Estimate by Patra and Sen the unknown component weight as well as the unknown distribution in admixture models

Estimation of unknown elements (by Patra and Sen method) under the admixture model with probability density function l: l = p*f + (1-p)*g, where g is the known component of the two-component mixture, p is the unknown proportion of the unknown component distribution f. More information in 'Details' below concerning the estimation method.

Usage

PatraSen_est_mix_model(
  data,
  method = c("lwr.bnd", "fixed", "cv"),
  c.n = NULL,
  folds = 10,
  reps = 1,
  cn.s = NULL,
  cn.length = 100,
  gridsize = 600
)

Arguments

data: Sample where the known component density of the admixture model has been transformed into a Uniform(0,1) distribution.
method: Either 'fixed' or 'cv', depending on whether compute the estimate based on the value of 'c.n' or use cross-validation for choosing 'c.n' (tuning parameter).
c.n: A positive number, with default value equal to 0.1 log(log(n)), where 'n' is the length of the observed sample.
folds: Number of folds used for cross-validation, default is 10.
reps: Number of replications for cross-validation, default is 1.
cn.s: A sequence of 'c.n' to be used for cross-validation (vector of values). Default is equally spaced grid of 100 values between .001 x log(log(n)) and 0.2 x log(log(n)).
cn.length: (default to 100) Number of equally spaced tuning parameter (between .001 x log(log(n)) and 0.2 x log(log(n))). Values to search from.
gridsize: (default to 600) Number of equally spaced points (between 0 and 1) to evaluate the distance function. Larger values are more computationally intensive but also lead to more accurate estimates.

Value

A list containing 'alp.hat' (estimate of the unknown component weight), 'Fs.hat' (list with elements 'x' and 'y' values for the function estimate of the unknown cumulative distribution function), 'dist.out' which is an object of the class 'dist.fun' using the complete data.gen, 'c.n' the value of the tuning parameter used to compute the final estimate, and finally 'cv.out' which is an object of class 'cv.mixmodel'. The object is NULL if method is "fixed".

Details

See Patra, R.K. and Sen, B. (2016); Estimation of a Two-component Mixture Model with Applications to Multiple Testing; JRSS Series B, 78, pp. 869--893.

Author

Xavier Milhaud xavier.milhaud.research@gmail.com

Examples

## Simulate data:
list.comp <- list(f = 'norm', g = 'norm')
list.param <- list(f = list(mean = 3, sd = 0.5),
                   g = list(mean = 0, sd = 1))
data1 <- rsimmix(n = 1500, unknownComp_weight = 0.8, list.comp, list.param)[['mixt.data']]
## Transform the known component of the admixture model into a Uniform(O,1) distribution:
list.comp <- list(f = NULL, g = 'norm')
list.param <- list(f = NULL, g = list(mean = 0, sd = 1))
data1_transfo <- knownComp_to_uniform(data = data1, comp.dist=list.comp, comp.param=list.param)
PatraSen_est_mix_model(data = data1_transfo, method = 'fixed',
                       c.n = 0.1*log(log(length(data1_transfo))), gridsize = 1000)$alp.hat
#> [1] 0.798