Estimation of both the weight and the distribution of the unknown component in an admixture model, by Patra and Sen approach. Remind that the admixture probability density function (pdf) l is given by l = p*f + (1-p)*g, where g is the known component of the two-component mixture, p is the unknown proportion of the unknown component distribution f. More information in 'Details' below concerning the estimation method.
Arguments
- samples
Sample to be studied.
- admixMod
An object of class admix_model, containing information about the known component distribution and its parameter(s).
- method
One of 'lwr.bnd', fixed' or 'cv': depending on whether compute some lower bound of the mixing proportion, the estimate based on the value of 'c.n' or use cross-validation for choosing 'c.n' (tuning parameter).
- c.n
(default to NULL) A positive number for the penalization, see reference below. If NULL, equals to 0.1*log(log(n)).
- folds
(optional, default to 10) Number of folds used for cross-validation.
- reps
(optional, default to 1) Number of replications for cross-validation.
- cn.s
(optional) A sequence of 'c.n' to be used for cross-validation (vector of values). Default is equally spaced grid of 100 values between .001 x log(log(n)) and 0.2 x log(log(n)).
- cn.length
(optional, default to 100) Number of equally spaced tuning parameter (between .001 x log(log(n)) and 0.2 x log(log(n))). Values to search from.
- gridsize
(default to 600) Number of equally spaced points (between 0 and 1) to evaluate the distance function. Larger values are more computationally intensive but also lead to more accurate estimates.
Value
An object of class estim_PS, containing 10 attributes: 1) the number of samples studied (1 in this case); 2) the sample size; 3) the information about component distributions of the admixture model; 4) the estimation method 5patra and Sen here); 5) the estimated mixing weight (estimate of the unknown component proportion); 6) the estimated decontaminated CDF; 7) an object of the class 'dist.fun' (that gives the distance); 8) the tuning parameter 'c.n'; 9) the lower bound of the estimated mixing proportion (if such an option has been chosen); 10) the number of observations.
References
Patra RK, Sen B (2016). “Estimation of a two-component mixture model with applications to multiple testing.” Journal of the Royal Statistical Society Series B, 78(4), 869-893.
Author
Xavier Milhaud xavier.milhaud.research@gmail.com
Examples
## Simulate mixture data:
mixt1 <- twoComp_mixt(n = 800, weight = 0.33,
comp.dist = list("gamma", "exp"),
comp.param = list(list("shape" = 2, "scale" = 0.5),
list("rate" = 0.25)))
data1 <- getmixtData(mixt1)
## Define the admixture model:
admixMod1 <- admix_model(knownComp_dist = mixt1$comp.dist[[2]],
knownComp_param = mixt1$comp.param[[2]])
## Estimation step:
estim_PS(samples = data1, admixMod = admixMod1, method = 'fixed')
#> Call:estim_PS(samples = data1, admixMod = admixMod1, method = "fixed")
#>
#> Estimate of the mixing weight (proportion of the unknown component distribution): 0.28