`R/admix_estim.R`

`admix_estim.Rd`

Estimate the component weights, the location shift parameter (in case of a symmetric unknown component density), and the unknown component distribution using different estimation techniques. We remind that the i-th admixture model has probability density function (pdf) l_i such that: l_i = p_i * f_i + (1-p_i) * g_i, where g_i is the known component density. The unknown quantities p_i and f_i then have to be estimated.

admix_estim( samples = NULL, sym.f = FALSE, est.method = c("PS", "BVdk", "IBM"), comp.dist = NULL, comp.param = NULL )

samples | A list of the K samples to be studied, all following admixture distributions. |
---|---|

sym.f | A boolean indicating whether the unknown component densities are assumed to be symmetric or not. |

est.method | The estimation method to be applied. Can be one of 'BVdk' (Bordes and Vandekerkhove estimator), 'PS' (Patra and Sen estimator), or 'IBM' (Inversion Best-Matching approach). The same estimation method is performed on each sample. For further details, see section 'Details' below. |

comp.dist | A list with 2*K elements corresponding to the component distributions (specified with R native names for these distributions) involved in the K admixture models. Elements, grouped by 2, refer to the unknown and known components of each admixture model, If there are unknown elements, they must be specified as 'NULL' objects. For instance, 'comp.dist' could be specified as follows with K = 3: list(f1 = NULL, g1 = 'norm', f2 = NULL, g2 = 'norm', f3 = NULL, g3 = 'rnorm'). |

comp.param | A list with 2*K elements corresponding to the parameters of the component distributions, each element being a list itself. The names used in this list must correspond to the native R argument names for these distributions. Elements, grouped by 2, refer to the parameters of unknown and known components of each admixture model. If there are unknown elements, they must be specified as 'NULL' objects. For instance, 'comp.param' could be specified as follows (with K = 3): list(f1 = NULL, g1 = list(mean=0,sd=1), f2 = NULL, g2 = list(mean=3,sd=1.1), f3 = NULL, g3 = list(mean=-2,sd=0.6)). |

A list containing the estimated weight of every unknown component distribution among admixture samples.

For further details on the different estimation techniques, see i) IBM approach at https://hal.archives-ouvertes.fr/hal-03201760 ; ii) Patra and Sen estimator: Patra, R.K. and Sen, B. (2016); Estimation of a Two-component Mixture Model with Applications to Multiple Testing; JRSS Series B, 78, pp. 869--893. ; iii) BVdk estimator: Bordes, L. and Vandekerkhove, P. (2010); Semiparametric two-component mixture model when a component is known: an asymptotically normal estimator; Math. Meth. Stat.; 19, pp. 22--41.

Xavier Milhaud xavier.milhaud.research@gmail.com

##### On a simulated example to see whether the true parameters are well estimated. list.comp <- list(f1 = "norm", g1 = "norm", f2 = "norm", g2 = "norm") list.param <- list(f1 = list(mean = 0, sd = 1), g1 = list(mean = 2, sd = 0.7), f2 = list(mean = 0, sd = 1), g2 = list(mean = -3, sd = 1.1)) ## Simulate data: sim1 <- rsimmix(n = 400, unknownComp_weight = 0.8, comp.dist = list(list.comp$f1,list.comp$g1), comp.param = list(list.param$f1, list.param$g1))$mixt.data sim2 <- rsimmix(n= 600, unknownComp_weight = 0.65, comp.dist = list(list.comp$f2,list.comp$g2), comp.param = list(list.param$f2, list.param$g2))$mixt.data ## Estimate the mixture weights of the admixture models: list.comp <- list(f1 = NULL, g1 = "norm", f2 = NULL, g2 = "norm") list.param <- list(f1 = NULL, g1 = list(mean = 2, sd = 0.7), f2 = NULL, g2 = list(mean = -3, sd = 1.1)) estim <- admix_estim(samples = list(sim1,sim2), sym.f = TRUE, est.method = 'BVdk', comp.dist = list.comp, comp.param = list.param) estim#> $unknownComp.weight #> [1] 0.8012741 0.6427171 #>