Estimate the component weights, the location shift parameter (in case of a symmetric unknown component density), and the unknown component distribution using different estimation techniques. We remind that the i-th admixture model has probability density function (pdf) l_i such that: l_i = p_i * f_i + (1-p_i) * g_i, where g_i is the known component density. The unknown quantities p_i and f_i then have to be estimated.
Usage
admix_estim(samples, admixMod, est_method = c("PS", "BVdk", "IBM"), ...)
Arguments
- samples
A list of the K (K>0) samples to be studied, all following admixture distributions.
- admixMod
A list of objects of class admix_model, containing useful information about distributions and parameters.
- est_method
The estimation method to be applied. Can be one of 'BVdk' (Bordes and Vandekerkhove estimator), 'PS' (Patra and Sen estimator), or 'IBM' (Inversion Best-Matching approach) in the continuous case (continuous random variable). Only 'IBM' for discrete random variables. The same estimation method is performed on each sample if several samples are provided.
- ...
Optional arguments to estim_PS, estim_BVdk or estim_IBM depending on the choice made by the user for the estimation method.
Value
An object of class admix_estim, containing at least 5 attributes: 1) the number of samples under study; 2) the information about the mixture components (distributions and parameters); 3) the sizes of the samples; 4) the chosen estimation technique (one of 'BVdk', 'PS' or 'IBM'); 5) the estimated mixing proportions (weights of the unknown component distributions in the mixture model). In case of 'BVdk' estimation, one additional attribute corresponding to the estimated location shift parameter is included.
Details
For further details on the different estimation techniques, see references below on i) Patra and Sen estimator ; ii) Bordes and Vandekerkhove estimator ; iii) Inversion Best-Matching approach. Important note: estimation by 'IBM' requires at least two samples at hand, and provides unbiased estimators only if the distributions of unknown components are equal (meaning that it requires to perform previously this test between the pairs of samples, see ?admix_test).
References
Patra RK, Sen B (2016). “Estimation of a two-component mixture model with applications to multiple testing.” Journal of the Royal Statistical Society Series B, 78(4), 869-893. Bordes L, Delmas C, Vandekerkhove P (2006). “Semiparametric Estimation of a Two-Component Mixture Model Where One Component Is Known.” Scandinavian Journal of Statistics, 33(4), 733--752. ISSN 03036898, 14679469, http://www.jstor.org/stable/4616955. Bordes L, Vandekerkhove P (2010). “Semiparametric two-component mixture model with a known component: An asymptotically normal estimator.” Mathematical Methods of Statistics, 19(1), 22--41. doi:10.3103/S1066530710010023 . Milhaud X, Pommeret D, Salhi Y, Vandekerkhove P (2024). “Two-sample contamination model test.” Bernoulli, 30(1), 170--197. doi:10.3150/23-BEJ1593 .
Author
Xavier Milhaud xavier.milhaud.research@gmail.com
Examples
## Simulate mixture data:
mixt1 <- twoComp_mixt(n = 300, weight = 0.7,
comp.dist = list("norm", "norm"),
comp.param = list(list("mean" = -2, "sd" = 0.5),
list("mean" = 0, "sd" = 1)))
mixt2 <- twoComp_mixt(n = 250, weight = 0.85,
comp.dist = list("norm", "exp"),
comp.param = list(list("mean" = -2, "sd" = 0.5),
list("rate" = 1)))
data1 <- getmixtData(mixt1)
data2 <- getmixtData(mixt2)
## Define the admixture models:
admixMod1 <- admix_model(knownComp_dist = mixt1$comp.dist[[2]],
knownComp_param = mixt1$comp.param[[2]])
admixMod2 <- admix_model(knownComp_dist = mixt2$comp.dist[[2]],
knownComp_param = mixt2$comp.param[[2]])
# Estimation by different methods:
admix_estim(samples=list(data1), admixMod=list(admixMod1), est_method = "BVdk")
#> Warning: Estimation by 'BVdk' assumes the unknown component distribution
#> to have a symmetric probability density function.
#> Call:
#> admix_estim(samples = list(data1), admixMod = list(admixMod1),
#> est_method = "BVdk")
#>
#> Estimated mixing weight of the unknown component distribution in Sample 1: 0.65
#> Estimated location parameters of the unknown component distribution in Sample 1: -2.06
admix_estim(samples=list(data1,data2), admixMod=list(admixMod1,admixMod2), est_method = "PS")
#> Call:
#> admix_estim(samples = list(data1, data2), admixMod = list(admixMod1,
#> admixMod2), est_method = "PS")
#>
#> Estimated mixing weight of the unknown component distribution in Sample 1: 0.62
#> Estimated mixing weight of the unknown component distribution in Sample 2: 0.85
admix_estim(samples=list(data1,data2), admixMod=list(admixMod1,admixMod2), est_method = "IBM")
#> Warning: IBM estimators of two unknown proportions are reliable only if the two
#> corresponding unknown component distributions have been tested equal (see 'admix_test()').
#> Furthermore, when both the known and unknown component distributions of the mixture
#> models are identical, the IBM approach provides an estimation of the ratio of the
#> actual mixing weights rather than an estimation of the unknown weights themselves.
#> Call:
#> admix_estim(samples = list(data1, data2), admixMod = list(admixMod1,
#> admixMod2), est_method = "IBM")
#>
#> Pairwise estimation performed (IBM estimation method).
#> Remember that estimated weights are reliable only if the unknown component distributions
#> have previously been tested equal. See ?admix_test.
#>
#> ----- Pair 1 and 2 -----
#> Estimated mixing weight of the unknown component distribution in Sample 1: 0.78
#> Estimated mixing weight of the unknown component distribution in Sample 2: 0.96