Skip to contents

Estimate the component weights, the location shift parameter (in case of a symmetric unknown component density), and the unknown component distribution using different estimation techniques. We remind that the i-th admixture model has probability density function (pdf) l_i such that: l_i = p_i * f_i + (1-p_i) * g_i, where g_i is the known component density. The unknown quantities p_i and f_i then have to be estimated.

Usage

admix_estim(
  samples,
  admixMod,
  est.method = c("BVdk", "PS", "IBM"),
  sym.f = FALSE
)

Arguments

samples

(list) A list of the K (K>0) samples to be studied, all following admixture distributions.

admixMod

(list) A list of objects of class 'admix_model', containing useful information about distributions and parameters.

est.method

The estimation method to be applied. Can be one of 'BVdk' (Bordes and Vandekerkhove estimator), 'PS' (Patra and Sen estimator), or 'IBM' (Inversion Best-Matching approach). The same estimation method is performed on each sample. Important note: estimation by 'IBM' is unbiased only under H0, meaning that choosing this method requires to perform previously the test hypothesis between the pairs of samples. For further details, see section 'Details' below.

sym.f

A boolean indicating whether the unknown component densities are assumed to be symmetric or not.

Value

An object of class 'admix_estim', containing at least 5 attributes: 1) the number of samples under study; 2) the information about the mixture components (distributions and parameters); 3) the sizes of the samples; 4) the chosen estimation technique (one of 'BVdk', 'PS' or 'IBM'); 5) the estimated mixing proportions (weights of the unknown component distributions in the mixture model). In case of 'BVdk' estimation, one additional attribute corresponding to the estimated location shift parameter is included.

Details

For further details on the different estimation techniques, see references below i) Patra and Sen estimator ; ii) BVdk estimator ; iii) IBM approach.

References

Patra RK, Sen B (2016). “Estimation of a two-component mixture model with applications to multiple testing.” Journal of the Royal Statistical Society Series B, 78(4), 869-893. Bordes L, Delmas C, Vandekerkhove P (2006). “Semiparametric Estimation of a Two-Component Mixture Model Where One Component Is Known.” Scandinavian Journal of Statistics, 33(4), 733--752. ISSN 03036898, 14679469, http://www.jstor.org/stable/4616955. Bordes L, Vandekerkhove P (2010). “Semiparametric two-component mixture model with a known component: An asymptotically normal estimator.” Mathematical Methods of Statistics, 19(1), 22--41. doi:10.3103/S1066530710010023 . Milhaud X, Pommeret D, Salhi Y, Vandekerkhove P (2024). “Two-sample contamination model test.” Bernoulli, 30(1), 170--197. doi:10.3150/23-BEJ1593 .

Author

Xavier Milhaud xavier.milhaud.research@gmail.com

Examples

## Simulate mixture data:
mixt1 <- twoComp_mixt(n = 250, weight = 0.7,
                      comp.dist = list("norm", "norm"),
                      comp.param = list(list("mean" = -2, "sd" = 0.5),
                                        list("mean" = 0, "sd" = 1)))
mixt2 <- twoComp_mixt(n = 200, weight = 0.85,
                      comp.dist = list("norm", "exp"),
                      comp.param = list(list("mean" = 3, "sd" = 1),
                                        list("rate" = 1)))
data1 <- getmixtData(mixt1)
data2 <- getmixtData(mixt2)

## Define the admixture models:
admixMod1 <- admix_model(knownComp_dist = mixt1$comp.dist[[2]],
                         knownComp_param = mixt1$comp.param[[2]])
admixMod2 <- admix_model(knownComp_dist = mixt2$comp.dist[[2]],
                         knownComp_param = mixt2$comp.param[[2]])
admix_estim(samples = list(data1), admixMod = list(admixMod1),
            est.method = 'BVdk', sym.f = TRUE)
#> Call:
#> admix_estim(samples = list(data1), admixMod = list(admixMod1), 
#>     est.method = "BVdk", sym.f = TRUE)
#> 
#> Estimated mixing weight of the unknown component distribution in Sample 1: 0.69
#> 
admix_estim(samples = list(data1,data2),
            admixMod = list(admixMod1,admixMod2),
            est.method = 'PS')
#> Call:
#> admix_estim(samples = list(data1, data2), admixMod = list(admixMod1, 
#>     admixMod2), est.method = "PS")
#> 
#> Estimated mixing weight of the unknown component distribution in Sample 1: 0.65
#> Estimated mixing weight of the unknown component distribution in Sample 2: 0.76
#>