Skip to contents

Estimate the decontaminated density of the unknown component in the admixture model under study, after inversion of the admixture cumulative distribution function. Recall that an admixture model follows the cumulative distribution function (CDF) L, where L = p*F + (1-p)*G, with g a known CDF and p and f unknown quantities.

Usage

decontaminated_density(sample1, estim.p, admixMod)

Arguments

sample1

Sample under study.

estim.p

The estimated weight of the unknown component distribution, related to the proportion of the unknown component in the admixture model studied.

admixMod

An object of class 'admix_model', containing useful information about distributions and parameters.

Value

An object of class 'decontaminated_density', containing 2 attributes: 1) the decontaminated density function; 2) the type of support for the underlying distribution (either discrete or continuous, useful for plots).

Details

The decontaminated density is obtained by inverting the admixture density, given by l = p*f + (1-p)*g, to isolate the unknown component f after having estimated p.

Author

Xavier Milhaud xavier.milhaud.research@gmail.com

Examples

## Simulate mixture data:
mixt1 <- twoComp_mixt(n = 400, weight = 0.4,
                      comp.dist = list("norm", "norm"),
                      comp.param = list(list("mean" = -2, "sd" = 0.5),
                                        list("mean" = 0, "sd" = 1)))
data1 <- getmixtData(mixt1)
## Define the admixture models:
admixMod1 <- admix_model(knownComp_dist = mixt1$comp.dist[[2]],
                         knownComp_param = mixt1$comp.param[[2]])
## Estimation:
est <- admix_estim(samples = list(data1), admixMod = list(admixMod1),
                   est_method = 'PS')
## Determine the decontaminated version of the unknown density by inversion:
decontaminated_density(sample1 = data1, estim.p = est$estimated_mixing_weights[1],
                       admixMod = admixMod1)
#> Call:decontaminated_density(sample1 = data1, estim.p = est$estimated_mixing_weights[1], 
#>     admixMod = admixMod1)
#> 
#> function (x) 
#> (1/estim.p) * (l1_emp(x) - (1 - estim.p) * g1(x))
#> <bytecode: 0x7fedc9bb1d98>
#> <environment: 0x7fedc9bb1158>
#> 
#> Type of support: Continuous

####### Discrete support:
mixt1 <- twoComp_mixt(n = 5000, weight = 0.6,
                      comp.dist = list("pois", "pois"),
                      comp.param = list(list("lambda" = 3),
                                        list("lambda" = 2)))
mixt2 <- twoComp_mixt(n = 4000, weight = 0.8,
                      comp.dist = list("pois", "pois"),
                      comp.param = list(list("lambda" = 3),
                                        list("lambda" = 4)))
data1 <- getmixtData(mixt1)
data2 <- getmixtData(mixt2)
## Define the admixture models:
admixMod1 <- admix_model(knownComp_dist = mixt1$comp.dist[[2]],
                         knownComp_param = mixt1$comp.param[[2]])
admixMod2 <- admix_model(knownComp_dist = mixt2$comp.dist[[2]],
                         knownComp_param = mixt2$comp.param[[2]])
## Estimation:
est <- admix_estim(samples = list(data1, data2),
                   admixMod = list(admixMod1, admixMod2), est_method = 'IBM')
#> Warning:  IBM estimators of two unknown proportions are reliable only if the two
#>     corresponding unknown component distributions have been tested equal (see 'admix_test()').
#>     Furthermore, when both the known and unknown component distributions of the mixture
#>     models are identical, the IBM approach provides an estimation of the ratio of the
#>     actual mixing weights rather than an estimation of the unknown weights themselves.
## Determine the decontaminated version of the unknown density by inversion:
decontaminated_density(sample1 = data1, estim.p = est$estimated_mixing_weights[1],
                       admixMod = admixMod1)
#> Call:decontaminated_density(sample1 = data1, estim.p = est$estimated_mixing_weights[1], 
#>     admixMod = admixMod1)
#> 
#> function (x) 
#> (1/estim.p) * (l1_emp(x) - (1 - estim.p) * g1(x))
#> <bytecode: 0x7fedc9bb1d98>
#> <environment: 0x7fedacfceeb0>
#> 
#> Type of support: Discrete

####### Finite discrete support:
mixt1 <- twoComp_mixt(n = 12000, weight = 0.6,
                      comp.dist = list("multinom", "multinom"),
                      comp.param = list(list("size" = 1, "prob" = c(0.3,0.4,0.3)),
                                        list("size" = 1, "prob" = c(0.6,0.3,0.1))))
mixt2 <- twoComp_mixt(n = 10000, weight = 0.8,
                      comp.dist = list("multinom", "multinom"),
                      comp.param = list(list("size" = 1, "prob" = c(0.3,0.4,0.3)),
                                        list("size" = 1, "prob" = c(0.2,0.6,0.2))))
data1 <- getmixtData(mixt1)
data2 <- getmixtData(mixt2)
## Define the admixture models:
admixMod1 <- admix_model(knownComp_dist = mixt1$comp.dist[[2]],
                         knownComp_param = mixt1$comp.param[[2]])
admixMod2 <- admix_model(knownComp_dist = mixt2$comp.dist[[2]],
                         knownComp_param = mixt2$comp.param[[2]])
## Estimation:
est <- admix_estim(samples = list(data1, data2),
                   admixMod = list(admixMod1, admixMod2), est_method = 'IBM')
#> Warning:  IBM estimators of two unknown proportions are reliable only if the two
#>     corresponding unknown component distributions have been tested equal (see 'admix_test()').
#>     Furthermore, when both the known and unknown component distributions of the mixture
#>     models are identical, the IBM approach provides an estimation of the ratio of the
#>     actual mixing weights rather than an estimation of the unknown weights themselves.
## Determine the decontaminated version of the unknown density by inversion:
decontaminated_density(sample1 = data1, estim.p = est$estimated_mixing_weights[1],
                       admixMod = admixMod1)
#> Call:decontaminated_density(sample1 = data1, estim.p = est$estimated_mixing_weights[1], 
#>     admixMod = admixMod1)
#> 
#> function (x) 
#> (1/estim.p) * (l1_emp(x) - (1 - estim.p) * g1(x))
#> <bytecode: 0x7fedc9bb1d98>
#> <environment: 0x7fedc4477d90>
#> 
#> Type of support: Discrete