Probability density function of the unknown component — decontaminated

Estimate the decontaminated density of the unknown component in the admixture model under study, after inversion of the admixture cumulative distribution function. Recall that an admixture model follows the cumulative distribution function (CDF) L, where L = p*F + (1-p)*G, with g a known CDF and p and f unknown quantities.

Usage

decontaminated_density(sample1, estim.p, admixMod)

Arguments

sample1: Sample under study.
estim.p: The estimated weight, related to the proportion of the unknown component distribution in the admixture model studied.
admixMod: An object of class 'admix_model', containing useful information about known distribution(s) and parameter(s).

Value

An object of class 'decontaminated_density', containing 3 attributes: 1) the data under study; 2) the type of support for the underlying distribution (either discrete or continuous, useful for plots); 3) the decontaminated density function.

Details

The decontaminated density is obtained by inverting the admixture density, given by l = p*f + (1-p)*g, to isolate the unknown component f after having estimated p.

Author

Xavier Milhaud xavier.milhaud.research@gmail.com

Examples

## Simulate mixture data:
mixt1 <- twoComp_mixt(n = 400, weight = 0.4,
                      comp.dist = list("norm", "norm"),
                      comp.param = list(list("mean" = -2, "sd" = 0.5),
                                        list("mean" = 0, "sd" = 1)))
data1 <- getmixtData(mixt1)
## Define the admixture models:
admixMod1 <- admix_model(knownComp_dist = mixt1$comp.dist[[2]],
                         knownComp_param = mixt1$comp.param[[2]])
## Estimation:
est <- admix_estim(samples = list(data1), admixMod = list(admixMod1),
                   est_method = 'PS')
## Determine the decontaminated version of the unknown density by inversion:
decontaminated_density(sample1 = data1, estim.p = est$estimated_mixing_weights[1],
                       admixMod = admixMod1)
#> Call:decontaminated_density(sample1 = data1, estim.p = est$estimated_mixing_weights[1], 
#>     admixMod = admixMod1)
#> 
#> function (x) 
#> (1/estim.p) * (l1_emp(x) - (1 - estim.p) * g1(x))
#> <bytecode: 0x7fa4d337fc08>
#> <environment: 0x7fa4d337ca18>
#> 

####### Discrete support:
mixt1 <- twoComp_mixt(n = 5000, weight = 0.6,
                      comp.dist = list("pois", "pois"),
                      comp.param = list(list("lambda" = 3),
                                        list("lambda" = 2)))
mixt2 <- twoComp_mixt(n = 4000, weight = 0.8,
                      comp.dist = list("pois", "pois"),
                      comp.param = list(list("lambda" = 3),
                                        list("lambda" = 4)))
data1 <- getmixtData(mixt1)
data2 <- getmixtData(mixt2)
## Define the admixture models:
admixMod1 <- admix_model(knownComp_dist = mixt1$comp.dist[[2]],
                         knownComp_param = mixt1$comp.param[[2]])
admixMod2 <- admix_model(knownComp_dist = mixt2$comp.dist[[2]],
                         knownComp_param = mixt2$comp.param[[2]])
## Estimation:
est <- admix_estim(samples = list(data1, data2),
                   admixMod = list(admixMod1, admixMod2), est_method = 'IBM')
#> Warning:  IBM estimators of two unknown proportions are reliable only if the two
#>     corresponding unknown component distributions have been tested equal (see ?admix_test).
## Determine the decontaminated version of the unknown density by inversion:
decontaminated_density(sample1 = data1, estim.p = est$estimated_mixing_weights[1],
                       admixMod = admixMod1)
#> Call:decontaminated_density(sample1 = data1, estim.p = est$estimated_mixing_weights[1], 
#>     admixMod = admixMod1)
#> 
#> function (x) 
#> (1/estim.p) * (l1_emp(x) - (1 - estim.p) * g1(x))
#> <bytecode: 0x7fa4d337fc08>
#> <environment: 0x7fa4be544338>
#> 

####### Finite discrete support:
mixt1 <- twoComp_mixt(n = 12000, weight = 0.6,
                      comp.dist = list("multinom", "multinom"),
                      comp.param = list(list("size" = 1, "prob" = c(0.3,0.4,0.3)),
                                        list("size" = 1, "prob" = c(0.6,0.3,0.1))))
mixt2 <- twoComp_mixt(n = 10000, weight = 0.8,
                      comp.dist = list("multinom", "multinom"),
                      comp.param = list(list("size" = 1, "prob" = c(0.3,0.4,0.3)),
                                        list("size" = 1, "prob" = c(0.2,0.6,0.2))))
data1 <- getmixtData(mixt1)
data2 <- getmixtData(mixt2)
## Define the admixture models:
admixMod1 <- admix_model(knownComp_dist = mixt1$comp.dist[[2]],
                         knownComp_param = mixt1$comp.param[[2]])
admixMod2 <- admix_model(knownComp_dist = mixt2$comp.dist[[2]],
                         knownComp_param = mixt2$comp.param[[2]])
## Estimation:
est <- admix_estim(samples = list(data1, data2),
                   admixMod = list(admixMod1, admixMod2), est_method = 'IBM')
#> Warning:  IBM estimators of two unknown proportions are reliable only if the two
#>     corresponding unknown component distributions have been tested equal (see ?admix_test).
## Determine the decontaminated version of the unknown density by inversion:
decontaminated_density(sample1 = data1, estim.p = est$estimated_mixing_weights[1],
                       admixMod = admixMod1)
#> Call:decontaminated_density(sample1 = data1, estim.p = est$estimated_mixing_weights[1], 
#>     admixMod = admixMod1)
#> 
#> function (x) 
#> (1/estim.p) * (l1_emp(x) - (1 - estim.p) * g1(x))
#> <bytecode: 0x7fa4d337fc08>
#> <environment: 0x7fa4d0f9cc98>
#>