Nonparametric estimation of the variance-covariance matrix of the gaussian vector in IBM approach

Estimate the variance-covariance matrix of the gaussian vector at point 'z', considering the use of Inversion - Best Matching (IBM) method to estimate the model parameters in two-sample admixture models. Recall that the two admixture models have respective probability density functions (pdf) l1 and l2, such that: l1 = p1*f1 + (1-p1)g1 and l2 = p2f2 + (1-p2)*g2, where g1 and g2 are the known component densities. Further information for the IBM approach are given in 'Details' below.

Usage

IBM_estimVarCov_gaussVect(
  x,
  y,
  estim.obj,
  fixed.p1 = NULL,
  known.p = NULL,
  sample1,
  sample2,
  min_size = NULL,
  comp.dist = NULL,
  comp.param = NULL
)

Arguments

x: Time point at which the first (related to the first parameter) underlying empirical process is looked through.
y: Time point at which the second (related to the second parameter) underlying empirical process is looked through.
estim.obj: Object obtained from the estimation of the component weights related to the proportions of the unknown component in each of the two admixture models.
fixed.p1: Arbitrary value chosen by the user for the component weight related to the unknown component of the first admixture model. Only useful for optimization when the known components of the two models are identical (G1=G2, leading to unidimensional optimization).
known.p: (optional, NULL by default) Numeric vector with two elements, the known (true) mixture weights.
sample1: Observations of the first sample under study.
sample2: Observations of the second sample under study.
min_size: (optional, NULL by default) in the k-sample case, useful to provide the minimal size among all samples (needed to take into account the correction factor in variance-covariance assessment). Otherwise, useless.
comp.dist: A list with four elements corresponding to the component distributions (specified with R native names for these distributions) involved in the two admixture models. The two first elements refer to the unknown and known components of the 1st admixture model, and the last two ones to those of the second admixture model. If there are unknown elements, they must be specified as 'NULL' objects. For instance, 'comp.dist' could be specified as follows: list(f1=NULL, g1='norm', f2=NULL, g2='norm').
comp.param: A list with four elements corresponding to the parameters of the component distributions, each element being a list itself. The names used in this list must correspond to the native R argument names for these distributions. The two first elements refer to the parameters of unknown and known components of the 1st admixture model, and the last two ones to those of the second admixture model. If there are unknown elements, they must be specified as 'NULL' objects. For instance, 'comp.param' could be specified as follows: : list(f1=NULL, g1=list(mean=0,sd=1), f2=NULL, g2=list(mean=3,sd=1.1)).

Value

The estimated variance-covariance matrix of the gaussian vector Z = (hat(p1),(hat(p2),Dn(z)), at location '(x,y)'.

Details

See the paper presenting the IBM approach at the following HAL weblink: https://hal.archives-ouvertes.fr/hal-03201760

Author

Xavier Milhaud xavier.milhaud.research@gmail.com

Examples

# \donttest{
######## Analysis by simulated data:
## Simulate Gamma - Exponential admixtures :
list.comp <- list(f1 = "gamma", g1 = "exp",
                  f2 = "gamma", g2 = "exp")
list.param <- list(f1 = list(shape = 2, scale = 3), g1 = list(rate = 1/3),
                   f2 = list(shape = 2, scale = 3), g2 = list(rate = 1/5))
X.sim <- rsimmix(n=400, unknownComp_weight=0.8, comp.dist = list(list.comp$f1,list.comp$g1),
                 comp.param = list(list.param$f1, list.param$g1))$mixt.data
Y.sim <- rsimmix(n=350, unknownComp_weight=0.9, comp.dist = list(list.comp$f2,list.comp$g2),
                 comp.param = list(list.param$f2, list.param$g2))$mixt.data
## Real-life setting:
list.comp <- list(f1 = NULL, g1 = "exp",
                  f2 = NULL, g2 = "exp")
list.param <- list(f1 = NULL, g1 = list(rate = 1/3),
                   f2 = NULL, g2 = list(rate = 1/5))
## Estimate the unknown component weights in the two admixture models:
estim <- IBM_estimProp(sample1 =X.sim, sample2 =Y.sim, known.prop = NULL, comp.dist = list.comp,
                       comp.param = list.param, with.correction = FALSE, n.integ = 1000)
IBM_estimVarCov_gaussVect(x = mean(X.sim), y = mean(Y.sim), estim.obj = estim,
                          fixed.p1 = estim[["p.X.fixed"]], known.p = NULL, sample1=X.sim,
                          sample2 = Y.sim, min_size = NULL,
                          comp.dist = list.comp, comp.param = list.param)
#>             [,1]        [,2]        [,3]
#> [1,]  1.96274074  0.09651599 -0.64451334
#> [2,]  0.09651599  1.54693148 -1.30361505
#> [3,] -0.65219884 -1.22003285  0.08545106
# }