`R/IBM_greenLight_criterion.R`

`IBM_greenLight_criterion.Rd`

Indicate whether there is need to perform the statistical test of equality between unknown components when comparing the unknown components of two samples following admixture models. Based on the IBM approach, see more in 'Details' below.

IBM_greenLight_criterion( estim.obj, sample1, sample2, comp.dist = NULL, comp.param = NULL, min_size = NULL, alpha = 0.05 )

estim.obj | Object obtained from the estimation of the component weights related to the proportions of the unknown component in each of the two admixture models studied. |
---|---|

sample1 | Observations of the first sample under study. |

sample2 | Observations of the second sample under study. |

comp.dist | A list with four elements corresponding to the component distributions (specified with R native names for these distributions) involved in the two admixture models. The two first elements refer to the unknown and known components of the 1st admixture model, and the last two ones to those of the second admixture model. If there are unknown elements, they must be specified as 'NULL' objects. For instance, 'comp.dist' could be specified as follows: list(f1=NULL, g1='norm', f2=NULL, g2='norm'). |

comp.param | A list with four elements corresponding to the parameters of the component distributions, each element being a list itself. The names used in this list must correspond to the native R argument names for these distributions. The two first elements refer to the parameters of unknown and known components of the 1st admixture model, and the last two ones to those of the second admixture model. If there are unknown elements, they must be specified as 'NULL' objects. For instance, 'comp.param' could be specified as follows: : list(f1=NULL, g1=list(mean=0,sd=1), f2=NULL, g2=list(mean=3,sd=1.1)). |

min_size | (optional, NULL by default) In the k-sample case, useful to provide the minimal size among all samples (needed to take into account the correction factor for variance-covariance assessment). Otherwise, useless. |

alpha | Confidence level at which the criterion is assessed (used to compute the confidence bands of the estimators of the unknown component weights). |

A boolean indicating whether it is useful or useless to tabulate the contrast distribution in order to answer the testing problem (f1 = f2).

See the paper presenting the IBM approach at the following HAL weblink: https://hal.archives-ouvertes.fr/hal-03201760

Xavier Milhaud xavier.milhaud.research@gmail.com

## Simulate data: list.comp <- list(f1 = 'norm', g1 = 'norm', f2 = 'norm', g2 = 'norm') list.param <- list(f1 = list(mean = 3, sd = 0.5), g1 = list(mean = 0, sd = 1), f2 = list(mean = 3, sd = 0.5), g2 = list(mean = 5, sd = 2)) sample1 <- rsimmix(n=550, unknownComp_weight=0.7, comp.dist = list(list.comp$f1,list.comp$g1), comp.param=list(list.param$f1,list.param$g1)) sample2 <- rsimmix(n=450, unknownComp_weight=0.8, comp.dist = list(list.comp$f2,list.comp$g2), comp.param=list(list.param$f2,list.param$g2)) ## Estimate the unknown component weights in the two admixture models in real-life setting: list.comp <- list(f1 = NULL, g1 = 'norm', f2 = NULL, g2 = 'norm') list.param <- list(f1 = NULL, g1 = list(mean = 0, sd = 1), f2 = NULL, g2 = list(mean = 5, sd = 2)) estim <- IBM_estimProp(sample1[['mixt.data']], sample2[['mixt.data']], known.prop = NULL, comp.dist = list.comp, comp.param = list.param, with.correction = FALSE, n.integ = 1000) IBM_greenLight_criterion(estim.obj = estim, sample1 = sample1[['mixt.data']], sample2 = sample2[['mixt.data']], comp.dist = list.comp, comp.param = list.param, min_size = NULL, alpha = 0.05)#> $green_light #> [1] TRUE #> #> $conf_interval_p1 #> [1] 0.6311164 0.7495696 #> #> $conf_interval_p2 #> [1] 0.6718105 0.8364227 #>