`R/PatraSen_cv_mixmodel.R`

`PatraSen_cv_mixmodel.Rd`

Estimation of unknown elements (by Patra and Sen method) under the admixture model with probability density function l: l = p*f + (1-p)*g, where g is the known component of the two-component admixture, p is the unknown proportion of the unknown component distribution f. The estimated unknown component weight p is selected using a cross-validation technique that helps to choose the right penalization, see 'Details' below for further information.

PatraSen_cv_mixmodel( data, folds = 10, reps = 1, cn.s = NULL, cn.length = NULL, gridsize = 200 )

data | Sample where the known component density of the admixture model has been transformed into a Uniform(0,1) distribution. |
---|---|

folds | (default to 10) Number of folds used for cross-validation. |

reps | (default to 1) Number of replications for cross-validation. |

cn.s | (default to NULL) A sequence of 'c.n' to be used for cross-validation (vector of values). |

cn.length | (default to NULL) Number of equally spaced tuning parameter (between .001 x log(log(n)) and 0.2 x log(log(n))). Values to search from. |

gridsize | (default to 200) Number of equally spaced points (between 0 and 1) to evaluate the distance function. Larger values are more computationally intensive but also lead to more accurate estimates. |

A list containing 'alp.hat' (estimate of the unknown component weight), 'Fs.hat' (list with elements 'x' and 'y' values for the function estimate of the unknown cumultaive distribution function), 'dist.out' which is an object of the class 'dist.fun' using the complete data.gen, 'c.n' the value of the tuning parameter used to compute the final estimate, and finally 'cv.out' which is an object of class 'cv.mixmodel'. The object is NULL if method is "fixed".

See Patra, R.K. and Sen, B. (2016); Estimation of a Two-component Mixture Model with Applications to Multiple Testing; JRSS Series B, 78, pp. 869--893.

Xavier Milhaud xavier.milhaud.research@gmail.com

## Simulate data: comp.dist <- list(f = 'norm', g = 'norm') comp.param <- list(f = list(mean = 3, sd = 0.5), g = list(mean = 0, sd = 1)) data1 <- rsimmix(n = 2000, unknownComp_weight = 0.85, comp.dist, comp.param)[['mixt.data']] ## Transform the known component of the admixture model into a Uniform(0,1) distribution: comp.dist <- list(f = NULL, g = 'norm') comp.param <- list(f = NULL, g = list(mean = 0, sd = 1)) data1_transfo <- knownComp_to_uniform(data = data1, comp.dist = list(comp.dist$f, comp.dist$g), comp.param = list(comp.param$f, comp.param$g)) ## Estimate the proportion of the unknown component of the admixture model: PatraSen_cv_mixmodel(data = data1_transfo, folds = 3, reps = 1, cn.s = NULL, cn.length = 3, gridsize = 100)$alp.hat#> Warning: Make sure that data is transformed such that the known component is Uniformly(0,1) distributed.#> [1] 0.92