this pipeline integrate multiple independent steps in VSHunter.

cnv_pipe(CN_data, cores = 1, genome_build = c("hg19", "hg38"),
  de_novo = TRUE, reference_components = NULL, min_comp = 2,
  max_comp = 10, min_prior = 0.001, model_selection = "BIC",
  nrep = 1, niter = 1000, nTry = 12, nrun = 10, seed = 123456,
  plot_survey = TRUE, testRandom = FALSE, nmfalg = "brunet",
  tmp = FALSE)

Arguments

CN_data

a QDNAseqCopyNumbers object or a list contains multiple data.frames each one data.frame stores copy-number profile for one sample with 'chromosome', 'start', 'end' and 'segVal' these four necessary columns. Of note, 'segVal' column shoule be absolute copy number values.

cores

number of compute cores to run this task. You can use detectCores function to check how many cores you can use. If you are using cnv_pipe feature, please do not use maximal number of cores in your computer, it may cause some unexpected problems.

genome_build

genome build version, must be one of 'hg19' or 'hg38'.

de_novo

default is TRUE. If set to FALSE, it will use reference components to generate sample-by-component matrix and then extract signatures.

reference_components

the object result from cnv_fitMixModels, default is NULL. When de_novo is FALSE and this argument is NULL, it will use reference components from Nature Genetics paper.

min_comp

minimal number of components to fit, default is 2.

max_comp

maximal number of components to fit, default is 10.

min_prior

minimal prior value, default is 0.001. Details about custom setting please refer to flexmix package.

model_selection

model selection strategy, default is 'BIC'.Details about custom setting please refer to flexmix package.

nrep

number of run times for each value of component, keep only the solution with maximum likelihood.

niter

maximal number of iteration to achive converge.

nTry

the maximal tried number of signatures, default is 12. Of note, this value should far less than number of features or samples.

nrun

the number of run to perform for each value in range of 2 to nTry, default is 10. According to NMF package documentation, nrun set to 50 is enough to achieve robust result.

seed

seed number.

plot_survey

logical. If TRUE, plot best rank survey.

testRandom

if generate random data from input to test measurements. Default is TRUE.

nmfalg

specification of the NMF algorithm.

tmp

whether create a tmp directory to store temp result or not, default is FALSE.

Value

a list contains results of NMF best rank survey, run, signature matrix, exposure list etc..

Examples

# NOT RUN {
## load example copy-number data from tcga
load(system.file("inst/extdata", "example_cn_list.RData", package = "VSHunter"))
## run cnv signature pipeline
result = cnv_pipe(CN_data = tcga_segTabs, cores = 1, genome_build = "hg19")
# }