Do NMF de-composition and then extract signatures.

sig_extract(
  nmf_matrix,
  n_sig,
  nrun = 10,
  cores = 1,
  method = "brunet",
  optimize = FALSE,
  pynmf = FALSE,
  use_conda = TRUE,
  py_path = "/Users/wsx/anaconda3/bin/python",
  seed = 123456,
  ...
)

Arguments

nmf_matrix

a matrix used for NMF decomposition with rows indicate samples and columns indicate components.

n_sig

number of signature. Please run sig_estimate to select a suitable value.

nrun

a numeric giving the number of run to perform for each value in range, nrun set to 30~50 is enough to achieve robust result.

cores

number of cpu cores to run NMF.

method

specification of the NMF algorithm. Use 'brunet' as default. Available methods for NMF decompositions are 'brunet', 'lee', 'ls-nmf', 'nsNMF', 'offset'.

optimize

if TRUE, then refit the denovo signatures with QP method, see sig_fit.

pynmf

if TRUE, use Python NMF driver Nimfa. The seed currently is not used by this implementation.

use_conda

if TRUE, create an independent conda environment to run NMF.

py_path

path to Python executable file, e.g. '/Users/wsx/anaconda3/bin/python'. In my test, it is more stable than use_conda=TRUE. You can install the Nimfa package by yourself or set use_conda to TRUE to install required Python environment, and then set this option.

seed

specification of the starting point or seeding method, which will compute a starting point, usually using data from the target matrix in order to provide a good guess.

...

other arguments passed to NMF::nmf().

Value

a list with Signature class.

References

Gaujoux, Renaud, and Cathal Seoighe. "A flexible R package for nonnegative matrix factorization." BMC bioinformatics 11.1 (2010): 367.

Mayakonda, Anand, et al. "Maftools: efficient and comprehensive analysis of somatic variants in cancer." Genome research 28.11 (2018): 1747-1756.

See also

sig_tally for getting variation matrix, sig_estimate for estimating signature number for sig_extract, sig_auto_extract for extracting signatures using automatic relevance determination technique.

Author

Shixiang Wang

Examples

# \donttest{
load(system.file("extdata", "toy_copynumber_tally_W.RData",
  package = "sigminer", mustWork = TRUE
))
# Extract copy number signatures
res <- sig_extract(cn_tally_W$nmf_matrix, 2, nrun = 1)
#> NMF algorithm: 'brunet'
#> NMF seeding method: random
#> Iterations:    0/2000
Iterations:    1/2000 
Iterations:    1/2000 
Iterations:    1/2000 
Iterations:    1/2000 
Iterations:    1/2000 
Iterations:    1/2000 
Iterations:    1/2000 
Iterations:    1/2000 
Iterations:    1/2000 
Iterations:    1/2000 
Iterations:    1/2000 
Iterations:    1/2000 
Iterations:    1/2000 
Iterations:    1/2000 
Iterations:    1/2000 
Iterations:    1/2000 
Iterations:    1/2000 
Iterations:    1/2000 
Iterations:    1/2000 
Iterations:    1/2000 
Iterations:    1/2000
Iterations:   50/2000 
Iterations:   50/2000 
Iterations:   50/2000 
Iterations:   50/2000 
Iterations:   50/2000 
Iterations:   50/2000 
Iterations:   50/2000 
Iterations:   50/2000 
Iterations:   50/2000 
Iterations:   50/2000 
Iterations:   50/2000 
Iterations:   50/2000 
Iterations:   50/2000 
Iterations:   50/2000 
Iterations:   50/2000 
Iterations:   50/2000 
Iterations:   50/2000 
Iterations:   50/2000 
Iterations:   50/2000 
Iterations:   50/2000 
Iterations:   50/2000
Iterations:  100/2000 
Iterations:  100/2000 
Iterations:  100/2000 
Iterations:  100/2000 
Iterations:  100/2000 
Iterations:  100/2000 
Iterations:  100/2000 
Iterations:  100/2000 
Iterations:  100/2000 
Iterations:  100/2000 
Iterations:  100/2000 
Iterations:  100/2000 
Iterations:  100/2000 
Iterations:  100/2000 
Iterations:  100/2000 
Iterations:  100/2000 
Iterations:  100/2000 
Iterations:  100/2000 
Iterations:  100/2000 
Iterations:  100/2000 
Iterations:  100/2000
Iterations:  150/2000 
Iterations:  150/2000 
Iterations:  150/2000 
Iterations:  150/2000 
Iterations:  150/2000 
Iterations:  150/2000 
Iterations:  150/2000 
Iterations:  150/2000 
Iterations:  150/2000 
Iterations:  150/2000 
Iterations:  150/2000 
Iterations:  150/2000 
Iterations:  150/2000 
Iterations:  150/2000 
Iterations:  150/2000 
Iterations:  150/2000 
Iterations:  150/2000 
Iterations:  150/2000 
Iterations:  150/2000 
Iterations:  150/2000 
Iterations:  150/2000
Iterations:  200/2000 
Iterations:  200/2000 
Iterations:  200/2000 
Iterations:  200/2000 
Iterations:  200/2000 
Iterations:  200/2000 
Iterations:  200/2000 
Iterations:  200/2000 
Iterations:  200/2000 
Iterations:  200/2000 
Iterations:  200/2000 
Iterations:  200/2000 
Iterations:  200/2000 
Iterations:  200/2000 
Iterations:  200/2000 
Iterations:  200/2000 
Iterations:  200/2000 
Iterations:  200/2000 
Iterations:  200/2000 
Iterations:  200/2000 
Iterations:  200/2000
Iterations:  250/2000 
Iterations:  250/2000 
Iterations:  250/2000 
Iterations:  250/2000 
Iterations:  250/2000 
Iterations:  250/2000 
Iterations:  250/2000 
Iterations:  250/2000 
Iterations:  250/2000 
Iterations:  250/2000 
Iterations:  250/2000 
Iterations:  250/2000 
Iterations:  250/2000 
Iterations:  250/2000 
Iterations:  250/2000 
Iterations:  250/2000 
Iterations:  250/2000 
Iterations:  250/2000 
Iterations:  250/2000 
Iterations:  250/2000 
Iterations:  250/2000
Iterations:  300/2000 
Iterations:  300/2000 
Iterations:  300/2000 
Iterations:  300/2000 
Iterations:  300/2000 
Iterations:  300/2000 
Iterations:  300/2000 
Iterations:  300/2000 
Iterations:  300/2000 
Iterations:  300/2000 
Iterations:  300/2000 
Iterations:  300/2000 
Iterations:  300/2000 
Iterations:  300/2000 
Iterations:  300/2000 
Iterations:  300/2000 
Iterations:  300/2000 
Iterations:  300/2000 
Iterations:  300/2000 
Iterations:  300/2000 
Iterations:  300/2000
Iterations:  350/2000 
Iterations:  350/2000 
Iterations:  350/2000 
Iterations:  350/2000 
Iterations:  350/2000 
Iterations:  350/2000 
Iterations:  350/2000 
Iterations:  350/2000 
Iterations:  350/2000 
Iterations:  350/2000 
Iterations:  350/2000 
Iterations:  350/2000 
Iterations:  350/2000 
Iterations:  350/2000 
Iterations:  350/2000 
Iterations:  350/2000 
Iterations:  350/2000 
Iterations:  350/2000 
Iterations:  350/2000 
Iterations:  350/2000 
Iterations:  350/2000
Iterations:  400/2000 
Iterations:  400/2000 
Iterations:  400/2000 
Iterations:  400/2000 
Iterations:  400/2000 
Iterations:  400/2000 
Iterations:  400/2000 
Iterations:  400/2000 
Iterations:  400/2000 
Iterations:  400/2000 
Iterations:  400/2000 
Iterations:  400/2000 
Iterations:  400/2000 
Iterations:  400/2000 
Iterations:  400/2000 
Iterations:  400/2000 
Iterations:  400/2000 
Iterations:  400/2000 
Iterations:  400/2000 
Iterations:  400/2000 
Iterations:  400/2000
Iterations:  450/2000 
Iterations:  450/2000 
Iterations:  450/2000 
Iterations:  450/2000 
Iterations:  450/2000 
Iterations:  450/2000 
Iterations:  450/2000 
Iterations:  450/2000 
Iterations:  450/2000 
Iterations:  450/2000 
Iterations:  450/2000 
Iterations:  450/2000 
Iterations:  450/2000 
Iterations:  450/2000 
Iterations:  450/2000 
Iterations:  450/2000 
Iterations:  450/2000 
Iterations:  450/2000 
Iterations:  450/2000 
Iterations:  450/2000 
Iterations:  450/2000
#> DONE (converged at 470/2000 iterations)
# }