Changelog

Fixed a bug that generating a wrong data type when only a sample is handled (#463). Thanks to @selkamand.

Updated sig_fit() related documents for better usage (#454).
Added cluster_col to show_group_enrichment().
Fixed the bug that error returned when cluster_row = TRUE & return_list = TRUE in function show_group_enrichment().
Fixed the error in generating DBS and INDEL matrix when only one sample input (#453).

Supported human T2T genome and corresponding annotation data.
Updated COSMIC database to v3.4. SV and RNA-SBS signatures are included.

get_sig_db("latest_RNA-SBS_GRCh37")
get_sig_db("latest_SV_GRCh38")

Fixed a bug in generating matrix for variation categories with strand bias due to problematic counting. (#445)

Updated pkg doc following the new CRAN feature (thanks to K from the CRAN team).
Added samps option to show_sig_exposure().

Example:

load(system.file("extdata", "toy_mutational_signature.RData",
                 package = "sigminer", mustWork = TRUE
))
# Show signature exposure
p1 <- show_sig_exposure(sig2, rm_space = TRUE)
p1

expo = sig_exposure(sig2)
show_sig_exposure(expo,
                  rm_space = TRUE,
                  samps = colnames(expo)[order(colSums(expo))])

Fixed the error in generating SBS matrix when only one sample input (#432).

Removed package ‘copynumber’ from suggests filed.
Supported Ziyu Tao et al approach for copy number segment classification.
Supported ce11 genome in read_vcf().
Added read_maf_minimal() to support a minimal MAF-like data as input.

Fixed the issue about the latest CN signatures from COSMIC have inconsistent labels with built-in CN signatures (#421).

Sorted substitution mutation types by default in sig_tally().
Added parameter in sigprofiler_extract() to help generate input matrix file for calling SigProfiler directly.
Added some notions in sigprofiler_extract().
Added a function sigprofiler_reorder() for utils in generating SigProfiler input matrix file with standard mutation types order.

Fixed the bug about plotting CN chromosome distribution (#420, thanks to @jrcodina96).

Updated COSMIC latest version from v3.2 to v3.3. A new reference for copy number signature now is provided as latest_CN_GRCh37 (#412).
get_sig_similarity() now uses “SBS” as default reference.
Fixed bug in show_cn_circos().
Added group_enrichment2().

Fixed checking in sig_tally().

Added a vignette to introduce the analysis of copy number signatures.
Updated CNS_TCGA.
Enhanced group_enrichment() with reference group support.

Example:

set.seed(1234)
df <- dplyr::tibble(
  g1 = rep(LETTERS[1:3], c(50, 40, 10)),
  g2 = rep(c("AA", "VV", "XX"), c(50, 40, 10)),
  e1 = sample(c("P", "N"), 100, replace = TRUE),
  e2 = rnorm(100)
)

x1 = group_enrichment(df, grp_vars = c("g1", "g2"), 
                      enrich_vars = c("e1", "e2"), 
                      ref_group = c("B", "VV"))
x1

Added option for reading ASCAT objects in parallel.

Fixed error in extracting invalid regions (#396, thanks to @KirsieMin).

Enhanced the read_copynumber_seqz() to include minor copy number. (Thanks to yancey)
Added input range check in sig_estimate(). (#391)

Expanded output_* function by adding option sig_db.
Fixed the error using sigminer::get_genome_annotation() before loading it.
Fixed the bug the get_pLOH_score() return nothing for sample without LOH.

Added sig_unify_extract() as an unified signature extractor.
Fixed error showing reference signature profile for CNS_TCGA database.

Impl y_limits option in show_sig_profile() (#381).
Added function get_pLOH_score() for representing the genome that displayed LOH.
Added function read_copynumber_ascat() for reading ASCAT result ASCAT object in .rds format.
Added function get_intersect_size() for getting overlap size between intervals.
Added option to get_Aneuploidy_score() to remove short arms of chr13/14/15/21/22 from calculation.

Implemented Cohen-Sharir method-like Aneuploidy Score.
Enhanced error handling in show_sig_feature_corrplot() (#376).
Fixed INDEL classification.
Fixed end position determination in read_vcf().
Updated INDEL adjustment.
Included TCGA copy number signatures from SigProfiler.
Updated docs.

Preprocessed INDELs before labeling them in sig_tally() (#370).
Fixed sigprofiler_extract() extracting copy number signatures and rolled up sigprofiler version (#369).

Fixed output_sig() error in handling exposure plot with >9 signatures (#366).
Added limitsize = FALSE for ggsave() or ggsave2() for handling big figure.

Supported mm9 genome build.
Removed FTP link as CRAN suggested (#359).
Updated README.

BUG REPORTS

Fixed the SigProfiler installation error due to Python version in conda environment.
Fixed classification bug due to repeated function name call_component.
Fixed the bug when read_vcf() with ## commented VCF files.

ENHANCEMENTS

Added support for latest COSMIC v3.2 as reference signatures. You can obtain them by

for (i in c("latest_SBS_GRCh37", "latest_DBS_GRCh37", "latest_ID_GRCh37",
            "latest_SBS_GRCh38", "latest_DBS_GRCh38",
            "latest_SBS_mm9", "latest_DBS_mm9",
            "latest_SBS_mm10", "latest_DBS_mm10",
            "latest_SBS_rn6", "latest_DBS_rn6")) {
  message(i)
  get_sig_db(i)
}

Updated keep_only_pass to FALSE at default.
Added RSS and unexplained variance calculation in get_sig_rec_similarity().
Added data check and filter in output_tally() and show_catalogue().
Enhanced show_group_enrichment() (#353) & added a new option to cluster rows.
Removed unnecessary CN classifications code in recent development.

NEW FUNCTIONS

DEPRECATED

Dropped copy number “M”” method to avoid misguiding user to use/read wrong signature profile and keep code simple.

BUG REPORTS

ENHANCEMENTS

Modified the default visualization of bp_show_survey().
Enhanced torch check.

NEW FUNCTIONS

read_sv_as_rs() and sig_tally.RS() for simplified genome rearrangement classification matrix generation (experimental).

DEPRECATED

BUG REPORTS

Fixed the assign problem about match pair in bp_extract_signatures() with lpSolve package instead of using my problematic code.

ENHANCEMENTS

Supported mm10 in read_vcf().
Removed large data files and store them in Zenodo to reduce package size.
Added cores check.
Upgraded SP to v1.1.0 (need test).
Tried installing Torch before SP (need test).

NEW FUNCTIONS

DEPRECATED

BUG REPORTS

Fixed bug in silhouette calculation in bp_extract_signatures() (#332). PAY ATTENTION: this may affect results.
Fixed bug using custom signature name in show_sig_profile_loop().

ENHANCEMENTS

Subset signatures to plot is available by sig_names option.
sigminer is available in bioconda channel: https://anaconda.org/bioconda/r-sigminer/
Updated ms strategy in sig_auto_extract() by assigning each signature to its best matched reference signatures.
Added get_shannon_diversity_index() to get diversity index for signatures (#333).
Added new method “S” (from Steele et al. 2019) for tallying copy number data (#329).
Included new (RS) reference signatures (related to #331).
Updated the internal code for getting relative activity in get_sig_exposure().

NEW FUNCTIONS

bp_get_clustered_sigs() to get clustered mean signatures.

DEPRECATED

Updated author list.

BUG REPORTS

ENHANCEMENTS

Added a quick start vignette.
A new option highlight is added to show_sig_number_survey() and bp_show_survey2() to highlight a selected number.

NEW FUNCTIONS

DEPRECATED

BUG REPORTS

ENHANCEMENTS

A new option cut_p_value is added to show_group_enrichment() to cut continous p values as binned regions.
A Python backend for sig_extract() is provided.
User now can directly use sig_extract() and sig_auto_extract() instead of loading NMF package firstly.
Added benchmark results for different extraction approaches in README.
The threshold for auto_reduce in sig_fit() is modified from 0.99 to 0.95 and similarity update threshold updated from >0 to >=0.01.
Removed pConstant option from sig_extract() and sig_estimate(). Now a auto-check function is created for avoiding the error from NMF package due to no contribution of a component in all samples.

NEW FUNCTIONS

bp_show_survey2() to plot a simplified version for signature number survey (#330).
read_xena_variants() to read variant data from UCSC Xena as a MAF object for signature analysis.
get_sig_rec_similarity() for getting reconstructed profile similarity for Signature object (#293).
Added functions start with bp_ which are combined to provide a best practice for extracting signatures in cancer researches. See more details, run ?bp in your R console.

DEPRECATED

Added data simulation.
Suppressed future warnings.
Fixed p value calculation in bootstrap analysis.
Fixed typo in show_cor(), thanks to @Miachol.
Added y_tr option in show_sig_profile() to transform y axis values.
Optimized default behavior of read_copynumber().
- Support LOH records when user input minor allele copy number.
- Set complement = FALSE as default.
- Free dependencies between option use_all and complement.
Added visualization support for genome rearrangement signatures (#300).
Added four database for reference signatures from https://doi.org/10.1038/s43018-020-0027-5 (#299).
Added new measure ‘CV’ for show_sig_bootstrap() (#298).
Added group_enrichment() and show_group_enrichment() (#277).
Optimized signature profile visualization (#295).
Updated ?sigminer documentation.
Added ms strategy to select optimal solution by maximizing cosine similarity to reference signatures.
Added same_size_clustering() for same size clustering.
Added show_cosmic() to support reading COSMIC signatures in web browser (#288).
Changed argument rel_threshold behavior in sig_fit() and get_sig_exposure(). Made them more consistent and allowed un-assigned signature contribution (#285).
Updated all COSMIC signatures to v3.1 and their aetiologies (#287).

Added more specific reference signatures from SigProfiler, e.g. SBS_mm9.
Supported data.frame as input object for sig in get_sig_similarity() and sig_fit().
Modified g_label option in show_group_distribution() to better control group names.
Added test option and variable checking in show_cor().
Updated output_sig() to output signature exposure distribution (#280).
Added show_cor() for general association analysis.
Added options in show_group_distribution() to control segments.

Fixed bugs when outputing only 1 signatures.
Fixed label inverse bug in add_labels(), thanks to TaoTao for reporting.

Handled , seperated indices in show_cosmic_signatures.
Added option set_order in get_sig_similarity() (#274).
Outputed more stats information in output_sig().
Fixed default y axis title in show_sig_bootstrap_error(), now it is “Reconstruction error (L2 norm)”

Added auto_reduce option in sig_fit* functions to improve signature fitting.
Return cosine similarity for sample profile in sig_fit().
Set default strategy in sig_auto_extract() to ‘optimal’.
Supported search reference signature index in get_sig_cancer_type_index().
Outputed legacy COSMIC similarity for SBS signatures.
Added new option in sigprofiler_extract() to reduce failure in when refit is enabled.
Outputed both relative and absolute signature exposure in output_sig().
Updated background color in show_group_distribution().
Modified the default theme for signature profile in COSMIC style.
Updated the copy number classification method.

Handled null catalogue.
Supported ordering the signatures for results from SigProfiler.
Supported importing refit results from SigProfiler.
Set optimize option in sig_extract() and sig_auto_extract().

Supported signature index separated by , in sig_fit() and sig_fit_bootstrap* functions.
Added output_* functions from sigflow.
Enhanced DBS search and error handling in sig_tally().
Added option highlight_genes in show_cn_group_profile() to show gene labels.
Added get_sig_cancer_type_index() to get reference signature index.
Added show_group_distribution() to show group distribution.
Added options in show_cn_profile() to show specified ranges and add copy number value labels.
Used package nnls instead of pracma for NNLS implementation in sig_fit().

Supported BSgenome.Hsapiens.1000genomes.hs37d5 in sig_tally().
Remove changing MT to M in mutation data.
Fixed bug in extract numeric signature names and signature orderings in show_sig_exposure().
Added letter_colors as an unexported discrete palette.

Added transform_seg_table().
Added show_cn_group_profile().
Added show_cn_freq_circos().
sig_orders option in show_sig_profile() function now can select and order signatures to plot.
Added show_sig_profile_loop() for better signature profile visualization.

Added option to control the SigProfilerExtractor to avoid issue in docker image build.

Some updates.
Compatible with SigProfiler 1.0.15

Tried to speed up joining adjacent segments in read_copynumber(), got 200% improvement.

Tried to speed up joining adjacent segments in read_copynumber(), got 20% improvement.
Added cosine() function.
Added and exported get_sig_db() to let users directly load signature database.
Added sigprofiler_extract() and sigprofiler_import() to call SigProfiler and import results.
Added read_vcf() for simply reading VCF files.
Implemented DBS-1248.
Added show_sig_profile_heatmap().
Supported mouse genome ‘mm10’ (#241).
Added read_copynumber_seqz() to read sequenza result directory.
Speed up the annotation process in read_copynumber().

Fixed bug in OsCN feature calculation.
Removed useless options in read_maf().
Modify method ‘LS’ in sig_fit() to ‘NNLS’ and implement it with pracma package (#216).
Made use_all option in read_copynumber() working correctly.
Fixed potential problem raised by unordered copy number segments (#217).
Fixed a typo, correct MRSE to RMSE.
Added feature in show_sig_bootstrap_*() for plotting aggregated values.
Fixed bug when use get_groups() for clustering.
Fixed bug about using reference components from NatGen 2018 paper.
Added option highlight_size for show_sig_bootstrap_*().
Fixed bug about signature profile plotting for method ‘M’.

Added “scatter” in sig_fit() function to better visualize a few samples.
Added “highlight” option.
lsei package was removed from CRAN, here I reset default method to ‘QP’ and tried best to keep the LS usage in sigminer (#189).
Made consistent copy number labels in show_sig_profile() and added input checking for this function.
Fixed unconsistent bootstrap when use furrr, solution is from https://github.com/DavisVaughan/furrr/issues/107.
Properly handled null-count sample in sig_fit() for methods QP and SA.
Supported boxplot or violin in show_sig_fit() and show_sig_bootstrap_* functions.
Added job mode for sig_fit_bootstrap_batch for more useful in practice.
Added show_groups() to show the signature contribution in each group from get_groups().
Expanded clustering in get_groups() to result of sig_fit().
Properly handled null-count samples in sig_fit_bootstrap_batch().
Added strand bias labeling for INDEL.
Added COSMIC TSB signatures.

Exported APOBEC result when the mode is ‘ALL’ in sig_tally().
Added batch bootstrap analysis feature (#158).
Supported all common signature plotting.
Added strand feature to signature profile.

Added profile plot for DBS and INDEL.
Fixed error for signature extraction in mode ‘DBS’ or ‘ID’.
Fixed method ‘M’ for CN tally cannot work when cores > 1 (#161).

Added multiple methods for sig_fit().
Added feature sig_fit_bootstrap() for bootstrap results.
Added multiple classification method for SBS signature.
Added strand bias enrichment analysis for SBS signature.
Moved multiple packages from field Imports to Suggests.
Added feature report_bootstrap_p_value() to report p values.
Added common DBS and ID signature.
Updated citation.

Added merged transcript info for hg19 and hg38 build, this is availabe by data().
Added gene info for hg19 and hg38 build to extdata directory.

Removed fuzzyjoin package from dependency.
Moved ggalluvial package to field suggsets.

All users, this is a break-through version of sigminer, most of functions have been modified, more features are implemented. Please read the reference list to see the function groups and their functionalities.

Please read the vignette for usage.

I Hope it helps your research work and makes a new contribution to the scientific community.

sigminer 2.3.2

sigminer 2.3.12024-05-11

sigminer 2.3.02023-12-12

sigminer 2.2.22023-08-21

sigminer 2.2.1

sigminer 2.2.02023-04-06

sigminer 2.1.10

sigminer 2.1.92022-11-09

sigminer 2.1.82022-10-20

sigminer 2.1.72022-08-31

sigminer 2.1.62022-08-09

sigminer 2.1.52022-06-30

sigminer 2.1.42022-04-26

sigminer 2.1.32022-03-10

sigminer 2.1.22021-12-15

sigminer 2.1.12021-10-29

sigminer 2.1.02021-09-22

sigminer 2.0.52021-09-03

sigminer 2.0.42021-08-03

sigminer 2.0.32021-07-18

sigminer 2.0.22021-06-17

sigminer 2.0.12021-05-19

sigminer 2.0.02021-04-01

BUG REPORTS

ENHANCEMENTS

NEW FUNCTIONS

DEPRECATED

sigminer 1.2.52021-02-20

BUG REPORTS

ENHANCEMENTS

NEW FUNCTIONS

DEPRECATED

sigminer 1.2.42021-01-30

BUG REPORTS

ENHANCEMENTS

NEW FUNCTIONS

DEPRECATED

sigminer 1.2.2

BUG REPORTS

ENHANCEMENTS

NEW FUNCTIONS

DEPRECATED

sigminer 1.2.12021-01-08

BUG REPORTS

ENHANCEMENTS

NEW FUNCTIONS

DEPRECATED

sigminer 1.2.0

BUG REPORTS

ENHANCEMENTS

NEW FUNCTIONS

DEPRECATED

sigminer 1.1.02020-11-11

sigminer 1.0.192020-09-28

sigminer 1.0.18

sigminer 1.0.17

sigminer 1.0.162020-09-12

sigminer 1.0.15

sigminer 1.0.14

sigminer 1.0.132020-08-27

sigminer 1.0.12

sigminer 1.0.11

sigminer 1.0.10

sigminer 1.0.9

sigminer 1.0.8

sigminer 1.0.72020-06-17

sigminer 1.0.62020-06-01

sigminer 1.0.52020-05-14

sigminer 1.0.4

sigminer 1.0.32020-04-30

sigminer 1.0.2

sigminer 1.0.1

sigminer 1.0.02020-03-31