R/get_shannon_diversity_index.R
get_shannon_diversity_index.Rd
$$H = - \sum_{i=1}^n{p_i ln(p_i)}$$
where n
is the number
of signatures identified in the signature with exposure > cutoff
,
and pi
is the normalized exposure of the ith signature with
exposure > cutoff
. Exposures of signatures were normalized to
sum to 1
.
get_shannon_diversity_index(rel_expo, cutoff = 0.001)
a data.frame
with numeric columns indicating
relative signature exposures for each sample. Typically
this data can be obtained from get_sig_exposure()
.
a relative exposure cutoff for filtering signatures,
default is 0.1%
.
a data.frame
Steele, Christopher D., et al. "Undifferentiated sarcomas develop through distinct evolutionary pathways." Cancer Cell 35.3 (2019): 441-456.
# Load mutational signature
load(system.file("extdata", "toy_mutational_signature.RData",
package = "sigminer", mustWork = TRUE
))
# Get signature exposure
rel_expo <- get_sig_exposure(sig2, type = "relative")
rel_expo
#> sample Sig1 Sig2 Sig3
#> <char> <num> <num> <num>
#> 1: TCGA-AB-2802 0.00000000 0.8581541 0.14184589
#> 2: TCGA-AB-2803 0.00000000 0.9418556 0.05814437
#> 3: TCGA-AB-2804 0.00000000 1.0000000 0.00000000
#> 4: TCGA-AB-2805 0.01638910 0.9836109 0.00000000
#> 5: TCGA-AB-2806 0.02766703 0.9723330 0.00000000
#> ---
#> 182: TCGA-AB-3007 0.00000000 1.0000000 0.00000000
#> 183: TCGA-AB-3008 0.06370731 0.9362927 0.00000000
#> 184: TCGA-AB-3009 0.00000000 0.9129796 0.08702041
#> 185: TCGA-AB-3011 0.00000000 1.0000000 0.00000000
#> 186: TCGA-AB-3012 0.00000000 1.0000000 0.00000000
diversity_index <- get_shannon_diversity_index(rel_expo)
diversity_index
#> sample diversity_index
#> <char> <num>
#> 1: TCGA-AB-2802 0.40830022
#> 2: TCGA-AB-2803 0.22183088
#> 3: TCGA-AB-2804 0.00000000
#> 4: TCGA-AB-2805 0.08363194
#> 5: TCGA-AB-2806 0.12653657
#> ---
#> 182: TCGA-AB-3007 0.00000000
#> 183: TCGA-AB-3008 0.23704875
#> 184: TCGA-AB-3009 0.29558940
#> 185: TCGA-AB-3011 0.00000000
#> 186: TCGA-AB-3012 0.00000000