9 Datasets
9.1 Reference annotation
sigminer stores many reference annotation datasets for internal calculation. It can be exported for other usage either by data()
or get_genome_annotation()
.
Currently, there are the following datasets:
centromeres.hg19
centromeres.hg38
chromsize.hg19
chromsize.hg38
cytobands.hg19
cytobands.hg38
An example is given as below:
data("centromeres.hg19")
head(centromeres.hg19)
## chrom left.base right.base
## 1 chr1 121535434 124535434
## 2 chr2 92326171 95326171
## 3 chr3 90504854 93504854
## 4 chr4 49660117 52660117
## 5 chr5 46405641 49405641
## 6 chr6 58830166 61830166
get_genome_annotation()
can better control the returned data.frame
.
get_genome_annotation(
data_type = "chr_size",
chrs = c("chr1", "chr10", "chr20"),
genome_build = "hg19"
)
## chrom size
## 1 chr1 249250621
## 2 chr10 135534747
## 3 chr20 63025520
More see ?get_genome_annotation
.
9.2 Copy number components setting
Dataset CN.features
is a predefined component data table for identifying copy number signatures by method โWangโ.
Users can define a custom table with similar structure and pass it to function like sig_tally()
.
Detail about how to generate this dataset can be viewed at https://github.com/ShixiangWang/sigminer/blob/master/data-raw/CN-features.R.
CN.features
## feature component label min max
## 1: BP10MB BP10MB[0] point 0 0
## 2: BP10MB BP10MB[1] point 1 1
## 3: BP10MB BP10MB[2] point 2 2
## 4: BP10MB BP10MB[3] point 3 3
## 5: BP10MB BP10MB[4] point 4 4
## 6: BP10MB BP10MB[5] point 5 5
## 7: BP10MB BP10MB[>5] range 5 Inf
## 8: BPArm BPArm[0] point 0 0
## 9: BPArm BPArm[1] point 1 1
## 10: BPArm BPArm[2] point 2 2
## 11: BPArm BPArm[3] point 3 3
## 12: BPArm BPArm[4] point 4 4
## 13: BPArm BPArm[5] point 5 5
## 14: BPArm BPArm[6] point 6 6
## 15: BPArm BPArm[7] point 7 7
## 16: BPArm BPArm[8] point 8 8
## 17: BPArm BPArm[9] point 9 9
## 18: BPArm BPArm[10] point 10 10
## 19: BPArm BPArm[>10 & <=20] range 10 20
## 20: BPArm BPArm[>20 & <=30] range 20 30
## 21: BPArm BPArm[>30] range 30 Inf
## 22: CN CN[0] point 0 0
## 23: CN CN[1] point 1 1
## 24: CN CN[2] point 2 2
## 25: CN CN[3] point 3 3
## 26: CN CN[4] point 4 4
## 27: CN CN[>4 & <=8] range 4 8
## 28: CN CN[>8] range 8 Inf
## 29: CNCP CNCP[0] point 0 0
## 30: CNCP CNCP[1] point 1 1
## 31: CNCP CNCP[2] point 2 2
## 32: CNCP CNCP[3] point 3 3
## 33: CNCP CNCP[4] point 4 4
## 34: CNCP CNCP[>4 & <=8] range 4 8
## 35: CNCP CNCP[>8] range 8 Inf
## 36: OsCN OsCN[0] point 0 0
## 37: OsCN OsCN[1] point 1 1
## 38: OsCN OsCN[2] point 2 2
## 39: OsCN OsCN[3] point 3 3
## 40: OsCN OsCN[4] point 4 4
## 41: OsCN OsCN[>4 & <=10] range 4 10
## 42: OsCN OsCN[>10] range 10 Inf
## 43: SS SS[<=2] range -Inf 2
## 44: SS SS[>2 & <=3] range 2 3
## 45: SS SS[>3 & <=4] range 3 4
## 46: SS SS[>4 & <=5] range 4 5
## 47: SS SS[>5 & <=6] range 5 6
## 48: SS SS[>6 & <=7] range 6 7
## 49: SS SS[>7 & <=8] range 7 8
## 50: SS SS[>8] range 8 Inf
## 51: NC50 NC50[<=2] range -Inf 2
## 52: NC50 NC50[3] point 3 3
## 53: NC50 NC50[4] point 4 4
## 54: NC50 NC50[5] point 5 5
## 55: NC50 NC50[6] point 6 6
## 56: NC50 NC50[7] point 7 7
## 57: NC50 NC50[>7] range 7 Inf
## 58: BoChr BoChr[1] point 1 1
## 59: BoChr BoChr[2] point 2 2
## 60: BoChr BoChr[3] point 3 3
## 61: BoChr BoChr[4] point 4 4
## 62: BoChr BoChr[5] point 5 5
## 63: BoChr BoChr[6] point 6 6
## 64: BoChr BoChr[7] point 7 7
## 65: BoChr BoChr[8] point 8 8
## 66: BoChr BoChr[9] point 9 9
## 67: BoChr BoChr[10] point 10 10
## 68: BoChr BoChr[11] point 11 11
## 69: BoChr BoChr[12] point 12 12
## 70: BoChr BoChr[13] point 13 13
## 71: BoChr BoChr[14] point 14 14
## 72: BoChr BoChr[15] point 15 15
## 73: BoChr BoChr[16] point 16 16
## 74: BoChr BoChr[17] point 17 17
## 75: BoChr BoChr[18] point 18 18
## 76: BoChr BoChr[19] point 19 19
## 77: BoChr BoChr[20] point 20 20
## 78: BoChr BoChr[21] point 21 21
## 79: BoChr BoChr[22] point 22 22
## 80: BoChr BoChr[23] point 23 23
## feature component label min max