MAF file is more recommended. In this function, we will mimic the MAF object from the key c(1, 2, 4, 5, 7) columns of VCF file.

read_vcf(
  vcfs,
  samples = NULL,
  genome_build = c("hg19", "hg38", "T2T", "mm10", "mm9", "ce11"),
  keep_only_pass = FALSE,
  verbose = TRUE
)

Arguments

vcfs

VCF file paths.

samples

sample names for VCF files.

genome_build

genome build version like "hg19".

keep_only_pass

if TRUE, keep only 'PASS' mutation for analysis.

verbose

if TRUE, print extra info.

Value

a MAF.

Examples

vcfs <- list.files(system.file("extdata", package = "sigminer"), "*.vcf", full.names = TRUE)
# \donttest{
maf <- read_vcf(vcfs)
#> Reading file(s): /home/runner/work/_temp/Library/sigminer/extdata/test1.vcf, /home/runner/work/_temp/Library/sigminer/extdata/test2.vcf, /home/runner/work/_temp/Library/sigminer/extdata/test3.vcf
#> It seems /home/runner/work/_temp/Library/sigminer/extdata/test2.vcf has no normal VCF header, try parsing without header.
#> Annotating Variant Type...
#> Downloading https://zenodo.org/record/10360995/files/human_hg19_gene_info.rds to /home/runner/work/_temp/Library/sigminer/extdata/human_hg19_gene_info.rds
#> Annotating mutations to first matched gene based on database of hg19...
#> Transforming into a MAF object...
#> -Validating
#> --Non MAF specific values in Variant_Classification column:
#>   Unknown
#> -Summarizing
#> -Processing clinical data
#> --Missing clinical data
#> -Finished in 0.024s elapsed (0.039s cpu) 
maf <- read_vcf(vcfs, keep_only_pass = TRUE)
#> Reading file(s): /home/runner/work/_temp/Library/sigminer/extdata/test1.vcf, /home/runner/work/_temp/Library/sigminer/extdata/test2.vcf, /home/runner/work/_temp/Library/sigminer/extdata/test3.vcf
#> It seems /home/runner/work/_temp/Library/sigminer/extdata/test2.vcf has no normal VCF header, try parsing without header.
#> Annotating Variant Type...
#> Annotating mutations to first matched gene based on database of hg19...
#> Transforming into a MAF object...
#> -Validating
#> --Non MAF specific values in Variant_Classification column:
#>   Unknown
#> -Summarizing
#> -Processing clinical data
#> --Missing clinical data
#> -Finished in 0.023s elapsed (0.037s cpu) 
# }