Volume 42 Issue 2
Mar.  2021
Turn off MathJax
Article Contents
Yu-Fang Mao, Xi-Guo Yuan, Yu-Peng Cun. A novel machine learning approach (svmSomatic) to distinguish somatic and germline mutations using next-generation sequencing data. Zoological Research, 2021, 42(2): 246-249. doi: 10.24272/j.issn.2095-8137.2021.014
Citation: Yu-Fang Mao, Xi-Guo Yuan, Yu-Peng Cun. A novel machine learning approach (svmSomatic) to distinguish somatic and germline mutations using next-generation sequencing data. Zoological Research, 2021, 42(2): 246-249. doi: 10.24272/j.issn.2095-8137.2021.014

A novel machine learning approach (svmSomatic) to distinguish somatic and germline mutations using next-generation sequencing data

doi: 10.24272/j.issn.2095-8137.2021.014
Funds:  This study was supported by the CAS Pioneer Hundred Talents Program and National Natural Science Foundation of China (32070683) to Y.P.C
More Information
  • loading
  • [1]
    Boeva V, Popova T, Bleakley K, Chiche P, Cappo J, Schleiermacher G, et al. 2012. Control-FREEC: a tool for assessing copy number and allelic content using next-generation sequencing data. Bioinformatics, 28(3): 423−425. doi: 10.1093/bioinformatics/btr670
    [2]
    Cun YP, Yang TP, Achter V, Lang U, Peifer M. 2018. Copy-number analysis and inference of subclonal populations in cancer genomes using Sclust. Nature Protocols, 13(6): 1488−1501. doi: 10.1038/nprot.2018.033
    [3]
    Fan Y, Xi L, Hughes DST, Zhang JJ, Zhang JH, Futreal PA, et al. 2016. MuSE: accounting for tumor heterogeneity using a sample-specific error model improves sensitivity and specificity in mutation calling from sequencing data. Genome Biology, 17(1): 178. doi: 10.1186/s13059-016-1029-6
    [4]
    Guyon I, Boser BE, Vapnik V. 1993. Automatic capacity tuning of very large VC-dimension classifiers. In: Proceedings of Advances in Neural Information Processing Systems 5. Denver: NIPS, 147–155.
    [5]
    Hastie T, Tibshirani R. 1998. Classification by pairwise coupling. The Annals of Statistics, 26(2): 451−471. doi: 10.1214/aos/1028144844
    [6]
    Kalatskaya I, Trinh QM, Spears M, Mcpherson JD, Bartlett JMS, Stein L. 2017. ISOWN: accurate somatic mutation identification in the absence of normal tissue controls. Genome Medicine, 9(1): 59. doi: 10.1186/s13073-017-0446-9
    [7]
    Koboldt DC, Zhang QY, Larson DE, Shen D, McLellan MD, et al. 2012. VarScan 2: somatic mutation and copy number alteration discovery in cancer by exome sequencing. Genome Research, 22(3): 568−576. doi: 10.1101/gr.129684.111
    [8]
    Lai ZW, Markovets A, Ahdesmaki M, Johnson J. 2015. Abstract 4864: VarDict: a novel and versatile variant caller for next-generation sequencing in cancer research. Cancer Research, 75(15): 4864−4864.
    [9]
    Lappalainen I, Almeida-King J, Kumanduri V, Senf A, Spalding JD, Ur-Rehman S, et al. 2015. The European Genome-phenome Archive of human data consented for biomedical research. Nature Genetics, 47(7): 692−695. doi: 10.1038/ng.3312
    [10]
    Li H, Durbin R. 2009. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics, 25(14): 1754−1760. doi: 10.1093/bioinformatics/btp324
    [11]
    Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, et al. 2009. The sequence alignment/map format and SAMtools. Bioinformatics, 25(16): 2078−2079. doi: 10.1093/bioinformatics/btp352
    [12]
    Liu RM, Liu EQ, Yang J, Li M, Wang FL. 2006. Optimizing the hyper-parameters for SVM by combining evolution strategies with a grid search. In: Proceedings of International Conference on Intelligent Computing. Kunming, China: Springer, 712–721.
    [13]
    Liu YC, Loewer M, Aluru S, Schmidt B. 2016. SNVSniffer: an integrated caller for germline and somatic single-nucleotide and indel mutations. BMC Systems Biology, 10(S2): 47. doi: 10.1186/s12918-016-0300-5
    [14]
    Pattnaik S, Gupta S, Rao AA, Panda B. 2014. SInC: an accurate and fast error-model based simulator for SNPs, Indels and CNVs coupled with a read generator for short-read sequence data. BMC Bioinformatics, 15: 40. doi: 10.1186/1471-2105-15-40
    [15]
    Sherry ST, Ward MH, Kholodov M, Baker J, Phan L, Smigielski EM, et al. 2001. dbSNP: the NCBI database of genetic variation. Nucleic Acids Research, 29(1): 308−311. doi: 10.1093/nar/29.1.308
    [16]
    Smith KS, Yadav VK, Pei SS, Pollyea DA, Jordan CT, De S. 2016. SomVarIUS: somatic variant identification from unpaired tissue samples. Bioinformatics, 32(6): 808−813. doi: 10.1093/bioinformatics/btv685
    [17]
    Wang WX, Wang PW, Xu F, Luo RB, Wong MP, Lam TW, et al. 2014. FaSD-somatic: a fast and accurate somatic SNV detection algorithm for cancer genome sequencing data. Bioinformatics, 30(17): 2498−2500. doi: 10.1093/bioinformatics/btu338
    [18]
    Wei Z, Wang W, Hu PZ, Lyon GJ, Hakonarson H. 2011. SNVer: a statistical tool for variant calling in analysis of pooled or individual next-generation sequencing data. Nucleic Acids Research, 39(19): e132. doi: 10.1093/nar/gkr599
    [19]
    Xi JN, Yuan XG, Wang MH, Li A, Li XL, Huang Q. 2020. Inferring subgroup-specific driver genes from heterogeneous cancer samples via subspace learning with subgroup indication. Bioinformatics, 36(6): 1855−1863.
    [20]
    Xu C. 2018. A review of somatic single nucleotide variant calling algorithms for next-generation sequencing data. Computational and Structural Biotechnology Journal, 16: 15−24. doi: 10.1016/j.csbj.2018.01.003
    [21]
    Yuan XG, Miller DJ, Zhang JY, Herrington D, Wang Y. 2012. An overview of population genetic data simulation. Journal of Computational Biology, 19(1): 42−54. doi: 10.1089/cmb.2010.0188
    [22]
    Yuan XG, Zhang JY, Yang LY. 2017. IntSIM: an integrated simulator of next-generation sequencing data. IEEE Transactions on Biomedical Engineering, 64(2): 441−451. doi: 10.1109/TBME.2016.2560939
    [23]
    Yuan X, Bai J, Zhang J, Yang L, Duan J, Li Y, et al. 2020a. CONDEL: Detecting copy number variation and genotyping deletion zygosity from single tumor samples using sequence data. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 17(4): 1141−1153.
    [24]
    Yuan X, Ma C, Zhao H, Yang L, Wang S, Xi J. 2020b. STIC: Predicting single nucleotide variants and tumor purity in cancer genome. IEEE/ACM Transactions on Computational Biology and Bioinformatics. doi: 10.1109/TCBB.2020.2975181.
  • ZR-2021-014 Supplementary Material.pdf
  • 加载中

Catalog

    通讯作者: 陈斌, bchen63@163.com
    • 1. 

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索

    Figures(1)  / Tables(1)

    Article Metrics

    Article views (929) PDF downloads(122) Cited by()
    Proportional views
    Related

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return