Citation: | Yu-Fang Mao, Xi-Guo Yuan, Yu-Peng Cun. A novel machine learning approach (svmSomatic) to distinguish somatic and germline mutations using next-generation sequencing data. Zoological Research, 2021, 42(2): 246-249. doi: 10.24272/j.issn.2095-8137.2021.014 |
[1] |
Boeva V, Popova T, Bleakley K, Chiche P, Cappo J, Schleiermacher G, et al. 2012. Control-FREEC: a tool for assessing copy number and allelic content using next-generation sequencing data. Bioinformatics, 28(3): 423−425. doi: 10.1093/bioinformatics/btr670
|
[2] |
Cun YP, Yang TP, Achter V, Lang U, Peifer M. 2018. Copy-number analysis and inference of subclonal populations in cancer genomes using Sclust. Nature Protocols, 13(6): 1488−1501. doi: 10.1038/nprot.2018.033
|
[3] |
Fan Y, Xi L, Hughes DST, Zhang JJ, Zhang JH, Futreal PA, et al. 2016. MuSE: accounting for tumor heterogeneity using a sample-specific error model improves sensitivity and specificity in mutation calling from sequencing data. Genome Biology, 17(1): 178. doi: 10.1186/s13059-016-1029-6
|
[4] |
Guyon I, Boser BE, Vapnik V. 1993. Automatic capacity tuning of very large VC-dimension classifiers. In: Proceedings of Advances in Neural Information Processing Systems 5. Denver: NIPS, 147–155.
|
[5] |
Hastie T, Tibshirani R. 1998. Classification by pairwise coupling. The Annals of Statistics, 26(2): 451−471. doi: 10.1214/aos/1028144844
|
[6] |
Kalatskaya I, Trinh QM, Spears M, Mcpherson JD, Bartlett JMS, Stein L. 2017. ISOWN: accurate somatic mutation identification in the absence of normal tissue controls. Genome Medicine, 9(1): 59. doi: 10.1186/s13073-017-0446-9
|
[7] |
Koboldt DC, Zhang QY, Larson DE, Shen D, McLellan MD, et al. 2012. VarScan 2: somatic mutation and copy number alteration discovery in cancer by exome sequencing. Genome Research, 22(3): 568−576. doi: 10.1101/gr.129684.111
|
[8] |
Lai ZW, Markovets A, Ahdesmaki M, Johnson J. 2015. Abstract 4864: VarDict: a novel and versatile variant caller for next-generation sequencing in cancer research. Cancer Research, 75(15): 4864−4864.
|
[9] |
Lappalainen I, Almeida-King J, Kumanduri V, Senf A, Spalding JD, Ur-Rehman S, et al. 2015. The European Genome-phenome Archive of human data consented for biomedical research. Nature Genetics, 47(7): 692−695. doi: 10.1038/ng.3312
|
[10] |
Li H, Durbin R. 2009. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics, 25(14): 1754−1760. doi: 10.1093/bioinformatics/btp324
|
[11] |
Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, et al. 2009. The sequence alignment/map format and SAMtools. Bioinformatics, 25(16): 2078−2079. doi: 10.1093/bioinformatics/btp352
|
[12] |
Liu RM, Liu EQ, Yang J, Li M, Wang FL. 2006. Optimizing the hyper-parameters for SVM by combining evolution strategies with a grid search. In: Proceedings of International Conference on Intelligent Computing. Kunming, China: Springer, 712–721.
|
[13] |
Liu YC, Loewer M, Aluru S, Schmidt B. 2016. SNVSniffer: an integrated caller for germline and somatic single-nucleotide and indel mutations. BMC Systems Biology, 10(S2): 47. doi: 10.1186/s12918-016-0300-5
|
[14] |
Pattnaik S, Gupta S, Rao AA, Panda B. 2014. SInC: an accurate and fast error-model based simulator for SNPs, Indels and CNVs coupled with a read generator for short-read sequence data. BMC Bioinformatics, 15: 40. doi: 10.1186/1471-2105-15-40
|
[15] |
Sherry ST, Ward MH, Kholodov M, Baker J, Phan L, Smigielski EM, et al. 2001. dbSNP: the NCBI database of genetic variation. Nucleic Acids Research, 29(1): 308−311. doi: 10.1093/nar/29.1.308
|
[16] |
Smith KS, Yadav VK, Pei SS, Pollyea DA, Jordan CT, De S. 2016. SomVarIUS: somatic variant identification from unpaired tissue samples. Bioinformatics, 32(6): 808−813. doi: 10.1093/bioinformatics/btv685
|
[17] |
Wang WX, Wang PW, Xu F, Luo RB, Wong MP, Lam TW, et al. 2014. FaSD-somatic: a fast and accurate somatic SNV detection algorithm for cancer genome sequencing data. Bioinformatics, 30(17): 2498−2500. doi: 10.1093/bioinformatics/btu338
|
[18] |
Wei Z, Wang W, Hu PZ, Lyon GJ, Hakonarson H. 2011. SNVer: a statistical tool for variant calling in analysis of pooled or individual next-generation sequencing data. Nucleic Acids Research, 39(19): e132. doi: 10.1093/nar/gkr599
|
[19] |
Xi JN, Yuan XG, Wang MH, Li A, Li XL, Huang Q. 2020. Inferring subgroup-specific driver genes from heterogeneous cancer samples via subspace learning with subgroup indication. Bioinformatics, 36(6): 1855−1863.
|
[20] |
Xu C. 2018. A review of somatic single nucleotide variant calling algorithms for next-generation sequencing data. Computational and Structural Biotechnology Journal, 16: 15−24. doi: 10.1016/j.csbj.2018.01.003
|
[21] |
Yuan XG, Miller DJ, Zhang JY, Herrington D, Wang Y. 2012. An overview of population genetic data simulation. Journal of Computational Biology, 19(1): 42−54. doi: 10.1089/cmb.2010.0188
|
[22] |
Yuan XG, Zhang JY, Yang LY. 2017. IntSIM: an integrated simulator of next-generation sequencing data. IEEE Transactions on Biomedical Engineering, 64(2): 441−451. doi: 10.1109/TBME.2016.2560939
|
[23] |
Yuan X, Bai J, Zhang J, Yang L, Duan J, Li Y, et al. 2020a. CONDEL: Detecting copy number variation and genotyping deletion zygosity from single tumor samples using sequence data. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 17(4): 1141−1153.
|
[24] |
Yuan X, Ma C, Zhao H, Yang L, Wang S, Xi J. 2020b. STIC: Predicting single nucleotide variants and tumor purity in cancer genome. IEEE/ACM Transactions on Computational Biology and Bioinformatics. doi: 10.1109/TCBB.2020.2975181.
|
![]() |
![]() |