Volume 42 Issue 1
Jan.  2021
Turn off MathJax
Article Contents
Jacobo Pardo-Seco, Alberto Gómez-Carballa, Xabier Bello, Federico Martinón-Torres, Antonio Salas. Pitfalls of barcodes in the study of worldwide SARS-CoV-2 variation and phylodynamics. Zoological Research, 2021, 42(1): 87-93. doi: 10.24272/j.issn.2095-8137.2020.364
Citation: Jacobo Pardo-Seco, Alberto Gómez-Carballa, Xabier Bello, Federico Martinón-Torres, Antonio Salas. Pitfalls of barcodes in the study of worldwide SARS-CoV-2 variation and phylodynamics. Zoological Research, 2021, 42(1): 87-93. doi: 10.24272/j.issn.2095-8137.2020.364

Pitfalls of barcodes in the study of worldwide SARS-CoV-2 variation and phylodynamics

doi: 10.24272/j.issn.2095-8137.2020.364
#Authors contributed equally to this work
Funds:  This study was supported by the GePEM (Instituto de Salud Carlos III(ISCIII)/PI16/01478/Cofinanciado FEDER), DIAVIR (Instituto de Salud Carlos III(ISCIII)/DTS19/00049/Cofinanciado FEDER; Proyecto de Desarrollo Tecnológico en Salud), Resvi-Omics (Instituto de Salud Carlos III(ISCIII)/PI19/01039/Cofinanciado FEDER), BI-BACVIR (PRIS-3; Agencia de Conocimiento en Salud (ACIS)—Servicio Gallego de Salud (SERGAS)—Xunta de Galicia; Spain), Programa Traslaciona Covid-19 (ACIS—Servicio Gallego de Salud (SERGAS)—Xunta de Galicia; Spain) and Axencia Galega de Innovación (GAIN; IN607B 2020/08—Xunta de Galicia; Spain) to A.S.; and ReSVinext (Instituto de Salud Carlos III(ISCIII)/PI16/01569/Cofinanciado FEDER), and Enterogen (Instituto de Salud Carlos III(ISCIII)/PI19/01090/Cofinanciado FEDER) to F.M.-T.
More Information
  • Corresponding author: E-mail: antonio.salas@usc.es
  • Received Date: 2020-12-04
  • Accepted Date: 2020-12-31
  • Available Online: 2020-12-31
  • Publish Date: 2021-01-18
  • Analysis of SARS-CoV-2 genome variation using a minimal number of selected informative sites conforming a genetic barcode presents several drawbacks. We show that purely mathematical procedures for site selection should be supervised by known phylogeny (i) to ensure that solid tree branches are represented instead of mutational hotspots with poor phylogeographic proprieties, and (ii) to avoid phylogenetic redundancy. We propose a procedure that prevents information redundancy in site selection by considering the cumulative informativeness of previously selected sites (as a proxy for phylogenetic-based criteria). This procedure demonstrates that, for short barcodes (e.g., 11 sites), there are thousands of informative site combinations that improve previous proposals. We also show that barcodes based on worldwide databases inevitably prioritize variants located at the basal nodes of the phylogeny, such that most representative genomes in these ancestral nodes are no longer in circulation. Consequently, coronavirus phylodynamics cannot be properly captured by universal genomic barcodes because most SARS-CoV-2 variation is generated in geographically restricted areas by the continuous introduction of domestic variants.
  • #Authors contributed equally to this work
  • loading
  • [1]
    Boni MF, Lemey P, Jiang XW, Lam TTY, Perry BW, Castoe TA, et al. 2020. Evolutionary origins of the SARS-CoV-2 sarbecovirus lineage responsible for the COVID-19 pandemic. Nature Microbiology, 5(11): 1408−1417. doi: 10.1038/s41564-020-0771-4
    [2]
    Forster P, Forster L, Renfrew C, Forster M. 2020. Phylogenetic network analysis of SARS-CoV-2 genomes. Proceedings of the National Academy of Sciences of the United States of America, 117(17): 9241−9243. doi: 10.1073/pnas.2004999117
    [3]
    Galanter JM, Fernández-López JC, Gignoux CR, Barnholtz-Sloan J, Fernández-Rozadilla C, Via M, et al. 2012. Development of a panel of genome-wide ancestry informative markers to study admixture throughout the Americas. PLoS Genetics, 8(3): e1002554. doi: 10.1371/journal.pgen.1002554
    [4]
    Gómez-Carballa A, Bello X, Pardo-Seco J, Martinón-Torres F, Salas A. 2020a. Mapping genome variation of SARS-CoV-2 worldwide highlights the impact of COVID-19 super-spreaders. Genome Research, 30(10): 1434−1448. doi: 10.1101/gr.266221.120
    [5]
    Gómez-Carballa A, Bello X, Pardo-Seco J, Pérez Del Molino ML, Martinón-Torres F, Salas A. 2020b. Phylogeography of SARS-CoV-2 pandemic in Spain: a story of multiple introductions, micro-geographic stratification, founder effects, and super-spreaders. Zoological Research, 41(6): 605−620. doi: 10.24272/j.issn.2095-8137.2020.217
    [6]
    Guan QT, Sadykov M, Mfarrej S, Hala S, Naeem R, Nugmanova R, et al. 2020. A genetic barcode of SARS-CoV-2 for monitoring global distribution of different clades during the COVID-19 pandemic. International Journal of Infectious Diseases, 100: 216−223. doi: 10.1016/j.ijid.2020.08.052
    [7]
    Gudbjartsson DF, Helgason A, Jonsson H, Magnusson OT, Melsted P, Norddahl GL, et al. 2020. Spread of SARS-CoV-2 in the icelandic population. The New England Journal of Medicine, 382(24): 2302−2315. doi: 10.1056/NEJMoa2006100
    [8]
    Hadfield J, Megill C, Bell SM, Huddleston J, Potter B, Callender C, et al. 2018. Nextstrain: real-time tracking of pathogen evolution. Bioinformatics, 34(23): 4121−4123. doi: 10.1093/bioinformatics/bty407
    [9]
    Pardo-Seco J, Martinón-Torres F, Salas A. 2014. Evaluating the accuracy of AIM panels at quantifying genome ancestry. BMC Genomics, 15(1): 543. doi: 10.1186/1471-2164-15-543
    [10]
    Rambaut A, Holmes EC, O'toole Á, Hill V, McCrone JT, Ruis C, et al. 2020. A dynamic nomenclature proposal for SARS-CoV-2 lineages to assist genomic epidemiology. Nature Microbiology, 5(11): 1403−1407. doi: 10.1038/s41564-020-0770-5
    [11]
    Rockett RJ, Arnott A, Lam C, Sadsad R, Timms V, Gray KA, et al. 2020. Revealing COVID-19 transmission in Australia by SARS-CoV-2 genome sequencing and agent-based modeling. Nature Medicine, 26(9): 1398−1404. doi: 10.1038/s41591-020-1000-7
    [12]
    Salas A, Amigo J. 2010. A reduced number of mtSNPs saturates mitochondrial DNA haplotype diversity of worldwide population groups. PLoS One, 5(5): e10218. doi: 10.1371/journal.pone.0010218
    [13]
    Van Dorp L, Acman M, Richard D, Shaw LP, Ford CE, Ormond L, et al. 2020. Emergence of genomic diversity and recurrent mutations in SARS-CoV-2. Infection, Genetics and Evolution, 83: 104351. doi: 10.1016/j.meegid.2020.104351
    [14]
    Yu WB, Tang GD, Zhang L, Corlett RT. 2020. Decoding the evolution and transmissions of the novel pneumonia coronavirus (SARS-CoV-2 / HCoV-19) using whole genomic data. Zoological Research, 41(3): 247−257. doi: 10.24272/j.issn.2095-8137.2020.022
    [15]
    Zhao ZQ, Sokhansanj BA, Malhotra C, Zheng K, Rosen GL. 2020. Genetic grouping of SARS-CoV-2 coronavirus sequences using informative subtype markers for pandemic spread visualization. PLoS Computational Biology, 16(9): e1008269. doi: 10.1371/journal.pcbi.1008269
  • ZR-2020-364 Supplementary Data and Table S1.zip
  • 加载中

Catalog

    通讯作者: 陈斌, bchen63@163.com
    • 1. 

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索

    Figures(1)  / Tables(1)

    Article Metrics

    Article views (564) PDF downloads(123) Cited by()
    Proportional views
    Related

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return