• 中文核心期刊要目总览
  • 中国科技核心期刊
  • 中国科学引文数据库(CSCD)
  • 中国科技论文与引文数据库(CSTPCD)
  • 中国学术期刊文摘数据库(CSAD)
  • 中国学术期刊(网络版)(CNKI)
  • 中文科技期刊数据库
  • 万方数据知识服务平台
  • 中国超星期刊域出版平台
  • 国家科技学术期刊开放平台
  • 荷兰文摘与引文数据库(SCOPUS)
  • 日本科学技术振兴机构数据库(JST)
Pardo-SecoJacobo, Gómez-CarballaAlberto, BelloXabier, Martinón-TorresFederico, SalasAntonio. 2021: 全球SARS-Cov-2变异和系统发育动力学研究中的条形码缺陷. 动物学研究, 42(1): 87-93. DOI: 10.24272/j.issn.2095-8137.2020.364
引用本文: Pardo-SecoJacobo, Gómez-CarballaAlberto, BelloXabier, Martinón-TorresFederico, SalasAntonio. 2021: 全球SARS-Cov-2变异和系统发育动力学研究中的条形码缺陷. 动物学研究, 42(1): 87-93. DOI: 10.24272/j.issn.2095-8137.2020.364
Jacobo Pardo-Seco, Alberto Gómez-Carballa, Xabier Bello, Federico Martinón-Torres, Antonio Salas. 2021: Pitfalls of barcodes in the study of worldwide SARS-CoV-2 variation and phylodynamics. Zoological Research, 42(1): 87-93. DOI: 10.24272/j.issn.2095-8137.2020.364
Citation: Jacobo Pardo-Seco, Alberto Gómez-Carballa, Xabier Bello, Federico Martinón-Torres, Antonio Salas. 2021: Pitfalls of barcodes in the study of worldwide SARS-CoV-2 variation and phylodynamics. Zoological Research, 42(1): 87-93. DOI: 10.24272/j.issn.2095-8137.2020.364

全球SARS-Cov-2变异和系统发育动力学研究中的条形码缺陷

Pitfalls of barcodes in the study of worldwide SARS-CoV-2 variation and phylodynamics

  • 摘要: 使用最少量的选定信息位点组成的基因条形码在分析SARS-Cov-2基因组变异时存在诸多弊端。我们的研究表明,仅用数学程序来选定位点时应由已知的系统发育学研究作为指导,(1)确保用实体树分支来代表,而不是具有较差的系统发育地理特性的突变热点;(2)避免系统发育冗余。我们提出了一个流程,即通过考虑先前选定位点的累积的信息量(作为基于系统发育分析的标准代表)来避免位点选择中的信息冗余。这个程序演示了,对于一些短的条形码(如有11个位点)来说,也有成千上万位点组合信息来改进之前的提议。我们的研究还表明,基于全球数据库的条形码不可避免的优先考虑那些位于系统发育的基础节点上的变异,这使得在这些祖先节点上的大多数代表性基因组不再反复出现。因此,冠状病毒的系统发育动力学无法通过普遍的基因组条形码捕获,因为大多数的SARS-Cov-2变异是在地理限制区域内引入当地的变异产生的。

     

    Abstract: Analysis of SARS-CoV-2 genome variation using a minimal number of selected informative sites conforming a genetic barcode presents several drawbacks. We show that purely mathematical procedures for site selection should be supervised by known phylogeny (i) to ensure that solid tree branches are represented instead of mutational hotspots with poor phylogeographic proprieties, and (ii) to avoid phylogenetic redundancy. We propose a procedure that prevents information redundancy in site selection by considering the cumulative informativeness of previously selected sites (as a proxy for phylogenetic-based criteria). This procedure demonstrates that, for short barcodes (e.g., 11 sites), there are thousands of informative site combinations that improve previous proposals. We also show that barcodes based on worldwide databases inevitably prioritize variants located at the basal nodes of the phylogeny, such that most representative genomes in these ancestral nodes are no longer in circulation. Consequently, coronavirus phylodynamics cannot be properly captured by universal genomic barcodes because most SARS-CoV-2 variation is generated in geographically restricted areas by the continuous introduction of domestic variants.

     

/

返回文章
返回