留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

A review on the bioinformatics pipelines for metagenomic research

YE Dan-Dan FAN Meng-Meng GUAN Qiong CHEN Hong-Ju MA Zhan-Shan

YE Dan-Dan, FAN Meng-Meng, GUAN Qiong, CHEN Hong-Ju, MA Zhan-Shan. A review on the bioinformatics pipelines for metagenomic research. Zoological Research, 2012, 33(6): 574-585. doi: 10.3724/SP.J.1141.2012.06574
Citation: YE Dan-Dan, FAN Meng-Meng, GUAN Qiong, CHEN Hong-Ju, MA Zhan-Shan. A review on the bioinformatics pipelines for metagenomic research. Zoological Research, 2012, 33(6): 574-585. doi: 10.3724/SP.J.1141.2012.06574

宏基因组研究的生物信息学平台现状

doi: 10.3724/SP.J.1141.2012.06574
基金项目: 中国国家自然科学基金 (批准号:61175071);云南省高端科技人才项目;云南省海外高层次人才项目; 中国科学院“计算进化-自然进化”协同研究云南省创新团队
详细信息
  • 中图分类号: Q811.4; Q75; Q349

A review on the bioinformatics pipelines for metagenomic research

  • 摘要: 由 Handelsman et al (1998) 提出的宏基因组 (metagenome)泛指特定环境样品 (例如:人类和动物的肠道、母乳、土壤、湖泊、冰川和海洋等环境)中微生物群落所有物种的基因组.宏基因组技术起源于环境微生物学研究, 而新一代高通量测序技术使其广泛应用成为可能.与基因组学研究相类似, 目前宏基因组学发展的瓶颈在于如何高效分析高通量测序产生的海量数据, 因此, 相关的生物信息学分析方法和平台是宏基因组学研究的关键.该文介绍了目前宏基因组研究领域中主要的生物信息学软件及工具; 鉴于目前宏基因组研究所采用的“全基因组测序”(whole genome sequencing)和“扩增子测序”(amplicon sequencing) 两大测序方法所获得的数据和相应分析方法有较大差异, 文中分别对相应软件平台进行了介绍.
  • [1] Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, Davis AP, Dolinski K, Dwight SS, Eppig JT, Harris MA, Hill DP, Issel-Tarver L, Kasarskis A, Lewis S, Matese JC, Richardson JE, Ringwald M, Rubin GM, Sherlock G. 2000. Gene ontology: tool for the unification of biology[J]. Nat Genet, 25(1): 25-29.
    [2] Batzoglou S, Jaffe DB, Stanley K, Butler J, Gnerre S, Mauceli E, Berger B, Mesirov JP, Lander ES. 2002. ARACHNE: a whole-genome shotgun assembler[J]. Genome Res, 12(1): 177-189.
    [3] Borodovsky M, McIninch J. 1993. GeneMark: parallel gene recognition for both DNA strands[J].Computers & Chemistry, 17(19):123-133.
    [4] Burge C, Karlin, S. 1997. Prediction of complete gene structures in human genomic DNA [J]. J Molecular Biol, 268(1): 78-94.
    [5] Caporaso JK, Kuczynski J, Stombaugh J, Bittinger K, Bushman FD, Costello EK, Fierer N, Pena AG, Goodrich J, Gordon JI, Gavin A Huttley, Kelley ST, Dan, EKoenig J, Ley RE, Lozupone CA, McDonald D, Muegge BD, Meg, Pirrung, Reeder J, Sevinsky JR, Turnbaugh PJ, Walters WA, Widmann J, Yatsunenko T, Zaneveld J, and Knight R. 2012. QIIME allows analysis of high-throughput community sequencing data [J]. Nat Methods, 7(5): 335–336.
    [6] Cock PJ, Fields CJ, Goto N, Heuer M L, Rice P M. 2010. The Sanger FASTQ file format for sequences with quality scores, and the Solexa/Illumina FASTQ variants[J]. Nucleic Acids Res, 38(6): 1767-1771.
    [7] Cole JR, Wang Q, Cardenas E, Fish J, Chai B, Farris RJ, Kulam-Syed-Mohideen AS, McGarrell DM, Marsh T, Garrity GM, Tiedje JM. 2009. The Ribosomal Database Project: improved alignments and new tools for rRNA analysis[J]. Nucleic Acids Res, 37: 141–145.
    [8] Conesa A, Gotz S, Garcia-Gomez JM, Terol J, Talon M, Robles M. 2005. Blast2GO: a universal tool for annotation, visualization and analysis in functional genomics research[J]. Bioinfomatics, 21: 3674-3676.
    [9] Delcher AL, HarmonD, Kasif S, White O, Salzberg SL. 2001. Improved microbial gene identification with GLIMMER[J]. Nucleic Acids Res, 27: 4636-4641.
    [10] Dennis G Jr, Sherman BT, Hosack DA, Yang J, Gao W, Lane HC, Lempicki RA. 2003. DAVID: Database for Annotation, Visualization, and Integrated Discovery[J]. Genome Biol, 4(5): 3.
    [11] DeSantis TZ, Hugenholtz P, Larsen N, Rojas M, Brodie EL, Keller K, Huber T, Dalevi D, Hu P and Andersen GL. 2006. Greengenes, a Chimera-Checked 16S rRNA Gene Database and Workbench Compatible with ARB[J]. Appl Environ Microbiol, 72(7): 5069-5072.
    [12] Frigaard NU, Martinez A, Mincer TJ, DeLong EF. 2006. Proteorhodopsin lateral gene transfer between marine planktonic Bacteria and Archaea[J]. Nature, 439(7078): 847–850.
    [13] Handelsman J, Rondon MR, Brady SF, Clardy J, Goodman RM. 1998. Molecular biological access to the chemistry of unknown soil microbes: a new frontier for natural products[J]. Chem Biol, 5(10): 245-249.
    [14] He JZ, Zhang LM, Sen JM, Zhu YG. 2008. Advances and perspectives of metagenomics[J]. Aata Scientiae Circumstantiae, 25(2): 231-234. [贺纪正, 张丽梅, 沈菊培, 朱永官. 2008. 宏基因组学(Metagenomics)的研究现状和发展趋势. 环境科学学报, 25(2): 231-234.]
    [15] Huang X, Yang SP. 2005. Generating a genome assembly with PCAP[M] // Current Protocols in Bioinformatics. New York: John Wiley & Sons.
    [16] Miller JR, Koren S, Sutton G. 2010. Assembly algorithms for next-generation sequencing data[J]. Genomics, 95(6): 315-327.
    [17] Kanehisa M, Goto S. 2000. KEGG: kyoto encyclopedia of genes and genomes[J]. Nucleic Acids Res, 28(1): 27-30.
    [18] Korf I, Flicek P, Duan D, Brent MR. 2001. Integrating genomic homology into gene structure prediction[J]. Bioinformatics, 17(1): 140-148.
    [19] Krogh A, Mian IS, Haussler D. 1994. A hidden Markov model that finds genes in E. coli DNA[J]. Nucleic Acids Res, 22(22): 4768-4778.
    [20] Li H, HeJJ, Zhang Y, Xu H, Chen GX. 2008a. Application of metagenomic technique in the exploring of uncultured environmental microbial gene resource [J]. Acta Ecol Sin, 28(4): 1762-1762, 1773. [李慧, 何晶晶, 张颖, 徐慧, 陈冠雄. 2008. 宏基因组技术在开发未培养环境微生物基因资源中的应用. 生态学报, 28(4): 1762-1762, 1773.]
    [21] Li R, Li Y, Kristiansen K, Wang J. 2008b. SOAP: short oligonucleotide alignment program[J]. Bioinformatics, 24(5): 713-714.
    [22] Ludwig W, Strunk O, Westram R, Richter L, Meier H, Yadhukumar, Buchner A, Lai T, Steppi S, Jobb G, Förster W, Brettske I, Gerber S, Ginhart AW, Gross O, Grumann S, Hermann S, Jost R, König A, Liss T, Lüβmann R, May M, Nonhoff B, Reichel R, Strehlow R, Stamatakis A, Stuckmann S, Vilbig A, Lenke M, Ludwig T, Bode A, Schleifer KH. 2004. ARB: a software environment for sequence data[J]. Nucleic Acids Res, 32(4): 1363-1371.
    [23] Lukashin AV, Borodovsky M. 1998. Genemark. hmm: new solutions for gene finding [J]. Nucleic Acids Res, 26(4): 1107-1115.
    [24] Ma, Z S. 2012. A note on extending Taylor’s power law for characterizing human microbial communities: Inspiration from comparative studies on the distribution patterns of insects and galaxies, and as a case study for medical ecology.[Online] Available: arXiv.org/abs/1205.3504 (2012/5/15).
    [25] Ma Z S, Geng JW, Abdo Z, Forney LJ. 2012. A Bird’s Eye View of Microbial Community Dynamics // Microbial Ecology Theory: Current Perspectives. Norwich, UK: Horizon Scientific Press: 57-70.
    [26] Maidak BL, Olsen GJ, Larsen1 N, Overbeek R, McCaughey, Woese CR. 1997. The RDP (Ribosomal Database Project). Nucleic Acids Res, 25(1): 109–110.
    [27] Melsted P, Pritchard JK. 2011. Efficient counting of K-mers in DNA sequence using a bloom filter[J]. BMC Bioinformaitcs, 12(1): 333.
    [28] Myers EW, Sutton GG, Delcher AL, Dew IM, Fasulo DP, Flanigan MJ, Kravitz SA, Mobarry CM, Reinert KH, Remington KA, Anson EL, Bolanos RA, Chou HH, Jordan CM, Halpern AL, Lonardi S, Beasley EM, Brandon RC, Chen L, Dunn PJ, Lai Z, Liang Y, Nusskern DR, Zhan M, Zhang Q, Zheng X, Rubin GM, Adams MD, Venter JC. 2000. A whole-genome assembly of Drosophila. Science, 287(5461): 2196-2204.
    [29] Pevzner PA, Tang HX, Waterman MS. 2001. An Eulerian path approach to DNA fragment assembly[J]. Proc Natl Acad Sci USA, 98(17): 9748-9753.
    [30] Pruesse E, Quast C, Knittel K, Fuchs BM, Ludwig W, Peplies J, Glockner FO. 2007. SILVA: a comprehensive online resource for quality checked and aligned ribosomal RNA sequence data compatible with ARB[J]. Nucleic Acids Res, 35(21): 7188-7196.
    [31] Salamov AA, Solovyev VV. 2000. Ab initio gene finding in Drosophila genomic DNA[J]. Genome Res, 10(4): 516-522.
    [32] Schloss PD, Westcott SL, Ryabin T, Hall J R, Hartmann M, Hollister EB, Lesniewski RA, Oakley BB, Parks DH, Robinson CJ, Sahl JW, Stres B, Thallinger GG, Van Horn DJ, Weber CF. 2009. Introducing mothur: open-source, platform independent, community-supported software for describing and comparing microbial communities[J]. Appl Environ Microbiol, 75(23): 7537-7541.
    [33] Simpson JT, Wong K, Jackman SD, Schein JE, Jones SJ, Birol I. 2009. ABySS: A parallel assembler for short read sequence data[J]. Genome Res, 19(6): 1117-1123.
    [34] Sun HX, Wang XJ. 2009. The development and future perspectives of DNA sequencing technology[J]. e-Science, 2(3): 19-29. [孙海汐, 王秀杰. 2009. DNA测序技术发展及其展望. e-Science技术, 2(3): 19-29.]
    [35] Sun RY, Li QF, Niu CJ, Lou AR. 2002. Basic Ecology [M]. Beijing: Higher Education Press: 112-144. [孙儒泳, 李庆芬, 牛翠娟, 娄安如. 2002. 基础生态学. 北京: 高等教育出版社: 112-114.]
    [36] Wang X. 2011. The Generation Algorithm Based on de Brujin Graph DNAContig[D]. Harbin: Harbin Institute of Technology. [王旭. 2011. 基于de Brujin图的DNAContig生成算法. 哈尔滨: 哈尔滨工业大学.]
    [37] Warren RL, Sutton GG, Jones SJ, Holt RA. 2007. Assembling millions of short DNA sequences using SSAKE[J]. Bioinformatics, 23(4): 500-501.
    [38] Wu QF. 2003. An introduction of several programs used in genomic analysis[J]. Hereditas, 25(6): 708-712. [吴清发. 2003. 基因组学研究中一些常用软件的概述. 遗传, 25(6): 708-712.]
    [39] Ye CX, Ma ZS, Cannon CH, Pop M, Yu DW. 2011a. SparseAssembler: de novo Assembly with the Sparse de Bruijn Graph.[Online] Available: arXiv.org/abs/ 1106.2603 (2011/6/14).
    [40] Ye CX, Cannon CH, Ma ZS, Yu DW, Pop M. 2011b. SparseAssembler2: Sparse k-mer Graph for Memory Efficient Genome Assembly.[Online] Available: arXiv.org/abs/1108.3556 (2011/8/17).
    [41] Ye CX, Ma ZS, Cannon CH, Pop M, Yu DW. 2012. Exploiting sparseness in de novo genome assembly[J]. BMC Bioinformatics, 13(S1): S1.
    [42] Ye J, Fang L, Zheng HK, Zhang Y, Chen J, Zhang ZJ, Wang J, Li ST, Li RQ, Bolund L, Wang J. 2006. WEGO: a web tool for plotting GO annotations[J]. Nucleic Acids Res, 34(S2): 293-297.
    [43] Zdobnov EM, Apweiler R. 2001. InterProScan-an intergration platform for the signature-recognition methods in InterPro[J]. Bioinformatics, 17(9): 847-848.
    [44] Zerbino DR, Birney E. 2008. Velvet: algorithms for de novo short Read assembly using de Bruijn graphs[J]. Genome Res, 18(5): 821-829.
    [45] Zhang H, Cui HZ. 2010. Metagenomics and its research progress[J]. China Animal H usbandry & Veterinary Medicine, 37(3): 87-90. [张辉, 崔焕忠. 2010. 宏基因组学及其研究进展, 中国畜牧兽医, 37(3): 87-90.]
    [46] Zhang EM, Hai R, Yu DZ. 2009. Research progress of gene prediction methods[J]. Chin J Vector Bio & Control, 20(3): 271-273. [张恩民, 海荣, 俞东征. 2009. 基因预测方法的研究进展. 中国媒介生物学及控制杂志, 20(3): 271-273.]
    [47] Zhou DQ. 1993. Laboratory Experiments in Microbiology[M]. Beijing: Higher Education Press, 396-398. [周德庆. 1993. 微生物学教程. 北京: 高等教育出版社, 396-398.]
    [48] Zhu HQ, Hu GQ, Yang YF, Wang J, She ZS. 2007. MED: a new non-supervised gene prediction algorithm for bacterial and archaeal genomes. BMC Bioinformatics, 8(1): 97.
  • [1] Yang Yang, Li-Na Wu, Jing-Fang Chen, Xi Wu, Jun-Hong Xia, Zi-Ning Meng, Xiao-Chun Liu, Hao-Ran Lin.  Whole-genome sequencing of leopard coral grouper (Plectropomus leopardus) and exploration of regulation mechanism of skin color and adaptive evolution, Zoological Research. doi: 10.24272/j.issn.2095-8137.2020.038
    [2] Yu Fan, Mao-Sen Ye, Jin-Yan Zhang, Ling Xu, Dan-Dan Yu, Tian-Le Gu, Yu-Lin Yao, Jia-Qi Chen, Long-Bao Lv, Ping Zheng, Dong-Dong Wu, Guo-Jie Zhang, Yong-Gang Yao.  Chromosomal level assembly and population sequencing of the Chinese tree shrew genome, Zoological Research. doi: 10.24272/j.issn.2095-8137.2019.063
    [3] Nina G. Jablonski.  Genes for the high life: New genetic variants point to positive selection for high altitude hypoxia in Tibetans, Zoological Research. doi: 10.24272/j.issn.2095-8137.2017.031
    [4] Hua CHEN.  Population genetic studies in the genomic sequencing era, Zoological Research. doi: 10.13918/j.issn.2095-8137.2015.4.223
    [5] Yong-Gui MA, Yuan HUANG, Fu-Min LEI.  Sequencing and phylogenetic analysis of the Pyrgilauda ruficollis (Aves, Passeridae) complete mitochondrial genome, Zoological Research. doi: 10.11813/j.issn.0254-5853.2014.2.081
    [6] Li GONG, Wei SHI, Li-Zhen SI, Xiao-Yu KONG.  Rearrangement of mitochondrial genome in fishes, Zoological Research. doi: 10.11813/j.issn.0254-5853.2013.6.0666
    [7] Yu-Qi ZHAO, Gong-Hua LI, Jing-Fei HUANG.  Comparative systems biology between human and animal models based on next-generation sequencing methods, Zoological Research. doi: 10.3724/SP.J.1141.2013.E02E35
    [8] Rui-Rui GAO, Yuan HUANG, Fu-Min LEI.  Sequencing and analysis of the complete mitochondrial genome of Remiz consobrinus, Zoological Research. doi: 10.11813/j.issn.0254-5853.2013.3.0228
    [9] YANG Chao, LEI Fu-Min, HUANG Yuan.  Sequencing and Analysis of the Complete Mitochondrial Genome of Pseudopodoces humilis (Aves, Paridae), Zoological Research. doi: 10.3724/SP.J.1141.2010.04333
    [10] ZHOU Li, WANG Yang, GUI Jian-fang , *Fish-Specific Genome Duplication, Zoological Research.
    [11] SUN Hong-ying, ZHOU Kai-ya, SONG Da-xiang.  Mitochondrial Genome and Phylogenetic Reconstruction of Arthropods, Zoological Research.
    [12] CHEN Dong-sheng, NIE Liu-wang, CHENG Shuang-huai, KAN Xian-zhao, WANG Chao-lin, XIE Wan-shu.  Amplification and Sequencing of Sox Gene HMG-box in Alligator sinensis, Zoological Research.
    [13] LI Fei, HAN Zhao-jun.  Cloning and Sequencing of Two Acetylcholinesterase cDNA Fragments from Cotton Aphid,Aphis gossypii Glover, Zoological Research.
    [14] PAN Jian-yi, HU Wei-jun, LIANG Song-ping.  Purification,Sequencing and Characterization of Hainantoxin-Ⅵ,a Neuro toxin from the Chinese Bird Spider Selenocosmia hainana, Zoological Research.
    [15] DONG Yun-wei, NIU Cui-juan, BAO Lei, LI Qing-fen, HUANG Chen-xi.  Method for Extracting DNA from Single Rotifer and Sequencing Partical Mitochondria Cytochrome Oxidase Subunit Ⅰ (COⅠ) Gene, Zoological Research.
    [16] ZHANG Hai-Jun, NIE Liu-Wang, SHAN Xiang-Nian, ZHANG Xiao-Ai.  Cloning and Sequencing of the Sox Genes in Chinemys reevesii, Zoological Research.
    [17] ZHAO Xiao-qing, DING Yuan-chun, WEI Ling, JIANG Ling, ZHANG Ya-ping.  A Modified Method of High-Resolution RFLPs, Zoological Research.
    [18] DING Bo, Oliver A.Ryder, ZHANG Ya-ping, ZHANG Jin-guo, ZHANG Cheng-lin, ZHANG Xi-ge.  DNA Preparation and Sequencing From Scent marks in The Glant Panda, Zoological Research.
    [19] LI Qing-wei, CHEN Yi-feng.  Enlarge The Mitochondrial Genome in Bird, Zoological Research.
    [20] LUO Li-hua, CHEN Yi-feng, SHAN Xiang-nian, CAO Xiao-mei.  A Microculture Method For Rat Whole Blood, Zoological Research.
  • 加载中
计量
  • 文章访问数:  4905
  • HTML全文浏览量:  103
  • PDF下载量:  11300
  • 被引次数: 0
出版历程
  • 收稿日期:  2012-09-17
  • 修回日期:  2012-11-15
  • 刊出日期:  2012-12-08

目录

    /

    返回文章
    返回