Mao-Sen Ye, Jin-Yan Zhang, Dan-Dan Yu, Min Xu, Ling Xu, Long-Bao Lv, Qi-Yun Zhu, Yu Fan, Yong-Gang Yao. Comprehensive annotation of the Chinese tree shrew genome by large-scale RNA sequencing and long-read isoform sequencing. Zoological Research, 2021, 42(6): 692-709. doi: 10.24272/j.issn.2095-8137.2021.272
Citation: Mao-Sen Ye, Jin-Yan Zhang, Dan-Dan Yu, Min Xu, Ling Xu, Long-Bao Lv, Qi-Yun Zhu, Yu Fan, Yong-Gang Yao. Comprehensive annotation of the Chinese tree shrew genome by large-scale RNA sequencing and long-read isoform sequencing. Zoological Research, 2021, 42(6): 692-709. doi: 10.24272/j.issn.2095-8137.2021.272

Comprehensive annotation of the Chinese tree shrew genome by large-scale RNA sequencing and long-read isoform sequencing

Funds:  This study was supported by the National Natural Science Foundation of China (U1902215 to Y.G.Y. and 31970542 to Y.F.), Chinese Academy of Sciences (Light of West China Program xbzg-zdsys-201909 to Y.G.Y.), and Yunnan Province (202001AS070023 and 2018FB046 to D.D.Y. and 202002AA100007 to Y.G.Y.)
  • Corresponding author: E-mail: yaoyg@mail.kiz.ac.cn
  • Received Date: 2021-09-02
  • Accepted Date: 2021-09-23
  • Available Online: 2021-09-26
  • Publish Date: 2021-11-18
  • The Chinese tree shrew (Tupaia belangeri chinensis) is emerging as an important experimental animal in multiple fields of biomedical research. Comprehensive reference genome annotation for both mRNA and long non-coding RNA (lncRNA) is crucial for developing animal models using this species. In the current study, we collected a total of 234 high-quality RNA sequencing (RNA-seq) datasets and two long-read isoform sequencing (ISO-seq) datasets and improved the annotation of our previously assembled high-quality chromosome-level tree shrew genome. We obtained a total of 3 514 newly annotated coding genes and 50 576 lncRNA genes. We also characterized the tissue-specific expression patterns and alternative splicing patterns of mRNAs and lncRNAs and mapped the orthologous relationships among 11 mammalian species using the current annotated genome. We identified 144 tree shrew-specific gene families, including interleukin 6 (IL6) and STT3 oligosaccharyltransferase complex catalytic subunit B (STT3B), which underwent significant changes in size. Comparison of the overall expression patterns in tissues and pathways across four species (human, rhesus monkey, tree shrew, and mouse) indicated that tree shrews are more similar to primates than to mice at the tissue-transcriptome level. Notably, the newly annotated purine rich element binding protein A (PURA) gene and the STT3B gene family showed dysregulation upon viral infection. The updated version of the tree shrew genome annotation (KIZ version 3: TS_3.0) is available at http://www.treeshrewdb.org and provides an essential reference for basic and biomedical studies using tree shrew animal models.
