Yu-Hao Yang, Juan Yang, Zi-Jia Liu, Yuan Li, Weibo Song, Naomi Stover, Xiao Chen. 2026. Deciphering ribosomal frameshifting determinants across species with a semi-supervised hybrid learning framework. Zoological Research. DOI: 10.24272/j.issn.2095-8137.2025.648
Citation: Yu-Hao Yang, Juan Yang, Zi-Jia Liu, Yuan Li, Weibo Song, Naomi Stover, Xiao Chen. 2026. Deciphering ribosomal frameshifting determinants across species with a semi-supervised hybrid learning framework. Zoological Research. DOI: 10.24272/j.issn.2095-8137.2025.648

Deciphering ribosomal frameshifting determinants across species with a semi-supervised hybrid learning framework

  • Programmed ribosomal frameshifting (PRF) is a conserved translational recoding mechanism that expands proteome diversity and regulates important biological processes across viruses, prokaryotes, and eukaryotes. However, existing predictors are often limited by their reliance on extensive annotated data, poor performance in non-model organisms, and restricted applicability across species or frameshift types. Here, we present FScanpy, a hybrid neural network that combines histogram-based gradient boosting (HistGB) with a BiLSTM-CNN architecture for cross-species PRF prediction. The model integrates short-range codon-position features associated with ribosomal pausing and long-range sequence contexts that reflect structural constraints around candidate shift sites. Using semi-supervised learning on 8,049 candidate sequences, FScanpy reduces reliance on manual annotation and improves model generalization. Across multiple datasets, it achieved an Area Under the Curve (AUC) of 0.951, and it also performed well on euplotid protists (AUC=0.877), supporting its utility in non-model systems with non-canonical recoding patterns. Feature interpretation further showed that AAA and GGG motifs, together with mRNA secondary-structure stability, as major determinants cross datasets, highlighting conserved features of PRF from viruses to eukaryotes. These insights not only validate the model’s accuracy but also deepen understanding of PRF’s conserved biological roles. FScanpy establishes a novel and effective computational framework for PRF analysis by resolving challenges in feature representation and species generalizability, enabling the study of frameshifting sites and accelerating discovery of evolutionarily divergent recoding events.
  • loading

Catalog

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return