NeuroTIS+: An Improved Method for Translation Initiation Site Prediction in Full-Length mRNA Sequence via Primary Structural Information

Translation initiation site (TIS) prediction in mRNA sequences constitutes an essential component of transcriptome annotation, playing a crucial role in deciphering gene expression and regulation mechanisms. Numerous computational methods have been proposed and achieved acceptable prediction accurac...

Full description

Saved in:
Bibliographic Details
Main Authors: Wenqiu Xiao, Chao Wei
Format: Article
Language:English
Published: MDPI AG 2025-07-01
Series:Applied Sciences
Subjects:
Online Access:https://www.mdpi.com/2076-3417/15/14/7866
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Translation initiation site (TIS) prediction in mRNA sequences constitutes an essential component of transcriptome annotation, playing a crucial role in deciphering gene expression and regulation mechanisms. Numerous computational methods have been proposed and achieved acceptable prediction accuracy. In our previous work, we developed NeuroTIS, a novel method for TIS prediction based on a hybrid dependency network combined with a deep learning framework that explicitly models label dependencies both within coding sequences (CDSs) and between CDSs and TISs. However, this method has limitations in fully exploiting the primary structural information within mRNA sequences. First, it only captures label dependency within three neighboring codon labels. Second, it neglects the heterogeneity of negative TISs originating from different reading frames, which exhibit distinct coding features in their vicinity. In this paper, under the framework of NeuroTIS, we propose its enhanced version, NeuroTIS+, which allows for more sophisticated codon label dependency modeling via temporal convolution and homogenous feature building through an adaptive grouping strategy. Tests on transcriptome-wide human and mouse datasets demonstrate that the proposed method yields excellent prediction performance, significantly surpassing the existing state-of-the-art methods.
ISSN:2076-3417