RESEARCH
Splice Site Prediction using Deep Learning
Spring 2021 – present
Keywords: Spice Site; Machine Learning; Deep Learning; Convolutional Neural Network; Ribonucleic Acid, Genome
Description:
Nucleotide sequences in the messenger Ribonucleic acid (mRNA) that code for any protein are split into non-coding (introns) and the coding region (exons). Before the mRNA takes the DNA information out of the nucleus to be expressed (translated into actual protein), the process of splicing occurs. This process removes the introns in the nucleotide sequence within the gene through an RNA splicing complex enzyme termed spliceosome. A splice site is a point where an exon and an intron intersect. The acceptor splice site is at the intron-exon boundary, which is expressed with consensus Adenine-Guanine in the 5′ to 3′ orientation (AG). On the other hand, donor-site splice sites are found at the exon-intron border and expressed with consensus GT in the 3′ to 5′ direction
While Canonical AG/GT splice sites make up nearly all of the splice sites, accurate prediction of splice sites permits alternate splicing prediction. This concept is an important feature of Eukaryotes– it allows eukaryotes to produce different proteins from a single gene. Our research focuses on splice site prediction, as precise splice site localization can significantly contribute to gene structure and function identification and analysis.
Publications:
Codes:
All our algorithms are made public, open-source, and freely accessible to all through our GitHub repository