miRNAFinder: An accurate plant pre-microRNA classifier with an
analysis of feature impact
Sandali Lokuge* 1 , Shyaman Jayasundara 1 ,
Puwasuru Ihalagedara 1 and Damayanthi Herath 1
1
Department of Computer Engineering, Faculty of Engineering, University of Peradeniya, Sri Lanka
*E-mail: [email protected]
Abstract: MicroRNAs (miRNAs) are endogenous small noncoding RNAs that play an important role in post-transcriptional
gene regulation. Several machine learning-based studies have been conducted for miRNA identification
with the use of miRNA features. It is difficult to classify real and pseudo-pre-miRNAs in plant species than
that in animals since plant pre-miRNAs are more diverse than the animal pre-miRNAs. Therefore, this study is
focused on classifying real and pseudo miRNAs in plants. We have introduced a Machine Learning model based
on a 280 feature set including compositional, triplet element, motif, and thermodynamic features. We tested and
compared classification performances considering different feature sets and four different machine learning classifiers.
Random forest classifier shows the best classification performance with all 280 features as it shows 97%
accuracy for the training dataset.
Keywords - microRNA, machine learning, microRNA classification, pre-miRNA, plant
15