pmiRScan: a LightGBM based method for prediction of animal pre-miRNAs

Amin N, McGrath A, Chen Y-PP (2019) Evaluation of deep learning in non-coding RNA classification. Nat Mach Intell 1:246–256. https://doi.org/10.1038/s42256-019-0051-2

Article  Google Scholar 

Barik A, Das S (2018) A comparative study of sequence- and structure-based features of small RNAs and other RNAs of bacteria. RNA Biol 15:95–103. https://doi.org/10.1080/15476286.2017.1387709

Article  PubMed  Google Scholar 

Bartel DP (2004) MicroRNAs Cell 116:281–297. https://doi.org/10.1016/S0092-8674(04)00045-5

Batuwita R, Palade V (2009) microPred: effective classification of pre-miRNAs for human miRNA gene prediction. Bioinformatics 25:989–995. https://doi.org/10.1093/bioinformatics/btp107

Article  CAS  PubMed  Google Scholar 

Bisong E (2019) Introduction to scikit-learn. Building machine learning and deep learning models on google cloud platform. A, Berkeley, CA, pp 215–229

Chapter  Google Scholar 

Bugnon LA, Yones C, Milone DH, Stegmayer G (2021) Genome-wide discovery of pre-miRNAs: comparison of recent approaches based on machine learning. Brief Bioinform 22. https://doi.org/10.1093/bib/bbaa184

Chen C, Tsai Y, Chang F, Lin W (2020) Ensemble feature selection in medical datasets: combining filter, wrapper, and embedded feature selection results. Expert Syst 37. https://doi.org/10.1111/exsy.12553

Chen PY, Manninga H, Slanchev K, Chien M, Russo JJ, Ju J, Sheridan R, John B, Marks DS, Gaidatzis D, Sander C, Zavolan M, Tuschl T (2005) The developmental miRNA profiles of zebrafish as determined by small RNA cloning. Genes Dev 19:1288–1293. https://doi.org/10.1101/gad.1310605

Article  CAS  PubMed  PubMed Central  Google Scholar 

Chen T, Guestrin C (2016) XGBoost. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining. ACM, New York, NY, USA, pp 785–794

Fernandez A, Garcia S, Herrera F, Chawla NV (2018) SMOTE for learning from Imbalanced Data: progress and challenges, marking the 15-year anniversary. J Artif Intell Res 61:863–905. https://doi.org/10.1613/jair.1.11192

Article  Google Scholar 

Fromm B, Høye E, Domanska D, Zhong X, Aparicio-Puerta E, Ovchinnikov V, Umu SU, Chabot PJ, Kang W, Aslanzadeh M, Tarbier M, Mármol-Sánchez E, Urgese G, Johansen M, Hovig E, Hackenberg M, Friedländer MR, Peterson KJ (2022) MirGeneDB 2.1: toward a complete sampling of all major animal phyla. Nucleic Acids Res 50:D204–D210. https://doi.org/10.1093/nar/gkab1101

Article  CAS  PubMed  Google Scholar 

Fu X, Zhu W, Cai L, Liao B, Peng L, Chen Y, Yang J (2019) Improved pre-miRNAs identification through mutual information of pre-miRNA sequences and structures. Front Genet 10. https://doi.org/10.3389/fgene.2019.00119

Ganju A, Khan S, Hafeez BB, Behrman SW, Yallapu MM, Chauhan SC, Jaggi M (2017) miRNA nanotherapeutics for cancer. Drug Discov Today 22:424–432. https://doi.org/10.1016/j.drudis.2016.10.014

Article  CAS  PubMed  Google Scholar 

Gardner PP, Giegerich R (2004) A comprehensive comparison of comparative RNA structure prediction approaches. BMC Bioinf 5:140. https://doi.org/10.1186/1471-2105-5-140

Article  CAS  PubMed  PubMed Central  Google Scholar 

Garg A, Roske Y, Yamada S, Uehata T, Takeuchi O, Heinemann U (2021) PIN and CCCH Zn-finger domains coordinate RNA targeting in ZC3H12 family endoribonucleases. Nucleic Acids Res 49:5369–5381. https://doi.org/10.1093/nar/gkab316

Article  CAS  PubMed  PubMed Central  Google Scholar 

Gonzales GB, De Saeger S (2018) Elastic net regularized regression for time-series analysis of plasma metabolome stability under sub-optimal freezing condition. Sci Rep 8:3659. https://doi.org/10.1038/s41598-018-21851-7

Article  CAS  PubMed  PubMed Central  Google Scholar 

Griffiths-Jones S (2006) MiRBase The MicroRNA sequence database. In: MicroRNA protocols. Humana, New Jersey, pp 129–138

Guan D-G, Liao J-Y, Qu Z-H, Zhang Y, Qu L-H (2011) mirExplorer: detecting microRNAs from genome and next generation sequencing data using the AdaBoost method with transition probability matrix and combined features. RNA Biol 8:922–934. https://doi.org/10.4161/rna.8.5.16026

Article  CAS  PubMed  Google Scholar 

Gudyś A, Szcześniak MW, Sikora M, Makałowska I (2013) HuntMi: an efficient and taxon-specific approach in pre-miRNA identification. BMC Bioinf 14:83. https://doi.org/10.1186/1471-2105-14-83

Article  PubMed  PubMed Central  Google Scholar 

Hemphill E, Lindsay J, Lee C, Măndoiu II, Nelson CE (2014) Feature selection and classifier performance on diverse bio-logical datasets. BMC Bioinf 15:S4. https://doi.org/10.1186/1471-2105-15-S13-S4

Article  PubMed  PubMed Central  Google Scholar 

Hertel J, Stadler PF (2006) Hairpins in a haystack: recognizing microRNA precursors in comparative genomics data. Bioinf 22:e197–e202. https://doi.org/10.1093/bioinformatics/btl257

Article  CAS  PubMed  Google Scholar 

Jiang P, Wu H, Wang W, Ma W, Sun X, Lu Z (2007) MiPred: classification of real and pseudo microRNA precursors using random forest prediction model with combined features. Nucleic Acids Res 35:W339–W344. https://doi.org/10.1093/nar/gkm368

Article  PubMed  PubMed Central  Google Scholar 

Jouravleva K, Golovenko D, Demo G, Dutcher RC, Hall TMT, Zamore PD, Korostelev AA (2022) Structural basis of microRNA biogenesis by Dicer-1 and its partner protein Loqs-PB. Mol Cell 82:4049–4063e6. https://doi.org/10.1016/j.molcel.2022.09.002

Article  CAS  PubMed  PubMed Central  Google Scholar 

Kleftogiannis D, Theofilatos K, Likothanassis S, Mavroudi S (2015) YamiPred: a novel evolutionary method for predicting pre-miRNAs and selecting relevant features. IEEE/ACM Trans Comput Biol Bioinform 12:1183–1192. https://doi.org/10.1109/TCBB.2014.2388227

Article  CAS  PubMed  Google Scholar 

Kotsiantis SB (2013) Decision trees: a recent overview. Artif Intell Rev 39:261–283. https://doi.org/10.1007/s10462-011-9272-4

Article  Google Scholar 

Kozomara A, Birgaoanu M, Griffiths-Jones S (2019) miRBase: from microRNA sequences to function. Nucleic Acids Res 47:D155–D162. https://doi.org/10.1093/nar/gky1141

Article  CAS  PubMed  Google Scholar 

Lee RC, Feinbaum RL, Ambros V (1993) The C. Elegans heterochronic gene lin-4 encodes small RNAs with antisense complementarity to lin-14. Cell 75:843–854. https://doi.org/10.1016/0092-8674(93)90529-Y

Article  CAS  PubMed  Google Scholar 

Liang L, Hu W, Zhang Y, Ma K, Gu Y, Tian B, Li H (2021) An algorithm with LightGBM + SVM fusion model for the assessment of dynamic security region. E3S Web Conferences 256(02022). https://doi.org/10.1051/e3sconf/202125602022

Li W, Godzik A (2006) Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences. Bioinf 22:1658–1659. https://doi.org/10.1093/bioinformatics/btl158

Article  CAS  PubMed  Google Scholar 

Lorenz R, Bernhart SH, Höner zu Siederdissen C, Tafer H, Flamm C, Stadler PF, Hofacker IL (2011) ViennaRNA Package 2.0. Algorithms Mol Biol 6:26. https://doi.org/10.1186/1748-7188-6-26

Lorenz R, Flamm C, Hofacker I, Stadler P (2020) Efficient computation of base-pairing probabilities in multi-strand RNA folding. In: proceedings of the 13th International Joint Conference on Biomedical Engineering Systems and Technologies. SCITEPRESS - Science and Technology Publications, pp 23–31

Ma Y, Yu Z, Han G, Li J, Anh V (2018) Identification of pre-microRNAs by characterizing their sequence order evolution information and secondary structure graphs. BMC Bioinf 19:521. https://doi.org/10.1186/s12859-018-2518-2

Article  CAS  PubMed  PubMed Central  Google Scholar 

Mendes ND, Freitas AT, Sagot M-F (2009) Current tools for the identification of miRNA genes and their targets. Nucleic Acids Res 37:2419–2433. https://doi.org/10.1093/nar/gkp145

Article  CAS  PubMed  PubMed Central  Google Scholar 

Nasiri H, Alavi SA (2022) A Novel Framework based on deep learning and ANOVA feature selection method for diagnosis of COVID-19 cases from chest X-Ray images. Comput Intell Neurosci 2022:1–11. https://doi.org/10.1155/2022/4694567

Article  Google Scholar 

Natekin A, Knoll A (2013) Gradient boosting machines, a tutorial. Front Neurorobot 7. https://doi.org/10.3389/fnbot.2013.00021

Nazarov PV, Kreis S (2021) Integrative approaches for analysis of mRNA and microRNA high-throughput data. Comput Struct Biotechnol J 19:1154–1162. https://doi.org/10.1016/j.csbj.2021.01.029

Article  CAS  PubMed  PubMed Central  Google Scholar 

Niaz NU, Shahariar KMN, Patwary MJA (2022) Class Imbalance Problems in Machine Learning: A Review of Methods And Future Challenges. In: Proceedings of the 2nd International Conference on Computing Advancements. ACM, New York, NY, USA, pp 485–490

Nithin C, Mukherjee S, Basak J, Bahadur RP (2022) NCodR: a multi-class support vector machine classification to distinguish non-coding RNAs in viridiplantae. Quant Plant Biology 3:e23.

Comments (0)

No login
gif