Please use this identifier to cite or link to this item: http://223.31.159.10:8080/jspui/handle/123456789/1825
Full metadata record
DC FieldValueLanguage
dc.contributor.authorHamid, Fiza-
dc.contributor.authorMukherjee, Kanka-
dc.contributor.authorChaudhary, Sakshi-
dc.contributor.authorKaushik, Love-
dc.contributor.authorKumar, Shailesh-
dc.date.accessioned2026-06-11T07:09:38Z-
dc.date.available2026-06-11T07:09:38Z-
dc.date.issued2026-
dc.identifier.citationDNA Research, (In Press)en_US
dc.identifier.issn1756-1663-
dc.identifier.otherhttps://doi.org/10.1093/dnares/dsag005-
dc.identifier.urihttps://academic.oup.com/dnaresearch/advance-article/doi/10.1093/dnares/dsag005/8704208?login=true-
dc.identifier.urihttp://223.31.159.10:8080/jspui/handle/123456789/1825-
dc.descriptionAccepted date: 26 May 2026en_US
dc.description.abstractFusion genes play crucial roles in plant biological processes but remain far less explored than their human counterparts, largely due to limited validated datasets and the absence of plant-specific prediction tools. Existing approaches often produce high false-positive rates, restricting reliable discovery. To address this gap, we developed Plant Fusion Gene Predictor (PFGPred), an ensemble machine learning framework that integrates Random Forest, XGBoost, and long short-term memory (LSTM) models into a meta-classifier for accurate identification of true and false fusion genes from RNA-Seq data. PFGPred was trained on a high-confidence dataset of fusion genes validated by both RNA-Seq and whole-genome sequencing from Arabidopsis thaliana, Oryza sativa, Triticum aestivum, and Zea mays, to predict and rank candidate fusion genes for future functional validation. It outperformed individual baseline models, achieving accuracies of 0.97 on training data and 0.77 on independent test data. When evaluated on human datasets, it achieved 0.71 accuracy with lower sensitivity, reflecting biological differences between plant and human fusion events. Comparative analyses confirmed that PFGPred reliably identifies validated fusions, demonstrating its utility as a cost-effective, plant-specific prediction tool for high-throughput fusion gene screening and functional genomics research. It is freely available as a web server at http://www.nipgr.ac.in/PFGPred.en_US
dc.description.sponsorshipThe authors gratefully acknowledge the BRIC-National Institute of Plant Genome Research (NIPGR), New Delhi, for providing research support. The authors extend their gratitude to the DBT e-Library Consortium (DeLCON) for providing access to e-material and the Computational 8 Biology & Bioinformatics Facility (CBBF) of the NIPGR for their support.en_US
dc.language.isoen_USen_US
dc.publisherOxford University Pressen_US
dc.subjectFusion Transcriptsen_US
dc.subjectGene Fusionen_US
dc.subjectMachine Learningen_US
dc.subjectPlant Fusion Geneen_US
dc.subjectRNA Sequencingen_US
dc.subjectWhole-Genome Sequencingen_US
dc.titlePFGPred: A stack ensemble classifier for the identification of fusion genes in plantsen_US
dc.typeArticleen_US
Appears in Collections:Institutional Publications

Files in This Item:
File Description SizeFormat 
Kumar Shai_2026_6.pdf
  Restricted Access
714.71 kBAdobe PDFView/Open Request a copy


Items in IR@NIPGR are protected by copyright, with all rights reserved, unless otherwise indicated.