A Deep Learning Based Method for Predicting DNA N6- Methyladenine (6mA) Sites in Eukaryotes

dc.contributor.authorRoland, L.H.
dc.contributor.authorWannige, C.T.
dc.date.accessioned2021-02-01T07:19:45Z
dc.date.available2021-02-01T07:19:45Z
dc.date.issued2020
dc.description.abstractDNA N6-methyladenine (6mA) is an epigenetic modification, which is involved in many biological regulation processes like DNA replication, DNA repair, transcription, and gene expression regulation. The widespread presence of this 6mA modification in eukaryotes has been unclear until recently. Therefore, for eukaryotes, the study of DNA 6mA is insufficient. Accurate identification of 6mA sites genome-wide provides a deeper understanding of the epigenetic modification process and the biological processes it involves. Existing experimental techniques are time-consuming and computational machine learning methods have room for performance improvement. DNA N6- methyladenine prediction in cross-species shows low performance. Hence, there is a need for a highly accurate, time-efficient method to predict the distribution of 6mA sites in eukaryotes. Deep learning models have shown higher accuracy in many experiments in bioinformatics. In this regard, we develop a customized VGG16 based model using convolution neural networks. We introduce a novel 3-dimensional encoding mechanism extending the one-hot encoding method for the given DNA sequences of length 41bp to support the VGG16 model input. Specifically, the 10-fold cross-validation on the benchmark datasets for the proposed model achieves higher accuracies for crossspecies, Rice, and M. musculus genomes. The cross-species data set was prepared by integrating the benchmark datasets of Rice, and M. musculus. This model outperforms the existing computational tools SNNRice6mA, ilM-CNN with a current validation accuracy of 97% for the prediction of 6mA sites. The model trained with cross-species data predicts 6mA sites of other species Arabidopsis Thaliana, Rosa Chinensis, Drosophila, and Yeast with a prediction accuracy over 70%. Thus, this model can be used for the genome-wide prediction of 6mA sites in eukaryotes. Keywords: DNA Sequence encoding method, Deep learning, Epigenetics, Bioinformatics, DNA N6-Methyladenineen_US
dc.identifier.isbn9789550481293
dc.identifier.urihttp://www.erepo.lib.uwu.ac.lk/bitstream/handle/123456789/5721/proceeding_oct_08-198.pdf?sequence=1&isAllowed=y
dc.language.isoenen_US
dc.publisherUva Wellassa University of Sri Lankaen_US
dc.relation.ispartofseries;International Research Conference
dc.subjectComputer Scienceen_US
dc.subjectInformation Scienceen_US
dc.subjectComputing and Information Managementen_US
dc.subjectBiotechnologyen_US
dc.titleA Deep Learning Based Method for Predicting DNA N6- Methyladenine (6mA) Sites in Eukaryotesen_US
dc.title.alternativeInternational Research Conference 2020en_US
dc.typeOtheren_US
Files
Original bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
proceeding_oct_08-198.pdf
Size:
31.75 KB
Format:
Adobe Portable Document Format
Description:
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed upon to submission
Description: