Similarity detection between modern human genome and their ancestors DNA sequences by Deep Learning
سال انتشار: 1400
نوع سند: مقاله کنفرانسی
زبان: انگلیسی
مشاهده: 195
نسخه کامل این مقاله ارائه نشده است و در دسترس نمی باشد
- صدور گواهی نمایه سازی
- من نویسنده این مقاله هستم
این مقاله در بخشهای موضوعی زیر دسته بندی شده است:
استخراج به نرم افزارهای پژوهشی:
شناسه ملی سند علمی:
IBIS10_043
تاریخ نمایه سازی: 5 تیر 1401
چکیده مقاله:
Neanderthals were a species of human that lived in Europe and parts of western Asia, Central Asia, andnorthern China (Altai). The first signs of early Neanderthals date back to about ۳۵۰,۰۰۰ years ago in Europe.There is ample genetic evidence that modern humans had sex with Neanderthals, Denisovans, and otherancient relatives.In this study, we used in-depth learning to identify areas of Neanderthal intrusion in the modern humangenome. Recent methods, such as the Markov latent model (HMM) to find the Neanderthal effect on thegenome, are a memoryless model that does not consider the relationship between nucleotide distances alongDNA sequences. Therefore, we used deep learning power to process crude genomic sequences and nucleotidelong-term memory in genomes with short-term long-term memory (LSTM).This model works better than linear models such as support vector machines (SVMs) or simple Bayesianclassifiers, so we recommend the LSTM method for analyzing ancient biological data.We first converted DNA sequences into k-mers with limited space. We then used the Bag Of Words modelto compare k-mers frequencies between sequences inherited from Neanderthals and sequences from weakancient ancestors. Finally, when classifying sequences, we learned Word Embeddings with a sequentialmodel with the Keras Embeddings layer. The model achieved an accuracy of ۸۷.۶% in the data set thatclassifies the input Neanderthal sequences against the discharged source.It should be noted that for the near future, our vision is to find similarities between modern humans and theirancestors in the genomic data of skin patients using the LSTM model.
کلیدواژه ها:
نویسندگان
Keivan Naseri
Department of Bioinformatics, Kish International Campus University of Tehran, Kish, Iran
Mahboobeh Golchinpour
Faculty of New Sciences and Technologies, University of Tehran, Tehran, Iran