Structural Variation Detection from Paired-end NGS data using Hidden Markov Model

سال انتشار: 1400
نوع سند: مقاله کنفرانسی
زبان: انگلیسی
مشاهده: 70

نسخه کامل این مقاله ارائه نشده است و در دسترس نمی باشد

استخراج به نرم افزارهای پژوهشی:

لینک ثابت به این مقاله:

شناسه ملی سند علمی:

IBIS10_216

تاریخ نمایه سازی: 5 تیر 1401

چکیده مقاله:

The objective of the present study was to identify the areas variated in the samples’ genomes, whichwas achieved by the Hidden Markov Model. The single-read data of IIlumina technology displaysthe exact same correlation with Array-CGH and almost the same algorithms can be applied. Inthis study, Hidden Markov Models were used for the more precise and abundant paired-end data,which is quite unusual for this type of algorithm. For this purpose, two methods of identificationwere used [۲, ۳]: ۱) in the first method, using a specific threshold, the ratios compared with thenormal samples were extracted and after the labelling of variated areas, the Hidden Markov Modelwas applied, ۲) the second method utilized the ground truth data and SVM machine leaningtechnique to label variated areas. The Hidden Markov Model was then applied for re-labelling ofvariated areas. Finally, for evaluation of the model, artificial data were acquired using simulationtechniques. After the identification of variated areas by Hidden Markov Model, the percentage ofthe found duplication, deletion and translocations were calculated. The novelty of this study lies inthe identification of structural variations in paired-end data by Hidden Markov Model whereas, inprevious studies single-read data were used. Furthermore, this study identifies translocations usingHidden Markov Model for the first time.

نویسندگان

Muhammad Amin Rahimi

Institute of Biochemistry and Biophysics, University of Tehran, Tehran, Iran