Imputation of ungenotyped individuals based on genotyped relatives using Machine Learning Methodology
محل انتشار: فصلنامه اپیژنتیک، دوره: 2، شماره: 2
سال انتشار: 1400
نوع سند: مقاله ژورنالی
زبان: انگلیسی
مشاهده: 113
فایل این مقاله در 9 صفحه با فرمت PDF قابل دریافت می باشد
- صدور گواهی نمایه سازی
- من نویسنده این مقاله هستم
این مقاله در بخشهای موضوعی زیر دسته بندی شده است:
استخراج به نرم افزارهای پژوهشی:
شناسه ملی سند علمی:
JR_JEPUSB-2-2_002
تاریخ نمایه سازی: 12 تیر 1401
چکیده مقاله:
Machine learning methods have been used in genetic studies to build models capable of predicting missing genotypes for both human and animal genetic variations. Genotype imputation is an important process of predicting unknown genotypes. The objective of this study was to investigate the idea of using machine learning as imputation to compare the family-based methods and tried to offer improving the imputation performance in different scenarios. Also, the accuracies of different methods i.e. Support vector Machine; SVM, Random forest; RF are compared. The final population were simulated in the form of different family structures. Therefore, ۱۰۰ families including one sire with different number of genotyped progenies (۲, ۳, ۴, ۵ or ۷) were simulated. The number of markers was set to ۵۰۰۰ for whole genome. The sires in families and other scenarios such as, BothParents, sire/dam and one progeny, sire and maternal grandsire were defined to investigate the ability of learning machine algorithm for imputation. The imputation accuracy ranged from ۰.۷۸ to ۰.۹۹ in different scenarios. Also, least amount of imputation accuracy were achieved for sire and maternal grand sire scenario with both methods. Increasing in number of progenies from ۲ to ۳ was considerably increased in imputation accuracy (SVM and RF). The imputation of non-genotyped individuals based on parent-offspring trios and close relatives paired is possible. But, the use of child- one parent genotyped, BothParents genotyped and sire and maternal grandsire genotyped, average imputation accuracy would not exceed ۸۵%. While genotyped progenies are the best source of predicted genotypes for ungenotyped individuals and if the number of progeny is more than ۴, the imputation accuracy is increased more than ۹۵%. These results confirmed, that the performance of machine learning methods in family of trios has a good accuracy and computational speed, which can be used in estimated breeding value.
کلیدواژه ها:
نویسندگان
Naeem Rastin Bojnord
Department of Animal Science, Science and Research Branch, Islamic Azad University, Tehran, Iran
Mehdi Aminafshar
a Department of Animal Science, Science and Research Branch, Islamic Azad University, Tehran, Iran
Mahmood Honarvar
Department of Animal Science, Shahr-e-Qods Branch, Islamic Azad University, Tehran, Iran
Nasser Emam Jomeh Kashan
Department of Animal Science, Science and Research Branch, Islamic Azad University, Tehran, Iran
مراجع و منابع این مقاله:
لیست زیر مراجع و منابع استفاده شده در این مقاله را نمایش می دهد. این مراجع به صورت کاملا ماشینی و بر اساس هوش مصنوعی استخراج شده اند و لذا ممکن است دارای اشکالاتی باشند که به مرور زمان دقت استخراج این محتوا افزایش می یابد. مراجعی که مقالات مربوط به آنها در سیویلیکا نمایه شده و پیدا شده اند، به خود مقاله لینک شده اند :