Machine learning model for random forest acute oral toxicity prediction

سال انتشار: 1404
نوع سند: مقاله ژورنالی
زبان: انگلیسی
مشاهده: 121

فایل این مقاله در 18 صفحه با فرمت PDF قابل دریافت می باشد

استخراج به نرم افزارهای پژوهشی:

لینک ثابت به این مقاله:

شناسه ملی سند علمی:

JR_GJESM-11-1_002

تاریخ نمایه سازی: 11 دی 1403

چکیده مقاله:

BACKGROUND AND OBJECTIVES: The focus of this study is on the importance of reliable and precise forecasting of acute oral toxicity to bolster chemical safety and advance sustainable development goals, particularly sustainable development goals-۳ (good health and well-being), sustainable development goals-۶ (clean water and sanitation), and sustainable development goals-۱۲ (responsible consumption and production). Traditional toxicity assessments are often time-consuming and costly, necessitating the exploration of more efficient approaches. The focus of this study is to establish the most efficient method for constructing reliable and precise models for toxicity prediction.METHODS: The random forests were evaluated, a robust ensemble method, for predicting acute oral toxicity using a comprehensive dataset from National Toxicology Program/Interagency Center for the Evaluation of Alternative Toxicological Methods and Environmental Protection Agency/National Center for Competency Testing, which presented significant class imbalance, ۸ percent very toxic ۹۲ percent not very toxic. To address this imbalance, strategies such as cost-sensitive learning and data resampling techniques, including both under sampling and oversampling, were utilized. A diverse set of two-dimensional molecular descriptors generated via rational discovery kit were used as input features, and model preprocessing involved normalization, validation, and feature selection. Hyper-parameter tuning was conducted using Bayesian optimization and cross-validation, while the performance of random forests was evaluated in comparison to gradient boosting, extreme gradient boosting, artificial neural networks, and the generalized linear model.FINDINGS: The random forests models, particularly those utilizing under sampling and cost-sensitive learning, demonstrated superior performance, achieving sensitivity of ۰.۸۱, Specificity of ۰.۸۵, accuracy of ۰.۸۵, and an area under the receiver operating characteristic curve of ۰.۸۹ on an independent test set. An examination of feature importance has shown that the primary molecular descriptors are those related to the Van der waals surface area and molecular quantum numbers. A surrogate decision tree developed from random forests predictions reached an area under the curve of ۰.۹۲۹.CONCLUSION: Random forest models effectively predicted acute oral toxicity, particularly when addressing class imbalance through cost-sensitive learning and resampling. leveraging explainable artificial intelligence techniques, including permutation feature importance, surrogate decision tree analysis and local interpretable model-agnostic explanations, this study identified key molecular descriptors driving toxicity. This advancement improves model interpretability and represents a significant step toward enhancing chemical safety while supporting sustainable development goals.

کلیدواژه ها:

Explainable Artificial Intelligence ، Machine Learning (ML) ، Random forest (RF) ، Rational Discovery Kit (RDKit) ، Sustainable Development Goals (SDGs) ، Toxicity

نویسندگان

A.M. Elsayad

Department of Electrical Engineering, College of Engineering in Wadi Alddawasir, Prince Sattam Bin Abdulaziz University, Wadi Alddawasir ۱۱۹۹۱, Saudi Arabia

M.M. Zeghid

Computer Engineering and Networks Department, College of Engineering, PSAU, Saudi Arabia

K.A. Elsayad

Pharmacy Department, Cairo University Hospitals, Cairo University, Cairo ۱۱۶۶۲, Egypt

A.N. Khan

Department of Electrical Engineering, University of Engineering and Technology Peshawar, Pakistan

A.K.M. Baareh

Department of Applied Science, Ajloun University College, Al-Balqa Applied University, Ajloun, Jordan

A. Sadiq

Department of Electrical Engineering, University of Engineering and Technology Peshawar, Pakistan

S.A. Mukhtar

Computers and Systems Department, Electronics Research Institute, Cairo, Egypt

H.F. Ali

Computers and Systems Department, Electronics Research Institute, Cairo, Egypt

S. Abd El-kader

Computers and Systems Department, Electronics Research Institute, Cairo, Egypt

مراجع و منابع این مقاله:

لیست زیر مراجع و منابع استفاده شده در این مقاله را نمایش می دهد. این مراجع به صورت کاملا ماشینی و بر اساس هوش مصنوعی استخراج شده اند و لذا ممکن است دارای اشکالاتی باشند که به مرور زمان دقت استخراج این محتوا افزایش می یابد. مراجعی که مقالات مربوط به آنها در سیویلیکا نمایه شده و پیدا شده اند، به خود مقاله لینک شده اند :
  • Abou Hajal, A.; Al Meslamani, A.Z., (۲۰۲۴). Overcoming barriers to ...
  • Bentéjac, C.; Csörgő, A.; Martínez-Muñoz, G., (۲۰۲۱). A comparative analysis ...
  • Born, J.; Markert, G.; Janakarajan, N.; Kimber, T.B.; Volkamer, A.; ...
  • Bharti, D.R.; Lynn, A.M., (۲۰۱۷). QSAR based predictive modeling for ...
  • Creton, S.; Dewhurst, I.C.; Earl, L. K.; Gehen, S.C.; Guest, ...
  • Cavasotto, C.N.; Scardino, V., (۲۰۲۲). Machine learning toxicity prediction: latest ...
  • Chinedu, E.; Arome, D.; Ameh, F.S., (۲۰۱۳). A new method ...
  • Cremer, J.; Medrano Sandonas, L.; Tkatchenko, A.; Clevert, D.-A.; De ...
  • Durrant, S.; Hardoon, D.R.; Brechmann, A.; Shawe-Taylor, J.; Miranda, E.R.; ...
  • Feng, H.; Zhang, L.; Li, S.; Liu, L.; Yang, T.; ...
  • Fan, T.; Sun, G.; Zhao, L.; Cui, X.; Zhong, R., ...
  • Guo, W.; Liu, J.; Dong, F.; Song, M.; Li, Z.; ...
  • Hasib, K.M.; Showrov, M.I.H.; Mahmud, J.A.; Mithu, K., (۲۰۲۲). Imbalanced ...
  • Jiang, J.; Wang, R.; Wei, G.-W., (۲۰۲۱). GGL-Tox: geometric graph ...
  • Kleinstreuer, N.C.; Karmaus, A.L.; Mansouri, K.; Allen, D.G.; Fitzpatrick, J.M.; ...
  • Lou, S.; Yu, Z.; Huang, Z.; Wang, H.; Pan, F.; ...
  • Landrum, G., (۲۰۱۳). Rdkit documentation. Release ۲۰۱۹.۰۹.۱ : ۱-۱۵۱ (۱۵۱ pages) ...
  • Mansouri, K.; Karmaus, A. L.; Fitzpatrick, J.; Patlewicz, G.; Pradeep, ...
  • More, A.S.; Rana, D.P., (۲۰۱۷). Review of random forest classification ...
  • Muschalik, M.; Fumagalli, F.; Hammer, B.; Hüllermeier, E., (۲۰۲۲). Agnostic ...
  • Mateo, J.; Rius-Peris, J.M.; Maraña-Pérez, A.I.; Valiente-Armero, A.; Torres, A. ...
  • Pérez Santín, E.; Rodríguez Solana, R.; González García, M.; García ...
  • Ryu, J.Y.; Jang, W.D.; Jang, J.; Oh, K.-S., (۲۰۲۳). PredAOT: ...
  • Sachs, J.D.; Schmidt-Traub, G.; Mazzucato, M.; Messner, D.; Nakicenovic, N.; ...
  • Tougui, I.; Jilbab, A.; El Mhamdi, J., (۲۰۲۱). Impact of ...
  • Tran, T.T.V.; Wibowo, A.S.; Tayara, H.; Chong, K.T., (۲۰۲۳). Artificial ...
  • Wu, S.; Li, S.-X.; Qiu, J.; Zhao, H.-M.; Li, Y.-W.; ...
  • Wilhelm, A.; Zweig, K. A., (۲۰۲۴). Hacking a surrogate model ...
  • Wu, J.; Chen, X.Y.; Zhang, H.; Xiong, L.D.; Lei, H.; ...
  • Xiao, L.; Deng, J.; Yang, L.; Huang, X.; Yu, X., ...
  • Zhang, Z., (۲۰۱۶). A gentle introduction to artificial neural networks. ...
  • نمایش کامل مراجع