Imbalanced Learning Techniques for Land Subsidence Prediction: Ensemble Methods and Data Balancing Strategies

سال انتشار: 1404
نوع سند: مقاله کنفرانسی
زبان: انگلیسی
مشاهده: 126

فایل این مقاله در 15 صفحه با فرمت PDF قابل دریافت می باشد

استخراج به نرم افزارهای پژوهشی:

لینک ثابت به این مقاله:

شناسه ملی سند علمی:

EMGBC09_056

تاریخ نمایه سازی: 1 آذر 1404

چکیده مقاله:

Land subsidence poses significant threats to infrastructure, environment, and public safety, making accurate prediction essential for disaster risk reduction and sustainable resource management. This study addresses the critical challenge of class imbalance in land subsidence prediction datasets, where subsidence events are rare compared to stable ground conditions, leading to biased models that poorly detect actual subsidence occurrences. We propose and evaluate several imbalanced learning approaches, including random under-sampling, cost-sensitive algorithms, and ensemble methods (bagging and boosting), for predicting land subsidence in Chaharmahal and Bakhtiari province, Iran. The study utilizes a comprehensive dataset of ۵۱۶ subsidence locations identified through InSAR analysis, along with ۱۳ conditioning factors including geological, hydrological, environmental, and anthropogenic variables. Multiple imbalanced learning techniques are systematically compared using precision, recall, F۱-score, and ROC-AUC score metrics. Results demonstrate that random under-sampling followed by Random Forest achieves the most balanced performance with precision, recall, and F۱-score all reaching ۹۴% and ROC-AUC of ۹۸.۴%. While bagging method applied directly to imbalanced data achieves high recall (۹۶%) and ROC-AUC (۹۹%), it suffers from lower precision due to false positives. The fine-tuned models are used to generate land subsidence susceptibility maps for the entire study area, revealing that eastern and southeastern regions exhibit the highest susceptibility. Risk analysis shows that random under-sampling is more conservative method producing the most balanced risk distribution with ۹.۱% and ۸.۱% of areas classified as high and very high risk, respectively. The findings highlight the critical importance of addressing class imbalance for achieving reliable subsidence prediction. This research provides valuable insights for improving early warning systems and supporting informed decision-making for land subsidence risk management.

نویسندگان

Khayyam Salehi

Department of Computer Science, Faculty of Mathematical Sciences, Shahrekord University, Iran

Maryam Karimi

Department of Computer Science, Faculty of Mathematical Sciences, Shahrekord University, Iran

Khosro Keyani

Department of Civil Engineering, Shahrekord Branch, Islamic Azad University, Shahrekord, Iran