Investigation of Deep Learning Optimization Algorithms in Scene Text Detection
سال انتشار: 1402
نوع سند: مقاله ژورنالی
زبان: انگلیسی
مشاهده: 95
فایل این مقاله در 12 صفحه با فرمت PDF قابل دریافت می باشد
- صدور گواهی نمایه سازی
- من نویسنده این مقاله هستم
این مقاله در بخشهای موضوعی زیر دسته بندی شده است:
استخراج به نرم افزارهای پژوهشی:
شناسه ملی سند علمی:
JR_IECO-6-3_002
تاریخ نمایه سازی: 10 آبان 1402
چکیده مقاله:
Scene text detection frameworks heavily rely on optimization methods for their successful operation. Choosing an appropriate optimizer is essential to performing recent scene text detection models. However, recent deep learning methods often employ various optimization algorithms and loss functions without explicitly explaining their selections. This paper presents a segmentation-based text detection pipeline capable of handling arbitrary-shaped text instances in wild images. We explore the effectiveness of well-known deep-learning optimizers to enhance the pipeline's capabilities. Additionally, we introduce a novel Segmentation-based Attention Module (SAM) that enables the model to capture long-range dependencies of multi-scale feature maps and focus more accurately on regions likely to contain text instances.The performance of the proposed architecture is extensively evaluated through ablation experiments, exploring the impact of different optimization algorithms and the introduced SAM block. Furthermore, we compare the final model against state-of-the-art scene text detection techniques on three publicly available benchmark datasets, namely ICDAR۱۵, MSRA-TD۵۰۰, and Total-Text. Our experimental results demonstrate that the focal loss combined with the Stochastic Gradient Descent (SGD) + Momentum optimizer with poly learning-rate policy achieves a more robust and generalized detection performance than other optimization strategies. Moreover, our utilized architecture, empowered by the proposed SAM block, significantly enhances the overall detection performance, achieving competitive H-mean detection scores while maintaining superior efficiency in terms of Frames Per Second (FPS) compared to recent techniques. Our findings shed light on the importance of selecting appropriate optimization strategies and demonstrate the effectiveness of our proposed Segmentation-based Attention Module in scene text detection tasks.
کلیدواژه ها:
نویسندگان
Zobeir Raisi
University of Waterloo, Waterloo, Canada. Chabahar Maritime University, Chabahar, Iran
John Zelek
University of Waterloo, Waterloo, Canada,
مراجع و منابع این مقاله:
لیست زیر مراجع و منابع استفاده شده در این مقاله را نمایش می دهد. این مراجع به صورت کاملا ماشینی و بر اساس هوش مصنوعی استخراج شده اند و لذا ممکن است دارای اشکالاتی باشند که به مرور زمان دقت استخراج این محتوا افزایش می یابد. مراجعی که مقالات مربوط به آنها در سیویلیکا نمایه شده و پیدا شده اند، به خود مقاله لینک شده اند :