Investigation of Deep Learning Optimization Algorithms in Scene Text Detection

سال انتشار: 1402
نوع سند: مقاله ژورنالی
زبان: انگلیسی
مشاهده: 95

فایل این مقاله در 12 صفحه با فرمت PDF قابل دریافت می باشد

این مقاله در بخشهای موضوعی زیر دسته بندی شده است:

استخراج به نرم افزارهای پژوهشی:

لینک ثابت به این مقاله:

شناسه ملی سند علمی:

JR_IECO-6-3_002

تاریخ نمایه سازی: 10 آبان 1402

چکیده مقاله:

Scene text detection frameworks heavily rely on optimization methods for their successful operation. Choosing an appropriate optimizer is essential to performing recent scene text detection models. However, recent deep learning methods often employ various optimization algorithms and loss functions without explicitly explaining their selections. This paper presents a segmentation-based text detection pipeline capable of handling arbitrary-shaped text instances in wild images. We explore the effectiveness of well-known deep-learning optimizers to enhance the pipeline's capabilities. Additionally, we introduce a novel Segmentation-based Attention Module (SAM) that enables the model to capture long-range dependencies of multi-scale feature maps and focus more accurately on regions likely to contain text instances.The performance of the proposed architecture is extensively evaluated through ablation experiments, exploring the impact of different optimization algorithms and the introduced SAM block. Furthermore, we compare the final model against state-of-the-art scene text detection techniques on three publicly available benchmark datasets, namely ICDAR۱۵, MSRA-TD۵۰۰, and Total-Text. Our experimental results demonstrate that the focal loss combined with the Stochastic Gradient Descent (SGD) + Momentum optimizer with poly learning-rate policy achieves a more robust and generalized detection performance than other optimization strategies. Moreover, our utilized architecture, empowered by the proposed SAM block, significantly enhances the overall detection performance, achieving competitive H-mean detection scores while maintaining superior efficiency in terms of Frames Per Second (FPS) compared to recent techniques. Our findings shed light on the importance of selecting appropriate optimization strategies and demonstrate the effectiveness of our proposed Segmentation-based Attention Module in scene text detection tasks.

نویسندگان

Zobeir Raisi

University of Waterloo, Waterloo, Canada. Chabahar Maritime University, Chabahar, Iran

John Zelek

University of Waterloo, Waterloo, Canada,

مراجع و منابع این مقاله:

لیست زیر مراجع و منابع استفاده شده در این مقاله را نمایش می دهد. این مراجع به صورت کاملا ماشینی و بر اساس هوش مصنوعی استخراج شده اند و لذا ممکن است دارای اشکالاتی باشند که به مرور زمان دقت استخراج این محتوا افزایش می یابد. مراجعی که مقالات مربوط به آنها در سیویلیکا نمایه شده و پیدا شده اند، به خود مقاله لینک شده اند :
  • H. Lin, P. Yang, and F. Zhang, “Review of scene ...
  • X. Liu, G. Meng, and C. Pan, “Scene text detection ...
  • Z. Raisi, M. A. Naiel, G. Younes, P. Fieguth, and ...
  • Z. Raisi and J. Zelek, “Text detection & recognition in ...
  • Z. Raisi, “Text detection and recognition in the wild, ” ...
  • J. Matas, O. Chum, M. Urban, and T. Pajdla, “Robust ...
  • L. Neumann and J. Matas, “A method for text localization ...
  • B. Epshtein, E. Ofek, and Y. Wexler, “Detecting text in ...
  • W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, ...
  • J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, “You ...
  • S. Ren, K. He, R. Girshick, and J. Sun, “Faster ...
  • J. Long, E. Shelhamer, and T. Darrell, “Fully convolutional networks ...
  • K. He, G. Gkioxari, P. Dollar, and R. Girshick, “Mask ...
  • Y. Baek, B. Lee, D. Han, S. Yun, and H. ...
  • I. Sutskever, J. Martens, G. Dahl, and G. Hinton, “On ...
  • J. Duchi, E. Hazan, and Y. Singer, “Adaptive subgradient methods ...
  • T. Tieleman and G. Hinton, “Lecture ۶.۵-rmsprop: Divide the gradient ...
  • D. P. Kingma and J. Ba, “Adam: A method for ...
  • K. He, X. Zhang, S. Ren, and J. Sun, “Deep ...
  • X. Zhou, C. Yao, H. Wen, Y. Wang, S. Zhou, ...
  • M. Liao, B. Shi, X. Bai, X. Wang, and W. ...
  • Y. Liu and L. Jin, “Deep matching prior network: Toward ...
  • M. Liao, B. Shi, and X. Bai, “Textboxes++: A single-shot ...
  • J. Ma, W. Shao, H. Ye, L. Wang, H. Wang, ...
  • Z. Zhang, C. Zhang, W. Shen, C. Yao, W. Liu, ...
  • D. Deng, H. Liu, X. Li, and D. Cai, “Pixellink: ...
  • S. Long, J. Ruan, W. Zhang, X. He, W. Wu, ...
  • C. Yao, X. Bai, N. Sang, X. Zhou, S. Zhou, ...
  • B. Shi, X. Bai, and S. Belongie, “Detecting oriented text ...
  • W. Wang, E. Xie, X. Song, Y. Zang, W. Wang, ...
  • P. Yang, G. Yang, X. Gong, P. Wu, X. Han, ...
  • M. Liao, Z. Wan, C. Yao, K. Chen, and X. ...
  • W. Wang, E. Xie, X. Li, W. Hou, T. Lu, ...
  • T.-Y. Lin, P. Goyal, R. Girshick, K. He, and P. ...
  • A. C. Wilson, R. Roelofs, M. Stern, N. Srebro, and ...
  • M. D. Zeiler, “Adadelta: an adaptive learning rate method,” arXiv ...
  • T. Dozat, “Incorporating nesterov momentum into adam,” ۲۰۱۶ ...
  • D. P. Kingma and J. Ba, “Adam: A method for ...
  • S. J. Reddi, S. Kale, and S. Kumar, “On the ...
  • T.-Y. Lin, P. Dollar, R. Girshick, K. He, B. Hariharan, ...
  • C. Yao, X. Bai, W. Liu, Y. Ma, and Z. ...
  • C. K. Ch’ng and C. S. Chan, “Total-text: A comprehensive ...
  • D. Karatzas, L. Gomez-Bigorda, A. Nicolaou, S. Ghosh, A. Bagdanov, ...
  • A. Gupta, A. Vedaldi, and A. Zisserman, “Synthetic data for ...
  • J. Liu, X. Liu, J. Sheng, D. Liang, X. Li, ...
  • A. Gupta, A. Vedaldi, and A. Zisserman, “Synthetic data for ...
  • N. Darjani and H. Omranpour, “Comprehensive learning polynomial auto-regressive model ...
  • S. Kalantari, M. Ramezani, and A. Madadi, “Introducing a new ...
  • Q. Hou, D. Zhou, and J. Feng, “Coordinate attention for ...
  • Y. Cao, J. Xu, S. Lin, F. Wei, and H. ...
  • J. Tang, W. Zhang, H. Liu, M. Yang, B. Jiang, ...
  • Y. Su, Z. Shao, Y. Zhou, F. Meng, H. Zhu, ...
  • S.-X. Zhang, X. Zhu, J.-B. Hou, C. Liu, C. Yang, ...
  • Y. Zhu, J. Chen, L. Liang, Z. Kuang, L. Jin, ...
  • S.-X. Zhang, X. Zhu, C. Yang, H. Wang, and X.-C. ...
  • Y. Liu, C. Shen, L. Jin, T. He, P. Chen, ...
  • W. Wang, Y. Zhou, J. Lv, D. Wu, G. Zhao, ...
  • C. Yang, M. Chen, Y. Yuan, and Q. Wang, “Text ...
  • V. Nazarzehi and R. Damani, “Decentralised optimal deployment of mobile ...
  • D. M. Katz, M. J. Bommarito, S. Gao, and P. ...
  • B. M. Lake, T. D. Ullman, J. B. Tenenbaum, and ...
  • Z. Raisi and J. Zelek, “Occluded text detection and recognition ...
  • نمایش کامل مراجع