Comparative Evaluation of Deep Learning Architectures for Printed and Handwritten Farsi OCR
محل انتشار: مجله هوش مصنوعی و داده کاوی، دوره: 14، شماره: 1
سال انتشار: 1405
نوع سند: مقاله ژورنالی
زبان: انگلیسی
مشاهده: 36
فایل این مقاله در 13 صفحه با فرمت PDF قابل دریافت می باشد
- صدور گواهی نمایه سازی
- من نویسنده این مقاله هستم
استخراج به نرم افزارهای پژوهشی:
شناسه ملی سند علمی:
JR_JADM-14-1_002
تاریخ نمایه سازی: 6 دی 1404
چکیده مقاله:
Farsi optical character recognition remains challenging due to the script’s cursive structure, positional glyph variations, and frequent diacritics. This study conducts a comparative evaluation of five foundational deep learning architectures widely used in OCR—two lightweight CRNN based models aimed at efficient deployment and three Transformer based models designed for advanced contextual modeling—to examine their suitability for the distinct characteristics of Farsi script. Performance was benchmarked on four publicly available datasets: Shotor and IDPL PFOD۲ for printed text, and Iranshahr and Sadri for handwritten text, using word level accuracy, parameter count, and computational cost as evaluation criteria. CRNN based models achieved high accuracy on word level datasets—۹۹.۴۲% (Shotor), ۹۷.۰۸% (Iranshahr), ۹۸.۸۶% (Sadri)—while maintaining smaller model sizes and lower computational demands. However, their accuracy dropped to ۷۸.۴۹% on the larger and more diverse line level IDPL PFOD۲ dataset. Transformer based models substantially narrowed this performance gap, exhibiting greater robustness to variations in font, style, and layout, with the best model reaching ۹۲.۸۱% on IDPL PFOD۲. To the best of our knowledge, this work is among the first comprehensive comparative studies of lightweight CRNN and Transformer based architectures for Farsi OCR, encompassing both printed and handwritten scripts, and establishes a solid performance baseline for future research and deployment strategies.
کلیدواژه ها:
نویسندگان
Fatemeh Asadi-Zeydabadi
Department of Electrical Engineering, Shahid Bahonar University of Kerman, Kerman, Iran.
Ali Afkari-Fahandari
Department of Electrical Engineering, Shahid Bahonar University of Kerman, Kerman, Iran.
Elham Shabaninia
Department of Applied Mathematics, Graduate University of Advanced Technology, Kerman, Iran.
Hossein Nezamabadi-pour
Department of Electrical Engineering, Shahid Bahonar University of Kerman, Kerman, Iran.
مراجع و منابع این مقاله:
لیست زیر مراجع و منابع استفاده شده در این مقاله را نمایش می دهد. این مراجع به صورت کاملا ماشینی و بر اساس هوش مصنوعی استخراج شده اند و لذا ممکن است دارای اشکالاتی باشند که به مرور زمان دقت استخراج این محتوا افزایش می یابد. مراجعی که مقالات مربوط به آنها در سیویلیکا نمایه شده و پیدا شده اند، به خود مقاله لینک شده اند :