FATR: A Comprehensive Dataset and Evaluation Framework for Persian Text Recognition in Wild Images

Z. Raisi; V. Nazarzehi Had; E. Sarani; R. Damani

FATR: A Comprehensive Dataset and Evaluation Framework for Persian Text Recognition in Wild Images

محل انتشار: مجله نوآوری های مهندسی برق و کامپیوتر، دوره: 13، شماره: 2

سال انتشار: 1404

نوع سند: مقاله ژورنالی

زبان: انگلیسی

مشاهده: 47

فایل این مقاله در 10 صفحه با فرمت PDF قابل دریافت می باشد

دریافت فایل کامل مقاله

صدور گواهی نمایه سازی
من نویسنده این مقاله هستم

استخراج به نرم افزارهای پژوهشی:

لینک ثابت به این مقاله:

https://civilica.com/doc/2299087

شناسه ملی سند علمی:

JR_JECEI-13-2_007

تاریخ نمایه سازی: 19 تیر 1404

چکیده مقاله:

kground and Objectives: Research on right-to-left scripts, particularly Persian text recognition in wild images, is limited due to lacking a comprehensive benchmark dataset. Applying state-of-the-art (SOTA) techniques on existing Latin or multilingual datasets often results in poor recognition performance for Persian scripts. This study aims to bridge this gap by introducing a comprehensive dataset for Persian text recognition and evaluating SOTA models on it.Methods: We propose a Farsi (Persian) text recognition (FATR) dataset, which includes challenging images captured in various indoor and outdoor environments. Additionally, we introduce FATR-Synth, the largest synthetic Persian text dataset, containing over ۲۰۰,۰۰۰ cropped word images designed for pre-training scene text recognition models. We evaluate five SOTA deep learning-based scene text recognition models using standard word recognition accuracy (WRA) metrics on the proposed datasets. We compare the performance of these recent architectures qualitatively on challenging sample images of the FATR dataset.Results: Our experiments demonstrate that SOTA recognition models' performance declines significantly when tested on the FATR dataset. However, when trained on synthetic and real-world Persian text datasets, these models demonstrate improved performance on Persian scripts.Conclusion: Introducing the FATR dataset enhances the resources available for Persian text recognition, improving model performance. The proposed datasets, trained models, and code is available at https://github.com/zobeirraisi/FATDR.

کلیدواژه ها:

Persian Scripts ، Scene text recognition ، Real-World datasets ، Synthetic images ، Deep Learning

نویسندگان

Z. Raisi

Electrical Engineering Department, Chabahar Maritime University, Chabahar, Iran.

V. Nazarzehi Had

Electrical Engineering Department, Chabahar Maritime University, Chabahar, Iran.

E. Sarani

Electrical Engineering Department, Chabahar Maritime University, Chabahar, Iran.

R. Damani

Electrical Engineering Department, Chabahar Maritime University, Chabahar, Iran.

مراجع و منابع این مقاله:

لیست زیر مراجع و منابع استفاده شده در این مقاله را نمایش می دهد. این مراجع به صورت کاملا ماشینی و بر اساس هوش مصنوعی استخراج شده اند و لذا ممکن است دارای اشکالاتی باشند که به مرور زمان دقت استخراج این محتوا افزایش می یابد. مراجعی که مقالات مربوط به آنها در سیویلیکا نمایه شده و پیدا شده اند، به خود مقاله لینک شده اند :

Y. Zhu, C. Yao, X. Bai, “Scene text detection and ...
H. Lin, P. Yang, F. Zhang, “Review of scene text ...
Z. Raisi, M. A. Naiel, P. Fieguth, S. Wardell, J. ...
Z. Raisi, J. Zelek, “Text detection and recognition for robot ...
K. Wang, B. Babenko, S. Belongie, “End-to-end scene text recognition,” ...
A. Bissacco, M. Cummins, Y. Netzer, H. Neven, “PhotoOCR: Reading ...
Z. Raisi, V. M. Nazarzehi, “A transformer-based approach with contextual ...
Z. Raisi, G. Younes, J. Zelek, “Arbitrary shape text detection ...
M. Jaderberg, K. Simonyan, A. Vedaldi, A. Zisserman., “Deep structured ...
B. Shi, X. Bai, C. Yao, “An end-to-end trainable neural ...
B. Shi, X. Wang, P. Lyu, C. Yao, X. Bai, ...
W. Liu, C. Chen, K. Y. K. Wong, Z. Su, ...
F. Borisyuk, A. Gordo, V. Sivakumar, “Rosetta: Large scale system ...
J. Baek, G. Kim, J. Lee, S. Park, D. Han, ...
C. Ma, L. Sun, J. Wang, Q. Huo, “Dq-detr: Dynamic ...
A. Rahman, A. Ghosh, C. Arora, “Utrnet: Highresolution urdu text ...
F. Alimoradi, F. Rahmani, L. Rabiei, M. Khansari, M. Mazoochi, ...
A. Rashtehroudi, A. Ranjkesh, A. Shahbahrami, "PESTD: a large-scale Persian-English ...
S. Kheirinejad, N. Riaihi, R. Azmi, “Persian text-based traffic sign ...
M. Rahmati, M. Fateh, M. Rezvani, A. Tajary, V. Abolghasemi, ...
A. Fateh, M. Rezvani, A. Tajary, M. Fateh, “Persian printed ...
T. E. De Campos, B. R. Babu, M. Varma, et ...
K. Wang, S. Belongie, “Word spotting in the wild,” in ...
L. Neumann, J. Matas, “Real-time scene text localization and recognition,” ...
F. Zhan, S. Lu, “Esir: End-to-end scene text recognition via ...
M. Sawaki, H. Murase, N. Hagita, “Automatic acquisition of context-based ...
Y. F. Pan, X. Hou, C. L. Liu, “Text localization ...
N. Dalal, B. Triggs, “Histograms of oriented gradients for human ...
D. G. Lowe, “Distinctive image features from scale-invariant keypoints,” Int. ...
J. A. Suykens, J. Vandewalle, “Least squares support vector machine ...
J. Almazan, A. Gordo, A. Forn´ es, E. Valveny, “Word´ ...
A. Graves, S. Fernandez, F. Gomez, J. Schmidhuber, “Connectionist temporal ...
Z. Wan, F. Xie, Y. Liu, X. Bai, C. Yao, ...
B. Shi, M. Yang, X. Wang, P. Lyu, C. Yao, ...
A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, ...
Z. Raisi, M. A. Naiel, G. Younes, S. Wardell, J. ...
Z. Qiao, Z. Ji, Y. Yuan, J. Bai, “Decoupling visual ...
D. Karatzas, F. Shafait, S. Uchida, M. Iwamura, L. G. ...
A. Mishra, K. Alahari, C. V. Jawahar, “Scene text recognition ...
D. Karatzas, L. Gomez-Bigorda, A. Nicolaou, S. Ghosh, A. Bagdanov, ...
A. Risnumawan, P. Shivakumara, C. S. Chan, C. L. Tan, ...
T. Quy Phan, P. Shivakumara, S. Tian, C. Lim Tan, ...
T. Y. Lin, M. Maire, S. Belongie, J. Hays, P. ...
A. Gupta, A. Vedaldi, A. Zisserman, “Synthetic data for text ...
M. Jaderberg, K. Simonyan, A. Vedaldi, A. Zisserman, “Synthetic data ...
M. Iwamura, N. Morimoto, K. Tainaka, D. Bazazian, L. Gomez, ...
Y. Sun, Z. Ni, C. K. Chng, Y. Liu, C. ...
W. Wu, Y. Zhao, Z. Li, J. Li, M. Z. ...
R. Zhang, Y. Zhou, Q. Jiang, Q. Song, N. Li, ...
C. K. Chng, Y. Liu, Y. Sun, C. C. Ng, ...
Z. Wan, J. Zhang, L. Zhang, J. Luo, C. Yao, ...
M. Tounsi, I. Moalla, A. M. Alimi, F. Lebouregois, “Arabic ...
M. Tounsi, I. Moalla, A. M. Alimi, “Arasti: A database ...
M. Jain, M. Mathew, C. Jawahar, “Unconstrained ocr for urdu ...
N. Sabbour, F. Shafait, “A segmentation-free approach to arabic and ...
V. I. Levenshtein, “Binary codes capable of correcting deletions, insertions, ...
A. Ramesh, P. Dhariwal, A. Nichol, C. Chu, M. Chen, ...
J. Achiam, S. Adler, S. Agarwal, L. Ahmad, I. Akkaya, ...
G. Team, R. Anil, S. Borgeaud, Y. Wu, J. B. ...
A. Kirillov, E. Mintun, N. Ravi, H. Mao, C. Rolland, ...
A. Kortylewski, Q. Liu, A. Wang, Y. Sun, A. Yuille, ...
Z. Raisi, J. Zelek, “Occluded text detection and recognition in ...
A. Faraji, M. Saeed, H. Nezamabadi-pour, "Introducing a database for ...

نمایش کامل مراجع