An efficient hierarchical method for text region extraction in degraded document images
عنوان مقاله: An efficient hierarchical method for text region extraction in degraded document images
شناسه ملی مقاله: ICMVIP05_046
منتشر شده در پنجمین کنفرانس ماشین بینایی و پردازش تصویر در سال 1387
شناسه ملی مقاله: ICMVIP05_046
منتشر شده در پنجمین کنفرانس ماشین بینایی و پردازش تصویر در سال 1387
مشخصات نویسندگان مقاله:
M valizadeh - Department of Electrical Engineering, tarbiat modares university,Tehran, Iran
E kabir - Department of Electrical Engineering, tarbiat modares university,Tehran, Iran
S jalili - Department of Electrical Engineering, tarbiat modares university,Tehran, Iran
خلاصه مقاله:
M valizadeh - Department of Electrical Engineering, tarbiat modares university,Tehran, Iran
E kabir - Department of Electrical Engineering, tarbiat modares university,Tehran, Iran
S jalili - Department of Electrical Engineering, tarbiat modares university,Tehran, Iran
This paper presents a clustering based method to extract text regions from degraded document images. In this method the gray scale image is decomposed into four sub-bands using discrete wavelet transform. For each pixel, the corresponding components of 3 detail sub-bands are considered as feature vector. Potential text regions are extracted by k-means clustering algorithm. We propose several heuristic constrains by which candidate text regions are refined to eliminate non-text regions. Evaluation over a set of degraded documents captured with camera, our method shows satisfactory results.
کلمات کلیدی: text extraction, degraded document, clustering, discrete wavelet transform, document segmentation
صفحه اختصاصی مقاله و دریافت فایل کامل: https://civilica.com/doc/52022/