CIVILICA We Respect the Science
(ناشر تخصصی کنفرانسهای کشور / شماره مجوز انتشارات از وزارت فرهنگ و ارشاد اسلامی: ۸۹۷۱)

An efficient hierarchical method for text region extraction in degraded document images

عنوان مقاله: An efficient hierarchical method for text region extraction in degraded document images
شناسه ملی مقاله: ICMVIP05_046
منتشر شده در پنجمین کنفرانس ماشین بینایی و پردازش تصویر در سال 1387
مشخصات نویسندگان مقاله:

M valizadeh - Department of Electrical Engineering, tarbiat modares university,Tehran, Iran
E kabir - Department of Electrical Engineering, tarbiat modares university,Tehran, Iran
S jalili - Department of Electrical Engineering, tarbiat modares university,Tehran, Iran

خلاصه مقاله:
This paper presents a clustering based method to extract text regions from degraded document images. In this method the gray scale image is decomposed into four sub-bands using discrete wavelet transform. For each pixel, the corresponding components of 3 detail sub-bands are considered as feature vector. Potential text regions are extracted by k-means clustering algorithm. We propose several heuristic constrains by which candidate text regions are refined to eliminate non-text regions. Evaluation over a set of degraded documents captured with camera, our method shows satisfactory results.

کلمات کلیدی:
text extraction, degraded document, clustering, discrete wavelet transform, document segmentation

صفحه اختصاصی مقاله و دریافت فایل کامل: https://civilica.com/doc/52022/