A Cross-Lingual Text-To-Speech System for Hausa using DNN-Based Approach

سال انتشار: 1398
نوع سند: مقاله ژورنالی
زبان: انگلیسی
مشاهده: 400

فایل این مقاله در 11 صفحه با فرمت PDF قابل دریافت می باشد

استخراج به نرم افزارهای پژوهشی:

لینک ثابت به این مقاله:

شناسه ملی سند علمی:

JR_IJMEC-10-35_003

تاریخ نمایه سازی: 3 اسفند 1398

چکیده مقاله:

In recent years, speech technology has gained a tremendous improvement in term of its application and development. Speech technology such as machine translator, automatic speech recognition system and speech synthesis system are the state-of-the-art in today’s technology. TTS system or artificial speech development during the last few decades aims at gradual improvement in the intelligibility and naturalness. A Text-to-Speech system is a system that generates speech output from a given input text. TTS system has many different applications for many different users, but more specifically are the visually impaired and the illiterates. Some of the major application areas of speech synthesis system are document reader, speech translator, mobile read-aloud applications (such as google map reader) and announcement system. Speech synthesis system serves as an assistive tool for disabled, which is used for reading online text/information and as an automatic learning system for children. Despite the potential benefits of TTS system, it is language dependent and has yet to be developed for many of the languages around the world, which is mostly due to the lack in the necessary resources. Languages that is lacking in the necessary resources are referred as under-resourced language. Hausa is one of the under-resourced languages that lacks in the resources for developing a TTS system. The aim of this research is to develop a state-of-the-art TTS system for Hausa, an under-resourced language, using minimal resources. Several techniques have been introduced by researchers for developing TTS system for under-resourced languages, such as speaker adaptation, cross-lingual adaptation, bootstrapping, and etc. Currently, the state-of-the-art TTS technology is the Deep Neural Network (DNN)-based speech synthesis system which is only available for selected well-resourced languages like English, Arabic etc. The DNN-based speech synthesis system is the most advanced system that offers the highest intelligibility and naturalness as compared to the existing systems. Using the English resources as the basis, a DNN-based speech synthesis system is developed for Hausa with minimal resources by adopting the cross-lingual technique. The developed system was tested for intelligibility and naturalness using native Hausa speakers. The result of the developed system is 4.20 out of 5 in terms of naturalness and 4.10 out of 5 in terms in intelligibility, which is better than the existing techniques used for the development of TTS systems for under-resourced languages.

نویسندگان

Abubakar Ahmad Aliero

Computer Science Department, Kebbi State University of Science and Technology, Aliero,Kebbi, Nigeria

Dalhatu Muhammed

Computer Science Department, Kebbi State University of Science and Technology, Aliero,Kebbi, Nigeria

Mumtaz Begum Binti Peer Mustafa

Faculty of Computer Science & Information Technology, University of Malaya, Malaysia

Muhammad Saidu A

ICT Department, Kebbi State University of Science and Technology, Aliero, Kebbi, Nigeria

Muhammad Garba

Computer Science Department, Kebbi State University of Science and Technology, Aliero,Kebbi, Nigeria