ArmanTTS single-speaker Persian dataset

TTS, or text-to-speech, is a complicated process thatcan be accomplished through appropriate modeling using deeplearning methods. In order to implement deep learning models, asuitable dataset is required. Since there is a scarce amount ofwork done in this field for the Persian language, this paper willintroduce the single speaker dataset: ArmanTTS. We comparedthe characteristics of this dataset with those of various prevalentdatasets to prove that ArmanTTS meets the necessary standardsfor teaching a Persian text-to-speech conversion model. We alsocombined the Tacotron ۲ and HiFi GAN to design a model thatcan receive phonemes as input, with the output being thecorresponding speech. ۴.۰ value of MOS was obtained from realspeech, ۳.۸۷ value was obtained by the vocoder prediction and۲.۹۸ value was reached with the synthetic speech generated bythe TTS model.

کلیدواژه ها:

dataset ، Vocoders ، Acoustic models

نویسندگان

Mohammd Hasan Shamgholi

MSc StudentSchool of Computer EngineeringIran University of Science andTechnologyTehran, Iran

Vahid Saeedi

MSc GraduateSchool of Computer EngineeringIran University of Science andTechnologyTehran, Iran

Javad Peymanfard

PhD CandidateSchool of Computer EngineeringIran University of Science andTechnologyTehran, Iran

Leila Alhabib

BSc StudentSchool of Computer EngineeringAmirkabir University of TechnologyTehran, Iran

Hossein Zeinali

Assistant ProfessorSchool of Computer EngineeringAmirkabir University of TechnologyTehran, Iran

صدور گواهی نمایه سازی
من نویسنده این مقاله هستم

استخراج به نرم افزارهای پژوهشی:

لینک ثابت به این مقاله:

https://civilica.com/doc/1675610

شناسه ملی سند علمی:

CEITCONF06_046

تاریخ نمایه سازی: 26 خرداد 1402

نحوه استناد به مقاله:

در صورتی که می خواهید در اثر پژوهشی خود به این مقاله ارجاع دهید، به سادگی می توانید از عبارت زیر در بخش منابع و مراجع استفاده نمایید:

Shamgholi, Mohammd Hasan and Saeedi, Vahid and Peymanfard, Javad and Alhabib, Leila and Zeinali, Hossein,1401,ArmanTTS single-speaker Persian dataset,1st International Conference and 6th National Conference on Computers, information technology and applications of artificial intelligence,https://civilica.com/doc/1675610

در داخل متن نیز هر جا که به عبارت و یا دستاوردی از این مقاله اشاره شود پس از ذکر مطلب، در داخل پارانتز، مشخصات زیر نوشته می شود.
برای بار اول: (1401, Shamgholi, Mohammd Hasan؛ Vahid Saeedi and Javad Peymanfard and Leila Alhabib and Hossein Zeinali)
برای بار دوم به بعد: (1401, Shamgholi؛ Saeedi and Peymanfard and Alhabib and Zeinali)
برای آشنایی کامل با نحوه مرجع نویسی لطفا بخش راهنمای سیویلیکا (مرجع دهی) را ملاحظه نمایید.

علم سنجی و رتبه بندی مقاله

مشخصات مرکز تولید کننده این مقاله به صورت زیر است:

رتبه علمی دانشگاه علم و صنعت ایران

نوع مرکز: دانشگاه دولتی

تعداد مقالات: 30,020

در بخش علم سنجی پایگاه سیویلیکا می توانید رتبه بندی علمی مراکز دانشگاهی و پژوهشی کشور را بر اساس آمار مقالات نمایه شده مشاهده نمایید.

مقالات مرتبط جدید