Automatic speech recognition (ASR)

Alvand Naserghandi; Kosar Namakin

Automatic speech recognition (ASR)

محل انتشار: اولین کنگره بین المللی هوش مصنوعی در علوم پزشکی

سال انتشار: 1402

نوع سند: مقاله کنفرانسی

زبان: انگلیسی

مشاهده: 388

نسخه کامل این مقاله ارائه نشده است و در دسترس نمی باشد

صدور گواهی نمایه سازی
من نویسنده این مقاله هستم

این مقاله در بخشهای موضوعی زیر دسته بندی شده است:

استخراج به نرم افزارهای پژوهشی:

لینک ثابت به این مقاله:

https://civilica.com/doc/1703085

شناسه ملی سند علمی:

AIMS01_134

تاریخ نمایه سازی: 1 مرداد 1402

چکیده مقاله:

Automatic speech recognition (ASR) is a technology that converts spoken language (an audiosignal) into written text, often used as a command. When combined with natural language processing(NLP), speech technology can understand, interpret and generate human language andperform tasks such as translation, transcription, automatic summarization, topic segmentation,and much more. Proper medical transcriptions require scribes and dictation recorders, which areoverpriced, time-consuming, and inconvenient for patients. This potentializes using ASR systemsto accurately transcribes medical terminology, including product names, procedures, and evendiagnoses or diseases.With the help of this system, we can quickly add medical speech-to-text features to our voice-activatedapplications. Conversations between health care professionals and the patient operate as abasis for a patient’s diagnosis, treatment plan, and clinical recording procedure. This system canbe used for a wide variety of purposes, including the transcription of doctor-patient discussionsfor clinical documentation, the recording of phone calls for pharmacovigilance, the subtitlingof online medical consultations, and summarizing patient symptoms and classifying patients’categories based on their symptoms. When accompanied by NLP techniques, ASR systems canprovide valuable predictions such as patients’ probable current and future symptoms, duration oftreatment, and practical after-recovery considerations. In other words, this system can be calledan intelligent doctor’s assistant. Cardiology, neurology, obstetrics-gynecology, pediatrics, oncology,radiology, and urology are just a few specialists care fields in which this system can be utilized.The ability to perform these tasks allows AI models to support human workers by givinghealthcare professionals more time to focus on personalized, face-to-face patient care. Anotheradvantage is the easy and convenient use of this system. Current speech recognition systems, designedfor general applications, provide acceptable accuracy, while fine-tuning them for medicalpurposes can boost their performance. ASR systems can be easily used without prior educationor expertise to set them up. On the other hand, Speech recognition is more complicated than itseems. There are many difficulties in developing speech recognition, such as the quality of inputspeech signal, real-time implementation complexity, and the quest for a rich language model inboth letter and word levels. Another critical factor that affects ASR accuracy is the dataset. As thedataset size increases and the speech content become more relevant to medical applications, theoverall accuracy of ASR systems will improve. There are similar foreign versions of this systemabroad. However, we are currently preparing the first local version of this system, and we hopeto be able to launch this system in teaching hospitals of universities across the country in the firststage.

کلیدواژه ها:

automatic speech recognition ، artificial intelligence ، natural language processing

نویسندگان

Alvand Naserghandi

Shahid Beheshti University of Medical Sciences, Tehran, Iran

Kosar Namakin

Shahid Beheshti University of Medical Sciences, Tehran, Iran