An Intelligent Persian Tourism Question-Answering Chatbot Based on BERT and RAG

Shirin Amini; Mohammad Ali Afshar Kazemi; Akbar Alam Tabriz; Seyed Mohammad Ali Khatami Firouzabadi

An Intelligent Persian Tourism Question-Answering Chatbot Based on BERT and RAG

محل انتشار: اولین کنفرانس بین المللی مدیریت، علوم کامپیوتر و هوش مصنوعی

سال انتشار: 1404

نوع سند: مقاله کنفرانسی

زبان: فارسی

مشاهده: 66

فایل این مقاله در 19 صفحه با فرمت PDF و WORD قابل دریافت می باشد

دریافت فایل کامل مقاله

صدور گواهی نمایه سازی
من نویسنده این مقاله هستم

استخراج به نرم افزارهای پژوهشی:

لینک ثابت به این مقاله:

https://civilica.com/doc/2631793

شناسه ملی سند علمی:

ICMCAI01_138

تاریخ نمایه سازی: 24 خرداد 1405

چکیده مقاله:

Introduction:With the rapid expansion of digital technologies and increasing competition in the tourism industry, providing intelligent, fast, and personalized services to tourists has become a fundamental requirement for tourism organizations. In this context, intelligent chatbots, as one of the most prominent applications of conversational artificial intelligence, play a critical role in enhancing tourist experiences, reducing operational costs, and improving service accessibility. Despite significant advances in deep language models, most existing tourism chatbots have been developed for high-resource languages. Persian, particularly in the specialized domain of tourism, still suffers from a lack of accurate, well-documented, and systematically evaluated chatbot systems. Moreover, many previous studies—both domestic and international—have primarily focused on proposing a single chatbot solution, while comprehensive and experimental comparisons between different chatbot architectures have received limited attention.The main objective of this study is to design, implement, and empirically compare two different types of intelligent Persian tourism question-answering chatbots based on distinct architectural paradigms. By analyzing their strengths, limitations, and application domains, this research aims to provide a scientific basis for selecting appropriate chatbot architectures in real-world tourism scenarios.Methods:This research is applied in nature and adopts a mixed approach combining system design and experimental evaluation. In the first phase, to address the scarcity of domain-specific Persian data, a Persian tourism question-answering dataset was constructed following the SQuAD standard. This dataset was generated using more than ۱,۰۰۰ reliable Persian tourism sources related to destinations, attractions, accommodations,and travel services in Iran.Next, a Persian language model based on the BERT architecture, named TookaBERT-Large, was fine-tuned on the constructed dataset and employed as the core answer extraction model. Subsequently, two distinct chatbot architectures were designed and implemented. The first chatbot followed an extractive question-answering approach, directly extracting answers from retrieved documents, with a strong emphasis on accuracy and answer reliability. The second chatbot was designed based on an Agentic Retrieval-Augmented Generation (RAG) architecture, combining vector-based retrieval, document ranking, and generative language models to produce more comprehensive and interactive responses.The performance of both chatbots was evaluated using a combination of quantitative and qualitative metrics, including Exact Match, F۱-score, answer coverage, response naturalness, response time, and user satisfaction.Results and Discussion:The experimental results revealed that each chatbot demonstrated distinct performance characteristics across different evaluation dimensions. The extractive question-answering chatbot, powered by the fine-tuned TookaBERT-Large model, achieved higher accuracy scores and produced more reliable and well-grounded responses. This chatbot showed particularly strong performance in factual and informational queries related to tourism destinations, attractions,and static travel information.In contrast, the Agentic RAG-based chatbot exhibited broader answer coverage and delivered more natural and engaging conversational interactions. The findings indicate that this chatbot outperformed the extractive model in terms of user satisfaction, response naturalness, and its ability to handle complex, explanatory, and multi-part queries. However, this improved conversational capability was accompanied by occasional reductions in factual accuracy and longer response times.Overall, the comparative analysis highlights that chatbot architecture selection has a significant impact on user experience, and no single approach can fully address all requirements of the tourism industry.Conclusions:This study demonstrates that the development of intelligent Persian tourism chatbots requires a goal-oriented and scenario-specific approach. Extractive chatbots are more suitable for accuracy-critical applications, while RAG-based chatbots are better suited for dynamic interactions and comprehensive tourist guidance. By providing a structured comparative framework, this research contributes to the development of nextgeneration Persian tourism chatbots and otherconversational systems in service-oriented domains.

کلیدواژه ها:

Intelligent Question Answering ، Persian SQuAD Dataset ، Retrieval-Augmented Generation (RAG) ، Tourism Industry ، Tourism Chatbot

نویسندگان

Shirin Amini

Department of Information Technology Management, KI.C., Islamic Azad University, Kish, Iran

Mohammad Ali Afshar Kazemi

Department of Industrial Management, CT.C., Islamic Azad University, Tehran, Iran (corresponding author).

Akbar Alam Tabriz

Department of Industrial Management and Information Technology, Faculty of Management and Accounting, Shahid Beheshti University, Tehran, Iran

Seyed Mohammad Ali Khatami Firouzabadi

Department of Operations Management and Information Technology, Faculty of Management and Accounting, Allameh Tabatabai University, Tehran, Iran