Detecting Machine-Generated Text in Academic Writing: Stylometric Fingerprinting of Humans and Large Language Models for Authorship Verification
سال انتشار: 1404
نوع سند: مقاله کنفرانسی
زبان: انگلیسی
مشاهده: 18
فایل این مقاله در 21 صفحه با فرمت PDF قابل دریافت می باشد
- صدور گواهی نمایه سازی
- من نویسنده این مقاله هستم
استخراج به نرم افزارهای پژوهشی:
شناسه ملی سند علمی:
ICAICS01_039
تاریخ نمایه سازی: 19 خرداد 1405
چکیده مقاله:
The proliferation of Large Language Models (LLMs) such as GPT-۴ and ChatGPT has introduced significant challenges to maintaining academic integrity, necessitating robust methods for distinguishing human-written texts from machine-generated content. This paper provides a comprehensive review of contemporary techniques for detecting machine-generated text in academic writing, with a particular focus on stylometric fingerprinting and machine learning approaches. We systematically analyze key methodologies, including stylometric analysis (lexical diversity, syntactic complexity, and punctuation patterns), psycholinguistic mapping, and the trigram-cosine delta metric. Furthermore, we examine advanced machine learning models such as supervised classification, ensemble learning, and Graph Neural Networks (GNNs) integrated with pre-trained language models. The review also explores multilingual and cross-domain detection strategies, benchmark datasets, and performance evaluation metrics. Despite high accuracy rates reported in recent studies (up to ۹۸%), significant challenges remain regarding generalizability across different LLMs and domains, computational efficiency, and ethical considerations related to privacy. The paper concludes that integrating stylometric analysis with advanced machine learning offers a promising pathway for safeguarding academic integrity, while emphasizing the need for continued research to address existing limitations.
کلیدواژه ها:
نویسندگان
Abdollah Givechi
PhD Student in Islamic Philosophy and Theology, Imam Sadiq University, Tehran, Iran