Detecting Machine-Generated Text in Academic Writing: Stylometric Fingerprinting of Humans and Large Language Models for Authorship Verification

Abdollah Givechi

Detecting Machine-Generated Text in Academic Writing: Stylometric Fingerprinting of Humans and Large Language Models for Authorship Verification

محل انتشار: اولین کنفرانس بین المللی هوش مصنوعی و علوم کامپیوتری نوظهور: از الگوریتم تا آینده نگری

سال انتشار: 1404

نوع سند: مقاله کنفرانسی

زبان: انگلیسی

مشاهده: 18

فایل این مقاله در 21 صفحه با فرمت PDF قابل دریافت می باشد

دریافت فایل کامل مقاله

صدور گواهی نمایه سازی
من نویسنده این مقاله هستم

استخراج به نرم افزارهای پژوهشی:

لینک ثابت به این مقاله:

https://civilica.com/doc/2626890

شناسه ملی سند علمی:

ICAICS01_039

تاریخ نمایه سازی: 19 خرداد 1405

چکیده مقاله:

The proliferation of Large Language Models (LLMs) such as GPT-۴ and ChatGPT has introduced significant challenges to maintaining academic integrity, necessitating robust methods for distinguishing human-written texts from machine-generated content. This paper provides a comprehensive review of contemporary techniques for detecting machine-generated text in academic writing, with a particular focus on stylometric fingerprinting and machine learning approaches. We systematically analyze key methodologies, including stylometric analysis (lexical diversity, syntactic complexity, and punctuation patterns), psycholinguistic mapping, and the trigram-cosine delta metric. Furthermore, we examine advanced machine learning models such as supervised classification, ensemble learning, and Graph Neural Networks (GNNs) integrated with pre-trained language models. The review also explores multilingual and cross-domain detection strategies, benchmark datasets, and performance evaluation metrics. Despite high accuracy rates reported in recent studies (up to ۹۸%), significant challenges remain regarding generalizability across different LLMs and domains, computational efficiency, and ethical considerations related to privacy. The paper concludes that integrating stylometric analysis with advanced machine learning offers a promising pathway for safeguarding academic integrity, while emphasizing the need for continued research to address existing limitations.

کلیدواژه ها:

Machine-Generated Text Detection ، Academic Writing ، Stylometry ، Large Language Models ، Authorship Verification

نویسندگان

Abdollah Givechi

PhD Student in Islamic Philosophy and Theology, Imam Sadiq University, Tehran, Iran