Discovery of Potential Topics from Blog Articles by Machine Learning

سال انتشار: 1393
نوع سند: مقاله ژورنالی
زبان: انگلیسی
مشاهده: 422

فایل این مقاله در 7 صفحه با فرمت PDF قابل دریافت می باشد

این مقاله در بخشهای موضوعی زیر دسته بندی شده است:

استخراج به نرم افزارهای پژوهشی:

لینک ثابت به این مقاله:

شناسه ملی سند علمی:

JR_ACSIJ-4-2_010

تاریخ نمایه سازی: 7 آذر 1394

چکیده مقاله:

This paper presents a method for potential topic discovery from blogsphere. We define a potential topic as an unpopular phrase that has potential to become a hot topic. To discover potentialtopics, this method builds a classifier to detect potentiality of a topic from topic frequency transitions in blog articles. First, thismethod extracts candidates of potential topics from categorized blog articles because categorization enables us to extractspecialists. To extract potential topics from the candidates, a classifier for detecting potential topics is built from topic frequency transition data. For this learning, we propose twotypes of learning methods: supervised learning and semisupervised learning. Though supervised learning provides moreprecise results, it requires enormous size of labeled data. Creating labeled data is costly and difficult. On the other hands, semi-supervised learning can build classifier from small size oflabeled data and a lot of unlabeled data. Experimental results with real blog data show the effectiveness of the proposed method

کلیدواژه ها:

نویسندگان

Yoshiaki YASUMURA

College of Engineering, Shibaura Institute of Technology, Japan Saitama City, Japan

Yuhei KOSAKA

Graduate School of System Informatics, Kobe University, Japan Kobe City, Japan

Hiroyoshi TAKAHASHI

Graduate School of System Informatics, Kobe University, Japan Kobe City, Japan

Kuniaki UEHARA

Graduate School of System Informatics, Kobe University, Japan Kobe City, Japan