Approaches Used in Focused Web Crawlers: A Systematic Mapping Study
محل انتشار: دهمین کنفرانس بین المللی وب پژوهی
سال انتشار: 1403
نوع سند: مقاله کنفرانسی
زبان: انگلیسی
مشاهده: 134
فایل این مقاله در 9 صفحه با فرمت PDF قابل دریافت می باشد
- صدور گواهی نمایه سازی
- من نویسنده این مقاله هستم
استخراج به نرم افزارهای پژوهشی:
شناسه ملی سند علمی:
IRANWEB10_013
تاریخ نمایه سازی: 14 مرداد 1403
چکیده مقاله:
Today, one of the most common uses of the Internet is searching the web and retrieving information from it. We all use general search engines like Google and Bing to search for information on a daily basis. Web crawlers are the most important part of a search engine that crawls the entire web content and extracts the content by following the links on the web pages. Focused web crawlers are a type of web crawlers that limit the crawling process to a specific section of online content and are used in vertical search engines. For example, they may only retrieve certain types of media (such as PowerPoint files).In this paper, a systematic mapping study has been conducted and the approaches used in the development of focused web crawlers have been reviewed and the advantages and disadvantages of each have been discussed. Also, ۲ new approaches have been identified and introduced. This study shows that the approach based on "ontology or semantics" is the most used in the development of focused web crawlers. Also, the decision to use each of the introduced approaches depends on the available resources and the existing limitations for development.
کلیدواژه ها:
نویسندگان
Amir Noorzadeh
Master's Student in Computer Engineering, Islamic Azad University, Karaj, Iran