Categorizing Web Pages and Data Mining

  • سال انتشار: 1396
  • محل انتشار: چهارمین کنفرانس بین المللی مطالعات نوین در علوم کامپیوتر و فناوری اطلاعات
  • کد COI اختصاصی: CONFITC04_181
  • زبان مقاله: انگلیسی
  • تعداد مشاهده: 333
دانلود فایل این مقاله

نویسندگان

Zohreh Vahedi

Islamic Azad university of ferdows, Department of Computer engineering, Ferdows, South Khorasan, Iran

چکیده

The diversity of knowledge on the web has made determining communication patterns in thedatabase and knowledge discovery among this mass of information an attractive target. The firststep to achieve this goal is to classify web pages. The current machine learning techniques toclassify content are initially discussed by flat and simple text documents and do not usestructures such as links and headers discussed in the web optimally. Data mining is a set oftechniques that allows a person to move beyond conventional data processing and help miningthe information hidden in a massive volume of data. Today, in most organizations, data arecollected and stored rapidly. However, using these data is not simple and they cannot be used asa single unit of the volume of data, so techniques they can be applied properly using acombination of statistics and computer science and the use of machine learning. However, inorder to achieve a meaningful result of Web mining we need to have good data on our website,so effective management of Web data is crucial in web mining. The ultimate goal of thisdescriptive paper is that the organizations and small businesses could use data mining in theirdecisions and e-commerce as a future big business would enter into new future areas. There is awide range of users considered for this technology, including for example research centers,companies operating in the field of web and database, analysts and managers of organizations,business, Web (search engines) and …

کلیدواژه ها

data mining, web mining, classification, e-commerce

مقالات مرتبط جدید

اطلاعات بیشتر در مورد COI

COI مخفف عبارت CIVILICA Object Identifier به معنی شناسه سیویلیکا برای اسناد است. COI کدی است که مطابق محل انتشار، به مقالات کنفرانسها و ژورنالهای داخل کشور به هنگام نمایه سازی بر روی پایگاه استنادی سیویلیکا اختصاص می یابد.

کد COI به مفهوم کد ملی اسناد نمایه شده در سیویلیکا است و کدی یکتا و ثابت است و به همین دلیل همواره قابلیت استناد و پیگیری دارد.