A Novel Framework of Anonymization Techniques for Big Data Applications in Interactive Information Retrieval Systems

سال انتشار: 1398
محل انتشار: دومین کنفرانس ملی بازیابی تعاملی اطلاعات
کد COI اختصاصی: IIIRC02_001
زبان مقاله: انگلیسی
تعداد مشاهده: 599

دانلود فایل این مقاله

نویسندگان

Mehdi Hasaninasab

ICT Research Institute (ITRC ) Information Technology Institute, ITRC, Tehran, Iran

Morteza Sargolzaei Javan

ICT Research Institute (ITRC ) Information Technology Institute, ITRC, Tehran, Iran

Ehsan Arianyan

ICT Research Institute (ITRC ) Information Technology Institute, ITRC, Tehran, Iran

چکیده

In the current digital world, organizations are transmitting and receiving data in different formats, rates, and technologies constantly. Organizations recognize the opportunities and business value offered by Big Data technologies. However there are many problems, which are derived from the difficulty of understanding the complex dimensions involved in Big Data adoption. Open data is one of the most important topics related to the big data domain. The benefit of open data on development of interactive information retrieval systems and consequently the overall economic growth of countries is undeniable. Prior to data release, they should be anonymized (e.g. through removing the owner of data) in order to avoid any privacy violation. Various anonymization techniques can be utilized which are different in their algorithms, speed, scalability, and de-identification risk. Choosing the proper technique for each big data application is a critical issue to reach the desired efficiency. This paper surveys the most important anonymization techniques and categorizes them into two main categories including randomization and generalization techniques. Moreover, this paper proposes a novel framework for choosing the best anonymization technique for six application categories including health, general data, finance, Geo information, social networks, and image data. Utilizing this framework helps designers to anonymize data efficiently before releasing them.

کلیدواژه ها

Anonymization, Privacy, GDPR, Big data, Retrieval.

مقالات مرتبط جدید

اطلاعات بیشتر در مورد COI

COI مخفف عبارت CIVILICA Object Identifier به معنی شناسه سیویلیکا برای اسناد است. COI کدی است که مطابق محل انتشار، به مقالات کنفرانسها و ژورنالهای داخل کشور به هنگام نمایه سازی بر روی پایگاه استنادی سیویلیکا اختصاص می یابد.

کد COI به مفهوم کد ملی اسناد نمایه شده در سیویلیکا است و کدی یکتا و ثابت است و به همین دلیل همواره قابلیت استناد و پیگیری دارد.