An Unsupervised Learning Embedding Method Based on Semantic Hashing

سال انتشار: 1401
نوع سند: مقاله ژورنالی
زبان: انگلیسی
مشاهده: 138

فایل این مقاله در 9 صفحه با فرمت PDF قابل دریافت می باشد

این مقاله در بخشهای موضوعی زیر دسته بندی شده است:

استخراج به نرم افزارهای پژوهشی:

لینک ثابت به این مقاله:

شناسه ملی سند علمی:

JR_MSEEE-2-3_005

تاریخ نمایه سازی: 2 مهر 1403

چکیده مقاله:

Embedding learning is an essential issue in Natural Language Processing (NLP) applications. Most existing methods measure the similarity between text chunks in a context using pre-trained word embedding. However, providing labeled data for model training is costly and time-consuming. So, these methods face downward performance when limited amounts of training data are available. This paper presents an unsupervised sentence embedding method that effectively integrates semantic hashing into the Kernel Principal Component Analysis (KPCA) to construct embeddings of lower dimensions that can be applied to any domain. The experiments conducted on benchmark datasets highlighted that the generated embeddings are general-purpose and can capture semantic meanings from both small and large corpora.

کلیدواژه ها:

Kernel Principal Component Analysis ، Natural Language Processing ، Semantic Hashing ، Sentence Embedding

نویسندگان

Javad Hamidzadeh

Faculty of computer engineering and information technology, Sadjad University, Mashhad, Iran.

Mona Moradi

Faculty of Electrical and Computer Engineering, Semnan University, Semnan, Iran.

مراجع و منابع این مقاله:

لیست زیر مراجع و منابع استفاده شده در این مقاله را نمایش می دهد. این مراجع به صورت کاملا ماشینی و بر اساس هوش مصنوعی استخراج شده اند و لذا ممکن است دارای اشکالاتی باشند که به مرور زمان دقت استخراج این محتوا افزایش می یابد. مراجعی که مقالات مربوط به آنها در سیویلیکا نمایه شده و پیدا شده اند، به خود مقاله لینک شده اند :
  • J.E. Font, M.R. Costa-Jussa, Equalizing gender biases in neural machine ...
  • R.A. Stein, P.A. Jaques, J.F. Valiati, An analysis of hierarchical ...
  • E. Biswas, K. Vijay-Shanker, L. Pollock, Exploring word embedding techniques ...
  • F. Incitti, F. Urli, L. Snidaro, Beyond word embeddings: A ...
  • R. JeffreyPennington, C. Manning, Glove: Global vectors for word representation, ...
  • T. Mikolov, I. Sutskever, K. Chen, G.S. Corrado, J. Dean, ...
  • D.S. Asudani, N.K. Nagwani, P. Singh, Impact of word embedding ...
  • J. Qiang, F. Zhang, Y. Li, Y. Yuan, Y. Zhu, ...
  • J. Pennington, R. Socher, C.D. Manning, Glove: Global vectors for ...
  • Y. Zhang, R. He, Z. Liu, K.H. Lim, L. Bing, ...
  • B. Li, H. Zhou, J. He, M. Wang, Y. Yang, ...
  • B. Wang, C.C.J. Kuo, Sbert-wk: A sentence embedding method by ...
  • J. Wieting, M. Bansal, K. Gimpel, K. Livescu, Towards universal ...
  • R. Socher, E.H. Huang, J. Pennin, C.D. Manning, A.Y. Ng, ...
  • B. Min, H. Ross, E. Sulem, A.P.B. Veyseh, T.H. Nguyen, ...
  • S. Li, X. Puig, C. Paxton, Y. Du, C. Wang, ...
  • R.K. Kaliyar, A multi-layer bidirectional transformer encoder for pre-trained word ...
  • M.S. Charikar, Similarity estimation techniques from rounding algorithms, in, ۲۰۰۲, ...
  • G.E. Hinton, R.R. Salakhutdinov, Reducing the dimensionality of data with ...
  • Y. Weiss, A. Torralba, R. Fergus, Spectral hashing, Advances in ...
  • Y. Li, F. Liu, Z. Du, D. Zhang, A simhash-based ...
  • J. Leskovec, A. Rajaraman, J.D. Ullman, Mining of massive data ...
  • F. Hill, K. Cho, A. Korhonen, Learning distributed representations of ...
  • T. Mikolov, W.-t. Yih, G. Zweig, Linguistic regularities in continuous ...
  • O. Levy, Y. Goldberg, Linguistic regularities in sparse and explicit ...
  • S. Arora, Y. Li, Y. Liang, T. Ma, A. Risteski, ...
  • W. Blacoe, M. Lapata, A comparison of vector-based representations for ...
  • J. Mitchell, M. Lapata, Vector-based models of semantic composition, proceedings ...
  • K.S. Tai, R. Socher, C.D. Manning, Improved semantic representations from ...
  • R. Socher, B. Huval, C.D. Manning, A.Y. Ng, Semantic compositionality ...
  • R. Socher, A. Perelygin, J. Wu, J. Chuang, C.D. Manning, ...
  • Q. Le, T. Mikolov, Distributed representations of sentences and documents, ...
  • N. Kalchbrenner, E. Grefenstette, P. Blunsom, A convolutional neural network ...
  • R. Kiros, Y. Zhu, R.R. Salakhutdinov, R. Zemel, R. Urtasun, ...
  • A. Conneau, D. Kiela, H. Schwenk, L. Barrault, A. Bordes, ...
  • S.R. Bowman, G. Angeli, C. Potts, C.D. Manning, A large ...
  • I. Goodfellow, Y. Bengio, A. Courville, Deep learning, MIT press, ...
  • A. Krizhevsky, I. Sutskever, G.E. Hinton, Imagenet classification with deep ...
  • A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, ...
  • T.J. Sejnowski, The unreasonable effectiveness of deep learning in artificial ...
  • S. Lamsiyah, A. El Mahdaouy, B. Espinasse, S. El Alaoui ...
  • P. Gupta, Unsupervised learning of sentence embeddings using compositional n-gram ...
  • M. Pagliardini, P. Gupta, M. Jaggi, Unsupervised learning of sentence ...
  • S. Arora, Y. Liang, T. Ma, A simple but tough-to-beat ...
  • A. Roshanzamir, H. Aghajan, M. Soleymani Baghshah, Transformer-based deep neural ...
  • J. Lu, X. Zhan, G. Liu, X. Zhan, X. Deng, ...
  • Z. Dai, J. Callan, Deeper text understanding for IR with ...
  • N. Azzouza, K. Akli-Astouati, R. Ibrahim, Twitterbert: Framework for twitter ...
  • H. Christian, D. Suhartono, A. Chowanda, K.Z. Zamli, Text based ...
  • V. Suresh, D.C. Ong, Using knowledge-embedded attention to augment pre-trained ...
  • L.K. Şenel, I. Utlu, V. Yücesoy, A. Koc, T. Cukur, ...
  • J.J. Lastra-Díaz, J. Goikoetxea, M.A.H. Taieb, A. García-Serrano, M.B. Aouicha, ...
  • A. Bakarov, A survey of word embeddings evaluation methods, arXiv ...
  • V. Lampos, B. Zou, I.J. Cox, Enhancing feature selection using ...
  • P. Indyk, R. Motwani, Approximate nearest neighbors: towards removing the ...
  • M.S. Charikar, Similarity estimation techniques from rounding algorithms, in: Proceedings ...
  • B. Schölkopf, A. Smola, K.-R. Müller, Kernel principal component analysis, ...
  • N. Reimers, I. Gurevych, Sentence-bert: Sentence embeddings using siamese bert-networks, ...
  • نمایش کامل مراجع