Whitened gradient descent, a new updating method for optimizers in deep neural networks

H. Gholamalinejad; H. Khosravi

Whitened gradient descent, a new updating method for optimizers in deep neural networks

محل انتشار: مجله هوش مصنوعی و داده کاوی، دوره: 10، شماره: 4

سال انتشار: 1401

نوع سند: مقاله ژورنالی

زبان: انگلیسی

مشاهده: 244

فایل این مقاله در 12 صفحه با فرمت PDF قابل دریافت می باشد

دریافت فایل کامل مقاله

صدور گواهی نمایه سازی
من نویسنده این مقاله هستم

این مقاله در بخشهای موضوعی زیر دسته بندی شده است:

هوش مصنوعی > یادگیری عمیق

استخراج به نرم افزارهای پژوهشی:

لینک ثابت به این مقاله:

https://civilica.com/doc/1570950

شناسه ملی سند علمی:

JR_JADM-10-4_002

تاریخ نمایه سازی: 28 آذر 1401

چکیده مقاله:

Optimizers are vital components of deep neural networks that perform weight updates. This paper introduces a new updating method for optimizers based on gradient descent, called whitened gradient descent (WGD). This method is easy to implement and can be used in every optimizer based on the gradient descent algorithm. It does not increase the training time of the network significantly. This method smooths the training curve and improves classification metrics. To evaluate the proposed algorithm, we performed ۴۸ different tests on two datasets, Cifar۱۰۰ and Animals-۱۰, using three network structures, including densenet۱۲۱, resnet۱۸, and resnet۵۰. The experiments show that using the WGD method in gradient descent based optimizers, improves the classification results significantly. For example, integrating WGD in RAdam optimizer increased the accuracy of DenseNet from ۸۷.۶۹% to ۹۰.۰۲% on the Animals-۱۰ dataset.

کلیدواژه ها:

deep learning ، Optimizer ، Whitened Gradient Descent ، Momentum

نویسندگان

H. Gholamalinejad

Department of Computer, Faculty of Engineering, Bozorgmehr University of Qaenat, Qaen, Iran.

H. Khosravi

Faculty of Electrical Engineering Shahrood University of Technology.

مراجع و منابع این مقاله:

لیست زیر مراجع و منابع استفاده شده در این مقاله را نمایش می دهد. این مراجع به صورت کاملا ماشینی و بر اساس هوش مصنوعی استخراج شده اند و لذا ممکن است دارای اشکالاتی باشند که به مرور زمان دقت استخراج این محتوا افزایش می یابد. مراجعی که مقالات مربوط به آنها در سیویلیکا نمایه شده و پیدا شده اند، به خود مقاله لینک شده اند :

A. Bordes, X. Glorot, J. Weston, and Y. Bengio, "Joint ...
W. Ma, W. Ma, S. Xu, and H. Zha, "Pyramid ...
V. Lialin, R. Goel, A. Simanovsky, A. Rumshisky, and R. ...
J. S. Ren and L. Xu, "On vectorization of deep ...
T. Kaur and T. K. Gandhi, "Deep convolutional neural networks ...
I. D. Apostolopoulos and T. A. Mpesiana, "Covid-۱۹: automatic detection ...
X. Li, Y. Grandvalet, and F. Davoine, "A baseline regularization ...
T. Mikolov, I. Sutskever, K. Chen, G. S. Corrado, and ...
S. Lazebnik, C. Schmid, and J. Ponce, "Beyond bags of ...
A. Krizhevsky, I. Sutskever, and G. E. Hinton, "Imagenet classification ...
D. Ciregan, U. Meier, and J. Schmidhuber, "Multi-column deep neural ...
O. Badmos, A. Kopp, T. Bernthaler, and G. Schneider, "Image-based ...
X. Gou, L. Qing, Y. Wang, M. Xin, and X. ...
L. Deng, "A tutorial survey of architectures, algorithms, and applications ...
Y. Guo, Y. Liu, A. Oerlemans, S. Lao, S. Wu, ...
R. Pascanu, C. Gulcehre, K. Cho, and Y. Bengio, "How ...
D. E. Rumelhart, G. E. Hinton, and R. J. Williams, ...
S. Ruder, "An overview of gradient descent optimization algorithms," arXiv ...
C. Y. Miao, A. Yang, and M. J. Anderson, "Deep ...
R. Marcus, P. Negi, H. Mao, N. Tatbul, M. Alizadeh, ...
G.-H. Liu, T. Chen, and E. A. Theodorou, "A Differential ...
I. Kandel, M. Castelli, and A. Popovič, "Comparative Study of ...
S. Kim and T.-S. Choi, "Design of Multichannel FIR Filter ...
R. Sutton, "Two problems with back propagation and other steepest ...
N. Qian, "On the momentum term in gradient descent learning ...
T. Dozat, "Incorporating nesterov momentum into adam.(۲۰۱۶),"[۳۰] J. Duchi, E. ...
M. D. Zeiler, "Adadelta: an adaptive learning rate method," arXiv ...
D. P. Kingma and J. Ba, "Adam: A method for ...
M. Kögel and R. Findeisen, "A fast gradient method for ...
L. Liu et al., "On the variance of the adaptive ...
P. Efraimidis and P. Spirakis, "Weighted Random Sampling," in Encyclopedia ...
K. He, X. Zhang, and S. Ren, "Deep residual learning ...
G. Huang, Z. Liu, L. Van Der Maaten, and K. ...
Animals-۱۰ image dataset. Available: https://www.kaggle.com/alessiocorrado۹۹/animals۱۰ ...
M. L. McHugh, "Interrater reliability: the kappa statistic," Biochemia medica: ...

نمایش کامل مراجع