Improving the Performance of Q-learning Using Simultanouse Q-values Updating

Maryam Pouyan; Amin Mousavi; Shahram Golzari; Ahmad Hatam

Improving the Performance of Q-learning Using Simultanouse Q-values Updating

محل انتشار: دومین کنفرانس بین المللی شبکه های اطلاعاتی هوشمند و سیستم های پیچیده

سال انتشار: 1393

نوع سند: مقاله کنفرانسی

زبان: انگلیسی

مشاهده: 1,186

فایل این مقاله در 6 صفحه با فرمت PDF قابل دریافت می باشد

دریافت فایل کامل مقاله

صدور گواهی نمایه سازی
من نویسنده این مقاله هستم

استخراج به نرم افزارهای پژوهشی:

لینک ثابت به این مقاله:

https://civilica.com/doc/344794

شناسه ملی سند علمی:

IINC02_020

تاریخ نمایه سازی: 25 فروردین 1394

چکیده مقاله:

Q-learning is a one of the best model-free reinforcement learning algorithms. The goal is to find an estimate of the optimal action-value function called Q-value function. The Q-value function is defined as the expected sum of future rewards obtained by taking an action in the current state. The main drawback of Q-learning is that the learning process is expensive for the agent, specially, in the beginning steps. Because, every state-action pair should be visited frequently in order to converge to the optimal policy. In this paper, the concept of opposite action is used to improve the performance of the Q-learning algorithm, especially, in the beginning steps of the learning. Opposite actions suggest updating two Q-values, simultaneously. The agent will update Q-value for each action and corresponding opposite action and thus increasing the speed of learning. The novel Q-learning method based on the concept of opposite action is simulated for the famous test-bed grid world problem. The results show the ability of the proposed method to improve the learning process

کلیدواژه ها:

reinforcement learning ، Q-learning ، opposite action ، estimate value

نویسندگان

Maryam Pouyan

Electrical and computer engineering department Hormozgan university Bandarabbas, Iran

Amin Mousavi

Electrical and computer engineering department Hormozgan university Bandarabbas, Iran

Shahram Golzari

Electrical and computer engineering department Hormozgan university Bandarabbas, Iran

Ahmad Hatam

Electrical and computer engineering department Hormozgan university Bandarabbas, Iran