Learning a model-free robotic continuous state-action task through contractive Q-network

MohammadJavad, Davari Dolatabadi; Khalil, Alipour; Alireza, Hadi; Bahram, Tarvirdizadeh

Learning a model-free robotic continuous state-action task through contractive Q-network

عنوان مقاله: Learning a model-free robotic continuous state-action task through contractive Q-network
شناسه ملی مقاله: ISME25_102
منتشر شده در بیست و پنجمین همایش سالانه مهندسی مکانیک در سال 1396

مشخصات نویسندگان مقاله:

MohammadJavad Davari Dolatabadi - MSc Student, Department of Mechatronics Eng., Faculty of New Sciences and Technologies, University of Tehran, Tehran, Iran
Khalil Alipour - Assistant Professor, Department of Mechatronics Eng., Faculty of New Sciences and Technologies, University of Tehran, Tehran, Iran
Alireza Hadi - Assistant Professor, Department of Mechatronics Eng., Faculty of New Sciences and Technologies, University of Tehran, Tehran, Iran
Bahram Tarvirdizadeh - Assistant Professor, Department of Mechatronics Eng., Faculty of New Sciences and Technologies, University of Tehran, Tehran, Iran

خلاصه مقاله:

The main purpose of this paper is to make it easier to movetoward Autonomous learning robots using reinforcementlearning (RL). Deterministic policy gradient algorithm ischosen for work in model-free and continuous state-spaceconfiguration. Neural network is chosen as FunctionApproximator (FA) for actor and critic in the algorithm. Anovel method called contractive Q-network is proposed forupdating the critic FA (Q-network). Since this methodrequires fewer samples in learning tasks, it is more efficientto be used in this context. To show the efficiency of thedeveloped method, two illustrative examples are conducted,first in the well-known puddle world and then in PushRecovery (PR) task on a simulated humanoid robot. Therobot learns how to recover from a variation of forcedirections and magnitudes.

کلمات کلیدی:

Reinforcement Learning-Neural Network-BipedRobot- Deterministic Policy Gradient-Continuous StateAction

صفحه اختصاصی مقاله و دریافت فایل کامل: https://civilica.com/doc/634622/