PID Controller Tuning with Deep Reinforcement Learning Policy Gradient Methods
محل انتشار: بیست و نهمین همایش سالانه بین المللی انجمن مهندسان مکانیک ایران و هشتمین همایش صنعت نیروگاه های حرارتی
سال انتشار: 1400
نوع سند: مقاله کنفرانسی
زبان: انگلیسی
مشاهده: 443
فایل این مقاله در 6 صفحه با فرمت PDF قابل دریافت می باشد
- صدور گواهی نمایه سازی
- من نویسنده این مقاله هستم
استخراج به نرم افزارهای پژوهشی:
شناسه ملی سند علمی:
ISME29_304
تاریخ نمایه سازی: 13 تیر 1400
چکیده مقاله:
In this paper challenge of tuning a PID controller for a single input single output (SISO) system has been overcome with couple of reinforcement learning agents which can automatically find the optimum values for controller parameters (kp, ki, kd). First, a self-balancing robot with two coaxial wheels was simulated using the PyBullet physics library. Motors, IMU and Inertial Measurement Unit (IMU) were added via PyBullet features. Next, the robot’s Environment has been defined using the OpenAI GYM library. Both state space and action space of RL agents are continuous and ANN was used as function approximator in RL agents. For better computation speed and faster training, agents were implemented with Microsoft COAX, JAX, and Haiku since they have privileges of using GPU acceleration. Neural Network backpropagation is a computationally expensive operation and in case the forward pass of ANN gets more complicated than hardware capabilities it might cause problems for real-time simulation (step-simulation is possible for all cases). During the training agent’s properties recorded and plotted. Finally, we drew comparison between agents themselves and a manually tuned controller with the classic method. Even with the PID controller (not tuned and randomly adjusted), the system itself is still naturally unstable and the stability criteria (controller stability, pitch angle of torso, the center of mass linear or angular speed and etc.) should be considered in reward function for best possible results.
کلیدواژه ها:
نویسندگان
Kasra Sinaei
Center of Advanced Systems and Technology, University of Tehran, Tehran;
Mohammad Reza Ha'iri Yazdi
School of Mechanical Engineering, University of Tehran, Tehran