Use of PID control during Education in Reinforcement Learning on Two Wheel Balance Robot

This study’s primary objective was to try to shorten the training time of the Reinforcement Learning (RL) method, which is one of the Machine Learning methods, by using the proportional-integral-derivative (PID) control method during training. In this study, a balancing robot with two wheels that can be controlled independently on the same axis is used. While the robot is in balance, the RL software block follows how the PID block maintains the balance, and the RL blog learned how to behave against disturbing factors without physical falling / rising. In the training of RL, it is necessary to create approximately 500 policy / reward / path equations between the current state and future state matrices. Obviously, the amount of equations will increase considerably when subjects such as old position and acceleration are added. Approximately 1000 trial / error is required for training purposes. This means many falling / rising cycles. With the method we present, the RL block has learned to keep the robot in balance without falling and requiring human intervention in 900 trials. This shortened the training time by about 60%.

___

  • [1] Ali Ghaffari, Azadeh Shariati, Amir H. Shamekhi. “A modified dynamical formulation for two-wheeled self-balancing robots” (Article - DOI 10.1007/s11071-015-2321-9)
  • [2] Chıa-Hong Chen, Jong-Hann Jean, Dao-Xıang Xu. “Applıcatıon Of Fuzzy Control For Self-Balancıng Two-Wheel Vehıcle” (Article - 978-1-4577-0308-9/11/$26.00 ©2011 IEEE)
  • [3] Raudys A, Subonien A. “A Review of Self-balancing Robot Reinforcement Learning Algorithms” (Article - ICIST 2020, CCIS 1283, pp. 159–170, 2020. )
  • [4] Muhammad Atif Imtiaz, Mahum Naveed, Nimra Bibi, Sumair Aziz, Syed Zohaib Hassan Naqvi. “Control System Design, Analysis & Implementation of Two Wheeled Self Balancing Robot” (Article - 978-1-5386-7266-2/18/$31.00 ©2018 IEEE)
  • [5] Boston Dynamic (website) - https://www.bostondynamics.com/handle
  • [6] Ascento (website) - https://www.ascento.ethz.ch/
  • [7] Segway - “Segway Inc.: Reference manual, Segway personal transporter (PT).” Segway Inc., Bedford, NH (2006)
  • [8] R.E. Parr, “Hierarchical Control and Learning for Markov Decision Processes.” University of California: Berkeley, 1998.
  • [9] Juan Yan, Huibin Yang. “Hierarchical Reinforcement Learning Based Self-balancing Algorithm for Two-wheeled Robots.” (Article - DOI: 10.2174/1874129001610010069)
  • [10] Liangliang Cui, Yongsheng Ou, Junbo Xin, Dawei Dai, Xiang Gao. “Control of a Two-Wheeled Self-Balancing Robot with Support Vector Regression Method.” (Article - 978-1-4799-4808-6 /14/$31.00 ©2014 IEEE)
  • [11] Shih-Yu Chang, Ching-Lung Chang. “Using Reinforcement Learning to Achieve Two Wheeled Self Balancing Control” (Article - 978-1-5090-3438-3/16 $31.00 © 2016 IEEE)
  • [12] Guoping You, Wanghui Zeng. “Design of Two-Wheel Balance Car Based on STM32” (Article - 2474-3828/18/$31.00 ©2018 IEEE)
  • [13] Tao Zhao, Qian Yu, Songyi Dian, Rui Guo, Shengchuan Li. “Non-singleton General Type-2 Fuzzy Control for a Two-Wheeled Self-Balancing Robot” (Article - https://doi.org/10.1007/s40815-019-00664-4)
  • [14] Chinmay Samak, Temo Samak. “Design of a Two-Wheel Self-Balancing Robot with the Implementation of a Novel State Feedback for PID Controller using On-Board State Estimation Algorithm” (Article·- ISSN(P): 2250-1592; ISSN(E): 2278-9421)
  • [15] Ebin Philip, Sharath Golluri. “Implementation of an Autonomous Self-Balancing Robot Using Cascaded PID Strategy” (Article - 978-1-7281-6139-6/20/$31.00 ©2020 IEEE)
  • [16] Linyuan Guo, Syed Ali Asad Rizvi, Zongli Lin. “Optimal control of a two-wheeled self-balancing robot by reinforcement learning.” (Article - DOI: 10.1002/rnc.5058 )
  • [17] The Anh Mai, D. N. Anisimov, Thai Son Dang, Van Nam Dinh. “Development of a microcontroller-based adaptive fuzzy controller for a two-wheeled self-balancing robot.”(Article - 00542-018-3825-2)
  • [18] MD Muhaimin Rahman , S. M. Hasanur Rashid, M. M. Hossain. “Implementation of Q learning and deep Q network for controlling a self balancing robot model.” (Article - https://doi.org/10.1186/s40638-018-0091-9)
  • [19] Penghui Xia, Yanjie Li. “The Control of Two-Wheeled Self-Balancing Vehicle based on Reinforcement Learning in a Continuous domain.” (Article - 978-1-5386-2901-7/17/$31.00 ©2017 IEEE)
  • [20] Sun Liang, Feimei Gan. “Balance Control of two-wheeled Robot Based on Reinforcement Learning.” (Article - 978-l-61284-088-8/ll/$26.00 ©2011 IEEE)
  • [21] Ren Hongge, Wang Zhilong, Li Fujin ,Huo Meijie. “The Balance Control of Two-wheeled Robot Based on Bionic Learning Algorithm” (Article - 978-1-4799-3708-0/14/$31.00c 2014 IEEE)
  • [22] Pete Warden, Daniel Situnayake. “TinyML - Machine Learning with TensorFlow Lite on Arduino and Ultra-Low-Power Microcontrollers” (Book - Published by O’Reilly Media, Inc., 1005 Gravenstein Highway North, Sebastopol, CA 95472.)