Dikey Kalkış ve İniş Sistemi Modeli için Derin Pekiştirmeli Öğrenme Tabanlı Kontrolör Tasarımı

Bu çalışmada, yapay sinir ağları ve pekiştirmeli öğrenmenin birleşiminden oluşan Deep Deterministic Policy Gradient (DDPG) derin pekiştirme öğrenme algoritması Dikey Kalkış ve İniş (VTOL) sistemi modeline yunuslama (pitch) açısını kontrol edebilme amacıyla uygulanmıştır. Bu algoritma, Oransal İntegral Türevsel (PID) kontrolör gibi geleneksel kontrol algoritmaları için en uygun kontrolör katsayıları bulunsa dahi kontrol edilecek sistem üzerindeki bozucu etki ve istenmeyen ortam etkilerini elimine edebilecek kontrol sinyali üretememelerinden dolayı seçilmiştir. Belirtilen bu problemi çözebilmek için kontrol amacına yönelik belirlenen bir ödül fonksiyonuna göre ödülü maksimize edebilecek yapısı ve yapay sinir ağlarının genelleştirme yeteneğini arkasına alan kontrol aksiyon değerleri üretebilen derin pekiştirmeli öğrenme yöntemlerinden sürekli eylem uzayına sahip DDPG algoritmasının, Simulink ortamında VTOL sisteminin matematiksel modelinde sinüzoidal bir referans için eğitimi gerçekleştirilmiştir. Belirtilen VTOL sistemi için çıkış olan yunuslama açısının, DDPG algoritması için sinusoidal ve sabit referans için elde edilen izleme başarımları, geleneksel PID kontrolör algoritmasının izleme başarımları ile ortalama kare hatası, integral kare hatası, integral mutlak hatası, yüzde aşım ve oturma zamanı cinsinden karşılaştırılmıştır ve edinilen sonuçlar simülasyon çalışmaları ile sunulmuştur.

Deep Reinforcement Learning Based Controller Design for Model of The Vertical Take off and Landing System

In this study, the Deep Deterministic Policy Gradient (DDPG) algorithm, which consists of a combination of artificial neural networks and reinforcement learning, was applied to the Vertical Takeoff and Landing (VTOL) system model in order to control the pitch angle. This algorithm was selected because conventional control algorithms such as Proportional Integral Derivative (PID) controllers which cannot always generate a suitable control signal eliminating the disturbance and unwanted environment effects on the considered system. In order to control the system, training was carried out for a sinusoidal reference in the mathematical model of the VTOL system in the Simulink environment, through the DDPG algorithm with continuous action space from deep reinforcement learning methods that can produce control action values that take the structure that can maximize the reward according to a determined reward function for the purpose of control and the generalization ability of artificial neural networks. For sinusoidal reference and a constant reference, tracking error performances obtained for the pitch angle, which is the output for the specified VTOL system, were compared with the conventional PID controller performance in terms of mean square error, integral square error, integral absolute error, percentage overshoot and settling time. The obtained results are presented via the simulations studies.

___

  • Bou-Ammar, H., Voos, H., & Ertel, W. (2010, September). Controller design for quadrotor uavs using reinforcement learning. In 2010 IEEE International Conference on Control Applications (pp. 2130-2135). IEEE.
  • Brandi, S., Piscitelli, M. S., Martellacci, M., & Capozzoli, A. (2020). Deep Reinforcement Learning to optimise indoor temperature control and heating energy consumption in buildings. Energy and Buildings, 224, 110225.
  • Buşoniu, L., Bruin, T., Tolić, D., Kober, J., & Palunko, I. (2018). Reinforcement learning for control: Performance, stability, and deep approximators. Annual Reviews in Control, 46, 8-28.
  • Chen, P., He, Z., Chen, C., & Xu, J. (2018). Control strategy of speed servo systems based on deep reinforcement learning. Algorithms, 11(5), 65.
  • Hossny, M., Iskander, J., Attia, M., & Saleh, K. (2020). Refined continuous control of ddpg actors via parametrised activation. arXiv preprint arXiv:2006.02818.
  • Hu, B., Yang, J., Li, J., Li, S., & Bai, H. (2019). Intelligent control strategy for transient response of a variable geometry turbocharger system based on deep reinforcement learning. Processes, 7(9), 601.
  • Junejo, M., Kalhoro, A. N., & Kumari, A. (2020). Fuzzy logic based PID auto tuning method of QNET 2.0 VTOL.
  • Lillicrap, T. P., Hunt, J. J., Pritzel, A., Heess, N., Erez, T., Tassa, Y., Silver, D., & Wierstra, D. (2015). Continuous control with deep reinforcement learning. arXiv preprint arXiv:1509.02971.
  • Ogata, K. (2010). Modern control engineering. Prentice hall.
  • Parvaresh, A., Abrazeh, S., Mohseni, S. R., Zeitouni, M. J., Gheisarnejad, M., & Khooban, M. H. (2020). A Novel Deep Learning Backstepping Controller Based Digital Twins Technology for Pitch Angle Control of Variable Speed Wind Turbine. Designs, 4(2), 15.
  • Qin, Y., Zhang, W., Shi, J., & Liu, J. (2018, August). Improve PID controller through reinforcement learning. In 2018 IEEE CSAA Guidance, Navigation and Control Conference (CGNCC) (pp. 1 6). IEEE.
  • Quanser Inc. (2011) QNET VTOL Instructor Workbook, ftp://ftp.ni.com/evaluation/academic/ekits/QNET_VTOL_Workbook_Student.pdf.
  • Rabault, J., Kuchta, M., Jensen, A., Réglade, U., & Cerardi, N. (2019). Artificial neural networks trained through deep reinforcement learning discover control strategies for active flow control. Journal of fluid mechanics, 865, 281 302.
  • Rahman, M. M., Rashid, S. H., & Hossain, M. M. (2018). Implementation of Q learning and deep Q network for controlling a self balancing robot model. Robotics and biomimetics, 5(1), 1 6.
  • Satheeshbabu, S., Uppalapati, N. K., Chowdhary, G., & Krishnan, G. (2019, May). Open loop position control of soft continuum arm using deep reinforcement learning. In 2019 Internati onal Conference on Robotics and Automation (ICRA) (pp. 5133 5139). IEEE.
  • Shi, Q., Lam, H. K., Xiao, B., & Tsai, S. H. (2018). Adaptive PID controller based on Q‐learning algorithm. CAAI Transactions on Intelligence Technology, 3(4), 235 244.
  • Spielberg, S. P. K., Gopaluni, R. B., & Loewen, P. D. (2017, May). Deep reinforcement learning approaches for process control. In 2017 6th international symposium on advanced control of industrial processes (AdCONIP) (pp. 201 206). IEEE.
  • Sutton, R. S., & Barto, A. G. (2018). Reinforcement learning: An introduction. MIT press.
  • Taşören, A. E., Gökçen, A., Soydemir, M. U., Şahin, S. (2020). Artificial Neural Network Based Adaptive PID Controller Design for Vertical Takeoff and Landing Model. European Journal of Science and Technology, (Special Issue), 87 93.