A theoretical demonstration for reinforcement learning of PI control dynamics for optimal speed control of DC motors by using Twin Delay Deep Deterministic Policy Gradient Algorithm

Tufenkci, Sevilay; Alagoz, Baris Baykant; Kavuran, Gurkan; Yeroglu, Celaleddin; Herencsar, Norbert; Mahata, Shibendu

A theoretical demonstration for reinforcement learning of PI control dynamics for optimal speed control of DC motors by using Twin Delay Deep Deterministic Policy Gradient Algorithm

dc.authorid	Alagoz, Baris Baykant/0000-0001-5238-6433
dc.authorid	Herencsar, Norbert/0000-0002-9504-2275
dc.authorid	Kavuran, Gurkan/0000-0003-2651-5005
dc.authorwosid	Alagoz, Baris Baykant/ABG-8526-2020
dc.authorwosid	Herencsar, Norbert/A-6539-2009
dc.authorwosid	Kavuran, Gurkan/S-6935-2016
dc.contributor.author	Tufenkci, Sevilay
dc.contributor.author	Alagoz, Baris Baykant
dc.contributor.author	Kavuran, Gurkan
dc.contributor.author	Yeroglu, Celaleddin
dc.contributor.author	Herencsar, Norbert
dc.contributor.author	Mahata, Shibendu
dc.date.accessioned	2024-08-04T20:53:08Z
dc.date.available	2024-08-04T20:53:08Z
dc.date.issued	2023
dc.department	İnönü Üniversitesi	en_US
dc.description.abstract	To benefit from the advantages of Reinforcement Learning (RL) in industrial control applications, RL methods can be used for optimal tuning of the classical controllers based on the simulation scenarios of operating con-ditions. In this study, the Twin Delay Deep Deterministic (TD3) policy gradient method, which is an effective actor-critic RL strategy, is implemented to learn optimal Proportional Integral (PI) controller dynamics from a Direct Current (DC) motor speed control simulation environment. For this purpose, the PI controller dynamics are introduced to the actor-network by using the PI-based observer states from the control simulation envi-ronment. A suitable Simulink simulation environment is adapted to perform the training process of the TD3 algorithm. The actor-network learns the optimal PI controller dynamics by using the reward mechanism that implements the minimization of the optimal control objective function. A setpoint filter is used to describe the desired setpoint response, and step disturbance signals with random amplitude are incorporated in the simu-lation environment to improve disturbance rejection control skills with the help of experience based learning in the designed control simulation environment. When the training task is completed, the optimal PI controller coefficients are obtained from the weight coefficients of the actor-network. The performance of the optimal PI dynamics, which were learned by using the TD3 algorithm and Deep Deterministic Policy Gradient algorithm, are compared. Moreover, control performance improvement of this RL based PI controller tuning method (RL-PI) is demonstrated relative to performances of both integer and fractional order PI controllers that were tuned by using several popular metaheuristic optimization algorithms such as Genetic Algorithm, Particle Swarm Opti-mization, Grey Wolf Optimization and Differential Evolution.	en_US
dc.identifier.doi	10.1016/j.eswa.2022.119192
dc.identifier.issn	0957-4174
dc.identifier.issn	1873-6793
dc.identifier.scopus	2-s2.0-85141914275	en_US
dc.identifier.scopusquality	Q1	en_US
dc.identifier.uri	https://doi.org/10.1016/j.eswa.2022.119192
dc.identifier.uri	https://hdl.handle.net/11616/100993
dc.identifier.volume	213	en_US
dc.identifier.wos	WOS:000890664400010	en_US
dc.identifier.wosquality	Q1	en_US
dc.indekslendigikaynak	Web of Science	en_US
dc.indekslendigikaynak	Scopus	en_US
dc.language.iso	en	en_US
dc.publisher	Pergamon-Elsevier Science Ltd	en_US
dc.relation.ispartof	Expert Systems With Applications	en_US
dc.relation.publicationcategory	Makale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanı	en_US
dc.rights	info:eu-repo/semantics/closedAccess	en_US
dc.subject	Deep reinforcement learning	en_US
dc.subject	DC motor	en_US
dc.subject	PI controller	en_US
dc.subject	Twin -delayed deep deterministic policy	en_US
dc.subject	gradient	en_US
dc.subject	Metaheuristic optimization	en_US
dc.title	A theoretical demonstration for reinforcement learning of PI control dynamics for optimal speed control of DC motors by using Twin Delay Deep Deterministic Policy Gradient Algorithm	en_US
dc.type	Article	en_US

Koleksiyon

WoS İndeksli Yayınlar Koleksiyonu
Scopus İndeksli Yayınlar Koleksiyonu

A theoretical demonstration for reinforcement learning of PI control dynamics for optimal speed control of DC motors by using Twin Delay Deep Deterministic Policy Gradient Algorithm

Dosyalar

Koleksiyon