Deep Q-learning: a robust control approach

↓ Direkt till sidans innehåll
↓ Direkt till sidans sekundära innehåll (sidomenyn)

Search: WFRF:(Kulcsár Balázs Adam 1975) > (2020-2024) > Deep Q-learning: a ...

Deep Q-learning: a robust control approach

Varga, Balázs, 1990 (author): Chalmers tekniska högskola,Chalmers University of Technology

Kulcsár, Balázs Adam, 1975 (author): Chalmers tekniska högskola,Chalmers University of Technology

Haghir Chehreghani, Morteza, 1982 (author): Chalmers tekniska högskola,Chalmers University of Technology

(creator_code:org_t)

2022-10-29
2023
English.
In: International Journal of Robust and Nonlinear Control. - : Wiley. - 1099-1239 .- 1049-8923. ; 33:1, s. 526-544

Related links:: https://research.cha... (primary) (free); show more...; https://research.cha...; https://doi.org/10.1...; show less...

Journal article (peer-reviewed)

Abstract Subject headings

This work aims at constructing a bridge between robust control theory and reinforcement learning. Although, reinforcement learning has shown admirable results in complex control tasks, the agent’s learning behaviour is opaque. Meanwhile, system theory has several tools for analyzing and controlling dynamical systems. This paper places deep Q-learning is into a control-oriented perspective to study its learning dynamics with well-established techniques from robust control. An uncertain linear time-invariant model is formulated by means of the neural tangent kernel to describe learning. This novel approach allows giving conditions for stability (convergence) of the learning and enables the analysis of the agent’s behaviour in frequency-domain. The control-oriented approach makes it possible to formulate robust controllers that inject dynamical rewards as control input in the loss function to achieve better convergence properties. Three output-feedback controllers are synthesized: gain scheduling H2, dynamical Hinf, and fixed-structure Hinf controllers. Compared to traditional deep Q-learning techniques, which involve several heuristics, setting up the learning agent with a control-oriented tuning methodology is more transparent and has well-established literature. The proposed approach does not use a target network and randomized replay memory. The role of the target network is overtaken by the control input, which also exploits the temporal dependency of samples (opposed to a randomized memory buffer). Numerical simulations in different OpenAI Gym environments suggest that the Hinf controlled learning can converge faster and receive higher scores (depending on the environment) compared to the benchmark Double deep Q-learning.

Subject headings

NATURVETENSKAP -- Matematik -- Beräkningsmatematik (hsv//swe)
NATURAL SCIENCES -- Mathematics -- Computational Mathematics (hsv//eng)
TEKNIK OCH TEKNOLOGIER -- Elektroteknik och elektronik -- Robotteknik och automation (hsv//swe)
ENGINEERING AND TECHNOLOGY -- Electrical Engineering, Electronic Engineering, Information Engineering -- Robotics (hsv//eng)
TEKNIK OCH TEKNOLOGIER -- Elektroteknik och elektronik -- Reglerteknik (hsv//swe)
ENGINEERING AND TECHNOLOGY -- Electrical Engineering, Electronic Engineering, Information Engineering -- Control Engineering (hsv//eng)
TEKNIK OCH TEKNOLOGIER -- Elektroteknik och elektronik -- Signalbehandling (hsv//swe)
ENGINEERING AND TECHNOLOGY -- Electrical Engineering, Electronic Engineering, Information Engineering -- Signal Processing (hsv//eng)

Keyword

Deep Q-learning
Robust control
Neural Tangent Kernel
Controlled learning

Publication and Content Type

art (subject category)
ref (subject category)

Find in a library

International Journal of Robust and Nonlinear Control (Search for host publication in LIBRIS)

To the university's database

Find more in SwePub

By the author/editor: Varga, Balázs, 1 ...; Kulcsár, Balázs ...; Haghir Chehregha ...

About the subject

NATURAL SCIENCES: NATURAL SCIENCES; and Mathematics; and Computational Ma ...

ENGINEERING AND TECHNOLOGY: ENGINEERING AND ...; and Electrical Engin ...; and Robotics

ENGINEERING AND TECHNOLOGY: ENGINEERING AND ...; and Electrical Engin ...; and Control Engineer ...

ENGINEERING AND TECHNOLOGY: ENGINEERING AND ...; and Electrical Engin ...; and Signal Processin ...

Articles in the publication: International Jo ...

By the university: Chalmers University of Technology

Search outside SwePub

Extend your search to:: Google; Google Book Search; Google Scholar

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

LIBRIS.kb.se