SwePub
Sök i LIBRIS databas

  Extended search

onr:"swepub:oai:DiVA.org:liu-193589"
 

Search: onr:"swepub:oai:DiVA.org:liu-193589" > Safe Reinforcement ...

  • 1 of 1
  • Previous record
  • Next record
  •    To hitlist

Safe Reinforcement Learning via a Model-Free Safety Certifier

Modares, Amir (author)
Sharif Univ Technol, Iran
Sadati, Nasser (author)
Sharif Univ Technol, Iran
Esmaeili, Babak (author)
Michigan State Univ, MI 48863 USA
show more...
Adib Yaghmaie, Farnaz (author)
Linköpings universitet,Reglerteknik,Tekniska fakulteten
Modares, Hamidreza (author)
Michigan State Univ, MI 48863 USA
show less...
 (creator_code:org_t)
2023
2023
English.
In: IEEE Transactions on Neural Networks and Learning Systems. - : IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC. - 2162-237X .- 2162-2388.
  • Journal article (peer-reviewed)
Abstract Subject headings
Close  
  • This article presents a data-driven safe reinforcement learning (RL) algorithm for discrete-time nonlinear systems. A data-driven safety certifier is designed to intervene with the actions of the RL agent to ensure both safety and stability of its actions. This is in sharp contrast to existing model-based safety certifiers that can result in convergence to an undesired equilibrium point or conservative interventions that jeopardize the performance of the RL agent. To this end, the proposed method directly learns a robust safety certifier while completely bypassing the identification of the system model. The nonlinear system is modeled using linear parameter varying (LPV) systems with polytopic disturbances. To prevent the requirement for learning an explicit model of the LPV system, data-based $\lambda$ -contractivity conditions are first provided for the closed-loop system to enforce robust invariance of a prespecified polyhedral safe set and the systems asymptotic stability. These conditions are then leveraged to directly learn a robust data-based gain-scheduling controller by solving a convex program. A significant advantage of the proposed direct safe learning over model-based certifiers is that it completely resolves conflicts between safety and stability requirements while assuring convergence to the desired equilibrium point. Data-based safety certification conditions are then provided using Minkowski functions. They are then used to seemingly integrate the learned backup safe gain-scheduling controller with the RL controller. Finally, we provide a simulation example to verify the effectiveness of the proposed approach.

Subject headings

NATURVETENSKAP  -- Data- och informationsvetenskap -- Datavetenskap (hsv//swe)
NATURAL SCIENCES  -- Computer and Information Sciences -- Computer Sciences (hsv//eng)

Keyword

Data-driven control; gain-scheduling control; reinforcement learning (RL); safe control

Publication and Content Type

ref (subject category)
art (subject category)

Find in a library

To the university's database

  • 1 of 1
  • Previous record
  • Next record
  •    To hitlist

Search outside SwePub

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Close

Copy and save the link in order to return to this view