Sökning: id:"swepub:oai:DiVA.org:kth-314828" >
Regret Minimization...
Regret Minimization for Linear Quadratic Adaptive Controllers Using Fisher Feedback Exploration
-
- Colin, Kevin (författare)
- KTH,Reglerteknik
-
- Ferizbegovic, Mina (författare)
- KTH,Reglerteknik
-
- Hjalmarsson, Håkan, 1962- (författare)
- KTH,Reglerteknik
-
(creator_code:org_t)
- Institute of Electrical and Electronics Engineers (IEEE), 2022
- 2022
- Engelska.
-
Ingår i: IEEE Control Systems Letters. - : Institute of Electrical and Electronics Engineers (IEEE). - 2475-1456. ; 6, s. 2870-2875
- Relaterad länk:
-
https://kth.diva-por... (primary) (Raw object)
-
visa fler...
-
https://urn.kb.se/re...
-
https://doi.org/10.1...
-
visa färre...
Abstract
Ämnesord
Stäng
- In this letter, we study the trade-off between exploration and exploitation for linear quadratic adaptive control. This trade-off can be expressed as a function of the exploration and exploitation costs, called cumulative regret. It has been shown over the years that the optimal asymptotic rate of the cumulative regret is in many instances O(root T). In particular, this rate can be obtained by adding a white noise external excitation, with a variance decaying as O(1/root T). As the amount of excitation is pre-determined, such approaches can be viewed as open loop control of the external excitation. In this contribution, we approach the problem of designing the external excitation from a feedback perspective leveraging the well known benefits of feedback control for decreasing sensitivity to external disturbances and system-model mismatch, as compared to open loop strategies. We base the feedback on the Fisher information matrix which is a measure of the accuracy of the model. Specifically, the amplitude of the exploration signal is seen as the control input while the minimum eigenvalue of the Fisher matrix is the variable to be controlled. We call such exploration strategies Fisher Feedback Exploration (F2E). We propose one explicit F2E design, called Inverse Fisher Feedback Exploration (IF2E), and argue that this design guarantees the optimal asymptotic rate for the cumulative regret. We provide theoretical support for IF2E and in a numerical example we illustrate benefits of IF2E and compare it with the open loop approach as well as a method based on Thompson sampling.
Ämnesord
- TEKNIK OCH TEKNOLOGIER -- Maskinteknik -- Farkostteknik (hsv//swe)
- ENGINEERING AND TECHNOLOGY -- Mechanical Engineering -- Vehicle Engineering (hsv//eng)
- NATURVETENSKAP -- Fysik -- Subatomär fysik (hsv//swe)
- NATURAL SCIENCES -- Physical Sciences -- Subatomic Physics (hsv//eng)
Nyckelord
- Regret minimization
- fisher feedback exploration
- adaptive control
- linear quadratic regulator
Publikations- och innehållstyp
- ref (ämneskategori)
- art (ämneskategori)
Hitta via bibliotek
Till lärosätets databas