SwePub
Tyck till om SwePub Sök här!
Sök i LIBRIS databas

  Utökad sökning

WFRF:(Borga Magnus)
 

Sökning: WFRF:(Borga Magnus) > (1995-1999) > Reinforcement Learn...

Reinforcement Learning Using Local Adaptive Models

Borga, Magnus (författare)
Linköpings universitet,Bildbehandling,Tekniska högskolan
 (creator_code:org_t)
ISBN 9178715903
Linköping, Sweden : Linköping University, Department of Electrical Engineering, 1995
Engelska 119 s.
Serie: Linköping Studies in Science and Technology. Thesis, 0280-7971 ; 507
  • Licentiatavhandling (övrigt vetenskapligt/konstnärligt)
Abstract Ämnesord
Stäng  
  • In this thesis, the theory of reinforcement learning is described and its relation to learning in biological systems is discussed. Some basic issues in reinforcement learning, the credit assignment problem and perceptual aliasing, are considered. The methods of temporal difference are described. Three important design issues are discussed: information representation and system architecture, rules for improving the behaviour and rules for the reward mechanisms. The use of local adaptive models in reinforcement learning is suggested and exemplified by some experiments. This idea is behind all the work presented in this thesis. A method for learning to predict the reward called the prediction matrix memory is presented. This structure is similar to the correlation matrix memory but differs in that it is not only able to generate responses to given stimuli but also to predict the rewards in reinforcement learning. The prediction matrix memory uses the channel representation, which is also described. A dynamic binary tree structure that uses the prediction matrix memories as local adaptive models is presented. The theory of canonical correlation is described and its relation to the generalized eigenproblem is discussed. It is argued that the directions of canonical correlations can be used as linear models in the input and output spaces respectively in order to represent input and output signals that are maximally correlated. It is also argued that this is a better representation in a response generating system than, for example, principal component analysis since the energy of the signals has nothing to do with their importance for the response generation. An iterative method for finding the canonical correlations is presented. Finally, the possibility of using the canonical correlation for response generation in a reinforcement learning system is indicated.

Nyckelord

TECHNOLOGY
TEKNIKVETENSKAP

Publikations- och innehållstyp

vet (ämneskategori)
lic (ämneskategori)

Hitta via bibliotek

Till lärosätets databas

Hitta mer i SwePub

Av författaren/redakt...
Borga, Magnus
Delar i serien
Linköping Studie ...
Av lärosätet
Linköpings universitet

Sök utanför SwePub

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Stäng

Kopiera och spara länken för att återkomma till aktuell vy