SwePub
Sök i SwePub databas

  Utökad sökning

Träfflista för sökning "WFRF:(Landelius Tomas) srt2:(1993-1994)"

Sökning: WFRF:(Landelius Tomas) > (1993-1994)

  • Resultat 1-5 av 5
Sortera/gruppera träfflistan
   
NumreringReferensOmslagsbildHitta
1.
  • Landelius, Tomas, et al. (författare)
  • A Dynamic Tree Structure for Incremental Reinforcement Learning of Good Behavior
  • 1994
  • Rapport (övrigt vetenskapligt/konstnärligt)abstract
    • This paper addresses the idea of learning by reinforcement, within the theory of behaviorism. The reason for this choice is its generality and especially that the reinforcement learning paradigm allows systems to be designed, which can improve their behavior beyond that of their teacher. The role of the teacher is to define the reinforcement function, which acts as a description of the problem the machine is to solve. Gained knowledge is represented by a behavior probability density function which is approximated with a number of normal distributions, stored in the nodes of a binary tree. It is argued that a meaningful partitioning into local models can only be accomplished in a fused space consisting of both stimuli and responses. Given a stimulus, the system searches for responses likely to result in highly reinforced decisions by treating the sum of the two normal distributions on each level in the tree as a distribution describing the system's behavior at that resolution. The resolution of the response, as well as the tree growing and pruning processes, are controlled by a random variable based on the difference in performance between two consecutive levels in the tree. This results in a system that will never be content but will indefinitely continue to search for better solutions.
  •  
2.
  • Landelius, Tomas (författare)
  • Behavior Representation by Growing a Learning Tree
  • 1993
  • Licentiatavhandling (övrigt vetenskapligt/konstnärligt)abstract
    • The work presented in this thesis is based on the basic idea of learning by reinforcement, within the theory of behaviorism. The reason for this choice is the generality of such an approach, especially that the reinforcement learning paradigm allows systems to be designed which can improve their behavior beyond that of their teacher. The role of the teacher is to define the reinforcement function, which acts as a description of the problem the machine is to solve.Learning is considered to be a bootstrapping procedure. Fragmented past experience, of what to do when performing well, is used for response generation. The new response, in its turn, adds more information to the system about the environment. Gained knowledge is represented by a behavior probability density function. This density function is approximated with a number of normal distributions which are stored in the nodes of a binary tree. The tree structure is grown by applying a recursive algorithm to the stored stimuli-response combinations, called decisions. By considering both the response and the stimulus, the system is able to bring meaning to structures in the input signal. The recursive algorithm is first applied to the whole set of stored decisions. A mean decision vector and a covariance matrix are calculated and stored in the root node. The decision space is then partitioned into two halves across the direction of maximal data variation. This procedure is now repeated recursively for each of the two halves of the decision space, forming a binary tree with mean vectors and covariance matrices in its nodes.The tree is the system's guide to response generation. Given a stimulus, the system searches for responses likely to result in highly reinforced decisions. This is accomplished by treating the sum of the normal distributions in the leaves as distribution describing the behavior of the system. The sum of normal distributions, with the current stimulus held fixed, is finally used for random generation of the response.This procedure makes it possible for the system to have several equally plausible responses to one stimulus. Not applying maximum likelihood principles will make the system more explorative and reduce its risk of being trapped in local minima.The performance and complexity of the learning tree is investigated and compared to some well known alternative methods. Presented are also some simple, yet principally important, experiments verifying the behavior of the proposed algorithm.
  •  
3.
  • Landelius, Tomas, et al. (författare)
  • Depth and Velocity from Orientation Tensor Fields
  • 1993
  • Ingår i: SCIA8.
  • Konferensbidrag (refereegranskat)abstract
    • This paper presents an algorithm for retrieving depth and velocity by estimating the 3D-orientation in an image sequence under the assumption of pure translation of the camera in a static scene. Quantitative error measurements are presented comparing the proposed algorithm to a gradient based optical flow algorithm.
  •  
4.
  •  
5.
  • Landelius, Tomas, et al. (författare)
  • The Learning Tree, A New Concept in Learning
  • 1993
  • Ingår i: Proceedings of the 2nd International Conference on Adaptive and Learning Systems.
  • Konferensbidrag (refereegranskat)abstract
    • In this paper learning is considered to be the bootstrapping procedure where fragmented past experience of what to do when performing well is used for generation of new responses adding more information to the system about the environment. The gained knowledge is represented by a behavior probability density function which is decomposed into a number of normal distributions using a binary tree. This tree structure is built by storing highly reinforced stimuli-response combinations, decisions, and calculating their mean decision vector and covariance matrix. Thereafter the decision space is divided, through the mean vector, into two halves along the direction of maximal data variation. The mean vector and the covariance matrix are stored in the tree node and the procedure is repeated recursively for each of the two halves of the decision space forming a binary tree with mean vectors and covariance matrices in its nodes. The tree is the systems guide to response generation. Given a stimuli the system searches for decisions likely to give a high reinforcement. This is accomplished by treating the sum of the normal distributions in the leaves, using their mean vectors and covariance matrices as the distribution parameters, as a distribution describing the systems behavior. A response is generated by fixating the stimuli in this sum of normal distribution and use the resulting distribution, which turns out to be a new sum of normal distributions, for random generation of the response. This procedure will also make it possible for the system to have several equally plausible response to one stimuli when this is appropriate. Not applying maximum likelihood principles will lead to a more explorative system behavior avoiding local minima traps.
  •  
Skapa referenser, mejla, bekava och länka
  • Resultat 1-5 av 5

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Stäng

Kopiera och spara länken för att återkomma till aktuell vy