SwePub
Sök i SwePub databas

  Utökad sökning

Träfflista för sökning "WFRF:(Landelius Tomas) srt2:(1995-1999)"

Sökning: WFRF:(Landelius Tomas) > (1995-1999)

  • Resultat 1-10 av 11
Sortera/gruppera träfflistan
   
NumreringReferensOmslagsbildHitta
1.
  • Borga, Magnus, et al. (författare)
  • A Unified Approach to PCA, PLS, MLR and CCA
  • 1997
  • Rapport (övrigt vetenskapligt/konstnärligt)abstract
    • This paper presents a novel algorithm for analysis of stochastic processes. The algorithm can be used to find the required solutions in the cases of principal component analysis (PCA), partial least squares (PLS), canonical correlation analysis (CCA) or multiple linear regression (MLR). The algorithm is iterative and sequential in its structure and uses on-line stochastic approximation to reach an equilibrium point. A quotient between two quadratic forms is used as an energy function and it is shown that the equilibrium points constitute solutions to the generalized eigenproblem.
  •  
2.
  •  
3.
  • Knutsson, Hans, et al. (författare)
  • Generalized Eigenproblem for Stochastic Process Covariances
  • 1996
  • Rapport (övrigt vetenskapligt/konstnärligt)abstract
    • This paper presents a novel algorithm for finding the solution of the generalized eigenproblem where the matrices involved contain expectation values from stochastic processes. The algorithm is iterative and sequential to its structure and uses on-line stochastic approximation to reach an equilibrium point. A quotient between two quadratic forms is suggested as an energy function for this problem and is shown to have zero gradient only at the points solving the eigenproblem. Furthermore it is shown that the algorithm for the generalized eigenproblem can be used to solve three important problems as special cases. For a stochastic process the algorithm can be used to find the directions for maximal variance, covariance, and canonical correlation as well as their magnitudes.
  •  
4.
  • Knutsson, Hans, et al. (författare)
  • Learning Canonical Correlations
  • 1995
  • Rapport (övrigt vetenskapligt/konstnärligt)abstract
    • This paper presents a novel learning algorithm that finds the linear combination of one set of multi-dimensional variates that is the best predictor, and at the same time finds the linear combination of another set which is the most predictable. This relation is known as the canonical correlation and has the property of being invariant with respect to affine transformations of the two sets of variates. The algorithm successively finds all the canonical correlations beginning with the largest one. It is shown that canonical correlations can be used in computer vision to find feature detectors by giving examples of the desired features. When used on the pixel level, the method finds quadrature filters and when used on a higher level, the method finds combinations of filter output that are less sensitive to noise compared to vector averaging.
  •  
5.
  • Knutsson, Hans, 1950-, et al. (författare)
  • Learning Multidimensional Signal Processing
  • 1998
  • Ingår i: Proceedings of the 14th International Conference on Pattern Recognition, vol 2. - Linköping, Sweden : Linköping University, Department of Electrical Engineering. ; , s. 1416-1420
  • Rapport (övrigt vetenskapligt/konstnärligt)abstract
    • This paper presents our general strategy for designing learning machines as well as a number of particular designs. The search for methods allowing a sufficient level of adaptivity are based on two main principles: 1. Simple adaptive local models and 2. Adaptive model distribution. Particularly important concepts in our work is mutual information and canonical correlation. Examples are given on learning feature descriptors, modeling disparity, synthesis of a global 3-mode model and a setup for reinforcement learning of online video coder parameter control.
  •  
6.
  • Landelius, Tomas, et al. (författare)
  • Behaviorism and Reinforcement Learning
  • 1995
  • Ingår i: Proceedings, 2nd Swedish Conference on Connectionism. ; , s. 259-270
  • Konferensbidrag (refereegranskat)
  •  
7.
  • Landelius, Tomas, et al. (författare)
  • Greedy adaptive critics for LPQ [dvs LQR] problems : Convergence Proofs
  • 1996
  • Rapport (övrigt vetenskapligt/konstnärligt)abstract
    • A number of success stories have been told where reinforcement learning has been applied to problems in continuous state spaces using neural nets or other sorts of function approximators in the adaptive critics. However, the theoretical understanding of why and when these algorithms work is inadequate. This is clearly exemplified by the lack of convergence results for a number of important situations. To our knowledge only two such results been presented for systems in the continuous state space domain. The first is due to Werbos and is concerned with linear function approximation and heuristic dynamic programming. Here no optimal strategy can be found why the result is of limited importance. The second result is due to Bradtke and deals with linear quadratic systems and quadratic function approximators. Bradtke's proof is limited to ADHDP and policy iteration techniques where the optimal solution is found by a number of successive approximations. This paper deals with greedy techniques, where the optimal solution is directly aimed for. Convergence proofs for a number of adaptive critics, HDP, DHP, ADHDP and ADDHP, are presented. Optimal controllers for linear quadratic regulation (LQR) systems can be found by standard techniques from control theory but the assumptions made in control theory can be weakened if adaptive critic techniques are employed. The main point of this paper is, however, not to emphasize the differences but to highlight the similarities and by so doing contribute to a theoretical understanding of adaptive critics.
  •  
8.
  • Landelius, Tomas, et al. (författare)
  • On-Line Singular Value Decomposition of Stochastic Process Covariances
  • 1995
  • Rapport (övrigt vetenskapligt/konstnärligt)abstract
    • This paper presents novel algorithms for finding the singular value decomposition (SVD) of a general covariance matrix by stochastic approximation. General in the sense that also non-square, between sets, covariance matrices are dealt with. For one of the algorithms, convergence is shown using results from stochastic approximation theory. Proofs of this sort, establishing both the point of equilibrium and its domain of attraction, have been reported very rarely for stochastic, iterative feature extraction algorithms.
  •  
9.
  • Landelius, Tomas, et al. (författare)
  • Reinforcement Learning Adaptive Control and Explicit Criterion Maximization
  • 1996
  • Rapport (övrigt vetenskapligt/konstnärligt)abstract
    • This paper reviews an existing algorithm for adaptive control based on explicit criterion maximization (ECM) and presents an extended version suited for reinforcement learning tasks. Furthermore, assumptions under which the algorithm convergences to a local maxima of a long term utility function are given. Such convergence theorems are very rare for reinforcement learning algorithms working with continuous state and action spaces. A number of similar algorithms, previously suggested to the reinforcement learning community, are briefly surveyed in order to give the presented algorithm a place in the field. The relations between the different algorithms is exemplified by checking their consistency on a simple problem of linear quadratic regulation (LQR).
  •  
10.
  • Landelius, Tomas (författare)
  • Reinforcement Learning and Distributed Local Model Synthesis
  • 1997
  • Doktorsavhandling (övrigt vetenskapligt/konstnärligt)abstract
    • Reinforcement learning is a general and powerful way to formulate complex learning problems and acquire good system behaviour. The goal of a reinforcement learning system is to maximize a long term sum of instantaneous rewards provided by a teacher. In its extremum form, reinforcement learning only requires that the teacher can provide a measure of success. This formulation does not require a training set with correct responses, and allows the system to become better than its teacher.In reinforcement learning much of the burden is moved from the teacher to the training algorithm. The exact and general algorithms that exist for these problems are based on dynamic programming (DP), and have a computational complexity that grows exponentially with the dimensionality of the state space. These algorithms can only be applied to real world problems if an efficient encoding of the state space can be found.To cope with these problems, heuristic algorithms and function approximation need to be incorporated. In this thesis it is argued that local models have the potential to help solving problems in high-dimensional spaces and that global models have not. This is motivated with the biasvariance dilemma, which is resolved with the assumption that the system is constrained to live on a low-dimensional manifold in the space of inputs and outputs. This observation leads to the introduction of bias in terms of continuity and locality.A linear approximation of the system dynamics and a quadratic function describing the long term reward are suggested to constitute a suitable local model. For problems involving one such model, i.e. linear quadratic regulation problems, novel convergence proofs for heuristic DP algorithms are presented. This is one of few available convergence proofs for reinforcement learning in continuous state spaces.Reinforcement learning is closely related to optimal control, where local models are commonly used. Relations to present methods are investigated, e.g. adaptive control, gain scheduling, fuzzy control, and jump linear systems. Ideas from these areas are compiled in a synergistic way to produce a new algorithm for heuristic dynamic programming where function parameters and locality, expressed as model applicability, are learned on-line. Both top-down and bottom-up versions are presented.The emerging local models and their applicability need to be memorized by the learning system. The binary tree is put forward as a suitable data structure for on-line storage and retrieval of these functions.
  •  
Skapa referenser, mejla, bekava och länka
  • Resultat 1-10 av 11

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Stäng

Kopiera och spara länken för att återkomma till aktuell vy