SwePub
Sök i SwePub databas

  Extended search

Träfflista för sökning "WFRF:(Arnekvist Isac) "

Search: WFRF:(Arnekvist Isac)

  • Result 1-8 of 8
Sort/group result
   
EnumerationReferenceCoverFind
1.
  •  
2.
  • Arnekvist, Isac, 1986- (author)
  • Transfer Learning using low-dimensional Representations in Reinforcement Learning
  • 2020
  • Licentiate thesis (other academic/artistic)abstract
    • Successful learning of behaviors in Reinforcement Learning (RL) are often learned tabula rasa, requiring many observations and interactions in the environment. Performing this outside of a simulator, in the real world, often becomes infeasible due to the large amount of interactions needed. This has motivated the use of Transfer Learning for Reinforcement Learning, where learning is accelerated by using experiences from previous learning in related tasks. In this thesis, I explore how we can transfer from a simple single-object pushing policy, to a wide array of non-prehensile rearrangement problems. I then explain how we can model task differences using a low-dimensional latent variable representation to make adaption to novel tasks efficient. Lastly, the dependence of accurate function approximation is sometimes problematic, especially in RL, where statistics of target variables are not known a priori. I present observations, along with explanations, that small target variances along with momentum optimization of ReLU-activated neural network parameters leads to dying ReLU.
  •  
3.
  • Arnekvist, Isac, 1986-, et al. (author)
  • Vpe : Variational policy embedding for transfer reinforcement learning
  • 2019
  • In: 2019 International Conference on Robotics And Automation (ICRA). - : Institute of Electrical and Electronics Engineers (IEEE). - 9781538660263 - 9781538660270 ; , s. 36-42
  • Conference paper (peer-reviewed)abstract
    • Reinforcement Learning methods are capable of solving complex problems, but resulting policies might perform poorly in environments that are even slightly different. In robotics especially, training and deployment conditions often vary and data collection is expensive, making retraining undesirable. Simulation training allows for feasible training times, but on the other hand suffer from a reality-gap when applied in real-world settings. This raises the need of efficient adaptation of policies acting in new environments. We consider the problem of transferring knowledge within a family of similar Markov decision processes. We assume that Q-functions are generated by some low-dimensional latent variable. Given such a Q-function, we can find a master policy that can adapt given different values of this latent variable. Our method learns both the generative mapping and an approximate posterior of the latent variables, enabling identification of policies for new tasks by searching only in the latent space, rather than the space of all policies. The low-dimensional space, and master policy found by our method enables policies to quickly adapt to new environments. We demonstrate the method on both a pendulum swing-up task in simulation, and for simulation-to-real transfer on a pushing task.
  •  
4.
  •  
5.
  • Haustein, Joshua Alexander, 1987-, et al. (author)
  • Learning Manipulation States and Actions for Efficient Non-prehensile Rearrangement Planning
  • Other publication (other academic/artistic)abstract
    • This paper addresses non-prehensile rearrangement planning problems where a robot is tasked to rearrange objects among obstacles on a planar surface. We present an efficient planning algorithm that is designed to impose few assumptions on the robot's non-prehensile manipulation abilities and is simple to adapt to different robot embodiments. For this, we combine sampling-based motion planning with reinforcement learning and generative modeling. Our algorithm explores the composite configuration space of objects and robot as a search over robot actions, forward simulated in a physics model. This search is guided by a generative model that provides robot states from which an object can be transported towards a desired state, and a learned policy that provides corresponding robot actions. As an efficient generative model, we apply Generative Adversarial Networks. We implement and evaluate our approach for robots endowed with configuration spaces in SE(2). We demonstrate empirically the efficacy of our algorithm design choices and observe more than 2x speedup in planning time on various test scenarios compared to a state-of-the-art approach.
  •  
6.
  •  
7.
  • Haustein, Joshua Alexander, 1987-, et al. (author)
  • Non-prehensile Rearrangement Planning with Learned Manipulation States and Actions
  • 2018
  • In: Workshop on "Machine Learning in Robot Motion Planning" at the International Conference on Intelligent Robots and Systems (IROS) 2018.
  • Conference paper (peer-reviewed)abstract
    • n this work we combine sampling-based motionplanning with reinforcement learning and generative modelingto solve non-prehensile rearrangement problems. Our algorithmexplores the composite configuration space of objects and robotas a search over robot actions, forward simulated in a physicsmodel. This search is guided by a generative model thatprovides robot states from which an object can be transportedtowards a desired state, and a learned policy that providescorresponding robot actions. As an efficient generative model,we apply Generative Adversarial Networks.
  •  
8.
  • Isac, Arnekvist, et al. (author)
  • The effect of Target Normalization and Momentum on Dying ReLU
  • 2020
  • In: The 32nd annual workshop of the Swedish Artificial Intelligence Society (SAIS).
  • Conference paper (peer-reviewed)abstract
    • Optimizing parameters with momentum, normalizing data values, and using rectified linear units (ReLUs) are popular choices in neural network (NN) regression. Although ReLUs are popular, they can collapse to a constant function and" die", effectively removing their contribution from the model. While some mitigations are known, the underlying reasons of ReLUs dying during optimization are currently poorly understood. In this paper, we consider the effects of target normalization and momentum on dying ReLUs. We find empirically that unit variance targets are well motivated and that ReLUs die more easily, when target variance approaches zero. To further investigate this matter, we analyze a discrete-time linear autonomous system, and show theoretically how this relates to a model with a single ReLU and how common properties can result in dying ReLU. We also analyze the gradients of a single-ReLU model to identify saddle points and regions corresponding to dying ReLU and how parameters evolve into these regions when momentum is used. Finally, we show empirically that this problem persist, and is aggravated, for deeper models including residual networks.
  •  
Skapa referenser, mejla, bekava och länka
  • Result 1-8 of 8

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Close

Copy and save the link in order to return to this view