SwePub - sökning: WFRF:(Togelius Julian Dr)

Numrering	Referens	Omslagsbild	Hitta
1.	Hagelbäck, Johan, 1977- (författare) A Multi-Agent Potential Field Based Approach for Real-Time Strategy Game Bots 2009 Licentiatavhandling (övrigt vetenskapligt/konstnärligt)abstract Computer games in general and Real-Time Strategy (RTS) games in particular provide a rich challenge for both human- and computer controlled players, often denoted as bots. The player or bot controls a large number of units that have to navigate in partially unknown dynamic worlds to pursue a goal. Navigation in such worlds can be complex and require much computational resources. Typically it is solved by using some sort of path planning algorithm, and a lot of research has been conducted to improve the performance of such algorithms in dynamic worlds. The main goal of this thesis is to investigate an alternative approach for RTS bots based on Artificial Potential Fields, an area originating from robotics. In robotics the technique has successfully been used for navigation in dynamic environments, and we show that it is possible to use Artificial Potential Fields for navigation in an RTS game setting without any need of path planning.In the first three papers we define and demonstrate a methodology for creating multi-agent potential field based bots for an RTS game scenario where two tank armies battle each other. The fourth paper addresses incomplete information about the game world, referred to as the fog of war, and show how Potential Field based bots can handle such environments. The final paper shows how a Potential Field based bot can be evolved to handle a more complex full RTS scenario. It addresses resource gathering, construction of bases, technological development and construction of an army consisting of different types of units.We show that Artificial Potential Fields is a viable option for several RTS game scenarios and that the performance, both in terms of being able to win a game and computational resources used, can match and even surpass those of traditional approaches based on path planning.
2.	Källström, Johan, 1976- (författare) Reinforcement Learning for Improved Utility of Simulation-Based Training 2023 Doktorsavhandling (övrigt vetenskapligt/konstnärligt)abstract Team training in complex domains often requires a substantial number of resources, e.g. vehicles, machines, and role-players. For this reason, it may be difficult to realise efficient and effective training scenarios in a real-world setting. Instead, part of the training can be conducted in synthetic, computer-generated environments. In these environments trainees can operate simulators instead of real vehicles, while synthetic actors can replace human role-players to increase the complexity of the simulated scenario at low operating cost. However, constructing behaviour models for synthetic actors is challenging, especially for the end users, who typically do not have expertise in artificial intelligence. In this dissertation, we study how machine learning can be used to simplify the construction of intelligent agents for simulation-based training. A simulation-based air combat training system is used as case study. The contributions of the dissertation are divided into two parts. The first part aims at improving the understanding of reinforcement learning in the domain of simulation-based training. First, a user-study is conducted to identify important capabilities and characteristics of learning agents that are intended to support training of fighter pilots. It is identified that one of the most important capabilities of learning agents in the context of simulation-based training is that their behaviour can be adapted to different phases of training, as well as to the training needs of individual human trainees. Second, methods for learning how to coordinate with other agents are studied in simplified training scenarios, to investigate how the design of the agent’s observation space, action space, and reward signal affects the performance of learning. It is identified that temporal abstractions and hierarchical reinforcement learning can improve the efficiency of learning, while also providing support for modelling of doctrinal behaviour. In more complex settings, curriculum learning and related methods are expected to help find novel tactics even when sparse, abstract reward signals are used. Third, based on the results from the user study and the practical experiments, a system concept for a user-adaptive training system is developed to support further research. The second part of the contributions focuses on methods for utility-based multi-objective reinforcement learning, which incorporates knowledge of the user’s utility function in the search for policies that balance multiple conflicting objectives. Two new agents for multi-objective reinforcement learning are proposed: the Tunable Actor (T-Actor) and the Multi-Objective Dreamer (MO-Dreamer). T-Actor provides decision support to instructors by learning a set of Pareto optimal policies, represented by a single neural network conditioned on objective preferences. This enables tuning of the agent’s behaviour to fit trainees’ current training needs. Experimental evaluations in gridworlds and in the target system show that T-Actor reduces the number of training steps required for learning. MO-Dreamer adapts online to changes in users’ utility, e.g. changes in training needs. It does so by learning a model of the environment, which it can use for anticipatory rollouts with a diverse set of utility functions to explore which policy to follow to optimise the return for a given set of objective preferences. An experimental evaluation shows that MO-Dreamer outperforms prior model-free approaches in terms of experienced regret, for frequent as well as sparse changes in utility. Overall, the research conducted in this dissertation contributes to improved knowledge about how to apply machine learning methods to construction of simulation-based training environments. While our focus was on air combat training, the results are general enough to be applicable in other domains.

Källström, Johan, 1976- (författare)
Reinforcement Learning for Improved Utility of Simulation-Based Training
2023
Doktorsavhandling (övrigt vetenskapligt/konstnärligt)abstract
- Team training in complex domains often requires a substantial number of resources, e.g. vehicles, machines, and role-players. For this reason, it may be difficult to realise efficient and effective training scenarios in a real-world setting. Instead, part of the training can be conducted in synthetic, computer-generated environments. In these environments trainees can operate simulators instead of real vehicles, while synthetic actors can replace human role-players to increase the complexity of the simulated scenario at low operating cost. However, constructing behaviour models for synthetic actors is challenging, especially for the end users, who typically do not have expertise in artificial intelligence. In this dissertation, we study how machine learning can be used to simplify the construction of intelligent agents for simulation-based training. A simulation-based air combat training system is used as case study. The contributions of the dissertation are divided into two parts. The first part aims at improving the understanding of reinforcement learning in the domain of simulation-based training. First, a user-study is conducted to identify important capabilities and characteristics of learning agents that are intended to support training of fighter pilots. It is identified that one of the most important capabilities of learning agents in the context of simulation-based training is that their behaviour can be adapted to different phases of training, as well as to the training needs of individual human trainees. Second, methods for learning how to coordinate with other agents are studied in simplified training scenarios, to investigate how the design of the agent’s observation space, action space, and reward signal affects the performance of learning. It is identified that temporal abstractions and hierarchical reinforcement learning can improve the efficiency of learning, while also providing support for modelling of doctrinal behaviour. In more complex settings, curriculum learning and related methods are expected to help find novel tactics even when sparse, abstract reward signals are used. Third, based on the results from the user study and the practical experiments, a system concept for a user-adaptive training system is developed to support further research. The second part of the contributions focuses on methods for utility-based multi-objective reinforcement learning, which incorporates knowledge of the user’s utility function in the search for policies that balance multiple conflicting objectives. Two new agents for multi-objective reinforcement learning are proposed: the Tunable Actor (T-Actor) and the Multi-Objective Dreamer (MO-Dreamer). T-Actor provides decision support to instructors by learning a set of Pareto optimal policies, represented by a single neural network conditioned on objective preferences. This enables tuning of the agent’s behaviour to fit trainees’ current training needs. Experimental evaluations in gridworlds and in the target system show that T-Actor reduces the number of training steps required for learning. MO-Dreamer adapts online to changes in users’ utility, e.g. changes in training needs. It does so by learning a model of the environment, which it can use for anticipatory rollouts with a diverse set of utility functions to explore which policy to follow to optimise the return for a given set of objective preferences. An experimental evaluation shows that MO-Dreamer outperforms prior model-free approaches in terms of experienced regret, for frequent as well as sparse changes in utility. Overall, the research conducted in this dissertation contributes to improved knowledge about how to apply machine learning methods to construction of simulation-based training environments. While our focus was on air combat training, the results are general enough to be applicable in other domains.

Träfflista för sökning "WFRF:(Togelius Julian Dr) "

Avgränsa träffmängd

År