SwePub
Tyck till om SwePub Sök här!
Sök i SwePub databas

  Utökad sökning

Träfflista för sökning "WFRF:(Wahlberg Bo Professor) srt2:(2020-2024)"

Sökning: WFRF:(Wahlberg Bo Professor) > (2020-2024)

  • Resultat 1-10 av 15
Sortera/gruppera träfflistan
   
NumreringReferensOmslagsbildHitta
1.
  • Winqvist, Rebecka, 1996- (författare)
  • Learning in the Loop : On Neural Network-based Model Predictive Control and Cooperative System Identification
  • 2023
  • Licentiatavhandling (övrigt vetenskapligt/konstnärligt)abstract
    • Inom reglerteknik har integrationen av maskininlärningsmetoder framträtt som en central strategi för att förbättra prestanda och adaptivitet hos styrsystem. Betydande framsteg har gjorts inom flera viktiga aspekter av reglerkretsen, såsom inlärningsbaserade metoder för systemidentifiering och parameterskattning, filtrering och brusreducering samt reglersyntes. Denna avhandling fördjupar sig i området inlärning för reglerteknik med särskild betoning på inlärningsbaserade regulatorer och identifieringsmetoder. Avhandlingens första del behandlar undersökningen av neuronnätsbaserad Modellprediktiv Reglering (MPC). Olika nätstrukturer studeras, både generella black box-nät och nät som väver in MPC-specifik information i sin struktur. Dessa nät jämförs och utvärderas med avseende på två prestandamått genom experiment på realistiska två- och fyrdimensionella system. Den huvudsakliga nyskapande aspekten är inkluderingen av gradientdata i träningsprocessen, vilket visar sig förbättra noggrannheten av de genererade styrsignalerna. Vidare påvisar de experimentella resultaten att en MPC-informerad nätstruktur leder till förbättrad prestanda när mängden träningsdata är begränsad. Med insikt om vikten av noggranna matematiska modeller av styrsystemet, riktar den andra delen av avhandlingen sitt fokus mot inlärningsbaserade identifieringsmetoder. Denna forskningsgren behandlar karakterisering och modellering av dynamiska system med hjälp av maskininlärning. Avhandlingen bidrar till området genom att introducera kooperativa systemidentifieringsmetoder för att förbättra parameterskattningen. Specifikt utnyttjas verktyg från Optimal Transport för att introducera en ny och mer generell formulering av ramverket Correctional Learning. Detta ramverk är baserat på en mästare-lärlingsmodell, där en expertagent (mästare) observerar och modifierar den insamlade data som används av en lärande agent (lärling), med syftet att förbättra lärlingens skattningsprocess. Genom att formulera correctional learning som ett optimal transport-problem erhålls ett mer flexibelt ramverk, bättre lämpat för skattning av komplexa systemegenskaper samt anpassning till alternativa handlingsstrategier. 
  •  
2.
  • de Miranda de Matos Lourenço, Inês, 1994- (författare)
  • Forward and Inverse Decision-Making in Adversarial, Cooperative, and Biologically-Inspired Dynamical Systems
  • 2021
  • Licentiatavhandling (övrigt vetenskapligt/konstnärligt)abstract
    • Decision-making is the mechanism of using available information to develop solutions to given problems by forming preferences, beliefs, or selecting courses of action amongst several alternatives. It is the main focus of a variety of scientific fields such as robotics, finances, and neuroscience. In this thesis, we study the mechanisms that generate behavior in diverse decision-making settings (the forward problem) and how their characteristics can explain observed behavior (the inverse problem). Both problems take a central role in current research due to the desire to understand the features of system behavior, many times under situations of risk and uncertainty. We study decision-making problems in the three following settings.In the first setting, we consider a decision-maker who forms a private belief (posterior distribution) on the state of the world by filtering private information. Estimating private beliefs is a way to understand what drives decisions. This forms a foundation for predicting, and counteracting against, future actions. In the setting of adversarial systems, we answer the problems of i) how can an adversary estimate the private belief of the decision-maker by observing its decisions (under two different scenarios), and ii) how can the decision-maker protect its private belief by confusing the adversary. We exemplify the applicability of our frameworks in regime-switching Markovian portfolio allocation.In the second setting we shift from an adversarial to a cooperative scenario. We consider a teacher-student framework similar to that used in learning from demonstration and transfer learning setups. An expert agent (teacher) knows the model of a system and wants to assist a learner agent (student) in performing identification for that system but cannot directly transfer its knowledge to the student. For example, the teacher's knowledge of the system might be abstract or the teacher and student might be employing different model classes, which renders the teacher's parameters uninformative to the student. We propose correctional learning as an approach where, in order to assist the student, the teacher can intercept the observations collected from the system and modify them to maximize the amount of information the student receives about the system. We obtain finite-sample results for correctional learning of binomial systems.In the third and final setting we shift our attention to cognitive science and decision-making of biological systems, to obtain insight about the intrinsic characteristics of these systems. We focus on time perception - how humans and animals perceive the passage of time, and solve the forward problem by designing a biologically-inspired decision-making framework that replicates the mechanisms responsible for time perception. We conclude that a simulated robot equipped with our framework is able to perceive time similarly to animals - when it comes to their intrinsic mechanisms of interpreting time and performing time-aware actions. We then focus on the inverse problem. Based on the empirical action probability distribution of the agent, we are able to estimate the parameters it uses for perceiving time. Our work shows promising results when it comes to drawing conclusions regarding some of the characteristics present in biological timing mechanisms.
  •  
3.
  • de Miranda de Matos Lourenço, Inês, 1994- (författare)
  • Learning from Interactions : Forward and Inverse Decision-Making for Autonomous Dynamical Systems
  • 2023
  • Doktorsavhandling (övrigt vetenskapligt/konstnärligt)abstract
    • Decision-making is the mechanism of using available information to generate solutions to given problems by forming preferences, beliefs, and selecting courses of action amongst several alternatives. In this thesis, we study the mechanisms that generate behavior (the forward problem) and how their characteristics can explain observed behavior (the inverse problem). Both problems play a pivotal role in contemporary research due to the desire to design sophisticated autonomous agents that serve as the building blocks for a smart society, amidst complexity, risk, and uncertainty. This work explores different parts of the autonomous decision-making process where agents learn from interacting with each other and the environment that surrounds them. We address fundamental problems of behavior modeling, parameter estimation in the form of beliefs, distributions, and reward functions, and then finally interactions with other agents; which lay the foundation for a complete and integrative framework for decision-making and learning. The thesis is divided into four parts, each featuring a different information exchange paradigm.First, we model the forward problem of how a decision-maker forms beliefs about the world and the inverse problem of estimating these beliefs from the agent’s behavior. The private belief (posterior distribution) on the state of the world is formed according to a hidden Markov model by filtering private information. The ability to estimate private beliefs forms a foundation for predicting and counteracting against future actions. We answer the problems of i) how the private belief of the decision-maker can be estimated by observing its decisions (under two different scenarios), and ii) how the decision-maker can protect its private belief from an adversary by confusing it. We exemplify the applicability of our frameworks in regime-switching Markovian portfolio allocation.In the second part, we study forward decision-making of biological systems and the inverse problem of how to obtain insight into their intrinsic characteristics. We focus on time perception – how humans and animals perceive the passage of time – and design a biologically-inspired decision-making framework using reinforcement learning that replicates timing mechanisms. We show that a simulated robot equipped with our framework is able to perceive time similarly to animals, and that by analyzing its performed actions we are able to estimate the parameters of timing mechanisms.Next, we consider teacher-student settings where a teacher agent can intervene with the decision-making process of a student agent to assist it in performing a task. In the third part, we propose correctional learning as an approach where the teacher can intercept the observations the student collects from the system and modify them to improve the estimation process of the student. We provide finite-sample results for batch correctional learning in system identification, generalize it to more complex systems using optimal transport, and lower-bound improvements on the estimate’s variance for the online case.Decision-making in teacher-student settings like the previous one requires both agents to have aligned models of understanding of each other. In the fourth and last part of this thesis, the teacher can, instead, alter the decisions of the decision-maker in a human-robot interaction setting. We use a confidence-based misalignment detection method that enables the robot to update its knowledge proportionally to its confidence in the human corrections and propose a framework to disambiguate between misalignment caused by incorrectly learned features that do not generalize to new environments and features entirely missing from the robot’s model. We demonstrate the proposed framework in a 7 degrees-of-freedom robot manipulator with physical human corrections and show how to initiate the model realignment process once misalignment is detected.
  •  
4.
  • González, Rodrigo A., 1992- (författare)
  • Consistency and efficiency in continuous-time system identification
  • 2020
  • Licentiatavhandling (övrigt vetenskapligt/konstnärligt)abstract
    • Continuous-time system identification deals with the problem of building continuous-time models of dynamical systems from sampled input and output data. In this field, there are two main approaches: indirect and direct. In the indirect approach, a suitable discrete-time model is first determined, and then it is transformed into continuous-time. On the other hand, the direct approach obtains a continuous-time model directly from the sampled data. In both approaches there exists a dichotomy between discrete-time data and continuous-time models, which can induce robustness issues and complications in the theoretical analysis of identification algorithms. These difficulties are addressed in this thesis.First, we consider the indirect approach to continuous-time system identification. For a zero-order hold sampling mechanism, this approach usually leads to a transfer function estimate with relative degree one, independent of the relative degree of the strictly proper true system. Inspired by the indirect prediction error method, we propose an indirect-approach estimator that enforces the desired number of poles and zeros in the continuous-time transfer function estimate, and show that the estimator is consistent and asymptotically efficient. A robustification of this method is also developed, by which the estimates are also guaranteed to deliver stable models.In the second part of the thesis, we analyze asymptotic properties of the Simplified Refined Instrumental Variable method for Continuous-time systems (SRIVC), which is one of the most popular direct identification methods. This algorithm applies an adaptive prefiltering to the sampled input and output that requires assumptions on the intersample behavior of the signals. We present a comprehensive analysis on the consistency and asymptotic efficiency of the SRIVC estimator while taking into account the intersample behavior of the input signal. Our results show that the SRIVC estimator is generically consistent when the intersample behavior of the input is known exactly and subsequently used in the implementation of the algorithm, and we give conditions under which consistency is not achieved. In terms of statistical efficiency, we compute the asymptotic Cramér-Rao lower bound for an output error model structure with Gaussian noise, and derive the asymptotic covariance of the SRIVC estimates. We conclude that the SRIVC estimator is asymptotically efficient under mild conditions, and that this property can be lost if the intersample behavior of the input is not carefully accounted for in the SRIVC procedure.Moreover, we propose and analyze the statistical properties of an extension of SRIVC that is able to deal with input signals that cannot be interpolated exactly via hold reconstructions. The proposed estimator is generically consistent for any input reconstructed using zero or first-order-hold devices, and we show that it is generically consistent for continuous-time multisine inputs as well. Comparisons with the Maximum Likelihood technique and an analysis of the iterations of the method are provided, in order to reveal the influence of the intersample behavior of the output and to propose new robustifications to the SRIVC algorithm.
  •  
5.
  • Lapandic, Dzenan (författare)
  • Trajectory Tracking and Prediction-Based Coordination of Underactuated Unmanned Vehicles
  • 2023
  • Licentiatavhandling (övrigt vetenskapligt/konstnärligt)abstract
    • In this thesis, we study trajectory tracking and prediction-based control of underactuated unmanned aerial and surface vehicles.  In the first part of the thesis, we examine the trajectory tracking using prescribed performance control (PPC) assuming that the model parameters are unknown. Moreover, due to the underactuation the original PPC is redesigned to accommodate for the specifics of the considered underactuated systems. We prove the stability of the proposed control schemes and support it with numerical simulations on the quadrotor and boat models. Furthermore, we propose enhancements to kinodynamic motion-planning via funnel control (KDF) framework that are based on rapidly-exploring random tree (RRT) algorithm and B-splines to generate the smooth trajectories and track them with PPC. We conducted real-world experiments and tested the advantages of the proposed enhancements to KDF. The second part of the thesis is devoted to the rendezvous problem of autonomous landing of a quadrotor on a boat based on distributed model predictive control (MPC) algorithms. We propose an algorithm that assumes minimal exchange of information between the agents, which is the rendezvous location, and an update rule to maintain the recursive feasibility of the landing. Moreover, we present a convergence proof without enforcing the terminal set constraints.  Finally, we investigated a leader-follower framework and presented an algorithm for multiple follower agents to land autonomously on the landing platform attached to the leader. An agent is equipped with a trajectory predictor to handle the cases of communication loss and avoid the inter-agent collisions. The algorithm is tested in a simulation scenario with the described challenges and the numerical results support the theoretical findings.
  •  
6.
  • Mattila, Robert (författare)
  • Hidden Markov Models: Identification, Inverse Filtering and Applications
  • 2020
  • Doktorsavhandling (övrigt vetenskapligt/konstnärligt)abstract
    • A hidden Markov model (HMM) comprises a state with Markovian dynamics that is hidden in the sense that it can only be observed via a noisy sensor. This thesis considers three themes in relation to HMMs, namely, identification, inverse filtering and applications.In order to employ an HMM, its parameters have first to be identified (or, estimated) from data. Traditional maximum-likelihood estimation procedures may, in practice, suffer from convergence to bad local optima and high computational cost. Recently proposed methods of moments address these shortcomings, but are less accurate. We explore how such methods can be extended to incorporate non-consecutive correlations in data so as to improve their accuracy (while still retaining their attractive properties).Motivated by applications in the design of counter-adversarial autonomous (CAA) systems, we then ask the question: Is it possible to estimate the parameters of an HMM from other data sources than just raw measurements from its sensor? To answer this question, we consider a number of inverse filtering problems. First, we demonstrate how HMM parameters and sensor measurements can be reconstructed from posterior distributions from an HMM filter. Next, we show how to estimate such posterior distributions from actions taken by a rational agent. Finally, we bridge our results to provide a solution to the CAA problem of remotely estimating the accuracy of an adversary’s sensor based on its actions.Throughout the thesis, we motivate our results with applications in various domains. A real-world application that we investigate in particular detail is how the treatment of abdominal aortic aneurysms can be modeled in the Markovian framework. Our findings suggest that the structural properties of the optimal treatment policy are different than those recommended by current clinical guidelines – in particular, that younger patients could benefit from earlier surgery. This indicates an opportunity for improved care of patients with the disease.
  •  
7.
  • Pereira, Goncalo Collares (författare)
  • Adaptive Lateral Model Predictive Control for Autonomous Driving of Heavy-Duty Vehicles
  • 2023
  • Doktorsavhandling (övrigt vetenskapligt/konstnärligt)abstract
    • Autonomous Vehicle (AV) technology promises safer, greener, and more efficient means of transportation for everyone. AVs are expected to have their first big impact in closed environments, such as mining areas, ports, and construction sites, where Heavy-Duty Vehicles (HDVs) operate. This thesis addresses lateral motion control for autonomous HDVs using Model Predictive Control (MPC). Lateral control for HDVs still has many open questions to be addressed, in particular, precise path tracking while ensuring a smooth, comfortable, and stable ride, coping with both external and internal disturbances, and adapting to different vehicles and conditions.To address these challenges, a comprehensive control module architecture is designed to adapt seamlessly to different vehicle types and interface with various planning and localization modules. Furthermore, it is designed to address system delays, maintain certain error bounds, and respect actuation constraints.This thesis presents the Reference Aware MPC (RA-MPC) for autonomous vehicles. This controller is iteratively improved throughout the thesis. The RA-MPC introduces a method to systematically handle references generated by motion planners which can consider different algorithms and vehicle models from the controller. The controller uses the linear time-varying MPC framework and considers control input rate and acceleration constraints to account for steering limitations. Furthermore, multiple models and control inputs are considered throughout the thesis. Ultimately, curvature acceleration is used as the control input, which together with stability ingredients, allows for stability guarantees under certain conditions via Lyapunov techniques.MPC is highly dependent on the prediction model used. This thesis proposes and compares different models. First, an offline-fitted, vehicle-specific nonlinear curvature response function is proposed and integrated into the kinematic bicycle model. The curvature response function is modeled as two Gaussian functions. To enhance the model's versatility and applicability to a fleet of vehicles the nonlinear curvature response table kinematic model is presented. This model replaces the function with a table, which is estimated online by means of Kalman filtering, adapting to the current vehicle and operating conditions.All controllers and models are simulated and experimentally validated on Scania HDVs and iteratively compared to the previous state-of-the-art. The RA-MPC with the nonlinear curvature response table kinematic model is shown to be the best for the problems and conditions considered. The robustness and adaptiveness of the proposed approach are highlighted by testing different vehicle configurations (a haulage truck, a mining truck, and a bus), operating conditions, and scenarios. The model allows all vehicles to accomplish the scenarios with very similar performance. Overall, the results show an average absolute lateral error to path no bigger than 7 cm, and a worst-case deviation no bigger than 25 cm. These results demonstrate the controller's ability to handle a fleet of HDVs, without the need for vehicle-specific tuning or intervention from expert engineers.
  •  
8.
  • Persson, Linnea, 1992- (författare)
  • Model Predictive Control for Cooperative Rendezvous of Autonomous Unmanned Vehicles
  • 2021
  • Doktorsavhandling (övrigt vetenskapligt/konstnärligt)abstract
    • This thesis investigates cooperative maneuvers for aerial vehicles autonomously landing on moving platforms. The objective has been to develop methods for safely performing such landings on real systems subject to a variety of disturbances, as well as physical and computational constraints. Two specific examples are considered: the landing of a fixed-wing drone on top of a moving ground carriage; and the landing of a quadcopter on the deck of a boat. The maneuvers are executed in a cooperative manner where both vehicles are allowed to take actions to reach their common objective, while avoiding safety based spatial constraints. Applications of such systems can be found in, for example, autonomous deliveries, emergency landings, and in search and rescue missions. Particular challenges of cooperative landing maneuvers include the heterogeneous and nonlinear dynamics, the coupled control, the sensitivity to disturbances, and the safety criticality of performing a high-velocity landing maneuver.In this thesis, a cooperative landing algorithm based on Model Predictive Control (MPC) that includes spatial safety constraints for avoiding dangerous regions is developed. MPC offers many advantages for the autonomous landing problem, with its ability to explicitly consider dynamic equations, constraints, and disturbances directly in the computation of the control inputs. It is shown that the cooperative landing MPC can be decoupled into a horizontal and a vertical sub-problem. This result makes the optimization problems significantly less computationally demandingand facilitates the real-time implementation. The autonomous landing maneuver is further improved by the employment of a variable horizon. The variable-horizon MPC framework lets the finite horizon length become a part of the optimization problem, and makes it possible to always extend the horizon to the end of the landing maneuver. An algorithm for variable horizon MPC that can be implemented to real-time systems is derived by the use of efficient update rules, and by taking into account the similarities between the multiple optimization problems that we have to solve in each sampling period. The algorithm is fast enough to be used even in time-critical systems with long horizons. Furthermore, the solution time of the variable-horizon MPC decreases as the target gets closer. This means that the computational demand becomes smaller in the most critical part of the landing maneuver.The algorithms are derived for two different landing systems, and are subsequently implemented in realistic simulations and in real-world outdoors flight tests through the WASP research arena. The results demonstrate both that the controllers are practically implementable on real systems with computational limitations, and that the suggested controller can successfully be used to perform the cooperative landing under the influence of external disturbances and under the constraint of various safety requirements.
  •  
9.
  • Li, Yibei, et al. (författare)
  • A Duality-Based Approach to Inverse Kalman Filtering
  • 2023
  • Ingår i: 22nd IFAC World Congress Yokohama, Japan, July 9-14, 2023. - : Elsevier BV. ; , s. 10258-10263
  • Konferensbidrag (refereegranskat)abstract
    • In this paper, the inverse Kalman filtering problem is addressed using a duality-based framework, where certain statistical properties of uncertainties in a dynamical model are recovered from observations of its posterior estimates. The duality relation in inverse filtering and inverse optimal control is established. It is shown that the inverse Kalman filtering problem can be solved using results from a well-posed inverse linear quadratic regulator. Identifiability of the considered inverse filtering model is proved and a unique covariance matrix is recovered by a least squares estimator, which is also shown to be statistically consistent. Effectiveness of the proposed methods is illustrated by numerical simulations.
  •  
10.
  • Li, Yibei, 1993-, et al. (författare)
  • Identifiability and Solvability in Inverse Linear Quadratic Optimal Control Problems
  • 2021
  • Ingår i: Journal of Systems Science and Complexity. - : Springer Nature. - 1009-6124 .- 1559-7067. ; 34:5, s. 1840-1857
  • Tidskriftsartikel (refereegranskat)abstract
    • In this paper, the inverse linear quadratic (LQ) problem over finite time-horizon is studied. Given the output observations of a dynamic process, the goal is to recover the corresponding LQ cost function. Firstly, by considering the inverse problem as an identification problem, its model structure is shown to be strictly globally identifiable under the assumption of system invertibility. Next, in the noiseless case a necessary and sufficient condition is proposed for the solvability of a positive semidefinite weighting matrix and its unique solution is obtained with two proposed algorithms under the condition of persistent excitation. Furthermore, a residual optimization problem is also formulated to solve a best-fit approximate cost function from sub-optimal observations. Finally, numerical simulations are used to demonstrate the effectiveness of the proposed methods.
  •  
Skapa referenser, mejla, bekava och länka
  • Resultat 1-10 av 15

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Stäng

Kopiera och spara länken för att återkomma till aktuell vy