SwePub
Sök i SwePub databas

  Utökad sökning

Träfflista för sökning "WFRF:(Johnander Joakim) srt2:(2023)"

Sökning: WFRF:(Johnander Joakim) > (2023)

  • Resultat 1-5 av 5
Sortera/gruppera träfflistan
   
NumreringReferensOmslagsbildHitta
1.
  • Brissman, Emil, 1987-, et al. (författare)
  • Recurrent Graph Neural Networks for Video Instance Segmentation
  • 2023
  • Ingår i: International Journal of Computer Vision. - : Springer. - 0920-5691 .- 1573-1405. ; 131, s. 471-495
  • Tidskriftsartikel (refereegranskat)abstract
    • Video instance segmentation is one of the core problems in computer vision. Formulating a purely learning-based method, which models the generic track management required to solve the video instance segmentation task, is a highly challenging problem. In this work, we propose a novel learning framework where the entire video instance segmentation problem is modeled jointly. To this end, we design a graph neural network that in each frame jointly processes all detections and a memory of previously seen tracks. Past information is considered and processed via a recurrent connection. We demonstrate the effectiveness of the proposed approach in comprehensive experiments. Our approach operates online at over 25 FPS and obtains 16.3 AP on the challenging OVIS benchmark, setting a new state-of-the-art. We further conduct detailed ablative experiments that validate the different aspects of our approach. Code is available at https://github.com/emibr948/RGNNVIS-PlusPlus.
  •  
2.
  • Carrasco Limeros, Sandra, et al. (författare)
  • Towards explainable motion prediction using heterogeneous graph representations
  • 2023
  • Ingår i: Transportation Research, Part C: Emerging Technologies. - : PERGAMON-ELSEVIER SCIENCE LTD. - 0968-090X .- 1879-2359. ; 157
  • Tidskriftsartikel (refereegranskat)abstract
    • Motion prediction systems play a crucial role in enabling autonomous vehicles to navigate safely and efficiently in complex traffic scenarios. Graph Neural Network (GNN)-based approaches have emerged as a promising solution for capturing interactions among dynamic agents and static objects. However, they often lack transparency, interpretability and explainability — qualities that are essential for building trust in autonomous driving systems. In this work, we address this challenge by presenting a comprehensive approach to enhance the explainability of graph-based motion prediction systems. We introduce the Explainable Heterogeneous Graph-based Policy (XHGP) model based on an heterogeneous graph representation of the traffic scene and lane-graph traversals. Distinct from other graph-based models, XHGP leverages object-level and type-level attention mechanisms to learn interaction behaviors, providing information about the importance of agents and interactions in the scene. In addition, capitalizing on XHGP's architecture, we investigate the explanations provided by the GNNExplainer and apply counterfactual reasoning to analyze the sensitivity of the model to modifications of the input data. This includes masking scene elements, altering trajectories, and adding or removing dynamic agents. Our proposal advances towards achieving reliable and explainable motion prediction systems, addressing the concerns of users, developers and regulatory agencies alike. The insights gained from our explainability analysis contribute to a better understanding of the relationships between dynamic and static elements in traffic scenarios, facilitating the interpretation of the results, as well as the correction of possible errors in motion prediction models, and thus contributing to the development of trustworthy motion prediction systems. The code to reproduce this work is publicly available at https://github.com/sancarlim/Explainable-MP/tree/v1.1.
  •  
3.
  • Carrasco Limeros, Sandra, et al. (författare)
  • Towards trustworthy multi-modal motion prediction: Holistic evaluation and interpretability of outputs
  • 2023
  • Ingår i: CAAI Transactions on Intelligence Technology. - : WILEY. - 2468-6557 .- 2468-2322. ; In Press
  • Tidskriftsartikel (refereegranskat)abstract
    • Predicting the motion of other road agents enables autonomous vehicles to perform safe and efficient path planning. This task is very complex, as the behaviour of road agents depends on many factors and the number of possible future trajectories can be considerable (multi-modal). Most prior approaches proposed to address multi-modal motion prediction are based on complex machine learning systems that have limited interpretability. Moreover, the metrics used in current benchmarks do not evaluate all aspects of the problem, such as the diversity and admissibility of the output. The authors aim to advance towards the design of trustworthy motion prediction systems, based on some of the requirements for the design of Trustworthy Artificial Intelligence. The focus is on evaluation criteria, robustness, and interpretability of outputs. First, the evaluation metrics are comprehensively analysed, the main gaps of current benchmarks are identified, and a new holistic evaluation framework is proposed. Then, a method for the assessment of spatial and temporal robustness is introduced by simulating noise in the perception system. To enhance the interpretability of the outputs and generate more balanced results in the proposed evaluation framework, an intent prediction layer that can be attached to multi-modal motion prediction models is proposed. The effectiveness of this approach is assessed through a survey that explores different elements in the visualisation of the multi-modal trajectories and intentions. The proposed approach and findings make a significant contribution to the development of trustworthy motion prediction systems for autonomous vehicles, advancing the field towards greater safety and reliability.
  •  
4.
  • Ljungbergh, William, et al. (författare)
  • Raw or Cooked? : Object Detection on RAW Images
  • 2023
  • Ingår i: Image Analysis. - : Springer. - 9783031314346 - 9783031314353 ; , s. 374-385
  • Konferensbidrag (refereegranskat)abstract
    • Images fed to a deep neural network have in general undergone several handcrafted image signal processing (ISP) operations, all of which have been optimized to produce visually pleasing images. In this work, we investigate the hypothesis that the intermediate representation of visually pleasing images is sub-optimal for downstream computer vision tasks compared to the RAW image representation. We suggest that the operations of the ISP instead should be optimized towards the end task, by learning the parameters of the operations jointly during training. We extend previous works on this topic and propose a new learnable operation that enables an object detector to achieve superior performance when compared to both previous works and traditional RGB images. In experiments on the open PASCALRAW dataset, we empirically confirm our hypothesis.
  •  
5.
  • Meding, Isak, et al. (författare)
  • You can have your ensemble and run it too - Deep Ensembles Spread Over Time
  • 2023
  • Ingår i: Proceedings - 2023 IEEE/CVF International Conference on Computer Vision Workshops, ICCVW 2023. - : IEEE COMPUTER SOC. ; , s. 4022-4031
  • Konferensbidrag (refereegranskat)abstract
    • Ensembles of independently trained deep neural networks yield uncertainty estimates that rival Bayesian networks in performance. They also offer sizable improvements in terms of predictive performance over single models. However, deep ensembles are not commonly used in environments with limited computational budget - such as autonomous driving - since the complexity grows linearly with the number of ensemble members. An important observation that can be made for robotics applications, such as autonomous driving, is that data is typically sequential. For instance, when an object is to be recognized, an autonomous vehicle typically observes a sequence of images, rather than a single image. This raises the question, could the deep ensemble be spread over time?In this work, we propose and analyze Deep Ensembles Spread Over Time (DESOT). The idea is to apply only a single ensemble member to each data point in the sequence, and fuse the predictions over a sequence of data points. We implement and experiment with DESOT for traffic sign classification, where sequences of tracked image patches are to be classified. We find that DESOT obtains the benefits of deep ensembles, in terms of predictive and uncertainty estimation performance, while avoiding the added computational cost. Moreover, DESOT is simple to implement and does not require sequences during training. Finally, we find that DESOT, like deep ensembles, outperform single models for out-of-distribution detection.
  •  
Skapa referenser, mejla, bekava och länka
  • Resultat 1-5 av 5

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Stäng

Kopiera och spara länken för att återkomma till aktuell vy