SwePub
Sök i SwePub databas

  Utökad sökning

Träfflista för sökning "WFRF:(Johnander Joakim 1993 ) "

Sökning: WFRF:(Johnander Joakim 1993 )

  • Resultat 1-4 av 4
Sortera/gruppera träfflistan
   
NumreringReferensOmslagsbildHitta
1.
  • Brissman, Emil, 1987-, et al. (författare)
  • Recurrent Graph Neural Networks for Video Instance Segmentation
  • 2023
  • Ingår i: International Journal of Computer Vision. - : Springer. - 0920-5691 .- 1573-1405. ; 131, s. 471-495
  • Tidskriftsartikel (refereegranskat)abstract
    • Video instance segmentation is one of the core problems in computer vision. Formulating a purely learning-based method, which models the generic track management required to solve the video instance segmentation task, is a highly challenging problem. In this work, we propose a novel learning framework where the entire video instance segmentation problem is modeled jointly. To this end, we design a graph neural network that in each frame jointly processes all detections and a memory of previously seen tracks. Past information is considered and processed via a recurrent connection. We demonstrate the effectiveness of the proposed approach in comprehensive experiments. Our approach operates online at over 25 FPS and obtains 16.3 AP on the challenging OVIS benchmark, setting a new state-of-the-art. We further conduct detailed ablative experiments that validate the different aspects of our approach. Code is available at https://github.com/emibr948/RGNNVIS-PlusPlus.
  •  
2.
  • Johnander, Joakim, 1993- (författare)
  • Dynamic Visual Learning
  • 2022
  • Doktorsavhandling (övrigt vetenskapligt/konstnärligt)abstract
    • Autonomous robots act in a \emph{dynamic} world where both the robots and other objects may move. The surround sensing systems of said robots therefore work with dynamic input data and need to estimate both the current state of the environment as well as its dynamics. One of the key elements to obtain a high-level understanding of the environment is to track dynamic objects. This enables the system to understand what the objects are doing; predict where they will be in the future; and in the future better estimate where they are. In this thesis, I focus on input from visual cameras, images. Images have, with the advent of neural networks, become a cornerstone in sensing systems. Image-processing neural networks are optimized to perform a specific computer vision task -- such as recognizing cats and dogs -- on vast datasets of annotated examples. This is usually referred to as \emph{offline training} and given a well-designed neural network, enough high-quality data, and a suitable offline training formulation, the neural network is expected to become adept at the specific task.This thesis starts with a study of object tracking. The tracking is based on the visual appearance of the object, achieved via discriminative convolution filters (DCFs). The first contribution of this thesis is to decompose the filter into multiple subfilters. This serves to increase the robustness during object deformations or rotations. Moreover, it provides a more fine-grained representation of the object state as the subfilters are expected to roughly track object parts. In the second contribution, a neural network is trained directly for object tracking. In order to obtain a fine-grained representation of the object state, it is represented as a segmentation. The main challenge lies in the design of a neural network able to tackle this task. While the common neural networks excel at recognizing patterns seen during offline training, they struggle to store novel patterns in order to later recognize them. To overcome this limitation, a novel appearance learning mechanism is proposed. The mechanism extends the state-of-the-art and is shown to generalize remarkably well to novel data. In the third contribution, the method is used together with a novel fusion strategy and failure detection criterion to semi-automatically annotate visual and thermal videos.Sensing systems need not only track objects, but also detect them. The fourth contribution of this thesis strives to tackle joint detection, tracking, and segmentation of all objects from a predefined set of object classes. The challenge here lies not only in the neural network design, but also in the design of the offline training formulation. The final approach, a recurrent graph neural network, outperforms prior works that have a runtime of the same order of magnitude.Last, this thesis studies \emph{dynamic} learning of novel visual concepts. It is observed that the learning mechanisms used for object tracking essentially learns the appearance of the tracked object. It is natural to ask whether this appearance learning could be extended beyond individual objects to entire semantic classes, enabling the system to learn new concepts based on just a few training examples. Such an ability is desirable in autonomous systems as it removes the need of manually annotating thousands of examples of each class that needs recognition. Instead, the system is trained to efficiently learn to recognize new classes. In the fifth contribution, we propose a novel learning mechanism based on Gaussian process regression. With this mechanism, our neural network outperforms the state-of-the-art and the performance gap is especially large when multiple training examples are given.To summarize, this thesis studies and makes several contributions to learning systems that parse dynamic visuals and that dynamically learn visual appearances or concepts.
  •  
3.
  • Johnander, Joakim, 1993-, et al. (författare)
  • Video Instance Segmentation with Recurrent Graph Neural Networks
  • 2021
  • Ingår i: Pattern Recognition. - Cham : Springer. - 9783030926588 - 9783030926595 ; , s. 206-221
  • Konferensbidrag (refereegranskat)abstract
    • Video instance segmentation is one of the core problems in computer vision. Formulating a purely learning-based method, which models the generic track management required to solve the video instance segmentation task, is a highly challenging problem. In this work, we propose a novel learning framework where the entire video instance segmentation problem is modeled jointly. To this end, we design a graph neural network that in each frame jointly processes all detections and a memory of previously seen tracks. Past information is considered and processed via a recurrent connection. We demonstrate the effectiveness of the proposed approach in comprehensive experiments. Our approach, operating at over 25 FPS, outperforms previous video real-time methods. We further conduct detailed ablative experiments that validate the different aspects of our approach.
  •  
4.
  • Ljungbergh, William, et al. (författare)
  • Raw or Cooked? : Object Detection on RAW Images
  • 2023
  • Ingår i: Image Analysis. - : Springer. - 9783031314346 - 9783031314353 ; , s. 374-385
  • Konferensbidrag (refereegranskat)abstract
    • Images fed to a deep neural network have in general undergone several handcrafted image signal processing (ISP) operations, all of which have been optimized to produce visually pleasing images. In this work, we investigate the hypothesis that the intermediate representation of visually pleasing images is sub-optimal for downstream computer vision tasks compared to the RAW image representation. We suggest that the operations of the ISP instead should be optimized towards the end task, by learning the parameters of the operations jointly during training. We extend previous works on this topic and propose a new learnable operation that enables an object detector to achieve superior performance when compared to both previous works and traditional RGB images. In experiments on the open PASCALRAW dataset, we empirically confirm our hypothesis.
  •  
Skapa referenser, mejla, bekava och länka
  • Resultat 1-4 av 4

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Stäng

Kopiera och spara länken för att återkomma till aktuell vy