SwePub
Sök i SwePub databas

  Extended search

Träfflista för sökning "WFRF:(Brissman Emil) "

Search: WFRF:(Brissman Emil)

  • Result 1-6 of 6
Sort/group result
   
EnumerationReferenceCoverFind
1.
  • Brissman, Emil, 1987-, et al. (author)
  • Camera Calibration Without Camera Access - A Robust Validation Technique for Extended PnP Methods
  • 2023
  • Conference paper (peer-reviewed)abstract
    • A challenge in image based metrology and forensics is intrinsic camera calibration when the used camera is unavailable. The unavailability raises two questions. The first question is how to find the projection model that describes the camera, and the second is to detect incorrect models. In this work, we use off-the-shelf extended PnP-methods to find the model from 2D-3D correspondences, and propose a method for model validation. The most common strategy for evaluating a projection model is comparing different models’ residual variances—however, this naive strategy cannot distinguish whether the projection model is potentially underfitted or overfitted. To this end, we model the residual errors for each correspondence, individually scale all residuals using a predicted variance and test if the new residuals are drawn from a standard normal distribution. We demonstrate the effectiveness of our proposed validation in experiments on synthetic data, simulating 2D detection and Lidar measurements. Additionally, we provide experiments using data from an actual scene and compare non-camera access and camera access calibrations. Last, we use our method to validate annotations in MegaDepth.
  •  
2.
  • Brissman, Emil, 1987- (author)
  • Learning to Analyze Visual Data Streams for Environment Perception
  • 2023
  • Doctoral thesis (other academic/artistic)abstract
    • A mobile robot, instructed by a human operator, acts in an environment with many other objects. However, for an autonomous robot, human instructions should be minimal and only high-level instructions, such as the ultimate task or destination. In order to increase the level of autonomy, it has become a foremost objective to mimic human vision using neural networks that take a stream of images as input and learn a specific computer vision task from large amounts of data. In this thesis, we explore several different models for surround sensing, each of which contributes to a higher understanding of the environment being possible. As its first contribution, this thesis presents an object tracking method for video sequences, which is a crucial component in a perception system. This method predicts a fine-grained mask to separate the pixels corresponding to the target from those corresponding to the background. Rather than tracking location and size, the method tracks the initial pixels assigned to the target in this so-called video object segmentation. For subsequent time steps, the goal is to learn how the target looks using features from a neural network. We named our method A-GAME, based on the generative modeling of deep feature space, separating target and background appearances. In the second contribution of this thesis, we detect, track, and segment all objects from a set of predefined object classes. This information is how the robot increases its capabilities to perceive the surroundings. We experiment with a graph neural network to weigh all new detections and existing tracks. This model outperforms prior works by separating visually, and semantically similar objects frame by frame. The third contribution investigates one limitation of anchor-based detectors, which classify pre-defined bounding boxes as either negative or positive and thus provide a limited set of handled object shapes. One idea is to learn an alternative instance representation. We experiment with a neural network that predicts the distance to the nearest object contour in different directions from each pixel. The network then computes an approximated signed distance function containing the respective instance information. Last, this thesis studies a concept within model validation. We observed that overfitting could increase performance on benchmarks. However, this opportunity is insipid for sensing systems in practice since measurements, such as length or angles, are quantities that explain the environment. The fourth contribution of this thesis is an extended validation technique for camera calibration. This technique uses a statistical model for each error difference between an observed value and a corresponding prediction of the projective model. We compute a test over the differences and detect if the projective model is incorrect. 
  •  
3.
  • Brissman, Emil, et al. (author)
  • Predicting Signed Distance Functions for Visual Instance Segmentation
  • 2021
  • In: 33rd Annual Workshop of the Swedish-Artificial-Intelligence-Society (SAIS). - : Institute of Electrical and Electronics Engineers (IEEE). - 9781665442367 - 9781665442374 ; , s. 5-10
  • Conference paper (peer-reviewed)abstract
    • Visual instance segmentation is a challenging problem and becomes even more difficult if objects of interest varies unconstrained in shape. Some objects are well described by a rectangle, however, this is hardly always the case. Consider for instance long, slender objects such as ropes. Anchor-based approaches classify predefined bounding boxes as either negative or positive and thus provide a limited set of shapes that can be handled. Defining anchor-boxes that fit well to all possible shapes leads to an infeasible number of prior boxes. We explore a different approach and propose to train a neural network to compute distance maps along different directions. The network is trained at each pixel to predict the distance to the closest object contour in a given direction. By pooling the distance maps we obtain an approximation to the signed distance function (SDF). The SDF may then be thresholded in order to obtain a foreground-background segmentation. We compare this segmentation to foreground segmentations obtained from the state-of-the-art instance segmentation method YOLACT. On the COCO dataset, our segmentation yields a higher performance in terms of foreground intersection over union (IoU). However, while the distance maps contain information on the individual instances, it is not straightforward to map them to the full instance segmentation. We still believe that this idea is a promising research direction for instance segmentation, as it better captures the different shapes found in the real world.
  •  
4.
  • Brissman, Emil, 1987-, et al. (author)
  • Recurrent Graph Neural Networks for Video Instance Segmentation
  • 2023
  • In: International Journal of Computer Vision. - : Springer. - 0920-5691 .- 1573-1405. ; 131, s. 471-495
  • Journal article (peer-reviewed)abstract
    • Video instance segmentation is one of the core problems in computer vision. Formulating a purely learning-based method, which models the generic track management required to solve the video instance segmentation task, is a highly challenging problem. In this work, we propose a novel learning framework where the entire video instance segmentation problem is modeled jointly. To this end, we design a graph neural network that in each frame jointly processes all detections and a memory of previously seen tracks. Past information is considered and processed via a recurrent connection. We demonstrate the effectiveness of the proposed approach in comprehensive experiments. Our approach operates online at over 25 FPS and obtains 16.3 AP on the challenging OVIS benchmark, setting a new state-of-the-art. We further conduct detailed ablative experiments that validate the different aspects of our approach. Code is available at https://github.com/emibr948/RGNNVIS-PlusPlus.
  •  
5.
  • Johnander, Joakim, et al. (author)
  • A generative appearance model for end-to-end video object segmentation
  • 2019
  • In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). - : Institute of Electrical and Electronics Engineers (IEEE). - 9781728132938 - 9781728132945 ; , s. 8945-8954
  • Conference paper (peer-reviewed)abstract
    • One of the fundamental challenges in video object segmentation is to find an effective representation of the target and background appearance. The best performing approaches resort to extensive fine-tuning of a convolutional neural network for this purpose. Besides being prohibitively expensive, this strategy cannot be truly trained end-to-end since the online fine-tuning procedure is not integrated into the offline training of the network. To address these issues, we propose a network architecture that learns a powerful representation of the target and background appearance in a single forward pass. The introduced appearance module learns a probabilistic generative model of target and background feature distributions. Given a new image, it predicts the posterior class probabilities, providing a highly discriminative cue, which is processed in later network modules. Both the learning and prediction stages of our appearance module are fully differentiable, enabling true end-to-end training of the entire segmentation pipeline. Comprehensive experiments demonstrate the effectiveness of the proposed approach on three video object segmentation benchmarks. We close the gap to approaches based on online fine-tuning on DAVIS17, while operating at 15 FPS on a single GPU. Furthermore, our method outperforms all published approaches on the large-scale YouTube-VOS dataset.
  •  
6.
  • Johnander, Joakim, 1993-, et al. (author)
  • Video Instance Segmentation with Recurrent Graph Neural Networks
  • 2021
  • In: Pattern Recognition. - Cham : Springer. - 9783030926588 - 9783030926595 ; , s. 206-221
  • Conference paper (peer-reviewed)abstract
    • Video instance segmentation is one of the core problems in computer vision. Formulating a purely learning-based method, which models the generic track management required to solve the video instance segmentation task, is a highly challenging problem. In this work, we propose a novel learning framework where the entire video instance segmentation problem is modeled jointly. To this end, we design a graph neural network that in each frame jointly processes all detections and a memory of previously seen tracks. Past information is considered and processed via a recurrent connection. We demonstrate the effectiveness of the proposed approach in comprehensive experiments. Our approach, operating at over 25 FPS, outperforms previous video real-time methods. We further conduct detailed ablative experiments that validate the different aspects of our approach.
  •  
Skapa referenser, mejla, bekava och länka
  • Result 1-6 of 6

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Close

Copy and save the link in order to return to this view