SwePub
Sök i SwePub databas

  Utökad sökning

Träfflista för sökning "WFRF:(Schiele Bernt) "

Sökning: WFRF:(Schiele Bernt)

  • Resultat 1-5 av 5
Sortera/gruppera träfflistan
   
NumreringReferensOmslagsbildHitta
1.
  • Ask, Erik, et al. (författare)
  • Tractable and Reliable Registration of 2D Point Sets
  • 2014
  • Ingår i: Lecture Notes in Computer Science (Computer Vision - ECCV 2014, 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part I). - Cham : Springer International Publishing. - 0302-9743 .- 1611-3349. - 9783319105895 - 9783319105901 ; 8689, s. 393-406
  • Konferensbidrag (refereegranskat)abstract
    • This paper introduces two new methods of registering 2D point sets over rigid transformations when the registration error is based on a robust loss function. In contrast to previous work, our methods are guaranteed to compute the optimal transformation, and at the same time, the worst-case running times are bounded by a low-degree polynomial in the number of correspondences. In practical terms, this means that there is no need to resort to ad-hoc procedures such as random sampling or local descent methods that cannot guarantee the quality of their solutions. We have tested the methods in several different settings, in particular, a thorough evaluation on two benchmarks of microscopic images used for histologic analysis of prostate cancer has been performed. Compared to the state-of-the-art, our results show that the methods are both tractable and reliable despite the presence of a significant amount of outliers.
  •  
2.
  • Danielsson, Oscar, 1982- (författare)
  • Shape-based Representations and Boosting for Visual Object Class Detection : Models and methods for representaion and detection in single and multiple views
  • 2011
  • Doktorsavhandling (övrigt vetenskapligt/konstnärligt)abstract
    • Detection of generic visual object classes (i.e. cars, dogs, mugs or people) in images is a task that humans are able to solve with remarkable ease. Unfortunately this has proven a very challenging task for computer vision. Thereason is that different instances of the same class may look very different, i.e. there is a high intra-class variation. There are several causes for intra-class variation; for example (1) the imaging conditions (e.g. lighting and exposure) may change, (2) different objects of the same class typically differ in shape and appearance, (3) the position of the object relative to the camera (i.e. the viewpoint) may change and (4) some objects are articulate and may change pose. In addition the background class, i.e. everything but the target object class, is very large. It is the combination of very high intra-class variation with a large background class that makes generic object class detection difficult. This thesis addresses this challenge within the AdaBoost framework. AdaBoost constructs an ensemble of weak classifiers to solve a given classification task and allows great flexibility in the design of these weak classifiers. This thesis proposes several types of weak classifiers that specifically target some of the causes of high intra-class variation. A multi-local classifier is proposed to capture global shape properties for object classes that lack discriminative local features, projectable classifiers are proposed to handle detection from multiple viewpoints and finally gated classifiers are proposed as a generic way to handle high intra-class variation in combination with a large background class. All proposed weak classifiers are evaluated on standard datasets to allow performance comparison to other related methods.
  •  
3.
  • Hong, Xudong, et al. (författare)
  • Visual Coherence Loss for Coherent and Visually Grounded Story Generation
  • 2023
  • Ingår i: Proceedings of the Annual Meeting of the Association for Computational Linguistics. - 0736-587X. - 9781959429777
  • Konferensbidrag (refereegranskat)abstract
    • Local coherence is essential for text generation models. We identify two important aspects of local coherence within the visual storytelling task: (1) the model needs to represent re-occurrences of characters within the image sequence in order to mention them correctly in the story; (2) character representations should enable us to find instances of the same characters and distinguish different characters. In this paper, we propose a loss function inspired by a linguistic theory of coherence for learning image sequence representations. We further propose combining features from an object detector and a face detector to construct stronger character features. To evaluate visual grounding that current reference-based metrics do not measure, we propose a character matching metric to check whether the models generate referring expressions correctly for characters in input image sequences. Experiments on a visual story generation dataset show that our proposed features and loss function are effective for generating more coherent and visually grounded stories. Our code is available at https://github.com/vwprompt/vcl.
  •  
4.
  • Johnander, Joakim, 1993- (författare)
  • Dynamic Visual Learning
  • 2022
  • Doktorsavhandling (övrigt vetenskapligt/konstnärligt)abstract
    • Autonomous robots act in a \emph{dynamic} world where both the robots and other objects may move. The surround sensing systems of said robots therefore work with dynamic input data and need to estimate both the current state of the environment as well as its dynamics. One of the key elements to obtain a high-level understanding of the environment is to track dynamic objects. This enables the system to understand what the objects are doing; predict where they will be in the future; and in the future better estimate where they are. In this thesis, I focus on input from visual cameras, images. Images have, with the advent of neural networks, become a cornerstone in sensing systems. Image-processing neural networks are optimized to perform a specific computer vision task -- such as recognizing cats and dogs -- on vast datasets of annotated examples. This is usually referred to as \emph{offline training} and given a well-designed neural network, enough high-quality data, and a suitable offline training formulation, the neural network is expected to become adept at the specific task.This thesis starts with a study of object tracking. The tracking is based on the visual appearance of the object, achieved via discriminative convolution filters (DCFs). The first contribution of this thesis is to decompose the filter into multiple subfilters. This serves to increase the robustness during object deformations or rotations. Moreover, it provides a more fine-grained representation of the object state as the subfilters are expected to roughly track object parts. In the second contribution, a neural network is trained directly for object tracking. In order to obtain a fine-grained representation of the object state, it is represented as a segmentation. The main challenge lies in the design of a neural network able to tackle this task. While the common neural networks excel at recognizing patterns seen during offline training, they struggle to store novel patterns in order to later recognize them. To overcome this limitation, a novel appearance learning mechanism is proposed. The mechanism extends the state-of-the-art and is shown to generalize remarkably well to novel data. In the third contribution, the method is used together with a novel fusion strategy and failure detection criterion to semi-automatically annotate visual and thermal videos.Sensing systems need not only track objects, but also detect them. The fourth contribution of this thesis strives to tackle joint detection, tracking, and segmentation of all objects from a predefined set of object classes. The challenge here lies not only in the neural network design, but also in the design of the offline training formulation. The final approach, a recurrent graph neural network, outperforms prior works that have a runtime of the same order of magnitude.Last, this thesis studies \emph{dynamic} learning of novel visual concepts. It is observed that the learning mechanisms used for object tracking essentially learns the appearance of the tracked object. It is natural to ask whether this appearance learning could be extended beyond individual objects to entire semantic classes, enabling the system to learn new concepts based on just a few training examples. Such an ability is desirable in autonomous systems as it removes the need of manually annotating thousands of examples of each class that needs recognition. Instead, the system is trained to efficiently learn to recognize new classes. In the fifth contribution, we propose a novel learning mechanism based on Gaussian process regression. With this mechanism, our neural network outperforms the state-of-the-art and the performance gap is especially large when multiple training examples are given.To summarize, this thesis studies and makes several contributions to learning systems that parse dynamic visuals and that dynamically learn visual appearances or concepts.
  •  
5.
  • Sturm, Jürgen, et al. (författare)
  • CopyMe3D: Scanning and Printing Persons in 3D
  • 2013
  • Ingår i: Lecture Notes in Computer Science Vol. 8142 (Pattern Recognition, 35th German Conference, GCPR 2013, Saarbrücken, Germany, September 3-6, 2013. Proceedings). - Berlin, Heidelberg : Springer Berlin Heidelberg. - 1611-3349 .- 0302-9743. - 9783642406010 - 9783642406027 ; 8142, s. 405-414
  • Konferensbidrag (refereegranskat)abstract
    • In this paper, we describe a novel approach to create 3D miniatures of persons using a Kinect sensor and a 3D color printer. To achieve this, we acquire color and depth images while the person is rotating on a swivel chair. We represent the model with a signed distance function which is updated and visualized as the images are captured for immediate feedback. Our approach automatically fills small holes that stem from self-occlusions. To optimize the model for 3D printing, we extract a watertight but hollow shell to minimize the production costs. In extensive experiments, we evaluate the quality of the obtained models as a function of the rotation speed, the non-rigid deformations of a person during recording, the camera pose, and the resulting self-occlusions. Finally, we present a large number of reconstructions and fabricated figures to demonstrate the validity of our approach.
  •  
Skapa referenser, mejla, bekava och länka
  • Resultat 1-5 av 5

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Stäng

Kopiera och spara länken för att återkomma till aktuell vy