SwePub
Sök i SwePub databas

  Utökad sökning

Träfflista för sökning "WFRF:(Felsberg Michael Professor) "

Sökning: WFRF:(Felsberg Michael Professor)

  • Resultat 1-10 av 27
Sortera/gruppera träfflistan
   
NumreringReferensOmslagsbildHitta
1.
  • Holmquist, Karl, 1992- (författare)
  • Data-Driven Robot Perception in the Wild
  • 2023
  • Doktorsavhandling (övrigt vetenskapligt/konstnärligt)abstract
    • As technology continues to advance, the interest in the relief of humans from tedious or dangerous tasks through automation increases. Some of the tasks that have received increasing attention are autonomous driving, disaster relief, and forestry inspection. Developing and deploying an autonomous robotic system to this type of unconstrained environments —in a safe way— is highly challenging. The system requires precise control and high-level decision making. Both of which require a robust and reliable perception system to understand the surroundings correctly. The main purpose of perception is to extract meaningful information from the environment, be it in the form of 3D maps, dense classification of the type of object and surfaces, or high-level information about the position and direction of moving objects. Depending on the limitations and application of the system, various types of sensors can be used: lidars, to collect sparse depth information; cameras, to collect dense information for different parts of the visual spectra, of-ten the red-green-blue (RGB) bands; Inertial Measurements Units (IMUs), to estimate the ego motion; microphones, to interact and respond to humans; GPS receivers, to get global position information; just to mention a few. This thesis investigates some of the necessities to approach the requirements of this type of system. Specifically, focusing on data-driven approaches, that is, machine learning, which has been shown time and again to be the main competitor for high-performance perception tasks in recent years. Although precision requirements might be high in industrial production plants, the environment is relatively controlled and the task is fixed. Instead, this thesis is studying some of the aspects necessary for complex, unconstrained environments, primarily outdoors and potentially near humans or other systems. The term in the wild refers exactly to the unconstrained nature of these environments, where the system can easily encounter something previously unseen and where the system might interact with unknowing humans. Some examples of environments are: city traffic, disaster relief scenarios, and dense forests. This thesis will mainly focus on the following three key aspects necessary to handle the types of tasks and situations that could occur in the wild: 1) generalizing to a new environment, 2) adapting to new tasks and requirements, and 3) modeling uncertainty in the perception system. First, a robotic system should be able to generalize to new environments and still function reliably. Papers B and G address this by using an intermediate representation to allow the system to handle much more diverse types of environment than otherwise possible. Paper B also investigates how robust the proposed autonomous driving system was to incorrect predictions, which is one of the likely results of changing the environment. Second, a robot should be sufficiently adaptive to allow it to learn new tasks without forgetting the previous ones. Paper E proposed a way to allow incrementally adding new semantic classes to a trained model without access to the previous training data. The approach is based on utilizing the uncertainty in the predictions to model the unknown classes, marked as background. Finally, the perception system will always be partially flawed, either because of the lack of modeling capabilities or because of ambiguities in the sensor data. To properly take this into account, it is fundamental that the system has the ability to estimate the certainty in the predictions. Paper F proposed a method for predicting the uncertainty in the model predictions when interpolating sparse data. Paper G addresses the ambiguities that exist when estimating the 3D pose of a human from a single camera image. 
  •  
2.
  • Johnander, Joakim, 1993- (författare)
  • Dynamic Visual Learning
  • 2022
  • Doktorsavhandling (övrigt vetenskapligt/konstnärligt)abstract
    • Autonomous robots act in a \emph{dynamic} world where both the robots and other objects may move. The surround sensing systems of said robots therefore work with dynamic input data and need to estimate both the current state of the environment as well as its dynamics. One of the key elements to obtain a high-level understanding of the environment is to track dynamic objects. This enables the system to understand what the objects are doing; predict where they will be in the future; and in the future better estimate where they are. In this thesis, I focus on input from visual cameras, images. Images have, with the advent of neural networks, become a cornerstone in sensing systems. Image-processing neural networks are optimized to perform a specific computer vision task -- such as recognizing cats and dogs -- on vast datasets of annotated examples. This is usually referred to as \emph{offline training} and given a well-designed neural network, enough high-quality data, and a suitable offline training formulation, the neural network is expected to become adept at the specific task.This thesis starts with a study of object tracking. The tracking is based on the visual appearance of the object, achieved via discriminative convolution filters (DCFs). The first contribution of this thesis is to decompose the filter into multiple subfilters. This serves to increase the robustness during object deformations or rotations. Moreover, it provides a more fine-grained representation of the object state as the subfilters are expected to roughly track object parts. In the second contribution, a neural network is trained directly for object tracking. In order to obtain a fine-grained representation of the object state, it is represented as a segmentation. The main challenge lies in the design of a neural network able to tackle this task. While the common neural networks excel at recognizing patterns seen during offline training, they struggle to store novel patterns in order to later recognize them. To overcome this limitation, a novel appearance learning mechanism is proposed. The mechanism extends the state-of-the-art and is shown to generalize remarkably well to novel data. In the third contribution, the method is used together with a novel fusion strategy and failure detection criterion to semi-automatically annotate visual and thermal videos.Sensing systems need not only track objects, but also detect them. The fourth contribution of this thesis strives to tackle joint detection, tracking, and segmentation of all objects from a predefined set of object classes. The challenge here lies not only in the neural network design, but also in the design of the offline training formulation. The final approach, a recurrent graph neural network, outperforms prior works that have a runtime of the same order of magnitude.Last, this thesis studies \emph{dynamic} learning of novel visual concepts. It is observed that the learning mechanisms used for object tracking essentially learns the appearance of the tracked object. It is natural to ask whether this appearance learning could be extended beyond individual objects to entire semantic classes, enabling the system to learn new concepts based on just a few training examples. Such an ability is desirable in autonomous systems as it removes the need of manually annotating thousands of examples of each class that needs recognition. Instead, the system is trained to efficiently learn to recognize new classes. In the fifth contribution, we propose a novel learning mechanism based on Gaussian process regression. With this mechanism, our neural network outperforms the state-of-the-art and the performance gap is especially large when multiple training examples are given.To summarize, this thesis studies and makes several contributions to learning systems that parse dynamic visuals and that dynamically learn visual appearances or concepts.
  •  
3.
  • Järemo Lawin, Felix, 1990- (författare)
  • Learning Representations for Segmentation and Registration
  • 2021
  • Doktorsavhandling (övrigt vetenskapligt/konstnärligt)abstract
    • In computer vision, the aim is to model and extract high-level information from visual sensor measurements such as images, videos and 3D points. Since visual data is often high-dimensional, noisy and irregular, achieving robust data modeling is challenging. This thesis presents works that address challenges within a number of different computer vision problems. First, the thesis addresses the problem of phase unwrapping for multi-frequency amplitude modulated time-of-flight (ToF) ranging. ToF is used in depth cameras, which have many applications in 3D reconstruction and gesture recognition. While amplitude modulation in time-of-flight ranging can provide accurate measurements for the depth, it also causes depth ambiguities. This thesis presents a method to resolve the ambiguities by estimating the likelihoods of different hypotheses for the depth values. This is achieved by performing kernel density estimation over the hypotheses in a spatial neighborhood of each pixel in the depth image. The depth hypothesis with the highest estimated likelihood can then be selected as the output depth. This approach yields improvements in the quality of the depth images and extends the effective range in both indoor and outdoor environments. Next, point set registration is investigated, which is the problem of aligning point sets from overlapping depth images or 3D models. Robust registration is fundamental to many vision tasks, such as multi-view 3D reconstruction and object pose estimation for robotics. The thesis presents a method for handling density variations in the measured point sets. This is achieved by modeling a latent distribution representing the underlying structure of the scene. Both the model of the scene and the registration parameters are inferred in an Expectation-Maximization based framework. Secondly, the thesis introduces a method for integrating features from deep neural networks into the registration model. It is shown that the deep features improve registration performance in terms of accuracy and robustness. Additionally, improved feature representations are generated by training the deep neural network end-to-end by minimizing registration errors produced by our registration model. Further, an approach for 3D point set segmentation is presented. As scene models are often represented using 3D point measurements, segmentation of these is important for general scene understanding. Learning models for segmentation requires a significant amount of annotated data, which is expensive and time-consuming to acquire. The approach presented in the thesis circumvents this by projecting the points into virtual camera views and render 2D images. The method can then exploit accurate convolutional neural networks for image segmentation and map the segmentation predictions back to the 3D points. This also allows for transferring learning using available annotated image data, thereby reducing the need for 3D annotations. Finally, the thesis explores the problem of video object segmentation (VOS), where the task is to track and segment target objects in each frame of a video sequence. Accurate VOS requires a robust model of the target that can adapt to different scenarios and objects. This needs to be achieved using only a single labeled reference frame as training data for each video sequence. To address the challenges in VOS, the thesis introduces a parametric target model, optimized to predict a target label derived from the mask annotation. The target model is integrated into a deep neural network, where its predictions guide a decoder module to produce target segmentation masks. The deep network is trained on labeled video data to output accurate segmentation masks for each frame. Further, it is shown that by training the entire network model in an end-to-end manner, it can learn a representation of the target that provides increased segmentation accuracy. 
  •  
4.
  • Pahlberg, Tobias (författare)
  • Wood fingerprint recognition and detection of thin cracks
  • 2017
  • Doktorsavhandling (övrigt vetenskapligt/konstnärligt)abstract
    • The first part of this thesis deals with recognition of wood fingerprints extracted from timber surfaces. It presents different methods to track sawn wood products through an industrial process using cameras. The possibility of identifying individual wood products comes from the biological variation of trees, where the genetic code, environment, and breakdown process means that every board has a unique appearance. Wood fingerprint recognition experiences many of the same challenges as found in human biometrics applications. The vision for the future is to be able to utilize existing imaging sensors in the production line to track individual products through a disordered and diverging product flow. The flow speed in wood industries is usually very high, 2-15 meters per second, with a high degree of automation. Wood fingerprints combined with automated inspection makes it possible to tailor subsequent processing steps for each product and can be used to deliver customized products. Wood tracking can also give the machine operators vital feedback on the process parameters. The motivation for recognition comes from the need for the wood industry to keep track of products without using invasive methods, such as bar code stickers or painted labels. In the project Hol-i-Wood Patching Robot, an automatic scanner- and robot system was developed. In this project, there was a wish to keep track of the shuttering panels that were going to be repaired by the automatic robots. In this thesis, three different strategies to recognize previously scanned sawn wood products are presented. The first approach uses feature detectors to find matching features between two images. This approach proved to be robust, even when subjected to moderate geometric- and radiometric image distortions. The recognition accuracy reached 100% when using high quality scans of Scots pine boards that had more than 20 knots. The second approach uses local knot neighborhood geometry to find point matches between images. The recognition accuracy reached above 99% when matching simulated Scots pine panels with realistically added noise to the knot positions, given the assumption that 85% of the knots could be detected.The third approach uses template matching to match a small part of a board against a large set of full-length boards. Cropping and heavy downsampling was implemented in this study. The intensity normalized algorithms using cross-correlation (CC-N) and correlation coefficient (CCF-N) obtained the highest recognition accuracy and had very similar overall performance. For instance, the matching accuracy for the CCF-N method reached above 99% for query images of length 1 m when the pixel density was above 0.08 pixels/mm.The last part of this thesis deals with the detection of thin cracks on oak flooring lamellae using ultrasound-excited thermography and machine learning. Today, many people manually grade and detect defects on wooden lamellae in the parquet flooring industry. The last appended paper investigates the possibility to use ensemble methods random forests and boosting to automate the process. When friction occurs in thin cracks they become warm and thus visible for a thermographic camera. Several image processing techniques were used to suppress noise and enhance likely cracks in the images. The best ensemble methods reached an average classification accuracy of 0.8, which was very close to the authors own manual attempt at separating the images (0.83).
  •  
5.
  • Persson, Mikael, 1985- (författare)
  • Visual Odometryin Principle and Practice
  • 2022
  • Doktorsavhandling (övrigt vetenskapligt/konstnärligt)abstract
    • Vision is the primary means by which we know where we are, what is nearby, and how we are moving. The corresponding computer-vision task is the simultaneous mapping of the surroundings and the localization of the camera. This goes by many names of which this thesis uses Visual Odometry. This name implies the images are sequential and emphasizes the accuracy of the pose and the real time requirements. This field has seen substantial improvements over the past decade and visual odometry is used extensively in robotics for localization, navigation and obstacle detection. The main purpose of this thesis is the study and advancement of visual odometry systems, and makes several contributions. The first of which is a high performance stereo visual odometry system, which through geometrically supported tracking achieved top rank on the KITTI odometry benchmark. The second is the state-of-the-art perspective three point solver. Such solvers find the pose of a camera given the projections of three known 3d points and are a core part of many visual odometry systems. By reformulating the underlying problem we avoided a problematic quartic polynomial. As a result we achieved substantially higher computational performance and numerical accuracy. The third is a system which generalizes stereo visual odometry to the simultaneous estimation of multiple independently moving objects. The main contribution is a real time system which allows the identification of generic moving rigid objects and the prediction of their trajectories in real time, with applications to robotic navigation in in dynamic environments. The fourth is an improved spline type continuous pose trajectory estimation framework, which simplifies the integration of general dynamic models. The framework is used to show that visual odometry systems based on continuous pose trajectories are both practical and can operate in real time. The visual odometry pipeline is considered from both a theoretical and a practical perspective. The systems described have been tested both on benchmarks and real vehicles. This thesis places the published work into context, highlighting key insights and practical observations.  
  •  
6.
  • Robinson, Andreas, 1975- (författare)
  • Discriminative correlation filters in robot vision
  • 2021
  • Doktorsavhandling (övrigt vetenskapligt/konstnärligt)abstract
    • In less than ten years, deep neural networks have evolved into all-encompassing tools in multiple areas of science and engineering, due to their almost unreasonable effectiveness in modeling complex real-world relationships. In computer vision in particular, they have taken tasks such as object recognition, that were previously considered very difficult, and transformed them into everyday practical tools. However, neural networks have to be trained with supercomputers on massive datasets for hours or days, and this limits their ability adjust to changing conditions.This thesis explores discriminative correlation filters, originally intended for tracking large objects in video, so-called visual object tracking. Unlike neural networks, these filters are small and can be quickly adapted to changes, with minimal data and computing power. At the same time, they can take advantage of the computing infrastructure developed for neural networks and operate within them.The main contributions in this thesis demonstrate the versatility and adaptability of correlation filters for various problems, while complementing the capabilities of deep neural networks. In the first problem, it is shown that when adopted to track small regions and points, they outperform the widely used Lucas-Kanade method, both in terms of robustness and precision. In the second problem, the correlation filters take on a completely new task. Here, they are used to tell different places apart, in a 16 by 16 square kilometer region of ocean near land. Given only a horizon profile - the coast line silhouette of islands and islets as seen from an ocean vessel - it is demonstrated that discriminative correlation filters can effectively distinguish between locations.In the third problem, it is shown how correlation filters can be applied to video object segmentation. This is the task of classifying individual pixels as belonging either to a target or the background, given a segmentation mask provided with the first video frame as the only guidance. It is also shown that discriminative correlation filters and deep neural networks complement each other; where the neural network processes the input video in a content-agnostic way, the filters adapt to specific target objects. The joint function is a real-time video object segmentation method.Finally, the segmentation method is extended beyond binary target/background classification to additionally consider distracting objects. This addresses the fundamental difficulty of coping with objects of similar appearance.
  •  
7.
  • Berg, Amanda, 1988- (författare)
  • Detection and Tracking in Thermal Infrared Imagery
  • 2016
  • Licentiatavhandling (övrigt vetenskapligt/konstnärligt)abstract
    • Thermal cameras have historically been of interest mainly for military applications. Increasing image quality and resolution combined with decreasing price and size during recent years have, however, opened up new application areas. They are now widely used for civilian applications, e.g., within industry, to search for missing persons, in automotive safety, as well as for medical applications. Thermal cameras are useful as soon as it is possible to measure a temperature difference. Compared to cameras operating in the visual spectrum, they are advantageous due to their ability to see in total darkness, robustness to illumination variations, and less intrusion on privacy.This thesis addresses the problem of detection and tracking in thermal infrared imagery. Visual detection and tracking of objects in video are research areas that have been and currently are subject to extensive research. Indications oftheir popularity are recent benchmarks such as the annual Visual Object Tracking (VOT) challenges, the Object Tracking Benchmarks, the series of workshops on Performance Evaluation of Tracking and Surveillance (PETS), and the workshops on Change Detection. Benchmark results indicate that detection and tracking are still challenging problems.A common belief is that detection and tracking in thermal infrared imagery is identical to detection and tracking in grayscale visual imagery. This thesis argues that the preceding allegation is not true. The characteristics of thermal infrared radiation and imagery pose certain challenges to image analysis algorithms. The thesis describes these characteristics and challenges as well as presents evaluation results confirming the hypothesis.Detection and tracking are often treated as two separate problems. However, some tracking methods, e.g. template-based tracking methods, base their tracking on repeated specific detections. They learn a model of the object that is adaptively updated. That is, detection and tracking are performed jointly. The thesis includes a template-based tracking method designed specifically for thermal infrared imagery, describes a thermal infrared dataset for evaluation of template-based tracking methods, and provides an overview of the first challenge on short-term,single-object tracking in thermal infrared video. Finally, two applications employing detection and tracking methods are presented.
  •  
8.
  • Berg, Amanda, 1988- (författare)
  • Learning to Analyze what is Beyond the Visible Spectrum
  • 2019
  • Doktorsavhandling (övrigt vetenskapligt/konstnärligt)abstract
    • Thermal cameras have historically been of interest mainly for military applications. Increasing image quality and resolution combined with decreasing camera price and size during recent years have, however, opened up new application areas. They are now widely used for civilian applications, e.g., within industry, to search for missing persons, in automotive safety, as well as for medical applications. Thermal cameras are useful as soon as there exists a measurable temperature difference. Compared to cameras operating in the visual spectrum, they are advantageous due to their ability to see in total darkness, robustness to illumination variations, and less intrusion on privacy.This thesis addresses the problem of automatic image analysis in thermal infrared images with a focus on machine learning methods. The main purpose of this thesis is to study the variations of processing required due to the thermal infrared data modality. In particular, three different problems are addressed: visual object tracking, anomaly detection, and modality transfer. All these are research areas that have been and currently are subject to extensive research. Furthermore, they are all highly relevant for a number of different real-world applications.The first addressed problem is visual object tracking, a problem for which no prior information other than the initial location of the object is given. The main contribution concerns benchmarking of short-term single-object (STSO) visual object tracking methods in thermal infrared images. The proposed dataset, LTIR (Linköping Thermal Infrared), was integrated in the VOT-TIR2015 challenge, introducing the first ever organized challenge on STSO tracking in thermal infrared video. Another contribution also related to benchmarking is a novel, recursive, method for semi-automatic annotation of multi-modal video sequences. Based on only a few initial annotations, a video object segmentation (VOS) method proposes segmentations for all remaining frames and difficult parts in need for additional manual annotation are automatically detected. The third contribution to the problem of visual object tracking is a template tracking method based on a non-parametric probability density model of the object's thermal radiation using channel representations.The second addressed problem is anomaly detection, i.e., detection of rare objects or events. The main contribution is a method for truly unsupervised anomaly detection based on Generative Adversarial Networks (GANs). The method employs joint training of the generator and an observation to latent space encoder, enabling stratification of the latent space and, thus, also separation of normal and anomalous samples. The second contribution is the previously unaddressed problem of obstacle detection in front of moving trains using a train-mounted thermal camera. Adaptive correlation filters are updated continuously and missed detections of background are treated as detections of anomalies, or obstacles. The third contribution to the problem of anomaly detection is a method for characterization and classification of automatically detected district heat leakages for the purpose of false alarm reduction.Finally, the thesis addresses the problem of modality transfer between thermal infrared and visual spectrum images, a previously unaddressed problem. The contribution is a method based on Convolutional Neural Networks (CNNs), enabling perceptually realistic transformations of thermal infrared to visual images. By careful design of the loss function the method becomes robust to image pair misalignments. The method exploits the lower acuity for color differences than for luminance possessed by the human visual system, separating the loss into a luminance and a chrominance part.
  •  
9.
  • Brissman, Emil, 1987- (författare)
  • Learning to Analyze Visual Data Streams for Environment Perception
  • 2023
  • Doktorsavhandling (övrigt vetenskapligt/konstnärligt)abstract
    • A mobile robot, instructed by a human operator, acts in an environment with many other objects. However, for an autonomous robot, human instructions should be minimal and only high-level instructions, such as the ultimate task or destination. In order to increase the level of autonomy, it has become a foremost objective to mimic human vision using neural networks that take a stream of images as input and learn a specific computer vision task from large amounts of data. In this thesis, we explore several different models for surround sensing, each of which contributes to a higher understanding of the environment being possible. As its first contribution, this thesis presents an object tracking method for video sequences, which is a crucial component in a perception system. This method predicts a fine-grained mask to separate the pixels corresponding to the target from those corresponding to the background. Rather than tracking location and size, the method tracks the initial pixels assigned to the target in this so-called video object segmentation. For subsequent time steps, the goal is to learn how the target looks using features from a neural network. We named our method A-GAME, based on the generative modeling of deep feature space, separating target and background appearances. In the second contribution of this thesis, we detect, track, and segment all objects from a set of predefined object classes. This information is how the robot increases its capabilities to perceive the surroundings. We experiment with a graph neural network to weigh all new detections and existing tracks. This model outperforms prior works by separating visually, and semantically similar objects frame by frame. The third contribution investigates one limitation of anchor-based detectors, which classify pre-defined bounding boxes as either negative or positive and thus provide a limited set of handled object shapes. One idea is to learn an alternative instance representation. We experiment with a neural network that predicts the distance to the nearest object contour in different directions from each pixel. The network then computes an approximated signed distance function containing the respective instance information. Last, this thesis studies a concept within model validation. We observed that overfitting could increase performance on benchmarks. However, this opportunity is insipid for sensing systems in practice since measurements, such as length or angles, are quantities that explain the environment. The fourth contribution of this thesis is an extended validation technique for camera calibration. This technique uses a statistical model for each error difference between an observed value and a corresponding prediction of the projective model. We compute a test over the differences and detect if the projective model is incorrect. 
  •  
10.
  • Eldesokey, Abdelrahman, 1989- (författare)
  • Uncertainty-Aware Convolutional Neural Networks for Vision Tasks on Sparse Data
  • 2021
  • Doktorsavhandling (övrigt vetenskapligt/konstnärligt)abstract
    • Early computer vision algorithms operated on dense 2D images captured using conventional monocular or color sensors. Those sensors embrace a passive nature providing limited scene representations based on light reflux, and are only able to operate under adequate lighting conditions. These limitations hindered the development of many computer vision algorithms that require some knowledge of the scene structure under varying conditions. The emergence of active sensors such as Time-of-Flight (ToF) cameras contributed to mitigating these limitations; however, they gave a rise to many novel challenges, such as data sparsity that stems from multi-path interference, and occlusion.Many approaches have been proposed to alleviate these challenges by enhancing the acquisition process of ToF cameras or by post-processing their output. Nonetheless, these approaches are sensor and model specific, requiring an individual tuning for each sensor. Alternatively, learning-based approaches, i.e., machine learning, are an attractive solution to these problems by learning a mapping from the original sensor output to a refined version of it. Convolutional Neural Networks (CNNs) are one example of powerful machine learning approaches and they have demonstrated a remarkable success on many computer vision tasks. Unfortunately, CNNs naturally operate on dense data and cannot efficiently handle sparse data from ToF sensors.In this thesis, we propose a novel variation of CNNs denoted as the Normalized Convolutional Neural Networks that can directly handle sparse data very efficiently. First, we formulate a differentiable normalized convolution layer that takes in sparse data and a confidence map as input. The confidence map provides information about valid and missing pixels to the normalized convolution layer, where the missing values are interpolated from their valid vicinity. Afterwards, we propose a confidence propagation criterion that allows building cascades of normalized convolution layers similar to the standard CNNs. We evaluated our approach on the task of unguided scene depth completion and achieved state-of-the-art results using an exceptionally small network.As a second contribution, we investigated the fusion of a normalized convolution network with standard CNNs employing RGB images. We study different fusion schemes, and we provide a thorough analysis for different components of the network. By employing our best fusion strategy, we achieve state-of-the-art results on guided depth completion using a remarkably small network.Thirdly, to provide a statistical interpretation for confidences, we derive a probabilistic framework for the normalized convolutional neural networks. This framework estimates the input confidence in a self-supervised manner and propagates it to provide a statistically valid output confidence. When compared against existing approaches for uncertainty estimation in CNNs such as Bayesian Deep Learning, our probabilistic framework provides a higher quality measure of uncertainty at a significantly lower computational cost.Finally, we attempt to employ our framework in a common task in CNNs, namely upsampling. We formulate the upsampling problem as a sparse problem, and we employ the normalized convolutional neural networks to solve it. In comparison to existing approaches, our proposed upsampler is structure-aware while being light-weight. We test our upsampler with various optical flow estimation networks, and we show that it consistently improves the results. When integrated with a recent optical flow network, it sets a new state-of-the-art on the most challenging optical flow dataset.
  •  
Skapa referenser, mejla, bekava och länka
  • Resultat 1-10 av 27
Typ av publikation
doktorsavhandling (20)
licentiatavhandling (7)
Typ av innehåll
övrigt vetenskapligt/konstnärligt (27)
Författare/redaktör
Felsberg, Michael, P ... (14)
Felsberg, Michael, P ... (10)
Forssén, Per-Erik, D ... (4)
Berg, Amanda, 1988- (2)
Åström, Freddie (2)
Felsberg, Michael, 1 ... (2)
visa fler...
Khan, Fahad Shahbaz, ... (2)
Larsson, Fredrik, 19 ... (2)
Gharaee, Zahra, 1986 ... (2)
Nordberg, Klas, Seni ... (2)
Matas, Jiri, Profess ... (1)
Folkesson, John, Ass ... (1)
Felsberg, Michael (1)
Bigun, Josef, Profes ... (1)
Lindeberg, Tony, Pro ... (1)
Öfjäll, Kristoffer, ... (1)
Ahlberg, Jörgen, Adj ... (1)
Moeslund, Thomas B., ... (1)
Ahlberg, Jörgen, Dr. ... (1)
Jenssen, Robert, Ass ... (1)
Danelljan, Martin, 1 ... (1)
Öfjäll, Kristoffer (1)
Knutsson, Hans, Prof ... (1)
Brissman, Emil, 1987 ... (1)
Leibe, Bastian, Prof ... (1)
Johnander, Joakim, 1 ... (1)
Häger, Gustav, 1988- (1)
Robinson, Andreas, 1 ... (1)
Khan, Fahad Shahbaz, ... (1)
Schiele, Bernt, Prof ... (1)
Sunnegårdh, Johan, 1 ... (1)
Andersson, Mats, Dr. (1)
Eldesokey, Abdelrahm ... (1)
Holmquist, Karl, 199 ... (1)
Persson, Mikael, 198 ... (1)
Bowden, Richard, Pro ... (1)
Jonsson, Erik, 1980- (1)
Daniilidis, Kostas (1)
Grelsson, Bertil (1)
Åström, Kalle, Profe ... (1)
Grelsson, Bertil, 19 ... (1)
Forssén, Per-Erik, D ... (1)
Kämäräinen, Joni, As ... (1)
Pahlberg, Tobias (1)
Hedborg, Johan, 1876 ... (1)
Koch, Reinhard, Prof ... (1)
Forssén, Per-Erik, d ... (1)
Klasén, Lena, Adj. P ... (1)
Krüger, Norbert, Pro ... (1)
Hagman, Olle, Profes ... (1)
visa färre...
Lärosäte
Linköpings universitet (25)
Kungliga Tekniska Högskolan (1)
Luleå tekniska universitet (1)
Språk
Engelska (27)
Forskningsämne (UKÄ/SCB)
Naturvetenskap (20)
Teknik (5)
Lantbruksvetenskap (1)

År

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Stäng

Kopiera och spara länken för att återkomma till aktuell vy