SwePub
Sök i SwePub databas

  Extended search

Träfflista för sökning "L773:1939 3539 srt2:(2015-2019)"

Search: L773:1939 3539 > (2015-2019)

  • Result 1-10 of 18
Sort/group result
   
EnumerationReferenceCoverFind
1.
  • Azizpour, Hossein, 1985-, et al. (author)
  • Factors of Transferability for a Generic ConvNet Representation
  • 2016
  • In: IEEE Transactions on Pattern Analysis and Machine Intelligence. - : IEEE Computer Society Digital Library. - 0162-8828 .- 1939-3539. ; 38:9, s. 1790-1802
  • Journal article (peer-reviewed)abstract
    • Evidence is mounting that Convolutional Networks (ConvNets) are the most effective representation learning method for visual recognition tasks. In the common scenario, a ConvNet is trained on a large labeled dataset (source) and the feed-forward units activation of the trained network, at a certain layer of the network, is used as a generic representation of an input image for a task with relatively smaller training set (target). Recent studies have shown this form of representation transfer to be suitable for a wide range of target visual recognition tasks. This paper introduces and investigates several factors affecting the transferability of such representations. It includes parameters for training of the source ConvNet such as its architecture, distribution of the training data, etc. and also the parameters of feature extraction such as layer of the trained ConvNet, dimensionality reduction, etc. Then, by optimizing these factors, we show that significant improvements can be achieved on various (17) visual recognition tasks. We further show that these visual recognition tasks can be categorically ordered based on their similarity to the source task such that a correlation between the performance of tasks and their similarity to the source task w.r.t. the proposed factors is observed.
  •  
2.
  • Carreira, Joao, et al. (author)
  • Free-Form Region Description with Second-Order Pooling
  • 2015
  • In: IEEE Transactions on Pattern Analysis and Machine Intelligence. - 1939-3539. ; 37:6, s. 1177-1189
  • Journal article (peer-reviewed)abstract
    • Semantic segmentation and object detection are nowadays dominated by methods operating on regions obtained as a result of a bottom-up grouping process (segmentation) but use feature extractors developed for recognition on fixed-form (e.g. rectangular) patches, with full images as a special case. This is most likely suboptimal. In this paper we focus on feature extraction and description over free-form regions and study the relationship with their fixed-form counterparts. Our main contributions are novel pooling techniques that capture the second-order statistics of local descriptors inside such free-form regions. We introduce second-order generalizations of average and max-pooling that together with appropriate non-linearities, derived from the mathematical structure of their embedding space, lead to state-of-the-art recognition performance in semantic segmentation experiments without any type of local feature coding. In contrast, we show that codebook-based local feature coding is more important when feature extraction is constrained to operate over regions that include both foreground and large portions of the background, as typical in image classification settings, whereas for high-accuracy localization setups, second-order pooling over free-form regions produces results superior to those of the winning systems in the contemporary semantic segmentation challenges, with models that are much faster in both training and testing.
  •  
3.
  • Danelljan, Martin, 1989-, et al. (author)
  • Discriminative Scale Space Tracking
  • 2017
  • In: IEEE Transactions on Pattern Analysis and Machine Intelligence. - : IEEE COMPUTER SOC. - 0162-8828 .- 1939-3539. ; 39:8, s. 1561-1575
  • Journal article (peer-reviewed)abstract
    • Accurate scale estimation of a target is a challenging research problem in visual object tracking. Most state-of-the-art methods employ an exhaustive scale search to estimate the target size. The exhaustive search strategy is computationally expensive and struggles when encountered with large scale variations. This paper investigates the problem of accurate and robust scale estimation in a tracking-by-detection framework. We propose a novel scale adaptive tracking approach by learning separate discriminative correlation filters for translation and scale estimation. The explicit scale filter is learned online using the target appearance sampled at a set of different scales. Contrary to standard approaches, our method directly learns the appearance change induced by variations in the target scale. Additionally, we investigate strategies to reduce the computational cost of our approach. Extensive experiments are performed on the OTB and the VOT2014 datasets. Compared to the standard exhaustive scale search, our approach achieves a gain of 2.5 percent in average overlap precision on the OTB dataset. Additionally, our method is computationally efficient, operating at a 50 percent higher frame rate compared to the exhaustive scale search. Our method obtains the top rank in performance by outperforming 19 state-of-the-art trackers on OTB and 37 state-of-the-art trackers on VOT2014.
  •  
4.
  • Demisse, G. G., et al. (author)
  • Deformation Based Curved Shape Representation
  • 2018
  • In: IEEE Transactions on Pattern Analysis and Machine Intelligence. - : IEEE Computer Society. - 0162-8828 .- 1939-3539. ; 40:6, s. 1338-1351
  • Journal article (peer-reviewed)abstract
    • n this paper, we introduce a deformation based representation space for curved shapes in R-n. Given an ordered set of points sampled from a curved shape, the proposed method represents the set as an element of a finite dimensional matrix Lie group. Variation due to scale and location are filtered in a preprocessing stage, while shapes that vary only in rotation are identified by an equivalence relationship. The use of a finite dimensional matrix Lie group leads to a similarity metric with an explicit geodesic solution. Subsequently, we discuss some of the properties of the metric and its relationship with a deformation by least action. Furthermore, invariance to reparametrization or estimation of point correspondence between shapes is formulated as an estimation of sampling function. Thereafter, two possible approaches are presented to solve the point correspondence estimation problem. Finally, we propose an adaptation of k-means clustering for shape analysis in the proposed representation space. Experimental results show that the proposed representation is robust to uninformative cues, e.g., local shape perturbation and displacement. In comparison to state of the art methods, it achieves a high precision on the Swedish and the Flavia leaf datasets and a comparable result on MPEG-7, Kimia99 and Kimia216 datasets.
  •  
5.
  • Fukui, Kazuhiro, et al. (author)
  • Difference subspace and its generalization for subspace-based methods
  • 2015
  • In: IEEE Transactions on Pattern Analysis and Machine Intelligence. - 0162-8828 .- 1939-3539. ; 37:11, s. 2164-2177
  • Journal article (peer-reviewed)abstract
    • Subspace-based methods are known to provide a practical solution for image set-based object recognition. Based on the insight that local shape differences between objects offer a sensitive cue for recognition, this paper addresses the problem of extracting a subspace representing the difference components between class subspaces generated from each set of object images independently of each other. We first introduce the difference subspace (DS), a novel geometric concept between two subspaces as an extension of a difference vector between two vectors, and describe its effectiveness in analyzing shape differences. We then generalize it to the generalized difference subspace (GDS) for multi-class subspaces, and show the benefit of applying this to subspace and mutual subspace methods, in terms of recognition capability. Furthermore, we extend these methods to kernel DS (KDS) and kernel GDS (KGDS) by a nonlinear kernel mapping to deal with cases involving larger changes in viewing direction. In summary, the contributions of this paper are as follows: 1) a DS/KDS between two class subspaces characterizes shape differences between the two respectively corresponding objects, 2) the projection of an input vector onto a DS/KDS realizes selective visualization of shape differences between objects, and 3) the projection of an input vector or subspace onto a GDS/KGDS is extremely effective at extracting differences between multiple subspaces, and therefore improves object recognition performance. We demonstrate validity through shape analysis on synthetic and real images of 3D objects as well as extensive comparison of performance on classification tests with several related methods; we study the performance in face image classification on the Yale face database B+ and the CMU Multi-PIE database, and hand shape classification of multi-view images.
  •  
6.
  • Henter, Gustav Eje, et al. (author)
  • Minimum entropy rate simplification of stochastic processes
  • 2016
  • In: IEEE Transactions on Pattern Analysis and Machine Intelligence. - : IEEE. - 0162-8828 .- 1939-3539. ; 38:12, s. 2487-2500
  • Journal article (peer-reviewed)abstract
    • We propose minimum entropy rate simplification (MERS), an information-theoretic, parameterization-independent framework for simplifying generative models of stochastic processes. Applications include improving model quality for sampling tasks by concentrating the probability mass on the most characteristic and accurately described behaviors while de-emphasizing the tails, and obtaining clean models from corrupted data (nonparametric denoising). This is the opposite of the smoothing step commonly applied to classification models. Drawing on rate-distortion theory, MERS seeks the minimum entropy-rate process under a constraint on the dissimilarity between the original and simplified processes. We particularly investigate the Kullback-Leibler divergence rate as a dissimilarity measure, where, compatible with our assumption that the starting model is disturbed or inaccurate, the simplification rather than the starting model is used for the reference distribution of the divergence. This leads to analytic solutions for stationary and ergodic Gaussian processes and Markov chains. The same formulas are also valid for maximum-entropy smoothing under the same divergence constraint. In experiments, MERS successfully simplifies and denoises models from audio, text, speech, and meteorology.
  •  
7.
  • Ismaeil, K. Al, et al. (author)
  • Real-Time Enhancement of Dynamic Depth Videos with Non-Rigid Deformations
  • 2017
  • In: IEEE Transactions on Pattern Analysis and Machine Intelligence. - : IEEE Computer Society. - 0162-8828 .- 1939-3539. ; 39:10, s. 2045-2059
  • Journal article (peer-reviewed)abstract
    • We propose a novel approach for enhancing depth videos containing non-rigidly deforming objects. Depth sensors are capable of capturing depth maps in real-time but suffer from high noise levels and low spatial resolutions. While solutions for reconstructing 3D details in static scenes, or scenes with rigid global motions have been recently proposed, handling unconstrained non-rigid deformations in relative complex scenes remains a challenge. Our solution consists in a recursive dynamic multi-frame super-resolution algorithm where the relative local 3D motions between consecutive frames are directly accounted for. We rely on the assumption that these 3D motions can be decoupled into lateral motions and radial displacements. This allows to perform a simple local per-pixel tracking where both depth measurements and deformations are dynamically optimized. The geometric smoothness is subsequently added using a multi-level L1 minimization with a bilateral total variation regularization. The performance of this method is thoroughly evaluated on both real and synthetic data. As compared to alternative approaches, the results show a clear improvement in reconstruction accuracy and in robustness to noise, to relative large non-rigid deformations, and to topological changes. Moreover, the proposed approach, implemented on a CPU, is shown to be computationally efficient and working in real-time. 
  •  
8.
  • Johnsson, Kerstin, et al. (author)
  • Low Bias Local Intrinsic Dimension Estimation from Expected Simplex Skewness
  • 2015
  • In: IEEE Transactions on Pattern Analysis and Machine Intelligence. - 1939-3539. ; 37:1, s. 196-202
  • Journal article (peer-reviewed)abstract
    • In exploratory high-dimensional data analysis, local intrinsic dimension estimation can sometimes be used in order to discriminate between data sets sampled from different low-dimensional structures. Global intrinsic dimension estimators can in many cases be adapted to local estimation, but this leads to problems with high negative bias or high variance. We introduce a method that exploits the curse/blessing of dimensionality and produces local intrinsic dimension estimators that have very low bias, even in cases where the intrinsic dimension is higher than the number of data points, in combination with relatively low variance. We show that our estimators have a very good ability to classify local data sets by their dimension compared to other local intrinsic dimension estimators; furthermore we provide examples showing the usefulness of local intrinsic dimension estimation in general and our method in particular for stratification of real data sets.
  •  
9.
  • Ma, Zhanyu, et al. (author)
  • Variational Bayesian Matrix Factorization for Bounded Support Data
  • 2015
  • In: IEEE Transactions on Pattern Analysis and Machine Intelligence. - 0162-8828 .- 1939-3539. ; 37:4, s. 876-889
  • Journal article (peer-reviewed)abstract
    • A novel Bayesian matrix factorization method for bounded support data is presented. Each entry in the observation matrix is assumed to be beta distributed. As the beta distribution has two parameters, two parameter matrices can be obtained, which matrices contain only nonnegative values. In order to provide low-rank matrix factorization, the nonnegative matrix factorization (NMF) technique is applied. Furthermore, each entry in the factorized matrices, i.e., the basis and excitation matrices, is assigned with gamma prior. Therefore, we name this method as beta-gamma NMF (BG-NMF). Due to the integral expression of the gamma function, estimation of the posterior distribution in the BG-NMF model can not be presented by an analytically tractable solution. With the variational inference framework and the relative convexity property of the log-inverse-beta function, we propose a new lower-bound to approximate the objective function. With this new lower-bound, we derive an analytically tractable solution to approximately calculate the posterior distributions. Each of the approximated posterior distributions is also gamma distributed, which retains the conjugacy of the Bayesian estimation. In addition, a sparse BG-NMF can be obtained by including a sparseness constraint to the gamma prior. Evaluations with synthetic data and real life data demonstrate the good performance of the proposed method.
  •  
10.
  • Mathe, Stefan, et al. (author)
  • Actions in the Eye: Dynamic Gaze Datasets and Learnt Saliency Models for Visual Recognition
  • 2015
  • In: IEEE Transactions on Pattern Analysis and Machine Intelligence. - 1939-3539. ; 37:7, s. 1408-1424
  • Journal article (peer-reviewed)abstract
    • Systems based on bag-of-words models from image features collected at maxima of sparse interest point operators have been used successfully for both computer visual object and action recognition tasks. While the sparse, interest-point based approach to recognition is not inconsistent with visual processing in biological systems that operate in 'saccade and fixate' regimes, the methodology and emphasis in the human and the computer vision communities remains sharply distinct. Here, we make three contributions aiming to bridge this gap. First, we complement existing state-of-the art large scale dynamic computer vision annotated datasets like Hollywood-2 [1] and UCF Sports [2] with human eye movements collected under the ecological constraints of visual action and scene context recognition tasks. To our knowledge these are the first large human eye tracking datasets to be collected and made publicly available for video, vision. imar. ro/eyetracking (497,107 frames, each viewed by 19 subjects), unique in terms of their (a) large scale and computer vision relevance, (b) dynamic, video stimuli, (c) task control, as well as free-viewing. Second, we introduce novel dynamic consistency and alignment measures, which underline the remarkable stability of patterns of visual search among subjects. Third, we leverage the significant amount of collected data in order to pursue studies and build automatic, end-to-end trainable computer vision systems based on human eye movements. Our studies not only shed light on the differences between computer vision spatio-temporal interest point image sampling strategies and the human fixations, as well as their impact for visual recognition performance, but also demonstrate that human fixations can be accurately predicted, and when used in an end-to-end automatic system, leveraging some of the advanced computer vision practice, can lead to state of the art results.
  •  
Skapa referenser, mejla, bekava och länka
  • Result 1-10 of 18
Type of publication
journal article (18)
Type of content
peer-reviewed (18)
Author/Editor
Ottersten, Björn, 19 ... (2)
Aouada, D. (2)
Kahl, Fredrik, 1972 (2)
Maki, Atsuto (2)
Sminchisescu, Cristi ... (2)
Leijon, Arne (2)
show more...
Kahl, Fredrik (1)
Zhang, Cheng (1)
Enqvist, Olof (1)
Ulen, Johannes (1)
Mirbach, B. (1)
Sullivan, Josephine (1)
Magnusson, Måns (1)
Johnsson, Kerstin (1)
Fontes, Magnus (1)
Varagnolo, Damiano (1)
Pillonetto, Gianluig ... (1)
Jonsson, Leif (1)
Oskarsson, Magnus (1)
Svärm, Linus (1)
Azizpour, Hossein, 1 ... (1)
Sharif Razavian, Ali ... (1)
Carlssom, Stefan (1)
Borysov, Stanislav S ... (1)
Demisse, G. G. (1)
Rodrigues, Filipe (1)
Teschendorff, Andrew ... (1)
Butepage, Judith (1)
Felsberg, Michael, 1 ... (1)
Danelljan, Martin, 1 ... (1)
Khan, Fahad Shahbaz, ... (1)
Soneson, Charlotte (1)
Schenato, Luca (1)
Kjellström, Hedvig, ... (1)
Taghia, Jalil (1)
Carreira, Joao (1)
Caseiro, Rui (1)
Batista, Jorge (1)
Terenin, Alexander (1)
Häger, Gustav, 1988- (1)
Kleijn, W.B. (1)
Fukui, Kazuhiro (1)
Solignac, T. (1)
Guo, Jun (1)
Henter, Gustav Eje (1)
Ismaeil, K. Al (1)
Zhang, Honggang (1)
Qiao, Yuanyuan (1)
Ma, Zhanyu (1)
Mathe, Stefan (1)
show less...
University
Royal Institute of Technology (9)
Lund University (5)
Chalmers University of Technology (3)
Linköping University (2)
Luleå University of Technology (1)
Stockholm University (1)
Language
English (18)
Research subject (UKÄ/SCB)
Natural sciences (13)
Engineering and Technology (8)

Year

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Close

Copy and save the link in order to return to this view