SwePub
Sök i SwePub databas

  Utökad sökning

Träfflista för sökning "L773:1057 7149 OR L773:1941 0042 srt2:(2015-2019)"

Sökning: L773:1057 7149 OR L773:1941 0042 > (2015-2019)

  • Resultat 1-10 av 15
Sortera/gruppera träfflistan
   
NumreringReferensOmslagsbildHitta
1.
  • Asplund, Teo, et al. (författare)
  • A Faster, Unbiased Path Opening by Upper Skeletonization and Weighted Adjacency Graphs
  • 2016
  • Ingår i: IEEE Transactions on Image Processing. - 1057-7149 .- 1941-0042. ; 25:12, s. 5589-5600
  • Tidskriftsartikel (refereegranskat)abstract
    • The path opening is a filter that preserves bright regions in the image in which a path of a certain length L fits. A path is a (not necessarily straight) line defined by a specific adjacency relation. The most efficient implementation known scales as O(min(L, d, Q)N) with the length of the path, L, the maximum possible path length, d, the number of graylevels, Q, and the image size, N. An approximation exists (parsimonious path opening) that has an execution time independent of path length. This is achieved by preselecting paths, and applying 1D openings along these paths. However, the preselected paths can miss important structures, as described by its authors. Here, we propose a different approximation, in which we preselect paths using a grayvalue skeleton. The skeleton follows all ridges in the image, meaning that no important line structures will be missed. An H-minima transform simplifies the image to reduce the number of branches in the skeleton. A graph-based version of the traditional path opening operates only on the pixels in the skeleton, yielding speedups up to one order of magnitude, depending on image size and filter parameters. The edges of the graph are weighted in order to minimize bias. Experiments show that the proposed algorithm scales linearly with image size, and that it is often slightly faster for longer paths than for shorter paths. The algorithm also yields the most accurate results- as compared with a number of path opening variants-when measuring length distributions.
  •  
2.
  • Do, Thanh Toan, et al. (författare)
  • Simultaneous feature aggregating and hashing for compact binary code learning
  • 2019
  • Ingår i: IEEE Transactions on Image Processing. - 1941-0042 .- 1057-7149. ; 28:10, s. 4954-4969
  • Tidskriftsartikel (refereegranskat)abstract
    • Representing images by compact hash codes is an attractive approach for large-scale content-based image retrieval. In most state-of-the-art hashing-based image retrieval systems, for each image, local descriptors are first aggregated as a global representation vector. This global vector is then subjected to a hashing function to generate a binary hash code. In previous works, the aggregating and the hashing processes are designed independently. Hence, these frameworks may generate suboptimal hash codes. In this paper, we first propose a novel unsupervised hashing framework in which feature aggregating and hashing are designed simultaneously and optimized jointly. Specifically, our joint optimization generates aggregated representations that can be better reconstructed by some binary codes. This leads to more discriminative binary hash codes and improved retrieval accuracy. In addition, the proposed method is flexible. It can be extended for supervised hashing. When the data label is available, the framework can be adapted to learn binary codes which minimize the reconstruction loss with respect to label vectors. Furthermore, we also propose a fast version of the state-of-the-art hashing method Binary Autoencoder to be used in our proposed frameworks. Extensive experiments on benchmark datasets under various settings show that the proposed methods outperform the state-of-the-art unsupervised and supervised hashing methods.
  •  
3.
  • Fu, Keren, 1988, et al. (författare)
  • Normalized Cut-based Saliency Detection by Adaptive Multi-Level Region Merging
  • 2015
  • Ingår i: IEEE Transactions on Image Processing. - 1941-0042 .- 1057-7149. ; 24:12, s. 5671-5683
  • Tidskriftsartikel (refereegranskat)abstract
    • Existing salient object detection models favor over-segmented regions upon which saliency is computed. Such local regions are less effective on representing object holistically and degrade emphasis of entire salient objects. As a result, existing methods often fail to highlight an entire object in complex background. Towards better grouping of objects and background, in this paper we consider graph cut, more specifically the Normalized graph cut (Ncut) for saliency detection. Since the Ncut partitions a graph in a normalized energy minimization fashion, resulting eigenvectors of the Ncut contain good cluster information that may group visual contents. Motivated by this, we directly induce saliency maps via eigenvectors of the Ncut, contributing to accurate saliency estimation of visual clusters. We implement the Ncut on a graph derived from a moderate number of superpixels. This graph captures both intrinsic color and edge information of image data. Starting from the superpixels, an adaptive multi-level region merging scheme is employed to seek such cluster information from Ncut eigenvectors. With developed saliency measures for each merged region, encouraging performance is obtained after across-level integration. Experiments by comparing with 13 existing methods on four benchmark datasets including MSRA-1000, SOD, SED and CSSD show the proposed method, Ncut saliency (NCS), results in uniform object enhancement and achieves comparable/better performance to the state-of-the-art methods.
  •  
4.
  • Ge, Qi, et al. (författare)
  • Structure-Based Low-Rank Model With Graph Nuclear Norm Regularization for Noise Removal
  • 2017
  • Ingår i: IEEE Transactions on Image Processing. - : Institute of Electrical and Electronics Engineers (IEEE). - 1057-7149 .- 1941-0042. ; 26:7, s. 3098-3112
  • Tidskriftsartikel (refereegranskat)abstract
    • Nonlocal image representation methods, including group-based sparse coding and block-matching 3-D filtering, have shown their great performance in application to low-level tasks. The nonlocal prior is extracted from each group consisting of patches with similar intensities. Grouping patches based on intensity similarity, however, gives rise to disturbance and inaccuracy in estimation of the true images. To address this problem, we propose a structure-based low-rank model with graph nuclear norm regularization. We exploit the local manifold structure inside a patch and group the patches by the distance metric of manifold structure. With the manifold structure information, a graph nuclear norm regularization is established and incorporated into a low-rank approximation model. We then prove that the graph-based regularization is equivalent to a weighted nuclear norm and the proposed model can be solved by a weighted singular-value thresholding algorithm. Extensive experiments on additive white Gaussian noise removal and mixed noise removal demonstrate that the proposed method achieves a better performance than several state-of-the-art algorithms.
  •  
5.
  • Khan, Fahad, et al. (författare)
  • Recognizing Actions Through Action-Specific Person Detection
  • 2015
  • Ingår i: IEEE Transactions on Image Processing. - : IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC. - 1057-7149 .- 1941-0042. ; 24:11, s. 4422-4432
  • Tidskriftsartikel (refereegranskat)abstract
    • Action recognition in still images is a challenging problem in computer vision. To facilitate comparative evaluation independently of person detection, the standard evaluation protocol for action recognition uses an oracle person detector to obtain perfect bounding box information at both training and test time. The assumption is that, in practice, a general person detector will provide candidate bounding boxes for action recognition. In this paper, we argue that this paradigm is suboptimal and that action class labels should already be considered during the detection stage. Motivated by the observation that body pose is strongly conditioned on action class, we show that: 1) the existing state-of-the-art generic person detectors are not adequate for proposing candidate bounding boxes for action classification; 2) due to limited training examples, the direct training of action-specific person detectors is also inadequate; and 3) using only a small number of labeled action examples, the transfer learning is able to adapt an existing detector to propose higher quality bounding boxes for subsequent action classification. To the best of our knowledge, we are the first to investigate transfer learning for the task of action-specific person detection in still images. We perform extensive experiments on two benchmark data sets: 1) Stanford-40 and 2) PASCAL VOC 2012. For the action detection task (i.e., both person localization and classification of the action performed), our approach outperforms methods based on general person detection by 5.7% mean average precision (MAP) on Stanford-40 and 2.1% MAP on PASCAL VOC 2012. Our approach also significantly outperforms the state of the art with a MAP of 45.4% on Stanford-40 and 31.4% on PASCAL VOC 2012. We also evaluate our action detection approach for the task of action classification (i.e., recognizing actions without localizing them). For this task, our approach, without using any ground-truth person localization at test time, outperforms on both data sets state-of-the-art methods, which do use person locations.
  •  
6.
  • Li, Yun, et al. (författare)
  • Scalable coding of plenoptic images by using a sparse set and disparities
  • 2016
  • Ingår i: IEEE Transactions on Image Processing. - 1057-7149 .- 1941-0042. ; 25:1, s. 80-91
  • Tidskriftsartikel (refereegranskat)abstract
    • One of the light field capturing techniques is the focused plenoptic capturing. By placing a microlens array in front of the photosensor, the focused plenoptic cameras capture both spatial and angular information of a scene in each microlens image and across microlens images. The capturing results in significant amount of redundant information, and the captured image is usually of a large resolution. A coding scheme that removes the redundancy before coding can be of advantage for efficient compression, transmission and rendering. In this paper, we propose a lossy coding scheme to efficiently represent plenoptic images. The format contains a sparse image set and its associated disparities. The reconstruction is performed by disparity-based interpolation and inpainting, and the reconstructed image is later employed as a prediction reference for the coding of the full plenoptic image. As an outcome of the representation, the proposed scheme inherits a scalable structure with three layers.The results show that plenoptic images are compressed efficiently with over 60 percent bit rate reduction compared to HEVC intra, and with over 20 percent compared to HEVC block copying mode.
  •  
7.
  • Liu, Du, et al. (författare)
  • Fractional-Pel Accurate Motion-Adaptive Transforms
  • 2019
  • Ingår i: IEEE Transactions on Image Processing. - : IEEE. - 1057-7149 .- 1941-0042. ; 28:6, s. 2731-2742
  • Tidskriftsartikel (refereegranskat)abstract
    • Fractional-pel accurate motion is widely used in video coding. For subband coding, fractional-pel accuracy is challenging since it is difficult to handle the complex motion field with temporal transforms. In our previous work, we designed integer accurate motion-adaptive transforms (MAT) which can transform integer accurate motion-connected coefficients. In this paper, we extend the integer MAT to fractional-pel accuracy. The integer MAT allows only one reference coefficient to be the lowhand coefficient. In this paper, we design the transform such that it permits multiple references and generates multiple low-band coefficients. In addition, our fractional-pel MAT can incorporate a general interpolation filter into the basis vector, such that the highband coefficient produced by the transform is the same as the prediction error from the interpolation filter. The fractional-pel MAT is always orthonormal. Thus, the energy is preserved by the transform. We compare the proposed fractional-pel MAT, the integer MAT, and the half-pel motion-compensated orthogonal transform (MCOT), while HEVC intra coding is used to encode the temporal subbands. The experimental results show that the proposed fractional-pel MAT outperforms the integer MAT and the half-pel MCOT. The gain achieved by the proposed MAT over the integer MAT can reach up to 1 dB in PSNR.
  •  
8.
  • Mahdizadehaghdam, Shahin, et al. (författare)
  • Deep Dictionary Learning: A PARametric NETwork Approach
  • 2019
  • Ingår i: IEEE Transactions on Image Processing. - 1941-0042 .- 1057-7149. ; 28:10, s. 4790-4802
  • Tidskriftsartikel (refereegranskat)abstract
    • Deep dictionary learning seeks multiple dictionaries at different image scales to capture complementary coherent characteristics. We propose a method for learning a hierarchy of synthesis dictionaries with an image classification goal. The dictionaries and classification parameters are trained by a classification objective, and the sparse features are extracted by reducing a reconstruction loss in each layer. The reconstruction objectives in some sense regularize the classification problem and inject source signal information in the extracted features. The performance of the proposed hierarchical method increases by adding more layers, which consequently makes this model easier to tune and adapt. The proposed algorithm furthermore shows a remarkably lower fooling rate in the presence of adversarial perturbation. The validation of the proposed approach is based on its classification performance using four benchmark datasets and is compared to a Convolutional Neural Network (CNN) of similar size.
  •  
9.
  • Markus, Nenad, et al. (författare)
  • Learning Local Descriptors by Optimizing the Keypoint-Correspondence Criterion: Applications to Face Matching, Learning From Unlabeled Videos and 3D-Shape Retrieval
  • 2019
  • Ingår i: IEEE Transactions on Image Processing. - : IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC. - 1057-7149 .- 1941-0042. ; 28:1, s. 279-290
  • Tidskriftsartikel (refereegranskat)abstract
    • Current best local descriptors are learned on a large data set of matching and non-matching keypoint pairs. However, data of this kind are not always available, since the detailed keypoint correspondences can be hard to establish. On the other hand, we can often obtain labels for pairs of keypoint bags. For example, keypoint bags extracted from two images of the same object under different views form a matching pair, and keypoint bags extracted from images of different objects form a non-matching pair. On average, matching pairs should contain more corresponding keypoints than non-matching pairs. We describe an end-to-end differentiable architecture that enables the learning of local keypoint descriptors from such weakly labeled data. In addition, we discuss how to improve the method by incorporating the procedure of mining hard negatives. We also show how our approach can be used to learn convolutional features from unlabeled video signals and 3D models.
  •  
10.
  • Oshima, Satoshi, et al. (författare)
  • Modeling, Measuring, and Compensating Color Weak Vision
  • 2016
  • Ingår i: IEEE Transactions on Image Processing. - : IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC. - 1057-7149 .- 1941-0042. ; 25:6, s. 2587-2600
  • Tidskriftsartikel (refereegranskat)abstract
    • We use methods from Riemann geometry to investigate transformations between the color spaces of color-normal and color-weak observers. The two main applications are the simulation of the perception of a color weak observer for a color-normal observer, and the compensation of color images in a way that a color-weak observer has approximately the same perception as a color-normal observer. The metrics in the color spaces of interest are characterized with the help of ellipsoids defined by the just-noticeable-differences between the colors which are measured with the help of color-matching experiments. The constructed mappings are the isometries of Riemann spaces that preserve the perceived color differences for both observers. Among the two approaches to build such an isometry, we introduce normal coordinates in Riemann spaces as a tool to construct a global color-weak compensation map. Compared with the previously used methods, this method is free from approximation errors due to local linearizations, and it avoids the problem of shifting locations of the origin of the local coordinate system. We analyze the variations of the Riemann metrics for different observers obtained from new color-matching experiments and describe three variations of the basic method. The performance of the methods is evaluated with the help of semantic differential tests.
  •  
Skapa referenser, mejla, bekava och länka
  • Resultat 1-10 av 15

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Stäng

Kopiera och spara länken för att återkomma till aktuell vy