SwePub
Sök i SwePub databas

  Extended search

Träfflista för sökning "L773:9781665490627 OR L773:9781665490634 "

Search: L773:9781665490627 OR L773:9781665490634

  • Result 1-10 of 10
Sort/group result
   
EnumerationReferenceCoverFind
1.
  • Blomqvist, Christopher, et al. (author)
  • Joint Handwritten Text Recognition and Word Classification for Tabular Information Extraction
  • 2022
  • In: 2022 26th International Conference on Pattern Recognition (ICPR). - 9781665490627 - 9781665490634 ; , s. 1564-1570
  • Conference paper (peer-reviewed)abstract
    • In this paper, we present a system for extracting tabular information from loosely structured handwritten documents. The system consists of three parts, (i) a u-net like CNN-based method for text detection and segmentation, (ii) a new attention-based method for simultaneous text recognition and classification of word-parts, and (iii) a method for matching the word parts into a tabular structure for each entry. A key contribution is the observation that the new attention-based recognition and classification module makes it possible for improved spatial analysis of the tabular information. The method is evaluated on a unique historical document: The Swedish Wealth Tax of 1571, consisting of 11,453 pages of hand-written tax records. The evaluation shows that the system provides a significant improvement to the state-of-the-art to the problem of tabular extraction from loosely structured historical documents.
  •  
2.
  • Edstedt, Johan, et al. (author)
  • VidHarm: A Clip Based Dataset for Harmful Content Detection
  • 2022
  • In: 2022 26th International Conference on Pattern Recognition (ICPR). - : Institute of Electrical and Electronics Engineers (IEEE). - 9781665490627 - 9781665490634 ; , s. 1543-1549
  • Conference paper (peer-reviewed)abstract
    • Automatically identifying harmful content in video is an important task with a wide range of applications. However, there is a lack of professionally labeled open datasets available. In this work VidHarm, an open dataset of 3589 video clips from film trailers annotated by professionals, is presented. An analysis of the dataset is performed, revealing among other things the relation between clip and trailer level annotations. Audiovisual models are trained on the dataset and an in-depth study of modeling choices conducted. The results show that performance is greatly improved by combining the visual and audio modality, pre-training on large-scale video recognition datasets, and class balanced sampling. Lastly, biases of the trained models are investigated using discrimination probing.VidHarm is openly available, and further details are available at the webpage https://vidharm.github.io/
  •  
3.
  • Gillsjö, David, et al. (author)
  • Semantic Room Wireframe Detection from a Single View
  • 2022
  • In: 26th International Conference on Pattern Recognition, 2022. - 9781665490634 - 9781665490627 ; , s. 1886-1893
  • Conference paper (peer-reviewed)abstract
    • Reconstruction of indoor surfaces with limited texture information or with repeated textures, a situation common in walls and ceilings, may be difficult with a monocular Structure from Motion system. We propose a Semantic Room Wireframe Detection task to predict a Semantic Wireframe from a single perspective image. Such predictions may be used with shape priors to estimate the Room Layout and aid reconstruction. To train and test the proposed algorithm we create a new set of annotations from the simulated Structured3D dataset. We show qualitatively that the SRW-Net handles complex room geometries better than previous Room Layout Estimation algorithms while quantitatively out-performing the baseline in non-semantic Wireframe Detection.
  •  
4.
  • Berg, Axel, et al. (author)
  • Points to patches: Enabling the use of self-attention for 3D shape recognition
  • 2022
  • In: 2022 26th International Conference on Pattern Recognition (ICPR). - 2831-7475 .- 1051-4651. - 9781665490627 - 9781665490627 ; , s. 528-534
  • Conference paper (peer-reviewed)abstract
    • While the Transformer architecture has become ubiquitous in the machine learning field, its adaptation to 3D shape recognition is non-trivial. Due to its quadratic computational complexity, the self-attention operator quickly becomes inefficient as the set of input points grows larger. Furthermore, we find that the attention mechanism struggles to find useful connections between individual points on a global scale. In order to alleviate these problems, we propose a two-stage Point Transformer-in-Transformer (Point-TnT) approach which combines local and global attention mechanisms, enabling both individual points and patches of points to attend to each other effectively. Experiments on shape classification show that such an approach provides more useful features for downstream tasks than the baseline Transformer, while also being more computationally efficient. In addition, we also extend our method to feature matching for scene reconstruction, showing that it can be used in conjunction with existing scene reconstruction pipelines.
  •  
5.
  • Flood, Gabrielle, et al. (author)
  • Minimal Solvers for Point Cloud Matching with Statistical Deformations
  • 2022
  • In: 2022 26th International Conference on Pattern Recognition (ICPR). - 9781665490634
  • Conference paper (peer-reviewed)abstract
    • An important issue in simultaneous localisation and mapping is how to match and merge individual local maps into one global map. This is addressed within the field of robotics and is crucial for multi-robot SLAM. There are a number of different ways to solve this task depending on the representation of the map. To take advantage of matching and merging methods that allow for deformations of the local maps it is important to find feature matches that capture such deformations. In this paper we present minimal solvers for point cloud matching using statistical deformations. The solvers use either three or four point matches. These solve for either rigid or similarity transformation as well as shape deformation in the direction of the most important modes of variation. Given an initial set of tentative matches based on, for example, feature descriptors or machine learning we use these solvers in a RANSAC loop to remove outliers among the tentative matches. We evaluate the methods on both synthetic and real data and compare them to RANSAC methods based on Procrustes and demonstrate that the proposed methods improve on the current state-of-the-art.
  •  
6.
  • Gravina, Michela, et al. (author)
  • Evaluating tumour bounding options for deep learning-based axillary lymph node metastasis prediction in breast cancer
  • 2022
  • In: 2022 26th International Conference on Pattern Recognition (ICPR). - : Institute of Electrical and Electronics Engineers (IEEE). - 9781665490627 ; , s. 4335-4342
  • Conference paper (peer-reviewed)abstract
    • The involvement of axillary lymph node metastasis in breast cancer is one of the most important independent prognostic factors. While the metastasis of lymph node depends on primary tumour intrinsic behaviour, morphology and angioinvasivity, the involvement of the peritumoral tissue by the neoplastic cells also provides useful information for the potential tumour aggressiveness. The lymph node status is currently evaluated by histological invasive procedures with possible complications, asking for introducing safer approaches. Among different imaging techniques, the Dynamic Contrast Enhanced-Magnetic Resonance Imaging (DCE-MRI) highlights physiological and morphological characteristics, reflecting breast lesions behaviour and aggressiveness. In the recent years, deep learning (DL) approaches, such as Convolutional Neural Networks, gained increasing popularity for biomedical image processing. Thanks to their ability to autonomously learn from images the set of features for the specific task to solve, they allow finding non-invasive alternatives to the standard procedures used up to now. This paper aims to evaluate the applicability of DL approaches for the axillary lymph node metastasis prediction, considering primary tumour DCE-MRI sequence. Differently from other work in the literature, we include a detailed analysis of healthy tissue influence in lymph node tumour spread through the evaluation of different tumour bounding options. Promising results are reported on a dataset of 153 patients with 155 malignant lesions.
  •  
7.
  • Gummeson, Anna, et al. (author)
  • Fast and efficient minimal solvers for quadric based camera pose estimation
  • 2022
  • In: 2022 26th International Conference on Pattern Recognition, ICPR 2022. - 9781665490627 ; , s. 3973-3979
  • Conference paper (peer-reviewed)abstract
    • In this paper we address absolute camera pose estimation. An efficient (and standard) way to solve this problem, is to use sparse keypoint correspondences. In many cases point features are not available, or are unstable over time and viewing conditions. We propose a framework based on silhouettes of quadric surfaces, with special emphasis on cylinders. We provide mathematical analysis of the problem of projected cylinders in particular, but also general quadrics. We develop a number of minimal solvers for estimating camera pose from silhouette lines of cylinders, given different calibration and cylinder properties. These solvers can be used efficiently in bootstrapping robust estimation schemes, such as RANSAC. Note that even though we have lines as image features, this is a different case than line based pose estimation, since we do not have 2D-line to 3D-line correspondences. We perform synthetic accuracy and robustness tests and evaluate on a number of real case scenarios.
  •  
8.
  • Hsu, Pohao, et al. (author)
  • Extremely Low-light Image Enhancement with Scene Text Restoration
  • 2022
  • In: Proceedings - International Conference on Pattern Recognition. - 1051-4651. - 9781665490627 ; 2022, s. 317-323
  • Conference paper (peer-reviewed)abstract
    • Deep learning based methods have made impressive progress in enhancing extremely low-light images - the image quality of the reconstructed images has generally improved. However, we found out that most of these methods could not sufficiently recover the image details, for instance the texts in the scene. In this paper, a novel image enhancement framework is proposed to specifically restore the scene texts, as well as the overall quality of the image simultaneously under extremely low-light images conditions. Particularly, we employed a selfregularised attention map, an edge map, and a novel text detection loss. The quantitative and qualitative experimental results have shown that the proposed model outperforms stateof-the-art methods in terms of image restoration, text detection, and text spotting on See In the Dark and ICDAR15 datasets.
  •  
9.
  • Liu, Xixi, 1995, et al. (author)
  • Effortless Training of Joint Energy-Based Models with Sliced Score Matching
  • 2022
  • In: Proceedings - International Conference on Pattern Recognition. - 1051-4651. - 9781665490627 ; , s. 2643-2649
  • Conference paper (peer-reviewed)abstract
    • Standard discriminative classifiers can be upgraded to joint energy-based models (JEMs) by combining the classification loss with a log-evidence loss. Hence, such models intrinsically allow detection of out-of-distribution (OOD) samples, and empirically also provide better-calibrated posteriors, i.e., prediction uncertainties. However, the training procedure suggested for JEMs (using stochastic gradient Langevin dynamics---or SGLD---to maximize the evidence) is reported to be brittle. In this work, we propose to utilize score matching---in particular sliced score matching---to obtain a stable training method for JEMs. We observe empirically that the combination of score matching with the standard classification loss leads to improved OOD detection and better-calibrated classifiers for otherwise identical DNN architectures. Additionally, we also analyze the impact of replacing the regular soft-max layer for classification with a gated soft-max one in order to improve the intrinsic transformation invariance and generalization ability.
  •  
10.
  • Liu, Xixi, 1995, et al. (author)
  • Joint Energy-based Model for Deep Probabilistic Regression
  • 2022
  • In: Proceedings - International Conference on Pattern Recognition. - 1051-4651. - 9781665490627 ; 2022, s. 2693-2699
  • Conference paper (peer-reviewed)abstract
    • It is desirable that a deep neural network trained on a regression task does not only achieve high prediction accuracy, but its prediction posteriors are also well-calibrated, especially in safety-critical settings. Recently, energy-based models specifically to enrich regression posteriors have been proposed and achieve state-of-art results in object detection tasks. However, applying these models at prediction time is not straightforward as the resulting inference methods require to minimize an underlying energy function. Furthermore, these methods empirically do not provide accurate prediction uncertainties. Inspired by recent joint energy-based models for classification, in this work we propose to utilize a joint energy model for regression tasks and describe architectural differences needed in this setting. Within this frame-work, we apply our methods to three computer vision regression tasks. We demonstrate that joint energy-based models for deep probabilistic regression improve the calibration property, do not require expensive inference, and yield competitive accuracy in terms of the mean absolute error (MAE).
  •  
Skapa referenser, mejla, bekava och länka
  • Result 1-10 of 10

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Close

Copy and save the link in order to return to this view