SwePub
Sök i SwePub databas

  Extended search

Träfflista för sökning "L773:0031 3203 OR L773:1873 5142 srt2:(2020-2024)"

Search: L773:0031 3203 OR L773:1873 5142 > (2020-2024)

  • Result 1-10 of 10
Sort/group result
   
EnumerationReferenceCoverFind
1.
  • Gharaee, Zahra, 1986-, et al. (author)
  • Graph representation learning for road type classification
  • 2021
  • In: Pattern Recognition. - : Elsevier. - 0031-3203 .- 1873-5142. ; 120
  • Journal article (peer-reviewed)abstract
    • We present a novel learning-based approach to graph representations of road networks employing state-of-the-art graph convolutional neural networks. Our approach is applied to realistic road networks of 17 cities from Open Street Map. While edge features are crucial to generate descriptive graph representations of road networks, graph convolutional networks usually rely on node features only. We show that the highly representative edge features can still be integrated into such networks by applying a line graph transformation. We also propose a method for neighborhood sampling based on a topological neighborhood composed of both local and global neighbors. We compare the performance of learning representations using different types of neighborhood aggregation functions in transductive and inductive tasks and in supervised and unsupervised learning. Furthermore, we propose a novel aggregation approach, Graph Attention Isomorphism Network, GAIN. Our results show that GAIN outperforms state-of-the-art methods on the road type classification problem.
  •  
2.
  • Guarrasi, Valerio, et al. (author)
  • Multimodal explainability via latent shift applied to COVID-19 stratification
  • 2024
  • In: Pattern Recognition. - : Elsevier. - 0031-3203 .- 1873-5142. ; 156
  • Journal article (peer-reviewed)abstract
    • We are witnessing a widespread adoption of artificial intelligence in healthcare. However, most of the advancements in deep learning in this area consider only unimodal data, neglecting other modalities. Their multimodal interpretation necessary for supporting diagnosis, prognosis and treatment decisions. In this work we present a deep architecture, which jointly learns modality reconstructions and sample classifications using tabular and imaging data. The explanation of the decision taken is computed by applying a latent shift that, simulates a counterfactual prediction revealing the features of each modality that contribute the most to the decision and a quantitative score indicating the modality importance. We validate our approach in the context of COVID-19 pandemic using the AIforCOVID dataset, which contains multimodal data for the early identification of patients at risk of severe outcome. The results show that the proposed method provides meaningful explanations without degrading the classification performance.
  •  
3.
  • Gupta, Ankit, et al. (author)
  • Efficient High-Resolution Template Matching with Vector Quantized Nearest Neighbour Fields
  • 2024
  • In: Pattern Recognition. - : Elsevier. - 0031-3203 .- 1873-5142. ; 151
  • Journal article (peer-reviewed)abstract
    • Template matching is a fundamental problem in computer vision with applications in fields including object detection, image registration, and object tracking. Current methods rely on nearest-neighbour (NN) matching, where the query feature space is converted to NN space by representing each query pixel with its NN in the template. NN-based methods have been shown to perform better in occlusions, appearance changes, and non-rigid transformations; however, they scale poorly with high-resolution data and high feature dimensions. We present an NN-based method that efficiently reduces the NN computations and introduces filtering in the NN fields (NNFs). A vector quantization step is introduced before the NN calculation to represent the template with k features, and the filter response over the NNFs is used to compare the template and query distributions over the features. We show that state-of-the-art performance is achieved in low-resolution data, and our method outperforms previous methods at higher resolution.
  •  
4.
  • Johansson, Ulf, et al. (author)
  • Rule extraction with guarantees from regression models
  • 2022
  • In: Pattern Recognition. - : Elsevier. - 0031-3203 .- 1873-5142. ; 126
  • Journal article (peer-reviewed)abstract
    • Tools for understanding and explaining complex predictive models are critical for user acceptance and trust. One such tool is rule extraction, i.e., approximating opaque models with less powerful but interpretable models. Pedagogical (or black-box) rule extraction, where the interpretable model is induced using the original training instances, but with the predictions from the opaque model as targets, has many advantages compared to the decompositional (white-box) approach. Most importantly, pedagogical methods are agnostic to the kind of opaque model used, and any learning algorithm producing interpretable models can be employed for the learning step. The pedagogical approach has, however, one main problem, clearly limiting its utility. Specifically, while the extracted models are trained to mimic the opaque, there are absolutely no guarantees that this will transfer to novel data. This potentially low test set fidelity must be considered a severe drawback, in particular when the extracted models are used for explanation and analysis. In this paper, a novel approach, solving the problem with test set fidelity by utilizing the conformal prediction framework, is suggested for extracting interpretable regression models from opaque models. The extracted models are standard regression trees, but augmented with valid prediction intervals in the leaves. Depending on the exact setup, the use of conformal prediction guarantees that either the test set fidelity or the test set accuracy will be equal to a preset confidence level, in the long run. In the extensive empirical investigation, using 20 publicly available data sets, the validity of the extracted models is demonstrated. In addition, it is shown how normalization can be used to provide individualized prediction intervals, thus providing highly informative extracted models.
  •  
5.
  • Khan, Rizwan, et al. (author)
  • A High Dynamic Range Imaging Method for Short Exposure Multiview Images
  • 2023
  • In: Pattern Recognition. - : Elsevier BV. - 0031-3203 .- 1873-5142. ; 137
  • Journal article (peer-reviewed)abstract
    • The restoration and enhancement of multiview low dynamic range (MVLDR) images captured in low lighting conditions is a great challenge. The disparity maps are hardly reliable in practical, real-world scenarios and suffers from holes and artifacts due to large baseline and angle deviation among multi-ple cameras in low lighting conditions. Furthermore, multiple images with some additional information (e.g., ISO/exposure time, etc.) are required for the radiance map and poses the additional challenges of deghosting to encounter motion artifacts. In this paper, we proposed a method to reconstruct multiview high dynamic range (MVHDR) images from MVLDR images without relying on disparity maps. We de-tect and accurately match the feature points among the involved input views and gather the brightness information from the neighboring viewpoints to optimize an image restoration function based on input exposure gain to finally generate MVHDR images. Our method is very reliable and suitable for a wide baseline among sparse cameras. The proposed method requires only one image per viewpoint without any additional information and outperforms others.
  •  
6.
  • Lv, Zhihan, Dr. 1984-, et al. (author)
  • Memory‐augmented neural networks based dynamic complex image segmentation in digital twins for self‐driving vehicle
  • 2022
  • In: Pattern Recognition. - : Elsevier. - 0031-3203 .- 1873-5142. ; 132
  • Journal article (peer-reviewed)abstract
    • With the continuous increase of the amount of information, people urgently need to identify the information in the image in more detail in order to obtain richer information from the image. This work explores the dynamic complex image segmentation of self-driving vehicle under Digital Twins (DTs) based on Memory-augmented Neural Networks (MANNs), so as to further improve the performance of self-driving in intelligent transportation. In view of the complexity of the environment and the dynamic changes of the scene in intelligent transportation, this work constructs a segmentation model for dynamic complex image of self-driving vehicle under DTs based on MANNs by optimizing the Deep Learning algorithm and further combining with the DTs technology, so as to recognize the information in the environment image during the self-driving. Finally, the performance of the constructed model is analyzed by experimenting with different image datasets (PASCALVOC 2012, NYUDv2, PASCAL CONTEXT, and real self-driving complex traffic image data). The results show that compared with other classical algorithms, the established MANN-based model has an accuracy of about 85.80%, the training time is shortened to 107.00 s, the test time is 0.70 s, and the speedup ratio is high. In addition, the average algorithm parameter of the given energy function α=0.06 reaches the maximum value. Therefore, it is found that the proposed model shows high accuracy and short training time, which can provide experimental reference for future image visual computing and intelligent information processing.
  •  
7.
  • Yao, Jie, et al. (author)
  • ADCNN : Towards learning adaptive dilation for convolutional neural networks
  • 2022
  • In: Pattern Recognition. - : Elsevier BV. - 0031-3203 .- 1873-5142. ; 123
  • Journal article (peer-reviewed)abstract
    • Dilated convolution kernels are constrained by their shared dilation, keeping them from being aware of diverse spatial contents at different locations. We address such limitations by formulating the dilation as trainable weights with respect to individual positions. We propose Adaptive Dilation Convolutional Neural Networks (ADCNN), a light-weighted extension that allows convolutional kernels to adjust their dilation value based on different contents at the pixel level. Unlike previous content-adaptive models, ADCNN dynamically infers pixel-wise dilation via modeling feed-forward inter-patterns, which provides a new perspective for developing adaptive network structures other than sampling kernel spaces. Our evaluation results indicate ADCNNs can be easily integrated into various backbone networks and consistently outperform their regular counterparts on various visual tasks.
  •  
8.
  • Zamboni, Simone, et al. (author)
  • Pedestrian trajectory prediction with convolutional neural networks
  • 2022
  • In: Pattern Recognition. - : Elsevier BV. - 0031-3203 .- 1873-5142. ; 121
  • Journal article (peer-reviewed)abstract
    • Predicting the future trajectories of pedestrians is a challenging problem that has a range of application, from crowd surveillance to autonomous driving. In literature, methods to approach pedestrian trajectory prediction have evolved, transitioning from physics-based models to data-driven models based on recurrent neural networks. In this work, we propose a new approach to pedestrian trajectory prediction, with the introduction of a novel 2D convolutional model. This new model outperforms recurrent models, and it achieves state-of-the-art results on the ETH and TrajNet datasets. We also present an effective system to represent pedestrian positions and powerful data augmentation techniques, such as the addition of Gaussian noise and the use of random rotations, which can be applied to any model. As an additional exploratory analysis, we present experimental results on the inclusion of occupancy methods to model social information, which empirically show that these methods are ineffective in capturing social interaction.
  •  
9.
  • Lin, Che-Tsung, 1979, et al. (author)
  • Cycle-Object Consistency for Image-to-Image Domain Adaptation
  • 2023
  • In: Pattern Recognition. - : Elsevier BV. - 0031-3203. ; 138
  • Journal article (peer-reviewed)abstract
    • Recent advances in generative adversarial networks (GANs) have been proven effective in performing domain adaptation for object detectors through data augmentation. While GANs are exceptionally successful, those methods that can preserve objects well in the image-to-image translation task usually require an auxiliary task, such as semantic segmentation to prevent the image content from being too distorted. However, pixel-level annotations are difficult to obtain in practice. Alternatively, instance-aware image-translation model treats object instances and background separately. Yet, it requires object detectors at test time, assuming that off-the-shelf detectors work well in both domains. In this work, we present AugGAN-Det, which introduces Cycle-object Consistency (CoCo) loss to generate instance-aware translated images across complex domains. The object detector of the target domain is directly leveraged in generator training and guides the preserved objects in the translated images to carry target-domain appearances. Compared to previous models, which e.g., require pixel-level semantic segmentation to force the latent distribution to be object-preserving, this work only needs bounding box annotations which are significantly easier to acquire. Next, as to the instance-aware GAN models, our model, AugGAN-Det, internalizes global and object style-transfer without explicitly aligning the instance features. Most importantly, a detector is not required at test time. Experimental results demonstrate that our model outperforms recent object-preserving and instance-level models and achieves state-of-the-art detection accuracy and visual perceptual quality.
  •  
10.
  • Ng, Chun Chet, et al. (author)
  • When IC meets text: Towards a rich annotated integrated circuit text dataset
  • 2024
  • In: Pattern Recognition. - 0031-3203. ; 147
  • Journal article (peer-reviewed)abstract
    • Automated Optical Inspection (AOI) is a process that uses cameras to autonomously scan printed circuit boards for quality control. Text is often printed on chip components, and it is crucial that this text is correctly recognized during AOI, as it contains valuable information. In this paper, we introduce \textit{ICText}, the largest dataset for text detection and recognition on integrated circuits. Uniquely, it includes labels for character quality attributes such as low contrast, blurry, and broken. While loss-reweighting and Curriculum Learning (CL) have been proposed to improve object detector performance by balancing positive and negative samples and gradually training the model from easy to hard samples, these methods have had limited success with one-stage object detectors commonly used in industry. To address this, we propose Attribute-Guided Curriculum Learning (AGCL), which leverages the labeled character quality attributes in \textit{ICText}. Our extensive experiments demonstrate that AGCL can be applied to different detectors in a plug-and-play fashion to achieve higher Average Precision (AP), significantly outperforming existing methods on \textit{ICText} without any additional computational overhead during inference. Furthermore, we show that AGCL is also effective on the generic object detection dataset Pascal VOC. Our code and dataset will be publicly available at \href{https://github.com/chunchet-ng/ICText-AGCL}{https://github.com/chunchet-ng/ICText-AGCL}.
  •  
Skapa referenser, mejla, bekava och länka
  • Result 1-10 of 10

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Close

Copy and save the link in order to return to this view