SwePub
Sök i SwePub databas

  Utökad sökning

Träfflista för sökning "WFRF:(Vu Xuan Son 1988 ) srt2:(2021)"

Sökning: WFRF:(Vu Xuan Son 1988 ) > (2021)

  • Resultat 1-4 av 4
Sortera/gruppera träfflistan
   
NumreringReferensOmslagsbildHitta
1.
  • Vu, Xuan-Son, 1988-, et al. (författare)
  • MC-OCR Challenge : Mobile-Captured Image Document Recognition for Vietnamese Receipts
  • 2021
  • Ingår i: Proceedings - 2021 RIVF International Conference on Computing and Communication Technologies, RIVF 2021. - : IEEE. - 9781665404358 ; , s. 88-93
  • Konferensbidrag (refereegranskat)abstract
    • The paper describes the organisation of the "Mobile Captured Receipt Recognition Challenge"(MC-OCR) task at the RIVF conference 2021 1 on recognizing the fine-grained information in Vietnamese receipts captured using mobile devices. The task is organized as a multi-tasking model on a dataset containing 2, 436 Vietnamese receipts. The participants were challenged to build a model that is capable of (1) predicting receipt's quality based on readable information, and (2) recognizing textual information of four required information (i.e., "SELLER", "SELLER ADDRESS", "TIMESTAMP", and "TOTAL COST") in the receipts. MC-OCR challenge happened in one month and top winners of each task will present their solutions at RIVF 2021. Participants were competing on CodaLab.Org from 05th December 2020 to 23rd January 2021. All participants with valid submitted results were encouraged to submit their papers. Within one month, the challenge has attracted 105 participants and recorded about 1, 285 submission entries.
  •  
2.
  • Hatefi, Arezoo, 1990-, et al. (författare)
  • Cformer: Semi-Supervised Text Clustering Based on Pseudo Labeling
  • 2021
  • Ingår i: CIKM '21. - New York, NY, USA : ACM Digital Library. - 9781450384469 ; , s. 3078-3082
  • Konferensbidrag (refereegranskat)abstract
    • We propose a semi-supervised learning method called Cformer for automatic clustering of text documents in cases where clusters are described by a small number of labeled examples, while the majority of training examples are unlabeled. We motivate this setting with an application in contextual programmatic advertising, a type of content placement on news pages that does not exploit personal information about visitors but relies on the availability of a high-quality clustering computed on the basis of a small number of labeled samples.To enable text clustering with little training data, Cformer leverages the teacher-student architecture of Meta Pseudo Labels. In addition to unlabeled data, Cformer uses a small amount of labeled data to describe the clusters aimed at. Our experimental results confirm that the performance of the proposed model improves the state-of-the-art if a reasonable amount of labeled data is available. The models are comparatively small and suitable for deployment in constrained environments with limited computing resources. The source code is available at https://github.com/Aha6988/Cformer.
  •  
3.
  • Nguyen, Hoang D., et al. (författare)
  • Modular Graph Transformer Networks for Multi-Label Image Classification
  • 2021
  • Ingår i: 35th AAAI Conference on Artificial Intelligence, AAAI 2021, 33 Conference on Innovative Applications of Artificial Intelligence and the 11 Symposium on Educational Advances in Artificial Intelligence. - : Association for the Advancement of Artificial Intelligence. - 9781713835974 - 9781577358664 ; , s. 9092-9100
  • Konferensbidrag (refereegranskat)abstract
    • With the recent advances in graph neural networks, there is a rising number of studies on graph-based multi-label classification with the consideration of object dependencies within visual data. Nevertheless, graph representations can become indistinguishable due to the complex nature of label relationships. We propose a multi-label image classification framework based on graph transformer networks to fully exploit inter-label interactions. The paper presents a modular learning scheme to enhance the classification performance by segregating the computational graph into multiple sub-graphs based on modularity. Our approach, named Modular Graph Transformer Networks (MGTN), is capable of employing multiple backbones for better information propagation over different sub-graphs guided by graph transformers and convolutions. We validate our framework on MS-COCO and Fashion550K datasets to demonstrate improvements for multilabel image classification. The source code is available at https://github.com/ReML-AI/MGTN.
  •  
4.
  • Nguyen, Nhu-Van, et al. (författare)
  • ICDAR 2021 Competition on Multimodal Emotion Recognition on Comics Scenes
  • 2021
  • Ingår i: Document Analysis and Recognition – ICDAR 2021. - Cham : Springer. - 9783030863364 - 9783030863371 ; , s. 767-782
  • Konferensbidrag (refereegranskat)abstract
    • The paper describes the "Multimodal Emotion Recognition on Comics scenes" competition presented at the ICDAR conference 2021. This competition aims to tackle the problem of emotion recognition of comic scenes (panels). Emotions are assigned manually by multiple annotators for each comic scene of a subset of a public large-scale dataset of golden age American comics. As a multi-modal analysis task, the competition proposes to extract the emotions of comic characters in comic scenes based on visual information, text in speech balloons or captions and the onomatopoeia. Participants were competing on CodaLab.org from December 16 th 2020 to March 31 th 2021. The challenge has attracted 145 registrants, 21 teams have joined the public test phase, and 7 teams have competed in the private test phase. In this paper we present the motivation, dataset preparation, task definition of the competition, the analysis of participant’s performance and submitted methods. We believe that the competition have drawn attention from the document analysis community in both fields of computer vision and natural language processing on the task of emotion recognition in documents.
  •  
Skapa referenser, mejla, bekava och länka
  • Resultat 1-4 av 4

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Stäng

Kopiera och spara länken för att återkomma till aktuell vy