SwePub
Sök i SwePub databas

  Utökad sökning

Träfflista för sökning "WFRF:(Vu Xuan Son 1988 ) srt2:(2024)"

Sökning: WFRF:(Vu Xuan Son 1988 ) > (2024)

  • Resultat 1-5 av 5
Sortera/gruppera träfflistan
   
NumreringReferensOmslagsbildHitta
1.
  • Hatefi, Arezoo, 1990- (författare)
  • Deep learning for news topic identification in limited supervision and unsupervised settings
  • 2024
  • Doktorsavhandling (övrigt vetenskapligt/konstnärligt)abstract
    • In today's world, following news is crucial for decision-making and staying informed. With the growing volume of daily news, automated processing is essential for timely insights and in aiding individuals and corporations in navigating the complexities of the information society. Another use of automated processing is contextual advertising, which addresses privacy concerns associated with cookie-based advertising by placing ads solely based on web page content, without tracking users or their online behavior. Therefore, accurately determining and categorizing page content is crucial for effective ad placements. The news media, heavily reliant on advertising to sustain operations, represent a substantial market for contextual advertising strategies.Inspired by these practical applications and the advancements in deep learning over the past decade, this thesis mainly focuses on using deep learning for categorizing news articles into topics of varying granularity. Considering the dynamic nature of these applications and the limited availability of relevant labeled datasets for training models, the thesis emphasizes developing methods that can be trained effectively using unlabeled or partially labeled data. It proposes semi-supervised text classification models for categorizing datasets into predefined coarse-grained topics, where only a few labeled examples exist for each topic, while the majority of the dataset remains unlabeled. Furthermore, to better explore coarse-grained topics within news archives and streams and overcome the limitations of predefined topics in text classification the thesis suggests deep clustering approaches that can be trained in unsupervised settings. Moreover, to address the identification of fine-grained topics, the thesis introduces a novel story discovery model for monitoring event-based topics in multi-source news streams. Given that online news reporting often incorporates diverse modalities like text, images, video, and audio to convey information, the thesis finally initiates an investigation into the synergy between textual and visual elements in news article analysis. To achieve this objective, a text-image dataset was annotated, and a baseline was established for event-topic discovery in multimodal news streams. While primarily intended for news monitoring and contextual advertising, the proposed models can, more generally, be regarded as novel approaches in semi-supervised text classification, deep clustering, and news story discovery. Comparison with state-of-the-art baseline models demonstrates their effectiveness in addressing the respective objectives.
  •  
2.
  •  
3.
  • Szawerna, Maria Irena, et al. (författare)
  • Pseudonymization Categories across Domain Boundaries
  • 2024
  • Ingår i: Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024). - : ELRA and ICCL.
  • Konferensbidrag (refereegranskat)abstract
    • Linguistic data, a component critical not only for research in a variety of fields but also for the development of various Natural Language Processing (NLP) applications, can contain personal information. As a result, its accessibility is limited, both from a legal and an ethical standpoint. One of the solutions is the pseudonymization of the data. Key stages of this process include the identification of sensitive elements and the generation of suitable surrogates in a way that the data is still useful for the intended task. Within this paper, we conduct an analysis of tagsets that have previously been utilized in anonymization and pseudonymization. We also investigate what kinds of Personally Identifiable Information (PII) appear in various domains. These reveal that none of the analyzed tagsets account for all of the PII types present cross-domain at the level of detailedness seemingly required for pseudonymization. We advocate for a universal system of tags for categorizing PIIs leading up to their replacement. Such categorization could facilitate the generation of grammatically, semantically, and sociolinguistically appropriate surrogates for the kinds of information that are considered sensitive in a given domain, resulting in a system that would enable dynamic pseudonymization while keeping the texts readable and useful for future research in various fields.
  •  
4.
  • Tran, Khanh-Tung, et al. (författare)
  • NeuProNet: neural profiling networks for sound classification
  • 2024
  • Ingår i: Neural Computing & Applications. - : Springer Nature. - 0941-0643 .- 1433-3058. ; 36:11, s. 5873-5887
  • Tidskriftsartikel (refereegranskat)abstract
    • Real-world sound signals exhibit various aspects of grouping and profiling behaviors, such as being recorded from identical sources, having similar environmental settings, or encountering related background noises. In this work, we propose novel neural profiling networks (NeuProNet) capable of learning and extracting high-level unique profile representations from sounds. An end-to-end framework is developed so that any backbone architectures can be plugged in and trained, achieving better performance in any downstream sound classification tasks. We introduce an in-batch profile grouping mechanism based on profile awareness and attention pooling to produce reliable and robust features with contrastive learning. Furthermore, extensive experiments are conducted on multiple benchmark datasets and tasks to show that neural computing models under the guidance of our framework gain significant performance gaps across all evaluation tasks. Particularly, the integration of NeuProNet surpasses recent state-of-the-art (SoTA) approaches on UrbanSound8K and VocalSound datasets with statistically significant improvements in benchmarking metrics, up to 5.92% in accuracy compared to the previous SoTA method and up to 20.19% compared to baselines. Our work provides a strong foundation for utilizing neural profiling for machine learning tasks.
  •  
5.
  • Volodina, Elena, et al. (författare)
  • Introduction
  • 2024
  • Ingår i: Proceedings of the workshop on computational approaches to language data pseudonymization (CALD-pseudo 2024). - : Association for Computational Linguistics. - 9798891760851 ; , s. ii-iii
  • Bokkapitel (övrigt vetenskapligt/konstnärligt)
  •  
Skapa referenser, mejla, bekava och länka
  • Resultat 1-5 av 5

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Stäng

Kopiera och spara länken för att återkomma till aktuell vy