SwePub
Sök i SwePub databas

  Utökad sökning

Träfflista för sökning "WFRF:(Saini Rajkumar Dr. 1988 ) "

Sökning: WFRF:(Saini Rajkumar Dr. 1988 )

  • Resultat 1-10 av 29
Sortera/gruppera träfflistan
   
NumreringReferensOmslagsbildHitta
1.
  • Alonso, Pedro, 1986-, et al. (författare)
  • Hate Speech Detection using Transformer Ensembles on the HASOC Dataset
  • 2020
  • Ingår i: Speech and Computer. - Cham : Springer. ; , s. 13-21
  • Konferensbidrag (refereegranskat)abstract
    • With the ubiquity and anonymity of the Internet, the spread of hate speech has been a growing concern for many years now. The language used for the purpose of dehumanizing, defaming or threatening individuals and marginalized groups not only threatens the mental health of its targets, as well as their democratic access to the Internet, but also the fabric of our society. Because of this, much effort has been devoted to manual moderation. The amount of data generated each day, particularly on social media platforms such as Facebook and twitter, however makes this a Sisyphean task. This has led to an increased demand for automatic methods of hate speech detection.Here, to contribute towards solving the task of hate speech detection, we worked with a simple ensemble of transformer models on a twitter-based hate speech benchmark. Using this method, we attained a weighted F1-score of 0.8426, which we managed to further improve by leveraging more training data, achieving a weighted F1-score of 0.8504. Thus markedly outperforming the best performing system in the literature.
  •  
2.
  • Alonso, Pedro, 1986-, et al. (författare)
  • TheNorth at SemEval-2020 Task 12 : Hate Speech Detection using RoBERTa
  • 2020
  • Ingår i: The International Workshop on Semantic Evaluation. - : International Committee for Computational Linguistics. ; , s. 2197-2202
  • Konferensbidrag (refereegranskat)abstract
    • Hate speech detection on social media platforms is crucial as it helps to avoid severe harm to marginalized people and groups. The application of Natural Language Processing (NLP) and Deep Learning has garnered encouraging results in the task of hate speech detection. The expressionof hate, however, is varied and ever-evolving. Thus better detection systems need to adapt to this variance. Because of this, researchers keep on collecting data and regularly come up with hate speech detection competitions. In this paper, we discuss our entry to one such competition,namely the English version of sub-task A for the OffensEval competition. Our contribution can be perceived through our results, that was first an F1-score of 0.9087, and with further refinementsdescribed here climb up to 0.9166. It serves to give more support to our hypothesis that one ofthe variants of BERT, namely RoBERTa can successfully differentiate between offensive and non-offensive tweets, given the proper preprocessing steps
  •  
3.
  • Chhipa, Prakash Chandra, et al. (författare)
  • Depth Contrast: Self-Supervised Pretraining on 3DPM Images for Mining Material Classification
  • Annan publikation (övrigt vetenskapligt/konstnärligt)abstract
    • This work presents a novel self-supervised representation learning method to learn efficient representations without labels on images from a 3DPM sensor (3-Dimensional Particle Measurement; estimates the particle size distribution of material) utilizing RGB images and depth maps of mining material on the conveyor belt. Human annotations for material categories on sensor-generated data are scarce and cost-intensive. Currently, representation learning without human annotations remains unexplored for mining materials and does not leverage on utilization of sensor-generated data. The proposed method, Depth Contrast, enables self-supervised learning of representations without labels on the 3DPM dataset by exploiting depth maps and inductive transfer. The proposed method outperforms material classification over ImageNet transfer learning performance in fully supervised learning settings and achieves an F1 score of 0.73. Further, The proposed method yields an F1 score of 0.65 with an 11% improvement over ImageNet transfer learning performance in a semi-supervised setting when only 20% of labels are used in fine-tuning. Finally, the Proposed method showcases improved performance generalization on linear evaluation. The implementation of proposed method is available on GitHub. 
  •  
4.
  •  
5.
  • Chhipa, Prakash Chandra, et al. (författare)
  • Magnification Prior: A Self-Supervised Method for Learning Representations on Breast Cancer Histopathological Images
  • 2023
  • Ingår i: Proceedings: 2023 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV 2023). - : IEEE. - 9781665493468 ; , s. 2716-2726
  • Konferensbidrag (refereegranskat)abstract
    • This work presents a novel self-supervised pre-training method to learn efficient representations without labels on histopathology medical images utilizing magnification factors. Other state-of-the-art works mainly focus on fully supervised learning approaches that rely heavily on human annotations. However, the scarcity of labeled and unlabeled data is a long-standing challenge in histopathology. Currently, representation learning without labels remains unexplored in the histopathology domain. The proposed method, Magnification Prior Contrastive Similarity (MPCS), enables self-supervised learning of representations without labels on small-scale breast cancer dataset BreakHis by exploiting magnification factor, inductive transfer, and reducing human prior. The proposed method matches fully supervised learning state-of-the-art performance in malignancy classification when only 20% of labels are used in fine-tuning and outperform previous works in fully supervised learning settings for three public breast cancer datasets, including BreakHis. Further, It provides initial support for a hypothesis that reducing human-prior leads to efficient representation learning in self-supervision, which will need further investigation. The implementation of this work is available online on GitHub
  •  
6.
  • Chhipa, Prakash Chandra, 1986- (författare)
  • Self-supervised Representation Learning for Visual Domains Beyond Natural Scenes
  • 2023
  • Licentiatavhandling (övrigt vetenskapligt/konstnärligt)abstract
    • This thesis investigates the possibility of efficiently adapting self-supervised representation learning on visual domains beyond natural scenes, e.g., medical imagining and non-RGB sensory images. The thesis contributes to i) formalizing the self-supervised representation learning paradigm in a unified conceptual framework and ii) proposing the hypothesis based on supervision signal from data, called data-prior. Method adaptations following the hypothesis demonstrate significant progress in downstream tasks performance on microscopic histopathology and 3-dimensional particle management (3DPM) mining material non-RGB image domains.Supervised learning has proven to be obtaining higher performance than unsupervised learning on computer vision downstream tasks, e.g., image classification, object detection, etc. However, it imposes limitations due to human supervision. To reduce human supervision, end-to-end learning, i.e., transfer learning, remains proven for fine-tuning tasks but does not leverage unlabeled data. Representation learning in a self-supervised manner has successfully reduced the need for labelled data in the natural language processing and vision domain. Advances in learning effective visual representations without human supervision through a self-supervised learning approach are thought-provoking.This thesis performs a detailed conceptual analysis, method formalization, and literature study on the recent paradigm of self-supervised representation learning. The study’s primary goal is to identify the common methodological limitations across the various approaches for adaptation to the visual domain beyond natural scenes. The study finds a common component in transformations that generate distorted views for invariant representation learning. A significant outcome of the study suggests this component is closely dependent on human knowledge of the real world around the natural scene, which fits well the visual domain of the natural scenes but remains sub-optimal for other visual domains that are conceptually different.A hypothesis is proposed to use the supervision signal from data (data-prior) to replace the human-knowledge-driven transformations in self-supervised pretraining to overcome the stated challenge. Two separate visual domains beyond the natural scene are considered to explore the mentioned hypothesis, which is breast cancer microscopic histopathology and 3-dimensional particle management (3DPM) mining material non-RGB image.The first research paper explores the breast cancer microscopic histopathology images by actualizing the data-prior hypothesis in terms of multiple magnification factors as supervision signal from data, which is available in the microscopic histopathology images public dataset BreakHis. It proposes a self-supervised representation learning method, Magnification Prior Contrastive Similarity, which adapts the contrastive learning approach by replacing the standard image view transformations (augmentations) by utilizing magnification factors. The contributions to the work are multi-folded. It achieves significant performance improvement in the downstream task of malignancy classification during label efficiency and fully supervised settings. Pretrained models show efficient knowledge transfer on two additional public datasets supported by qualitative analysis on representation learning. The second research paper investigates the 3DPM mining material non-RGB image domain where the material’s pixel-mapped reflectance image and height (depth map) are captured. It actualizes the data-prior hypothesis by using depth maps of mining material on the conveyor belt. The proposed method, Depth Contrast, also adapts the contrastive learning method while replacing standard augmentations with depth maps for mining materials. It outperforms material classification over ImageNet transfer learning performance in fully supervised learning settings in fine-tuning and linear evaluation. It also shows consistent improvement in performance during label efficiency.In summary, the data-prior hypothesis shows one promising direction for optimal adaptations of contrastive learning methods in self-supervision for the visual domain beyond the natural scene. Although, a detailed study on the data-prior hypothesis is required to explore other non-contrastive approaches of recent self-supervised representation learning, including knowledge distillation and information maximization.
  •  
7.
  • Keserwani, Prateek, et al. (författare)
  • Quadbox : Quadrilateral bounding box based scene text detection using vector regression
  • 2021
  • Ingår i: IEEE Access. - : IEEE. - 2169-3536. ; 9, s. 36802-36818
  • Tidskriftsartikel (refereegranskat)abstract
    • Scene text appears with a wide range of sizes and arbitrary orientations. For detecting such text in the scene image, the quadrilateral bounding boxes provide a much tight bounding box compared to the rotated rectangle. In this work, a vector regression method has been proposed for text detection in the wild to generate a quadrilateral bounding box. The bounding box prediction using direct regression requires predicting the vectors from each position inside the quadrilateral. It needs to predict four-vectors, and each varies drastically in its length and orientation. It makes the vector prediction a difficult problem. To overcome this, we have proposed a centroid-centric vector regression by utilizing the geometry of quadrilateral. In this work, we have added the philosophy of indirect regression to direct regression by shifting all points within the quadrilateral to the centroid and afterward performed vector regression from shifted points. The experimental results show the improvement of the quadrilateral approach over the existing direct regression approach. The proposed method shows good performance on many existing public datasets. The proposed method also demonstrates good results on the unseen dataset without getting trained on it, which validates the approach’s generalization ability.
  •  
8.
  • Keserwani, Prateek, et al. (författare)
  • Robust Scene Text Detection for Partially Annotated Training Data
  • 2022
  • Ingår i: IEEE transactions on circuits and systems for video technology (Print). - : Institute of Electrical and Electronics Engineers (IEEE). - 1051-8215 .- 1558-2205. ; 32:12, s. 8635-8645
  • Tidskriftsartikel (refereegranskat)abstract
    • This article analyzed the impact of training data containing un-annotated text instances, i.e., partial annotation in scene text detection, and proposed a text region refinement approach to address it. Scene text detection is a problem that has attracted the attention of the research community for decades. Impressive results have been obtained for fully supervised scene text detection with recent deep learning approaches. These approaches, however, need a vast amount of completely labeled datasets, and the creation of such datasets is a challenging and time-consuming task. Research literature lacks the analysis of the partial annotation of training data for scene text detection. We have found that the performance of the generic scene text detection method drops significantly due to the partial annotation of training data. We have proposed a text region refinement method that provides robustness against the partially annotated training data in scene text detection. The proposed method works as a two-tier scheme. Text-probable regions are obtained in the first tier by applying hybrid loss that generates pseudo-labels to refine text regions in the second-tier during training. Extensive experiments have been conducted on a dataset generated from ICDAR 2015 by dropping the annotations with various drop rates and on a publicly available SVT dataset. The proposed method exhibits a significant improvement over the baseline and existing approaches for the partially annotated training data.
  •  
9.
  • Kovács, Gyorgy, Postdoctoral researcher, 1984-, et al. (författare)
  • Challenges of Hate Speech Detection in Social Media : Data Scarcity, and Leveraging External Resources
  • 2021
  • Ingår i: SN Computer Science. - Switzerland : Springer. - 2662-995X .- 2661-8907. ; 2:2
  • Tidskriftsartikel (refereegranskat)abstract
    • The detection of hate speech in social media is a crucial task. The uncontrolled spread of hate has the potential to gravely damage our society, and severely harm marginalized people or groups. A major arena for spreading hate speech online is social media. This significantly contributes to the difficulty of automatic detection, as social media posts include paralinguistic signals (e.g. emoticons, and hashtags), and their linguistic content contains plenty of poorly written text. Another difficulty is presented by the context-dependent nature of the task, and the lack of consensus on what constitutes as hate speech, which makes the task difficult even for humans. This makes the task of creating large labeled corpora difficult, and resource consuming. The problem posed by ungrammatical text has been largely mitigated by the recent emergence of deep neural network (DNN) architectures that have the capacity to efficiently learn various features. For this reason, we proposed a deep natural language processing (NLP) model—combining convolutional and recurrent layers—for the automatic detection of hate speech in social media data. We have applied our model on the HASOC2019 corpus, and attained a macro F1 score of 0.63 in hate speech detection on the test set of HASOC. The capacity of DNNs for efficient learning, however, also means an increased risk of overfitting. Particularly, with limited training data available (as was the case for HASOC). For this reason, we investigated different methods for expanding resources used. We have explored various opportunities, such as leveraging unlabeled data, similarly labeled corpora, as well as the use of novel models. Our results showed that by doing so, it was possible to significantly increase the classification score attained.
  •  
10.
  • Kovács, György, Postdoctoral researcher, 1984-, et al. (författare)
  • Leveraging external resources for offensive content detection in social media
  • 2022
  • Ingår i: AI Communications. - : IOS Press. - 0921-7126 .- 1875-8452. ; 35:2, s. 87-109
  • Tidskriftsartikel (refereegranskat)abstract
    • Hate speech is a burning issue of today’s society that cuts across numerous strategic areas, including human rights protection, refugee protection, and the fight against racism and discrimination. The gravity of the subject is further demonstrated by António Guterres, the United Nations Secretary-General, calling it “a menace to democratic values, social stability, and peace”. One central platform for the spread of hate speech is the Internet and social media in particular. Thus, automatic detection of hateful and offensive content on these platforms is a crucial challenge that would strongly contribute to an equal and sustainable society when overcome. One significant difficulty in meeting this challenge is collecting sufficient labeled data. In our work, we examine how various resources can be leveraged to circumvent this difficulty. We carry out extensive experiments to exploit various data sources using different machine learning models, including state-of-the-art transformers. We have found that using our proposed methods, one can attain state-of-the-art performance detecting hate speech on Twitter (outperforming the winner of both the HASOC 2019 and HASOC 2020 competitions). It is observed that in general, adding more data improves the performance or does not decrease it. Even when using good language models and knowledge transfer mechanisms, the best results were attained using data from one or two additional data sets.
  •  
Skapa referenser, mejla, bekava och länka
  • Resultat 1-10 av 29
Typ av publikation
konferensbidrag (13)
tidskriftsartikel (9)
annan publikation (5)
licentiatavhandling (2)
Typ av innehåll
refereegranskat (22)
övrigt vetenskapligt/konstnärligt (7)
Författare/redaktör
Saini, Rajkumar, Dr. ... (29)
Liwicki, Marcus (18)
Upadhyay, Richa (11)
Chhipa, Prakash Chan ... (8)
Mokayed, Hamam (6)
Rakesh, Sumit (6)
visa fler...
Kovács, György, Post ... (5)
Alonso, Pedro, 1986- (5)
Gupta, Vibha (5)
Liwicki, Foteini (4)
Roy, Partha Pratim (4)
De, Kanjar (4)
Uchida, Seiichi (4)
Abid, Nosheen, 1993- (2)
Eriksson, Johan (2)
Kumar, Pradeep (2)
Lindqvist, Lars (2)
Pal, Umapada (2)
Nordenskjold, Richar ... (2)
Keserwani, Prateek (2)
Prabhu, Sameer (2)
Rakesh, Sumit, 1987- (2)
Kumar, Rakesh (1)
Adewumi, Tosin, 1978 ... (1)
Kovács, György, 1984 ... (1)
Maki, Atsuto (1)
Grund Pihlgren, Gust ... (1)
Chhipa, Prakash Chan ... (1)
Liwicki, Marcus, Pro ... (1)
Lladós, Josep (1)
Dogra, Debi Prosad (1)
Faridghasemnia, Moha ... (1)
Javed, Saleha, 1990- (1)
Dhankhar, Ankit (1)
Kovács, Gyorgy, Post ... (1)
Kovács, György (1)
Adewumi, Oluwatosin (1)
Behera, Santosh Kuma ... (1)
Lavergne, Eric (1)
Murphy, Killian (1)
Alenezi, Fayadh (1)
Mishra, Ashish Ranja ... (1)
Shivakumara, Palaiah ... (1)
Chee Hin, Loo (1)
Pandey, Sachi (1)
Chouhan, Vikas (1)
Verma, Devanshi (1)
Rajrah, Shubham (1)
Santosh, KC (1)
Singh, Dinesh (1)
visa färre...
Lärosäte
Luleå tekniska universitet (29)
Umeå universitet (1)
Örebro universitet (1)
Språk
Engelska (29)
Forskningsämne (UKÄ/SCB)
Naturvetenskap (28)
Teknik (4)
Medicin och hälsovetenskap (1)
Samhällsvetenskap (1)
Humaniora (1)

År

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Stäng

Kopiera och spara länken för att återkomma till aktuell vy