SwePub
Tyck till om SwePub Sök här!
Sök i SwePub databas

  Utökad sökning

Träfflista för sökning "WFRF:(Liwicki Marcus) "

Sökning: WFRF:(Liwicki Marcus)

  • Resultat 1-10 av 150
Sortera/gruppera träfflistan
   
NumreringReferensOmslagsbildHitta
1.
  • Adewumi, Oluwatosin, 1978-, et al. (författare)
  • Conversational Systems in Machine Learning from the Point of View of the Philosophy of Science—Using Alime Chat and Related Studies
  • 2019
  • Ingår i: Philosophies. - Switzerland : MDPI. - 2409-9287. ; 4:3
  • Tidskriftsartikel (refereegranskat)abstract
    • This essay discusses current research efforts in conversational systems from the philosophy of science point of view and evaluates some conversational systems research activities from the standpoint of naturalism philosophical theory. Conversational systems or chatbots have advanced over the decades and now have become mainstream applications. They are software that users can communicate with, using natural language. Particular attention is given to the Alime Chat conversational system, already in industrial use, and the related research. The competitive nature of systems in production is a result of different researchers and developers trying to produce new conversational systems that can outperform previous or state-of-the-art systems. Different factors affect the quality of the conversational systems produced, and how one system is assessed as being better than another is a function of objectivity and of the relevant experimental results. This essay examines the research practices from, among others, Longino’s view on objectivity and Popper’s stand on falsification. Furthermore, the need for qualitative and large datasets is emphasized. This is in addition to the importance of the peer-review process in scientific publishing, as a means of developing, validating, or rejecting theories, claims, or methodologies in the research community. In conclusion, open data and open scientific discussion fora should become more prominent over the mere publication-focused trend.
  •  
2.
  • Adewumi, Oluwatosin, 1978-, et al. (författare)
  • Corpora Compared : The Case of the Swedish Gigaword & Wikipedia Corpora
  • 2020
  • Konferensbidrag (refereegranskat)abstract
    • In this work, we show that the difference in performance of embeddings from differently sourced data for a given language can be due to other factors besides data size. Natural language processing (NLP) tasks usually perform better with embeddings from bigger corpora. However, broadness of covered domain and noise can play important roles. We evaluate embeddings based on two Swedish corpora: The Gigaword and Wikipedia, in analogy (intrinsic) tests and discover that the embeddings from the Wikipedia corpus generally outperform those from the Gigaword corpus, which is a bigger corpus. Downstream tests will be required to have a definite evaluation.
  •  
3.
  • Adewumi, Oluwatosin, 1978-, et al. (författare)
  • Exploring Swedish & English fastText Embeddings
  • 2022
  • Ingår i: Artificial Intelligence and Cognition 2022. ; , s. 201-208
  • Konferensbidrag (refereegranskat)abstract
    • In this paper, we show that embeddings from relatively smaller corpora sometimes outperform thosefrom larger corpora and we introduce a new Swedish analogy test set and make it publicly available.To achieve good performance in Natural Language Processing (NLP) downstream tasks, several factorsplay important roles: dataset size, the right hyper-parameters, and well-trained embeddings. We utilizethe fastText tool for our experiments. We evaluate both the Swedish and English embeddings that wecreated using intrinsic evaluation (including analogy & Spearman correlation) and compare them with2 common, publicly available embeddings. Our English continuous Bag-of-Words (CBoW)-negativesampling embedding shows better performance compared to the publicly available GoogleNews version.We also describe the relationship between NLP and cognitive science. We contribute the embeddings forresearch or other useful purposes by publicly releasing them.
  •  
4.
  • Adewumi, Oluwatosin, 1978-, et al. (författare)
  • Exploring Swedish & English fastText Embeddings for NER with the Transformer
  • Annan publikation (övrigt vetenskapligt/konstnärligt)abstract
    • In this paper, our main contributions are that embeddings from relatively smaller corpora can outperform ones from far larger corpora and we present the new Swedish analogy test set. To achieve a good network performance in natural language processing (NLP) downstream tasks, several factors play important roles: dataset size, the right hyper-parameters, and well-trained embeddings. We show that, with the right set of hyper-parameters, good network performance can be reached even on smaller datasets. We evaluate the embeddings at the intrinsic level and extrinsic level, by deploying them on the Transformer in named entity recognition (NER) task and conduct significance tests. This is done for both Swedish and English. We obtain better performance in both languages on the downstream task with far smaller training data, compared to recently released, common crawl versions; and character n-grams appear useful for Swedish, a morphologically rich language.
  •  
5.
  • Adewumi, Oluwatosin, 1978-, et al. (författare)
  • Småprat : DialoGPT for Natural Language Generation of Swedish Dialogue by Transfer Learning
  • 2022
  • Ingår i: Vol. 3 (2022): Proceedings of the Northern Lights Deep Learning Workshop 2022. - : Septentrio Academic Publishing. - 2703-6928.
  • Konferensbidrag (refereegranskat)abstract
    • Building open-domain conversational systems (or chatbots) that produce convincing responses is a recognized challenge. Recent state-of-the-art (SoTA) transformer-based models for the generation of natural language dialogue have demonstrated impressive performance in simulating human-like, single-turn conversations in English.This work investigates, by an empirical study, the potential for transfer learning of such models to Swedish language. DialoGPT, an English language pre-trained model, is adapted by training on three different Swedish language conversational datasets obtained from publicly available sources: Reddit, Familjeliv and the GDC. Perplexity score (an automated intrinsic metric) and surveys by human evaluation were used to assess the performances of the fine-tuned models. We also compare the DialoGPT experiments with an attention-mechanism-based seq2seq baseline model, trained on the GDC dataset. The results indicate that the capacity for transfer learning can be exploited with considerable success. Human evaluators asked to score the simulated dialogues judged over 57% of the chatbot responses to be human-like for the model trained on the largest (Swedish) dataset. The work agrees with the hypothesis that deep monolingual models learn some abstractions which generalize across languages. We contribute the codes, datasets and model checkpoints and host the demos on the HuggingFace platform.
  •  
6.
  • Adewumi, Oluwatosin, 1978-, et al. (författare)
  • T5 for Hate Speech, Augmented Data, and Ensemble
  • 2023
  • Ingår i: Sci. - : MDPI. - 2413-4155. ; 5:4
  • Tidskriftsartikel (refereegranskat)abstract
    • We conduct relatively extensive investigations of automatic hate speech (HS) detection using different State-of-The-Art (SoTA) baselines across 11 subtasks spanning six different datasets. Our motivation is to determine which of the recent SoTA models is best for automatic hate speech detection and what advantage methods, such as data augmentation and ensemble, may have on the best model, if any. We carry out six cross-task investigations. We achieve new SoTA results on two subtasks—macro F1 scores of 91.73% and 53.21% for subtasks A and B of the HASOC 2020 dataset, surpassing previous SoTA scores of 51.52% and 26.52%, respectively. We achieve near-SoTA results on two others—macro F1 scores of 81.66% for subtask A of the OLID 2019 and 82.54% for subtask A of the HASOC 2021, in comparison to SoTA results of 82.9% and 83.05%, respectively. We perform error analysis and use two eXplainable Artificial Intelligence (XAI) algorithms (Integrated Gradient (IG) and SHapley Additive exPlanations (SHAP)) to reveal how two of the models (Bi-Directional Long Short-Term Memory Network (Bi-LSTM) and Text-to-Text-Transfer Transformer (T5)) make the predictions they do by using examples. Other contributions of this work are: (1) the introduction of a simple, novel mechanism for correcting Out-of-Class (OoC) predictions in T5, (2) a detailed description of the data augmentation methods, and (3) the revelation of the poor data annotations in the HASOC 2021 dataset by using several examples and XAI (buttressing the need for better quality control). We publicly release our model checkpoints and codes to foster transparency.
  •  
7.
  • Adewumi, Oluwatosin, 1978-, et al. (författare)
  • Vector Representations of Idioms in Conversational Systems
  • 2022
  • Ingår i: Sci. - : MDPI. - 2413-4155. ; 4:4
  • Tidskriftsartikel (refereegranskat)abstract
    • In this study, we demonstrate that an open-domain conversational system trained on idioms or figurative language generates more fitting responses to prompts containing idioms. Idioms are a part of everyday speech in many languages and across many cultures, but they pose a great challenge for many natural language processing (NLP) systems that involve tasks such as information retrieval (IR), machine translation (MT), and conversational artificial intelligence (AI). We utilized the Potential Idiomatic Expression (PIE)-English idiom corpus for the two tasks that we investigated: classification and conversation generation. We achieved a state-of-the-art (SoTA) result of a 98% macro F1 score on the classification task by using the SoTA T5 model. We experimented with three instances of the SoTA dialogue model—the Dialogue Generative Pre-trained Transformer (DialoGPT)—for conversation generation. Their performances were evaluated by using the automatic metric, perplexity, and a human evaluation. The results showed that the model trained on the idiom corpus generated more fitting responses to prompts containing idioms 71.9% of the time in comparison with a similar model that was not trained on the idiom corpus. We have contributed the model checkpoint/demo/code to the HuggingFace hub for public access.
  •  
8.
  • Adewumi, Oluwatosin, 1978- (författare)
  • Vector Representations of Idioms in Data-Driven Chatbots for Robust Assistance
  • 2022
  • Doktorsavhandling (övrigt vetenskapligt/konstnärligt)abstract
    • This thesis presents resources capable of enhancing solutions of some Natural Language Processing (NLP) tasks, demonstrates the learning of abstractions by deep models through cross-lingual transferability, and shows how deep learning models trained on idioms can enhance open-domain conversational systems. The challenges of open-domain conversational systems are many and include bland repetitive utterances, lack of utterance diversity, lack of training data for low-resource languages, shallow world-knowledge and non-empathetic responses, among others. These challenges contribute to the non-human-like utterances that open-domain conversational systems suffer from. They, hence,have motivated the active research in Natural Language Understanding (NLU) and Natural Language Generation (NLG), considering the very important role conversations (or dialogues) play in human lives. The methodology employed in this thesis involves an iterative set of scientific methods. First, it conducts a systematic literature review to identify the state-of-the-art (SoTA) and gaps, such as the challenges mentioned earlier, in current research. Subsequently, it follows the seven stages of the Machine Learning (ML) life-cycle, which are data gathering (or acquisition), data preparation, model selection, training, evaluation with hyperparameter tuning, prediction and model deployment. For data acquisition, relevant datasets are acquired or created, using benchmark datasets as references, and their data statements are included. Specific contributions of this thesis are the creation of the Swedish analogy test set for evaluating word embeddings and the Potential Idiomatic Expression (PIE)-English idioms corpus for training models in idiom identification and classification. In order to create a benchmark, this thesis performs human evaluation on the generated predictions of some SoTA ML models, including DialoGPT. As different individuals may not agree on all the predictions, the Inter-Annotator Agreement (IAA) is measured. A typical method for measuring IAA is Fleiss Kappa, however, it has a number of shortcomings, including high sensitivity to the number of categories being evaluated. Therefore, this thesis introduces the credibility unanimous score (CUS), which is more intuitive, easier to calculate and seemingly less sensitive to changes in the number of categories being evaluated. The results of human evaluation and comments from evaluators provide valuable feedback on the existing challenges within the models. These create the opportunity for addressing such challenges in future work. The experiments in this thesis test two hypothesis; 1) an open-domain conversational system that is idiom-aware generates more fitting responses to prompts containing idioms, and 2) deep monolingual models learn some abstractions that generalise across languages. To investigate the first hypothesis, this thesis trains English models on the PIE-English idioms corpus for classification and generation. For the second hypothesis, it explores cross-lingual transferability from English models to Swedish, Yorùbá, Swahili, Wolof, Hausa, Nigerian Pidgin English and Kinyarwanda. From the results, the thesis’ additional contributions mainly lie in 1) confirmation of the hypothesis that an open-domain conversational system that is idiom-aware generates more fitting responses to prompts containing idioms, 2) confirmation of the hypothesis that deep monolingual models learn some abstractions that generalise across languages, 3) introduction of CUS and its benefits, 4) insight into the energy-saving and time-saving benefits of more optimal embeddings from relatively smaller corpora, and 5) provision of public access to the model checkpoints that were developed from this work. We further discuss the ethical issues involved in developing robust, open-domain conversational systems. Parts of this thesis are already published in the form of peer-reviewed journal and conference articles.
  •  
9.
  • Adewumi, Oluwatosin, 1978-, et al. (författare)
  • Word2Vec: Optimal Hyper-Parameters and Their Impact on NLP Downstream Tasks
  • Annan publikation (övrigt vetenskapligt/konstnärligt)abstract
    • Word2Vec is a prominent model for natural language processing (NLP) tasks. Similar nspiration is found in distributed embeddings for new state-of-the-art (SotA) deep neural networks.  However, wrong combination of hyper-parameters can produce poor quality vectors. The objective of this work is to empirically show optimal combination of hyper-parameters exists and evaluate various combinations. We compare them with the released, pre-trained original word2vec model. Both intrinsic and extrinsic (downstream) evaluations, including named entity recognition (NER) and sentiment analysis (SA) were carried out. The downstream tasks reveal that the best model is usually task-specific, high analogy scores don’t necessarily correlate positively with F1 scores and the same applies to focus on data alone. Increasing vector dimension size after a point leads to poor quality or performance. If ethical considerations to save time, energy and the environment are made, then reasonably smaller corpora may do just as well or even better in some cases. Besides, using a small corpus, we obtain better human-assigned WordSim scores, corresponding Spearman correlation and better downstream performances (with significance tests) compared to the original model, trained on 100 billion-word corpus.
  •  
10.
  • Adewumi, Oluwatosin, 1978-, et al. (författare)
  • Word2Vec: Optimal hyperparameters and their impact on natural language processing downstream tasks
  • 2022
  • Ingår i: Open Computer Science. - : Walter de Gruyter. - 2299-1093. ; 12:1, s. 134-141
  • Tidskriftsartikel (refereegranskat)abstract
    • Word2Vec is a prominent model for natural language processing tasks. Similar inspiration is found in distributed embeddings (word-vectors) in recent state-of-the-art deep neural networks. However, wrong combination of hyperparameters can produce embeddings with poor quality. The objective of this work is to empirically show that Word2Vec optimal combination of hyper-parameters exists and evaluate various combinations. We compare them with the publicly released, original Word2Vec embedding. Both intrinsic and extrinsic (downstream) evaluations are carried out, including named entity recognition and sentiment analysis. Our main contributions include showing that the best model is usually task-specific, high analogy scores do not necessarily correlate positively with F1 scores, and performance is not dependent on data size alone. If ethical considerations to save time, energy, and the environment are made, then relatively smaller corpora may do just as well or even better in some cases. Increasing the dimension size of embeddings after a point leads to poor quality or performance. In addition, using a relatively small corpus, we obtain better WordSim scores, corresponding Spearman correlation, and better downstream performances (with significance tests) compared to the original model, which is trained on a 100 billion-word corpus.
  •  
Skapa referenser, mejla, bekava och länka
  • Resultat 1-10 av 150
Typ av publikation
konferensbidrag (78)
tidskriftsartikel (41)
annan publikation (11)
licentiatavhandling (9)
forskningsöversikt (6)
doktorsavhandling (4)
visa fler...
bokkapitel (1)
visa färre...
Typ av innehåll
refereegranskat (115)
övrigt vetenskapligt/konstnärligt (34)
Författare/redaktör
Liwicki, Marcus (148)
Liwicki, Foteini (27)
Stricker, Didier (23)
Afzal, Muhammad Zesh ... (20)
Saini, Rajkumar, Dr. ... (19)
Mokayed, Hamam (16)
visa fler...
Pagani, Alain (16)
Adewumi, Oluwatosin, ... (13)
Hashmi, Khurram Azee ... (13)
Sandin, Fredrik, 197 ... (13)
Upadhyay, Richa (13)
Ingold, Rolf (12)
Chhipa, Prakash Chan ... (12)
Pondenkandath, Vinay ... (10)
Alberti, Michele (9)
Almqvist, Andreas (9)
Abid, Nosheen, 1993- (8)
Kovács, György, Post ... (8)
Seuret, Mathias (8)
Usman, Ali (8)
Adewumi, Tosin, 1978 ... (7)
Grund Pihlgren, Gust ... (7)
Alonso, Pedro, 1986- (6)
Javed, Saleha, 1990- (6)
Delsing, Jerker, 195 ... (5)
Nikolaidou, Konstant ... (5)
Kovács, György, 1984 ... (5)
De, Kanjar (5)
Uchida, Seiichi (5)
Rakesh, Sumit (5)
Saini, Rajkumar (4)
Belay, Birhanu (4)
Habtegebrial, Tewodr ... (4)
Nazir, Danish (4)
Gupta, Vibha (4)
Shankar, Priyamvada, ... (4)
Fischer, Andreas (3)
Brännvall, Rickard, ... (3)
Sabry, Sana Sabah (3)
Ahmad, Riaz (3)
Shridhar, Kumar (3)
Sandin, Fredrik (3)
Belay, Gebeyehu (3)
Pal, Umapada (3)
Chopra, Muskaan (3)
Gupta, Varun (3)
Nilsson, Jacob (3)
Park, Cheol Woo (3)
Taal, Cees (3)
Shivakumara, Palaiah ... (3)
visa färre...
Lärosäte
Luleå tekniska universitet (149)
RISE (7)
Umeå universitet (2)
Uppsala universitet (2)
Örebro universitet (1)
Blekinge Tekniska Högskola (1)
Språk
Engelska (150)
Forskningsämne (UKÄ/SCB)
Naturvetenskap (129)
Teknik (39)
Samhällsvetenskap (4)
Medicin och hälsovetenskap (2)
Humaniora (2)
Lantbruksvetenskap (1)

År

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Stäng

Kopiera och spara länken för att återkomma till aktuell vy