SwePub
Sök i SwePub databas

  Utökad sökning

Träfflista för sökning "hsv:(NATURVETENSKAP) hsv:(Data och informationsvetenskap) hsv:(Språkteknologi) srt2:(2020-2024)"

Sökning: hsv:(NATURVETENSKAP) hsv:(Data och informationsvetenskap) hsv:(Språkteknologi) > (2020-2024)

  • Resultat 1-10 av 914
Sortera/gruppera träfflistan
   
NumreringReferensOmslagsbildHitta
1.
  • Norlund, Tobias, 1991, et al. (författare)
  • Transferring Knowledge from Vision to Language: How to Achieve it and how to Measure it?
  • 2021
  • Ingår i: Proceedings of the Fourth BlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP, pp. 149-162, Punta Cana, Dominican Republic. - : Association for Computational Linguistics.
  • Konferensbidrag (refereegranskat)abstract
    • Large language models are known to suffer from the hallucination problem in that they are prone to output statements that are false or inconsistent, indicating a lack of knowledge. A proposed solution to this is to provide the model with additional data modalities that complements the knowledge obtained through text. We investigate the use of visual data to complement the knowledge of large language models by proposing a method for evaluating visual knowledge transfer to text for uni- or multimodal language models. The method is based on two steps, 1) a novel task querying for knowledge of memory colors, i.e. typical colors of well-known objects, and 2) filtering of model training data to clearly separate knowledge contributions. Additionally, we introduce a model architecture that involves a visual imagination step and evaluate it with our proposed method. We find that our method can successfully be used to measure visual knowledge transfer capabilities in models and that our novel model architecture shows promising results for leveraging multimodal knowledge in a unimodal setting.
  •  
2.
  • Al Sabbagh, Khaled, 1987, et al. (författare)
  • Improving Data Quality for Regression Test Selection by Reducing Annotation Noise
  • 2020
  • Ingår i: Proceedings - 46th Euromicro Conference on Software Engineering and Advanced Applications, SEAA 2020. ; , s. 191-194
  • Konferensbidrag (refereegranskat)abstract
    • Big data and machine learning models have been increasingly used to support software engineering processes and practices. One example is the use of machine learning models to improve test case selection in continuous integration. However, one of the challenges in building such models is the identification and reduction of noise that often comes in large data. In this paper, we present a noise reduction approach that deals with the problem of contradictory training entries. We empirically evaluate the effectiveness of the approach in the context of selective regression testing. For this purpose, we use a curated training set as input to a tree-based machine learning ensemble and compare the classification precision, recall, and f-score against a non-curated set. Our study shows that using the noise reduction approach on the training instances gives better results in prediction with an improvement of 37% on precision, 70% on recall, and 59% on f-score.
  •  
3.
  • Samoaa, Hazem Peter, et al. (författare)
  • A systematic mapping study of source code representation for deep learning in software engineering
  • 2022
  • Ingår i: Iet Software. - : Institution of Engineering and Technology (IET). - 1751-8806 .- 1751-8814. ; 16:4, s. 351-385
  • Tidskriftsartikel (refereegranskat)abstract
    • The usage of deep learning (DL) approaches for software engineering has attracted much attention, particularly in source code modelling and analysis. However, in order to use DL, source code needs to be formatted to fit the expected input form of DL models. This problem is known as source code representation. Source code can be represented via different approaches, most importantly, the tree-based, token-based, and graph-based approaches. We use a systematic mapping study to investigate i detail the representation approaches adopted in 103 studies that use DL in the context of software engineering. Thus, studies are collected from 2014 to 2021 from 14 different journals and 27 conferences. We show that each way of representing source code can provide a different, yet orthogonal view of the same source code. Thus, different software engineering tasks might require different (combinations of) code representation approaches, depending on the nature and complexity of the task. Particularly, we show that it is crucial to define whether the DL approach requires lexical, syntactical, or semantic code information. Our analysis shows that a wide range of different representations and combinations of representations (hybrid representations) are used to solve a wide range of common software engineering problems. However, we also observe that current research does not generally attempt to transfer existing representations or models to other studies even though there are other contexts in which these representations and models may also be useful. We believe that there is potential for more reuse and the application of transfer learning when applying DL to software engineering tasks.
  •  
4.
  • Barreiro, Anabela, et al. (författare)
  • Multi3Generation : Multitask, Multilingual, Multimodal Language Generation
  • 2022
  • Ingår i: Proceedings of the 23rd Annual Conference of the European Association for Machine Translation. - : European Association for Machine Translation. ; , s. 345-346
  • Konferensbidrag (refereegranskat)abstract
    • This paper presents the Multitask, Multilingual, Multimodal Language Generation COST Action – Multi3Generatio(CA18231), an interdisciplinary networof research groups working on different aspects of language generation. This "meta-paper" will serve as reference for citationof the Action in future publications. It presents the objectives, challenges and a the links for the achieved outcomes.
  •  
5.
  • Lindgren, Helena, Professor, et al. (författare)
  • The wasp-ed AI curriculum : A holistic curriculum for artificial intelligence
  • 2023
  • Ingår i: INTED2023 Proceedings. - : IATED. - 9788409490264 ; , s. 6496-6502
  • Konferensbidrag (refereegranskat)abstract
    • Efforts in lifelong learning and competence development in Artificial Intelligence (AI) have been on the rise for several years. These initiatives have mostly been applied to Science, Technology, Engineering and Mathematics (STEM) disciplines. Even though there has been significant development in Digital Humanities to incorporate AI methods and tools in higher education, the potential for such competences in Arts, Humanities and Social Sciences is far from being realised. Furthermore, there is an increasing awareness that the STEM disciplines need to include competences relating to AI in humanity and society. This is especially important considering the widening and deepening of the impact of AI on society at large and individuals. The aim of the presented work is to provide a broad and inclusive AI Curriculum that covers the breadth of the topic as it is seen today, which is significantly different from only a decade ago. It is important to note that with the curriculum we mean an overview of the subject itself, rather than a particular education program. The curriculum is intended to be used as a foundation for educational activities in AI to for example harmonize terminology, compare different programs, and identify educational gaps to be filled. An important aspect of the curriculum is the ethical, legal, and societal aspects of AI and to not limit the curriculum to the STEM subjects, instead extending to a holistic, human-centred AI perspective. The curriculum is developed as part of the national research program WASP-ED, the Wallenberg AI and transformative technologies education development program. 
  •  
6.
  • Singh, Avinash, 1986-, et al. (författare)
  • Verbal explanations by collaborating robot teams
  • 2021
  • Ingår i: Paladyn - Journal of Behavioral Robotics. - : De Gruyter Open. - 2080-9778 .- 2081-4836. ; 12:1, s. 47-57
  • Tidskriftsartikel (refereegranskat)abstract
    • In this article, we present work on collaborating robot teams that use verbal explanations of their actions and intentions in order to be more understandable to the human. For this, we introduce a mechanism that determines what information the robots should verbalize in accordance with Grice’s maxim of quantity, i.e., convey as much information as is required and no more or less. Our setup is a robot team collaborating to achieve a common goal while explaining in natural language what they are currently doing and what they intend to do. The proposed approach is implemented on three Pepper robots moving objects on a table. It is evaluated by human subjects answering a range of questions about the robots’ explanations, which are generated using either our proposed approach or two further approaches implemented for evaluation purposes. Overall, we find that our proposed approach leads to the most understanding of what the robots are doing. In addition, we further propose a method for incorporating policies driving the distribution of tasks among the robots, which may further support understandability.
  •  
7.
  • Ryazanov, Igor, et al. (författare)
  • Deep Learning for Deep Waters: An Expert-in-the-Loop Machine Learning Framework for Marine Sciences
  • 2021
  • Ingår i: Journal of Marine Science and Engineering. - : MDPI AG. - 2077-1312. ; 9:2
  • Tidskriftsartikel (refereegranskat)abstract
    • Driven by the unprecedented availability of data, machine learning has become a pervasive and transformative technology across industry and science. Its importance to marine science has been codified as one goal of the UN Ocean Decade. While increasing amounts of, for example, acoustic marine data are collected for research and monitoring purposes, and machine learning methods can achieve automatic processing and analysis of acoustic data, they require large training datasets annotated or labelled by experts. Consequently, addressing the relative scarcity of labelled data is, besides increasing data analysis and processing capacities, one of the main thrust areas. One approach to address label scarcity is the expert-in-the-loop approach which allows analysis of limited and unbalanced data efficiently. Its advantages are demonstrated with our novel deep learning-based expert-in-the-loop framework for automatic detection of turbulent wake signatures in echo sounder data. Using machine learning algorithms, such as the one presented in this study, greatly increases the capacity to analyse large amounts of acoustic data. It would be a first step in realising the full potential of the increasing amount of acoustic data in marine sciences.
  •  
8.
  • Fredriksson, Teodor, 1992, et al. (författare)
  • Machine Learning Algorithms for Labeling: Where and How They are Used?
  • 2022
  • Ingår i: SysCon 2022 - 16th Annual IEEE International Systems Conference, Proceedings.
  • Konferensbidrag (refereegranskat)abstract
    • With the increased availability of new and better computer processing units (CPUs) as well as graphical processing units (GPUs), the interest in statistical learning and deep learning algorithms for classification tasks has grown exponentially. These classification algorithms often require the presence of fully labeled instances during the training period for maximum classification accuracy. However, in industrial applications, data is commonly not fully labeled, which both reduces the prediction accuracy of the learning algorithms as well as increases the project cost to label the missing instances. The purpose of this paper is to survey the current state-of-the-art literature on machine learning algorithms that are used for assisted or automatic labeling and to understand where these are used. We performed a systematic mapping study and identified 52 primary studies relevant to our research. This paper provides three main contributions. First, we identify the existing machine learning algorithms for labeling and we present a taxonomy of these algorithms. Second, we identify the datasets that are used to evaluate the algorithms and we provide a mapping of the datasets based on the type of data and the application area. Third, we provide a process to support people in industry to optimally label their dataset. The results presented in this paper can be used by both researchers and practitioners aiming to improve the missing labels with the aid of machine algorithms or to select appropriate datasets to compare new state-of-the art algorithms in their respective application area.
  •  
9.
  • Fredriksson, Teodor, 1992, et al. (författare)
  • Machine learning models for automatic labeling: A systematic literature review
  • 2020
  • Ingår i: ICSOFT 2020 - Proceedings of the 15th International Conference on Software Technologies. - : SCITEPRESS - Science and Technology Publications. ; , s. 552-566
  • Konferensbidrag (refereegranskat)abstract
    • Automatic labeling is a type of classification problem. Classification has been studied with the help of statistical methods for a long time. With the explosion of new better computer processing units (CPUs) and graphical processing units (GPUs) the interest in machine learning has grown exponentially and we can use both statistical learning algorithms as well as deep neural networks (DNNs) to solve the classification tasks. Classification is a supervised machine learning problem and there exists a large amount of methodology for performing such task. However, it is very rare in industrial applications that data is fully labeled which is why we need good methodology to obtain error-free labels. The purpose of this paper is to examine the current literature on how to perform labeling using ML, we will compare these models in terms of popularity and on what datatypes they are used on. We performed a systematic literature review of empirical studies for machine learning for labeling. We identified 43 primary studies relevant to our search. From this we were able to determine the most common machine learning models for labeling. Lack of unlabeled instances is a major problem for industry as supervised learning is the most widely used. Obtaining labels is costly in terms of labor and financial costs. Based on our findings in this review we present alternate ways for labeling data for use in supervised learning tasks.
  •  
10.
  • Hagström, Lovisa, 1995 (författare)
  • A Picture is Worth a Thousand Words: Natural Language Processing in Context
  • 2023
  • Licentiatavhandling (övrigt vetenskapligt/konstnärligt)abstract
    • Modern NLP models learn language from lexical co-occurrences. While this method has allowed for significant breakthroughs, it has also exposed potential limitations of modern NLP methods. For example, NLP models are prone to hallucinate, represent a biased world view and may learn spurious correlations to solve the data instead of the task at hand. This is to some extent the consequence of training the models exclusively on text. In text, concepts are only defined by the words that accompany them and the information in text is incomplete due to reporting bias. In this work, we investigate whether additional context in the form of multimodal information can be used to improve on the representations of modern NLP models. Specifically, we consider BERT-based vision-and-language models that receive additional context from images. We hypothesize that visual training primarily should improve on the visual commonsense knowledge, i.e. obvious knowledge about visual properties, of the models. To probe for this knowledge we develop the evaluation tasks Memory Colors and Visual Property Norms. Generally, we find that the vision-and-language models considered do not outperform unimodal model counterparts. In addition to this, we find that the models switch their answer depending on prompt when evaluated for the same type of knowledge. We conclude that more work is needed on understanding and developing vision-and-language models, and that extra focus should be put on how to successfully fuse image and language processing. We also reconsider the usefulness of measuring commonsense knowledge in models that cannot represent factual knowledge.
  •  
Skapa referenser, mejla, bekava och länka
  • Resultat 1-10 av 914
Typ av publikation
konferensbidrag (550)
tidskriftsartikel (179)
bokkapitel (67)
proceedings (redaktörskap) (28)
doktorsavhandling (22)
annan publikation (20)
visa fler...
rapport (13)
samlingsverk (redaktörskap) (12)
licentiatavhandling (9)
forskningsöversikt (8)
bok (5)
konstnärligt arbete (2)
visa färre...
Typ av innehåll
refereegranskat (741)
övrigt vetenskapligt/konstnärligt (165)
populärvet., debatt m.m. (4)
Författare/redaktör
Dobnik, Simon, 1977 (52)
Borin, Lars, 1957 (39)
Bernardy, Jean-Phili ... (32)
Volodina, Elena, 197 ... (29)
Kokkinakis, Dimitrio ... (26)
Nivre, Joakim, 1962- (25)
visa fler...
Skantze, Gabriel, 19 ... (23)
Johansson, Richard, ... (21)
Dannélls, Dana, 1976 (21)
Berdicevskis, Aleksa ... (21)
Howes, Christine, 19 ... (21)
Larsson, Staffan, 19 ... (21)
Stymne, Sara, 1977- (20)
Ilinykh, Nikolai, 19 ... (20)
Tahmasebi, Nina, 198 ... (19)
Cooper, Robin, 1947 (18)
Maraev, Vladislav, 1 ... (18)
Jönsson, Arne, 1955- (17)
Megyesi, Beáta, 1971 ... (17)
Breitholtz, Ellen (16)
Hengchen, Simon, 198 ... (16)
Chatzikyriakidis, St ... (15)
Megyesi, Beáta, Prof ... (15)
Sayeed, Asad, 1980 (14)
Liwicki, Marcus (12)
Székely, Eva (12)
Adesam, Yvonne, 1975 (12)
Ek, Adam, 1990 (12)
Saxena, Anju, 1959- (12)
Hardmeier, Christian (12)
Loáiciga, Sharid, 19 ... (12)
Virk, Shafqat, 1979 (12)
Alfter, David, 1986 (11)
Kopal, Nils (11)
Forsberg, Markus, 19 ... (10)
Edlund, Jens, Docent ... (10)
Vakili, Thomas (10)
Kurfalı, Murathan, 1 ... (10)
Dubossarsky, Haim (10)
Hammarlin, Mia-Marie (10)
Beskow, Jonas (9)
Noble, Bill (9)
Fornés, Alicia (9)
Gustafsson, Joakim, ... (9)
Kalpakchi, Dmytro (9)
Drewes, Frank (8)
Henriksson, Aron, 19 ... (8)
Boye, Johan (8)
Muñoz Sánchez, Ricar ... (8)
Waldispühl, Michelle ... (8)
visa färre...
Lärosäte
Göteborgs universitet (380)
Uppsala universitet (166)
Kungliga Tekniska Högskolan (108)
Chalmers tekniska högskola (90)
Stockholms universitet (57)
Linköpings universitet (57)
visa fler...
Umeå universitet (43)
Lunds universitet (31)
RISE (29)
Luleå tekniska universitet (20)
Linnéuniversitetet (15)
Högskolan i Halmstad (11)
Institutet för språk och folkminnen (8)
Blekinge Tekniska Högskola (7)
Jönköping University (6)
Högskolan i Borås (5)
Örebro universitet (3)
Högskolan i Skövde (3)
Karolinska Institutet (3)
Mälardalens universitet (2)
Malmö universitet (2)
Mittuniversitetet (2)
Karlstads universitet (2)
Sveriges Lantbruksuniversitet (2)
Högskolan Kristianstad (1)
Södertörns högskola (1)
Enskilda Högskolan Stockholm (1)
visa färre...
Språk
Engelska (895)
Svenska (16)
Estniska (2)
Tyska (1)
Forskningsämne (UKÄ/SCB)
Naturvetenskap (914)
Humaniora (278)
Samhällsvetenskap (67)
Teknik (55)
Medicin och hälsovetenskap (16)

År

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Stäng

Kopiera och spara länken för att återkomma till aktuell vy