SwePub
Sök i SwePub databas

  Utökad sökning

Träfflista för sökning "WFRF:(Santini Marina 1960 ) "

Sökning: WFRF:(Santini Marina 1960 )

  • Resultat 1-10 av 19
Sortera/gruppera träfflistan
   
NumreringReferensOmslagsbildHitta
1.
  • Blomqvist, Eva, et al. (författare)
  • Towards causal knowledge graphs - position paper
  • 2020
  • Ingår i: CEUR Workshop Proceedings. - : CEUR-WS. ; , s. 58-62, s. 58-62
  • Konferensbidrag (refereegranskat)abstract
    • In this position paper, we highlight that being able to analyse the cause-effect relationships for determining the causal status among a set of events is an essential requirement in many contexts and argue that cannot be overlooked when building systems targeting real-world use cases. This is especially true for medical contexts where the understanding of the cause(s) of a symptom, or observation, is of vital importance. However, most approaches purely based on Machine Learning (ML) do not explicitly represent and reason with causal relations, and may therefore mistake correlation for causation. In the paper, we therefore argue for an approach to extract causal relations from text, and represent them in the form of Knowledge Graphs (KG), to empower downstream ML applications, or AI systems in general, with the ability to distinguish correlation from causation and reason with causality in an explicit manner. So far, the bottlenecks in KG creation have been scalability and accuracy of automated methods, hence, we argue that two novel features are required from methods for addressing these challenges, i.e. (i) the use of Knowledge Patterns to guide the KG generation process towards a certain resulting knowledge structure, and (ii) the use of a semantic referee to automatically curate the extracted knowledge. We claim that this will be an important step forward for supporting interpretable AI systems, and integrating ML and knowledge representation approaches, such as KGs, which should also generalise well to other types of relations, apart from causality. © 2020 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
  •  
2.
  • Brännvall, Rickard, 1975-, et al. (författare)
  • Homomorphic encryption enables private data sharing for digital health : Winning entry to the Vinnova innovation competition Vinter 2021-22
  • 2022
  • Ingår i: 34th Workshop of the Swedish Artificial Intelligence Society, SAIS 2022. - : Institute of Electrical and Electronics Engineers Inc.. - 9781665471268
  • Konferensbidrag (refereegranskat)abstract
    • People living with type 1 diabetes often use several apps and devices that help them collect and analyse data for a better monitoring and management of their disease. When such health related data is analysed in the cloud, one must always carefully consider privacy protection and adhere to laws regulating the use of personal data. In this paper we present our experience at the pilot Vinter competition 2021-22 organised by Vinnova. The competition focused on digital services that handle sensitive diabetes related data. The architecture that we proposed for the competition is discussed in the context of a hypothetical cloud-based service that calculates diabetes self-care metrics under strong privacy preservation. It is based on Fully Homomorphic Encryption (FHE)-a technology that makes computation on encrypted data possible. Our solution promotes safe key management and data life-cycle control. Our benchmarking experiment demonstrates execution times that scale well for the implementation of personalised health services. We argue that this technology has great potentials for AI-based health applications and opens up new markets for third-party providers of such services, and will ultimately promote patient health and a trustworthy digital society.
  •  
3.
  • Capshaw, Riley, et al. (författare)
  • BERT is as Gentle as a Sledgehammer: Too Powerful or Too Blunt? It Depends on the Benchmark
  • 2021
  • Konferensbidrag (övrigt vetenskapligt/konstnärligt)abstract
    • In this position statement, we wish to contribute to the discussion about how to assess quality and coverage of a model.We believe that BERT's prominence as a single-step pipeline for contextualization and classification highlights the need for benchmarks to evolve concurrently with models. Much recent work has touted BERT's raw power for solving natural language tasks, so we used a 12-layer uncased BERT pipeline with a linear classifier as a quick-and-dirty model to score well on the SemEval 2010 Task 8 dataset for relation classification between nominals. We initially expected there to be significant enough bias from BERT's training to influence downstream tasks, since it is well-known that biased training corpora can lead to biased language models (LMs). Gender bias is the most common example, where gender roles are codified within language models. To handle such training data bias, we took inspiration from work in the field of computer vision. Tang et al. (2020) mitigate human reporting bias over the labels of a scene graph generation task using a form of causal reasoning based on counterfactual analysis. They extract the total direct effect of the context image on the prediction task by "blanking out" detected objects, intuitively asking "What if these objects were not here?" If the system still predicts the same label, then the original prediction is likely caused by bias in some form. Our goal was to remove any effects from biases learned during BERT's pre-training, so we analyzed total effect (TE) instead. However, across several experimental configurations we found no noticeable effects from using TE analysis. One disappointing possibility was that BERT might be resistant to causal analysis due to its complexity. Another was that BERT is so powerful (or blunt?) that it can find unanticipated trends in its input, rendering any human-generated causal analysis of its predictions useless. We nearly concluded that what we expected to be delicate experimentation was more akin to trying to carve a masterpiece sculpture with a self-driven sledgehammer. We then found related work where BERT fooled humans by exploiting unexpected characteristics of a benchmark. When we used BERT to predict a relation for random words in the benchmark sentences, it guessed the same label as it would have for the corresponding marked entities roughly half of the time. Since the task had nineteen roughly-balanced labels, we expected much less consistency. This finding repeated across all pipeline configurations; BERT was treating the benchmark as a sequence classification task! Our final conclusion was that the benchmark is inadequate: all sentences appeared exactly once with exactly one pair of entities, so the task was equivalent to simply labeling each sentence. We passionately claim from our experience that the current trend of using larger and more complex LMs must include concurrent evolution of benchmarks. We as researchers need to be diligent in keeping our tools for measuring as sophisticated as the models being measured, as any scientific domain does.
  •  
4.
  • Danielsson, Benjamin, et al. (författare)
  • Classifying Implant-Bearing Patients via their Medical Histories : a Pre-Study on Swedish EMRs with Semi-Supervised GAN-BERT
  • 2022
  • Ingår i: 2022 Language Resources and Evaluation Conference, LREC 2022. - : European Language Resources Association (ELRA). - 9791095546726 ; , s. 5428-5435
  • Konferensbidrag (refereegranskat)abstract
    • In this paper, we compare the performance of two BERT-based text classifiers whose task is to classify patients (more precisely, their medical histories) as having or not having implant(s) in their body. One classifier is a fully-supervised BERT classifier. The other one is a semi-supervised GAN-BERT classifier. Both models are compared against a fully-supervised SVM classifier. Since fully-supervised classification is expensive in terms of data annotation, with the experiments presented in this paper, we investigate whether we can achieve a competitive performance with a semi-supervised classifier based only on a small amount of annotated data. Results are promising and show that the semi-supervised classifier has a competitive performance when compared with the fully-supervised classifier. © licensed under CC-BY-NC-4.0.
  •  
5.
  • Falkenjack, Johan, et al. (författare)
  • An Exploratory Study on Genre Classification using Readability Features
  • 2016
  • Ingår i: The Sixth Swedish Language Technology Conference (SLTC).
  • Konferensbidrag (refereegranskat)abstract
    • We present a preliminary study that explores whether text features used for readability assessment are reliable genre-revealing features. We empirically explore the difference between genre and domain. We carry out two sets of experiments with both supervised and unsupervised methods. Findings on the Swedish national corpus (the SUC) show that readability cues are good indicators of genre variation.
  •  
6.
  • Jerdhaf, Oskar, et al. (författare)
  • Focused Terminology Extraction for CPSs The Case of "Implant Terms" in Electronic Medical Records
  • 2021
  • Ingår i: 2021 IEEE International Conference on Communications Workshops (ICC Workshops). - : Institute of Electrical and Electronics Engineers (IEEE).
  • Konferensbidrag (refereegranskat)abstract
    • Language Technology is an essential component of many Cyber-Physical Systems (CPSs) because specialized linguistic knowledge is indispensable to prevent fatal errors. We present the case of automatic identification of implant terms. The need of an automatic identification of implant terms spurs from safety reasons because patients who have an implant may or may be not submitted to Magnetic Resonance Imaging (MRI). Normally, MRI scans are safe. However, in some cases an MRI scan may not be recommended. It is important to know if a patient has an implant, because MRI scanning is incompatible with some implants. At present, the process of ascertain whether a patient could be at risk is lengthy, manual, and based on the specialized knowledge of medical staff. We argue that this process can be sped up, streamlined and become safer by sieving through patients’ medical records. In this paper, we explore how to discover implant terms in electronic medical records (EMRs) written in Swedish with an unsupervised approach. To this aim we use BERT, a state-of-the-art deep learning algorithm based on pre-trained word embeddings. We observe that BERT discovers a solid proportion of terms that are indicative of implants.
  •  
7.
  •  
8.
  • Jerdhaf, Oskar, et al. (författare)
  • Implant Terms: Focused Terminology Extraction with Swedish BERT - Preliminary Results
  • 2020
  • Konferensbidrag (refereegranskat)abstract
    • Certain implants are imperative to detect be-fore MRI scans. However, implant terms, like‘pacemaker’ or ‘stent’, are sparse and difficultto identify in noisy and hastily written elec-tronic medical records (EMRs). In this pa-per, we explore how to discover implant termsin Swedish EMRs with an unsupervised ap-proach.To this purpose, we use BERT, astate-of-the-art deep learning algorithm, andfine-tune a model built on pre-trained SwedishBERT. We observe that BERT discovers asolid proportion of indicative implant terms.
  •  
9.
  • Rennes, Evelina, 1990-, et al. (författare)
  • The Swedish Simplification Toolkit : Designed with Target Audiences in Mind
  • 2022
  • Ingår i: 2nd Workshop on Tools and Resources for REAding DIfficulties, READI 2022 - collocated with the International Conference on Language Resources and Evaluation Conference, LREC 2022. - : European Language Resources Association (ELRA). - 9791095546849 ; , s. 31-38
  • Konferensbidrag (refereegranskat)abstract
    • In this paper, we present the current version of The Swedish Simplification Toolkit. The toolkit includes computational and empirical tools that have been developed along the years to explore a still neglected area of NLP, namely the simplification of “standard” texts to meet the needs of target audiences. Target audiences, such as people affected by dyslexia, aphasia, autism, but also children and second language learners, require different types of text simplification and adaptation. For example, while individuals with aphasia have difficulties in reading compounds (such as arbetsmarknadsdepartement, eng. ministry of employment), second language learners struggle with cultural-specific vocabulary (e.g. konflikträdd, eng. afraid of conflicts). The toolkit allows user to selectively select the types of simplification that meet the specific needs of the target audience they belong to. The Swedish Simplification Toolkit is one of the first attempts to overcome the one-fits-all approach that is still dominant in Automatic Text Simplification, and proposes a set of computational methods that, used individually or in combination, may help individuals reduce reading (and writing) difficulties.
  •  
10.
  • Santini, Marina, 1960-, et al. (författare)
  • A Web Corpus for eCare : Collection, Lay Annotation and Learning - First Results
  • 2017
  • Konferensbidrag (refereegranskat)abstract
    • In this position paper, we put forward two claims: 1) it is possible to design a dynamic and extensible corpus without running the risk of getting into scalability problems; 2) it is possible to devise noise-resistant Language Technology applications without affecting performance. To support our claims, we describe the design, construction and limitations of a very specialized medical web corpus, called eCare_Sv_01, and we present two experiments on lay-specialized text classification. eCare_Sv_01 is a small corpus of web documents written in Swedish. The corpus contains documents about chronic diseases. The sublanguage used in each document has been labelled as “lay” or “specialized” by a lay annotator. The corpus is designed as a flexible text resource, where additional medical documents will be appended over time. Experiments show that the lay-specialized labels assigned by the lay annotator are reliably learned by standard classifiers. More specifically, Experiment 1 shows that scalability is not an issue when increasing the size of the datasets to be learned from 156 up to 801 documents. Experiment 2 shows that lay-specialized labels can be learned regardless of the large amount of disturbing factors, such as machine translated documents or low-quality texts that are numerous in the corpus.
  •  
Skapa referenser, mejla, bekava och länka
  • Resultat 1-10 av 19

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Stäng

Kopiera och spara länken för att återkomma till aktuell vy