SwePub
Sök i SwePub databas

  Utökad sökning

Träfflista för sökning "WFRF:(Henriksson Aron 1985 ) "

Sökning: WFRF:(Henriksson Aron 1985 )

  • Resultat 1-10 av 24
Sortera/gruppera träfflistan
   
NumreringReferensOmslagsbildHitta
1.
  • Alam, Mahbub Ul, et al. (författare)
  • Deep Learning from Heterogeneous Sequences of Sparse Medical Data for Early Prediction of Sepsis
  • 2020
  • Ingår i: Proceedings of the 13th International Joint Conference on Biomedical Engineering Systems and Technologies, Volume 5: HEALTHINF. - Setúbal : SciTePress. - 9789897583988 ; , s. 45-55
  • Konferensbidrag (refereegranskat)abstract
    • Sepsis is a life-threatening complication to infections, and early treatment is key for survival. Symptoms of sepsis are difficult to recognize, but prediction models using data from electronic health records (EHRs) can facilitate early detection and intervention. Recently, deep learning architectures have been proposed for the early prediction of sepsis. However, most efforts rely on high-resolution data from intensive care units (ICUs). Prediction of sepsis in the non-ICU setting, where hospitalization periods vary greatly in length and data is more sparse, is not as well studied. It is also not clear how to learn effectively from longitudinal EHR data, which can be represented as a sequence of time windows. In this article, we evaluate the use of an LSTM network for early prediction of sepsis according to Sepsis-3 criteria in a general hospital population. An empirical investigation using six different time window sizes is conducted. The best model uses a two-hour window and assumes data is missing not at random, clearly outperforming scoring systems commonly used in healthcare today. It is concluded that the size of the time window has a considerable impact on predictive performance when learning from heterogeneous sequences of sparse medical data for early prediction of sepsis.
  •  
2.
  • Alam, Mahbub Ul, et al. (författare)
  • Terminology Expansion with Prototype Embeddings : Extracting Symptoms of Urinary Tract Infection from Clinical Text
  • 2021
  • Ingår i: Proceedings of the 14th International Joint Conference on Biomedical Engineering Systems and Technologies (BIOSTEC 2021) - Volume 5: HEALTHINF. - Setúbal : SciTePress. - 9789897584909 ; , s. 47-57
  • Konferensbidrag (refereegranskat)abstract
    • Many natural language processing applications rely on the availability of domain-specific terminologies containing synonyms. To that end, semi-automatic methods for extracting additional synonyms of a given concept from corpora are useful, especially in low-resource domains and noisy genres such as clinical text, where nonstandard language use and misspellings are prevalent. In this study, prototype embeddings based on seed words were used to create representations for (i) specific urinary tract infection (UTI) symptoms and (ii) UTI symptoms in general. Four word embedding methods and two phrase detection methods were evaluated using clinical data from Karolinska University Hospital. It is shown that prototype embeddings can effectively capture semantic information related to UTI symptoms. Using prototype embeddings for specific UTI symptoms led to the extraction of more symptom terms compared to using prototype embeddings for UTI symptoms in general. Overall, 142 additional UTI symp tom terms were identified, yielding a more than 100% increment compared to the initial seed set. The mean average precision across all UTI symptoms was 0.51, and as high as 0.86 for one specific UTI symptom. This study provides an effective and cost-effective solution to terminology expansion with small amounts of labeled data.
  •  
3.
  • Henriksson, Aron, 1985- (författare)
  • Ensembles of Semantic Spaces : On Combining Models of Distributional Semantics with Applications in Healthcare
  • 2015
  • Doktorsavhandling (övrigt vetenskapligt/konstnärligt)abstract
    • Distributional semantics allows models of linguistic meaning to be derived from observations of language use in large amounts of text. By modeling the meaning of words in semantic (vector) space on the basis of co-occurrence information, distributional semantics permits a quantitative interpretation of (relative) word meaning in an unsupervised setting, i.e., human annotations are not required. The ability to obtain inexpensive word representations in this manner helps to alleviate the bottleneck of fully supervised approaches to natural language processing, especially since models of distributional semantics are data-driven and hence agnostic to both language and domain.All that is required to obtain distributed word representations is a sizeable corpus; however, the composition of the semantic space is not only affected by the underlying data but also by certain model hyperparameters. While these can be optimized for a specific downstream task, there are currently limitations to the extent the many aspects of semantics can be captured in a single model. This dissertation investigates the possibility of capturing multiple aspects of lexical semantics by adopting the ensemble methodology within a distributional semantic framework to create ensembles of semantic spaces. To that end, various strategies for creating the constituent semantic spaces, as well as for combining them, are explored in a number of studies.The notion of semantic space ensembles is generalizable across languages and domains; however, the use of unsupervised methods is particularly valuable in low-resource settings, in particular when annotated corpora are scarce, as in the domain of Swedish healthcare. The semantic space ensembles are here empirically evaluated for tasks that have promising applications in healthcare. It is shown that semantic space ensembles – created by exploiting various corpora and data types, as well as by adjusting model hyperparameters such as the size of the context window and the strategy for handling word order within the context window – are able to outperform the use of any single constituent model on a range of tasks. The semantic space ensembles are used both directly for k-nearest neighbors retrieval and for semi-supervised machine learning. Applying semantic space ensembles to important medical problems facilitates the secondary use of healthcare data, which, despite its abundance and transformative potential, is grossly underutilized.
  •  
4.
  • Henriksson, Aron, 1985-, et al. (författare)
  • Multimodal fine-tuning of clinical language models for predicting COVID-19 outcomes
  • 2023
  • Ingår i: Artificial Intelligence in Medicine. - 0933-3657 .- 1873-2860. ; 146
  • Tidskriftsartikel (refereegranskat)abstract
    • Clinical prediction models tend only to incorporate structured healthcare data, ignoring information recorded in other data modalities, including free-text clinical notes. Here, we demonstrate how multimodal models that effectively leverage both structured and unstructured data can be developed for predicting COVID-19 outcomes. The models are trained end-to-end using a technique we refer to as multimodal fine-tuning, whereby a pre -trained language model is updated based on both structured and unstructured data. The multimodal models are trained and evaluated using a multicenter cohort of COVID-19 patients encompassing all encounters at the emergency department of six hospitals. Experimental results show that multimodal models, leveraging the notion of multimodal fine-tuning and trained to predict (i) 30-day mortality, (ii) safe discharge and (iii) readmission, outperform unimodal models trained using only structured or unstructured healthcare data on all three outcomes. Sensitivity analyses are performed to better understand how well the multimodal models perform on different patient groups, while an ablation study is conducted to investigate the impact of different types of clinical notes on model performance. We argue that multimodal models that make effective use of routinely collected healthcare data to predict COVID-19 outcomes may facilitate patient management and contribute to the effective use of limited healthcare resources.
  •  
5.
  • Henriksson, Aron, 1985- (författare)
  • Semantic Spaces of Clinical Text : Leveraging Distributional Semantics for Natural Language Processing of Electronic Health Records
  • 2013
  • Licentiatavhandling (övrigt vetenskapligt/konstnärligt)abstract
    • The large amounts of clinical data generated by electronic health record systems are an underutilized resource, which, if tapped, has enormous potential to improve health care. Since the majority of this data is in the form of unstructured text, which is challenging to analyze computationally, there is a need for sophisticated clinical language processing methods. Unsupervised methods that exploit statistical properties of the data are particularly valuable due to the limited availability of annotated corpora in the clinical domain.Information extraction and natural language processing systems need to incorporate some knowledge of semantics. One approach exploits the distributional properties of language – more specifically, term co-occurrence information – to model the relative meaning of terms in high-dimensional vector space. Such methods have been used with success in a number of general language processing tasks; however, their application in the clinical domain has previously only been explored to a limited extent. By applying models of distributional semantics to clinical text, semantic spaces can be constructed in a completely unsupervised fashion. Semantic spaces of clinical text can then be utilized in a number of medically relevant applications.The application of distributional semantics in the clinical domain is here demonstrated in three use cases: (1) synonym extraction of medical terms, (2) assignment of diagnosis codes and (3) identification of adverse drug reactions. To apply distributional semantics effectively to a wide range of both general and, in particular, clinical language processing tasks, certain limitations or challenges need to be addressed, such as how to model the meaning of multiword terms and account for the function of negation: a simple means of incorporating paraphrasing and negation in a distributional semantic framework is here proposed and evaluated. The notion of ensembles of semantic spaces is also introduced; these are shown to outperform the use of a single semantic space on the synonym extraction task. This idea allows different models of distributional semantics, with different parameter configurations and induced from different corpora, to be combined. This is not least important in the clinical domain, as it allows potentially limited amounts of clinical data to be supplemented with data from other, more readily available sources. The importance of configuring the dimensionality of semantic spaces, particularly when – as is typically the case in the clinical domain – the vocabulary grows large, is also demonstrated.
  •  
6.
  • Lamproudis, Anastasios, et al. (författare)
  • Evaluating Pretraining Strategies for Clinical BERT Models
  • 2022
  • Ingår i: Proceedings of the 13th Conference on Language Resources and Evaluation (LREC 2022). - : European Language Resources Association. ; , s. 410-416
  • Konferensbidrag (refereegranskat)abstract
    • Research suggests that using generic language models in specialized domains may be sub-optimal due to significant domain differences. As a result, various strategies for developing domain-specific language models have been proposed, including techniques for adapting an existing generic language model to the target domain, e.g. through various forms of vocabulary modifications and continued domain-adaptive pretraining with in-domain data. Here, an empirical investigation is carried out in which various strategies for adapting a generic language model to the clinical domain are compared to pretraining a pure clinical language model. Three clinical language models for Swedish, pretrained for up to ten epochs, are fine-tuned and evaluated on several downstream tasks in the clinical domain. A comparison of the language models’ downstream performance over the training epochs is conducted. The results show that the domain-specific language models outperform a general-domain language model, although there is little difference in performance between the various clinical language models. However, compared to pretraining a pure clinical language model with only in-domain data, leveraging and adapting an existing general-domain language model requires fewer epochs of pretraining with in-domain data.
  •  
7.
  • Lamproudis, Anastasios, et al. (författare)
  • Improving the Timeliness of Early Prediction Models for Sepsis through Utility Optimization
  • 2022
  • Ingår i: 2022 IEEE 34th International Conference on Tools with Artificial Intelligence (ICTAI). ; , s. 1062-1069
  • Konferensbidrag (refereegranskat)abstract
    • Early prediction of sepsis can facilitate early intervention and lead to improved clinical outcomes. However, for early prediction models to be clinically useful, and also to reduce alarm fatigue, detection of sepsis needs to be timely with respect to onset, being neither too late nor too early. In this paper, we propose a utility-based loss function for training early prediction models, where utility is defined by a function according to when the predictions are made and in relation to onset as well as to specified early, optimal and late time points. Two versions of the utility-based loss function are evaluated and compared to a cross-entropy loss baseline. Experimental results, using real clinical data from electronic health records, show that incorporating the utility-based loss function leads to superior multimodal early prediction models, detecting sepsis both more accurately and more timely. We argue that improving the timeliness of early prediction models is important for increasing their utility and acceptance in a clinical setting.
  •  
8.
  • Lamproudis, Anastasios, et al. (författare)
  • On the Impact of the Vocabulary for Domain-Adaptive Pretraining of Clinical Language Models
  • 2023
  • Ingår i: Biomedical Engineering Systems and Technologies. - : Springer Nature. - 9783031388538 ; , s. 315-332
  • Bokkapitel (refereegranskat)abstract
    • Pretrained language models tailored to the target domain may improve predictive performance on downstream tasks. Such domain-specific language models are typically developed by pretraining on in-domain data, either from scratch or by continuing to pretrain an existing generic language model. Here, we focus on the latter situation and study the impact of the vocabulary for domain-adaptive pretraining of clinical language models. In particular, we investigate the impact of (i) adapting the vocabulary to the target domain, (ii) using different vocabulary sizes, and (iii) creating initial representations for clinical terms not present in the general-domain vocabulary based on subword averaging. The results confirm the benefits of adapting the vocabulary of the language model to the target domain; however, the choice of vocabulary size is not particularly sensitive with respect to downstream performance, while the benefits of subword averaging is reduced after a modest amount of domain-adaptive pretraining.
  •  
9.
  • Lamproudis, Anastasios, et al. (författare)
  • Vocabulary Modifications for Domain-adaptive Pretraining of Clinical Language Models
  • 2022
  • Ingår i: Proceedings of the 15th International Joint Conference on Biomedical Engineering Systems and Technologies - HEALTHINF. - : SciTePress. - 9789897585524 ; , s. 180-188
  • Konferensbidrag (refereegranskat)abstract
    • Research has shown that using generic language models – specifically, BERT models – in specialized domains may be sub-optimal due to domain differences in language use and vocabulary. There are several techniques for developing domain-specific language models that leverage the use of existing generic language models, including continued and domain-adaptive pretraining with in-domain data. Here, we investigate a strategy based on using a domain-specific vocabulary, while leveraging a generic language model for initialization. The results demonstrate that domain-adaptive pretraining, in combination with a domain-specific vocabulary – as opposed to a general-domain vocabulary – yields improvements on two downstream clinical NLP tasks for Swedish. The results highlight the value of domain-adaptive pretraining when developing specialized language models and indicate that it is beneficial to adapt the vocabulary of the language model to the target domain prior to continued, domain-adaptive pretraining of a generic language model.
  •  
10.
  • Li, Xiu, et al. (författare)
  • Automatic Educational Concept Extraction Using NLP
  • 2022
  • Ingår i: Methodologies and Intelligent Systems for Technology Enhanced Learning, 12th International Conference. - Cham : Springer Nature. - 9783031206177 - 9783031206160 ; , s. 133-138
  • Konferensbidrag (refereegranskat)abstract
    • Educational concepts are the core of teaching and learning. From the perspective of educational technology, concepts are essential meta-data, represen- tative terms that can connect different learning materials, and are the foundation for many downstream tasks. Some studies on automatic concept extraction have been conducted, but there are no studies looking at the K-12 level and focused on the Swedish language. In this paper, we use a state-of-the-art Swedish BERT model to build an automatic concept extractor for the Biology subject using fine- annotated digital textbook data that cover all content for K-12. The model gives a recall measure of 72% and has the potential to be used in real-world settings for use cases that require high recall. Meanwhile, we investigate how input data fea- tures influence model performance and provide guidance on how to effectively use text data to achieve the optimal results when building a named entity recognition (NER) model.
  •  
Skapa referenser, mejla, bekava och länka
  • Resultat 1-10 av 24

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Stäng

Kopiera och spara länken för att återkomma till aktuell vy