SwePub
Sök i SwePub databas

  Utökad sökning

Träfflista för sökning "WFRF:(Chomutare Taridzo) "

Sökning: WFRF:(Chomutare Taridzo)

  • Resultat 1-8 av 8
Sortera/gruppera träfflistan
   
NumreringReferensOmslagsbildHitta
1.
  • Berg, Hanna, et al. (författare)
  • Building a De-identification System for Real Swedish Clinical Text Using Pseudonymised Clinical Text
  • 2019
  • Ingår i: Proceedings of the Tenth International Workshop on Health Text Mining and Information Analysis (LOUHI 2019). - : Association for Computational Linguistics. - 9781950737772 ; , s. 118-125
  • Konferensbidrag (refereegranskat)abstract
    • This article presents experiments with pseudonymised Swedish clinical text used as training data to de-identify real clinical text with the future aim to transfer non-sensitive training data to other hospitals. Conditional Random Fields (CFR) and Long Short-Term Memory (LSTM) machine learning algorithms were used to train de-identification models. The two models were trained on pseudonymised data and evaluated on real data. For benchmarking, models were also trained on real data, and evaluated on real data as well as trained on pseudonymised data and evaluated on pseudonymised data. CRF showed better performance for some PHI information like Date Part, First Name and Last Name; consistent with some reports in the literature. In contrast, poor performances on Location and Health Care Unit information were noted, partially due to the constrained vocabulary in the pseudonymised training data. It is concluded that it is possible to train transferable models based on pseudonymised Swedish clinical data, but even small narrative and distributional variation could negatively impact performance.
  •  
2.
  • Budrionis, Andrius, et al. (författare)
  • Negation detection in Norwegian medical text : Porting a Swedish NegEx to Norwegian. Work in progress
  • 2018
  • Konferensbidrag (refereegranskat)abstract
    • This paper presents an initial effort in developing a negation detection algorithm for Norwegian clinical text. An evaluated version of NegEx for Swedish was extended to support Norwegian clinical text, by translating the negation triggers and adding more negation rules as well as using a pre-processed Norwegian ICD-10 diagnosis code list to detect symptoms and diagnoses. Due to limited access to the Norwegian clinical text the Norwegian NegEx was tested on Norwegian medical scientific text. NegEx found 70 negated symptoms/diagnoses in the text combined of 170 publications in the medical domain. The results are not completely evaluated due to the lacking gold standard. Some challenging erroneous tokenizations of Norwegian words were found in addition to the need for improved preprocessing and matching techniques for the Norwegian ICD-10 code list. This work pointed out the weaknesses of the current implementation and provided insights for future work.
  •  
3.
  • Chomutare, Taridzo, et al. (författare)
  • Combining deep learning and fuzzy logic to predict rare ICD-10 codes from clinical notes
  • 2022
  • Ingår i: Proceedings - 2022 IEEE International Conference on Digital Health (ICDH 2022). - Piscataway : IEEE. - 9781665481496 ; , s. 163-168
  • Konferensbidrag (refereegranskat)abstract
    • Computer assisted coding (CAC) of clinical text into standardized classifications such as ICD-10 is an important challenge. For frequently used ICD-10 codes, deep learning approaches have been quite successful. For rare codes, however, the problem is still outstanding. To improve performance for rare codes, a pipeline is proposed that takes advantage of the ICD-10 code hierarchy to combine semantic capabilities of deep learning and the flexibility of fuzzy logic. The data used are discharge summaries in Swedish in the medical speciality of gastrointestinal diseases. Using our pipeline, fuzzy matching computation time is reduced and accuracy of the top 10 hits of the rare codes is also improved. While the method is promising, further work is required before the pipeline can be part of a usable prototype. Code repository: https://github.com/icd-coding/zeroshot.
  •  
4.
  • Chomutare, Taridzo, et al. (författare)
  • De-Identifying Swedish EHR Text Using Public Resources in the General Domain
  • 2020
  • Ingår i: Digital Personalized Health and Medicine. - Amsterdam : IOS Press. - 9781643680828 - 9781643680835 ; , s. 148-152
  • Konferensbidrag (refereegranskat)abstract
    • Sensitive data is normally required to develop rule-based or train machine learning-based models for de-identifying electronic health record (EHR) clinical notes; and this presents important problems for patient privacy. In this study, we add non-sensitive public datasets to EHR training data; (i) scientific medical text and (ii) Wikipedia word vectors. The data, all in Swedish, is used to train a deep learning model using recurrent neural networks. Tests on pseudonymized Swedish EHR clinical notes showed improved precision and recall from 55.62% and 80.02% with the base EHR embedding layer, to 85.01% and 87.15% when Wikipedia word vectors are added. These results suggest that non-sensitive text from the general domain can be used to train robust models for de-identifying Swedish clinical text; and this could be useful in cases where the data is both sensitive and in low-resource languages.
  •  
5.
  • Chomutare, Taridzo, et al. (författare)
  • Improving Quality of ICD-10 (International Statistical Classification of Diseases, Tenth Revision) Coding Using AI : Protocol for a Crossover Randomized Controlled Trial
  • 2024
  • Ingår i: JMIR Research Protocols. - 1929-0748. ; 13
  • Tidskriftsartikel (refereegranskat)abstract
    • Background: Computer-assisted clinical coding (CAC) tools are designed to help clinical coders assign standardized codes, such as the ICD-10 (International Statistical Classification of Diseases, Tenth Revision), to clinical texts, such as discharge summaries. Maintaining the integrity of these standardized codes is important both for the functioning of health systems and for ensuring data used for secondary purposes are of high quality. Clinical coding is an error-prone cumbersome task, and the complexity of modern classification systems such as the ICD-11 (International Classification of Diseases, Eleventh Revision) presents significant barriers to implementation. To date, there have only been a few user studies; therefore, our understanding is still limited regarding the role CAC systems can play in reducing the burden of coding and improving the overall quality of coding. Objective: The objective of the user study is to generate both qualitative and quantitative data for measuring the usefulness of a CAC system, Easy-ICD, that was developed for recommending ICD-10 codes. Specifically, our goal is to assess whether our tool can reduce the burden on clinical coders and also improve coding quality. Methods: The user study is based on a crossover randomized controlled trial study design, where we measure the performance of clinical coders when they use our CAC tool versus when they do not. Performance is measured by the time it takes them to assign codes to both simple and complex clinical texts as well as the coding quality, that is, the accuracy of code assignment. Results: We expect the study to provide us with a measurement of the effectiveness of the CAC system compared to manual coding processes, both in terms of time use and coding quality. Positive outcomes from this study will imply that CAC tools hold the potential to reduce the burden on health care staff and will have major implications for the adoption of artificial intelligence-based CAC innovations to improve coding practice. Expected results to be published summer 2024. Conclusions: The planned user study promises a greater understanding of the impact CAC systems might have on clinical coding in real-life settings, especially with regard to coding time and quality. Further, the study may add new insights on how to meaningfully exploit current clinical text mining capabilities, with a view to reducing the burden on clinical coders, thus lowering the barriers and paving a more sustainable path to the adoption of modern coding systems, such as the new ICD-11.
  •  
6.
  • Lamproudis, Anastasios, et al. (författare)
  • De-identifying Norwegian Clinical Text using Resources from Swedish and Danish
  • 2024
  • Ingår i: AMIA Annual Symposium Proceedings.
  • Konferensbidrag (refereegranskat)abstract
    • The lack of relevant annotated datasets represents one key limitation in the application of Natural Language Pro- cessing techniques in a broad number of tasks, among them Protected Health Information (PHI) identification in Norwegian clinical text. In this work, the possibility of exploiting resources from Swedish, a very closely related language, to Norwegian is explored. The Swedish dataset is annotated with PHI information. Different processing and text augmentation techniques are evaluated, along with their impact in the final performance of the model. The augmentation techniques, such as injection and generation of both Norwegian and Scandinavian Named Entities into the Swedish training corpus, showed to increase the performance in the de-identification task for both Danish and Norwegian text. This trend was also confirmed by the evaluation of model performance on a sample Norwegian gastro surgical clinical text.
  •  
7.
  • Ngo, Phuong, et al. (författare)
  • Deidentifying a Norwegian clinical corpus - An effort to create a privacy-preserving Norwegian large clinical language model
  • 2024
  • Ingår i: Proceedings of the CALD-pseudo Workshop at the 18th Conference of the European Chapter of the Association for Computational Linguistics, EACL 2024. - : Association for Computational Linguistics. ; , s. 37-43
  • Konferensbidrag (refereegranskat)abstract
    • This study discusses the methods and challenges of deidentifying and pseudonymizing Norwegian clinical text for research purposes. The results of the NorDeid tool for deidentification and pseudonymization on different types of protected health information were evaluated and discussed, as well as the extension of its functionality with regular expressions to identify specific types of sensitive information. This research used a clinical corpus of adult patients treated in a gastro-surgical department in Norway, which contains approximately nine million clinical notes. The study also highlights the challenges posed by the unique language and clinical terminology of Norway and emphasizes the importance of protecting privacy and the need for customized approaches to meet legal and research requirements.
  •  
8.
  • Tayefi, Maryam, et al. (författare)
  • Challenges and opportunities beyond structured data in analysis of electronic health records
  • 2021
  • Ingår i: Wiley Interdisciplinary Reviews. - : Wiley. - 1939-5108 .- 1939-0068. ; 13:6
  • Forskningsöversikt (refereegranskat)abstract
    • Electronic health records (EHR) contain a lot of valuable information about individual patients and the whole population. Besides structured data, unstructured data in EHRs can provide extra, valuable information but the analytics processes are complex, time-consuming, and often require excessive manual effort. Among unstructured data, clinical text and images are the two most popular and important sources of information. Advanced statistical algorithms in natural language processing, machine learning, deep learning, and radiomics have increasingly been used for analyzing clinical text and images. Although there exist many challenges that have not been fully addressed, which can hinder the use of unstructured data, there are clear opportunities for well-designed diagnosis and decision support tools that efficiently incorporate both structured and unstructured data for extracting useful information and provide better outcomes. However, access to clinical data is still very restricted due to data sensitivity and ethical issues. Data quality is also an important challenge in which methods for improving data completeness, conformity and plausibility are needed. Further, generalizing and explaining the result of machine learning models are important problems for healthcare, and these are open challenges. A possible solution to improve data quality and accessibility of unstructured data is developing machine learning methods that can generate clinically relevant synthetic data, and accelerating further research on privacy preserving techniques such as deidentification and pseudonymization of clinical text.
  •  
Skapa referenser, mejla, bekava och länka
  • Resultat 1-8 av 8

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Stäng

Kopiera och spara länken för att återkomma till aktuell vy