SwePub
Sök i LIBRIS databas

  Extended search

onr:"swepub:oai:DiVA.org:su-190020"
 

Search: onr:"swepub:oai:DiVA.org:su-190020" > De-Identifying Swed...

  • 1 of 1
  • Previous record
  • Next record
  •    To hitlist

De-Identifying Swedish EHR Text Using Public Resources in the General Domain

Chomutare, Taridzo (author)
Norwegian Centre for E-health Research, Norway
Yigzaw, Kassaye Yitbarek (author)
Norwegian Centre for E-health Research, Norway
Budrionis, Andrius (author)
Norwegian Centre for E-health Research, Norway
show more...
Makhlysheva, Alexandra (author)
Norwegian Centre for E-health Research, Norway
Godtliebsen, Fred (author)
Norwegian Centre for E-health Research, Norway; UiT - The Arctic University of Norway, Norway
Dalianis, Hercules (author)
Stockholms universitet,Institutionen för data- och systemvetenskap,Norwegian Centre for E-health Research, Norway
show less...
 (creator_code:org_t)
Amsterdam : IOS Press, 2020
2020
English.
In: Digital Personalized Health and Medicine. - Amsterdam : IOS Press. - 9781643680828 - 9781643680835 ; , s. 148-152
  • Conference paper (peer-reviewed)
Abstract Subject headings
Close  
  • Sensitive data is normally required to develop rule-based or train machine learning-based models for de-identifying electronic health record (EHR) clinical notes; and this presents important problems for patient privacy. In this study, we add non-sensitive public datasets to EHR training data; (i) scientific medical text and (ii) Wikipedia word vectors. The data, all in Swedish, is used to train a deep learning model using recurrent neural networks. Tests on pseudonymized Swedish EHR clinical notes showed improved precision and recall from 55.62% and 80.02% with the base EHR embedding layer, to 85.01% and 87.15% when Wikipedia word vectors are added. These results suggest that non-sensitive text from the general domain can be used to train robust models for de-identifying Swedish clinical text; and this could be useful in cases where the data is both sensitive and in low-resource languages.

Subject headings

NATURVETENSKAP  -- Data- och informationsvetenskap (hsv//swe)
NATURAL SCIENCES  -- Computer and Information Sciences (hsv//eng)

Keyword

EHR
clinical text
de-identification
deep learning
wiki word vectors
data- och systemvetenskap
Computer and Systems Sciences

Publication and Content Type

ref (subject category)
kon (subject category)

Find in a library

To the university's database

  • 1 of 1
  • Previous record
  • Next record
  •    To hitlist

Search outside SwePub

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Close

Copy and save the link in order to return to this view