Sökning: id:"swepub:oai:DiVA.org:su-177194" >
Augmenting a De-ide...
Augmenting a De-identification System for Swedish Clinical Text Using Open Resources and Deep Learning
-
- Berg, Hanna (författare)
- Stockholms universitet,Institutionen för data- och systemvetenskap
-
- Dalianis, Hercules (författare)
- Stockholms universitet,Institutionen för data- och systemvetenskap
-
(creator_code:org_t)
- Linköping : Linköping University Electronic Press, 2019
- 2019
- Engelska.
-
Ingår i: Proceedings of the Workshop on NLP and Pseudonymisation. - Linköping : Linköping University Electronic Press. - 9789179299965 ; , s. 8-15
- Relaterad länk:
-
https://urn.kb.se/re...
Abstract
Ämnesord
Stäng
- Electronic patient records are produced in abundance every day and there is a demand to use them for research or management purposes. The records, however, contain information in the free text that can identify the patient and therefore tools are needed to identify this sensitive information. The aim is to compare two machine learning algorithms, Long Short-Term Memory (LSTM) and Conditional Random Fields (CRF) applied to a Swedish clinical data set annotated for de-identification. The results show that CRF performs better than deep learning with LSTM, with CRF giving the best results with an F1 score of 0.91 when adding more data from within the same domain. Adding general open data did, on the other hand, not improve the results.
Ämnesord
- NATURVETENSKAP -- Data- och informationsvetenskap -- Systemvetenskap, informationssystem och informatik (hsv//swe)
- NATURAL SCIENCES -- Computer and Information Sciences -- Information Systems (hsv//eng)
Nyckelord
- de-identification
- electronic health records
- machine learning
- Swedish
- Computer and Systems Sciences
- data- och systemvetenskap
Publikations- och innehållstyp
- ref (ämneskategori)
- kon (ämneskategori)
Hitta via bibliotek
Till lärosätets databas