SwePub
Sök i LIBRIS databas

  Extended search

onr:"swepub:oai:DiVA.org:su-147956"
 

Search: onr:"swepub:oai:DiVA.org:su-147956" > Semi-supervised med...

  • 1 of 1
  • Previous record
  • Next record
  •    To hitlist

Semi-supervised medical entity recognition : A study on Spanish and Swedish clinical corpora

Pérez, Alicia (author)
Weegar, Rebecka (author)
Stockholms universitet,Institutionen för data- och systemvetenskap
Casillas, Arantza (author)
show more...
Gojenola, Koldo (author)
Oronoz, Maite (author)
Dalianis, Hercules (author)
Stockholms universitet,Institutionen för data- och systemvetenskap
show less...
 (creator_code:org_t)
Elsevier BV, 2017
2017
English.
In: Journal of Biomedical Informatics. - : Elsevier BV. - 1532-0464 .- 1532-0480. ; 71, s. 16-30
  • Journal article (peer-reviewed)
Abstract Subject headings
Close  
  • Objective: The goal of this study is to investigate entity recognition within Electronic Health Records (EHRs) focusing on Spanish and Swedish. Of particular importance is a robust representation of the entities. In our case, we utilized unsupervised methods to generate such representations. Methods: The significance of this work stands on its experimental layout. The experiments were carried out under the same conditions for both languages. Several classification approaches were explored: maximum probability, CRF, Perceptron and SVM. The classifiers were enhanced by means of ensembles of semantic spaces and ensembles of Brown trees. In order to mitigate sparsity of data, without a significant increase in the dimension of the decision space, we propose the use of clustered approaches of the hierarchical Brown clustering represented by trees and vector quantization for each semantic space. Results: The results showed that the semi-supervised approaches significantly improved standard supervised techniques for both languages. Moreover, clustering the semantic spaces contributed to the quality of the entity recognition while keeping the dimension of the feature-space two orders of magnitude lower than when directly using the semantic spaces. Conclusions: The contributions of this study are: (a) a set of thorough experiments that enable comparisons regarding the influence of different types of features on different classifiers, exploring two languages other than English; and (b) the use of ensembles of clusters of Brown trees and semantic spaces on EHRs to tackle the problem of scarcity of available annotated data.

Subject headings

NATURVETENSKAP  -- Data- och informationsvetenskap (hsv//swe)
NATURAL SCIENCES  -- Computer and Information Sciences (hsv//eng)

Keyword

Medical entity recognition
Supervised and unsupervised learning
Health records
Computer and Systems Sciences
data- och systemvetenskap

Publication and Content Type

ref (subject category)
art (subject category)

Find in a library

To the university's database

  • 1 of 1
  • Previous record
  • Next record
  •    To hitlist

Search outside SwePub

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Close

Copy and save the link in order to return to this view