Sökning: onr:"swepub:oai:DiVA.org:su-149420" >
Discovering, select...
Discovering, selecting and exploiting feature sequence records of study participants for the classification of epidemiological data on hepatic steatosis
-
Hielscher, Tommy (författare)
-
Völzke, Henry (författare)
-
- Papapetrou, Panagiotis (författare)
- Stockholms universitet,Institutionen för data- och systemvetenskap
-
visa fler...
-
Spiliopoulou, Myra (författare)
-
visa färre...
-
(creator_code:org_t)
- 2018-04-09
- 2018
- Engelska.
-
Ingår i: Proceedings of the 33rd Annual ACM Symposium on Applied Computing. - New York, NY, USA : Association for Computing Machinery (ACM). - 9781450351911 ; , s. 6-13
- Relaterad länk:
-
https://urn.kb.se/re...
-
visa fler...
-
https://doi.org/10.1...
-
visa färre...
Abstract
Ämnesord
Stäng
- In longitudinal epidemiological studies, participants undergo repeated medical examinations and are thus represented by a potentially large number of short examination outcome sequences. Some of those sequences may contain important information in various forms, such as patterns, with respect to the disease under study, while others may be on features of little relevance to the outcome. In this work, we propose a framework for Discovery, Selection and Exploitation (DiSelEx) of longitudinal epidemiological data, aiming to identify informative patterns among these sequences. DiSelEx combines sequence clustering with supervised learning to identify sequence groups that contribute to class separation. Newly derived and old features are evaluated and selected according to their redundancy and informativeness regarding the target variable. The selected feature set is then used to learn a classification model on the study data. We evaluate DiSelEx on cohort participants for the disorder "hepatic steatosis" and report on the impact on predictive performance when using sequential data in comparison to utilizing only the basic classifier.
Ämnesord
- NATURVETENSKAP -- Data- och informationsvetenskap -- Systemvetenskap, informationssystem och informatik (hsv//swe)
- NATURAL SCIENCES -- Computer and Information Sciences -- Information Systems (hsv//eng)
Nyckelord
- medical data mining
- patient similarity
- time-series clustering
- feature selection
- classification
- epidemiological studies
- hepatic steatosis
- Computer and Systems Sciences
- data- och systemvetenskap
Publikations- och innehållstyp
- ref (ämneskategori)
- kon (ämneskategori)
Hitta via bibliotek
Till lärosätets databas