Obtaining Accurate and Comprehensible Data Mining Models: An Evolutionary Approach

↓ Direkt till sidans innehåll
↓ Direkt till sidans sekundära innehåll (sidomenyn)

Search: L4X0:0345 7524 > University of Borås > Obtaining Accurate ...

1 of 1
Previous record
Next record
To hitlist

Obtaining Accurate and Comprehensible Data Mining Models : An Evolutionary Approach

Johansson, Ulf (author): Högskolan i Borås,Linköpings universitet,Institutionen för datavetenskap,Tekniska högskolan,Institutionen Handels- och IT-högskolan

Niklasson, Lars (thesis advisor)

Ziemke, Tom (thesis advisor)

Rögnvaldsson, Thorsteinn, Professor (opponent): Sektionen för informationsvetenskap, Data- och Elektroteknik, Högskolan i Halmstad

show less...

(creator_code:org_t)

ISBN 9789185715343
Institutionen för datavetenskap, 2007
English 254 s.
Series: Linköping Studies in Science and Technology. Dissertations, 0345-7524 ; 1086

Related links:: https://liu.diva-por... (primary) (Raw object); show more...; https://hb.diva-port... (primary) (Raw object); https://urn.kb.se/re...; https://urn.kb.se/re...; show less...

Doctoral thesis (other academic/artistic)

Abstract Subject headings

When performing predictive data mining, the use of ensembles is claimed to virtually guarantee increased accuracy compared to the use of single models. Unfortunately, the problem of how to maximize ensemble accuracy is far from solved. In particular, the relationship between ensemble diversity and accuracy is not completely understood, making it hard to efficiently utilize diversity for ensemble creation. Furthermore, most high-accuracy predictive models are opaque, i.e. it is not possible for a human to follow and understand the logic behind a prediction. For some domains, this is unacceptable, since models need to be comprehensible. To obtain comprehensibility, accuracy is often sacrificed by using simpler but transparent models; a trade-off termed the accuracy vs. comprehensibility trade-off. With this trade-off in mind, several researchers have suggested rule extraction algorithms, where opaque models are transformed into comprehensible models, keeping an acceptable accuracy.In this thesis, two novel algorithms based on Genetic Programming are suggested. The first algorithm (GEMS) is used for ensemble creation, and the second (G-REX) is used for rule extraction from opaque models. The main property of GEMS is the ability to combine smaller ensembles and individual models in an almost arbitrary way. Moreover, GEMS can use base models of any kind and the optimization function is very flexible, easily permitting inclusion of, for instance, diversity measures. In the experimentation, GEMS obtained accuracies higher than both straightforward design choices and published results for Random Forests and AdaBoost. The key quality of G-REX is the inherent ability to explicitly control the accuracy vs. comprehensibility trade-off. Compared to the standard tree inducers C5.0 and CART, and some well-known rule extraction algorithms, rules extracted by G-REX are significantly more accurate and compact. Most importantly, G-REX is thoroughly evaluated and found to meet all relevant evaluation criteria for rule extraction algorithms, thus establishing G-REX as the algorithm to benchmark against.

Subject headings

NATURVETENSKAP -- Data- och informationsvetenskap -- Datavetenskap (hsv//swe)
NATURAL SCIENCES -- Computer and Information Sciences -- Computer Sciences (hsv//eng)
NATURVETENSKAP -- Data- och informationsvetenskap (hsv//swe)
NATURAL SCIENCES -- Computer and Information Sciences (hsv//eng)

Keyword

Rule extraction
Ensembles
Data mining
Genetic programming
Artificial neural networks
Computer science
Datalogi

Publication and Content Type

vet (subject category)
dok (subject category)

Find in a library

Obtaining Accurate and Comprehensible Data Mining Models An Evolutionary Approac... (Search the publication in LIBRIS)

To the university's database

1 of 1
Previous record
Next record
To hitlist

Find more in SwePub

By the author/editor: Johansson, Ulf; Niklasson, Lars; Ziemke, Tom; Rögnvaldsson, Th ...

About the subject

NATURAL SCIENCES: NATURAL SCIENCES; and Computer and Inf ...; and Computer Science ...

NATURAL SCIENCES: NATURAL SCIENCES; and Computer and Inf ...

Parts in the series: Linköping Studie ...

By the university: Linköping University; University of Borås

Search outside SwePub

Extend your search to:: Google; Google Book Search; Google Scholar

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

LIBRIS.kb.se