SwePub
Sök i SwePub databas

  Utökad sökning

Träfflista för sökning "LAR1:hb ;lar1:(his);pers:(Niklasson Lars)"

Sökning: LAR1:hb > Högskolan i Skövde > Niklasson Lars

  • Resultat 1-8 av 8
Sortera/gruppera träfflistan
   
NumreringReferensOmslagsbildHitta
1.
  • Johansson, Ulf, et al. (författare)
  • Evolving decision trees using oracle guides
  • 2009
  • Ingår i: 2009 IEEE Symposium on Computational Intelligence and Data Mining (CIDM 2009) Proceedings. - : IEEE. - 9781424427659 ; , s. 238-244
  • Konferensbidrag (refereegranskat)abstract
    • Abstract—Some data mining problems require predictive models to be not only accurate but also comprehensible. Comprehensibility enables human inspection and understanding of the model, making it possible to trace why individual predictions are made. Since most high-accuracy techniques produce opaque models, accuracy is, in practice, regularly sacrificed for comprehensibility. One frequently studied technique, often able to reduce this accuracy vs. comprehensibility tradeoff, is rule extraction, i.e., the activity where another, transparent, model is generated from the opaque. In this paper, it is argued that techniques producing transparent models, either directly from the dataset, or from an opaque model, could benefit from using an oracle guide. In the experiments, genetic programming is used to evolve decision trees, and a neural network ensemble is used as the oracle guide. More specifically, the datasets used by the genetic programming when evolving the decision trees, consist of several different combinations of the original training data and “oracle data”, i.e., training or test data instances, together with corresponding predictions from the oracle. In total, seven different ways of combining regular training data with oracle data were evaluated, and the results, obtained on 26 UCI datasets, clearly show that the use of an oracle guide improved the performance. As a matter of fact, trees evolved using training data only had the worst test set accuracy of all setups evaluated. Furthermore, statistical tests show that two setups, both using the oracle guide, produced significantly more accurate trees, compared to the setup using training data only.
  •  
2.
  • Johansson, Ulf, et al. (författare)
  • Genetic rule extraction optimizing brier score
  • 2010
  • Ingår i: Proceedings of the 12th Annual Genetic and Evolutionary Computation Conference, GECCO '10. - New York : Association for Computing Machinery (ACM). - 9781450300728 ; , s. 1007-1014
  • Konferensbidrag (refereegranskat)abstract
    • Most highly accurate predictive modeling techniques produce opaque models. When comprehensible models are required, rule extraction is sometimes used to generate a transparent model, based on the opaque. Naturally, the extracted model should be as similar as possible to the opaque. This criterion, called fidelity, is therefore a key part of the optimization function in most rule extracting algorithms. To the best of our knowledge, all existing rule extraction algorithms targeting fidelity use 0/1 fidelity, i.e., maximize the number of identical classifications. In this paper, we suggest and evaluate a rule extraction algorithm utilizing a more informed fidelity criterion. More specifically, the novel algorithm, which is based on genetic programming, minimizes the difference in probability estimates between the extracted and the opaque models, by using the generalized Brier score as fitness function. Experimental results from 26 UCI data sets show that the suggested algorithm obtained considerably higher accuracy and significantly better AUC than both the exact same rule extraction algorithm maximizing 0/1 fidelity, and the standard tree inducer J48. Somewhat surprisingly, rule extraction using the more informed fidelity metric normally resulted in less complex models, making sure that the improved predictive performance was not achieved on the expense of comprehensibility. Copyright 2010 ACM.
  •  
3.
  • Johansson, Ulf, et al. (författare)
  • Inconsistency : Friend or Foe
  • 2007
  • Ingår i: The International Joint Conference on Neural Networks. - : IEEE Press. - 142441380X - 9781424413805 - 9781424413799 ; , s. 1383-1388
  • Bokkapitel (övrigt vetenskapligt/konstnärligt)abstract
    • One way of obtaining accurate yet comprehensible models is to extract rules from opaque predictive models. When evaluating rule extraction algorithms, one frequently used criterion is consistency; i.e. the algorithm must produce similar rules every time it is applied to the same problem. Rule extraction algorithms based on evolutionary algorithms are, however, inherently inconsistent, something that is regarded as their main drawback. In this paper, we argue that consistency is an overvalued criterion, and that inconsistency can even be beneficial in some situations. The study contains two experiments, both using publicly available data sets, where rules are extracted from neural network ensembles. In the first experiment, it is shown that it is normally possible to extract several different rule sets from an opaque model, all having high and similar accuracy. The implication is that consistency in that perspective is useless; why should one specific rule set be considered superior? Clearly, it should instead be regarded as an advantage to obtain several accurate and comprehensible descriptions of the relationship. In the second experiment, rule extraction is used for probability estimation. More specifically, an ensemble of extracted trees is used in order to obtain probability estimates. Here, it is exactly the inconsistency of the rule extraction algorithm that makes the suggested approach possible.
  •  
4.
  • Johansson, Ulf, et al. (författare)
  • Increasing Rule Extraction Accuracy by Post-processing GP Trees
  • 2008
  • Ingår i: Proceedings of the Congress on Evolutionary Computation. - : IEEE. - 9781424418237 - 9781424418220 ; , s. 3010-3015
  • Konferensbidrag (refereegranskat)abstract
    • Genetic programming (GP), is a very general and efficient technique, often capable of outperforming more specialized techniques on a variety of tasks. In this paper, we suggest a straightforward novel algorithm for post-processing of GP classification trees. The algorithm iteratively, one node at a time, searches for possible modifications that would result in higher accuracy. More specifically, the algorithm for each split evaluates every possible constant value and chooses the best. With this design, the post-processing algorithm can only increase training accuracy, never decrease it. In this study, we apply the suggested algorithm to GP trees, extracted from neural network ensembles. Experimentation, using 22 UCI datasets, shows that the post-processing results in higher test set accuracies on a large majority of datasets. As a matter of fact, for two setups of three evaluated, the increase in accuracy is statistically significant.
  •  
5.
  • Johansson, Ulf, et al. (författare)
  • The Importance of Diversity in Neural Network Ensembles : An Empirical Investigation
  • 2007
  • Ingår i: IJCNN 2007 Conference Proceedings. - : IEEE. - 9781424413799 - 9781424413805 - 142441380X - 142441380X ; , s. 661-666
  • Konferensbidrag (refereegranskat)abstract
    • When designing ensembles, it is almost an axiom that the base classifiers must be diverse in order for the ensemble to generalize well. Unfortunately, there is no clear definition of the key term diversity, leading to several diversity measures and many, more or less ad hoc, methods for diversity creation in ensembles. In addition, no specific diversity measure has shown to have a high correlation with test set accuracy. The purpose of this paper is to empirically evaluate ten different diversity measures, using neural network ensembles and 11 publicly available data sets. The main result is that all diversity measures evaluated, in this study too, show low or very low correlation with test set accuracy. Having said that, two measures; double fault and difficulty show slightly higher correlations compared to the other measures. The study furthermore shows that the correlation between accuracy measured on training or validation data and test set accuracy also is rather low. These results challenge ensemble design techniques where diversity is explicitly maximized or where ensemble accuracy on a hold-out set is used for optimization.
  •  
6.
  • Johansson, Ulf, et al. (författare)
  • Using Imaginary Ensembles to Select GP Classifiers
  • 2010
  • Ingår i: Genetic Programming: 13th European Conference, EuroGP 2010, Istanbul, Turkey, April 7-9, 2010, Proceedings. - Berlin, Heidelberg : Springer. - 9783642121470 - 9783642121487 - 3642121470 ; , s. 278-288
  • Konferensbidrag (refereegranskat)abstract
    • When predictive modeling requires comprehensible models, most data miners will use specialized techniques producing rule sets or decision trees. This study, however, shows that genetically evolved decision trees may very well outperform the more specialized techniques. The proposed approach evolves a number of decision trees and then uses one of several suggested selection strategies to pick one specific tree from that pool. The inherent inconsistency of evolution makes it possible to evolve each tree using all data, and still obtain somewhat different models. The main idea is to use these quite accurate and slightly diverse trees to form an imaginary ensemble, which is then used as a guide when selecting one specific tree. Simply put, the tree classifying the largest number of instances identically to the ensemble is chosen. In the experimentation, using 25 UCI data sets, two selection strategies obtained significantly higher accuracy than the standard rule inducer J48.
  •  
7.
  • König, Rikard, et al. (författare)
  • Instance Ranking Using Ensemble Spread
  • 2007
  • Ingår i: Proceedings of the 2007 International Conference on Data Mining. - : CSREA Press. - 1601320310 - 9781601320315 ; , s. 73-78
  • Konferensbidrag (refereegranskat)abstract
    • This paper investigates a technique for predicting ensemble uncertainty originally proposed in the weather forecasting domain. The overall purpose is to find out if the technique can be modified to operate on a wider range of regression problems. The main difference, when moving outside the weather forecasting domain, is the lack of extensive statistical knowledge readily available for weather forecasting. In this study, three different modifications are suggested to the original technique. In the experiments, the modifications are compared to each other and to two straightforward technniques, using ten publicly available regression problems. Three of the techniques show promising result, especially one modification based on genetic algorithms. The suggested modification can accurately determine whether the confidence in ensemble predictions should be high or low.
  •  
8.
  • König, Rikard, et al. (författare)
  • Using Genetic Programming to Increase Rule Quality
  • 2008
  • Ingår i: Proceedings of the Twenty-First International FLAIRS Conference (FLAIRS 2008). - : AAAI Press. - 9781577353652 ; , s. 288-293
  • Konferensbidrag (refereegranskat)abstract
    • Rule extraction is a technique aimed at transforming highly accurate opaque models like neural networks into comprehensible models without losing accuracy. G-REX is a rule extraction technique based on Genetic Programming that previously has performed well in several studies. This study has two objectives, to evaluate two new fitness functions for G-REX and to show how G-REX can be used as a rule inducer. The fitness functions are designed to optimize two alternative quality measures, area under ROC curves and a new comprehensibility measure called brevity. Rules with good brevity classifies typical instances with few and simple tests and use complex conditions only for atypical examples. Experiments using thirteen publicly available data sets show that the two novel fitness functions succeeded in increasing brevity and area under the ROC curve without sacrificing accuracy. When compared to a standard decision tree algorithm, G-REX achieved slightly higher accuracy, but also added additional quality to the rules by increasing their AUC or brevity significantly.
  •  
Skapa referenser, mejla, bekava och länka
  • Resultat 1-8 av 8
Typ av publikation
konferensbidrag (7)
bokkapitel (1)
Typ av innehåll
refereegranskat (7)
övrigt vetenskapligt/konstnärligt (1)
Författare/redaktör
Johansson, Ulf (8)
König, Rikard (6)
Löfström, Tuve (3)
Lärosäte
Högskolan i Borås (8)
Jönköping University (3)
Språk
Engelska (8)
Forskningsämne (UKÄ/SCB)
Naturvetenskap (8)

År

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Stäng

Kopiera och spara länken för att återkomma till aktuell vy