SwePub
Sök i LIBRIS databas

  Extended search

onr:"swepub:oai:gup.ub.gu.se/211857"
 

Search: onr:"swepub:oai:gup.ub.gu.se/211857" > Towards benchmarkin...

  • 1 of 1
  • Previous record
  • Next record
  •    To hitlist

Towards benchmarking feature subset selection methods for software fault prediction

Afzal, Wasif (author)
Mälardalens högskola,Inbyggda system,Bahria University, Islamabad, Pakistan
Torkar, Richard, 1971 (author)
Blekinge Tekniska Högskola,Gothenburg University,Göteborgs universitet,Institutionen för data- och informationsteknik (GU),Department of Computer Science and Engineering (GU),Blekinge Institute of Technology, Karlskrona, Sweden; Chalmers University of Technology, Sweden,Institutionen för programvaruteknik
 (creator_code:org_t)
Berlin, Heidelberg : Springer, 2016
2016
English.
In: Computational Intelligence and Quantitative Software Engineering. - Berlin, Heidelberg : Springer. - 9783319259642 - 9783319259628 ; , s. 33-58
  • Book chapter (peer-reviewed)
Abstract Subject headings
Close  
  • Despite the general acceptance that software engineering datasets often contain noisy, irrele- vant or redundant variables, very few benchmark studies of feature subset selection (FSS) methods on real-life data from software projects have been conducted. This paper provides an empirical comparison of state-of-the-art FSS methods: information gain attribute ranking (IG); Relief (RLF); principal com- ponent analysis (PCA); correlation-based feature selection (CFS); consistency-based subset evaluation (CNS); wrapper subset evaluation (WRP); and an evolutionary computation method, genetic programming (GP), on five fault prediction datasets from the PROMISE data repository. For all the datasets, the area under the receiver operating characteristic curve—the AUC value averaged over 10-fold cross- validation runs—was calculated for each FSS method-dataset combination before and after FSS. Two diverse learning algorithms, C4.5 and na ̈ıve Bayes (NB) are used to test the attribute sets given by each FSS method. The results show that although there are no statistically significant differences between the AUC values for the different FSS methods for both C4.5 and NB, a smaller set of FSS methods (IG, RLF, GP) consistently select fewer attributes without degrading classification accuracy. We conclude that in general, FSS is beneficial as it helps improve classification accuracy of NB and C4.5. There is no single best FSS method for all datasets but IG, RLF and GP consistently select fewer attributes without degrading classification accuracy within statistically significant boundaries.

Subject headings

NATURVETENSKAP  -- Data- och informationsvetenskap -- Programvaruteknik (hsv//swe)
NATURAL SCIENCES  -- Computer and Information Sciences -- Software Engineering (hsv//eng)
TEKNIK OCH TEKNOLOGIER  -- Elektroteknik och elektronik -- Datorsystem (hsv//swe)
ENGINEERING AND TECHNOLOGY  -- Electrical Engineering, Electronic Engineering, Information Engineering -- Computer Systems (hsv//eng)
NATURVETENSKAP  -- Data- och informationsvetenskap (hsv//swe)
NATURAL SCIENCES  -- Computer and Information Sciences (hsv//eng)

Keyword

benchmarking
feature subset
selection
fault prediction
software
software fault prediction
Empirical
Fault prediction
Feature subset selection

Publication and Content Type

ref (subject category)
kap (subject category)

Find in a library

To the university's database

  • 1 of 1
  • Previous record
  • Next record
  •    To hitlist

Find more in SwePub

By the author/editor
Afzal, Wasif
Torkar, Richard, ...
About the subject
NATURAL SCIENCES
NATURAL SCIENCES
and Computer and Inf ...
and Software Enginee ...
ENGINEERING AND TECHNOLOGY
ENGINEERING AND ...
and Electrical Engin ...
and Computer Systems
NATURAL SCIENCES
NATURAL SCIENCES
and Computer and Inf ...
Articles in the publication
Computational In ...
By the university
University of Gothenburg
Mälardalen University
Blekinge Institute of Technology

Search outside SwePub

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Close

Copy and save the link in order to return to this view