SwePub
Sök i LIBRIS databas

  Utökad sökning

WFRF:(Hvidsten Torgeir R.)
 

Sökning: WFRF:(Hvidsten Torgeir R.) > Classification of m...

Classification of microarrays : synergistic effects between normalization, gene selection and machine learning

Önskog, Jenny (författare)
Umeå universitet,Institutionen för fysiologisk botanik,Computational Life Science Cluster (CLiC), Umeå University, Umeå, Sweden
Freyhult, Eva (författare)
Uppsala universitet,Umeå universitet,Klinisk bakteriologi,Department of Medical Sciences, Uppsala University, Academic Hospital, Uppsala, Sweden; Computational Life Science Cluster (CLiC), Umeå University, Umeå, Sweden,Institutionen för medicinska vetenskaper
Landfors, Mattias (författare)
Umeå universitet,Institutionen för matematik och matematisk statistik,Klinisk bakteriologi,Computational Life Science Cluster (CLiC), Umeå University, Umeå, Sweden
visa fler...
Rydén, Patrik (författare)
Umeå universitet,Institutionen för matematik och matematisk statistik,Computational Life Science Cluster (CLiC), Umeå University, Umeå, Sweden
Hvidsten, Torgeir R (författare)
Umeå universitet,Institutionen för fysiologisk botanik,Computational Life Science Cluster (CLiC), Umeå University, Umeå, Sweden
visa färre...
 (creator_code:org_t)
2011-10-07
2011
Engelska.
Ingår i: BMC Bioinformatics. - : BioMed Central. - 1471-2105. ; 12:1
  • Tidskriftsartikel (refereegranskat)
Abstract Ämnesord
Stäng  
  • BACKGROUND: Machine learning is a powerful approach for describing and predicting classes in microarray data. Although several comparative studies have investigated the relative performance of various machine learning methods, these often do not account for the fact that performance (e.g. error rate) is a result of a series of analysis steps of which the most important are data normalization, gene selection and machine learning.RESULTS: In this study, we used seven previously published cancer-related microarray data sets to compare the effects on classification performance of five normalization methods, three gene selection methods with 21 different numbers of selected genes and eight machine learning methods. Performance in term of error rate was rigorously estimated by repeatedly employing a double cross validation approach. Since performance varies greatly between data sets, we devised an analysis method that first compares methods within individual data sets and then visualizes the comparisons across data sets. We discovered both well performing individual methods and synergies between different methods.CONCLUSION: Support Vector Machines with a radial basis kernel, linear kernel or polynomial kernel of degree 2 all performed consistently well across data sets. We show that there is a synergistic relationship between these methods and gene selection based on the T-test and the selection of a relatively high number of genes. Also, we find that these methods benefit significantly from using normalized data, although it is hard to draw general conclusions about the relative performance of different normalization procedures.

Ämnesord

NATURVETENSKAP  -- Biologi -- Biokemi och molekylärbiologi (hsv//swe)
NATURAL SCIENCES  -- Biological Sciences -- Biochemistry and Molecular Biology (hsv//eng)

Nyckelord

statistical methods
expression
bioinformatics
features
tumors
cell

Publikations- och innehållstyp

ref (ämneskategori)
art (ämneskategori)

Hitta via bibliotek

Till lärosätets databas

Sök utanför SwePub

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Stäng

Kopiera och spara länken för att återkomma till aktuell vy