SwePub
Sök i LIBRIS databas

  Extended search

onr:"swepub:oai:DiVA.org:su-142052"
 

Search: onr:"swepub:oai:DiVA.org:su-142052" > Order in the random...

  • 1 of 1
  • Previous record
  • Next record
  •    To hitlist

Order in the random forest

Karlsson, Isak, 1987- (author)
Stockholms universitet,Institutionen för data- och systemvetenskap
Boström, Henrik, Professor (thesis advisor)
Stockholms universitet,Institutionen för data- och systemvetenskap
Asker, Lars, Docent (thesis advisor)
Stockholms universitet,Institutionen för data- och systemvetenskap
show more...
Geurts, Pierre, Associate Professor (opponent)
Department of Electrical Engineering and Computer Science, University of Liège, Belgium
show less...
 (creator_code:org_t)
ISBN 9789176498279
Stockholm : Department of Computer and Systems Sciences, Stockholm University, 2017
English 76 s.
  • Doctoral thesis (other academic/artistic)
Abstract Subject headings
Close  
  • In many domains, repeated measurements are systematically collected to obtain the characteristics of objects or situations that evolve over time or other logical orderings. Although the classification of such data series shares many similarities with traditional multidimensional classification, inducing accurate machine learning models using traditional algorithms are typically infeasible since the order of the values must be considered.In this thesis, the challenges related to inducing predictive models from data series using a class of algorithms known as random forests are studied for the purpose of efficiently and effectively classifying (i) univariate, (ii) multivariate and (iii) heterogeneous data series either directly in their sequential form or indirectly as transformed to sparse and high-dimensional representations. In the thesis, methods are developed to address the challenges of (a) handling sparse and high-dimensional data, (b) data series classification and (c) early time series classification using random forests. The proposed algorithms are empirically evaluated in large-scale experiments and practically evaluated in the context of detecting adverse drug events.In the first part of the thesis, it is demonstrated that minor modifications to the random forest algorithm and the use of a random projection technique can improve the effectiveness of random forests when faced with discrete data series projected to sparse and high-dimensional representations. In the second part of the thesis, an algorithm for inducing random forests directly from univariate, multivariate and heterogeneous data series using phase-independent patterns is introduced and shown to be highly effective in terms of both computational and predictive performance. Then, leveraging the notion of phase-independent patterns, the random forest is extended to allow for early classification of time series and is shown to perform favorably when compared to alternatives. The conclusions of the thesis not only reaffirm the empirical effectiveness of random forests for traditional multidimensional data but also indicate that the random forest framework can, with success, be extended to sequential data representations.

Subject headings

NATURVETENSKAP  -- Data- och informationsvetenskap (hsv//swe)
NATURAL SCIENCES  -- Computer and Information Sciences (hsv//eng)

Keyword

Machine learning
random forest
ensemble
time series
data series
sequential data
sparse data
high-dimensional data
Computer and Systems Sciences
data- och systemvetenskap

Publication and Content Type

vet (subject category)
dok (subject category)

Find in a library

To the university's database

  • 1 of 1
  • Previous record
  • Next record
  •    To hitlist

Search outside SwePub

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Close

Copy and save the link in order to return to this view