Classification of Potentially Unwanted Programs Using Supervised Learning

↓ Direkt till sidans innehåll
↓ Direkt till sidans sekundära innehåll (sidomenyn)

Search: onr:"swepub:oai:DiVA.org:bth-00548" > Classification of P...

1 of 1
Previous record
Next record
To hitlist

Classification of Potentially Unwanted Programs Using Supervised Learning

Shahzad, Raja Muhammad Khurram (author): Blekinge Tekniska Högskola,Sektionen för datavetenskap och kommunikation

(creator_code:org_t)

ISBN 9789172952478
Karlskrona : Blekinge Institute of Technology, 2013
English 154 p. s.
Series: Blekinge Institute of Technology Licentiate Dissertation Series, 1650-2140 ; 2

Related links:: https://bth.diva-por... (primary) (Raw object); show more...; https://urn.kb.se/re...; show less...

Licentiate thesis (other academic/artistic)

Abstract Subject headings

Malicious software authors have shifted their focus from illegal and clearly malicious software to potentially unwanted programs (PUPs) to earn revenue. PUPs blur the border between legitimate and illegitimate programs and thus fall into a grey zone. Existing anti-virus and anti-spyware software are in many instances unable to detect previously unseen or zero-day attacks and separate PUPs from legitimate software. Many tools also require frequent updates to be effective. By predicting the class of particular piece of software, users can get support before taking the decision to install the software. This Licentiate thesis introduces approaches to distinguish PUP from legitimate software based on the supervised learning of file features represented as n-grams. The overall research method applied in this thesis is experiments. For these experiments, malicious software applications were obtained from anti-malware industrial partners. The legitimate software applications were collected from various online repositories. The general steps of supervised learning, from data preparation (n-gram generation) to evaluation were, followed. Different data representations, such as byte codes and operation codes, with different configurations, such as fixed-size, variable-length, and overlap, were investigated to generate different n-gram sizes. The experimental variables were controlled to measure the correlation between n-gram size, the number of features required for optimal training, and classifier performance. The thesis results suggest that, despite the subtle difference between legitimate software and PUP, this type of software can be classified accurately with a low false positive and false negative rate. The thesis results further suggest an optimal size of operation code-based n-grams for data representation. Finally, the results indicate that classification accuracy can be increased by using a customized ensemble learner that makes use of multiple representations of the data set. The investigated approaches can be implemented as a software tool with a less frequently required update in comparison to existing commercial tools.

Find in a library

Classification of Potentially Unwanted Programs Using Supervised Learning (Search the publication in LIBRIS)

To the university's database

1 of 1
Previous record
Next record
To hitlist

Find more in SwePub

By the author/editor: Shahzad, Raja Mu ...

About the subject

NATURAL SCIENCES: NATURAL SCIENCES; and Computer and Inf ...; and Computer Science ...

Parts in the series: Blekinge Institu ...

By the university: Blekinge Institute of Technology

Search outside SwePub

Extend your search to:: Google; Google Book Search; Google Scholar

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

LIBRIS.kb.se

Classification of Potentially Unwanted Programs Using Supervised Learning

Subject headings

Publication and Content Type

Find in a library

To the university's database

Find more in SwePub

Search outside SwePub