Visual Representations and Models: From Latent SVM to Deep Learning

↓ Direkt till sidans innehåll
↓ Direkt till sidans sekundära innehåll (sidomenyn)

Search: WFRF:(Azizpour Hossein Associate Professor 1985 ) > Visual Representati...

5 of 5
Previous record
Next record
To hitlist

Visual Representations and Models: From Latent SVM to Deep Learning

Azizpour, Hossein, 1985- (author): KTH,Datorseende och robotik, CVAP,Computer Vision

Carlsson, Stefan, Professor (thesis advisor): KTH,Datorseende och robotik, CVAP

Caputo, Barbara, Associate Professor (opponent): IDIAP

(creator_code:org_t)

ISBN 9789177291107
Stockholm, Sweden : KTH Royal Institute of Technology, 2016
English 172 s.
Series: TRITA-CSC-A, 1653-5723 ; 21

Related links:: https://kth.diva-por... (primary) (Raw object); show more...; https://urn.kb.se/re...; show less...

Doctoral thesis (other academic/artistic)

Abstract Subject headings

Two important components of a visual recognition system are representation and model. Both involves the selection and learning of the features that are indicative for recognition and discarding those features that are uninformative. This thesis, in its general form, proposes different techniques within the frameworks of two learning systems for representation and modeling. Namely, latent support vector machines (latent SVMs) and deep learning.First, we propose various approaches to group the positive samples into clusters of visually similar instances. Given a fixed representation, the sampled space of the positive distribution is usually structured. The proposed clustering techniques include a novel similarity measure based on exemplar learning, an approach for using additional annotation, and augmenting latent SVM to automatically find clusters whose members can be reliably distinguished from background class. In another effort, a strongly supervised DPM is suggested to study how these models can benefit from privileged information. The extra information comes in the form of semantic parts annotation (i.e. their presence and location). And they are used to constrain DPMs latent variables during or prior to the optimization of the latent SVM. Its effectiveness is demonstrated on the task of animal detection.Finally, we generalize the formulation of discriminative latent variable models, including DPMs, to incorporate new set of latent variables representing the structure or properties of negative samples. Thus, we term them as negative latent variables. We show this generalization affects state-of-the-art techniques and helps the visual recognition by explicitly searching for counter evidences of an object presence.Following the resurgence of deep networks, in the last works of this thesis we have focused on deep learning in order to produce a generic representation for visual recognition. A Convolutional Network (ConvNet) is trained on a largely annotated image classification dataset called ImageNet with $\sim1.3$ million images. Then, the activations at each layer of the trained ConvNet can be treated as the representation of an input image. We show that such a representation is surprisingly effective for various recognition tasks, making it clearly superior to all the handcrafted features previously used in visual recognition (such as HOG in our first works on DPM). We further investigate the ways that one can improve this representation for a task in mind. We propose various factors involving before or after the training of the representation which can improve the efficacy of the ConvNet representation. These factors are analyzed on 16 datasets from various subfields of visual recognition.

Subject headings

TEKNIK OCH TEKNOLOGIER -- Elektroteknik och elektronik (hsv//swe)
ENGINEERING AND TECHNOLOGY -- Electrical Engineering, Electronic Engineering, Information Engineering (hsv//eng)
TEKNIK OCH TEKNOLOGIER -- Elektroteknik och elektronik -- Datorsystem (hsv//swe)
ENGINEERING AND TECHNOLOGY -- Electrical Engineering, Electronic Engineering, Information Engineering -- Computer Systems (hsv//eng)

Keyword

Computer Vision
Machine Learning
Artificial Intelligence
Deep Learning
Learning Representation
Deformable Part Models
Discriminative Latent Variable Models
Convolutional Networks
Object Recognition
Object Detection
Computer Science
Datalogi

Publication and Content Type

vet (subject category)
dok (subject category)

About the subject

ENGINEERING AND TECHNOLOGY: ENGINEERING AND ...; and Electrical Engin ...

ENGINEERING AND TECHNOLOGY: ENGINEERING AND ...; and Electrical Engin ...; and Computer Systems

Parts in the series: TRITA-CSC-A,

By the university: Royal Institute of Technology

Search outside SwePub

Extend your search to:: Google; Google Book Search; Google Scholar

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

LIBRIS.kb.se

Visual Representations and Models: From Latent SVM to Deep Learning

Subject headings

Keyword

Publication and Content Type

Find in a library

To the university's database

Find more in SwePub

Search outside SwePub