SwePub
Sök i LIBRIS databas

  Utökad sökning

onr:"swepub:oai:DiVA.org:liu-168856"
 

Sökning: onr:"swepub:oai:DiVA.org:liu-168856" > Multi-stream Convol...

Multi-stream Convolutional Networks for Indoor Scene Recognition

Anwer, Rao Muhammad (författare)
Aalto Univ, Finland; Incept Inst Artificial Intelligence, U Arab Emirates
Khan, Fahad (författare)
Linköpings universitet,Datorseende,Tekniska fakulteten,Incept Inst Artificial Intelligence, U Arab Emirates
Laaksonen, Jorma (författare)
Aalto Univ, Finland
visa fler...
Zaki, Nazar (författare)
United Arab Emirates Univ, U Arab Emirates
visa färre...
 (creator_code:org_t)
2019-08-22
2019
Engelska.
Ingår i: COMPUTER ANALYSIS OF IMAGES AND PATTERNS, CAIP 2019, PT I. - Cham : SPRINGER INTERNATIONAL PUBLISHING AG. - 9783030298883 - 9783030298876 ; , s. 196-208
  • Konferensbidrag (refereegranskat)
Abstract Ämnesord
Stäng  
  • Convolutional neural networks (CNNs) have recently achieved outstanding results for various vision tasks, including indoor scene understanding. The de facto practice employed by state-of-the-art indoor scene recognition approaches is to use RGB pixel values as input to CNN models that are trained on large amounts of labeled data (Image-Net or Places). Here, we investigate CNN architectures by augmenting RGB images with estimated depth and texture information, as multiple streams, for monocular indoor scene recognition. First, we exploit the recent advancements in the field of depth estimation from monocular images and use the estimated depth information to train a CNN model for learning deep depth features. Second, we train a CNN model to exploit the successful Local Binary Patterns (LBP) by using mapped coded images with explicit LBP encoding to capture texture information available in indoor scenes. We further investigate different fusion strategies to combine the learned deep depth and texture streams with the traditional RGB stream. Comprehensive experiments are performed on three indoor scene classification benchmarks: MIT-67, OCIS and SUN-397. The proposed multi-stream network significantly outperforms the standard RGB network by achieving an absolute gain of 9.3%, 4.7%, 7.3% on the MIT-67, OCIS and SUN-397 datasets respectively.

Ämnesord

NATURVETENSKAP  -- Data- och informationsvetenskap -- Datorseende och robotik (hsv//swe)
NATURAL SCIENCES  -- Computer and Information Sciences -- Computer Vision and Robotics (hsv//eng)

Nyckelord

Scene recognition; Depth features; Texture features

Publikations- och innehållstyp

ref (ämneskategori)
kon (ämneskategori)

Hitta via bibliotek

Till lärosätets databas

Sök utanför SwePub

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Stäng

Kopiera och spara länken för att återkomma till aktuell vy