Sökning: onr:"swepub:oai:DiVA.org:kth-150006" >
Learning from image...
Learning from images and speech with non-negative matrix factorization enhanced by input space scaling
-
Driesen, J. (författare)
-
Van Hamme, H. (författare)
-
- Kleijn, W. Bastiaan (författare)
- KTH,Ljud- och bildbehandling
-
(creator_code:org_t)
- IEEE, 2010
- 2010
- Engelska.
-
Ingår i: 2010 IEEE Workshop on Spoken Language Technology, SLT 2010 - Proceedings. - : IEEE. - 9781424479030 ; , s. 1-6
- Relaterad länk:
-
https://urn.kb.se/re...
-
visa fler...
-
https://doi.org/10.1...
-
visa färre...
Abstract
Ämnesord
Stäng
- Computional learning from multimodal data is often done with matrix factorization techniques such as NMF (Non-negative Matrix Factorization), pLSA (Probabilistic Latent Semantic Analysis) or LDA (Latent Dirichlet Allocation). The different modalities of the input are to this end converted into features that are easily placed in a vectorized format. An inherent weakness of such a data representation is that only a subset of these data features actually aids the learning. In this paper, we first describe a simple NMF-based recognition framework operating on speech and image data. We then propose and demonstrate a novel algorithm that scales the inputs of this framework in order to optimize its recognition performance.
Ämnesord
- NATURVETENSKAP -- Data- och informationsvetenskap -- Språkteknologi (hsv//swe)
- NATURAL SCIENCES -- Computer and Information Sciences -- Language Technology (hsv//eng)
Nyckelord
- Feature selection
- Image recognition
- Machine learning
- Multi-modal learning
- Vocabulary acquisition
Publikations- och innehållstyp
- ref (ämneskategori)
- kon (ämneskategori)
Hitta via bibliotek
Till lärosätets databas