Sökning: id:"swepub:oai:DiVA.org:ltu-76905" >
Examining the Combi...
Examining the Combination of Multi-Band Processing and Channel Dropout for Robust Speech Recognition
-
- Kovács, György, 1984- (författare)
- Luleå tekniska universitet,EISLAB,MTA-SZTE Research Group on Artificial Intelligence, Szeged, Hungary
-
- Tóth, László (författare)
- Institute of Informatics, University of Szeged, Szeged, Hungary
-
- Van Compernolle, Dirk (författare)
- Department of Electrical Engineering (ESAT), KU Leuven, Leuven, Belgium
-
visa fler...
-
- Liwicki, Marcus (författare)
- Luleå tekniska universitet,EISLAB
-
visa färre...
-
(creator_code:org_t)
- The International Speech Communication Association (ISCA), 2019
- 2019
- Engelska.
-
Ingår i: Proc. Interspeech 2019. - : The International Speech Communication Association (ISCA). ; , s. 421-425
- Relaterad länk:
-
https://urn.kb.se/re...
-
visa fler...
-
https://doi.org/10.2...
-
visa färre...
Abstract
Ämnesord
Stäng
- A pivotal question in Automatic Speech Recognition (ASR) is the robustness of the trained models. In this study, we investigate the combination of two methods commonly applied to increase the robustness of ASR systems. On the one hand, inspired by auditory experiments and signal processing considerations, multi-band band processing has been used for decades to improve the noise robustness of speech recognition. On the other hand, dropout is a commonly used regularization technique to prevent overfitting by keeping the model from becoming over-reliant on a small set of neurons. We hypothesize that the careful combination of the two approaches would lead to increased robustness, by preventing the resulting model from over-rely on any given band.To verify our hypothesis, we investigate various approaches for the combination of the two methods using the Aurora-4 corpus. The results obtained corroborate our initial assumption, and show that the proper combination of the two techniques leads to increased robustness, and to significantly lower word error rates (WERs). Furthermore, we find that the accuracy scores attained here compare favourably to those reported recently on the clean training scenario of the Aurora-4 corpus.
Ämnesord
- NATURVETENSKAP -- Data- och informationsvetenskap -- Datavetenskap (hsv//swe)
- NATURAL SCIENCES -- Computer and Information Sciences -- Computer Sciences (hsv//eng)
Nyckelord
- multi-band processing
- band-dropout
- robust speech recognition
- Aurora-4
- Maskininlärning
- Machine Learning
Publikations- och innehållstyp
- ref (ämneskategori)
- kon (ämneskategori)