SwePub
Sök i SwePub databas

  Extended search

Träfflista för sökning "WFRF:(Blomberg K) srt2:(1990-1999)"

Search: WFRF:(Blomberg K) > (1990-1999)

  • Result 1-10 of 16
Sort/group result
   
EnumerationReferenceCoverFind
1.
  •  
2.
  • Bertenstam, J, et al. (author)
  • THE WAXHOLM APPLICATION DATABASE
  • 1995
  • Conference paper (peer-reviewed)abstract
    • This paper describes an application database collected in Wizard-of-Oz experiments in a spoken dialogue system, WAXHOLM. The system provides information on boat traffic in the Stockholm archipelago. The database consists of utterance-length speech files, their corressonding transcriptions, and log files of the dialogue sessions. In addition to the spontaneous dialogue speech, the material also comprise recordings of phonetically balanced reference sentences uttered by all 66 subjects. In the paper the recording procedure is described as well as some characteristics of the speech data and the dialogue.
  •  
3.
  • Blomberg, Mats, et al. (author)
  • An experimental dialog system: WAXHOLM
  • 1993
  • Conference paper (peer-reviewed)abstract
    • Recently we have begun to build the basic tools for a generic speech-dialogue system, WAXHOLM. The main modules, their function and internal communication have been specified. The different components are connected through a computer network. A preliminary version of the system has been tested, using simplified versions of the modules. We will give a general overview of the system and describe some of the components in more detail. Application specific data are collected with the help of Wizard-of-Oz techniques. The dialogue system is used during the data collection and the wizard only replaces the speechrecognition module.
  •  
4.
  • Blomberg, Mats, et al. (author)
  • Creation of unseen triphones from diphones and monophones using a speech production approach
  • 1996
  • Conference paper (peer-reviewed)abstract
    • With limited training data, infrequent triphone models for speech recognition will not be observed in sufficient number. In this report, a speech production approach is used to predict the characteristics of unseen triphones by concatenating diphones and/or monophones in the parametric representation of a formant speech synthesiser. The parameter trajectories are estimated by interpolation between the endpoints of the original units. The spectral states of the created triphone are generated by the speech synthesiser. Evaluation of the proposed technique has been performed using spectral error measurements and recognition candidate rescoring of N-best lists. In both cases, the created triphones are shown to perform better than the shorter units from which they were constructed. 1. INTRODUCTION The triphone unit is the basic phone model in many current phonetic speech recognition systems. The reason for this is that triphones capture the coarticulation effect caused by the immediate pr...
  •  
5.
  • Blomberg, Mats, et al. (author)
  • Optimizing some parameters of a word recognizer used in car noise
  • 1990
  • In: STL-QPSR. ; 31:4, s. 43-52
  • Journal article (peer-reviewed)abstract
    • A speaker-dependent word recognition system has been modified to improve the performance in noise. Problems with word detection and noise compensation have been addressed by using a close-talk microphone and a "noise addition" method. The reference templates are recorded in relative silence. The additional environmental noise during the recognition phase is measured and is "added" to the reference templates before using them for template matching. The recognition performance has been tested in moving cars with references recorded in parked cars. Recordings of six male speakers have been used in this report to rest the sensitivity of the recognition system to some essential parameters. The results from six male speakers and a twenty word vocabulary show that adapting the endpoint detection threshold to the noise level is essential for good performance and that noise compensation is imponant at signal-to-noise ratios below 15 dB.
  •  
6.
  • Blomberg, Mats, et al. (author)
  • Speech recognition in the Waxholm dialog system
  • 1994
  • Conference paper (peer-reviewed)abstract
    • The speech recognition component in the KTH "Waxholm" dialog system is described. It will handle continuous speech with a vocabulary of about 1000 words. The output of the recogniser is fed to a probabilistic, knowledge-based parser, that contains a context-free grammar compiled into an augmented transition network.
  •  
7.
  • Elenius, K, et al. (author)
  • Comparing phoneme and feature based speech recognition using artificial neural networks
  • 1992
  • Conference paper (peer-reviewed)abstract
    • An artificial neural network has been trained by the error backpropagation technique to recognise phonemes and words. The speech material was recorded by a male Swedish talker and was labelled by a phonetician. There were 38 output nodes corresponding to Swedish phonemes. The training algorithm was somewhat modified to increase the training speed. Introducing coarticulation information by adding simple recurrency to the net is shown to more effective than expanding the size of the input spectral window. The phoneme recognition network was used with dynamic programming for time alignment to recognise connected digits. It was compared to a similar recogniser based on nine quasi-phonetic features instead of 38 phonemes. The phoneme based system performed better than the feature based one. I.
  •  
8.
  • Elenius, K.O.E., et al. (author)
  • Experiments with artificial neural networks for phoneme and word recognit
  • 1993
  • In: STL-QPSR. ; 34:1, s. 47-56
  • Journal article (peer-reviewed)abstract
    • An artificial neural network has been bained by the error back-propagation technique to recopse phonemes and words. The speech material was recorded by a male Swedish talker and was labelled by a phonetician. There were 38ou put nodes corresponding to Swedish phonemes. Introducing coarticulation information by adding simple recurrency to the net is shown to be more effective than expanding the size of the input spectral window. The phoneme recognition network was used with dynamic programming for time alignment to recognise connected digits in a speakerindependent way. It was compared to a similar recogniser based on nine quasi-phonetic features instead of 38phonemes. The phoneme-based system performed better fhan the feature-based one for five out of seven speakers.
  •  
9.
  •  
10.
  •  
Skapa referenser, mejla, bekava och länka
  • Result 1-10 of 16

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Close

Copy and save the link in order to return to this view