↓ Direkt till sidans innehåll
↓ Direkt till sidans sekundära innehåll (sidomenyn)

Träfflista för sökning "WFRF:(Salvi Giampiero) srt2:(2000-2004)"

Sökning: WFRF:(Salvi Giampiero) > (2000-2004)

Resultat 1-9 av 9

Sortera/gruppera träfflistan

Sortering: Träffar per sida:

Numrering	Referens	Omslagsbild	Hitta
1.	Beskow, Jonas, et al. (författare) SYNFACE - A talking head telephone for the hearing-impaired 2004 Ingår i: COMPUTERS HELPING PEOPLE WITH SPECIAL NEEDS. - BERLIN : SPRINGER. - 3540223347 ; , s. 1178-1185 Konferensbidrag (refereegranskat)abstract SYNFACE is a telephone aid for hearing-impaired people that shows the lip movements of the speaker at the other telephone synchronised with the speech. The SYNFACE system consists of a speech recogniser that recognises the incoming speech and a synthetic talking head. The output from the recogniser is used to control the articulatory movements of the synthetic head. SYNFACE prototype systems exist for three languages: Dutch, English and Swedish and the first user trials have just started.
2.	Johansen, Finn Tore, et al. (författare) The cost 249 speechdat multilingual reference recogniser 2000 Konferensbidrag (refereegranskat)abstract The COST 249 SpeechDat reference recogniser is a fully automatic, language-independent training procedure for building a phonetic recogniser. It relies on the HTK toolkit and a SpeechDat(II) compatible database. The recogniser is designed to serve as a reference system in multilingual recognition research. This paper documents version 0.93 of the reference recogniser and presents results on smallvocabulary recognition for seven languages.
3.	Karlsson, Inger, et al. (författare) SYNFACE - a talking face telephone 2003 Ingår i: Proceedings of EUROSPEECH 2003. ; , s. 1297-1300 Konferensbidrag (refereegranskat)abstract The SYNFACE project has as its primary goal to facilitate for hearing-impaired people to use an ordinary telephone. This will be achieved by using a talking face connected to the telephone. The incoming speech signal will govern the speech movements of the talking face, hence the talking face will provide lip-reading support for the user.The project will define the visual speech information that supports lip-reading, and develop techniques to derive this information from the acoustic speech signal in near real time for three different languages: Dutch, English and Swedish. This requires the development of automatic speech recognition methods that detect information in the acoustic signal that correlates with the speech movements. This information will govern the speech movements in a synthetic face and synchronise them with the acoustic speech signal. A prototype system is being constructed. The prototype contains results achieved so far in SYNFACE. This system will be tested and evaluated for the three languages by hearing-impaired users. SYNFACE is an IST project (IST-2001-33327) with partners from the Netherlands, UK and Sweden. SYNFACE builds on experiences gained in the Swedish Teleface project.
4.	Lindberg, Borge, et al. (författare) a noise robust multilingual reference recogniser based on speechdat(II) 2000 Konferensbidrag (refereegranskat)abstract An important aspect of noise robustness of automatic speech recognisers (ASR) is the proper handling of non-speech acoustic events. The present paper describes further improvements of an already existing reference recogniser towards achieving such kind of robustness. The reference recogniser applied is the COST 249 SpeechDat reference recogniser, which is a fully automatic, language-independent training procedure for building a phonetic recogniser (http://www.telenor.no/fou/prosjekter/taletek/refrec). The reference recogniser relies on the HTK toolkit and a Speech- Dat(II) compatible database, and is designed to serve as a reference system in multilingual speech recognition research. The paper describes version 0.96 of the reference recogniser which take into account labelled non-speech acoustic events during training and provides robustness against these during testing. Results are presented on small and medium vocabulary recognition for six languages.
5.	Salvi, Giampiero (författare) Accent clustering in Swedish using the Bhattacharyya distance 2003 Ingår i: Proceedings of the 15th International Congress of Phonetic Sciences (ICPhS), Barcelona Spain. ; , s. 1149-1152 Konferensbidrag (refereegranskat)abstract In an attempt to improve automatic speech recognition(ASR) models for Swedish, accent variations wereconsidered. These have proved to be important variablesin the statistical distribution of the acoustic featuresusually employed in ASR. The analysis of featurevariability have revealed phenomena that are consistentwith what is known from phonetic investigations,suggesting that a consistent part of the informationabout accents could be derived form those features. Agraphical interface has been developed to simplify thevisualization of the geographical distributions of thesephenomena.
6.	Salvi, Giampiero (författare) Truncation error and dynamics in very low latency phonetic recognition 2003 Ingår i: Proceedings of Non Linear Speech Processing (NOLISP). Konferensbidrag (refereegranskat)abstract The truncation error for a two-pass decoder is analyzed in a problem of phonetic speech recognition for very demanding latency constraints (look-ahead length < 100ms) and for applications where successive renements of the hypotheses are not allowed. This is done empirically in the framework of hybrid MLP/HMM models. The ability of recurrent MLPs, as a posteriori probability estimators, to model time variations is also considered, and its interaction with the dynamic modeling in the decoding phase is shown in the simulations.
7.	Salvi, Giampiero (författare) Using accent information in ASR models for Swedish 2003 Ingår i: Proceedings of INTERSPEECH'2003. ; , s. 2677-2680 Konferensbidrag (refereegranskat)abstract In this study accent information is used in an attempt to improve acoustic models for automatic speech recognition (ASR). First, accent dependent Gaussian models were trained independently. The Bhattacharyya distance was then used in conjunction with agglomerative hierarchical clustering to define optimal strategies for merging those models. The resulting allophonic classes were analyzed and compared with the phonetic literature. Finally, accent "aware" models were built, in which the parametric complexity for each phoneme corresponds to the degree of variability across accent areas and to the amount of training data available for it. The models were compared to models with the same, but evenly spread, overall complexity showing in some cases a slight improvement in recognition accuracy.
8.	Siciliano, C., et al. (författare) Intelligibility of an ASR-controlled synthetic talking face 2004 Ingår i: Journal of the Acoustical Society of America. - 0001-4966 .- 1520-8524. ; 115:5, s. 2428- Tidskriftsartikel (refereegranskat)abstract The goal of the SYNFACE project is to develop a multilingual synthetic talking face, driven by an automatic speech recognizer (ASR), to assist hearing‐impaired people with telephone communication. Previous multilingual experiments with the synthetic face have shown that time‐aligned synthesized visual face movements can enhance speech intelligibility in normal‐hearing and hearing‐impaired users [C. Siciliano et al., Proc. Int. Cong. Phon. Sci. (2003)]. Similar experiments are in progress to examine whether the synthetic face remains intelligible when driven by ASR output. The recognizer produces phonetic output in real time, in order to drive the synthetic face while maintaining normal dialogue turn‐taking. Acoustic modeling was performed with a neural network, while an HMM was used for decoding. The recognizer was trained on the SpeechDAT telephone speech corpus. Preliminary results suggest that the currently achieved recognition performance of around 60% frames correct limits the usefulness of the synthetic face movements. This is particularly true for consonants, where correct place of articulation is especially important for visual intelligibility. Errors in the alignment of phone boundaries representative of those arising in the ASR output were also shown to decrease audio‐visual intelligibility.
9.	Spens, Karl-Erik, et al. (författare) SYNFACE, a talking head telephone for the hearing impaired 2004 Konferensbidrag (refereegranskat)

Skapa referenser, mejla, bekava och länka

Länka till träfflistan

Resultat 1-9 av 9

Avgränsa träffmängd

Typ av publikation: konferensbidrag (8); tidskriftsartikel (1)

Typ av innehåll: refereegranskat (9)

Författare/redaktör: Salvi, Giampiero (9); Karlsson, Inger (3); Beskow, Jonas (2); Elenius, Kjell (2); Lindberg, Borge (2); Johansen, Finn Tore (2); visa fler...; Warakagoda, Narada (2); Lehtinen, Gunnar (2); Kacic, Zdravko (2); Zgank, Andrei (2); Williams, G. (1); Agelfors, Eva (1); Granström, Björn (1); Spens, Karl-Erik (1); Kewley, J (1); Faulkner, Andrew (1); Siciliano, C. (1); Faulkner, A. (1); visa färre...

Lärosäte: Kungliga Tekniska Högskolan (9)

Språk: Engelska (9)

Forskningsämne (UKÄ/SCB): Naturvetenskap (9)

År

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

Copyright © LIBRIS - Nationella bibliotekssystem
LIBRIS.kb.se

pil uppåt

Stäng

Kopiera och spara länken för att återkomma till aktuell vy