SwePub
Sök i SwePub databas

  Utökad sökning

Träfflista för sökning "WFRF:(Granström Björn) "

Sökning: WFRF:(Granström Björn)

  • Resultat 1-50 av 93
Sortera/gruppera träfflistan
   
NumreringReferensOmslagsbildHitta
1.
  • Carlson, Rolf, et al. (författare)
  • Gunnar Fant 1920-2009 In Memoriam
  • 2009
  • Ingår i: Phonetica. - : Walter de Gruyter GmbH. - 0031-8388 .- 1423-0321. ; 66:4, s. 249-250
  • Tidskriftsartikel (refereegranskat)
  •  
2.
  • Engstrand, Olle, et al. (författare)
  • In memoriam - Gösta Bruce (1947–2010)
  • 2010
  • Ingår i: Journal of the International Phonetic Association. - Cambridge : Cambridge University Press. - 0025-1003. ; 40:3, s. 379-381
  • Tidskriftsartikel (övrigt vetenskapligt/konstnärligt)
  •  
3.
  •  
4.
  • Agelfors, Eva, et al. (författare)
  • Synthetic visual speech driven from auditory speech
  • 1999
  • Ingår i: Proceedings of Audio-Visual Speech Processing (AVSP'99)).
  • Konferensbidrag (refereegranskat)abstract
    • We have developed two different methods for using auditory, telephone speech to drive the movements of a synthetic face. In the first method, Hidden Markov Models (HMMs) were trained on a phonetically transcribed telephone speech database. The output of the HMMs was then fed into a rulebased visual speech synthesizer as a string of phonemes together with time labels. In the second method, Artificial Neural Networks (ANNs) were trained on the same database to map acoustic parameters directly to facial control parameters. These target parameter trajectories were generated by using phoneme strings from a database as input to the visual speech synthesis The two methods were evaluated through audiovisual intelligibility tests with ten hearing impaired persons, and compared to “ideal” articulations (where no recognition was involved), a natural face, and to the intelligibility of the audio alone. It was found that the HMM method performs considerably better than the audio alone condition (54% and 34% keywords correct respectively), but not as well as the “ideal” articulating artificial face (64%). The intelligibility for the ANN method was 34% keywords correct.
  •  
5.
  •  
6.
  • Al Moubayed, Samer, et al. (författare)
  • A robotic head using projected animated faces
  • 2011
  • Ingår i: Proceedings of the International Conference on Audio-Visual Speech Processing 2011. - Stockholm : KTH Royal Institute of Technology. ; , s. 71-
  • Konferensbidrag (refereegranskat)abstract
    • This paper presents a setup which employs virtual animatedagents for robotic heads. The system uses a laser projector toproject animated faces onto a three dimensional face mask. This approach of projecting animated faces onto a three dimensional head surface as an alternative to using flat, two dimensional surfaces, eliminates several deteriorating effects and illusions that come with flat surfaces for interaction purposes, such as exclusive mutual gaze and situated and multi-partner dialogues. In addition to that, it provides robotic heads with a flexible solution for facial animation which takes into advantage the advancements of facial animation using computer graphics overmechanically controlled heads.
  •  
7.
  • Al Moubayed, Samer, et al. (författare)
  • Animated Faces for Robotic Heads : Gaze and Beyond
  • 2011
  • Ingår i: Analysis of Verbal and Nonverbal Communication and Enactment. - Berlin, Heidelberg : Springer Berlin/Heidelberg. - 9783642257742 ; , s. 19-35
  • Konferensbidrag (refereegranskat)abstract
    • We introduce an approach to using animated faces for robotics where a static physical object is used as a projection surface for an animation. The talking head is projected onto a 3D physical head model. In this chapter we discuss the different benefits this approach adds over mechanical heads. After that, we investigate a phenomenon commonly referred to as the Mona Lisa gaze effect. This effect results from the use of 2D surfaces to display 3D images and causes the gaze of a portrait to seemingly follow the observer no matter where it is viewed from. The experiment investigates the perception of gaze direction by observers. The analysis shows that the 3D model eliminates the effect, and provides an accurate perception of gaze direction. We discuss at the end the different requirements of gaze in interactive systems, and explore the different settings these findings give access to.
  •  
8.
  • Al Moubayed, Samer, et al. (författare)
  • Audio-Visual Prosody : Perception, Detection, and Synthesis of Prominence
  • 2010
  • Ingår i: 3rd COST 2102 International Training School on Toward Autonomous, Adaptive, and Context-Aware Multimodal Interfaces. - Berlin, Heidelberg : Springer Berlin Heidelberg. - 9783642181832 ; , s. 55-71
  • Konferensbidrag (refereegranskat)abstract
    • In this chapter, we investigate the effects of facial prominence cues, in terms of gestures, when synthesized on animated talking heads. In the first study a speech intelligibility experiment is conducted, where speech quality is acoustically degraded, then the speech is presented to 12 subjects through a lip synchronized talking head carrying head-nods and eyebrow raising gestures. The experiment shows that perceiving visual prominence as gestures, synchronized with the auditory prominence, significantly increases speech intelligibility compared to when these gestures are randomly added to speech. We also present a study examining the perception of the behavior of the talking heads when gestures are added at pitch movements. Using eye-gaze tracking technology and questionnaires for 10 moderately hearing impaired subjects, the results of the gaze data show that users look at the face in a similar fashion to when they look at a natural face when gestures are coupled with pitch movements opposed to when the face carries no gestures. From the questionnaires, the results also show that these gestures significantly increase the naturalness and helpfulness of the talking head.
  •  
9.
  • Al Moubayed, Samer, et al. (författare)
  • Auditory visual prominence From intelligibility to behavior
  • 2009
  • Ingår i: Journal on Multimodal User Interfaces. - : Springer Science and Business Media LLC. - 1783-7677 .- 1783-8738. ; 3:4, s. 299-309
  • Tidskriftsartikel (refereegranskat)abstract
    • Auditory prominence is defined as when an acoustic segment is made salient in its context. Prominence is one of the prosodic functions that has been shown to be strongly correlated with facial movements. In this work, we investigate the effects of facial prominence cues, in terms of gestures, when synthesized on animated talking heads. In the first study, a speech intelligibility experiment is conducted, speech quality is acoustically degraded and the fundamental frequency is removed from the signal, then the speech is presented to 12 subjects through a lip synchronized talking head carrying head-nods and eyebrows raise gestures, which are synchronized with the auditory prominence. The experiment shows that presenting prominence as facial gestures significantly increases speech intelligibility compared to when these gestures are randomly added to speech. We also present a follow-up study examining the perception of the behavior of the talking heads when gestures are added over pitch accents. Using eye-gaze tracking technology and questionnaires on 10 moderately hearing impaired subjects, the results of the gaze data show that users look at the face in a similar fashion to when they look at a natural face when gestures are coupled with pitch accents opposed to when the face carries no gestures. From the questionnaires, the results also show that these gestures significantly increase the naturalness and the understanding of the talking head.
  •  
10.
  • Al Moubayed, Samer, 1982-, et al. (författare)
  • Furhat : A Back-projected Human-like Robot Head for Multiparty Human-Machine Interaction
  • 2012
  • Ingår i: Cognitive Behavioural Systems. - Berlin, Heidelberg : Springer Berlin/Heidelberg. - 9783642345838 ; , s. 114-130
  • Konferensbidrag (refereegranskat)abstract
    • In this chapter, we first present a summary of findings from two previous studies on the limitations of using flat displays with embodied conversational agents (ECAs) in the contexts of face-to-face human-agent interaction. We then motivate the need for a three dimensional display of faces to guarantee accurate delivery of gaze and directional movements and present Furhat, a novel, simple, highly effective, and human-like back-projected robot head that utilizes computer animation to deliver facial movements, and is equipped with a pan-tilt neck. After presenting a detailed summary on why and how Furhat was built, we discuss the advantages of using optically projected animated agents for interaction. We discuss using such agents in terms of situatedness, environment, context awareness, and social, human-like face-to-face interaction with robots where subtle nonverbal and social facial signals can be communicated. At the end of the chapter, we present a recent application of Furhat as a multimodal multiparty interaction system that was presented at the London Science Museum as part of a robot festival,. We conclude the paper by discussing future developments, applications and opportunities of this technology.
  •  
11.
  • Al Moubayed, Samer, et al. (författare)
  • Furhat goes to Robotville: a large-scale multiparty human-robot interaction data collection in a public space
  • 2012
  • Ingår i: Proc of LREC Workshop on Multimodal Corpora. - Istanbul, Turkey.
  • Konferensbidrag (refereegranskat)abstract
    • In the four days of the Robotville exhibition at the London Science Museum, UK, during which the back-projected head Furhat in a situated spoken dialogue system was seen by almost 8 000 visitors, we collected a database of 10 000 utterances spoken to Furhat in situated interaction. The data collection is an example of a particular kind of corpus collection of human-machine dialogues in public spaces that has several interesting and specific characteristics, both with respect to the technical details of the collection and with respect to the resulting corpus contents. In this paper, we take the Furhat data collection as a starting point for a discussion of the motives for this type of data collection, its technical peculiarities and prerequisites, and the characteristics of the resulting corpus.
  •  
12.
  • Al Moubayed, Samer, et al. (författare)
  • Studies on Using the SynFace Talking Head for the Hearing Impaired
  • 2009
  • Ingår i: Proceedings of Fonetik'09. - Stockholm : Stockholm University. - 9789163348921 ; , s. 140-143
  • Konferensbidrag (övrigt vetenskapligt/konstnärligt)abstract
    • SynFace is a lip-synchronized talking agent which is optimized as a visual reading support for the hearing impaired. In this paper wepresent the large scale hearing impaired user studies carried out for three languages in the Hearing at Home project. The user tests focuson measuring the gain in Speech Reception Threshold in Noise and the effort scaling when using SynFace by hearing impaired people, where groups of hearing impaired subjects with different impairment levels from mild to severe and cochlear implants are tested. Preliminaryanalysis of the results does not show significant gain in SRT or in effort scaling. But looking at large cross-subject variability in both tests, it isclear that many subjects benefit from SynFace especially with speech with stereo babble.
  •  
13.
  • Al Moubayed, Samer, et al. (författare)
  • Talking with Furhat - multi-party interaction with a back-projected robot head
  • 2012
  • Ingår i: Proceedings of Fonetik 2012. - Gothenberg, Sweden. ; , s. 109-112
  • Konferensbidrag (övrigt vetenskapligt/konstnärligt)abstract
    • This is a condensed presentation of some recent work on a back-projected robotic head for multi-party interaction in public settings. We will describe some of the design strategies and give some preliminary analysis of an interaction database collected at the Robotville exhibition at the London Science Museum
  •  
14.
  • Al Moubayed, Samer, et al. (författare)
  • Virtual Speech Reading Support for Hard of Hearing in a Domestic Multi-Media Setting
  • 2009
  • Ingår i: INTERSPEECH 2009. - BAIXAS : ISCA-INST SPEECH COMMUNICATION ASSOC. ; , s. 1443-1446
  • Konferensbidrag (refereegranskat)abstract
    • In this paper we present recent results on the development of the SynFace lip synchronized talking head towards multilinguality, varying signal conditions and noise robustness in the Hearing at Home project. We then describe the large scale hearing impaired user studies carried out for three languages. The user tests focus on measuring the gain in Speech Reception Threshold in Noise when using SynFace, and on measuring the effort scaling when using SynFace by hearing impaired people. Preliminary analysis of the results does not show significant gain in SRT or in effort scaling. But looking at inter-subject variability, it is clear that many subjects benefit from SynFace especially with speech with stereo babble noise.
  •  
15.
  • Ambrazaitis, Gilbert, et al. (författare)
  • Swedish word accents in a ‘confirmation’ context
  • 2007
  • Ingår i: Proceedings of Fonetik : Speech, Music and Hearing Quarterly Progress and Status Report (TMH-QPSR), 50(1) - Speech, Music and Hearing Quarterly Progress and Status Report (TMH-QPSR), 50(1). - 1104-5787. ; 50:1, s. 49-52
  • Konferensbidrag (övrigt vetenskapligt/konstnärligt)abstract
    • An exploratory study on the prosodic signaling of ‘confirmation’ in Swedish is presented. Pairs of subjects read short dialogs, constructed around selected target words, in a conversational style. The target utterances were produced with a rising-falling intonation, lacking any typical ‘focal accent’ (FA). Qualitative observations and acoustic measurements reveal that the signaling of the word accent contrast appears to be, to a certain degree, optional in a confirmation context. The results support the view that no tonal target needs to be assumed for accent I, and further suggest that utterance-level prominence can be realized by other means than the FA.
  •  
16.
  •  
17.
  • Ayers, Gayle, et al. (författare)
  • Modelling intonation in dialogue
  • 1995
  • Ingår i: Proceedings of the XIIIth International Congress of Phonetic Sciences. - 9171708367 ; 2, s. 278-281
  • Konferensbidrag (refereegranskat)
  •  
18.
  •  
19.
  • Beskow, Jonas, et al. (författare)
  • Analysis and synthesis of multimodal verbal and non-verbal interaction for animated interface agents
  • 2007
  • Ingår i: VERBAL AND NONVERBAL COMMUNICATION BEHAVIOURS. - BERLIN : SPRINGER-VERLAG BERLIN. - 9783540764410 ; , s. 250-263
  • Konferensbidrag (refereegranskat)abstract
    • The use of animated talking agents is a novel feature of many multimodal spoken dialogue systems. The addition and integration of a virtual talking head has direct implications for the way in which users approach and interact with such systems. However, understanding the interactions between visual expressions, dialogue functions and the acoustics of the corresponding speech presents a substantial challenge. Some of the visual articulation is closely related to the speech acoustics, while there are other articulatory movements affecting speech acoustics that are not visible on the outside of the face. Many facial gestures used for communicative purposes do not affect the acoustics directly, but might nevertheless be connected on a higher communicative level in which the timing of the gestures could play an important role. This chapter looks into the communicative function of the animated talking agent, and its effect on intelligibility and the flow of the dialogue.
  •  
20.
  • Beskow, Jonas, et al. (författare)
  • Expressive animated agents for affective dialogue systems
  • 2004
  • Ingår i: AFFECTIVE DIALOGUE SYSTEMS, PROCEEDINGS. - BERLIN : SPRINGER. - 3540221433 ; , s. 240-243
  • Konferensbidrag (refereegranskat)abstract
    • We present our current state of development regarding animated agents applicable to affective dialogue systems. A new set of tools are under development to support the creation of animated characters compatible with the MPEG-4 facial animation standard. Furthermore, we have collected a multimodal expressive speech database including video, audio and 3D point motion registration. One of the objectives of collecting the database is to examine how emotional expression influences articulatory patterns, to be able to model this in our agents. Analysis of the 3D data shows for example that variation in mouth width due to expression greatly exceeds that due to vowel quality.
  •  
21.
  • Beskow, Jonas, et al. (författare)
  • Face-to-Face Interaction and the KTH Cooking Show
  • 2010
  • Ingår i: Development of multimodal interfaces. - Berlin, Heidelberg : Springer Berlin Heidelberg. - 9783642123962 ; , s. 157-168
  • Konferensbidrag (refereegranskat)abstract
    • We share our experiences with integrating motion capture recordings in speech and dialogue research by describing (1) Spontal, a large project collecting 60 hours of video, audio and motion capture spontaneous dialogues, is described with special attention to motion capture and its pitfalls; (2) a tutorial where we use motion capture, speech synthesis and an animated talking head to allow students to create an active listener; and (3) brief preliminary results in the form of visualizations of motion capture data over time in a Spontal dialogue. We hope that given the lack of writings on the use of motion capture for speech research, these accounts will prove inspirational and informative.
  •  
22.
  • Beskow, Jonas, et al. (författare)
  • Focal accent and facial movements in expressive speech
  • 2006
  • Ingår i: Proceedings from Fonetik 2006, Lund, June, 7-9, 2006. - Lund : Lund University. ; , s. 9-12
  • Konferensbidrag (övrigt vetenskapligt/konstnärligt)abstract
    • In this paper, we present measurements of visual, facial parameters obtained from a speech corpus consisting of short, read utterances in which focal accent was systetnatically varied. The utterances were recorded in a variety of expressive modes including Certain, Confirming,Questioning, Uncertain, Happy, Angry and Neutral. Results showed that in all expressive modes, words with focal accent are accompanied by a greater variation of the facial parameters than are words in non-focal positions. Moreover, interesting differences between the expressions in terms of different parameters were found.
  •  
23.
  • Beskow, Jonas, et al. (författare)
  • Goda utsikter för teckenspråksteknologi
  • 2010
  • Ingår i: Språkteknologi för ökad tillgänglighet. - Linköping : Linköping University Electronic Press. - 9789173930949 - 9789173930956 ; , s. 77-86
  • Konferensbidrag (övrigt vetenskapligt/konstnärligt)abstract
    • I dag finns stora brister i tillgängligheten i samhället vad gäller teckentolkning. Nya tekniska landvinningar inom dator- och animationsteknologi, och det senaste decenniets forskning kring syntetisk teckentolkning har lett till att det nu finns nya förutsättningar att hitta tekniska lösningar med potential att förbättra tillgängligheten avsevärt för teckenspråkiga, för vissa typer av tjänster eller situationer. I Sverige finns idag ca 30 000 teckenspråksanvändare. Kunskapsläget har utvecklats mycket under senare år, både vad gäller förståelse/beskrivning av teckenspråk och tekniska förutsättningar för att analysera, lagra och generera teckenspråk. I kapitlet beskriver vi de olika tekniker som krävs för att utveckla teckenspråkteknologi. Det senaste decenniet har forskningen kring teckenspråkteknogi tagit fart, och ett flertal internationella projekt har startat. Ännu har bara ett fåtal tillämpningarblivit allmänt tillgängliga. Vi ger exempel på både forskningsprojekt och tidiga tillämpningar, speciellt från Europa där utvecklingen varit mycket stark. Utsikterna att starta en svensk utveckling inom området får anses goda. De kunskapsmässiga förutsättningarna är utmärkta; teknikkunnande inom språkteknologi, multimodal registrering och animering bl.a. vid KTH i kombination med fackkunskaper inom svenskt teckenspråk och teckenspråksanvändning vid Stockholms Universitet.
  •  
24.
  • Beskow, Jonas, et al. (författare)
  • Hearing at Home : Communication support in home environments for hearing impaired persons
  • 2008
  • Ingår i: INTERSPEECH 2008. - BAIXAS : ISCA-INST SPEECH COMMUNICATION ASSOC. - 9781615673780 ; , s. 2203-2206
  • Konferensbidrag (refereegranskat)abstract
    • The Hearing at Home (HaH) project focuses on the needs of hearing-impaired people in home environments. The project is researching and developing an innovative media-center solution for hearing support, with several integrated features that support perception of speech and audio, such as individual loudness amplification, noise reduction, audio classification and event detection, and the possibility to display an animated talking head providing real-time speechreading support. In this paper we provide a brief project overview and then describe some recent results related to the audio classifier and the talking head. As the talking head expects clean speech input, an audio classifier has been developed for the task of classifying audio signals as clean speech, speech in noise or other. The mean accuracy of the classifier was 82%. The talking head (based on technology from the SynFace project) has been adapted for German, and a small speech-in-noise intelligibility experiment was conducted where sentence recognition rates increased from 3% to 17% when the talking head was present.
  •  
25.
  • Beskow, Jonas, et al. (författare)
  • Human Recognition of Swedish Dialects
  • 2008
  • Ingår i: Proceedings of Fonetik 2008. - Göteborg : Göteborgs universitet. - 9789197719605 ; , s. 61-64
  • Konferensbidrag (övrigt vetenskapligt/konstnärligt)abstract
    • Our recent work within the research projectSIMULEKT (Simulating Intonational Varieties of Swedish) involves a pilot perceptiontest, used for detecting tendencies in humanclustering of Swedish dialects. 30 Swedishlisteners were asked to identify the geographical origin of 72 Swedish native speakers by clicking on a map of Sweden. Resultsindicate for example that listeners from thesouth of Sweden are generally better at recognizing some major Swedish dialects thanlisteners from the central part of Sweden.
  •  
26.
  •  
27.
  • Beskow, Jonas, et al. (författare)
  • Innovative interfaces in MonAMI : The Reminder
  • 2008
  • Ingår i: Perception In Multimodal Dialogue Systems, Proceedings. - Berlin, Heidelberg : Springer Berlin Heidelberg. - 9783540693680 ; , s. 272-275
  • Konferensbidrag (refereegranskat)abstract
    • This demo paper presents the first version of the Reminder, a prototype ECA developed in the European project MonAMI, which aims at "main-streaming accessibility in consumer goods and services, using advanced technologies to ensure equal access, independent living and participation for all". The Reminder helps users to plan activities and to remember what to do. The prototype merges ECA technology with other, existing technologies: Google Calendar and a digital pen and paper. This innovative combination of modalities allows users to continue using a paper calendar in the manner they are used to, whilst the ECA provides verbal notifications on what has been written in the calendar. Users may also ask questions such as "When was I supposed to meet Sara?" or "What's on my schedule today?"
  •  
28.
  • Beskow, Jonas, et al. (författare)
  • Multimodal Interaction Control
  • 2009
  • Ingår i: Computers in the Human Interaction Loop. - Berlin/Heidelberg : Springer Berlin/Heidelberg. - 9781848820531 - 9781848820548 ; , s. 143-158
  • Bokkapitel (refereegranskat)
  •  
29.
  • Beskow, Jonas, et al. (författare)
  • Recognizing and Modelling Regional Varieties of Swedish
  • 2008
  • Ingår i: INTERSPEECH 2008. - 9781615673780 ; , s. 512-515
  • Konferensbidrag (refereegranskat)abstract
    • Our recent work within the research project SIMULEKT (Simulating Intonational Varieties of Swedish) includes two approaches. The first involves a pilot perception test, used for detecting tendencies in human clustering of Swedish dialects. 30 Swedish listeners were asked to identify the geographical origin of Swedish native speakers by clicking on a map of Sweden. Results indicate for example that listeners from the south of Sweden are better at recognizing some major Swedish dialects than listeners from the central part of Sweden, which includes the capital area. The second approach concerns a method for modelling intonation using the newly developed SWING (Swedish INtonation Generator) tool, where annotated speech samples are resynthesized with rule based intonation and audiovisually analysed with regards to the major intonational varieties of Swedish. We consider both approaches important in our aim to test and further develop the Swedish prosody model.
  •  
30.
  • Beskow, Jonas, et al. (författare)
  • Resynthesis of Facial and Intraoral Articulation fromSimultaneous Measurements
  • 2003
  • Ingår i: Proceedings of the 15th International Congress of phonetic Sciences (ICPhS'03). - Adelaide : Casual Productions. - 1876346493
  • Konferensbidrag (övrigt vetenskapligt/konstnärligt)abstract
    • Simultaneous measurements of tongue and facial motion,using a combination of electromagnetic articulography(EMA) and optical motion tracking, are analysed to improvethe articulation of an animated talking head and toinvestigate the correlation between facial and vocal tractmovement. The recorded material consists of VCV andCVC words and 270 short everyday sentences spoken byone Swedish subject. The recorded articulatory movementsare re-synthesised by a parametrically controlled 3D modelof the face and tongue, using a procedure involvingminimisation of the error between measurement and model.Using linear estimators, tongue data is predicted from theface and vice versa, and the correlation betweenmeasurement and prediction is computed.
  •  
31.
  • Beskow, Jonas, et al. (författare)
  • Speech technology in the European project MonAMI
  • 2008
  • Ingår i: Proceedings of FONETIK 2008. - Gothenburg, Sweden : University of Gothenburg. - 9789197719605 ; , s. 33-36
  • Konferensbidrag (övrigt vetenskapligt/konstnärligt)abstract
    • This paper describes the role of speech and speech technology in the European project MonAMI, which aims at “mainstreaming ac-cessibility in consumer goods and services, us-ing advanced technologies to ensure equal ac-cess, independent living and participation for all”. It presents the Reminder, a prototype em-bodied conversational agent (ECA) which helps users to plan activities and to remember what to do. The prototype merges speech technology with other, existing technologies: Google Cal-endar and a digital pen and paper. The solution allows users to continue using a paper calendar in the manner they are used to, whilst the ECA provides notifications on what has been written in the calendar. Users may also ask questions such as “When was I supposed to meet Sara?” or “What’s on my schedule today?”
  •  
32.
  •  
33.
  • Beskow, Jonas, et al. (författare)
  • The MonAMI Reminder : a spoken dialogue system for face-to-face interaction
  • 2009
  • Ingår i: Proceedings of the 10th Annual Conference of the International Speech Communication Association, INTERSPEECH 2009. - Brighton, U.K. ; , s. 300-303
  • Konferensbidrag (refereegranskat)abstract
    • We describe the MonAMI Reminder, a multimodal spoken dialogue system which can assist elderly and disabled people in organising and initiating their daily activities. Based on deep interviews with potential users, we have designed a calendar and reminder application which uses an innovative mix of an embodied conversational agent, digital pen and paper, and the web to meet the needs of those users as well as the current constraints of speech technology. We also explore the use of head pose tracking for interaction and attention control in human-computer face-to-face interaction.
  •  
34.
  • Beskow, Jonas, et al. (författare)
  • The Swedish PFs-Star Multimodal Corpora
  • 2004
  • Ingår i: Proceedings of LREC Workshop on Models of Human Behaviour for the Specification and Evaluation of Multimodal Input and Output Interfaces. ; , s. 34-37
  • Konferensbidrag (refereegranskat)abstract
    • The aim of this paper is to present the multimodal speech corpora collected at KTH, in the framework of the European project PF-Star, and discuss some of the issues related to the analysis and implementation of human communicative and emotional visual correlates of speech in synthetic conversational agents. Two multimodal speech corpora have been collected by means of an opto-electronic system, which allows capturing the dynamics of emotional facial expressions with very high precision. The data has been evaluated through a classification test and the results show promising identification rates for the different acted emotions. These multimodal speech corpora will truly represent a valuable source to get more knowledge about how speech articulation and communicative gestures are affected by the expression of emotions.
  •  
35.
  • Beskow, Jonas, et al. (författare)
  • Visual correlates to prominence in several expressive modes
  • 2006
  • Ingår i: INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING. - BAIXAS : ISCA-INST SPEECH COMMUNICATION ASSOC. ; , s. 1272-1275
  • Konferensbidrag (refereegranskat)abstract
    • In this paper, we present measurements of visual, facial parameters obtained from a speech corpus consisting of short, read utterances in which focal accent was systematically varied. The utterances were recorded in a variety of expressive modes including certain, confirming, questioning, uncertain, happy, angry and neutral. Results showed that in all expressive modes, words with focal accent are accompanied by a greater variation of the facial parameters than are words in non-focal positions. Moreover, interesting differences between the expressions in terms of different parameters were found.
  •  
36.
  • Beskow, Jonas, et al. (författare)
  • Visualization of speech and audio for hearing-impaired persons
  • 2008
  • Ingår i: Technology and Disability. - : IOS Press. - 1055-4181 .- 1878-643X. ; 20:2, s. 97-107
  • Tidskriftsartikel (refereegranskat)abstract
    • Speech and sounds are important sources of information in our everyday lives for communication with our environment, be it interacting with fellow humans or directing our attention to technical devices with sound signals. For hearing impaired persons this acoustic information must be supplemented or even replaced by cues using other senses. We believe that the most natural modality to use is the visual, since speech is fundamentally audiovisual and these two modalities are complementary. We are hence exploring how different visualization methods for speech and audio signals may support hearing impaired persons. The goal in this line of research is to allow the growing number of hearing impaired persons, children as well as the middle-aged and elderly, equal participation in communication. A number of visualization techniques are proposed and exemplified with applications for hearing impaired persons.
  •  
37.
  • Bisitz, T., et al. (författare)
  • Noise Reduction for Media Streams
  • 2009
  • Ingår i: NAG/DAGA'09 International Conference on Acoustics. - Red Hook, USA : Curran Associates, Inc.. - 9781618391995
  • Konferensbidrag (refereegranskat)
  •  
38.
  • Blomberg, Mats, et al. (författare)
  • Children and adults in dialogue with the robot head Furhat - corpus collection and initial analysis
  • 2012
  • Ingår i: Proceedings of WOCCI. - Portland, OR : The International Society for Computers and Their Applications (ISCA).
  • Konferensbidrag (refereegranskat)abstract
    • This paper presents a large scale study in a public museum setting, where a back-projected robot head interacted with the visitors in multi-party dialogue. The exhibition was seen by almost 8000 visitors, out of which several thousand interacted with the system. A considerable portion of the visitors were children from around 4 years of age and adolescents. The collected corpus consists of about 10.000 user utterances. The head and a multi-party dialogue design allow the system to regulate the turn-taking behaviour, and help the robot to effectively obtain information from the general public. The commercial speech recognition component, supposedly designed for adult speakers, had considerably lower accuracy for the children. Methods are proposed for improving the performance for that speaker category.
  •  
39.
  • Blomberg, Mats, et al. (författare)
  • Speech recognition based on a text-to-speech synthesis system
  • 1987
  • Konferensbidrag (refereegranskat)abstract
    • A major problem in large-vocabulary speech recognition is the collection of reference data and speaker normalization. In this paper we propose the use of synthetic speech as a means of handling this problem. An experimental scheme for such a system will be described.
  •  
40.
  • Borgqvist, Martin, et al. (författare)
  • Förstudie av testbädd för bränsleceller
  • 2013
  • Rapport (övrigt vetenskapligt/konstnärligt)abstract
    • Prestudy of test bed for fuel cells This report investigates the potential needs and benefits of a Swedish national test bed for fuel cell and hydrogen technologies. The analysis is based on an interview study among 43 organisations within the field, as well as on inventory studies on existing test infrastructure in Sweden. The result is aggregated into a proposal that describes a test bed in terms of functionality and organisation.
  •  
41.
  • Botinis, A., et al. (författare)
  • Developments and paradigms in intonation research
  • 2001
  • Ingår i: Speech Communication. - 0167-6393 .- 1872-7182. ; 33:4, s. 263-296
  • Forskningsöversikt (refereegranskat)abstract
    • The present tutorial paper is addressed to a wide audience with different discipline backgrounds as well as variable expertise on intonation. The paper is structured into five sections. In Section 1, Introduction, basic concepts of intonation and prosody are summarised and cornerstones of intonation research are highlighted. In Section 2, Functions and forms of intonation, a wide range of functions from morpholexical and phrase levels to discourse and dialogue levels are discussed and forms of intonation with examples from different languages are presented. In Section 3, Modelling and labelling of intonation, established models of intonation as well as labelling systems are presented. In Section 4, Applications of intonation the most widespread applications of intonation and especially technological ones are presented and methodological issues are discussed. In Section 5, Research perspective research avenues and ultimate goals as well as the significance and benefits of intonation research in the upcoming years are outlined.
  •  
42.
  • Bruce, Gösta, et al. (författare)
  • Developing the modelling of Swedish prosody in spontaneous dialogue
  • 1996
  • Ingår i: ICSLP 96 : proceedings, Fourth International Conference on Spoken Language Processing. - 0780335554 ; 1, s. 370-373
  • Konferensbidrag (refereegranskat)abstract
    • The main goal of our current research is the development of the Swedish prosody model. In our analysis of discourse and dialogue intonation, we are exploiting model-based resynthesis. By comparing synthesized default and fine-tuned pitch contours for the dialogues under study, we are able to isolate relevant intonation patterns. This analysis of intonation is related to an independent modelling of topic structure consisting of lexical-semantic analysis and text segmentation. Some results from our model-based acoustic analysis are presented, and its implementation in text-to-speech-synthesis is discussed.
  •  
43.
  •  
44.
  • Bruce, Gösta, et al. (författare)
  • Modelling intonation in varieties of swedish
  • 2008
  • Ingår i: Proceedings of the 4th International Conference on Speech Prosody, SP 2008. - : International Speech Communications Association. - 9780616220030 ; , s. 571-574
  • Konferensbidrag (refereegranskat)abstract
    • The research project Simulating intonational varieties of Swedish (SIMULEKT) aims to gain more precise and thorough knowledge about some major regional varieties of Swedish: South, Göta, Svea, Gotland, Dala, North, and Finland Swedish. In this research effort, the Swedish prosody model and different forms of speech synthesis play a prominent role. The two speech databases SweDia 2000 and SpeechDat constitute our main material for analysis. As a first test case for our prosody model, we compared Svea and North Swedish intonation in a pilot production-oriented perception test. Näi{dotless}ve Swedish listeners were asked to identify the most Svea and North sounding stimuli. Results showed that listeners can differentiate between the two varieties from intonation only. They also provided information on how intonational parameters affect listeners' impression of Swedish varieties. All this indicates that our experimental method can be used to test perception of different regional varieties of Swedish.
  •  
45.
  •  
46.
  • Bruce, Gösta, et al. (författare)
  • On the analysis of prosody in interaction
  • 1997
  • Ingår i: Computing Prosody: Computational Models for Processing Spontaneous Speech. - 038794804X ; , s. 43-59
  • Bokkapitel (populärvet., debatt m.m.)
  •  
47.
  •  
48.
  • Bruce, Gösta, et al. (författare)
  • SIMULEKT : modelling Swedish regional intonation
  • 2007
  • Ingår i: Proceedings of Fonetik 2007. - Stockholm : KTH Royal Institute of Technology. ; , s. 121-124
  • Konferensbidrag (övrigt vetenskapligt/konstnärligt)abstract
    • This paper introduces a new research project Simulating Intonational Varieties of Swedish (SIMULEKT). The basic goal of the project is to produce more precise and thorough knowledge about some major intonational varieties of Swedish. In this research effort the Swedish prosody model plays a prominent role. A fundamental idea is to take advantage of speech synthesis in different forms. In our analysis and synthesis work we will focus on some major intonational types: South, Göta, Svea, Gotland, Dala, North, and Finland Swedish. The significance of our project work will be within basic research as well as in speech technology applications.
  •  
49.
  •  
50.
  • Bruce, Gösta, et al. (författare)
  • Speech synthesis in spoken dialogue research
  • 1995
  • Ingår i: Proceedings of the 4th European Conference on Speech Communication and Technology (Eurospeech'95). ; 2, s. 1169-1172
  • Konferensbidrag (refereegranskat)
  •  
Skapa referenser, mejla, bekava och länka
  • Resultat 1-50 av 93
Typ av publikation
konferensbidrag (61)
tidskriftsartikel (12)
bokkapitel (11)
rapport (3)
doktorsavhandling (3)
licentiatavhandling (2)
visa fler...
forskningsöversikt (1)
visa färre...
Typ av innehåll
refereegranskat (68)
övrigt vetenskapligt/konstnärligt (23)
populärvet., debatt m.m. (2)
Författare/redaktör
Granström, Björn (86)
Beskow, Jonas (40)
House, David (28)
Bruce, Gösta (23)
Gustafson, Joakim (13)
Al Moubayed, Samer (12)
visa fler...
Skantze, Gabriel (10)
Salvi, Giampiero (7)
Edlund, Jens (6)
Blomberg, Mats (6)
Frid, Johan (5)
Enflo, Laura (5)
Agelfors, Eva (4)
Sundberg, Johan (3)
Spens, Karl-Erik (3)
Öhman, Tobias (3)
Strangert, Eva (2)
Botinis, A (2)
Friberg, Anders (2)
Lundeberg, Magnus (2)
Lindblom, Björn (2)
Al Moubayed, Samer, ... (2)
Tscheligi, Manfred (2)
Öster, Anne-Marie (2)
van Son, Nic (2)
Ormel, Ellen (2)
Herzke, Tobias (2)
Asnafi, Nader, 1960- (1)
Olsson, Håkan (1)
Aaltonen, Olli (1)
Engstrand, Olle (1)
Segerup, My (1)
Johansson, Christer (1)
Holmgren, Björn (1)
Nilsson, Björn (1)
Claesson, Ingvar (1)
Dahlquist, Martin (1)
Dahlquist, M (1)
Lundeberg, M (1)
Spens, K-E (1)
Karlsson, Inger (1)
Leisner, Peter (1)
Nordebo, Sven (1)
Alexanderson, Simon (1)
Mirning, Nicole (1)
Mirning, N. (1)
Öster, Ann-Marie (1)
Pettersson, Anders (1)
Megyesi, Beata (1)
Alexandersson, Anna (1)
visa färre...
Lärosäte
Kungliga Tekniska Högskolan (64)
Lunds universitet (23)
Uppsala universitet (2)
Stockholms universitet (2)
Umeå universitet (1)
Mälardalens universitet (1)
visa fler...
Örebro universitet (1)
RISE (1)
Karolinska Institutet (1)
Högskolan Dalarna (1)
Blekinge Tekniska Högskola (1)
Sveriges Lantbruksuniversitet (1)
visa färre...
Språk
Engelska (91)
Svenska (2)
Forskningsämne (UKÄ/SCB)
Naturvetenskap (56)
Humaniora (27)
Teknik (4)
Samhällsvetenskap (4)
Medicin och hälsovetenskap (1)
Lantbruksvetenskap (1)

År

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Stäng

Kopiera och spara länken för att återkomma till aktuell vy