SwePub
Sök i SwePub databas

  Utökad sökning

Träfflista för sökning "WFRF:(Al Moubayed Samer) "

Sökning: WFRF:(Al Moubayed Samer)

  • Resultat 1-10 av 58
Sortera/gruppera träfflistan
   
NumreringReferensOmslagsbildHitta
1.
  • Abou Zliekha, M., et al. (författare)
  • Emotional Audio-Visual Arabic Text to Speech
  • 2006
  • Ingår i: Proceedings of the XIV European Signal Processing Conference (EUSIPCO). - Florence, Italy.
  • Konferensbidrag (refereegranskat)abstract
    • The goal of this paper is to present an emotional audio-visual. Text to speech system for the Arabic Language. The system is based on two entities: un emotional audio text to speech system which generates speech depending on the input text and the desired emotion type, and un emotional Visual model which generates the talking heads, by forming the corresponding visemes. The phonemes to visemes mapping, and the emotion shaping use a 3-paramertic face model, based on the Abstract Muscle Model. We have thirteen viseme models and five emotions as parameters to the face model. The TTS produces the phonemes corresponding to the input text, the speech with the suitable prosody to include the prescribed emotion. In parallel the system generates the visemes and sends the controls to the facial model to get the animation of the talking head in real time.
  •  
2.
  • Al Dakkak, O., et al. (författare)
  • Emotional Inclusion in An Arabic Text-To-Speech
  • 2005
  • Ingår i: Proceedings of the 13th European Signal Processing Conference (EUSIPCO). - Antalya, Turkey. - 9781604238211
  • Konferensbidrag (refereegranskat)abstract
    • The goal of this paper is to present an emotional audio-visua lText to speech system for the Arabic Language. The system is based on two entities: un emotional audio text to speech system which generates speech depending on the input text and the desired emotion type, and un emotional Visual model which generates the talking heads, by forming the corresponding visemes. The phonemes to visemes mapping, and the emotion shaping use a 3-paramertic face model, based on the Abstract Muscle Model. We have thirteen viseme models and five emotions as parameters to the face model. The TTS produces the phonemes corresponding to the input text, the speech with the suitable prosody to include the prescribed emotion. In parallel the system generates the visemes and sends the controls to the facial model to get the animation of the talking head in real time.
  •  
3.
  • Al Dakkak, O., et al. (författare)
  • Prosodic Feature Introduction and Emotion Incorporation in an Arabic TTS
  • 2006
  • Ingår i: Proceedings of IEEE International Conference on Information and Communication Technologies. - Damascus, Syria. - 0780395212 ; , s. 1317-1322
  • Konferensbidrag (refereegranskat)abstract
    • Text-to-speech is a crucial part of many man-machine communication applications, such as phone booking and banking, vocal e-mail, and many other applications. In addition to many other applications concerning impaired persons, such as: reading machines for blinds, talking machines for persons with speech difficulties. However, the main drawback of most speech synthesizers in the talking machines, are their metallic sounds. In order to sound naturally, we have to incorporate prosodic features, as close as possible to natural prosody, this helps to improve the quality of the synthetic speech. Actual researches in the world are towards better "automatic prosody generation".
  •  
4.
  • Agarwal, Priyanshu, et al. (författare)
  • Imitating Human Movement with Teleoperated Robotic Head
  • 2016
  • Ingår i: 2016 25TH IEEE INTERNATIONAL SYMPOSIUM ON ROBOT AND HUMAN INTERACTIVE COMMUNICATION (RO-MAN). - 9781509039296 ; , s. 630-637
  • Konferensbidrag (refereegranskat)abstract
    • Effective teleoperation requires real-time control of a remote robotic system. In this work, we develop a controller for realizing smooth and accurate motion of a robotic head with application to a teleoperation system for the Furhat robot head [1], which we call TeleFurhat. The controller uses the head motion of an operator measured by a Microsoft Kinect 2 sensor as reference and applies a processing framework to condition and render the motion on the robot head. The processing framework includes a pre-filter based on a moving average filter, a neural network-based model for improving the accuracy of the raw pose measurements of Kinect, and a constrained-state Kalman filter that uses a minimum jerk model to smooth motion trajectories and limit the magnitude of changes in position, velocity, and acceleration. Our results demonstrate that the robot can reproduce the human head motion in real time with a latency of approximately 100 to 170 ms while operating within its physical limits. Furthermore, viewers prefer our new method over rendering the raw pose data from Kinect.
  •  
5.
  • Al Moubayed, Samer, et al. (författare)
  • A novel Skype interface using SynFace for virtual speech reading support
  • 2011
  • Ingår i: Proceedings from Fonetik 2011, June 8 - June 10, 2011. - Stockholm, Sweden. ; , s. 33-36
  • Konferensbidrag (övrigt vetenskapligt/konstnärligt)abstract
    • We describe in this paper a support client interface to the IP telephony application Skype. The system uses a variant of SynFace, a real-time speech reading support system using facial animation. The new interface is designed for the use by elderly persons, and tailored for use in systems supporting touch screens. The SynFace real-time facial animation system has previously shown ability to enhance speech comprehension for the hearing impaired persons. In this study weemploy at-home field studies on five subjects in the EU project MonAMI. We presentinsights from interviews with the test subjects on the advantages of the system, and onthe limitations of such a technology of real-time speech reading to reach the homesof elderly and the hard of hearing.
  •  
6.
  • Al Moubayed, Samer, et al. (författare)
  • A robotic head using projected animated faces
  • 2011
  • Ingår i: Proceedings of the International Conference on Audio-Visual Speech Processing 2011. - Stockholm : KTH Royal Institute of Technology. ; , s. 71-
  • Konferensbidrag (refereegranskat)abstract
    • This paper presents a setup which employs virtual animatedagents for robotic heads. The system uses a laser projector toproject animated faces onto a three dimensional face mask. This approach of projecting animated faces onto a three dimensional head surface as an alternative to using flat, two dimensional surfaces, eliminates several deteriorating effects and illusions that come with flat surfaces for interaction purposes, such as exclusive mutual gaze and situated and multi-partner dialogues. In addition to that, it provides robotic heads with a flexible solution for facial animation which takes into advantage the advancements of facial animation using computer graphics overmechanically controlled heads.
  •  
7.
  • Al Moubayed, Samer, et al. (författare)
  • Acoustic-to-Articulatory Inversion based on Local Regression
  • 2010
  • Ingår i: Proceedings of the 11th Annual Conference of the International Speech Communication Association, INTERSPEECH 2010. - Makuhari, Japan. - 9781617821233 ; , s. 937-940
  • Konferensbidrag (refereegranskat)abstract
    • This paper presents an Acoustic-to-Articulatory inversionmethod based on local regression. Two types of local regression,a non-parametric and a local linear regression have beenapplied on a corpus containing simultaneous recordings of positionsof articulators and the corresponding acoustics. A maximumlikelihood trajectory smoothing using the estimated dynamicsof the articulators is also applied on the regression estimates.The average root mean square error in estimating articulatorypositions, given the acoustics, is 1.56 mm for the nonparametricregression and 1.52 mm for the local linear regression.The local linear regression is found to perform significantlybetter than regression using Gaussian Mixture Modelsusing the same acoustic and articulatory features.
  •  
8.
  • Al Moubayed, Samer, et al. (författare)
  • Analysis of gaze and speech patterns in three-party quiz game interaction
  • 2013
  • Ingår i: Interspeech 2013. - : The International Speech Communication Association (ISCA). ; , s. 1126-1130
  • Konferensbidrag (refereegranskat)abstract
    • In order to understand and model the dynamics between interaction phenomena such as gaze and speech in face-to-face multiparty interaction between humans, we need large quantities of reliable, objective data of such interactions. To date, this type of data is in short supply. We present a data collection setup using automated, objective techniques in which we capture the gaze and speech patterns of triads deeply engaged in a high-stakes quiz game. The resulting corpus consists of five one-hour recordings, and is unique in that it makes use of three state-of-the-art gaze trackers (one per subject) in combination with a state-of-theart conical microphone array designed to capture roundtable meetings. Several video channels are also included. In this paper we present the obstacles we encountered and the possibilities afforded by a synchronised, reliable combination of large-scale multi-party speech and gaze data, and an overview of the first analyses of the data. Index Terms: multimodal corpus, multiparty dialogue, gaze patterns, multiparty gaze.
  •  
9.
  • Al Moubayed, Samer, et al. (författare)
  • Animated Faces for Robotic Heads : Gaze and Beyond
  • 2011
  • Ingår i: Analysis of Verbal and Nonverbal Communication and Enactment. - Berlin, Heidelberg : Springer Berlin/Heidelberg. - 9783642257742 ; , s. 19-35
  • Konferensbidrag (refereegranskat)abstract
    • We introduce an approach to using animated faces for robotics where a static physical object is used as a projection surface for an animation. The talking head is projected onto a 3D physical head model. In this chapter we discuss the different benefits this approach adds over mechanical heads. After that, we investigate a phenomenon commonly referred to as the Mona Lisa gaze effect. This effect results from the use of 2D surfaces to display 3D images and causes the gaze of a portrait to seemingly follow the observer no matter where it is viewed from. The experiment investigates the perception of gaze direction by observers. The analysis shows that the 3D model eliminates the effect, and provides an accurate perception of gaze direction. We discuss at the end the different requirements of gaze in interactive systems, and explore the different settings these findings give access to.
  •  
10.
  • Al Moubayed, Samer, et al. (författare)
  • Audio-Visual Prosody : Perception, Detection, and Synthesis of Prominence
  • 2010
  • Ingår i: 3rd COST 2102 International Training School on Toward Autonomous, Adaptive, and Context-Aware Multimodal Interfaces. - Berlin, Heidelberg : Springer Berlin Heidelberg. - 9783642181832 ; , s. 55-71
  • Konferensbidrag (refereegranskat)abstract
    • In this chapter, we investigate the effects of facial prominence cues, in terms of gestures, when synthesized on animated talking heads. In the first study a speech intelligibility experiment is conducted, where speech quality is acoustically degraded, then the speech is presented to 12 subjects through a lip synchronized talking head carrying head-nods and eyebrow raising gestures. The experiment shows that perceiving visual prominence as gestures, synchronized with the auditory prominence, significantly increases speech intelligibility compared to when these gestures are randomly added to speech. We also present a study examining the perception of the behavior of the talking heads when gestures are added at pitch movements. Using eye-gaze tracking technology and questionnaires for 10 moderately hearing impaired subjects, the results of the gaze data show that users look at the face in a similar fashion to when they look at a natural face when gestures are coupled with pitch movements opposed to when the face carries no gestures. From the questionnaires, the results also show that these gestures significantly increase the naturalness and helpfulness of the talking head.
  •  
Skapa referenser, mejla, bekava och länka
  • Resultat 1-10 av 58

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Stäng

Kopiera och spara länken för att återkomma till aktuell vy