SwePub
Tyck till om SwePub Sök här!
Sök i SwePub databas

  Utökad sökning

Träfflista för sökning "WFRF:(Stefanov Kalin) "

Sökning: WFRF:(Stefanov Kalin)

  • Resultat 1-10 av 22
Sortera/gruppera träfflistan
   
NumreringReferensOmslagsbildHitta
1.
  • Adiban, Mohammad, et al. (författare)
  • Hierarchical Residual Learning Based Vector Quantized Variational Autoencorder for Image Reconstruction and Generation
  • 2022
  • Ingår i: The 33<sup>rd</sup> British Machine Vision Conference Proceedings.
  • Konferensbidrag (refereegranskat)abstract
    • We propose a multi-layer variational autoencoder method, we call HR-VQVAE, thatlearns hierarchical discrete representations of the data. By utilizing a novel objectivefunction, each layer in HR-VQVAE learns a discrete representation of the residual fromprevious layers through a vector quantized encoder. Furthermore, the representations ateach layer are hierarchically linked to those at previous layers. We evaluate our methodon the tasks of image reconstruction and generation. Experimental results demonstratethat the discrete representations learned by HR-VQVAE enable the decoder to reconstructhigh-quality images with less distortion than the baseline methods, namely VQVAE andVQVAE-2. HR-VQVAE can also generate high-quality and diverse images that outperform state-of-the-art generative models, providing further verification of the efficiency ofthe learned representations. The hierarchical nature of HR-VQVAE i) reduces the decoding search time, making the method particularly suitable for high-load tasks and ii) allowsto increase the codebook size without incurring the codebook collapse problem.
  •  
2.
  • Al Moubayed, Samer, et al. (författare)
  • Human-robot Collaborative Tutoring Using Multiparty Multimodal Spoken Dialogue
  • 2014
  • Konferensbidrag (refereegranskat)abstract
    • In this paper, we describe a project that explores a novel experi-mental setup towards building a spoken, multi-modally rich, and human-like multiparty tutoring robot. A human-robotinteraction setup is designed, and a human-human dialogue corpus is collect-ed. The corpus targets the development of a dialogue system platform to study verbal and nonverbaltutoring strategies in mul-tiparty spoken interactions with robots which are capable of spo-ken dialogue. The dialogue task is centered on two participants involved in a dialogueaiming to solve a card-ordering game. Along with the participants sits a tutor (robot) that helps the par-ticipants perform the task, and organizes and balances their inter-action. Differentmultimodal signals captured and auto-synchronized by different audio-visual capture technologies, such as a microphone array, Kinects, and video cameras, were coupled with manual annotations. These are used build a situated model of the interaction based on the participants personalities, their state of attention, their conversational engagement and verbal domi-nance, and how that is correlated with the verbal and visual feed-back, turn-management, and conversation regulatory actions gen-erated by the tutor. Driven by the analysis of the corpus, we will show also the detailed design methodologies for an affective, and multimodally rich dialogue system that allows the robot to meas-ure incrementally the attention states, and the dominance for each participant, allowing the robot head Furhat to maintain a well-coordinated, balanced, and engaging conversation, that attempts to maximize the agreement and the contribution to solve the task. This project sets the first steps to explore the potential of us-ing multimodal dialogue systems to build interactive robots that can serve in educational, team building, and collaborative task solving applications.
  •  
3.
  • Al Moubayed, Samer, et al. (författare)
  • Multimodal Multiparty Social Interaction with the Furhat Head
  • 2012
  • Konferensbidrag (refereegranskat)abstract
    • We will show in this demonstrator an advanced multimodal and multiparty spoken conversational system using Furhat, a robot head based on projected facial animation. Furhat is a human-like interface that utilizes facial animation for physical robot heads using back-projection. In the system, multimodality is enabled using speech and rich visual input signals such as multi-person real-time face tracking and microphone tracking. The demonstrator will showcase a system that is able to carry out social dialogue with multiple interlocutors simultaneously with rich output signals such as eye and head coordination, lips synchronized speech synthesis, and non-verbal facial gestures used to regulate fluent and expressive multiparty conversations.
  •  
4.
  • Al Moubayed, Samer, et al. (författare)
  • Tutoring Robots: Multiparty Multimodal Social Dialogue With an Embodied Tutor
  • 2014
  • Konferensbidrag (refereegranskat)abstract
    • This project explores a novel experimental setup towards building spoken, multi-modally rich, and human-like multiparty tutoring agent. A setup is developed and a corpus is collected that targets the development of a dialogue system platform to explore verbal and nonverbal tutoring strategies in multiparty spoken interactions with embodied agents. The dialogue task is centered on two participants involved in a dialogue aiming to solve a card-ordering game. With the participants sits a tutor that helps the participants perform the task and organizes and balances their interaction. Different multimodal signals captured and auto-synchronized by different audio-visual capture technologies were coupled with manual annotations to build a situated model of the interaction based on the participants personalities, their temporally-changing state of attention, their conversational engagement and verbal dominance, and the way these are correlated with the verbal and visual feedback, turn-management, and conversation regulatory actions generated by the tutor. At the end of this chapter we discuss the potential areas of research and developments this work opens and some of the challenges that lie in the road ahead.
  •  
5.
  •  
6.
  •  
7.
  •  
8.
  • Chollet, M., et al. (författare)
  • Public Speaking Training with a Multimodal Interactive Virtual Audience Framework
  • 2015
  • Ingår i: ICMI '15 Proceedings of the 2015 ACM on International Conference on Multimodal Interaction. - New York, NY, USA : ACM Digital Library. ; , s. 367-368
  • Konferensbidrag (refereegranskat)abstract
    • We have developed an interactive virtual audience platform for public speaking training. Users' public speaking behavior is automatically analyzed using multimodal sensors, and ultimodal feedback is produced by virtual characters and generic visual widgets depending on the user's behavior. The flexibility of our system allows to compare different interaction mediums (e.g. virtual reality vs normal interaction), social situations (e.g. one-on-one meetings vs large audiences) and trained behaviors (e.g. general public speaking performance vs specific behaviors).
  •  
9.
  • Eyben, F., et al. (författare)
  • Socially Aware Many-to-Machine Communication
  • 2012
  • Konferensbidrag (refereegranskat)abstract
    • This reports describes the output of the project P5: Socially Aware Many-to-Machine Communication (M2M) at the eNTERFACE’12 workshop. In this project, we designed and implemented a new front-end for handling multi-user interaction in a dialog system. We exploit the Microsoft Kinect device for capturing multimodal input and extract some features describing user and face positions. These data are then analyzed in real-time to robustly detect speech and determine both who is speaking and whether the speech is directed to the system or not. This new front-end is integrated to the SEMAINE (Sustained Emotionally colored Machine-human Interaction using Nonverbal Expression) system. Furthermore, a multimodal corpus has been created, capturing all of the system inputs in two different scenarios involving human-human and human-computer interaction.
  •  
10.
  • Koutsombogera, Maria, et al. (författare)
  • The Tutorbot Corpus - A Corpus for Studying Tutoring Behaviour in Multiparty Face-to-Face Spoken Dialogue
  • 2014
  • Konferensbidrag (refereegranskat)abstract
    • This paper describes a novel experimental setup exploiting state-of-the-art capture equipment to collect a multimodally rich game-solving collaborative multiparty dialogue corpus. The corpus is targeted and designed towards the development of a dialogue system platform to explore verbal and nonverbal tutoring strategies in multiparty spoken interactions. The dialogue task is centered on two participants involved in a dialogue aiming to solve a card-ordering game. The participants were paired into teams based on their degree of extraversion as resulted from a personality test. With the participants sits a tutor that helps them perform the task, organizes and balances their interaction and whose behavior was assessed by the participants after each interaction. Different multimodal signals captured and auto-synchronized by different audio-visual capture technologies, together with manual annotations of the tutor’s behavior constitute the Tutorbot corpus. This corpus is exploited to build a situated model of the interaction based on the participants’ temporally-changing state of attention, their conversational engagement and verbal dominance, and their correlation with the verbal and visual feedback and conversation regulatory actions generated by the tutor.
  •  
Skapa referenser, mejla, bekava och länka
  • Resultat 1-10 av 22

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Stäng

Kopiera och spara länken för att återkomma till aktuell vy