MRI-based vocal tract representations for the three-dimensional finite element synthesis of diphthongs

↓ Direkt till sidans innehåll
↓ Direkt till sidans sekundära innehåll (sidomenyn)

Search: L773:1558 7916 OR L773:1558 7924 > MRI-based vocal tra...

Details
MARC

Arnela, Marc (author)

MRI-based vocal tract representations for the three-dimensional finite element synthesis of diphthongs

Article/chapterEnglish2019

Publisher, publication year, extent ...

IEEE Press,2019
electronicrdacarrier

Numbers

LIBRIS-ID:oai:DiVA.org:kth-259580
https://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-259580URI
https://doi.org/10.1109/TASLP.2019.2942439DOI

Supplementary language notes

Language:English
Summary in:English

Part of subdatabase

SwePubSwePub

Classification

Subject category:ref swepub-contenttype
Subject category:art swepub-publicationtype

Notes

QC 20211129
The synthesis of diphthongs in three-dimensions (3D) involves the simulation of acoustic waves propagating through a complex 3D vocal tract geometry that deforms over time. Accurate 3D vocal tract geometries can be extracted from Magnetic Resonance Imaging (MRI), but due to long acquisition times, only static sounds can be currently studied with an adequate spatial resolution. In this work, 3D dynamic vocal tract representations are built to generate diphthongs, based on a set of cross-sections extracted from MRI-based vocal tract geometries of static vowel sounds. A diphthong can then be easily generated by interpolating the location, orientation and shape of these cross-sections, thus avoiding the interpolation of full 3D geometries. Two options are explored to extract the cross-sections. The first one is based on an adaptive grid (AG), which extracts the cross-sections perpendicular to the vocal tract midline, whereas the second one resorts to a semi-polar grid (SPG) strategy, which fixes the cross-section orientations. The finite element method (FEM) has been used to solve the mixed wave equation and synthesize diphthongs [${\alpha i}$] and [${\alpha u}$] in the dynamic 3D vocal tracts. The outputs from a 1D acoustic model based on the Transfer Matrix Method have also been included for comparison. The results show that the SPG and AG provide very close solutions in 3D, whereas significant differences are observed when using them in 1D. The SPG dynamic vocal tract representation is recommended for 3D simulations because it helps to prevent the collision of adjacent cross-sections.

Subject headings and genre

Added entries (persons, corporate bodies, meetings, titles ...)

Dabbaghchian, SaeedKTH,Tal, musik och hörsel, TMH(Swepub:kth)u1lksn0x (author)
Guasch, Oriol (author)
Engwall, OlovKTH,Tal-kommunikation(Swepub:kth)u1niocbs (author)
KTHTal, musik och hörsel, TMH (creator_code:org_t)

Related titles

In:IEEE Transactions on Audio, Speech, and Language Processing: IEEE Press27:12, s. 2173-21821558-79161558-7924

Internet link

Find in a library

IEEE Transactions on Audio, Speech, and Language Processing (Search for host publication in LIBRIS)

To the university's database

Find more in SwePub

By the author/editor: Arnela, Marc; Dabbaghchian, Sa ...; Guasch, Oriol; Engwall, Olov

About the subject

NATURAL SCIENCES: NATURAL SCIENCES; and Computer and Inf ...; and Computer Science ...

Articles in the publication: IEEE Transaction ...

By the university: Royal Institute of Technology

Search outside SwePub

Extend your search to:: Google; Google Book Search; Google Scholar

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

LIBRIS.kb.se