SwePub
Sök i SwePub databas

  Utökad sökning

Träfflista för sökning "hsv:(NATURVETENSKAP) hsv:(Data och informationsvetenskap) hsv:(Medieteknik) "

Sökning: hsv:(NATURVETENSKAP) hsv:(Data och informationsvetenskap) hsv:(Medieteknik)

Sortera/gruppera träfflistan
   
NumreringReferensOmslagsbildHitta
1.
  • Elowsson, Anders (författare)
  • Modeling Music Studies of Music Transcription, Music Perception and Music Production
  • 2018
  • Doktorsavhandling (övrigt vetenskapligt)abstract
    • <p>Denna avhandling presenterar tio studier inom tre viktiga delområden av forskningsområdet ”Music Information Retrieval” (MIR) – ett forskningsområde fokuserat på att extrahera information från musik. Del A riktar in sig på musiktranskription, del B på musikperception och del C på musikproduktion. En avslutande del diskuterar maskininlärningsmetodiken och spanar framåt (del D).</p><p>I del A presenteras system som kan transkribera musik med hänsyn till rytm och polyfon tonhöjd. De två första publikationerna beskriver metoder för att estimera tempo och positionen av taktslag i ljudande musik. En metod för att beräkna den mest framstående periodiciteten (”cepstroiden”) beskrivs, samt hur denna kan användas för att guida de applicerade maskininlärningssystemen.  Systemet för polyfon tonhöjdsestimering kan både identifiera ljudande toner samt notstarter- och slut. Detta system är både tonhöjdsinvariant samt invariant med hänseende till variationer över tid inom ljudande toner. Transkriptionssystemen tränas till att predicera flera musikaspekter i en hierarkisk struktur. Transkriptionsresultaten är de bästa som rapporterats i tester på flera olika dataset.</p><p>Del B fokuserar på perceptuella särdrag i musik. Dessa kan prediceras för att modellera fundamentala perceptionsaspekter, men de kan också användas som representationer i modeller som försöker klassificera övergripande musikparametrar. Modeller presenteras som kan predicera den upplevda hastigheten samt den upplevda dynamiken i utförandet med hög precision. Medelvärdesbildade skattningar från omkring 20 lyssnare utgör målvärden under träning och evaluering.</p><p>I del C utforskas aspekter relaterade till musikproduktion. Den första studien analyserar variationer i medelvärdesspektrum mellan populärmusikaliska musikstycken. Analysen visar att nivån på perkussiva instrument är en viktig faktor för spektrumdistributionen – data antyder att denna nivå är bättre att använda än genreklassificeringar för att förutsäga spektrum. Den andra studien i del C behandlar musikkomposition. Ett algoritmiskt kompositionsprogram presenteras, där relevanta musikparametrar fogas samman en hierarkisk struktur. Ett lyssnartest genomförs för att påvisa validiteten i programmet och undersöka effekten av vissa parametrar.</p><p>Avhandlingen avslutas med del D, vilken placerar den utvecklade maskininlärningstekniken i ett vidare sammanhang och föreslår nya metoder för att generalisera rytmprediktion. Den första studien diskuterar djupinlärningssystem som predicerar olika musikaspekter i en hierarkisk struktur. Relevanta koncept presenteras tillsammans med förslag för framtida implementationer. Den andra studien föreslår en tempoinvariant metod för att processa log-frekvensdomänen av rytmsignaler med så kallade convolutional neural networks. Den föreslagna arkitekturen kan använda sig av magnitud, relative fas mellan rytmkanaler, samt ursprunglig fas från frekvenstransformen för att ta sig an flera viktiga problem relaterade till rytm.</p>
  •  
2.
  • Jönsson, Daniel (författare)
  • Enhancing Salient Features in Volumetric Data Using Illumination and Transfer Functions
  • 2016
  • Doktorsavhandling (övrigt vetenskapligt)abstract
    • <p>The visualization of volume data is a fundamental component in the medical domain. Volume data is used in the clinical work-flow to diagnose patients and is therefore of uttermost importance. The amount of data is rapidly increasing as sensors, such as computed tomography scanners, become capable of measuring more details and gathering more data over time. Unfortunately, the increasing amount of data makes it computationally challenging to interactively apply high quality methods to increase shape and depth perception. Furthermore, methods for exploring volume data has mostly been designed for experts, which prohibits novice users from exploring volume data. This thesis aims to address these challenges by introducing efficient methods for enhancing salient features through high quality illumination as well as methods for intuitive volume data exploration.</p><p>Humans are interpreting the world around them by observing how light interacts with objects. Shadows enable us to better determine distances while shifts in color enable us to better distinguish objects and identify their shape. These concepts are also applicable to computer generated content. The perception in volume data visualization can therefore be improved by simulating real-world light interaction. This thesis presents efficient methods that are capable of interactively simulating realistic light propagation in volume data. In particular, this work shows how a multi-resolution grid can be used to encode the attenuation of light from all directions using spherical harmonics and thereby enable advanced interactive dynamic light configurations. Two methods are also presented that allow photon mapping calculations to be focused on visually changing areas.The results demonstrate that photon mapping can be used in interactive volume visualization for both static and time-varying volume data.</p><p>Efficient and intuitive exploration of volume data requires methods that are easy to use and reflect the objects that were measured. A value that has been collected by a sensor commonly represents the material existing within a small neighborhood around a location. Recreating the original materials is difficult since the value represents a mixture of them. This is referred to as the partial-volume problem. A method is presented that derives knowledge from the user in order to reconstruct the original materials in a way which is more in line with what the user would expect. Sharp boundaries are visualized where the certainty is high while uncertain areas are visualized with fuzzy boundaries. The volume exploration process of mapping data values to optical properties through the transfer function has traditionally been complex and performed by expert users. A study at a science center showed that visitors favor the presented dynamic gallery method compared to the most commonly used transfer function editor.</p>
  •  
3.
  • Koniaris, Christos, 1979- (författare)
  • Perceptually motivated speech recognition and mispronunciation detection
  • 2012
  • Ingår i: European Union FP6-034362 research project ACORNS. - Stockholm : KTH Royal Institute of Technology. - 978-91-7501-468-5
  • Doktorsavhandling (övrigt vetenskapligt)abstract
    • <p>This doctoral thesis is the result of a research effort performed in two fields of speech technology, i.e., <em>speech recognition</em> and <em>mispronunciation detection</em>. Although the two areas are clearly distinguishable, the proposed approaches share a common hypothesis based on psychoacoustic processing of speech signals. The conjecture implies that the human auditory periphery provides a relatively good separation of different sound classes. Hence, it is possible to use recent findings from psychoacoustic perception together with mathematical and computational tools to model the auditory sensitivities to small speech signal changes.</p><p>The performance of an automatic speech recognition system strongly depends on the representation used for the front-end. If the extracted features do not include all relevant information, the performance of the classification stage is inherently suboptimal. The work described in Papers A, B and C is motivated by the fact that humans perform better at speech recognition than machines, particularly for noisy environments. The goal is to make use of knowledge of human perception in the selection and optimization of speech features for speech recognition. These papers show that maximizing the similarity of the Euclidean geometry of the features to the geometry of the perceptual domain is a powerful tool to select or optimize features. Experiments with a practical speech recognizer confirm the validity of the principle. It is also shown an approach to improve mel frequency cepstrum coefficients (MFCCs) through offline optimization. The method has three advantages: i) it is computationally inexpensive, ii) it does not use the auditory model directly, thus avoiding its computational cost, and iii) importantly, it provides better recognition performance than traditional MFCCs for both clean and noisy conditions.</p><p>The second task concerns automatic pronunciation error detection. The research, described in Papers D, E and F, is motivated by the observation that almost all native speakers perceive, relatively easily, the acoustic characteristics of their own language when it is produced by speakers of the language. Small variations within a phoneme category, sometimes different for various phonemes, do not change significantly the perception of the language’s own sounds. Several methods are introduced based on similarity measures of the Euclidean space spanned by the acoustic representations of the speech signal and the Euclidean space spanned by an auditory model output, to identify the problematic phonemes for a given speaker. The methods are tested for groups of speakers from different languages and evaluated according to a theoretical linguistic study showing that they can capture many of the problematic phonemes that speakers from each language mispronounce. Finally, a listening test on the same dataset verifies the validity of these methods.</p>
  •  
4.
  • Pilarczyk, Kacper, et al. (författare)
  • Molecules, semiconductors, light and information: Towards future sensing and computing paradigms
  • 2018
  • Ingår i: Reservoir Computing with Real-time Data for future IT (RECORD-IT).
  • Tidskriftsartikel (refereegranskat)abstract
    • Over the last few years we have witnessed a great progress in the research devoted to unconventional computing - an unorthodox approach to information handling. It includes both novel algorithms and computing paradigms as well as completely new elements of circuitry: whole organisms (e.g., Physarum species), DNA, enzymes, various biomolecules, molecular and nanoparticulate materials. One of the biggest challenges in this field is the realisation of in-materio computing - i.e., the utilisation of properties of pristine materials, instead of high-tech structures - for advanced information processing. In this review we present recent achievements in the design of logic devices (binary, ternary and fuzzy) implemented in molecular and nanoscale components, photoelectrochemical chemosensing, photoactive memristive devices and reservoir computing systems. A common denominator for all these devices is the involvement of molecular species, semiconducting nanoparticles and light in information processing. (C) 2018 Elsevier B.V. All rights reserved.
  •  
5.
  •  
6.
  •  
7.
  • Bresin, Roberto, et al. (författare)
  • Looking for the soundscape of the future : preliminary results applying the design fiction method
  • 2020
  • Ingår i: Sound and Music Computing Conference 2020.
  • Konferensbidrag (refereegranskat)abstract
    • <p>The work presented in this paper is a preliminary study in a larger project that aims to design the sound of the future through our understanding of the soundscapes of the present, and through methods of documentary filmmaking, sound computing and HCI. This work is part of a project that will complement and run parallel to Erik Gandini’s research project ”The Future through the Present”, which explores how a documentary narrative can create a projection into the future, and develop a cinematic documentary aesthetics that releases documentary film from the constraints of dealing with the present or the past. The point of departure is our relationship to labour at a time when Robotics, VR/AR and AI applied to Big Data outweigh and augment our physical and cognitive capabilities, with automation expected to replace humans on a large scale within most professional fields. From an existential perspective this poses the question: what will we do when we don’t have to work? And challenges us to formulate a new idea of work beyond its historical role. If the concept of work ethics changes, how would that redefine soundscapes? Will new sounds develop? Will sounds from the past resurface? In the context of this paper we try to tackle these questions by first applying the Design Fiction method. In a workshop with twenty-three participants predicted both positive and negative future scenarios, including both lo-fi and hi-fi soundscapes, and in which people will be able to control and personalize soundscapes. Results are presented, summarized and discussed.</p>
  •  
8.
  •  
9.
  • Han, Xu, et al. (författare)
  • Performance of piano trills: effects of hands, fingers, notes and emotions
  • 2019
  • Ingår i: Combined proceedings of the Nordic Sound and Music Computing Conference 2019 and the Interactive Sonification Workshop 2019. - Stockholm. ; s. 9-15
  • Konferensbidrag (refereegranskat)abstract
    • <p>Trill is a type of musical ornament. In automatic playback of piano music scores, trills are usually synthesised as a sequence of repeated notes with equal duration and dynamic level. This is not how trills are performed by pianists. In this study, trills were performed by three pianists on a Yamaha Disklavier and recorded as both audio and MIDI files. Then note duration, inter-onset interval (IOI) and key velocity for each note were extracted from MIDI files and analyzed in relation to hands, notes and emotions. Four significant effects were found; 1) hand effect: trills on right hand were in average performed with a faster rate, shorter note duration, longer off duration and faster key velocity, 2) finger effect: within the two notes forming a trill, notes with lower fingering number were performed with shorter off duration, while keeping note duration and key velocity close, 3) emotion effect: emotion mainly contributed to dynamic level, 4) crescendo effect: when crescendo happened, note duration and off duration compensated with each other and kept IOI at a almost constant value.</p>
  •  
10.
  • Hansen, Kjetil Falkenberg, Docent, 1972-, et al. (författare)
  • Student involvement in sound and music computing research: Current practices at KTH and KMH
  • 2019
  • Ingår i: Combined proceedings of the Nordic Sound and Music Computing Conference 2019 and the Interactive Sonification Workshop 2019. - Stockholm. ; s. 36-42
  • Konferensbidrag (refereegranskat)abstract
    • <p>To engage students in and beyond course activities has been a working practice both at KTH Sound and Music Computing group and at KMH Royal College of Music since many years. This paper collects experiences of involving students in research conducted within the two institutions. </p><p>We describe how students attending our courses are given the possibility to be involved in our research activities, and we argue that their involvement both contributes to develop new research and benefits the students in the short and long term.  Among the assignments, activities, and tasks we offer in our education programs are pilot experiments, prototype development, public exhibitions, performing, composing, data collection, analysis challenges, and bachelor and master thesis projects that lead to academic publications.</p>
  •  
Skapa referenser, mejla, bekava och länka
Åtkomst
fritt online (464)
Typ av publikation
konferensbidrag (721)
tidskriftsartikel (500)
bokkapitel (96)
rapport (50)
annan publikation (48)
doktorsavhandling (39)
visa fler...
licentiatavhandling (26)
proceedings (redaktörskap) (12)
samlingsverk (redaktörskap) (10)
bok (7)
forskningsöversikt (4)
recension (4)
patent (3)
konstnärligt arbete (1)
visa färre...
Typ av innehåll
refereegranskat (1189)
övrigt vetenskapligt (1029)
populärvet., debatt m.m. (79)
Författare/redaktör
Vasilakos, Athanasio ... (216)
Parnes, Peter, (97)
Åhlund, Christer, (87)
Synnes, Kåre, (84)
Zaslavsky, Arkady (83)
Andersson, Karl, (79)
visa fler...
Hallberg, Josef, (45)
Holmquist, Lars Erik ... (38)
Mitra, Karan, (38)
Hernwall, Patrik, (34)
Schmidt, Mischa, (30)
Brännström, Robert, (28)
Johansson, Dan, (27)
Bresin, Roberto, 196 ... (26)
Andersson, Karl, 197 ... (25)
Saguna, Saguna, (25)
Kaipainen, Mauri (24)
Wallin, Stefan (22)
Pargman, Daniel, (21)
Ranjan, Rajiv , (20)
Milrad, Marcelo, (20)
Schelén, Olov, (20)
Mejtoft, Thomas, 197 ... (19)
Hüttenrauch, Helge, (19)
Jayaraman, Prem Prak ... (18)
Granlund, Daniel, (17)
Kranz, Matthias, (17)
Holzapfel, André, 19 ... (16)
Lankoski, Petri (16)
Vasilakos, Athanasio ... (16)
Nugent, Chris, (15)
Scholl, Jeremiah (14)
Boytsov, Andrey (14)
Elkotob, Muslim, (13)
Wennberg, Paula, (13)
Räsänen, Minna, (13)
Nilsson, Marcus, (12)
Diewald, Stefan, (12)
Delsing, Jerker, (11)
Hedin, Björn, 1970-, (11)
Zapico, Jorge Luis, (11)
Roalter, Luis, (11)
Sjöström, Mårten, 19 ... (10)
Imran, Muhammad Al, (10)
Elblaus, Ludvig, 198 ... (10)
Hossain, Mohammad Sh ... (10)
Kumar, Neeraj, (10)
Phanse, Kaustubh, (10)
Bodin, Ulf (10)
Schülke, Anett, (10)
visa färre...
Lärosäte
Luleå tekniska universitet (853)
Södertörns högskola (218)
Kungliga Tekniska Högskolan (146)
Linnéuniversitetet (71)
Linköpings universitet (50)
Chalmers tekniska högskola (44)
visa fler...
Umeå universitet (27)
Uppsala universitet (25)
Blekinge Tekniska Högskola (15)
Mittuniversitetet (11)
Karlstads universitet (11)
Örebro universitet (9)
Stockholms universitet (8)
Högskolan i Skövde (7)
Lunds universitet (6)
Högskolan i Halmstad (5)
Högskolan i Jönköping (4)
Göteborgs universitet (3)
Sveriges Lantbruksuniversitet (3)
Högskolan Kristianstad (1)
Högskolan Väst (1)
Mälardalens högskola (1)
Konstfack (1)
Kungl. Musikhögskolan (1)
Högskolan i Borås (1)
Högskolan Dalarna (1)
visa färre...
Språk
Engelska (1451)
Svenska (67)
Norska (1)
Forskningsämne (UKÄ/SCB)
Naturvetenskap (1520)
Teknik (158)
Samhällsvetenskap (152)
Humaniora (54)
Medicin och hälsovetenskap (15)
Lantbruksvetenskap (1)

År

 
pil uppåt Stäng

Kopiera och spara länken för att återkomma till aktuell vy