SwePub
Sök i SwePub databas

  Utökad sökning

Träfflista för sökning "L773:2329 9290 "

Sökning: L773:2329 9290

  • Resultat 1-20 av 20
Sortera/gruppera träfflistan
   
NumreringReferensOmslagsbildHitta
1.
  • Ahrens, Jens, 1978, et al. (författare)
  • Computation of Spherical Harmonic Representations of Source Directivity Based on the Finite-Distance Signature
  • 2021
  • Ingår i: IEEE/ACM Transactions on Speech and Language Processing. - 2329-9290 .- 2329-9304. ; 29, s. 83-92
  • Tidskriftsartikel (refereegranskat)abstract
    • The measurement of directivity for sound sources that are not electroacoustic transducers is fundamentally limited because the source cannot be driven with arbitrary signals. A consequence is that directivity can only be measured at a sparse set of frequencies—for example, at the stable partial oscillations of a steady tone played by a musical instrument or from the human voice. This limitation prevents the data from being used in certain applications such as time-domain room acoustic simulations where the directivity needs to be available at all frequencies in the frequency range of interest. We demonstrate in this article that imposing the signature of the directivity that is obtained at a given distance on a spherical wave allows for all interpolation that is required for obtaining a complete spherical harmonic representation of the source’s directivity, i.e., a representation that is viable at any frequency, in any direction, and at any distance. Our approach is inspired by the far-field signature of exterior sound fields. It is not capable of incorporating the phase of the directivity directly. We argue based on directivity measurement data of musical instruments that the phase of such measurement data is too unreliable or too ambiguous to be useful. We incorporate numerically-derived directivity into the example application of finite difference time domain simulation of the acoustic field, which has not been possible previously.
  •  
2.
  • Ahrens, Jens, 1978, et al. (författare)
  • Spherical Harmonic Decomposition of a Sound Field Using Microphones on a Circumferential Contour Around a Non-Spherical Baffle
  • 2022
  • Ingår i: IEEE/ACM Transactions on Speech and Language Processing. - 2329-9290 .- 2329-9304. ; 30, s. 3110-3119
  • Tidskriftsartikel (refereegranskat)abstract
    • Spherical harmonic (SH) representations of sound fields are usually obtained from microphone arrays with rigid spherical baffles whereby the microphones are distributed over the entire surface of the baffle. We present a method that overcomes the requirement for the baffle to be spherical. Furthermore, the microphones can be placed along a circumferential contour around the baffle. This greatly reduces the required number of microphones for a given spatial resolution compared to conventional spherical arrays. Our method is based on the analytical solution for SH decomposition based on observations along the equator of a rigid sphere that we presented recently. It incorporates a calibration stage in which the microphone signals due to a suitable set of calibration sound fields are projected onto the SH decomposition of those same sound fields on the surface of a notional rigid sphere by means of a linear filtering operation. The filter coefficients are computed from the calibration data via a least/squares fit. We present an evaluation of the method based on the application of binaural rendering of the SH decomposition of the signals from an 18/element array that uses a human head as the baffle and that provides 8th ambisonic order. We analyse the accuracy and robustness of our method based on simulated data as well as based on measured data from a prototype.
  •  
3.
  • Deppisch, Thomas, 1993, et al. (författare)
  • Direct and Residual Subspace Decomposition of Spatial Room Impulse Responses
  • 2023
  • Ingår i: IEEE/ACM Transactions on Audio Speech and Language Processing. - 2329-9290 .- 2329-9304. ; 31, s. 927-942
  • Tidskriftsartikel (refereegranskat)abstract
    • Psychoacoustic experiments have shown that directional properties of the direct sound, salient reflections, and the late reverberation of an acoustic room response can have a distinct influence on the auditory perception of a given room. Spatial room impulse responses (SRIRs) capture those properties and thus are used for direction-dependent room acoustic analysis and virtual acoustic rendering. This work proposes a subspace method that decomposes SRIRs into a direct part, which comprises the direct sound and the salient reflections, and a residual, to facilitate enhanced analysis and rendering methods by providing individual access to these components. The proposed method is based on the generalized singular value decomposition and interprets the residual as noise that is to be separated from the other components of the reverberation. Large generalized singular values are attributed to the direct part, which is then obtained as a low-rank approximation of the SRIR. By advancing from the end of the SRIR toward the beginning while iteratively updating the residual estimate, the method adapts to spatio-temporal variations of the residual. The method is evaluated using a spatio-spectral error measure and simulated SRIRs of different rooms, microphone arrays, and ratios of direct sound to residual energy. The proposed method creates lower errors than existing approaches in all tested scenarios, including a scenario with two simultaneous reflections. A case study with measured SRIRs shows the applicability of the method under real-world acoustic conditions. A reference implementation is provided.
  •  
4.
  • Gunnarsson, Viktor, et al. (författare)
  • Binaural Auralization of Microphone Array Room Impulse Responses Using Causal Wiener Filtering
  • 2021
  • Ingår i: IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING. - : Institute of Electrical and Electronics Engineers (IEEE). - 2329-9290 .- 2329-9304. ; 29, s. 2899-2914
  • Tidskriftsartikel (refereegranskat)abstract
    • Binaural room auralization involves Binaural Room Impulse Responses (BRIRs). Dynamic binaural synthesis (i.e., head-tracked presentation) requires BRIRs for multiple head poses. Artificial heads can be used to measure BRIRs, but BRIR modeling from microphone array room impulse responses (RIRs) is becoming popular since personalized BRIRs can be obtained for any head pose with low extra effort. We present a novel framework for estimating a binaural signal from microphone array signals, using causal Wiener filtering and polynomial matrix formalism. The formulation places no explicit constraints on the geometry of the microphone array and enables directional weighting of the estimation error. A microphone noise model is used for regularization and to balance filter performance and noise gain. A complete procedure for BRIR modeling from microphone array RIRs is also presented, employing the proposed Wiener filtering framework. An application example illustrates the modeling procedure using a 19-channel spherical microphone array. Direct and reflected sound segments are modeled separately. The modeled BRIRs are compared to measured BRIRs and are shown to be waveform-accurate up to at least 1.5 kHz. At higher frequencies, correct statistical properties of diffuse sound field components are aimed for. A listening test indicates small perceptual differences to measured BRIRs. The presented method facilitates fast BRIR data set acquisition for use in dynamic binaural synthesis and is a viable alternative to Ambisonics-based binaural room auralization.
  •  
5.
  • Gunnarsson, Viktor (författare)
  • Spectral Correction of Audio Objects in Stereophonic Rendering
  • 2024
  • Ingår i: IEEE/ACM Transactions on Audio, Speech, and Language Processing. - : Institute of Electrical and Electronics Engineers (IEEE). - 2329-9290 .- 2329-9304. ; 32, s. 3141-3156
  • Tidskriftsartikel (refereegranskat)abstract
    • This paper presents a comprehensive model for ear-signal level coloration in stereo amplitude panning, enabling the calculation of monaural correction filters that equalize the average coloration over a small area around the sweet spot. The model takes into account the speaker setup geometry, listener Head-Related Transfer-Functions (HRTFs), the employed pan-law, the direct-to-reflected sound ratio, and the correlation between the speaker signals at the listening position. Coloration in diffuse sound reproduction is also investigated. The coloration model is validated using binaural room impulse response measurements, and the correction filters are found to effectively reduce the difference in composite ear power spectrum between a discrete and virtual center source. A listening test on the perceived spectral difference between these two cases, with stereo setups in front of and behind the listener, indicate that the correction filter improves timbral similarity between a virtual and discrete center source for rear speaker panning. The test also indicates that remaining unmodeled coloration sources are large, especially for front panning. However, a second listening test finds that the correction filter improves accuracy of perceived direction in front panning by mitigating the phantom image elevation effect.
  •  
6.
  • Helmholz, Hannes, 1990, et al. (författare)
  • Effects of Additive Noise in Binaural Rendering of Spherical Microphone Array Signals
  • 2021
  • Ingår i: IEEE/ACM Transactions on Speech and Language Processing. - 2329-9290 .- 2329-9304. ; 29, s. 3642-3653
  • Tidskriftsartikel (refereegranskat)abstract
    • Additive noise produced by the recording hardware will contribute to streamed signals from spherical microphone arrays under practical conditions. For the application of binaural reproduction and under the assumption that the noise is uncorrelated between the array channels, the spectral properties and the overall level of the rendered noise in the ear signals have been shown to be strongly influenced by the configuration of the array as well as of the processing pipeline. In a previous investigation, we determined the audibility thresholds for changes in the rendered noise due to listener head rotations as a function of the differences in noise level of individual array channels. In this article, we calibrate the instrumental metric of Composite Loudness Level to the perceptual data and predict audibility of changes in the additive noise due to head rotations for a broad set of array configurations and distributions of the noise levels across array channels. We demonstrate that some types of microphone layouts can produce audible variations even if the noise level is equal in all channels. This is particularly the case for sampling grids that exhibit negative quadrature weights such as the Lebedev and Fliege-Maier grids for some spherical harmonic orders. The analysis of configurations with unevenly distributed noise contributions show that the influence of the noise from individual array channels is determined by the proximity of their virtual location to the relative trajectory of the ears.
  •  
7.
  • Krebs, Florian, et al. (författare)
  • Inferring metrical structure in music using particle filters
  • 2015
  • Ingår i: IEEE Transactions on Audio, Speech and Language Processing. - : IEEE Press. - 2329-9290 .- 2329-9304. ; 23:5, s. 817-827
  • Tidskriftsartikel (refereegranskat)abstract
    • In this paper, we propose a new state-of-the-art particle filter (PF) system to infer the metrical structure of musical audio signals. The new inference method is designed to overcome the problem of PFs in multi-modal probability distributions, which arise due to tempo and phase ambiguities in musical rhythm representations. We compare the new method with a hidden Markov model (HMM) system and several other PF schemes in terms of performance, speed and scalability on several audio datasets. We demonstrate that using the proposed system the computational complexity can be reduced drastically in comparison to the HMM while maintaining the same order of beat tracking accuracy. Therefore, for the first time, the proposed system allows fast meter inference in a high-dimensional state space, spanned by the three components of tempo, type of rhythm, and position in a metric cycle.
  •  
8.
  • Adalbjörnsson, Stefan Ingi, et al. (författare)
  • Sparse Localization of Harmonic Audio Sources
  • 2016
  • Ingår i: IEEE/ACM Transactions on Audio, Speech, and Language Processing. - 2329-9290. ; 24:1, s. 117-129
  • Tidskriftsartikel (refereegranskat)abstract
    • In this paper, we propose a novel method for estimating the locations of near- and/or far-field harmonic audio sources impinging on an arbitrary, but calibrated, sensor array. Using a joint pitch and location estimation formed in two steps, we first estimate the fundamental frequencies and complex amplitudes under a sinusoidal model assumption, whereafter the location of each source is found by utilizing both the difference in phase and the relative attenuation of the magnitude estimates. As audio recordings often consist of multi-pitch signals exhibiting some degree of reverberation, where both the number of pitches and the source locations are unknown, we propose to use sparse heuristics to avoid the necessity of detailed a priori assumptions on the spectral and spatial model orders. The method’s performance is evaluated using both simulated and measured audio data, with the former showing that the proposed method achieves near-optimal performance, whereas the latter confirms the method’s feasibility when used with real recordings.
  •  
9.
  • Chettri, Bhusan, et al. (författare)
  • Dataset Artefacts in Anti-Spoofing Systems : A Case Study on the ASVspoof 2017 Benchmark
  • 2020
  • Ingår i: IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING. - : Institute of Electrical and Electronics Engineers (IEEE). - 2329-9290. ; 28, s. 3018-3028
  • Tidskriftsartikel (refereegranskat)abstract
    • The Automatic Speaker Verification Spoofing and Countermeasures Challenges motivate research in protecting speech biometric systems against a variety of different access attacks. The 2017 edition focused on replay spoofing attacks, and involved participants building and training systems on a provided dataset (ASVspoof 2017). More than 60 research papers have so far been published with this dataset, but none have sought to answer why countermeasures appear successful in detecting spoofing attacks. This article shows how artefacts inherent to the dataset may be contributing to the apparent success of published systems. We first inspect the ASVspoof 2017 dataset and summarize various artefacts present in the dataset. Second, we demonstrate how countermeasure models can exploit these artefacts to appear successful in this dataset. Third, for reliable and robust performance estimates on this dataset we propose discarding nonspeech segments and silence before and after the speech utterance during training and inference. We create speech start and endpoint annotations in the dataset and demonstrate how using them helps countermeasure models become less vulnerable from being manipulated using artefacts found in the dataset. Finally, we provide several new benchmark results for both frame-level and utterance-level models that can serve as new baselines on this dataset.
  •  
10.
  • Dabbaghchian, Saeed, et al. (författare)
  • Synthesis of vowels and vowel-vowel utterancesusing a 3D biomechanical-acoustic model
  • 2018
  • Ingår i: IEEE/ACM Transactions on Audio, Speech, and Language Processing. - 2329-9290.
  • Tidskriftsartikel (refereegranskat)abstract
    • A link is established between a 3D biomechanicaland acoustic model allowing for the umerical synthesis of vowelsounds by contraction of the relevant muscles. That is, thecontraction of muscles in the biomechanical model displacesand deforms the articulators, which in turn deform the vocaltract shape. The mixed wave equation for the acoustic pressureand particle velocity is formulated in an arbitrary Lagrangian-Eulerian framework to account for moving boundaries. Theequations are solved numerically using the finite element method.Since the activation of muscles are not fully known for a givenvowel sound, an inverse method is employed to calculate aplausible activation pattern. For vowel-vowel utterances, two different approaches are utilized: linear interpolation in eithermuscle activation or geometrical space. Although the former isthe natural choice for biomechanical modeling, the latter is usedto investigate the contribution of biomechanical modeling onspeech acoustics. Six vowels [ɑ, ə, ɛ, e, i, ɯ] and three vowel-vowelutterances [ɑi, ɑɯ, ɯi] are synthesized using the 3D model. Results,including articulation, formants, and spectrogram of vowelvowelsounds, are in agreement with previous studies.Comparingthe spectrogram of interpolation in muscle and geometrical spacereveals differences in all frequencies, with the most extendeddifference in the second formant transition.
  •  
11.
  • Elvander, Filip, et al. (författare)
  • Online Estimation of Multiple Harmonic Signals
  • 2017
  • Ingår i: IEEE/ACM Transactions on Audio, Speech, and Language Processing. - 2329-9290. ; 25:2, s. 273-284
  • Tidskriftsartikel (refereegranskat)abstract
    • In this paper, we propose a time-recursive multipitch estimation algorithm using a sparse reconstruction framework, assuming that only a few pitches from a large set of candidates are active at each time instant. The proposed algorithm does not require any training data, and instead utilizes a sparse recursive least-squares formulation augmented by an adaptive penalty term specifically designed to enforce a pitch structure on the solution. The amplitudes of the active pitches are also recursively updated, allowing for a smooth and more accurate representation. When evaluated on a set of ten music pieces, the proposed method is shown to outperform other general purpose multipitch estimators in either accuracy or computational speed, although not being able to yield performance as good as the state-of-the art methods, which are being optimally tuned and specifically trained on the present instruments. However, the method is able to outperform such a technique when used without optimal tuning, or when applied to instruments not included in the training data.
  •  
12.
  • Leijon, Arne, et al. (författare)
  • Bayesian Analysis of Phoneme Confusion Matrices
  • 2016
  • Ingår i: IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING. - : IEEE. - 2329-9290. ; 24:3
  • Tidskriftsartikel (refereegranskat)abstract
    • This paper presents a parametric Bayesian approach to the statistical analysis of phoneme confusion matrices measured for groups of individual listeners in one or more test conditions. Two different bias problems in conventional estimation of mutual information are analyzed and explained theoretically. Evaluations with synthetic datasets indicate that the proposed Bayesian method can give satisfactory estimates of mutual information and response probabilities, even for phoneme confusion tests using a very small number of test items for each phoneme category. The proposed method can reveal overall differences in performance between two test conditions with better power than conventional Wilcoxon significance tests or conventional confidence intervals. The method can also identify sets of confusion-matrix cells that are credibly different between two test conditions, with better power than a similar approximate frequentist method.
  •  
13.
  • Nordholm, Sven, et al. (författare)
  • Performance Limits in Subband Beamforming
  • 2003
  • Ingår i: IEEE transactions on speech and audio processing. - : IEEE. - 1063-6676 .- 1558-2353 .- 2329-9290. ; 11:3, s. 193-203
  • Tidskriftsartikel (refereegranskat)abstract
    • This paper analyzes subband beamforming schemes mainly aimed at speech enhancement and acoustic echo suppression applications such as hands-free telephony for both mobile and office environments, internet telephony and video conferencing. Analytical descriptions of both causal finite-length and noncausal infinite-length subband microphone array structures are given. More specifically, this paper compares finite Wiener filter performance with the noncausal Wiener solution, giving a comprehensive theoretical suppression limit. It is shown that even short filters will yield a good approximation of the infinite solution, provided that the element spacing and temporal sampling is matched to the frequency band of interest. Typically, 10-20 FIR taps are sufficient in each subband.
  •  
14.
  • Petkov, Petko N., 1980-, et al. (författare)
  • Spectral Dynamics Recovery for Enhanced Speech Intelligibility in Noise
  • 2015
  • Ingår i: IEEE/ACM Transactions on Speech and Language Processing. - : Institute of Electrical and Electronics Engineers (IEEE). - 2329-9290. ; 23:2, s. 327-338
  • Tidskriftsartikel (refereegranskat)abstract
    • Speech intelligibility in noisy environments decreases with an increase in the noise power. We hypothesize that the differences of subsequent short-term spectra of speech, which we collectively refer to as the speech spectral dynamics, can be used to characterize speech intelligibility. We propose a distortion measure to characterize the deviation of the dynamics of the noisy modified speech from the dynamics of natural speech. Optimizing this distortion measure, we derive a parametric relationship between the signal band-power before and after modification. The parametric nature of the solution ensures adaptation to the noise level, the speech statistics and a penalty on the power gain. A multi-band speech modification system based on the single-band optimal solution is designed under a total signal power constraint and evaluated in selected noise conditions. The results indicate that the proposed approach compares favorably to a reference method based on optimizing a measure of the speech intelligibility index. Very low computational complexity and high intelligibility gain make this an attractive approach for speech modification in a wide range of application scenarios.
  •  
15.
  • Schüldt, Christian, 1978-, et al. (författare)
  • Decay Rate Estimators and Their Performance for Blind Reverberation Time Estimation
  • 2014
  • Ingår i: IEEE/ACM Transactions on Audio, Speech, and Language Processing. - 2329-9290. ; 22:8, s. 1274-1284
  • Tidskriftsartikel (refereegranskat)abstract
    • Several approaches for blind estimation of reverberation time have been presented in the literature and decay rate estimation is an integral part of many, if not all, of such approaches. This paper provides both an analytical and experimental comparison, in terms of the bias and variance of three common decay rate estimators; a straight-forward linear regression approach as well as two maximum-likelihood based methods. Situations with and without interfering additive noise are considered. It is shown that the linear regression based approach is unbiased if no smoothing is applied, and that the estimation variance in the absence of noise is constantly about twice that of the maximum-likelihood based methods. It is shown that the methods that do not take possible noise into account suffer from similar estimation bias in the presence of noise. Further, a hybrid method, combining the noise robustness and low computational complexity advantages of the two different maximum-likelihood based methods, is presented.
  •  
16.
  • Shahrebabaki, Abdolreza Sabzi, et al. (författare)
  • Acoustic-to-Articulatory Mapping With Joint Optimization of Deep Speech Enhancement and Articulatory Inversion Models
  • 2022
  • Ingår i: IEEE/ACM transactions on audio, speech, and language processing. - : Institute of Electrical and Electronics Engineers (IEEE). - 2329-9290. ; 30, s. 135-147
  • Tidskriftsartikel (refereegranskat)abstract
    • We investigate the problem of speaker independent acoustic-to-articulatory inversion (AAI) in noisy conditions within the deep neural network (DNN) framework. In contrast with recent results in the literature, we argue that a DNN vector-to-vector regression front-end for speech enhancement (DNN-SE) can play a key role in AAI when used to enhance spectral features prior to AAI back-end processing. We experimented with single- and multi-task training strategies for the DNN-SE block finding the latter to be beneficial to AAI. Furthermore, we show that coupling DNN-SE producing enhanced speech features with an AAI trained on clean speech outperforms a multi-condition AAI (AAI-MC) when tested on noisy speech. We observe a 15% relative improvement in the Pearson's correlation coefficient (PCC) between our system and AAI-MC at 0 dB signal-to-noise ratio on the Haskins corpus. Our approach also compares favourably against using a conventional DSP approach to speech enhancement (MMSE with IMCRA) in the front-end. Finally, we demonstrate the utility of articulatory inversion in a downstream speech application. We report significant WER improvements on an automatic speech recognition task in mismatched conditions based on the Wall Street Journal corpus (WSJ) when leveraging articulatory information estimated by AAI-MC system over spectral-alone speech features.
  •  
17.
  • Sward, Johan, et al. (författare)
  • Off-grid Fundamental Frequency Estimation
  • 2018
  • Ingår i: IEEE/ACM Transactions on Audio, Speech, and Language Processing. - 2329-9290. ; 26:2, s. 296-303
  • Tidskriftsartikel (refereegranskat)abstract
    • In this paper, we propose a gridless method for estimating an unknown number of fundamental frequencies. Starting with a conventional dictionary matrix, containing sets of candidate fundamental frequencies and their corresponding harmonics, a non-convex log-sum cost function is formed such that it imposes the harmonic structure and treats every fundamental frequency in the dictionary as a parameter. The cost function is iteratively decreased by minimizing a surrogate function, and, in each iteration, the fundamental frequencies are refined, whereas redundant parameters are omitted from the dictionary. The proposed method is tested on both real and simulated data, showing its preferred performance as compared to other state-of-the-art multi-pitch estimators.
  •  
18.
  • Venkitaraman, Arun, et al. (författare)
  • Binaural Signal Processing Motivated Generalized Analytic Signal Construction and AM-FM Demodulation
  • 2014
  • Ingår i: IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING. - 2329-9290. ; 22:6, s. 1023-1036
  • Tidskriftsartikel (refereegranskat)abstract
    • Binaural hearing studies show that the auditory system uses the phase-difference information in the auditory stimuli for localization of a sound source. Motivated by this finding, we present a method for demodulation of amplitude-modulated-frequency-modulated (AM-FM) signals using a signal and its arbitrary phase-shifted version. The demodulation is achieved using two allpass filters, whose impulse responses are related through the fractional Hilbert transform (FrHT). The allpass filters are obtained by cosine-modulation of a zero-phase flat-top prototype halfband lowpass filter. The outputs of the filters are combined to construct an analytic signal (AS) from which the AM and FM are estimated. We show that, under certain assumptions on the signal and the filter structures, the AM and FM can be obtained exactly. The AM-FM calculations are based on the quasi-eigenfunction approximation. We then extend the concept to the demodulation of multicomponent signals using uniform and non-uniform cosine-modulated filterbank (FB) structures consisting of flat bandpass filters, including the uniform cosine-modulated, equivalent rectangular bandwidth (ERB), and constant-Q filterbanks. We validate the theoretical calculations by considering application on synthesized AM-FM signals and compare the performance in presence of noise with three other multiband demodulation techniques, namely, the Teager-energy-based approach, the Gabor's AS approach, and the linear transduction filter approach. We also show demodulation results for real signals.
  •  
19.
  • Widmark, Simon (författare)
  • Causal IIR Audio Precompensator Filters Subject to Quadratic Constraints
  • 2018
  • Ingår i: IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING. - : IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC. - 2329-9290. ; 26:10, s. 1897-1912
  • Tidskriftsartikel (refereegranskat)abstract
    • Infinite impulse response (IIR) Wiener precompensator design, with constraints on causality, is here also extended to incorporate general quadratic constraints. A method for finding a linear quadratic optimal, causal discrete-time multiple-input multiple-output filter subject to a set of user defined constraints is proposed and analyzed. A method for designing causal filters subject to constraints on the power gains in a large number of small frequency intervals is also proposed. The resulting set of methods provide constrained stable IIR filters with optimal parameterization. Compared to finite impulse response Wiener filtering, the computational complexity is much lower; and compared to noncausal frequency domain designs, we gain control of the time-domain properties of the compensated system. The design methods are applied to a room compensation audio problem subject to filter power gain constraint(s) and are compared to a corresponding noncausal per-frequency method. The results are presented with audio filtering and sound field control as main motivating applications but the methods extend to other areas of linear feedforward controller design and Wiener filtering.
  •  
20.
  • Widmark, Simon (författare)
  • Causal MSE-Optimal Filters for Personal Audio Subject to Constrained Contrast
  • 2019
  • Ingår i: IEEE-ACM Transactions on Audio, Speech and Language Processing (TASLP). - : IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC. - 2329-9290. ; 27:5, s. 972-987
  • Tidskriftsartikel (refereegranskat)abstract
    • A novel design method that generates causal pre-compensation filters is formulated. The resulting filters are constrained with respect to the amount of acoustic contrast they generate and are intended to be used for personal audio. The proposed method provides a more direct method for trading bright zone behavior against acoustic contrast as compared to other causal methods available. It also provides improved control over the temporal properties of the resulting filters as compared to the pre-existing non-causal methods. The resulting filters are analyzed by means of simulations, based on measured impulse responses of the sound-system-room interactions. The results of the simulations are compared to simulations of a frequency-domain optimal method with comparable objective, as proposed by Cai et al. and the results of the comparison are explained using the design equations. It is shown that the proposed method is viable, but that unattainable contrasts have a detrimental impact on the spectral bright zone behavior. A few different strategies for dealing with this problem are also proposed. It is demonstrated that the detrimental effect of increasingly strict causality constraints mainly concerns the lower frequency bright zone behavior in the system under investigation, but that the very highest attainable contrast levels may also be reduced somewhat.
  •  
Skapa referenser, mejla, bekava och länka
  • Resultat 1-20 av 20

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Stäng

Kopiera och spara länken för att återkomma till aktuell vy