SwePub
Sök i SwePub databas

  Utökad sökning

Träfflista för sökning "WFRF:(Kleijn W. Bastiaan) "

Sökning: WFRF:(Kleijn W. Bastiaan)

  • Resultat 1-10 av 126
Sortera/gruppera träfflistan
   
NumreringReferensOmslagsbildHitta
1.
  • Alhaj Moussa, Obada, 1982-, et al. (författare)
  • PITCH ENHANCEMENT MOTIVATED BY RATE-DISTORTION THEORY
  • 2014
  • Konferensbidrag (refereegranskat)abstract
    • A pitch enhancement filter is designed with the objective to approach the optimal rate-distortion trade-off. The filter shows significant perceptual benefits, restating that information-theoretical and perceptual criteria are usually consistent. The filter is easy to implement and can be used as a complement to existing audio codecs. Our experiments show that it can improve the reconstruction quality of the AMR-WB standard.
  •  
2.
  • Alhaj Moussa, Obada, et al. (författare)
  • Predictive Audio Coding Using Rate-Distortion-Optimal Pre-and-Post-Filtering
  • 2011
  • Ingår i: Applications of Signal Processing to Audio and Acoustics (WASPAA), 2011 IEEE Workshop on. - : IEEE conference proceedings. ; , s. 213-216
  • Konferensbidrag (refereegranskat)abstract
    • A natural approach to audio coding is to use a rate-distortion optimal design combined with a perceptual model. While this approach is common in transform coding, existing predictive-coding based audio coders are generally not optimal and they benefit from heuristically motivated post-filtering. As delay requirements often force the use of predictive coding, we consider audio coding with a pre- and post-filtered predictive structure that was recently proven to be asymptotically optimal in the rate-distortion sense [1]. We show that this audio coding is efficient in achieving the state-of-the-art performance. We also show that the pre-filter plays a relatively minor role. This leads to an analytic approach for optimizing the post-filter and the predictor at each rate, eliminating the need for manual re-tuning whenever a different rate is called for. In a subjective test, the theoretically optimized post-filter provided a better performance than a conventional post-filter.
  •  
3.
  • Chatterjee, Saikat, et al. (författare)
  • Auditory Model-Based Design and Optimization of Feature Vectors for Automatic Speech Recognition
  • 2011
  • Ingår i: IEEE Transactions on Audio, Speech, and Language Processing. - 1558-7916 .- 1558-7924. ; 19:6, s. 1813-1825
  • Tidskriftsartikel (refereegranskat)abstract
    • Using spectral and spectro-temporal auditory models along with perturbation-based analysis, we develop a new framework to optimize a feature vector such that it emulates the behavior of the human auditory system. The optimization is carried out in an offline manner based on the conjecture that the local geometries of the feature vector domain and the perceptual auditory domain should be similar. Using this principle along with a static spectral auditory model, we modify and optimize the static spectral mel frequency cepstral coefficients (MFCCs) without considering any feedback from the speech recognition system. We then extend the work to include spectro-temporal auditory properties into designing a new dynamic spectro-temporal feature vector. Using a spectro-temporal auditory model, we design and optimize the dynamic feature vector to incorporate the behavior of human auditory response across time and frequency. We show that a significant improvement in automatic speech recognition (ASR) performance is obtained for any environmental condition, clean as well as noisy.
  •  
4.
  • Chatterjee, Saikat, et al. (författare)
  • AUDITORY MODEL BASED MODIFIED MFCC FEATURES
  • 2010
  • Ingår i: 2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING. - 9781424442966 ; , s. 4590-4593
  • Konferensbidrag (refereegranskat)abstract
    • Using spectral and spectro-temporal auditory models, we develop a computationally simple feature vector based on the design architecture of existing mel frequency cepstral coefficients (MFCCs). Along with the use of an optimized static function to compress a set of filter bank energies, we propose to use a memory-based adaptive compression function to incorporate the behavior of human auditory response across time and frequency. We show that a significant improvement in automatic speech recognition (ASR) performance is obtained for any environmental condition, clean as well as noisy.
  •  
5.
  • Driesen, J., et al. (författare)
  • Learning from images and speech with non-negative matrix factorization enhanced by input space scaling
  • 2010
  • Ingår i: 2010 IEEE Workshop on Spoken Language Technology, SLT 2010 - Proceedings. - : IEEE. - 9781424479030 ; , s. 1-6
  • Konferensbidrag (refereegranskat)abstract
    • Computional learning from multimodal data is often done with matrix factorization techniques such as NMF (Non-negative Matrix Factorization), pLSA (Probabilistic Latent Semantic Analysis) or LDA (Latent Dirichlet Allocation). The different modalities of the input are to this end converted into features that are easily placed in a vectorized format. An inherent weakness of such a data representation is that only a subset of these data features actually aids the learning. In this paper, we first describe a simple NMF-based recognition framework operating on speech and image data. We then propose and demonstrate a novel algorithm that scales the inputs of this framework in order to optimize its recognition performance.
  •  
6.
  • Ekman, Anders, et al. (författare)
  • Double-Ended Quality Assessment System for Super-Wideband Speech
  • 2011
  • Ingår i: IEEE TRANS AUDIO SPEECH LANG. - 1558-7916. ; 19:3, s. 558-569
  • Tidskriftsartikel (refereegranskat)abstract
    • This paper describes a double-ended quality assessment system for speech with a bandwidth of up to 14 kHz (so-called super-wideband speech). The quality assessment system is based on a combination of local and global features, where the local features are dependent on a time alignment procedure and the global features are not. The system is evaluated over a large set of subjectively scored narrowband, wideband and super-wideband speech databases. The system performs similarly to PESQ for narrowband speech and significantly better for wideband speech.
  •  
7.
  • Ekman, L. Anders, et al. (författare)
  • Spectral envelope estimation and regularization
  • 2006
  • Ingår i: 2006 IEEE International Conference on Acoustics, Speech and Signal Processing. ; , s. 245-248
  • Konferensbidrag (refereegranskat)abstract
    • A well-known problem with linear prediction is that its estimate of the spectral envelope often has sharp peaks for high-pitch speakers. These peaks are anomalies resulting from contamination of the spectral envelope by the spectral fine structure. We investigate the method of regularized linear prediction to find a better estimate of the spectral envelope and compare the method to the commonly used approach of bandwidth expansion. We present simulations over voiced frames of female speakers from the TINUT database, where the envelope modeling accuracy is measured using a log spectral distortion measure. We also investigate the coding properties of the methods. The results indicate that the new regularized LP method is superior to bandwidth expansion, with an insignificant increase in computational complexity.
  •  
8.
  • Falk, Tiago H., et al. (författare)
  • Noise Suppression Based on Extending a Speech-Dominated Modulation Band
  • 2007
  • Ingår i: INTERSPEECH 2007. - 9781605603162 ; , s. 1469-1472
  • Konferensbidrag (refereegranskat)abstract
    • Previous work on bandpass modulation filtering for noise suppression has resulted in unwanted perceptual artifacts and decreased speech clarity. Artifacts are introduced mainly due to half-wave rectification, which is employed to correct for negative power spectral values resultant from the filtering process. In this paper, modulation frequency estimation (i.e., bandwidth extension) is used to improve perceptual quality. Experiments demonstrate that speech-component lowpass modulation content can be reliably estimated from bandpass modulation content of speech-plus-noise components. Subjective listening tests corroborate that improved quality is attained when the removed speech lowpass modulation content is compensated for by the estimate.
  •  
9.
  • Faundez-Zanuy, M., et al. (författare)
  • On the relevance of bandwidth extension for speaker identification
  • 2015
  • Ingår i: European Signal Processing Conference. - : EUSIPCO. - 2219-5491.
  • Tidskriftsartikel (refereegranskat)abstract
    • In this paper we discuss the relevance of bandwidth extension for speaker identification tasks. Mainly we want to study if it is possible to recognize voices that have been bandwith extended. For this purpose, we created two different databases (microphonic and ISDN) of speech signals that were bandwidth extended from telephone bandwidth ([300, 3400] Hz) to full bandwidth ([100, 8000] Hz). We have evaluated different parameterizations, and we have found that the MELCEPST parameterization can take advantage of the bandwidth extension algorithms in several situations.
  •  
10.
  • Faundez-Zanuy, M, et al. (författare)
  • The COST-277 European action : An overview
  • 2005
  • Ingår i: NONLINEAR ANALYSES AND ALGORITHMS FOR SPEECH PROCESSING. ; , s. 1-9
  • Konferensbidrag (refereegranskat)abstract
    • This paper summarizes the rationale for proposing the COST-277 "nonlinear speech processing" action, and the work done during these last four years. In addition, future perspectives are described.
  •  
Skapa referenser, mejla, bekava och länka
  • Resultat 1-10 av 126
Typ av publikation
konferensbidrag (69)
tidskriftsartikel (43)
doktorsavhandling (7)
annan publikation (5)
forskningsöversikt (1)
licentiatavhandling (1)
visa fler...
visa färre...
Typ av innehåll
refereegranskat (106)
övrigt vetenskapligt/konstnärligt (20)
Författare/redaktör
Kleijn, W. Bastiaan (117)
Li, Minyue (13)
Klejsa, Janusz (11)
Henter, Gustav Eje (9)
Petkov, Petko N. (9)
Kozica, Ermin (9)
visa fler...
Kleijn, W. Bastiaan, ... (7)
Vafin, R. (6)
Ozerov, Alexey (6)
Grancharov, Volodya (5)
Kim, Moo Young (5)
Zhang, Guoqiang (5)
Leijon, Arne (4)
Plasberg, Jan H. (4)
Vafin, Renat (4)
Huang, F. (3)
Lee, T. (3)
Nilsson, Mattias (3)
Chatterjee, Saikat (3)
Kubin, G (3)
Faundez-Zanuy, M. (3)
Samuelsson, Jonas (3)
Zhao, David Yuheng (3)
Leijon, Arne, Profes ... (3)
Heusdens, R. (3)
Kuropatwinski, Marci ... (3)
Petkov, Petko N., 19 ... (3)
Srinivasan, Sriram (3)
Lundin, Henrik (2)
Flierl, Markus (2)
Lindblom, Jonas (2)
Jensen, J. (2)
Hu, Xiaoming (2)
Taghia, Jalil (2)
Petkov, Petko (2)
Feldbauer, C. (2)
Guo, Jun (2)
Guoqiang, Zhang, 198 ... (2)
Kot, V. (2)
Niamut, O. A. (2)
Van De Par, S. (2)
Van Schijndel, N. H. (2)
Heusdens, Richard (2)
Ramchandran, Kannan (2)
Ma, Zhanyu (2)
Mohammadiha, Nasser, ... (2)
Pobloth, H. (2)
Östergaard, Jan (2)
Zhang, Guoqiang, 198 ... (2)
Zhi, Ruicong (2)
visa färre...
Lärosäte
Kungliga Tekniska Högskolan (126)
Språk
Engelska (126)
Forskningsämne (UKÄ/SCB)
Teknik (61)
Naturvetenskap (38)
Humaniora (3)
Samhällsvetenskap (2)

År

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Stäng

Kopiera och spara länken för att återkomma till aktuell vy