Abstract: Acoustic features play an important role in improving the quality of the synthesised speech. Currently, the Mel spectrogram is a widely employed acoustic feature in most acoustic models.
Abstract: In this work, we propose CleanMel, a single-channel Mel-spectrogram denoising and dereverberation network for improving both speech quality and automatic speech recognition (ASR) performance ...
A study published in the journal Information Sciences introduces a novel framework for speech emotion recognition using dual-channel spectrograms and optimized deep features. Their proposed ...
We incorporate effective components of the TasNet into a freq-domain separation method. We introduce a solution for directly optimizing the separation criterion in freq-domain networks. Our exp ...
For the task of speech separation, previous study usually treats multi-channel and single-channel scenarios as two research tracks with specialized solutions developed respectively. Instead, we ...
Recent studies have successfully shown how style transfer can be applied on images from one domain to another. In this project we attempt to use this technique to embed emotions in spectrogram images.
一部の結果でアクセス不可の可能性があるため、非表示になっています。
アクセス不可の結果を表示する