Audio samples

This is the sample page for our submission 'Investigations on the optimal estimation of speech envelopes for the two-stage speech enhancement'.

All samples have been rescaled to –26 dBov individually based on the active speech level (ASL).

To highlight the difference between the estimated speech envelopes, the excitation signals are also enhanced by constant pitch amplifying [1] and selective cepstral smoothing [2] in cepstrum to generate the new a priori SNR estimate.

Hover your mouse on the spectrogram to check the noisy input.
Click the spectrogram to enlarge/reset it.

Methods Noisy input Baseline:LSA GRU-codebook Regression Clean reference

References

  1. Elshamy, S., Madhu, N., Tirry, W., & Fingscheidt, T. (2017). Instantaneous a priori SNR estimation by cepstral excitation manipulation. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 25(8), 1592-1605.
  2. Song, Y., & Madhu, N. (2022). Improved CEM for speech harmonic enhancement in single channel noise suppression. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 30, 2492-2503.