A Cepstrum-Based Technique for Determining a Harmonics-to-Noise Ratio in Speech Signals A new method to calculate a spectral harmonics-to-noise ratio (HNR) in speech signals is presented. The method involves discrimination between harmonic and noise energy in the magnitude spectrum by means of a comb-liftering operation in the cepstrum domain. Sensitivity of HNR to (a) additive noise and (b) jitter was tested ... Research Article
Research Article  |   April 01, 1993
A Cepstrum-Based Technique for Determining a Harmonics-to-Noise Ratio in Speech Signals
 
Author Affiliations & Notes
  • Guus de Krom
    Research Institute for Language and Speech University of Utrecht the Netherlands
  • Contact author: Guus de Krom, MA, Research Institute for Language and Speech, University of Utrecht, 3512 J K Utrecht, the Netherlands.
Article Information
Hearing & Speech Perception / Acoustics / Speech, Voice & Prosody / Speech / Research Articles
Research Article   |   April 01, 1993
A Cepstrum-Based Technique for Determining a Harmonics-to-Noise Ratio in Speech Signals
Journal of Speech, Language, and Hearing Research, April 1993, Vol. 36, 254-266. doi:10.1044/jshr.3602.254
History: Received March 11, 1992 , Accepted September 29, 1992
 
Journal of Speech, Language, and Hearing Research, April 1993, Vol. 36, 254-266. doi:10.1044/jshr.3602.254
History: Received March 11, 1992; Accepted September 29, 1992

A new method to calculate a spectral harmonics-to-noise ratio (HNR) in speech signals is presented. The method involves discrimination between harmonic and noise energy in the magnitude spectrum by means of a comb-liftering operation in the cepstrum domain. Sensitivity of HNR to (a) additive noise and (b) jitter was tested with synthetic vowel-like signals, generated at 10 fundamental frequencies. All jitter and noise signals were analyzed at three window lengths in order to investigate the effect of the length of the analysis frame on the estimated HNR values. Results of a multiple linear regression analysis with noise or jitter, F0, and window length as predictors for HNR indicate a major effect of both noise and jitter on HNR, in that HNR decreases almost linearly with increasing noise levels or increasing jitter. The influence of F0 and window length on HNR is small for the jittered signals, but HNR increases considerably with increasing F0 or window length for the noise signals. We conclude that the method seems to be a valid technique for determining the amount of spectral noise, because it is almost linearly sensitive to both noise and jitter for a large part of the noise or jitter continuum. The strong negative relation between HNR and jitter illustrates that spectral noise measures cannot simply be taken as indicators of the actual amount of noise in the time signal. Instead, HNR integrates several aspects of the acoustic stability of the signal. As such, HNR may be a useful parameter in the analysis of voice quality, although it cannot be directly interpreted in terms of underlying glottal events or perceptual characteristics.

Acknowledgments
The program for synthesizing the signals used in this experiment was written by Peter Pabon, Research Institute for Language and Speech, University of Utrecht. He and a number of other people have provided me with useful comments on the first drafts of this paper. Their help is gratefully acknowledged.
Order a Subscription
Pay Per View
Entire Journal of Speech, Language, and Hearing Research content & archive
24-hour access
This Article
24-hour access