Comparison of Fo Extraction Methods for High-Precision Voice Perturbation Measurements Voice perturbation measures, such as jitter and shimmer, depend on accurate extraction of fundamental frequency (Fo) and amplitude of various waveform types. The extraction method directly affects the accuracy of the measures, particularly if several waveform types (with or without formant structure) are under consideration and if noise and modulation ... Research Article
Research Article  |   December 01, 1993
Comparison of Fo Extraction Methods for High-Precision Voice Perturbation Measurements
 
Author Affiliations & Notes
  • Ingo R. Titze
    Department of Speech Pathology and Audiology, and National Center for Voice and Speech The University of Iowa Iowa City and The Recording and Research Center The Denver Center for the Performing Arts Denver, CO
  • Haixiang Liang
    Department of Speech Pathology and Audiology, and National Center for Voice and Speech The University of Iowa Iowa City and The Recording and Research Center The Denver Center for the Performing Arts Denver, CO
  • Contact author: Ingo R. Titze, PhD, National Center for Voice and Speech, Department of Speech Pathology and Audiology, The University of Iowa, Iowa City, IA 52242.
Article Information
Speech, Voice & Prosody / Speech / Research Articles
Research Article   |   December 01, 1993
Comparison of Fo Extraction Methods for High-Precision Voice Perturbation Measurements
Journal of Speech, Language, and Hearing Research, December 1993, Vol. 36, 1120-1133. doi:10.1044/jshr.3606.1120
History: Received March 9, 1992 , Accepted April 1, 1993
 
Journal of Speech, Language, and Hearing Research, December 1993, Vol. 36, 1120-1133. doi:10.1044/jshr.3606.1120
History: Received March 9, 1992; Accepted April 1, 1993

Voice perturbation measures, such as jitter and shimmer, depend on accurate extraction of fundamental frequency (Fo) and amplitude of various waveform types. The extraction method directly affects the accuracy of the measures, particularly if several waveform types (with or without formant structure) are under consideration and if noise and modulation are present in the signal. For frequency perturbation, high precision is defined here as the ability to extract Fo to ±0.01% under conditions of noise and modulation. Three Fo-extraction methods and their software implementations are discussed and compared. The methods are cycle-to-cycle waveform matching, zero-crossing and peak-picking. Interpolation between samples is added to make the extractions more accurate and reliable. The sensitivity of the methods to different parameters such as sampling frequency, mean Fo, signal-to-noise ratio, frequency modulation, and amplitude modulation are explored.

Acknowledgments
This work was supported by grant No. DC00387-05 from the National Institutes on Deafness and Other Communication Disorders for which we are grateful. The authors also express appreciation to Pamela Rios and Julie Lemke for manuscript preparation.
Order a Subscription
Pay Per View
Entire Journal of Speech, Language, and Hearing Research content & archive
24-hour access
This Article
24-hour access