Optimizing the Reliability of Speech Recognition Scores Speech recognition assessment involves a dilemma because clinicians want a test that is short and reliable, but statistical principles dictate that a short test is unreliable. Curves representing the variability of test scores based on the binomial model reveal that approximately 450 scorable items are needed in order to optimize ... Research Article
Research Article  |   October 01, 1998
Optimizing the Reliability of Speech Recognition Scores
 
Author Affiliations & Notes
  • Stanley A. Gelfand
    Queens College and the Graduate School of the City University of New York
  • Contact author: Stanley A. Gelfand, PhD, Department of Linguistics and Communication Disorders, Queens College of CUNY, 65-30 Kissena Boulevard, Flushing, NY 11367. Email: sagelfand@qc.edu
Article Information
Hearing & Speech Perception / Speech, Voice & Prosody / Hearing / Research Articles
Research Article   |   October 01, 1998
Optimizing the Reliability of Speech Recognition Scores
Journal of Speech, Language, and Hearing Research, October 1998, Vol. 41, 1088-1102. doi:10.1044/jslhr.4105.1088
History: Received February 20, 1998 , Accepted July 23, 1998
 
Journal of Speech, Language, and Hearing Research, October 1998, Vol. 41, 1088-1102. doi:10.1044/jslhr.4105.1088
History: Received February 20, 1998; Accepted July 23, 1998

Speech recognition assessment involves a dilemma because clinicians want a test that is short and reliable, but statistical principles dictate that a short test is unreliable. Curves representing the variability of test scores based on the binomial model reveal that approximately 450 scorable items are needed in order to optimize the reliability of a speech recognition test. A testing approach was developed to achieve this sample size while retaining the principal features of the most commonly accepted speech recognition tests (i.e., monosyllabic words presented in an open-set format, verbal responses, and right/wrong scoring). It involves the use of an interactive computer program to present CNC words in 50 three-word groups, which are scored phonemically, resulting in 450 scorable items. Normal performance is described as a function of both presentation level and signal-to-noise ratio. Comparisons of test and retest scores for 100 individuals with normal hearing and 100 persons with sensorineural losses revealed that the approach achieves the degree of reliability predicted by the binomial model for both groups. Phoneme scores accounted for 99% of the variance of word scores for most of the performance range encountered in clinical practice, making it possible for test outcomes based on phonemic scoring to be expressed in terms of equivalent word recognition scores.

Acknowledgments
I am indebted to Teresa Schwander and Matthew Bakke, who assisted in various aspects of the test construction, and to Sharon Cohen, Shoshana Millman, and Cheryl Newman, who assisted in data collection. I would also like to thank Drs. Christopher W. Turner and Richard H. Wilson, as well as an anonymous reviewer, for their valuable comments and suggestions. This work was supported in part by City University of New York PSC-BHE Grants 663451, 665535, and 666535.
Order a Subscription
Pay Per View
Entire Journal of Speech, Language, and Hearing Research content & archive
24-hour access
This Article
24-hour access