An Auditory-Feedback-Based Neural Network Model of Speech Production That Is Robust to Developmental Changes in the Size and Shape of the Articulatory System The purpose of this article is to demonstrate that self-produced auditory feedback is sufficient to train a mapping between auditory target space and articulator space under conditions in which the structures of speech production are undergoing considerable developmental restructuring. One challenge for competing theories that propose invariant constriction targets is ... Research Article
Research Article  |   June 01, 2000
An Auditory-Feedback-Based Neural Network Model of Speech Production That Is Robust to Developmental Changes in the Size and Shape of the Articulatory System
 
Author Affiliations & Notes
  • Daniel E. Callan
    ATR Human Information Processing Research Laboratories Kyoto, Japan and ATR-I Brain Activity Imaging Center Kyoto, Japan
  • Ray D. Kent
    Department of Communicative Disorders University of Wisconsin-Madison
  • Frank H. Guenther
    Department of Cognitive and Neural Systems Boston University Boston, MA
  • Houri K. Vorperian
    Department of Communicative Disorders University of Wisconsin-Madison
  • Contact author: Daniel E. Callan, PhD, ATR Human Information Processing Research Laboratories, 2-2 Hikaridai, Seika-cho, Soraku-gun, Kyoto 619-0288, Japan.
    Contact author: Daniel E. Callan, PhD, ATR Human Information Processing Research Laboratories, 2-2 Hikaridai, Seika-cho, Soraku-gun, Kyoto 619-0288, Japan.×
  • Corresponding author: Email: dcallan@hip.atr.co.jp
Article Information
Speech, Voice & Prosody / Speech / Research Articles
Research Article   |   June 01, 2000
An Auditory-Feedback-Based Neural Network Model of Speech Production That Is Robust to Developmental Changes in the Size and Shape of the Articulatory System
Journal of Speech, Language, and Hearing Research, June 2000, Vol. 43, 721-736. doi:10.1044/jslhr.4303.721
History: Received March 9, 1999 , Accepted October 21, 1999
 
Journal of Speech, Language, and Hearing Research, June 2000, Vol. 43, 721-736. doi:10.1044/jslhr.4303.721
History: Received March 9, 1999; Accepted October 21, 1999

The purpose of this article is to demonstrate that self-produced auditory feedback is sufficient to train a mapping between auditory target space and articulator space under conditions in which the structures of speech production are undergoing considerable developmental restructuring. One challenge for competing theories that propose invariant constriction targets is that it is unclear what teaching signal could specify constriction location and degree so that a mapping between constriction target space and articulator space can be learned. It is predicted that a model trained by auditory feedback will accomplish speech goals, in auditory target space, by continuously learning to use different articulator configurations to adapt to the changing acoustic properties of the vocal tract during development. The Maeda articulatory synthesis part of the DIVA neural network model (Guenther et al., 1998) was modified to reflect the development of the vocal tract by using measurements taken from MR images of children. After training, the model was able to maintain the 11 English vowel targets in auditory planning space, utilizing varying articulator configurations, despite morphological changes that occur during development. The vocal-tract constriction pattern (derived from the vocal-tract area function) as well as the formant values varied during the course of development in correspondence with morphological changes in the structures involved with speech production. Despite changes in the acoustical properties of the vocal tract that occur during the course of development, the model was able to demonstrate motor-equivalent speech production under lip-restriction conditions. The model accomplished this in a self-organizing manner even though there was no prior experience with lip restriction during training.

Acknowledgments
This work was supported in part by NIDCD grant 1R29 DC02852 and also in part by NIH grant 5R01 DC00319. We would like to thank Shinji Maeda and Mark Tiede for the use of their code.
Order a Subscription
Pay Per View
Entire Journal of Speech, Language, and Hearing Research content & archive
24-hour access
This Article
24-hour access