Developmental Shifts in Detection and Attention for Auditory, Visual, and Audiovisual Speech Purpose Successful speech processing depends on our ability to detect and integrate multisensory cues, yet there is minimal research on multisensory speech detection and integration by children. To address this need, we studied the development of speech detection for auditory (A), visual (V), and audiovisual (AV) input. Method ... Research Article
Research Article  |   December 10, 2018
Developmental Shifts in Detection and Attention for Auditory, Visual, and Audiovisual Speech
 
Author Affiliations & Notes
  • Susan Jerger
    School of Behavioral and Brain Sciences, GR4.1, University of Texas at Dallas, Richardson
    Callier Center for Communication Disorders, Richardson, TX
  • Markus F. Damian
    School of Experimental Psychology, University of Bristol, United Kingdom
  • Cassandra Karl
    School of Behavioral and Brain Sciences, GR4.1, University of Texas at Dallas, Richardson
    Callier Center for Communication Disorders, Richardson, TX
  • Hervé Abdi
    School of Behavioral and Brain Sciences, GR4.1, University of Texas at Dallas, Richardson
  • Disclosure: The authors have declared that no competing interests existed at the time of publication.
    Disclosure: The authors have declared that no competing interests existed at the time of publication. ×
  • Correspondence to Susan Jerger: sjerger@utdallas.edu
  • Editor-in-Chief: Frederick (Erick) Gallun
    Editor-in-Chief: Frederick (Erick) Gallun×
  • Editor: Lori J. Leibold
    Editor: Lori J. Leibold×
Article Information
Research Issues, Methods & Evidence-Based Practice / Speech, Voice & Prosody / Hearing / Research Articles
Research Article   |   December 10, 2018
Developmental Shifts in Detection and Attention for Auditory, Visual, and Audiovisual Speech
Journal of Speech, Language, and Hearing Research, December 2018, Vol. 61, 3095-3112. doi:10.1044/2018_JSLHR-H-17-0343
History: Received September 10, 2017 , Revised January 2, 2018 , Accepted July 16, 2018
 
Journal of Speech, Language, and Hearing Research, December 2018, Vol. 61, 3095-3112. doi:10.1044/2018_JSLHR-H-17-0343
History: Received September 10, 2017; Revised January 2, 2018; Accepted July 16, 2018

Purpose Successful speech processing depends on our ability to detect and integrate multisensory cues, yet there is minimal research on multisensory speech detection and integration by children. To address this need, we studied the development of speech detection for auditory (A), visual (V), and audiovisual (AV) input.

Method Participants were 115 typically developing children clustered into age groups between 4 and 14 years. Speech detection (quantified by response times [RTs]) was determined for 1 stimulus, /buh/, presented in A, V, and AV modes (articulating vs. static facial conditions). Performance was analyzed not only in terms of traditional mean RTs but also in terms of the faster versus slower RTs (defined by the 1st vs. 3rd quartiles of RT distributions). These time regions were conceptualized respectively as reflecting optimal detection with efficient focused attention versus less optimal detection with inefficient focused attention due to attentional lapses.

Results Mean RTs indicated better detection (a) of multisensory AV speech than A speech only in 4- to 5-year-olds and (b) of A and AV inputs than V input in all age groups. The faster RTs revealed that AV input did not improve detection in any group. The slower RTs indicated that (a) the processing of silent V input was significantly faster for the articulating than static face and (b) AV speech or facial input significantly minimized attentional lapses in all groups except 6- to 7-year-olds (a peaked U-shaped curve). Apparently, the AV benefit observed for mean performance in 4- to 5-year-olds arose from effects of attention.

Conclusions The faster RTs indicated that AV input did not enhance detection in any group, but the slower RTs indicated that AV speech and dynamic V speech (mouthing) significantly minimized attentional lapses and thus did influence performance. Overall, A and AV inputs were detected consistently faster than V input; this result endorsed stimulus-bound auditory processing by these children.

Acknowledgments
This research was supported by the National Institute on Deafness and Other Communication Disorders Grant DC-000421 to the University of Texas at Dallas. We thank the children and parents who participated and the researchers who assisted, namely, Aisha Aguilera, Carissa Dees, Nina Dinh, Nadia Dunkerton, Alycia Elkins, Brittany Hernandez, Demi Krieger, Rachel Parra McAlpine, Michelle McNeal, Jeffrey Okonye, and Kimberly Periman of the University of Texas at Dallas (data collection, analysis, and presentation) as well as Derek Hammons and Scott Hawkins of the University of Texas at Dallas and Drs. Brent Spehar and Nancy Tye-Murray of the Washington University School of Medicine (computer programming and stimuli recording/editing).
Order a Subscription
Pay Per View
Entire Journal of Speech, Language, and Hearing Research content & archive
24-hour access
This Article
24-hour access