Examining Factors Influencing the Viability of Automatic Acoustic Analysis of Child Speech Purpose Heterogeneous child speech was force-aligned to investigate whether (a) manipulating specific parameters could improve alignment accuracy and (b) forced alignment could be used to replicate published results on acoustic characteristics of /s/ production by children. Method In Part 1, child speech from 2 corpora was force-aligned with ... Research Article
Newly Published
Research Article  |   September 19, 2018
Examining Factors Influencing the Viability of Automatic Acoustic Analysis of Child Speech
 
Author Affiliations & Notes
  • Thea Knowles
    School of Communication Sciences & Disorders, Western University, London, Ontario, Canada
    Health & Rehabilitation Sciences, Western University, London, Ontario, Canada
  • Meghan Clayards
    Linguistics, McGill University, Montréal, Québec, Canada
    School of Communication Sciences and Disorders, McGill University, Montréal, Québec, Canada
    Centre for Research on Brain, Language and Music, McGill University, Montréal, Québec, Canada
  • Morgan Sonderegger
    Linguistics, McGill University, Montréal, Québec, Canada
    Centre for Research on Brain, Language and Music, McGill University, Montréal, Québec, Canada
  • Disclosure: The authors have declared that no competing interests existed at the time of publication.
    Disclosure: The authors have declared that no competing interests existed at the time of publication. ×
  • Correspondence to Thea Knowles: tknowle3@uwo.ca
  • Editor-in-Chief: Julie Liss
    Editor-in-Chief: Julie Liss×
  • Editor: Megan McAuliffe
    Editor: Megan McAuliffe×
Article Information
Speech, Voice & Prosodic Disorders / Voice Disorders / Hearing & Speech Perception / Acoustics / International & Global / Speech, Voice & Prosody / Newly Published / Research Article
Research Article   |   September 19, 2018
Examining Factors Influencing the Viability of Automatic Acoustic Analysis of Child Speech
Journal of Speech, Language, and Hearing Research, Newly Published. doi:10.1044/2018_JSLHR-S-17-0275
History: Received July 19, 2017 , Revised January 22, 2018 , Accepted May 17, 2018
 
Journal of Speech, Language, and Hearing Research, Newly Published. doi:10.1044/2018_JSLHR-S-17-0275
History: Received July 19, 2017; Revised January 22, 2018; Accepted May 17, 2018

Purpose Heterogeneous child speech was force-aligned to investigate whether (a) manipulating specific parameters could improve alignment accuracy and (b) forced alignment could be used to replicate published results on acoustic characteristics of /s/ production by children.

Method In Part 1, child speech from 2 corpora was force-aligned with a trainable aligner (Prosodylab-Aligner) under different conditions that systematically manipulated input training data and the type of transcription used. Alignment accuracy was determined by comparing hand and automatic alignments as to how often they overlapped (%-Match) and absolute differences in duration and boundary placements. Using mixed-effects regression, accuracy was modeled as a function of alignment conditions, as well as segment and child age. In Part 2, forced alignments derived from a subset of the alignment conditions in Part 1 were used to extract spectral center of gravity of /s/ productions from young children. These findings were compared to published results that used manual alignments of the same data.

Results Overall, the results of Part 1 demonstrated that using training data more similar to the data to be aligned as well as phonetic transcription led to improvements in alignment accuracy. Speech from older children was aligned more accurately than younger children. In Part 2, /s/ center of gravity extracted from force-aligned segments was found to diverge in the speech of male and female children, replicating the pattern found in previous work using manually aligned segments. This was true even for the least accurate forced alignment method.

Conclusions Alignment accuracy of child speech can be improved by using more specific training and transcription. However, poor alignment accuracy was not found to impede acoustic analysis of /s/ produced by even very young children. Thus, forced alignment presents a useful tool for the analysis of child speech.

Supplemental Material https://doi.org/10.23641/asha.7070105

Acknowledgments
This research was supported by the McGill Collaborative Research Development Fund awarded to Meghan Clayards, Aparna Nadig, Kristine Onishi, Morgan Sonderegger, and Michael Wagner. We thank Aparna Nadig, Kristine Onishi, and Michael Wagner for their contributions to a preliminary version of this work, which was presented at the 170th Meeting on Acoustics and appears in the proceedings (Knowles et al., 2015). We would also like to thank Hye-Young Bang for providing the manual alignments and acoustic analysis scripts for these data. Finally, we would like to thank A. Azzimmaturo, L. Bassford, L. Harrison, J. Kobelski, M. Schwartz, and A. Vong for assisting with data preparation.
Order a Subscription
Pay Per View
Entire Journal of Speech, Language, and Hearing Research content & archive
24-hour access
This Article
24-hour access