Automated Grammatical Tagging of Child Language Samples Recent studies of the automated grammatical categorization ("tagging") of words using probabilistic methods have reported substantial levels of accuracy—over 95% agreement with manual tagging for words from a variety of texts. However, the texts with which this method has been tested were written by adults and edited by publishers. The ... Research Article
Research Article  |   June 01, 1999
Automated Grammatical Tagging of Child Language Samples
 
Author Affiliations & Notes
  • Ron W. Channell
    Brigham Young University Provo, UT
  • Bonnie W. Johnson
    University of Kansas Lawrence
  • Contact author: Ron W. Channell, PhD, Audiology and Speech-Language Pathology, 128 TLRB, Brigham Young University, Provo, UT 84602.
    Contact author: Ron W. Channell, PhD, Audiology and Speech-Language Pathology, 128 TLRB, Brigham Young University, Provo, UT 84602.×
  • Corresponding author: E-mail: channellR@byu.edu
Article Information
Development / Telepractice & Computer-Based Approaches / Language Disorders / Language / Research Articles
Research Article   |   June 01, 1999
Automated Grammatical Tagging of Child Language Samples
Journal of Speech, Language, and Hearing Research, June 1999, Vol. 42, 727-734. doi:10.1044/jslhr.4203.727
History: Received July 14, 1998 , Accepted December 1, 1998
 
Journal of Speech, Language, and Hearing Research, June 1999, Vol. 42, 727-734. doi:10.1044/jslhr.4203.727
History: Received July 14, 1998; Accepted December 1, 1998

Recent studies of the automated grammatical categorization ("tagging") of words using probabilistic methods have reported substantial levels of accuracy—over 95% agreement with manual tagging for words from a variety of texts. However, the texts with which this method has been tested were written by adults and edited by publishers. The present study examined the accuracy with which such methods could tag transcribed conversational language samples from 30 normally developing children. On a word-by-word basis, automated accuracy levels ranged from 92.9% to 97.4%, averaging 95.1%. Accuracy at correctly tagging whole utterances was lower, ranging from 60.5% to 90.3%, with an average of 77.7%. Probabilistic methods of coding language samples hold potential as a viable tool for child language research. Further study and improvement of automated grammatical tagging is warranted and necessary before widespread use can be made of this technology.

Acknowledgments
We acknowledge Laurie G. Berrett who collaborated on an earlier version of this study, Kim Smith of the BYU Humanities Research Center who gave us access to the Brown University Corpus, Debbie Millet for interrater reliability, and the BYU College of Education for funding. We thank Melissa L. Barber, Lori Taylor Banta, and Elizabeth Chamberlain Mitchell for the language samples that they collected and transcribed as part of their master's theses. We thank Marc Fey and Steven Long for their encouragement and thoughtful comments.
Bonnie W. Johnson's contribution to this research was supported in part by a National Institute on Deafness and Other Communication Disorders Award #5 T32 DC0052, granted to Mabel L. Rice.
A copy of the latest version of the program, including a dictionary and probability matrix, associated utility programs, and an on-line technical manual, can be obtained without charge via email from the first author.
Order a Subscription
Pay Per View
Entire Journal of Speech, Language, and Hearing Research content & archive
24-hour access
This Article
24-hour access