CHAPTER 15 DATA MINING WITH NATURAL LANGUAGE PROCESSING AND CORPUS LINGUISTICS: UNLOCKING ACCESS TO SCHOOL CHILDREN’S LANGUAGE IN DIVERSE CONTEXTS TO IMPROVE INSTRUCTIONAL AND ASSESSMENT PRACTICES
Alison L. Bailey, Anne Blackstock‐Bernstein, Eve Ryan, and Despina Pitsoulakis
Department of Education, University of California, Los Angeles, CA, USA
15.1 INTRODUCTION
In this chapter, we bring together the fields of corpus linguistics, natural language processing (NLP), and computing to describe how language samples of school‐age students were used to create a digital data system of Dynamic Language Learning Progressions (DLLPs). Our work offers a compelling instantiation of teacher practitioners, education researchers, computer scientists, and engineers working to utilize analytical techniques from corpus linguistics and big data principles in school settings. The collaboration was forged to mine language corpora (collections of verbatim language samples) of school‐age students’ oral and written language and to define trajectories of student language development for research and practice purposes. The project capitalizes on the efficiency that NLP offers with the automated linguistic analysis of transcribed oral language and text‐based data. The statistical outputs that NLP produces (e.g., number and length of sentences, identified parts of speech, word inventories) include not only frequencies, but, when coupled with the educational and demographic data also collected with ...
Get Data Mining and Learning Analytics now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.