CHAPTER 3FINDING PREDICTORS IN HIGHER EDUCATION

David Eubanks1, William Evers Jr.2, and Nancy Smith2

1 Assessment and Institutional Effectiveness, Furman University, Greenville, SC, USA

2 Eckerd College, Petersburg, FL, USA

As a practical matter, data mining is a requirement in higher education resulting from internal and external requests for numbers and associated meaning in the service of decision‐making. Traditional statistical summaries (typically parametric statistics) are still widely used to report descriptive information on operations, including enrollment and retention indicators, financial aid and revenue predictions, and learning assessments. This activity is customarily done by institutional research departments, which are proficient at using tools to perform data wrangling (the often‐tedious process of combining data from various sources, cleaning up problematic entries, and normalizing) using SQL databases and Excel or programming software and performing analyses with the aid of statistical software like SPSS, SAS, and R. We will refer to this as the traditional sort of data mining, which produces tables and graphs of parametric estimates or distributions and frequencies, including regression models, factor analysis, or any of a host of statistical tests.

In 2009, when the Journal of Educational Data Mining was founded, EDM was based on institutional level data and analysis of the type just mentioned. More recently Romero and Ventura (2013) provided a synopsis ...

Get Data Mining and Learning Analytics now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.