Schools have long amassed data: tracking grades, attendance, textbook purchases, test scores, cafeteria meals, and the like. But little has actually been done with this information — whether due to privacy issues or technical capacities — to enhance students’ learning.

With the adoption of technology in more schools and with a push for more open government data, there are clearly a lot of opportunities for better data gathering and analysis in education. But what will that look like? It’s a politically charged question, no doubt, as some states are turning to things like standardized test score data in order to gauge teacher effectiveness and, in turn, retention and promotion.

I asked education theorist George Siemens, from the Technology Enhanced Knowledge Research Institute at Athabasca University, about the possibilities and challenges for data, teaching, and learning.

Our interview follows.

What kinds of data have schools traditionally tracked?

George Siemens: Schools and universities have long tracked a broad range of learner data — often drawn from applications (universities) or enrollment forms (schools). This data includes any combination of: location, previous learning activities, health concerns (physical and emotional/mental), attendance, grades, socio-economic data (parental income), parental status, and so on. Most universities will store and aggregate this data under the umbrella of institutional statistics.

Privacy laws differ from country to country, but generally will prohibit academics from accessing data that is not relevant to a particular class, course, or program. Unfortunately, most schools and universities do very little with this wealth of data, other than possibly producing an annual institutional profile report. Even a simple analysis of existing institutional data could raise the profile of potential at-risk students or reveal attendance or assignment submission patterns that indicate the need for additional support.

What new types of educational data can now be captured and mined?

George Siemens: In terms of learning analytics or educational data-mining, the growing externalization of learning activity (i.e. capturing how learners interact with content and the discourse they have around learning materials as well as the social networks they form in the process) is driven by the increased attention to online learning. For example, a learning management system like Moodle or Desire2Learn captures a significant amount of data, including time spent on a resource, frequency of posting, number of logins, etc. This data is fairly similar to what Google Analytics or Piwik collects regarding website traffic. A new generation of tools, such as SNAPP, uses this data to analyze social networks, degrees of connectivity, and peripheral learners. Discourse analysis tools, such as those being developed at the Knowledge Media Institute at the Open University, UK, are also effective at evaluating the qualitative attributes of discourse and discussions and rate each learner’s contributions by depth and substance in relation to the topic of discussion.

An area of data gathering that universities and schools are largely overlooking relates to the distributed social interactions learners engage in on a daily basis through Facebook, blogs, Twitter, and similar tools. Of course, privacy issues are significant here. However, as we are researching at Athabasca University, social networks can provide valuable insight into how connected learners are to each other and to the university. Potential models are already being developed on the web that would translate well to school settings. For example, Klout measures influence within a network and Radian6 tracks discussions in distributed networks.

The existing data gathering in schools and universities pales in comparison to the value of data mining and learning analytics opportunities that exist in the distributed social and informational networks that we all participate in on a daily basis. It is here, I think, that most of the novel insights on learning and knowledge growth will occur. When we interact in a learning management system (LMS), we do so purposefully — to learn or to complete an assignment. Our interaction in distributed systems is more “authentic” and can yield novel insights into how we are connected, our sentiments, and our needs in relation to learning success. The challenge, of course, is how to balance concerns of the Hawthorne effect with privacy.

Discussions about data ownership and privacy lag well behind what is happening in learning analytics. Who owns learner-produced data? Who owns the analysis of that data? Who gets to see the results of analysis? How much should learners know about the data being collected and analyzed?

I believe that learners should have access to the same dashboard for analytics that educators and institutions see. Analytics can be a powerful tool in learner motivation — how do I compare to others in this class? How am I doing against the progress goals that I set? If data and analytics are going to be used for decision making in teaching and learning, then we need to have important conversations about who sees what and what are the power structures created by the rules we impose on data and analytics access.

How can analytics change education?

George Siemens: Education is, today at least, a black box. Society invests significantly in primary, secondary, and higher education. Unfortunately, we don’t really know how our inputs influence or produce outputs. We don’t know, precisely, which academic practices need to be curbed and which need to be encouraged. We are essentially swatting flies with a sledgehammer and doing a fair amount of peripheral damage.

Learning analytics are a foundational tool for informed change in education. Over the past decade, calls for educational reform have increased, but very little is understood about how the system of education will be impacted by the proposed reforms. I sometimes fear that the solution being proposed to what ails education will be worse than the current problem. We need a means, a foundation, on which to base reform activities. In the corporate sector, business intelligence serves this “decision foundation” role. In education, I believe learning analytics will serve this role. Once we better understand the learning process — the inputs, the outputs, the factors that contribute to learner success — then we can start to make informed decisions that are supported by evidence.

However, we have to walk a fine line in the use of learning analytics. On the one hand, analytics can provide valuable insight into the factors that influence learners’ success (time on task, attendance, frequency of logins, position within a social network, frequency of contact with faculty members or teachers). Peripheral data analysis could include the use of physical services in a school or university: access to library resources and learning help services. On the other hand, analytics can’t capture the softer elements of learning, such as the motivating encouragement from a teacher and the value of informal social interactions. In any assessment system, whether standardized testing or learning analytics, there is a real danger that the target becomes the object of learning, rather than the assessment of learning.

With that as a caveat, I believe learning analytics can provide dramatic, structural change in education. For example, today, our learning content is created in advance of the learners taking a course in the form of curriculum like textbooks. This process is terribly inefficient. Each learner has differing levels of knowledge when they start a course. An intelligent curriculum should adjust and adapt to the needs of each learner. We don’t need one course for 30 learners; each learner should have her own course based on her life experiences, learning pace, and familiarity with the topic. The content in the courses that we take should be as adaptive, flexible, and continually updated. The black box of education needs to be opened and adapted to the requirements of each individual learner.

In terms of evaluation of learners, assessment should be in-process, not at the conclusion of a course in the form of an exam or a test. Let’s say we develop semantically-defined learning materials and ways to automatically compare learner-produced artifacts (in discussions, texts, papers) to the knowledge structure of a field. Our knowledge profile could then reflect how we compare to the knowledge architecture of a domain — i.e. “you are 64% on your way to being a psychologist” or “you are 38% on your way to being a statistician.” Basically, evaluation should be done based on a complete profile of an individual, not only the individual in relation to a narrowly defined subject area.

Programs of study should also include non-school-related learning (prior learning assessment). A student that volunteers with a local charity or a student that plays sports outside of school is acquiring skills and knowledge that is currently ignored by the school system. “Whole-person analytics” is required where we move beyond the micro-focus of exams. For students that return to university mid-career to gain additional qualifications, recognition for non-academic learning is particularly important.

Much of the current focus on analytics relates to reducing attrition or student dropouts. This is the low-hanging fruit of analytics. An analysis of the signals learners generate (or fail to — such as when they don’t login to a course) can provide early indications of which students are at risk for dropping out. By recognizing these students and offering early interventions, schools can reduce dropouts dramatically.

All of this is to say that learning analytics serve as a foundation for informed change in education, altering how schools and universities create curriculum, deliver it, assess student learning, provide learning support, and even allocate resources.

What technologies are behind learning analytics?

George Siemens: Some of the developments in learning analytics track the development of the web as a whole — including the use of recommender systems, social network analysis, personalization, and adaptive content. We are at an exciting cross-over point between innovations in the technology space and research in university research labs. Language recognition, artificial intelligence, machine learning, neural networks, and related concepts are being combined with the growth of social network services, collaborative learning, and participatory pedagogy.

The combination of technical and social innovations in learning offers huge potential for a better, more effective learning model. Together with Stephen Downes and Dave Cormier, I’ve experimented with “massive open online courses” over the past four years. This experimentation has resulted in software that we’ve developed to encourage distributed learning, while still providing a loose level of aggregation that enables analytics. Tools like Open Study take a similar approach: decentralized learning, centralized analytics. Companies like Grockit and Knewton are creating personalized adaptive learning platforms. Not to be outdone, traditional publishers like Pearson and McGraw-Hill are investing heavily in adaptive learning content and are starting to partner with universities and schools to deliver the content and even evaluate learner performance. Learning management system providers (such as Desire2Learn and Blackboard) are actively building analytics options into their offerings.

Essentially, in order for learning analytics to have a broad impact in education, the focus needs to move well beyond basic analytics techniques such as those found in Google Analytics. An integrated learning and knowledge model is required where the learning content is adaptive, prior learning is included in assessment, and learning resources are provided in various contexts (e.g. “in class today you studied Ancient Roman laws, two blocks from where you are now, a museum is holding a special exhibit on Roman culture”). The profile of the learner, not pre-planned content, needs to drives curriculum and learning opportunities.

What are the major obstacles facing education data and analytics?

George Siemens: In spite of the enormous potential they hold to improve education, learning analytics are not without concerns. Privacy for learners and teachers is a critical issue. While I see analytics as a means to improve learner success, opportunities exist to use analytics to evaluate and critique the performance of teachers. Data access and ownership are equally important issues: who should be able to see the analysis that schools perform on learners? Other concerns relate to error-correction in analytics. If educators rely heavily on analytics, effort should be devoted to evaluating the analytics models and understanding in which contexts those analytics are not valid.

With regard to the adoption of learning analytics, now is an exceptionally practical time to explore analytics. The complex challenges that schools and universities face can, at least partially, be illuminated through analytics applications.