HCI Bibliography Home | HCI Conferences | LAK Archive | Detailed Records | RefWorks | EndNote | Hide Abstracts
LAK Tables of Contents: 1112131415

LAK'15: 2015 International Conference on Learning Analytics and Knowledge

Fullname:Proceedings of the Fifth International Conference on Learning Analytics and Knowledge
Editors:Paulo Blikstein; Josh Baron; Agathe Merceron; Grace Lynch; Nicole Maziarz; George Siemens
Location:Poughkeepsie, New York
Dates:2015-Mar-16 to 2015-Mar-20
Publisher:ACM
Standard No:ISBN: 978-1-4503-3417-4; ACM DL: Table of Contents; hcibib: LAK15
Papers:84
Pages:435
Links:Conference Website
  1. Indicators and tools for awareness
  2. Student engagement and behaviour
  3. MOOCs -- assessments, connections and demographics
  4. Practice across boundaries
  5. Institutional perspectives
  6. Students at risk
  7. Student performance
  8. Predicting achievement
  9. MOOCs -- discussion forums
  10. Off-task behaviour / Bayesian knowledge tracing
  11. Writing and discourse analysis
  12. Learning analytics tools and frameworks
  13. Theoretical foundations for learning analytics
  14. Text and discourse analysis
  15. Learning strategies and tools
  16. Alternative methods of improving learning
  17. Interventions and remediations
  18. Analyses with LMS data
  19. Tutoring systems
  20. Curricula, network and discourse analysis
  21. Multilevel, multimodal and network analysis
  22. Workshop
  23. Posters

Indicators and tools for awareness

The LATUX workflow: designing and deploying awareness tools in technology-enabled learning settings BIBAFull-Text 1-10
  Roberto Martinez-Maldonado; Abelardo Pardo; Negin Mirriahi; Kalina Yacef; Judy Kay; Andrew Clayphan
Designing, deploying and validating learning analytics tools for instructors or students is a challenge requiring techniques and methods from different disciplines, such as software engineering, human-computer interaction, educational design and psychology. Whilst each of these disciplines has consolidated design methodologies, there is a need for more specific methodological frameworks within the cross-disciplinary space defined by learning analytics. In particular there is no systematic workflow for producing learning analytics tools that are both technologically feasible and truly underpin the learning experience. In this paper, we present the LATUX workflow, a five-stage workflow to design, deploy and validate awareness tools in technology-enabled learning environments. LATUX is grounded on a well-established design process for creating, testing and re-designing user interfaces. We extend this process by integrating the pedagogical requirements to generate visual analytics to inform instructors' pedagogical decisions or intervention strategies. The workflow is illustrated with a case study in which collaborative activities were deployed in a real classroom.
Learning analytics beyond the LMS: the connected learning analytics toolkit BIBAFull-Text 11-15
  Kirsty Kitto; Sebastian Cross; Zak Waters; Mandy Lupton
We present a Connected Learning Analytics (CLA) toolkit, which enables data to be extracted from social media and imported into a Learning Record Store (LRS), as defined by the new xAPI standard. A number of implementation issues are discussed, and a mapping that will enable the consistent storage and then analysis of xAPI verb/object/activity statements across different social media and online environments is introduced. A set of example learning activities are proposed, each facilitated by the Learning Analytics beyond the LMS that the toolkit enables.
Developing an evaluation framework of quality indicators for learning analytics BIBAFull-Text 16-20
  Maren Scheffel; Hendrik Drachsler; Marcus Specht
This paper presents results from the continuous process of developing an evaluation framework of quality indicators for learning analytics (LA). Building on a previous study, a group concept mapping approach that uses multidimensional scaling and hierarchical clustering, the study presented here applies the framework to a collection of LA tools in order to evaluate the framework. Using the quantitative and qualitative results of this study, the first version of the framework was revisited so as to allow work towards an improved version of the evaluation framework of quality indicators for LA.

Student engagement and behaviour

Exploring networks of problem-solving interactions BIBAFull-Text 21-30
  Michael Eagle; Drew Hicks; Barry, III Peddycord; Tiffany Barnes
Intelligent tutoring systems and other computer-aided learning environments produce large amounts of transactional data on student problem-solving behavior, in previous work we modeled the student-tutor interaction data as a complex network, and successfully generated automated next-step hints as well as visualizations for educators. In this work we discuss the types of tutoring environments that are best modeled by interaction networks, and how the empirical observations of problem-solving result in common network features. We find that interaction networks exhibit the properties of scale-free networks such as vertex degree distributions that follow power law. We compare data from two versions of a propositional logic tutor, as well as two different representations of data from an educational game on programming. We find that statistics such as degree assortativity and the scale-free metric allow comparison of the network structures across domains, and provide insight into student problem solving behavior.
Towards better affect detectors: effect of missing skills, class features and common wrong answers BIBAFull-Text 31-35
  Yutao Wang; Neil T. Heffernan; Cristina Heffernan
The well-studied Baker et al., affect detectors on boredom, frustration, confusion and engagement concentration with ASSISTments dataset were used to predict state tests scores, college enrollment, and even whether a student majored in a STEM field. In this paper, we present three attempts to improve upon current affect detectors. The first attempt analyzed the effect of missing skill tags in the dataset to the accuracy of the affect detectors. The results show a small improvement after correctly tagging the missing skill values. The second attempt added four features related to student classes for feature selection. The third attempt added two features that described information about student common wrong answers for feature selection. Result showed that two out of the four detectors were improved by adding the new features.
Exploring college major choice and middle school student behavior, affect and learning: what happens to students who game the system? BIBAFull-Text 36-40
  Maria O. Z. San Pedro; Ryan S. Baker; Neil T. Heffernan; Jaclyn L. Ocumpaugh
Choosing a college major is a major life decision. Interests stemming from students' ability and self-efficacy contribute to eventual college major choice. In this paper, we consider the role played by student learning, affect and engagement during middle school, using data from an educational software system used as part of regular schooling. We use predictive analytics to leverage automated assessments of student learning and engagement, investigating which of these factors are related to a chosen college major. For example, we already know that students who game the system in middle school mathematics are less likely to major in science or technology, but what majors are they more likely to select? Using data from 356 college students who used the ASSISTments system during their middle school years, we find significant differences in student knowledge, performance, and off-task and gaming behaviors between students who eventually choose different college majors.

MOOCs -- assessments, connections and demographics

On the validity of peer grading and a cloud teaching assistant system BIBAFull-Text 41-50
  Tim Vogelsang; Lara Ruppertz
We introduce a new grading system, the Cloud Teaching Assistant System (CTAS), as an additional element to instructor grading, peer grading and automated validation in massive open online courses (MOOCs). The grading distributions of the different approaches are compared in an experiment consisting of 476 exam participants. 25 submissions were graded by all four methods. 451 submissions were graded only by peer grading and automated validation. The results of the experiment suggest that both CTAS and peer grading do not simulate instructor grading (Pearson's correlations: 0.36, 0.39). If the CTAS and not the instructor is assumed to deliver accurate grading, peer grading is concluded to be a valid grading method (Pearson's correlation: 0.76).
Examining engagement: analysing learner subpopulations in massive open online courses (MOOCs) BIBAFull-Text 51-58
  Rebecca Ferguson; Doug Clow
Massive open online courses (MOOCs) are now being used across the world to provide millions of learners with access to education. Many learners complete these courses successfully, or to their own satisfaction, but the high numbers who do not finish remain a subject of concern for platform providers and educators. In 2013, a team from Stanford University analysed engagement patterns on three MOOCs run on the Coursera platform. They found four distinct patterns of engagement that emerged from MOOCs based on videos and assessments. However, not all platforms take this approach to learning design. Courses on the FutureLearn platform are underpinned by a social-constructivist pedagogy, which includes discussion as an important element. In this paper, we analyse engagement patterns on four FutureLearn MOOCs and find that only two clusters identified previously apply in this case. Instead, we see seven distinct patterns of engagement: Samplers, Strong Starters, Returners, Mid-way Dropouts, Nearly There, Late Completers and Keen Completers. This suggests that patterns of engagement in these massive learning environments are influenced by decisions about pedagogy. We also make some observations about approaches to clustering in this context.
Socioeconomic status and MOOC enrollment: enriching demographic information with external datasets BIBAFull-Text 59-63
  John D. Hansen; Justin Reich
To minimize barriers to entry, massive open online course (MOOC) providers collect minimal demographic information about users. In isolation, these data are insufficient to address important questions about socioeconomic status (SES) and MOOC enrollment and performance. We demonstrate the use of third-party datasets to enrich demographic portraits of MOOC students and answer fundamental questions about SES and MOOC enrollment. We derive demographic information from registrants' geographic location by matching self-reported mailing addresses with data available from Esri at the census block group level and the American Community Survey at the zip code level. We then use these data to compare neighborhood income and parental education for US registrants in HarvardX courses to the US population as a whole. Overall, HarvardX registrants tend to reside in more affluent neighborhoods. Registrants on average live in neighborhoods with median incomes approximately. 45 standard deviations higher than the US population. Higher levels of parental education are also associated with a higher likelihood of registration.
How do you connect?: analysis of social capital accumulation in connectivist MOOCs BIBAFull-Text 64-68
  Srecko Joksimovic; Nia Dowell; Oleksandra Skrypnyk; Vitomir Kovanovic; Dragan Gaševic; Shane Dawson; Arthur C. Graesser
Connections established between learners via interactions are seen as fundamental for connectivist pedagogy. Connections can also be viewed as learning outcomes, i.e. learners' social capital accumulated through distributed learning environments. We applied linear mixed effects modeling to investigate whether the social capital accumulation interpreted through learners' centrality to course interaction networks, is influenced by the language learners use to express and communicate in two connectivist MOOCs. Interactions were distributed across the three social media, namely Twitter, blog and Facebook. Results showed that learners in a cMOOC connect easier with the individuals who use a more informal, narrative style, but still maintain a deeper cohesive structure to their communication.

Practice across boundaries

Learning analytics: European perspectives BIBAFull-Text 69-72
  Rebecca Ferguson; Adam Cooper; Hendrik Drachsler; Gábor Kismihók; Anne Boyer; Kairit Tammets; Alejandra Martínez Monés
Since the emergence of learning analytics in North America, researchers and practitioners have worked to develop an international community. The organization of events such as SoLAR Flares and LASI Locals, as well as the move of LAK in 2013 from North America to Europe, has supported this aim. There are now thriving learning analytics groups in North American, Europe and Australia, with smaller pockets of activity emerging on other continents. Nevertheless, much of the work carried out outside these forums, or published in languages other than English, is still inaccessible to most people in the community. This panel, organized by Europe's Learning Analytics Community Exchange (LACE) project, brings together researchers from five European countries to examine the field from European perspectives. In doing so, it will identify the benefits and challenges associated with sharing and developing practice across national boundaries.
OpenCourseWare observatory: does the quality of OpenCourseWare live up to its promise? BIBAFull-Text 73-82
  Sahar Vahdati; Christoph Lange; Sören Auer
A vast amount of OpenCourseWare (OCW) is meanwhile being published online to make educational content accessible to larger audiences. The awareness of such courses among users and the popularity of systems providing such courses are increasing. However, from a subjective experience, OCW is frequently cursory, outdated or non-reusable. In order to obtain a better understanding of the quality of OCW, we assess the quality in terms of fitness for use. Based on three OCW use case scenarios, we define a range of dimensions according to which the quality of courses can be measured. From the definition of each dimension a comprehensive list of quality metrics is derived. In order to obtain a representative overview of the quality of OCW, we performed a quality assessment on a set of 100 randomly selected courses obtained from 20 different OCW repositories. Based on this assessment we identify crucial areas in which OCW needs to improve in order to deliver up to its promises.

Institutional perspectives

Student privacy self-management: implications for learning analytics BIBAFull-Text 83-92
  Paul Prinsloo; Sharon Slade
Optimizing the harvesting and analysis of student data promises to clear the fog surrounding the key drivers of student success and retention, and provide potential for improved student success. At the same time, concerns are increasingly voiced around the extent to which individuals are routinely and progressively tracked as they engage online. The Internet, the very thing that promised to open up possibilities and to break down communication barriers, now threatens to narrow it again through the panopticon of mass surveillance.
   Within higher education, our assumptions and understanding of issues surrounding student attitudes to privacy are influenced both by the apparent ease with which the public appear to share the detail of their lives and our paternalistic institutional cultures. As such, it can be easy to allow our enthusiasm for the possibilities offered by learning analytics to outweigh consideration of issues of privacy.
   This paper explores issues around consent and the seemingly simple choice to allow students to opt-in or opt-out of having their data tracked. We consider how 3 providers of massive open online courses (MOOCs) inform users of how their data is used, and discuss how higher education institutions can work toward an approach which engages and more fully informs students of the implications of learning analytics on their personal data.

Students at risk

Who, when, and why: a machine learning approach to prioritizing students at risk of not graduating high school on time BIBAFull-Text 93-102
  Everaldo Aguiar; Himabindu Lakkaraju; Nasir Bhanpuri; David Miller; Ben Yuhas; Kecia L. Addison
Several hundred thousand students drop out of high school every year in the United States. Interventions can help those who are falling behind in their educational goals, but given limited resources, such programs must focus on the right students, at the right time, and with the right message. In this paper, we describe an incremental approach that can be used to select and prioritize students who may be at risk of not graduating high school on time, and to suggest what may be the predictors of particular students going off-track. These predictions can then be used to inform targeted interventions for these students, hopefully leading to better outcomes.

Student performance

Collaborative multi-regression models for predicting students' performance in course activities BIBAFull-Text 103-107
  Asmaa Elbadrawy; R. Scott Studham; George Karypis
Methods that accurately predict the grade of a student at a given activity or course can identify students that are at risk in failing a course and allow their educational institution to take corrective actions. Though a number of prediction models have been developed, they either estimate a single model for all students based on their past course performance and interactions with learning management systems (LMS), or estimate student-specific models that do not take into account LMS interactions; thus, failing to exploit fine-grain information related to a student's engagement. In this work we present a class of collaborative multi-regression models that are personalized to each student and also take into account features related to student's past performance, engagement and course characteristics. These models use all historical information to estimate a small number of regression models shared by all students along with student-specific combination weights. This allows for information sharing and also generating personalized predictions. Our experimental evaluation on a large set of students, courses, and activities shows that these models are capable of improving the performance prediction accuracy by over 20%. In addition, we show that by analyzing the estimated models and the student-specific combination functions we can gain insights on the effectiveness of the educational material that is made available at the courses of different departments.
Investigating performance of students: a longitudinal study BIBAFull-Text 108-112
  Raheela Asif; Agathe Merceron; Mahmood Khan Pathan
This paper, investigates how academic performance of students evolves over the years in their studies during a study programme. To determine typical progression patterns over the years, students are described by a 4 tuple (e.g. x1, x2, x3, x4), these being the clusters' mean to which a student belongs to in each year of the degree. For this purpose, two consecutive cohorts have been analyzed using X-means clustering. Interestingly the patterns found in both cohorts show that a substantial number of students stay in the same kind of groups during their studies.
Using transaction-level data to diagnose knowledge gaps and misconceptions BIBAFull-Text 113-117
  Randall Davies; Rob Nyland; John Chapman; Gove Allen
The role of assessment in learning is to evaluate student comprehension and ability. Assessment instruments often function at the task level. What is rarely considered is the process students go through to reach the final solution. This often allows knowledge component gaps and misconceptions to go undetected. This research identified higher levels of knowledge component gaps and misunderstandings when assessing transaction-level knowledge component data than task-level final solution data. Final solution data showed little evidence that students had any misunderstanding or knowledge gaps about the use of absolute references. However, when analyzing these data at the transaction level we found evidence that far more students struggled than the analysis of the final solutions suggested.

Predicting achievement

Estimation of ability from homework items when there are missing and/or multiple attempts BIBAFull-Text 118-125
  Yoav Bergner; Kimberly Colvin; David E. Pritchard
Scoring of student item response data from online courses and especially massively open online courses (MOOCs) is complicated by two challenges, potentially large amounts of missing data and allowances for multiple attempts to answer. Approaches to ability estimation with respect to both of these issues are considered using data from a large-enrollment electrical engineering MOOC. The allowance of unlimited multiple attempts sets up a range of observed score and latent-variable approaches to scoring the constructed response homework. With respect to missing data, two classical approaches are discussed, treating omitted items as incorrect or missing at random (MAR). These treatments turn out to have slightly different interpretations depending on the scoring model. In all, twelve different homework scores are proposed based on combinations of scoring model and missing data handling. The scores are computed and correlations between each score and the final exam score are compared, with attention to different populations of course participants.
A time series interaction analysis method for building predictive models of learners using log data BIBAFull-Text 126-135
  Christopher Brooks; Craig Thompson; Stephanie Teasley
As courses become bigger, move online, and are deployed to the general public at low cost (e.g. through Massive Open Online Courses, MOOCs), new methods of predicting student achievement are needed to support the learning process. This paper presents a novel method for converting educational log data into features suitable for building predictive models of student success. Unlike cognitive modelling or content analysis approaches, these models are built from interactions between learners and resources, an approach that requires no input from instructional or domain experts and can be applied across courses or learning environments.
Predicting success: how learners' prior knowledge, skills and activities predict MOOC performance BIBAFull-Text 136-140
  Gregor Kennedy; Carleton Coffrin; Paula de Barba; Linda Corrin
While MOOCs have taken the world by storm, questions remain about their pedagogical value and high rates of attrition. In this paper we argue that MOOCs which have open entry and open curriculum structures, place pressure on learners to not only have the requisite knowledge and skills to complete the course, but also the skills to traverse the course in adaptive ways that lead to success. The empirical study presented in the paper investigated the degree to which students' prior knowledge and skills, and their engagement with the MOOC as measured through learning analytics, predict end-of-MOOC performance. The findings indicate that prior knowledge is the most significant predictor of MOOC success followed by students' ability to revise and revisit their previous work.
Likelihood analysis of student enrollment outcomes using learning environment variables: a case study approach BIBAFull-Text 141-145
  Scott Harrison; Renato Villano; Grace Lynch; George Chen
Tertiary institutions are increasing the emphasis on generating, collecting and analyzing student data as a means of targeting student support services. This study utilizes a data set from a regional Australian university to conduct logistic regression analyzing the student enrollment outcomes. The results indicate that demographic factors have a minor effect while institutional and learning environment variables play a more significant role in determining student enrollment outcomes. Using grade distribution compared to grade point average provides better estimates as to the effect particular grades have on enrollment outcomes. Moreover, the effect of an early alert system on enrollment outcomes shows that early identification has a significant relationship to a student's choice to stay enrolled versus discontinuing, lapsing or being inactive in their enrollment. These results are vital in the targeting of student support services at the case study institution. The significant results indicate the importance of learning environment variables in understanding student enrollment outcomes at tertiary institutions. This analysis forms part of a much larger research project analyzing student retention at the institution.

MOOCs -- discussion forums

Unsupervised modeling for understanding MOOC discussion forums: a learning analytics approach BIBAFull-Text 146-150
  Aysu Ezen-Can; Kristy Elizabeth Boyer; Shaun Kellogg; Sherry Booth
Massively Open Online Courses (MOOCs) have gained attention recently because of their great potential to reach learners. Substantial empirical study has focused on student persistence and their interactions with the course materials. However, most MOOCs include a rich textual dialogue forum, and these textual interactions are largely unexplored. Automatically understanding the nature of discussion forum posts holds great promise for providing adaptive support to individual students and to collaborative groups. This paper presents a study that applies unsupervised student understanding models originally developed for synchronous tutorial dialogue to MOOC forums. We use a clustering approach to group similar posts, compare the clusters with manual annotations by MOOC researchers, and further investigate clusters qualitatively. This paper constitutes a step toward applying unsupervised models to asynchronous communication, which can enable massive-scale automated discourse analysis and mining to better support students' learning.
Crowd-sourced learning in MOOCs: learning analytics meets measurement theory BIBAFull-Text 151-155
  Sandra Milligan
This paper illustrated the promise of the combination of measurement theory and learning analytics for understanding effective MOOC learning. It reports findings from a study of whether and how MOOC log file data can assist in understanding how MOOC participants use (often) messy, chaotic forums to support complex, unpredictable, contingent learning processes. It is argued that descriptions of posting, voting and viewing behaviours do not in and of themselves provide insights about how learning is generated in MOOC forums. Rather, it is hypothesised that there is a skill involved in using forums to learn; that theory-informed descriptions of this skill illustrate how MOOC participants use forums differently as they progress from novice to expert; that the skill progression can be validated through the use of forum log file data; and that log file data can also be used to assess an individual MOOC participant's position in relation to this progression -- that is, to measure an individual's skill in learning through forums and similar educational settings. These hypotheses were examined using data drawn from forums in a large MOOC run at the University of Melbourne in 2013.
What do cMOOC participants talk about in social media?: a topic analysis of discourse in a cMOOC BIBAFull-Text 156-165
  Srecko Joksimovic; Vitomir Kovanovic; Jelena Jovanovic; Amal Zouaq; Dragan Gaševic; Marek Hatala
Creating meaning from a wide variety of available information and being able to choose what to learn are highly relevant skills for learning in a connectivist setting. In this work, various approaches have been utilized to gain insights into learning processes occurring within a network of learners and understand the factors that shape learners' interests and the topics to which learners devote a significant attention. This study combines different methods to develop a scalable analytic approach for a comprehensive analysis of learners' discourse in a connectivist massive open online course (cMOOC). By linking techniques for semantic annotation and graph analysis with a qualitative analysis of learner-generated discourse, we examined how social media platforms (blogs, Twitter, and Facebook) and course recommendations influence content creation and topics discussed within a cMOOC. Our findings indicate that learners tend to focus on several prominent topics that emerge very quickly in the course. They maintain that focus, with some exceptions, throughout the course, regardless of readings suggested by the instructor. Moreover, the topics discussed across different social media differ, which can likely be attributed to the affordances of different media. Finally, our results indicate a relatively low level of cohesion in the topics discussed which might be an indicator of a diversity of the conceptual coverage discussed by the course participants.

Off-task behaviour / Bayesian knowledge tracing

Tracking student progress in a game-like learning environment with a Monte Carlo Bayesian knowledge tracing model BIBAFull-Text 166-170
  G.-H. Gweon; Hee-Sun Lee; Chad Dorsey; Robert Tinker; William Finzer; Daniel Damelin
The Bayesian Knowledge Tracing (BKT) model is a popular model used for tracking student progress in learning systems such as an intelligent tutoring system. However, the model is not free of problems. Well-recognized problems include the identifiability problem and the empirical degeneracy problem. Unfortunately, these problems are still poorly understood and how they should be dealt with in practice is unclear. Here, we analyze the mathematical structure of the BKT model, identify a source of the difficulty, and construct a simple Monte Carlo BKT model to analyze the problem in real data. Using the student activity data obtained from the ramp task module at the Concord Consortium, we find that the Monte Carlo BKT analysis is capable of detecting the identifiability problem and the empirical degeneracy problem, and, more generally, gives an excellent summary of the student learning data. In particular, the student activity monitoring parameter M emerges as the central parameter.
How does Bayesian knowledge tracing model emergence of knowledge about a mechanical system? BIBAFull-Text 171-175
  Hee-Sun Lee; Gey-Hong Gweon; Chad Dorsey; Robert Tinker; William Finzer; Daniel Damelin; Nathan Kimball; Amy Pallant; Trudi Lord
An interactive learning task was designed in a game format to help high school students acquire knowledge about a simple mechanical system involving a car moving on a ramp. This ramp game consisted of five challenges that addressed individual knowledge components with increasing difficulty. In order to investigate patterns of knowledge emergence during the ramp game, we applied the Monte Carlo Bayesian Knowledge Tracing (BKT) algorithm to 447 game segments produced by 64 student groups in two physics teachers' classrooms. Results indicate that, in the ramp game context, (1) the initial knowledge and guessing parameters were significantly highly correlated, (2) the slip parameter was interpretable monotonically, (3) low guessing parameter values were associated with knowledge emergence while high guessing parameter values were associated with knowledge maintenance, and (4) the transition parameter showed the speed of knowledge emergence. By applying the k-means clustering to ramp game segments represented in the three dimensional space defined by guessing, slip, and transition parameters, we identified seven clusters of knowledge emergence. We characterize these clusters and discuss implications for future research as well as for instructional game design.
Learning analytics in outer space: a Hidden Naïve Bayes model for automatic student off-task behavior detection BIBAFull-Text 176-183
  Wanli Xing; Sean Goggins
Learning analytics (LA) has invested much effort in the investigation of students' behavior and performance within learning systems. This paper expands the influence of LA to students' behavior outside of learning systems and describes a novel machine learning model which automatically detects students' off-task behavior as students interact with a learning system, ASSISTments, based solely on log file data. We first operationalize social cognitive theory to introduce two new variables, affect states and problem set, both of which can be automatically derived from the logs, and can be considered to have a major influence on students' behavior. These two variables further work as the feature vector data for a K-means clustering algorithm in order to quantify students' different behavioral characteristics. This quantified variable representing student behavior type expands the feature space and contributes to the improvement of the various model performance compared with only time- and performance-related features. In addition, an advanced Hidden Naïve Bayes (HNB) algorithm is coded for off-task behavior detection and to show the best performance compared with traditional modeling techniques. Implications of the study are then discussed.
Penetrating the black box of time-on-task estimation BIBAFull-Text 184-193
  Vitomir Kovanovic; Dragan Gaševic; Shane Dawson; Srecko Joksimovic; Ryan S. Baker; Marek Hatala
All forms of learning take time. There is a large body of research suggesting that the amount of time spent on learning can improve the quality of learning, as represented by academic performance. The wide-spread adoption of learning technologies such as learning management systems (LMSs), has resulted in large amounts of data about student learning being readily accessible to educational researchers. One common use of this data is to measure time that students have spent on different learning tasks (i.e., time-on-task). Given that LMS systems typically only capture times when students executed various actions, time-on-task measures are estimated based on the recorded trace data. LMS trace data has been extensively used in many studies in the field of learning analytics, yet the problem of time-on-task estimation is rarely described in detail and the consequences that it entails are not fully examined.
   This paper presents the results of a study that examined the effects of different time-on-task estimation methods on the results of commonly adopted analytical models. The primary goal of this paper is to raise awareness of the issue of accuracy and appropriateness surrounding time-estimation within the broader learning analytics community, and to initiate a debate about the challenges of this process. Furthermore, the paper provides an overview of time-on-task estimation methods in educational and related research fields.

Writing and discourse analysis

You've got style: detecting writing flexibility across time BIBAFull-Text 194-202
  Erica L. Snow; Laura K. Allen; Matthew E. Jacovina; Cecile A. Perret; Danielle S. McNamara
Writing researchers have suggested that students who are perceived as strong writers (i.e., those who generate texts that are rated as high quality) demonstrate flexibility in their writing style. While anecdotally this has been a commonly held belief among researchers, scientists, and educators, there is little empirical research to support this claim. This study investigates this hypothesis by examining how students vary in their use of linguistic features across 16 prompt-based essays. Forty-five high school students wrote 16 essays across 8 sessions within an Automated Writing Evaluation (AWE) system. Natural language processing (NLP) techniques and Entropy analyses were used to calculate how rigid or flexible students were in their use of narrative linguistic features over time and how this trait related to individual differences in literacy ability and essay quality. Additional analyses indicated that NLP and Entropy reliably detected narrative flexibility (or rigidity) after session 2 and was related to students' prior literacy skills. These exploratory methodologies are important for researchers and educators, as they indicate that writing flexibility is indeed a trait of strong writers and can be detected rather quickly using the combination of textual features and dynamic analyses.
Pssst... textual features... there is more to automatic essay scoring than just you! BIBAFull-Text 203-207
  Scott Crossley; Laura K. Allen; Erica L. Snow; Danielle S. McNamara
This study investigates a new approach to automatically assessing essay quality that combines traditional approaches based on assessing textual features with new approaches that measure student attributes such as demographic information, standardized test scores, and survey results. The results demonstrate that combining both text features and student attributes leads to essay scoring models that are on par with state-of-the-art scoring models. Such findings expand our knowledge of textual and non-textual features that are predictive of writing success.
OpenEssayist: a supply and demand learning analytics tool for drafting academic essays BIBAFull-Text 208-212
  Denise Whitelock; Alison Twiner; John T. E. Richardson; Debora Field; Stephen Pulman
This paper focuses on the use of a natural language analytics engine to provide feedback to students when preparing an essay for summative assessment. OpenEssayist is a real-time learning analytics tool, which operates through the combination of a linguistic analysis engine that processes the text in the essay, and a web application that uses the output of the linguistic analysis engine to generate the feedback. We outline the system itself and present analysis of observed patterns of activity as a cohort of students engaged with the system for their module assignments. We report a significant positive correlation between the number of drafts submitted to the system and the grades awarded for the first assignment. We can also report that this cohort of students gained significantly higher overall grades than the students in the previous cohort, who had no access to OpenEssayist. As a system that is content free, OpenEssayist can be used to support students working in any domain that requires the writing of essays.

Learning analytics tools and frameworks

DOP8: merging both data and analysis operators life cycles for technology enhanced learning BIBAFull-Text 213-217
  Nadine Mandran; Michael Ortega; Vanda Luengo; Denis Bouhineau
This paper presents DOP8: a Data Mining Iterative Cycle that improves the classical data life cycle. While the latter only combines the data production and data analysis phases, DOP8 also integrates the analysis operators life cycle. In this cycle, data life cycle and operators life cycle processing meet in the data analysis step. This paper also presents a reification of DOP8 in a new computing platform: UnderTracks. The latter provides a flexibility on storing and sharing data, operators and analysis processes. Undertracks is compared with three types of platform 'Storage platform', 'Analysis platform' and 'Storage and Analysis platform'. Several real TEL analysis scenarios are present into the platform, (1) to test Undertracks flexibility on storing data and operators and (2) to test Undertracks flexibility on designing analysis processes.
A handwriting recognition system for the classroom BIBAFull-Text 218-222
  Eric Gross; Safwan Wshah; Isaiah Simmons; Gary Skinner
The Xerox Ignite™ Educator Support System (henceforth referred to simply as Ignite™) is a data collection, analysis, and visualization workflow and software solution to assist K-12 educators. To illustrate, suppose a third-grade teacher wants to know how well her class has grasped a lesson on fractions. She would first scan her students' homework and/or exams into the Ignite system via a range of multifunctional input devices. Xerox Ignite™ reads, interprets, and analyzes the students' work in minutes. Then the teacher can select how to view the data by choosing from numerous reports. Examples are; an "at a glance" class summary that shows who needs extra help in what areas and who is ready to move on; a "context" report showing how each skill for each student is progressing over time; a grade-level performance report that helps third-grade teachers share best practices and cluster students into learning groups; and a student feedback report that tells each student what he/she needs to improve upon. Ignite™ intent is also to make it easier for districts to administer, score and evaluate content based on academic goals set for schools and students. The scanning and 'mark lifting' technology embedded into Ignite™ reduces the time needed to correct papers and frees time for the teacher to apply detailed insights to their day-to-day instruction tasks. Critical to this function is the automated reading of student marks, including handwriting, to enable the digitization of student performance at a detailed level. In this paper we present a system level description of the Ignite™ handwriting recognition module and describe the challenges and opportunities presented in an educational environment.

Theoretical foundations for learning analytics

Critical realism and learning analytics research: epistemological implications of an ontological foundation BIBAFull-Text 223-230
  Tim Rogers
Learning analytics is a broad church that incorporates a range of topics and methodologies. As the field has developed some tension has emerged regarding a perceived contradiction between the implied constructivist ethos of the field and prevalent empirical practices that have been characterised as 'behaviourist' and 'positivist'. This paper argues that this tension is a sign of deeper metatheoretical faultlines that have plagued the social sciences more broadly. Critical realism is advanced as a philosophy of science that can help reconcile the apparent contradictions between the constructivist aims and the empirical practices of learning analytics and simultaneously can justify learning analytics' current methodological tolerance. The paper concludes that learning analytics, arrayed in realist terms, is essentially longitudinal and multimethodological, concerned with the socio-technical systems of learning and the problems of implementation, and has the potential to be emancipatory. Some methodological implications for learning analytics practice are discussed.

Text and discourse analysis

Topic facet modeling: semantic visual analytics for online discussion forums BIBAFull-Text 231-235
  I-Han Hsiao; Piyush Awasthi
In this paper, we propose a novel Topic Facet Model (TFM), a probabilistic topic model that assumes all words in single sentence are generated from one topic facet. The model is applied to automatically extract forum posts semantics for uncovering the content latent structures. We further prototype a visual analytics interface to present online discussion forum semantics. We hypothesize that the semantic modeling through analytics on open online discussion forums can help users examine the post content by viewing the summarized topic facets. Our preliminary results demonstrated that TFM can be a promising method to extract topic specificity from conversational and relatively short texts in online programming discussion forums.
Effects of sequences of socially regulated learning on group performance BIBAFull-Text 236-240
  Inge Molenaar; Ming Ming Chiu
Past research shows that regulative activities (metacognitive or relational) can aid learning and that sequences of cognitive, metacognitive and relational activities affect subsequent cognition. Extending this research, this study examines whether sequences of socially regulated learning differ across low, medium or high performing groups. Scaffolded by a computer avatar, 54 primary school students (working in 18 groups of 3) discussed writing a report about a foreign country for 51,338 turns. Statistical discourse analysis (SDA) of these sequences of talk showed that in high performing groups, high cognition was preceded more often by high cognition and less often by denials or low cognition. In medium performing groups, high cognition was preceded more often by high cognition or planning. As these results indicate that different sequences among students' cognitive, metacognitive and relational activities are linked to levels of performance, they can inform a micro-temporal theory of socially shared regulation.
Developing a multiple-document-processing performance assessment for epistemic literacy BIBAFull-Text 241-245
  Simon Knight; Karen Littleton
The LAK15 theme "shifts the focus from data to impact", noting the potential for Learning Analytics based on existing technologies to have scalable impact on learning for people of all ages. For such demand and potential in scalability to be met the challenges of addressing higher-order thinking skills should be addressed. This paper discuses one such approach -- the creation of an analytic and task model to probe epistemic cognition in complex literacy tasks. The research uses existing technologies in novel ways to build a conceptually grounded model of trace-indicators for epistemic-commitments in information seeking behaviors. We argue that such an evidence centered approach is fundamental to realizing the potential of analytics, which should maintain a strong association with learning theory.
Are you reading my mind?: modeling students' reading comprehension skills with natural language processing techniques BIBAFull-Text 246-254
  Laura K. Allen; Erica L. Snow; Danielle S. McNamara
This study builds upon previous work aimed at developing a student model of reading comprehension ability within the intelligent tutoring system, iSTART. Currently, the system evaluates students' self-explanation performance using a local, sentence-level algorithm and does not adapt content based on reading ability. The current study leverages natural language processing tools to build models of students' comprehension ability from the linguistic properties of their self-explanations. Students (n = 126) interacted with iSTART across eight training sessions where they self-explained target sentences from complex science texts. Coh-Metrix was then used to calculate the linguistic properties of their aggregated self-explanations. The results of this study indicated that the linguistic indices were predictive of students' reading comprehension ability, over and above the current system algorithms. These results suggest that natural language processing techniques can inform stealth assessments and ultimately improve student models within intelligent tutoring systems.

Learning strategies and tools

Identifying learning strategies associated with active use of video annotation software BIBAFull-Text 255-259
  Abelardo Pardo; Negin Mirriahi; Shane Dawson; Yu Zhao; An Zhao; Dragan Gaševic
The higher education sector has seen a shift in teaching approaches over the past decade with an increase in the use of video for delivering lecture content as part of a flipped classroom or blended learning model. Advances in video technologies have provided opportunities for students to now annotate videos as a strategy to support their achievement of the intended learning outcomes. However, there are few studies exploring the relationship between video annotations, student approaches to learning, and academic performance. This study seeks to narrow this gap by investigating the impact of students' use of video annotation software coupled with their approaches to learning and academic performance in the context of a flipped learning environment. Preliminary findings reveal a significant positive relationship between annotating videos and exam results. However, negative effects of surface approaches to learning, cognitive strategy use and test anxiety on midterm grades were also noted. This indicates a need to better promote and scaffold higher order cognitive strategies and deeper learning with the use of video annotation software.
Planning for success: how students use a grade prediction tool to win their classes BIBAFull-Text 260-264
  Caitlin Holman; Stephen J. Aguilar; Adam Levick; Jeff Stern; Benjamin Plummer; Barry Fishman
Gameful course designs require a significant shift in approach for both students and instructors. Transforming a standard course into a good game involves fundamentally altering how the course functions, most notably by giving students greater control over their work. We have developed an application, GradeCraft, to support this shift in pedagogy. A key feature of the application is the Grade Predictor, where students can explore coursework options and plan pathways to success. We observed students in two gameful courses with differing designs using the Grade Predictor in similar ways: they spent similar amounts of time per session, increased usage when assignments were due and before making significant course decisions, predicted different types of assignments at different rates, and made more predictions in preparation for the end of semester. This study describes how students plan their coursework using the GradeCraft Grade Predictor tool.
A process mining approach to linking the study of aptitude and event facets of self-regulated learning BIBAFull-Text 265-269
  Sanam Shirazi Beheshitha; Dragan Gaševic; Marek Hatala
Research on self-regulated learning has taken main two paths: self-regulated learning as aptitudes and more recently, self-regulated learning as events. This paper proposes the use of the Fuzzy miner process mining technique to examine the relationship between students' self-reported aptitudes (i.e., achievement goal orientation and approaches to learning) and strategies followed in self-regulated learning. A pilot study is conducted to probe the method and the preliminary results are reported.

Alternative methods of improving learning

Towards data-driven mastery learning BIBAFull-Text 270-274
  Behrooz Mostafavi; Michael Eagle; Tiffany Barnes
We have developed a novel data-driven mastery learning system to improve learning in complex procedural problem solving domains. This new system was integrated into an existing logic proof tool, and assigned as homework in a deductive logic course. Student performance and dropout were compared across three systems: The Deep Thought logic tutor, Deep Thought with integrated hints, and Deep Thought with our data-driven mastery learning system. Results show that the data-driven mastery learning system increases mastery of target tutor-actions, improves tutor scores, and lowers the rate of tutor dropout over Deep Thought, with or without provided hints.
Analysing reflective text for learning analytics: an approach using anomaly recontextualisation BIBAFull-Text 275-279
  Andrew Gibson; Kirsty Kitto
Reflective writing is an important learning task to help foster reflective practice, but even when assessed it is rarely analysed or critically reviewed due to its subjective and affective nature. We propose a process for capturing subjective and affective analytics based on the identification and recontextualisation of anomalous features within reflective text. We evaluate 2 human supervised trials of the process, and so demonstrate the potential for an automated Anomaly Recontextualisation process for Learning Analytics.
Classifying student dialogue acts with multimodal learning analytics BIBAFull-Text 280-289
  Aysu Ezen-Can; Joseph F. Grafsgaard; James C. Lester; Kristy Elizabeth Boyer
Supporting learning with rich natural language dialogue has been the focus of increasing attention in recent years. Many adaptive learning environments model students' natural language input, and there is growing recognition that these systems can be improved by leveraging multimodal cues to understand learners better. This paper investigates multimodal features related to posture and gesture for the task of classifying students' dialogue acts within tutorial dialogue. In order to accelerate the modeling process by eliminating the manual annotation bottleneck, a fully unsupervised machine learning approach is utilized for this task. The results indicate that these unsupervised models are significantly improved with the addition of automatically extracted posture and gesture information. Further, even in the absence of any linguistic features, a model that utilizes posture and gesture features alone performed significantly better than a majority class baseline. This work represents a step toward achieving better understanding of student utterances by incorporating multimodal features within adaptive learning environments. Additionally, the technique presented here is scalable to very large student datasets.

Interventions and remediations

Automated detection of proactive remediation by teachers in reasoning mind classrooms BIBAFull-Text 290-294
  William L. Miller; Ryan S. Baker; Matthew J. Labrum; Karen Petsche; Yu-Han Liu; Angela Z. Wagner
Among the most important tasks of the teacher in a classroom using the Reasoning Mind blended learning system is proactive remediation: dynamically planned interventions conducted by the teacher with one or more students. While there are several examples of detectors of student behavior within an online learning environment, most have focused on behaviors occurring fully within the context of the system, and on student behaviors. In contrast, proactive remediation is a teacher-driven activity that occurs outside of the system, and its occurrence is not necessarily related to the student's current task within the Reasoning Mind system. We present a sensor-free detector of proactive remediation, which is able to distinguish these activities from other behaviors involving idle time, such as on-task conversation related to immediate learning activities and off-task behavior.
Reducing selection bias in quasi-experimental educational studies BIBAFull-Text 295-299
  Christopher Brooks; Omar Chavez; Jared Tritz; Stephanie Teasley
In this paper we examine the issue of selection bias in quasi-experimental (non-randomly controlled) educational studies. We provide background about common sources of selection bias and the issues involved in evaluating the outcomes of quasi-experimental studies. We describe two methods, matched sampling and propensity score matching, that can be used to overcome this bias. Using these methods, we describe their application through one case study that leverages large educational datasets drawn from higher education institutional data warehouses. The contribution of this work is the recommendation of a methodology and case study that educational researchers can use to understand, measure, and reduce selection bias in real-world educational interventions.
Discovering clues to avoid middle school failure at early stages BIBAFull-Text 300-304
  Manuel Ángel Jiménez-Gómez; José María Luna; Cristóbal Romero; Sebastián Ventura
The use of data mining techniques in educational domains helps to find new knowledge about how students learn and how to improve the resources management. Using these techniques for predicting school failure is very useful in order to carry out actions to avoid drop out. With this purpose, we try to determine the earliest stage when the quality of the results allows for clarifying the possibility of school failure. We process real information from a Spanish high school by structuring the whole data in incremental datasets, which represent how students' academic records grow. Our study reveals an early and robust detection of the risky cases of school failure at the end of the first out of four courses.

Analyses with LMS data

Combining observational and experiential data to inform the redesign of learning activities BIBAFull-Text 305-309
  Abelardo Pardo; Robert A. Ellis; Rafael A. Calvo
A main goal for learning analytics is to inform the design of a learning experience to improve its quality. The increasing presence of solutions based on big data has even questioned the validity of current scientific methods. Is this going to happen in the area of learning analytics? In this paper we postulate that if changes are driven solely by a digital footprint, there is a risk of focusing only on factors that are directly connected to numeric methods. However, if the changes are complemented with an understanding about how students approach their learning, the quality of the evidence used in the redesign is significantly increased. This reasoning is illustrated with a case study in which an initial set of activities for a first year engineering course were shaped based only on the student's digital footprint. These activities were significantly modified after collecting qualitative data about the students approach to learning. We conclude the paper arguing that the interpretation of the meaning of learning analytics is improved when combined with qualitative data which reveals how and why students engaged with the learning tasks in qualitatively different ways, which together provide a more informed basis for designing learning activities.
Formative and summative analyses of disciplinary engagement and learning in a big open online course BIBAFull-Text 310-314
  Daniel T. Hickey; Joshua D. Quick; Xinyi Shen
Situative theories of knowing and participatory approaches to learning and assessment were used to offer a big open online course on Educational Assessment using Google CourseBuilder in 2013. The course was started by 160 students and completed by 60, with relatively extensive instructor interaction with individual learners. This yielded much higher levels of engagement and learning than are typical of open or conventional online courses. The course was further refined and offered a second time in 2014, where it was started by 76 students and completed by 22, with a much lower level of support. Comparable levels of engagement and learning were obtained, suggesting that this participatory approach to learning and assessment can indeed be managed with more typical instructor support. Nonetheless, additional automation and streamlining is called for if the model is to eventually be used in massive online courses with thousands of students or as an autonomous self-paced open course.
"Scaling up" learning design: impact of learning design activities on LMS behavior and performance BIBAFull-Text 315-319
  Bart Rienties; Lisette Toetenel; Annie Bryan
While substantial progress has been made in terms of predictive modeling in the Learning Analytics Knowledge (LAK) community, one element that is often ignored is the role of learning design. Learning design establishes the objectives and pedagogical plans which can be evaluated against the outcomes captured through learning analytics. However, no empirical study is available linking learning designs of a substantial number of courses with usage of Learning Management Systems (LMS) and learning performance. Using cluster- and correlation analyses, in this study we compared how 87 modules were designed, and how this impacted (static and dynamic) LMS behavior and learning performance. Our findings indicate that academics seem to design modules with an "invisible" blueprint in their mind. Our cluster analyses yielded four distinctive learning design patterns: constructivist, assessment-driven, balanced-variety and social constructivist modules. More importantly, learning design activities strongly influenced how students were engaging online. Finally, learning design activities seem to have an impact on learning performance, in particular when modules rely on assimilative activities. Our findings indicate that learning analytics researchers need to be aware of the impact of learning design on LMS data over time, and subsequent academic performance.

Tutoring systems

An analysis of the impact of action order on future performance: the fine-grain action model BIBAFull-Text 320-324
  Eric Van Inwegen; Seth Adjei; Yan Wang; Neil Heffernan
To better model students' learning, user modelling should be able to use the detailed sequence of student actions to model student knowledge, not just their right/wrong scores. Our goal is to analyze the question: "Does it matter when a hint is used?". We look at students who use identical attempt counts to get the right answer and look for the impact of help use and action order on future performance. We conclude that students who use hints too early do worse than students who use hints later. However, students who use hints, at times, may perform as well as students who do not use hints. This paper makes a novel contribution showing for the first time that paying attention to the precise sequence of hints and attempts allows better prediction of students' performance, as well as to definitively show that, when we control for the number of attempts and hints, students that attempt problems before asking for hints show higher performance on the next question. This analysis shows that the pattern of hints and attempts, not just their numbers, is important.
Improving students' long-term retention performance: a study on personalized retention schedules BIBAFull-Text 325-329
  Xiaolu Xiong; Yan Wang; Joseph Barbosa Beck
Traditional practices of spacing and expanding retrieval practices have typically fixed their spacing intervals to one or few predefined schedules [5, 7]. Few have explored the advantages of using personalized expanding intervals and scheduling systems to adapt to the knowledge levels and learning patterns of individual students. In this work, we are concerned with estimating the effects of personalized expanding intervals on improving students' long-term mastery level of skills. We developed a Personalized Adaptive Scheduling System (PASS) in ASSISTments' retention and relearning workflow. After implementing the PASS, we conducted a study to investigate the impact of personalized scheduling on long-term retention by comparing results from 97 classes in the summer of 2013 and 2014. We observed that students in PASS outperformed students in traditional scheduling systems on long-term retention performance (p = 0.0002), and that in particular, students with medium level of knowledge demonstrated reliable improvement (p = 0.0209) with an effect size of 0.27. In addition, the data we gathered from this study also helped to expose a few issues we have with the new system. These results suggest personalized knowledge retrieval schedules are more effective than fixed schedules and we should continue our future work on examining approaches to optimize PASS.

Curricula, network and discourse analysis

Curriculum analysis of CS departments based on CS2013 by simplified, supervised LDA BIBAFull-Text 330-339
  Takayuki Sekiya; Yoshitatsu Matsuda; Kazunori Yamaguchi
The curricula higher educational institutions offer is a key asset in enabling them to systematically educate their students. We have been developing a curriculum analysis method that can help to find out differences among curricula. On the basis of "Computing Science Curricula CS2013", a report released by the ACM and IEEE Computer Society, we applied our method to analyzing 10 computer science (CS) related curricula offered by CS departments of universities in the United States. Using the method enables us to compare courses across universities. Through an analysis of course syllabi distribution, we found that CS2013 uniformly covered a wide area of computer science. Some universities emphasized human factors, while others attached greater importance to theoretical ones. We also found that some CS departments offered not only a CS curriculum but also an electrical engineering one, and those departments showed a tendency to have more "Architecture and Organization (AR)" related curricula. Furthermore, we found that even though "Information Assurance and Security (IAS)" has not yet become a very popular field, some universities are already offering IAS related courses.
"Twitter Archeology" of learning analytics and knowledge conferences BIBAFull-Text 340-349
  Bodong Chen; Xin Chen; Wanli Xing
The goal of the present study was to uncover new insights about the learning analytics community by analyzing Twitter archives from the past four Learning Analytics and Knowledge (LAK) conferences. Through descriptive analysis, interaction network analysis, hashtag analysis, and topic modeling, we found: extended coverage of the community over the years; increasing interactions among its members regardless of peripheral and in-persistent participation; increasingly dense, connected and balanced social networks; and more and more diverse research topics. Detailed inspection of semantic topics uncovered insights complementary to the analysis of LAK publications in previous research.
Discourse cohesion: a signature of collaboration BIBAFull-Text 350-354
  Mihai Dascalu; Stefan Trausan-Matu; Philippe Dessus; Danielle S. McNamara
As Computer Supported Collaborative Learning (CSCL) becomes increasingly adopted as an alternative to classic educational scenarios, we face an increasing need for automatic tools designed to support tutors in the time consuming process of analyzing conversations and interactions among students. Therefore, building upon a cohesion-based model of the discourse, we have validated ReaderBench, a system capable of evaluating collaboration based on a social knowledge-building perspective. Through the inter-twining of different participants' points of view, collaboration emerges and this process is reflected in the identified cohesive links between different speakers. Overall, the current experiments indicate that textual cohesion successfully detects collaboration between participants as ideas are shared and exchanged within an ongoing conversation.
Correlations between automated rhetorical analysis and tutors' grades on student essays BIBAFull-Text 355-359
  Duygu Simsek; Ágnes Sándor; Simon Buckingham Shum; Rebecca Ferguson; Anna De Liddo; Denise Whitelock
When assessing student essays, educators look for the students' ability to present and pursue well-reasoned and strong arguments. Such scholarly argumentation is often articulated by rhetorical metadiscourse. Educators will be necessarily examining metadiscourse in students' writing as signals of the intellectual moves that make their reasoning visible. Therefore students and educators could benefit from available powerful automated textual analysis that is able to detect rhetorical metadiscourse. However, there is a need to validate such technologies in higher education contexts, since they were originally developed in non-educational applications. This paper describes an evaluation study of a particular language analysis tool, the Xerox Incremental Parser (XIP), on undergraduate social science student essays, using the mark awarded as a measure of the quality of the writing. As part of this exploration, the study presented in this paper seeks to assess the quality of the XIP through correlational studies and multiple regression analysis.

Multilevel, multimodal and network analysis

Leveraging multimodal learning analytics to differentiate student learning strategies BIBAFull-Text 360-367
  Marcelo Worsley; Paulo Blikstein
Multimodal analysis has had demonstrated effectiveness in studying and modeling several human-human and human-computer interactions. In this paper, we explore the role of multimodal analysis in the service of studying complex learning environments. We compare uni-modal and multimodal; manual and semi-automated methods for examining how students learn in a hands-on, engineering design context. Specifically, we compare human annotations, speech, gesture and electro-dermal activation data from a study (N=20) where student participating in two different experimental conditions. The experimental conditions have already been shown to be associated with differences in learning gains and design quality. Hence, one objective of this paper is to identify the behavioral practices that differed between the two experimental conditions, as this may help us better understand how the learning interventions work. An additional objective is to provide examples of how to conduct learning analytics research in complex environments and compare how the same algorithm, when used with different forms of data can provide complementary results.
From contingencies to network-level phenomena: multilevel analysis of activity and actors in heterogeneous networked learning environments BIBAFull-Text 368-377
  Dan Suthers
Learning in social settings is a complex phenomenon that involves multiple processes at individual and collective levels of agency. Thus, a richer understanding of learning in socio-technical networks will be furthered by analytic methods that can move between and coordinate analyses of individual, small group and network level phenomena. This paper outlines Traces, an analytic framework designed to address these and other needs, and gives examples of the framework's practical utility using data from the Tapped In educator professional network. The Traces framework identifies observable contingencies between events and uses these to build more abstract models of interaction and ties represented as graphs. Applications are illustrated to identification of sessions and key participants in the sessions, relations between sessions as mediated by participants, and longer-term participant roles.
Ubiquitous learning analytics in the context of real-world language learning BIBAFull-Text 378-382
  Kousuke Mouri; Hiroaki Ogata; Noriko Uosaki
This paper describes a method of the visualization and analysis for mining useful learning logs from numerous learning experiences that learners have accumulated in the real world as the ubiquitous learning logs. Ubiquitous Learning Log (ULL) is defined as a digital record of what learners have learned in the daily life using ubiquitous technologies. It allows learners to log their learning experiences with photos, audios, videos, location, RFID tag and sensor data, and to share and reuse ULL with others. By constructing real-world corpora which comprise of accumulated ULLs with information such as what, when, where, and how learners have learned in the real world and by analyzing them, we can support learners to learn more effectively. The proposed system will predict their future learning opportunities including their learning patterns and trends by analyzing their past ULLs. The prediction is made possible both by network analysis based on ULL information such as learners, knowledge, place and time and by learners' self-analysis using time-map. By predicting what they tend to learn next in their learning paths, it provides them with more learning opportunities. Accumulated data are so big and the relationships among the data are so complicated that it is difficult to grasp how closely the ULLs are related each other. Therefore, this paper proposes a system to help learners to grasp relationships among learners, knowledge, place and time, using network graphs and network analysis.
An exploratory study using social network analysis to model eye movements in mathematics problem solving BIBAFull-Text 383-387
  Mengxiao Zhu; Gary Feng
Eye tracking is a useful tool to understand students' cognitive process during problem solving. This paper offers a unique perspective by applying techniques from social network analysis to eye movement patterns in mathematics problem solving. We construct and visualize transition networks using eye-tracking data collected from 37 8th grade students while solving linear function problems. By applying network analysis on the constructed transition networks, we find general transition patterns between areas of interest (AOIs) for all students, and we also compare patterns for high- and low-performing students. Our results show that even though students share general transition patterns during problem solving, high-performing students made more strategic transitions among AOI triples than low-performing students.

Workshop

It's about time: 4th international workshop on temporal analyses of learning data BIBAFull-Text 388-389
  Simon Knight; Alyssa F. Wise; Bodong Chen; Britte Haugan Cheng
Interest in analyses that probe the temporal aspects of learning continues to grow. The study of common and consequential sequences of events (such as learners accessing resources, interacting with other learners and engaging in self-regulatory activities) and how these are associated with learning outcomes, as well as the ways in which knowledge and skills grow or evolve over time are both core areas of interest. Learning analytics datasets are replete with fine-grained temporal data: click streams; chat logs; document edit histories (e.g. wikis, etherpads); motion tracking (e.g. eye-tracking, Microsoft Kinect), and so on. However, the emerging area of temporal analysis presents both technical and theoretical challenges in appropriating suitable techniques and interpreting results in the context of learning. The learning analytics community offers a productive focal ground for exploring and furthering efforts to address these challenges. This workshop, the fourth in a series on temporal analysis of learning, provides a focal point for analytics researchers to consider issues around and approaches to temporality in learning analytics.
Ethical and privacy issues in the application of learning analytics BIBAFull-Text 390-391
  Hendrik Drachsler; Tore Hoel; Maren Scheffel; Gábor Kismihók; Alan Berg; Rebecca Ferguson; Weiqin Chen; Adam Cooper; Jocelyn Manderveld
The large-scale production, collection, aggregation, and processing of information from various learning platforms and online environments have led to ethical and privacy concerns regarding potential harm to individuals and society. In the past, these types of concern have impacted on areas as diverse as computer science, legal studies and surveillance studies. Within a European consortium that brings together the EU project LACE, the SURF SIG Learning Analytics, the Apereo Foundation and the EATEL SIG dataTEL, we aim to understand the issues with greater clarity, and to find ways of overcoming the issues and research challenges related to ethical and privacy aspects of learning analytics practice. This interactive workshop aims to raise awareness of major ethics and privacy issues. It will also be used to develop practical solutions to advance the application of learning analytics technologies.
2nd int'l workshop on open badges in education (OBIE 2015): from learning evidence to learning analytics BIBAFull-Text 392-393
  Daniel Hickey; Jelena Jovanovic; Steve Lonn; James E. Willis
Open digital badges are Web-enabled tokens of learning and accomplishment. Unlike traditional grades, certificates, and transcripts, badges include specific claims about learning accomplishments and detailed evidence in support of those claims. Considering the richness of data associated with Open Badges, it is reasonable to expect a very powerful predictive element at the intersection of Open Badges and Learning Analytics. This could have substantial implications for recommending and exposing students to a variety of curricular and co-curricular pathways utilizing data sources far more nuanced than grades and achievement tests. Therefore, this workshop was aimed at: i) examining the potentials of Open Badges (including the associated data and resources) to provide new and potentially unprecedented data for analysis; ii) examining the kinds of Learning Analytics methods and techniques that could be suitable for gaining valuable insights from and/or making predictions based on the evidence (data and resources) associated with badges, and iii) connecting Open Badges communities, aiming to allow for the exchange of experiences and learning from different cultures and communities.
VISLA: visual aspects of learning analytics BIBAFull-Text 394-395
  Erik Duval; Katrien Verbert; Joris Klerkx; Martin Wolpers; Abelardo Pardo; Sten Govaerts; Denis Gillet; Xavier Ochoa; Denis Parra
In this paper, we briefly describe the goal and activities of the LAK15 workshop on Visual Aspects of Learning analytics.
The 3rd LAK data competition BIBAFull-Text 396-397
  Hendrik Drachsler; Stefan Dietze; Eelco Herder; Mathieu d'Aquin; Davide Taibi; Maren Scheffel
The LAK Data Challenge 2015 continues the research efforts of the previous data competitions in 2013 and 2014 by stimulating research on the evolving fields Learning Analytics (LA) and Educational Data Mining (EDM). Building on a series of activities of the LinkedUp project, the challenge aims to generate new insights and analysis on the LA & EDM disciplines and is supported through the LAK Dataset -- a unique corpus of LA & EDM literature, exposed in structured and machine-readable formats.

Posters

A learning analytics approach to characterize and analyze inquiry-based pedagogical processes BIBAFull-Text 398-399
  Carlos Monroy; Virginia Snodgrass Rangel; Elizabeth R. Bell; Reid Whitaker
Here we describe the use of learning analytics (LA) for investigating inquiry-based science instruction. We define several variables that quantify curriculum usage and leverage tools from process mining to examine inquiry-based pedagogical processes. These are initial steps toward measuring and modeling fidelity of implementation of a science curriculum. We use data from one school district's use of an online science curriculum (N=1,021 teachers and nearly 330,000 page views).
Predicting post-training readiness to work with computers: the predominance of log-based variables BIBAFull-Text 400-401
  Dalit Mor; Hagar Laks; Arnon Hershkovitz
In today's job market, computer skills are part of the prerequisites for many jobs. In this paper, we report on a study of readiness to work with computers (the dependent variable) among unemployed women (N=54) after participating in a unique training focused on computer skills and empowerment. Associations were explored between this variable and 17 variables from four categories: log-based, computer literacy and experience, job-seeking motivation and practice, and training satisfaction. Only two variables were associated with the dependent variable: Knowledge post-test duration and satisfaction with content. Building a prediction model of the dependent variable, another feature was highlighted: Total number of actions in the course website along the course. Our analyses highlight the predominance of the log-based variables over the variables from the other categories, and we thoroughly discuss this finding.
Investigating the impact of a notification system on student behaviors in a discourse-intensive hybrid course: a case study BIBAFull-Text 402-403
  Zhenhua Xu; Alexandra Makos
This study investigated the effects of students' opting to use notification tools in a collaborative discourse-intensive online graduate course. Social constructivism and self-expectancy theory were applied to frame our understanding of the interactive relationship between the use of the notification tools, student's online contribution behavior and student's self-expectancy. Log-data from a 12-week hybrid (online and face-to-face) graduate course at a Canadian faculty of education was analyzed. Findings from the correlation, mediation and ANOVA analyses suggested that activation of the notification tool system positively affected students' contribution behavior and that the influence of the use of notification tools on student contribution behavior was partially mediated by student's self-expectancy.
Minimum information entropy based q-matrix learning in DINA model BIBAFull-Text 404-405
  Shiwei Ye; Yuan Sun; Haobo Wang; Yi Sun
Cognitive diagnosis models (CDMs) are of growing interest in test development and measurement of learners' performance. The DINA (deterministic input, noisy, and gate) model is one of the most widely used models in CDM. In this paper, we propose a new method and present an alternating recursive algorithm to learn Q-matrix and uncertainty variables, slip and guessing parameters, based on Boolean Matrix Factorization (BMF) and Minimized Information Entropy (MIE) respectively for the DINA model. Simulation results show that our algorithm for Q-matrix learning has fast convergence to the local optimal solutions for Q-matrix and students' knowledge states A matrix. This is especially important and applicable when the method is extended to big data.
Integrated representations and small data: towards contextualized and embedded analytics tools for learners BIBAFull-Text 406-407
  Andreas Harrer; Tilman Göhnert
We present an approach to support learners by means of visualization and contextualization of learning analytics interventions in the learning process. We follow up on conceptual work of colleagues and derive further design principles oriented towards learners as recipients of LA results. These are shown with implementations in two distinct projects to fulfill learners information in collaborative learning processes.
Frequent sequential interactions as opportunities to engage in temporal reasoning with an online GIS BIBAFull-Text 408-409
  Raymond Kang; Josh Radinsky; Leilah Lyons
Temporal reasoning (i.e., reasoning about relationships across time) is complex and difficult, particularly when engaged through complex media such as online Geographic Information System (GIS) applications. Partnering with Social Explorer (SE), a Web-based GIS application that allows users to create interactive visualizations of large sociological datasets, we engaged in frequent sequential pattern mining of a database of users' interactions with SE. The resulting frequent sequences provide initial descriptions of how SE affords opportunities to engage in temporal reasoning.
The bridge report: bringing learning analytics to low-income, urban schools BIBAFull-Text 410-411
  Aaron Hawn
Widespread adoption of learning analytics for risk prediction faces different challenges at low-income secondary schools than at post-secondary institutions, where such methods have been more widely adopted. To leverage the benefits of learning analytics for under-resourced communities, educators must overcome the barriers to adoption faced by local schools: internet access, data integration, data interpretation, and local alignment. We present the case study of an enhanced reporting tool for parents and teachers, the Bridge Report, locally designed to meet the needs of a low-income secondary school in New York City. Parent and Teacher focus groups suggest that addressing local obstacles to learning analytics can create conditions for enthusiastic adoption by parents and teachers.
Improving undergraduate student achievement in large blended courses through data-driven interventions BIBAFull-Text 412-413
  Bernie Dodge; John Whitmer; James P. Frazee
This pilot study applied Learning Analytics methods to identify students at-risk of not succeeding in two high enrollment courses with historically low pass rates at San Diego State University: PSY 101 and STAT 119. With input from instructors, targeted interventions were developed and sent to participating students (n=882) suggesting ways to improve their performance. An experimental design was used with half of the students randomly assigned to receive these interventions via email and the other half being analyzed for at-risk triggers but receiving no intervention. Pre-course surveys on student motivation [4] and prior subject matter knowledge were conducted, and students were asked to maintain weekly logs of their activity online and offline connected to the courses. Regression analyses, incorporating feature selection methods to account for student demographic data, were used to compare the impact of the interventions between the control and experimental groups. Results showed that the interventions were associated with a higher final grade in one course, but only for a particular demographic group.
Increasing the accessibility of learning objects by automatic tagging BIBAFull-Text 414-415
  Katja Niemann
Data sets coming from the educational domain often suffer from sparsity. Hence, they might comprise potentially useful learning objects that are not findable by the users. In order to address this problem, we present a new way to automatically assign tags and classifications to learning objects offered by educational web portals that is solely based on the objects' usage.
Measuring student success using predictive engine BIBAFull-Text 416-417
  Shady Shehata; Kimberly E. Arnold
A basic challenge in delivering global education is improving student success. Institutions of education are increasingly focused on improving graduation and retention rates of their students. In this poster, we describe Student Success System (S3) that can measure student performance starting from the first weeks of the semester and the adoption process for S3 by University of Wisconsin System (UWS).
A learning system utilizing learners' active tracing behaviors BIBAFull-Text 418-419
  Kazushi Maruya; Junji Watanabe; Hiroyuki Takahashi; Shoji Hashiba
A monitoring system that does not disturb learners' motivation and attention is important, especially in online learning with massive numbers of participants. We propose a learning system, called the finger trail learning system (FTLS), that can monitor participants' learning attitude by means of their finger movements. On the display of the FTLS, letters are presented with low contrast in the initial state, and the contrast of the letters changes to high when they are traced by learners. We implemented the FTLS as an iOS application and confirmed that the software can be utilized to monitor learners' attitudes. In addition, we compared trails of finger movements between participants with high and low performance. The results show that the trail of finger movements recorded by the FTLS can be an index of learners' attitudes.
A case study to track teacher gestures and performance in a virtual learning environment BIBAFull-Text 420-421
  Roghayeh Barmaki; Charles E. Hughes
As part of normal interpersonal communication, people send and receive messages with their body, especially with their hands. Gestures play an important role in teacher-student classroom interactions. In the domain of education, many research projects have focused on the study of such gestures either in real classrooms or in tutorial settings with experienced teachers. Novice teachers especially need to understand the messages they are sending through nonverbal communication as this can have a major effect on their ability to manage behaviors and deliver content. Such learning should optimally occur before experiencing the real classroom. To assist in this process, we have developed a virtual classroom environment -- TeachLivE -- and used it for teacher practice, reflection and assessment. This paper investigates the way teachers use gestures in the virtual classroom settings of TeachLivE. Biology and algebra teachers were evaluated in our study. Analysis of video recordings from real and virtual environment seems to indicate that algebra teachers gesture significantly more often than biology teachers. These results have implications for providing useful feedback to participant teachers.
Qualitatively exploring electronic portfolios: a text mining approach to measuring student emotion as an early warning indicator BIBAFull-Text 422-423
  Frederick Nwanganga; Everaldo Aguiar; G. Alex Ambrose; Victoria Goodrich; Nitesh V. Chawla
The collection and analysis of student-level data is quickly becoming the norm across school campuses. More and more institutions are starting to use this resource as a window into better understanding the needs of their student population. In previous work, we described the use of electronic portfolio data as a proxy to measuring student engagement, and showed how it can be predictive of student retention. This paper highlights our ongoing efforts to explore and measure the valence of positive and negative emotions in student reflections and how they can serve as an early warning indicator of student disengagement.
Media multiplexity in connectivist MOOCs BIBAFull-Text 424-425
  Rafa Absar; Anatoliy Gruzd; Caroline Haythornthwaite; Drew Paulin
In this poster, we present work on exploring use of multiple social media platforms for learning in two connectivist MOOCs (or cMOOCs) to develop and evaluate methods for learning analytics to detect and study collaborative learning processes.
Using learning analytics to study cognitive disequilibrium in a complex learning environment BIBAFull-Text 426-427
  Marcelo Worsley; Paulo Blikstein
Cognitive disequilibrium has received significant attention for its role in fostering student learning in intelligent tutoring systems and in complex learning environments. In this paper, we both add to and extend this discussion by analyzing the emergence of four affective states associated with disequilibrium: joy, surprise, neutrality and confusion; in a collaborative hands-on, engineering design task. Specifically, we conduct a comparison between two learning strategies to make salient how the strategies are associated with different affective states. This comparison is grounded in the construction of a probabilistic model of student affective state as defined by the frequency of each state, and the rate of transition between affective states. Through this comparison we confirm prior research that highlights the importance of confusion as a marker of knowledge construction, but put to question the notion that surprise is a significant mediator of cognitive disequilibrium. Overall, we show how modeling learner affect is useful for understanding and improving learning in complex, hands-on learning environments.
Analysis of learners' study logs: mouse trajectories to identify the occurrence of hesitation in solving word-reordering problems BIBAFull-Text 428-429
  Mitsumasa Zushi; Yoshinori Miyazaki; Ken Norizuki
In this paper, we describe a Web application we have been developing in order to help both teachers and learners notice the crucial aspects of solving word-reordering problems (WRPs). Also, we discuss ways to analyze the recorded mouse trajectories, response time, and drag and drop (D&D) logs, because these records are potential indicators of the degree of learners' understanding.
How do students interpret feedback delivered via dashboards? BIBAFull-Text 430-431
  Linda Corrin; Paula de Barba
Providing feedback directly to students on their engagement and performance in educational activities is important to supporting students' learning. However, questions have been raised whether such data representations are adequate to inform reflection, planning and monitoring of students' learning strategies. In this poster we present an investigation of how students interpret feedback delivered via learning analytics dashboards. The findings indicated that most students were able to articulate an interpretation of the feedback presented through the dashboard to identify gaps between their expected and actual performance to inform changes to their study strategies. However, there was also evidence of uncertain interpretation both in terms of the format of the visualization of the feedback and their inability to understand the connection between the feedback and their current strategies. The findings have been used to inform recommendations for ways to enhance the effectiveness of the delivery of feedback through dashboards to provide value to students in developing effective learning strategies to meet their educational goals.
Learning analytics in Oz: what's happening now, what's planned, and where could it (and should it) go? BIBAFull-Text 432-433
  Tim Rogers; Cassandra Colvin; Deborah West; Shane Dawson
This poster outlines the process and purpose of two related Australian Office for Learning and Teaching (OLT) commissioned grants to investigate the current usage and future potential of learning analytics in Australian Higher Education, with a view to developing resources to guide Australian universities in their adoption of learning analytics. The commissioned grants run from February 2014 to June 2015. Preliminary results will be available for LAK 15.
Text mining approach to automate teamwork assessment in group chats BIBAFull-Text 434-435
  Antonette Shibani; Elizabeth Koh; Helen Hong
The increasing use of chat tools for learning and collaboration emphasizes the need for automating assessment. We propose a text mining approach to automate teamwork assessment in chat data. This supervised training approach can be extended to other domains for efficient assessment.