Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Data mining with DataShop Ken Koedinger CMU Director of PSLC Professor of Human-Computer Interaction & Psychology Carnegie Mellon University Ryan S.J.d. Baker PSLC/HCII Carnegie Mellon University Overview Motivation for educational data mining Next DataShop Learning curves to improve cognitive models Past project example Conclusion What is educational data mining? “The area of scientific inquiry centered around the development of methods for making discoveries within the unique kinds of data that come from educational settings, and using those methods to better understand students and the settings which they learn in.” (Baker, under review) What is educational data mining? More informally: using “large” data sets to answer educational and psychological questions What “large” means is always changing Developing methods or algorithms to aid in discovery What is educational data mining? One popular data source is “instrumented” computer tutors Fine grained, longitudinal, often across contexts Other data sources Records of online courses (e.g. WebCAT) District or university-level student records Example: www.icpsr.umich.edu/IAED Educational Data Mining is a hot topic! 2008: First International Conference on Educational Data Mining 2008: Launch of Journal of Educational Data Mining 2009: Second International Conference on Educational Data Mining Submissions due in March 2009 www.educationaldatamining.org Data Mining Questions & Methods How can we reliably model student knowledge or achievement? Bayesian Knowledge Tracing Simple type of “Bayes Net”, getting less simple all the time Item Response Theory (IRT) Basis for standardized tests, SAT, GRE, TIMSS… Version of “logistic regression” Many variations & generalizations … See slides of Brian Junker’s EDM08 invited talk Data Mining Questions & Methods What’s the nature of knowledge students are learning? How can we discover cognitive models of student learning? Learning Factors Analysis (LFA) Extends IRT to account for learning Search algorithm: Discover cognitive model(s) that capture how student learning transfers over tasks over time Rule space, knowledge space, … Data Mining Questions & Methods How can we model students, beyond just what they know? Models of Choices: Metacognitive & Motivational Help-seeking Gaming the System Off-Task Behavior Self-explanation Affect Involves prediction methods such as classification, regression (not just linear regression) Data Mining Questions & Methods What features of a tutor lead to the most learning? Learning Decomposition Explores different rates of learning due to different forms of pedagogical support Close relative of Learning Factors Analysis Data Mining Questions & Methods How to extract reliable inferences about causal mechanisms from correlations in data? Causal modeling using Tetrad Data Mining Questions & Methods And one generally useful tool for figuring out what’s going on, in any of these cases: Exploratory data analysis Summary & visualization tools in DataShop Tools in Excel Clustering algorithms Visualization packages Overview Motivation for educational data mining Next DataShop Learning curves to improve cognitive models Past project example Conclusion Find DataShop at learnlab.org/datashop QuickTime™ and a TIFF (LZW) decompressor are needed to see this picture. Video Intro of DataShop … View here: QuickTime™ and a TIFF (LZW) decompressor are needed to see this picture. DataShop – Dataset Tabs Datasets you can view or edit. You have to be a project member or PI for the dataset to appear here. Private datasets you can’t view. Email us and the PI to get access. Public datasets that you can view only. Analysis Tools Dataset Info Performance Profiler Learning Curve Error Report Export Sample Selector Dataset Info • • Papers and Files storage Meta data for given dataset PI’s get ‘edit’ privileges, others must request it Problem Breakdown table Dataset Metrics 18 Performance Profiler View measures of • • • • • Aggregate by • • • • Step Problem KC Dataset Level Error Rate Assistance Score Avg # Hints Avg # Incorrect Residual Error Rate Multipurpose tool to help identify areas that are too hard or easy Learning Curve Visualizes changes in student performance over time View by KC or Student, Assistance Score or Error Rate Time is represented on the xaxis as ‘opportunity’, or the # of times a student (or students) had an opportunity to demonstrate a KC Error Report • • View by Problem or KC Provides a breakdown of problem information (by step) for finegrained analysis of problem-solving behavior Attempts are categorized by student Export • Two types of export available • By Transaction • By Step • Anonymous, tab-delimited file • Easy to import into Excel! You can also export the Problem Breakdown table and LFA values! Sample Selector Easily create a sample/filter to view a smaller subset of data Shared (only owner can edit) and private samples Filter by • • • • • • Condition Dataset Level Problem School Student Tutor Transaction Help/Documentation Glossary of common terms, tied in with PSLC Theory wiki • • • Extensive documentation with examples Contextual by tool/report http://learnlab.web.cmu.edu/datashop/help New Features Manage Knowledge Component models Addition of Latency Curves to Learning Curve Reporting Create, Modify & Delete KC models within DataShop Time to Correct Assistance Time Problem Rollup & Export Enhanced Contextual Help Overview Motivation for educational data mining Next DataShop Learning curves to improve cognitive models Past project example Conclusion Cognitive Modeling Challenge Premise: High quality instructional design requires a high quality cognitive model of student thinking Problem: Creating such a Cognitive Model is hard to get right Hard to program, but more importantly … A high quality cognitive model requires a deep understanding of student thinking Cognitive models created by intuition are often wrong (e.g., Koedinger & Nathan, 2004) Significance of improving a cognitive model A better cognitive model means better: Assessment Instructional feedback & hints (model tracing) Activity selection & pacing (knowledge tracing) Better cognitive models advance basic cognitive science Using student data to build better cognitive models Cognitive Task Analysis methods Think alouds, Difficulty Factors Assessment Peer collaboration dialog analysis General lecture Tuesday TagHelper track Data mining of student interactions with on-line tutors DataShop track Knowledge components are the “germ theory” of transfer Germs are hidden elements that carry disease from one agent to another Knowledge components are hidden elements that carry learning experiences from one situation to another -- they account for transfer DataShop Supports Theory Integration Makes micro theory concrete Knowledge decomposability hypothesis Acquisition of academic competencies can be decomposed into units, called knowledge components, that yield predictions about student task performance & the transfer of learning. Not obviously true “learning, cognition, knowing, and context are irreducibly co-constituted and cannot be treated as isolated entities or processes” (Barab & Squire, 2004) Learning curves show performance changes over time Learning curves: Student data Statistical model fit (blue line) Based on micro level analysis: learning event opportunities Averaged across knowledge components QuickTime™ and a TIFF (LZW) decompressor are needed to see this picture. QuickTime™ and a TIFF (LZW) decompressor are needed to see this picture. Not a smooth learning curve -> this knowledge component model is wrong. Does not capture genuine student difficulties. QuickTime™ and a TIFF (LZW) decompressor are needed to see this picture. This more specific knowledge component (KC) model (2 KCs) is also wrong -- still no smooth drop in error rate. QuickTime™ and a TIFF (LZW) decompressor are needed to see this picture. Ah! Now we get smoother learning curve. A more specific decomposition (12 KCs) better tracks nature of student difficulties & transfer from one problem situation to another (Rise near end due to fewer observations biased toward poorer students) Summary: KC model as “germ theory” QuickTime™ and a TIFF (LZW) decompressor are needed to see this picture. QuickTime™ and a a Without decomposition, using QuickTime™ and TIFF TIFF (LZW) (LZW) decompressor decompressor are to this picture. just a single KC, are needed needed to see see“Geometry” this picture. no smooth learning curve. But with decomposition, 12 KCs for area concepts, QuickTime™ and a TIFF (LZW) decompressor are needed to see this picture. QuickTime™ and a TIFF (LZW) decompressor are needed to see this picture. a smooth learning curve. Upshot: A decomposed KC model fits learning & transfer data better than a “faculty theory” of mind QuickTime™ and a TIFF (LZW) decompressor are needed to see this picture. Overview Motivation for educational data mining DataShop Learning curves to improve cognitive models Past project example Next Conclusion Past Project Example Rafferty (Stanford) & Yudelson (Pitt) Analyzed a data set from Geometry Applied Learning Factors Analysis (LFA) Driving questions: Are students learning at the same rate as assumed in prior LFA models? Do we need different cognitive models (KC models) to account for low-achieving vs. highachieving students? A Statistical Model for Learning Curves Predicts whether student is correct depending on knowledge & practice Additive Factor Model (Draney, et al. 1995, Cen, Koedinger, Junker, 2006) QuickTime™ and a TIFF (LZW) decompressor are needed to see this picture. Learning rate is different for different skills, but not for different students Low-Start High-Learn (LSHL) group has a faster learning rate than other groups of students QuickTime™ and a TIFF (LZW) decompressor are needed to see this picture. Rafferty & Yudelson Results 2 Is it “faster” learning or “different” learning? Fit with a more compact model is better for low start high learn QuickTime™ and a TIFF (LZW) decompressor are needed to see this picture. Students with an apparent faster learning rate are learning a more “compact”, general and transferable domain model Resulted in best Young Researcher Track paper at AIED07 Overview Motivation for educational data mining DataShop Learning curves to improve cognitive models Past project example Next Conclusion Lots of interesting questions to be addressed with Ed Data Mining!! Assessment questions Learning theory questions What are the “elements of transfer” in human learning? Is learning rate driven by student variability or content variability? Can conceptual change be tracked & better understood? Instructional questions Can on-line embedded assessment replace standardized tests? Can assessment be accurate if students are learning during test? What instructional moves yield the greatest increases in learning? Can we replace ANOVA with learning curve comparison to better evaluate learning experiments? Metacogniton & motivation questions Can student affect & motivation be detected in on-line click stream data? Can student metacognitive & self-regulated learning strategies be detected in on-line click stream data? Data Mining-Data Shop Offerings Data Mining Track: Tues 9:15 Using DataShop for Exploratory Data Analysis Tues 1:30 Learning from learning curves Item Response Theory Learning Factors Analysis Wed 9:30 Discovery with Models General lecture: Tues 3:30 Educational Data Mining Bayesian models of knowledge tracing Causal models with Tetrad Questions? Extra slides … Sample tutor interactions (from 1997 version) that generated Geometry Area data set used in example of learning curves … TWO_CIRCLES_IN_SQUARE problem: Initial screen TWO_CIRCLES_IN_SQUARE problem: An error a few steps later TWO_CIRCLES_IN_SQUARE problem: Student follows hint & completes prob Learning curve constrast in Physics dataset … Not a smooth learning curve -> this knowledge component model is wrong. Does not capture genuine student difficulties. QuickTime™ and a TIFF (LZW) decompressor are needed to see this picture. More detailed cognitive model yields smoother learning curve. Better tracks nature of student difficulties & transfer (Few observations after 10 opportunities yields noisy data) QuickTime™ and a TIFF (LZW) decompressor are needed to see this picture. Best BIC (parsimonious fit) for Default (original) KC model QuickTime™ and a TIFF (LZW) decompressor are needed to see this picture. Better than simpler Single-KC model QuickTime™ and a TIFF (LZW) decompressor are needed to see this picture. And better than more complex Unique-step (IRT) model QuickTime™ and a TIFF (LZW) decompressor are needed to see this picture.