Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
The Q-matrix method: A new artificial intelligence tool for data mining Dr. Tiffany Barnes Kennedy 213, [email protected] PhD - North Carolina State University Overview Introduction Adaptive Teaching and Data Mining Student Model Extraction Conclusions & Future Work The Q-matrix method Sep 10, 2004 2 Research challenge Turn the computer into a private tutor Diagnose and correct misconceptions Diagnosis tolerates careless errors & guesses Build a scientific approach to improving computer based education Build in fault tolerance, robustness Optimize for student performance Optimize teaching strategies for effectiveness The Q-matrix method Sep 10, 2004 3 The Problem Students take a tutorial and quiz online Determine what students know Redirect students to new/repeat material The Q-matrix method Sep 10, 2004 4 Adaptive Tutorial Flow Question Engine Determine concept state Select new material Concept Model Ask questions Diagnostic Engine student Student responds Teaching Strategy Determine learning path Data mining for knowledge Behavior Known Contents Unknown student The Q-matrix method Assume contents affect behavior Sep 10, 2004 6 Knowledge & student model Concepts Tutorial questions Student concepts Student responses Goal: Mine to extract student concepts The Q-matrix method Sep 10, 2004 7 Data mining & adaptive teaching Problem understanding Data understanding Data from online tutorials Data preparation Effective direction of student learning Select relevant variables Modeling: Q-matrix, cluster, factor Evaluation of results Misconceptions diagnosed? References: Data Mining Server @ http://dms.irb.hr/tutorial The Q-matrix method Sep 10, 2004 8 How the model works Student response 11100 Tutorial & Questions Teaching Strategy The Q-matrix method match Q-matrix 00011 10010 Predicted responses: 01100 Err: 1 01101 Err: 2 11100 Err: Err:00 11100 11111 Err: 2 Student understands Concept 1 but not 2. Sep 10, 2004 9 How the model works-2 Concept state – a bit string that describes understanding Concept state 01: understands concept 2 but not concept 1 Q-matrix: concepts v. questions Each state has an “ideal response vector” computed from Q-matrix The Q-matrix method Sep 10, 2004 10 Binary Q-matrix example Con1 Con2 q1 0 1 q2 0 0 Concept State 00 01 10 11 The Q-matrix method q3 0 0 q4 1 1 q5 1 0 IDR 01100 11100 01101 11111 Sep 10, 2004 11 Research questions Are Q-matrix models interpretable? What factors affect Q-matrix extraction? How well does the Q-matrix method compare with other data mining methods? The Q-matrix method Sep 10, 2004 12 Results on simulated students Brewer tested 2 Q-matrix extraction methods based on ideal students + noise in ideal response vectors Q-matrix method needs few students for high noise tolerance, factor analysis needs many more References: Brewer 1996. NCSU Masters Thesis. The Q-matrix method Sep 10, 2004 13 Student model extraction Q-matrix, factor, and cluster models Compared for error on student data sets Q-matrix and cluster also compared by maps and by cluster convergence The Q-matrix method Sep 10, 2004 14 Q-matrix model Assumes concepts underlie questions Students are in “concept states” C: C1 = 1 understands concept 0 C2 = 0 doesn’t get concept 2 For each state, compute IDR Assign students to state with closest IDR The Q-matrix method Sep 10, 2004 15 Q-matrix creation Until convergence criterion met: 1. 2. 3. 4. 5. 6. 7. Increment number of concepts Create random q-matrix Fill concept states & compute error Vary q-matrix Fill concept states & compute error Repeat steps 4-5 until error not improving Repeat steps 2-6 to avoid local minima The Q-matrix method Sep 10, 2004 16 Factor analysis model Each tutorial question is a variable Create covariance matrix for vars Derive eigenvectors/values to explain most of the variance in the covar matrix Assumes that linear combinations of the variables will be able to explain the vars Eigenvectors ROTATED The Q-matrix method Sep 10, 2004 17 Cluster analysis model Answer vectors as points in plane Iterate until convergence: Choose random seed from data set Assign vectors to nearest seed Set new seeds to cluster medians Chooses random seeds, assigns vecs to closest seed, set new seed to cluster median Similar to q-matrix except seeds are Ideal Response Vectors The Q-matrix method Sep 10, 2004 18 Q-matrix vs. Factor Analysis CFA generated 4 factors/matrix Compared to q-matrix with 4 concepts Factor matrix converted to 0/1 Threshold of 0.3 -> 1, less -> 0 Factor matrix used as q-matrix Error computed for both Q-matrix performed significantly better (at least 19% less error/stud) on all 14 problems Smallest diff in performance when large amount of variance in student answers The Q-matrix method Sep 10, 2004 19 Q-matrix and factor errors per student 3 2 1.5 1 0.5 Factor Q-matrix Pf10 Pf9 Pf8 Pf7 Pf6 Pf5 Pf4 Pf3 Pf2 Pf1 Count Binq3 Binq2 0 Binq1 Errors/student 2.5 Ratio of q-matrix to factor error and relative # of distinct observations 1.2 1 0.8 0.6 0.4 0.2 # diff ans/max ratio q/fac Pf10 Pf9 Pf8 Pf7 Pf6 Pf5 Pf4 Pf3 Pf2 Pf1 Count Binq3 Binq2 Binq1 0 Q-matrix vs. Cluster Analysis Cluster Analysis does not map to qmatrix as factor anal. does However, q-matrices do form clusters of students in the same concept state Ran Cluster Analysis with same number of clusters as q-matrix Similar clusters generated by both The Q-matrix method Sep 10, 2004 22 Clustering comparisons Determine equivalent concept state & cluster groupings (by largest overlap) These are in BOLD Count elements NOT in overlaps Overall diff = total NOT overlapping / total elements The Q-matrix method Sep 10, 2004 23 Proof 8 Q-matrix Cluster Comparison 6/15 clus different 105,205,305 Con3-777 Con2-35 231 274 14,15 Con 1-444 16 Con 0-4 402,441,446,622 546,646,744 Differences in cluster overlap 0.6 0.5 0.4 0.3 0.2 0.1 0 b1 b2 b3 ct p1 p2 p3 p4 p5 p6 p7 p8 p9 Ratio of different to total cluster assignments p10 Q-matrix vs. Cluster Analysis 2 Each cluster has a “seed” Distances from seeds determine cluster membership For each cluster, summed differences between seeds & answer vectors Total error less than that of q-matrix clusters for all experiments The Q-matrix method Sep 10, 2004 26 Q-matrix vs. Cluster Analysis 3 Why is total error less for clusters? Because we force the IDRs in q-matrix method to be based on concepts This yields higher errors but more help in directing teaching strategies The Q-matrix method Sep 10, 2004 27 Q-matrix v. Clusters Summary If we used cluster results, how would we determine what to do for each student after the analysis? Cluster and q-matrix analyses could be used together for large data sets. Important: student outcomes The Q-matrix method Sep 10, 2004 28 Conclusions Full automation of economically expandable adaptive teaching system Method for diagnosis of misconceptions Q-matrix model interpretable by humans Q-matrix outperforms factor analysis in student modeling Q-matrix forms clusters similar to those in cluster analysis The Q-matrix method Sep 10, 2004 29 Future Work Any lesson can be augmented with diagnostic engine Different teaching strategies can be compared Apply Q-matrix method to benchmark data mining datasets Perform detailed time analysis and determine improvements Cross-validation tests to determine accuracy of model Missing data adaptations The Q-matrix method Sep 10, 2004 30 Thank you! Email: [email protected] This work was partially supported by NSF grants #9813902 and #0204222. The Q-matrix method Sep 10, 2004 31 How the model works-2 Student takes quiz Assigned to state with nearest IDR Error determined from difference between IDR & response, Q-matrix Q-matrices varied until error over all students is minimized The Q-matrix method Sep 10, 2004 32 Manual concept mapping Expert analysis of algebra tasks into rules Evolved into Q-matrix Relationship between questions & concepts Applications: Student assessment Group performance measure Finding new rules (student innovations) References: Birenbaum, et al. 1993, Tatsuoka 1983. The Q-matrix method Sep 10, 2004 33 Prediction of student data Hubal found that randomly generated rules were better predictors of student data than Tatsuoka’s Q-matrix This suggests that student data should be used to generate dynamic Q-matrices Mining for what the students know! References: Hubal 1992. NCSU Masters Thesis. The Q-matrix method Sep 10, 2004 34 Knowledge Assessment Comparison with expert models Remediation Tutorial effectiveness The Q-matrix method Sep 10, 2004 35 Remediation Analyze student states and apply a teaching strategy to direct next step Process: Find the least-understood concept, and have student retake the first lesson related to that concept The Q-matrix method Sep 10, 2004 36 Remediation results Self-guided choices compared with qmatrix choices Less than half of self-guided students chose differently Exam performance: q-predicted equal or worse than self-chosen Conclusion: remediation at least as good as student remediation The Q-matrix method Sep 10, 2004 37