Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Informatiseringscentrum Patterns in Usage Data Victor Maijer University of Amsterdam 2 June 2006, Vancouver Informatiseringscentrum Overview - Introduction - Data Mining - Results - Sakai & DM - Conclusion Informatiseringscentrum Introduction • • • • • UvA founded in 1632 (Atheneum Illustre) 7 schools (faculty), 1518 study programmes 25.000 students, 3500 employees (2000 academic staff) Blackboard is our VLE since 1999, 13.000 users per day We run OSP and regard Sakai as a potential succesor of Blackbaord Informatiseringscentrum Strategic Information Stakeholders need strategic information in order to make decisions Stakeholders are: Instructors Administrators Management Support Etc. Informatiseringscentrum Data Warehouse Provides an integrated and total view of learning/collaboration systems Makes the systems current and historical information easily available for decision making Makes decision-support transactions possible without hindering operational systems Presents a flexible and interactive source of strategic information Informatiseringscentrum Architecture Informatiseringscentrum Info for Administrators & Management Informatiseringscentrum Why I went mining • I had data, a lot • I did it before • I wanted to do some fun stuff Official reason (the one I tell my boss): • We needed strategic information about how our VLE evolved Informatiseringscentrum What is Data Mining? • Data mining is the extraction of implicit, previously unknown, and potentially useful information from data. • Clustering is a data mining technique that applies when instances are to be divided into natural groups. Informatiseringscentrum Example Course Documents ABBA 36 BEATLES 4 COLDPLAY 30 DARKHORSES 2 ELASTICA 24 Group Members Average Docs A ABBA, COLDPLAY, ELASTICA 30 B BEATLES, DARKHORSES 3 Informatiseringscentrum Procedure • • • • • • • • Determine mining questions Determine source (tables) Verify by changing items via GUI Identify needed output formats for analysis Define SQL-queries Program scripts (Perl) Determine which clustering techniques you want to apply Analyze (cluster). ‘Weka’ is an excellent JAVA OS tool for Data Mining http://www.cs.waikato.ac.nz/ml/weka/ Informatiseringscentrum Domains clustered • CourseSites and its content • Users (instructors) • Sessions (student) Informatiseringscentrum Site clusters Basic usage (content + announcements) Extended usage Cluster Size(%) N A 87 1547 B 7 122 C 4 66 D 2 43 20 140 120 15 100 DiscussionFora 80 Content Announcement 60 Gradebook Tests 10 Groups 40 5 20 0 Cluster A Cluster B Cluster C Cluster D 0 Cluster A Cluster B Cluster C Cluster D Informatiseringscentrum Content clusters 100 80 Test 60 Asignment Document External Link 40 Folder 20 0 Cluster A Cluster B Cluster C Cluster D Cluster Size(%) N A 91 1636 B 3 62 C 3 57 D 3 45 Informatiseringscentrum Instructor activity clusters 600 500 400 Announcements Content 300 Dropbox DiscussionBoard 200 Gradebook Test 100 0 Cluster A Cluster B Cluster C Cluster D Cluster Size(%) N A 88 1443 B 7 115 C 4 61 D 1 15 Informatiseringscentrum Student session clusters 180 160 140 120 100 80 60 40 20 0 171,45 Clicks Dur(min) 63,4 25,5 29,7 Cluster B Cluster C 8,7 3,8 Cluster A Cluster Size(%) N A 91 1294K B 6 90K C 2 32K Informatiseringscentrum Extra • • Female students click significant more than male students and have significant longer sessions Any ideas? Informatiseringscentrum Sakai & Data mining • • • Our UvA Pilots were too small to analyze Content can be clustered Events are difficult to cluster (not enough logging compared to Blackboard Informatiseringscentrum Implications • • Put rumours into perspective Differentiate to user groups – Support – Functionality Informatiseringscentrum Conclusion • Methods – Clustering can be used to discover usage patterns – You need appropiate hardware for preprocessing and clustering • Results – Basic Usage (Documents & Announcements) – Duration of a session is a couple of minutes – Extended Usage grows but is limited • • Sakai needs more logging if it wants to compete with Blackboard A Sakai warehouse would be nice Informatiseringscentrum Evolvement Users Usage 0