Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
A Decision Support System to improve e-Learning Enviromments Marta Zorrilla, Diego García, Elena Álvarez University of Cantabria, Spain Motivation y y y Nowadays most universities and educational centres offer e-learning courses BUT, due to the lack of face-to-face, instructors have real difficulties knowing the rhythm and progress of their students E-learning platforms provide some tools to monitor and track the student activity ◦ Poor information ◦ Difficult to get a clear vision of each student MATEP Data warehouse Goal y y Develop a module to answer questions such as OLAP DATA MINING When do students connect to the system? Knowing students’ profiles according to demographic and navigation information Do they work online? Grouping students’ according to their style of learning How often do they use collaborative tools? Which one? Which tools do they use together? Who leaves the course and when? Knowing drop-out students’ profile Instructors can obtain models without requiring data mining knowledge How to get it y Defining templates for each question ◦ ◦ ◦ ◦ y choose the variables specify how to obtain them (ETL) determine the data mining algorithm establish parameters of the algorithm Two main difficulties ◦ Choosing parameters of algorithms ◦ Showing the results in an easy and understandable way (GUI) Web Service Architecture Backboard module ... Generate XML file Moodle module se o o Ch at e l p t em Templates DB (XML schemas) Wrapper Send data XML file Refine results View results XML schema validation Modify parameters Visualize results Data validation Parameters selection Data Mining algorithms Proposal of patterns: Student profile ◦ Variables: gender, age, number of sessions, time spent, average sessions per week, average time spent per week. ◦ Algorithms: x EM (Expectation-Maximization) / SOM Æ nº clusters x Kmeans Average Cluster 0 Cluster 1 Cluster 2 Age 22 24 22 21 Gender Male Male Female Male TotalTime 1976 1313 2290 2085 Sessions 128 80 146 141 AvgTimeWeek 115 76 134 122 AvgSessionWeek 7 4 8 7 Instances: 26% (10) Instances 44% (17) Instances 31% (12) Proposal of patterns: Resources used together ◦ Variables: sessionID, boolean variable for each resource used in the course. ◦ Algorithms: x Apriori x Borgelt (more efficient and lower number of rules) DISCUSSION <- NoASSIGNEMENT NoCONTENT (43.9, 71.2) DISCUSSION <- NoASSIGNEMENT NoCONTENT NoMAIL (32.5, 76.6) DISCUSSION <- NoASSIGNEMENT NoCONTENT ORGANIZER NoMAIL (21.8, 70.5) NoASSIGNEMENT <- DISCUSSION NoCONTENT NoMAIL (35.3, 70.5) NoCONTENT <- DISCUSSION (61.3, 73.2) NoCONTENT <- DISCUSSION NoASSIGNEMENT (39.7, 78.7) NoCONTENT <- DISCUSSION NoASSIGNEMENT NoMAIL (32.4, 76.9) NoCONTENT <- DISCUSSION NoASSIGNEMENT ORGANIZER (26.9, 70.3) NoCONTENT <- DISCUSSION NoMAIL (49.3, 71.6) NoMAIL <- (100.0, 80.6) ORGANIZER <- (100.0, 80.3) … 57 rules Proposal of patterns: Session profile ◦ Variables: time spent in session, hits and time spent in content-pages, hits and time spent in collaborative resources (mail, discussion, chat) and in the rest of resources of the course. ◦ Algorithms: x EM (Expectation-Maximization) / SOM Æ nº clusters x Kmeans Session profile Average Cluster 0 Cluster 1 Cluster 2 Cluster 3 SessionTime 14.0658 6.0482 27.0222 51.8116 66.7015 hit_mail 0.6873 0.6191 2.2519 0.9799 0.6741 hit_discussion 8.9338 7.1086 23.9259 17.4246 16.9726 hit_chat 0.0625 0.0021 2.3185 0.0402 0.0373 hit_contentpage 1.4481 0.6111 2.363 0.8769 11.5572 hit_assignments 1.1112 0.5813 3.3111 6.2739 1.4975 hit_weblinks 0.0672 0.0184 0.6222 0.0804 0.4428 hit_organizer 2.3489 1.5293 3.9259 3.1206 10.7015 hit_learningobjectives 0.1315 0.0955 0.7333 0.1834 0.301 hit_other 1.0856 0.7269 5.3333 3.4648 1.5249 time_mail 0.725 0.5591 4.1852 1.2965 0.9502 time_discussion 3.1068 1.9746 5.8667 9.0101 9.6592 time_chat 0.0018 0 0.0741 0 0 time_contentpage 4.9652 1.7227 5.8963 3.2613 44.5 time_assignments 2.9017 0.6796 5.3111 27.7412 3.6517 time_weblinks 0.0321 0.0116 0.6296 0.0226 0.0821 time_organizer 0.6511 0.2741 1.1037 0.9573 4.6318 time_learningobjectives 0.0178 0.0148 0.0444 0.0201 0.0423 time_other 1.201 0.501 2.5407 8.3367 1.9254 Instances: 83% (4731) Instances: 2% (135) Instances: 7% (398) Instances: 7% (402) Conclusions y In the instructor’s opinion, the models allow her ◦ to gain an insight about her students and the use of the course. ◦ To validate or refute hypothesis used in the design of the learning process. y It is necessary to show the results in a more intuitive way which allows instructors to interpret them easily Key issues y Towards Data Mining without parameters: ◦ Choosing algorithms and their parameters y Data Mining models visualization: ◦ Showing the results in an easy and understandable way x Java 2D/3D. x Wrapper (Matlab, Mathematica, IRIS explorer, Graphviz, …) Software architecture WEB Client JavaScript / AJAX DHTML / XML CSS / XSLT Application Server JSP / Servlets DB Access XML Parser ODBC/ JDBC DOM …. Backboard SQL Server Moodle MySQL Data mining Java EE v.5 Display Criterios para resultados de asociación y y y y y y y y - un nº suficiente ni 3 ni 100, entorno a 10 y luego si quiere más pues se generan más específicas - sin repetición de semántica, esto es que sea realmente distintas (huevos y arroz -> tomate, huevos -> tomate) ambas con confiazan muy parecida (boost más de un % de diferencia entre una regla y otra) - un atributo en el consecuente resulta más sencillo de leer - un nº reducido en el antecedente (de 3 a 4 atributos) sino muy difícil de interpretar (quizá en elearning, en cambio en cesta de la compra podría tener sentido... en función del nº de atributos inicial... - no tener que determinar parámetros de entrada (soporte , confianza, boost o lift,...) Borgelt y Balcazar !!!-> usar positivos o positivos y negativos, si solo se usan positivos habrá que bajar el soporte y la confianza Sugerir quitar el recurso con el que se accede a la herramienta Predictive apriori…