Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
International Journal of Electronics and Computer Science Engineering Available Online at www.ijecse.org 773 ISSN- 2277-1956 Understanding Educational Data Mining (EDM) Mr.Suhas G. Kulkarni 1 , Mr.Ganesh C. Rampure 2, Mr.Bhagwat Yadav 3 1 23 Sinhgad School of Computer Studies, (Solapur, Maharashtra) India 1 [email protected] 2 [email protected] 3 [email protected] Abstract- Recently Educational Data Mining (EDM) is successful in attracting a great deal of attention of researchers. It is an emerging multidisciplinary research area. Educational Data Mining (EDM) is the process of discovering useful information from raw data generated and collected from educational systems which can be used by the different stakeholders. In EDM different techniques and methods for exploring data originating from various educational information systems can be developed. It is rich application area for data mining as well as a learning science, also due to the growing availability of educational data. EDM contributes towards the study of how students can learn and the settings in which they learn. It enables data-driven decision making for improving the current educational practice and learning material. The objective of this paper is to present a brief overview of EDM and to observe the development in the field of EDM. Keywords – Educational Data Mining, KDD, EDM, ITS, LMS. I. INTRODUCTION Data mining, also known as Knowledge Discovery in Databases (KDD), is the field of discovering novel and potentially useful information from huge amounts of data. Data mining has been applied in a number of fields. In recent years, there has been increasing interest in the use of data mining to investigate scientific questions within educational research, an area of inquiry termed educational data mining. Educational data mining (also referred to as “EDM”) is defined as the area of scientific inquiry centered around the development of methods for making discoveries within the unique kinds of data that come from educational settings, and using those methods to better understand students and the settings which they learn in.[1] Recently Educational Data Mining (EDM) is emerging as multidisciplinary research area due to introduction of new technologies. Substantial growth is been observed in the use of interactive learning environments, intelligent tutoring systems, educational hypermedia systems and learning management systems (LMS). E.g. general purpose LMS such as Moodle, specialized ITSs like SQL Tutor, professional education and training systems such as simulators, eHealth and patient education such as Philips Motiva etc. At the same time the wider use of information communication technologies in education has allowed the collection of huge amount of data.[2] EDM aims towards discovering useful information from the huge amount of electronic data collected by these educational systems. The survey of Educational Data Mining by Romero and S. Ventura provides the overview of EDM process.[3] Overview of EDM Process An important and unique feature of educational data is that they are hierarchical. Data at the keystroke level, the answer level, the session level, the student level, the classroom level, the teacher level, and the school level are ISSN 2277-1956/V2N2-773-777 IJECSE,Volume2, Number 2 Milind Kansara et al. nested inside one another (Baker 2011; Romero and Ventura 2010). From year 2008 an international conference dedicated to the field of EDM has began. Five aspirations for Educational Data Mining (EDM 2012; Bob Dolan and John Behrens) fall into five categories which begin with the statement that “We hope that Educational Data Mining will...” 1) Consider the broad range of the social and organizational aspects of education and its administration, including informal and ubiquitous learning; 2) Consider the broad range of inputs of digital artifacts that feed into the design of learning systems (not just the outcomes of system interactions); 3) Consider data mining as a human endeavour which is itself a proper topic of psychological, sociological and other academic disciplines; 4) Remember the fundamentals of quality data analysis regardless of computational techniques 5) Provide information that celebrates the diversity of effective pedagogies and supports learning by the outliers, hidden clusters, and otherwise missed special groups of people that are lost in the averages or other insensitive aggregations. [4] II. DATA MINING IN EDUCATION SYSTEM: OBJECTIVE Data mining can be applied majorly in two different types of education system [3] i.e. traditional class room and distance education. No doubt that these systems are with different objectives and uses different data sources, hence we may need to apply the different data mining techniques for them. Table I gives the overview of these systems with their basic or key idea. TABLE I Sr. No. 1 2 3 4 5 Type of education system Traditional class room Distance education Particular Web based Course Learning Content Management System Adaptive and intelligent webbased educational systems Key Concept Face-to-face con-tact between educators and students, organized through lectures. Techniques and methods providing access to educational programs for students who are separated by time and space from lecturers. Specific courseware that use standard HTML (Hyper Text Markup Language) Offer a great variety of channels and workspaces to facilitate information sharing and communication between participants in a course, let educators distribute information to students, produce content material, prepare assignments and tests, engage in discussions, manage distance classes and enable collaborative learning with forums, chats, file storage areas, news services, Provides an alternative to the traditional just-put-it-on-the-web approach in the development of web-based educational courseware Educational data mining researchers (e.g., Baker 2011; Baker and Yacef 2009) uses the five categories of technical methods to accomplish the goal of their research.[5,6] 1. 2. Predication- entails developing a model that can infer a single aspect of the data (predicted variable) from some combination of other aspects of the data (predictor variables). Clustering -refers to finding data points that naturally group together and can be used to split a full dataset into categories. ISSN 2277-1956/V2N2-773-777 774 775 Understanding Educational Data Mining (EDM) 3. 4. 5. Relationship mining -involves discovering relationships between variables in a dataset and encoding them as rules for later use. Distillation for human judgment – is a technique that involves depicting data in a way that enables a human to quickly identify or classify features of the data. Discovery with model - is a technique that involves using a validated model of a phenomenon (developed through prediction, clustering, or manual knowledge engineering) as a component in further analysis.[3] III. DATA MINING PREPROCESSING AND DATA MINING TECHNIQUES Data pre-processing allows the transformation of original data into a suitable shape to be used by a particular mining algorithm. So, before applying the data mining algorithm, a number of general data pre-processing tasks have to be addressed (Koutri et al., 2004; Zorrilla et al., 2005): 1. Data cleaning.- removing irrelevant items and log entries that are not needed for the mining process such us scripts, graphics, etc. 2. User identification.- process of associating page references to the connected user. 3. Session identification- taking all of the page references for a given user and course in a log and breaks them up into user sessions. 4. Path completion.- fills in page references that are missing due to browser and proxy server caching. 5. Transaction identification- breaks down sessions into smaller units, referred to as transactions or episodes. 6. Data transformation and enrichment. - It includes calculation of new attributes from the existing ones, conversion of numerical attributes into nominal attributes, providing meaning to references contained in the log, etc. 7. Data integration. - Integration and synchronization of data from heterogeneous sources. 8. Data reduction. - To reduce data dimensionality. Along with this we have to consider and address the some special issues concerned with web based systems also. The different data mining techniques used in Educational System are: Statistics and visualization, Web mining, Clustering, classification and outlier detection, Association rule mining and sequential pattern mining, Text mining, etc. IV. DEVELOPMENT IN THE FIELD OF EDM In “Educational data mining: A survey from 1995 to 2005” C. Romera and S. Ventura mentioned the development in the field of EDM till 2005. Here Table II gives a brief idea about the major developments took place after 2005. TABLE III Development in the field of EDM Author(year) Cristobal Romero, Sebastián Ventura, Pedro G. Espejo and Cesar Hervas (2008) ISSN 2277-1956/V2N2-773-777 Work carried Data Mining Algorithms to Classify Students. IJECSE,Volume2, Number 2 Milind Kansara et al. Claudia Antunes (2008) Acquiring Background Knowledge for Intelligent Tutoring Systems. Michel Desmarais, Alejandro Villarreal and Michel Gagnon (2008) Adaptive Test Design with a Naive Bayes Framework. Hogyeong Jeong and Gautam Biswas.(2008) Mining Student Behavior Models in Learning-by-Teaching Environments. Nguyen Thai-Nghe, Tom´aˇs Horv´ath and Lars Schmidt-Thieme (2011) Factorization Models for Forecasting Student Performance Nan Li, William Cohen, Kenneth R. Koedinger and Noboru Matsuda (2011) A Machine Learning Approach for Automatic Student Model Discovery Kelly Wauters, Piet Desmet and Wim Van Den Noortgate(2011) Acquiring Item Difficulty Estimates: a Collaborative Effort of Data and Judgment Shubhendu Trivedi, Zachary Pardos, G´abor S´ark¨ozy and Neil Heffernan (2011) Spectral Clustering in Educational Data Mining Annalies Vuong, Tristan Nixon and Brendon Towle (2011) A Method for Finding Prerequisites Within a Curriculum Vasile Rus, Arthur Graesser, Cristian Moldovan and Nobal Niraula (2012) Automatic Discovery of Speech Act Categories in Educational Games Shubhendu Trivedi, Zachary Pardos, Gábor Sárközy and Neil Heffernan (2012) Co-Clustering by Bipartite Spectral Graph Partitioning for Out-of-Tutor Prediction John Kinnebrew and Gautam Biswas (2012) Identifying Learning Behaviors by Contextualizing Differential Sequence Mining with Action Features and Performance Evolution François Bouchet, John Kinnebrew, Gautam Biswas and Roger Azevedo (2012) Identifying Students' Characteristic Learning Behaviors in an Intelligent Tutoring System Fostering Self-Regulated Learning Terry Peckham and Gordon McCalla (2012) Mining Student Behavior Patterns in Reading Comprehension Tasks Tomas Obsivac, Lubos Popelinsky, Jaroslav Bayer, Jan Geryk and Hana Bydzovska(2012) Ma. Mercedes Rodrigo, Ryan S. J. D. Baker, Bruce McLaren, Alejandra Jayme and Thomas Dy (2012) Predicting drop-out from social behaviour of students Development of a Workbench to Address the Educational Data Mining Bottleneck Judi Mccuaig and Julia Baldwin(2012) Identifying Successful Learners from Interaction Behaviour V. APPLICATIONS OF DATA MINING Educational data mining is a multidisciplinary research area; hence it is difficult to bind it in few applications. But to list few of them, primary applications of EDM are Predicting student performance Student modelling Detecting undesirable student behaviours Analysis and visualization of data Providing feedback for supporting instructors Constructing courseware Planning and scheduling Recommendations for students Grouping students Social network analysis Developing concept maps ISSN 2277-1956/V2N2-773-777 776 777 Understanding Educational Data Mining (EDM) VI.CONCLUSION Educational data mining is a young research area. It is an emerging field related to several well-established areas of research including e-learning, adaptive hypermedia, intelligent tutoring systems, web mining, data mining, etc. The application of data mining in educational systems has specific requirements not present in other domains, mainly the need to take into account pedagogical aspects of the learner and the system. Although the educational data mining is a very recent research area there are an important number of contributions published. EDM brings together researchers and practitioners from computer science, education, psychometrics, statistics, psychology, etc. REFERENCE [1] [2] [3] [4] [5] [6] [7] [8] [9] Ryan S.J.d. Baker, “Data Mining for Education” Carnegie Mellon University, Pittsburgh, Pennsylvania, USA Toon Calders, Mykola Pechenizkiy, “Introduction to The Special Section on Educational Data Mining”, SIGKDD Explorations Volume 13, Issue 2. C. Romero, S. Ventura, “Educational data mining: A survey from 1995 to 2005,” Expert Systems with Applications 33 (2007) 135– 146(Science Direct). Bob Dolan and John Behrens, Pearson, “Five Aspirations for Educational Data Mining,” Proceedings of the 5th International Conference on Educational Data Mining- EDM 2012. Baker, R., & Yacef, K. (2009). The State of Educational Data mining in 2009: A Review and Future Visions. Journal of Educational Data Mining, Richard A. Huebner, “A survey of educational data-mining research,” Norwich University. Proceedings of the 1st International Conference on Educational Data Mining- EDM 2008. Proceedings of the 4th International Conference on Educational Data Mining- EDM 2011. Proceedings of the 5th International Conference on Educational Data Mining- EDM 2012. ISSN 2277-1956/V2N2-773-777