Download Understanding Educational Data Mining (EDM)

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Nonlinear dimensionality reduction wikipedia , lookup

Transcript
International Journal of Electronics and Computer Science Engineering
Available Online at www.ijecse.org
773
ISSN- 2277-1956
Understanding Educational Data Mining (EDM)
Mr.Suhas G. Kulkarni 1 , Mr.Ganesh C. Rampure 2, Mr.Bhagwat Yadav 3
1 23
Sinhgad School of Computer Studies, (Solapur, Maharashtra) India
1
[email protected]
2
[email protected]
3
[email protected]
Abstract- Recently Educational Data Mining (EDM) is successful in attracting a great deal of attention of researchers. It
is an emerging multidisciplinary research area. Educational Data Mining (EDM) is the process of discovering useful
information from raw data generated and collected from educational systems which can be used by the different
stakeholders. In EDM different techniques and methods for exploring data originating from various educational
information systems can be developed. It is rich application area for data mining as well as a learning science, also due to
the growing availability of educational data. EDM contributes towards the study of how students can learn and the
settings in which they learn. It enables data-driven decision making for improving the current educational practice and
learning material. The objective of this paper is to present a brief overview of EDM and to observe the development in the
field of EDM.
Keywords – Educational Data Mining, KDD, EDM, ITS, LMS.
I. INTRODUCTION
Data mining, also known as Knowledge Discovery in Databases (KDD), is the field of discovering novel and
potentially useful information from huge amounts of data. Data mining has been applied in a number of fields. In
recent years, there has been increasing interest in the use of data mining to investigate scientific questions within
educational research, an area of inquiry termed educational data mining. Educational data mining (also referred to as
“EDM”) is defined as the area of scientific inquiry centered around the development of methods for making
discoveries within the unique kinds of data that come from educational settings, and using those methods to better
understand students and the settings which they learn in.[1]
Recently Educational Data Mining (EDM) is emerging as multidisciplinary research area due to introduction of
new technologies. Substantial growth is been observed in the use of interactive learning environments, intelligent
tutoring systems, educational hypermedia systems and learning management systems (LMS). E.g. general purpose
LMS such as Moodle, specialized ITSs like SQL Tutor, professional education and training systems such as
simulators, eHealth and patient education such as Philips Motiva etc. At the same time the wider use of information
communication technologies in education has allowed the collection of huge amount of data.[2]
EDM aims towards discovering useful information from the huge amount of electronic data collected by these
educational systems. The survey of Educational Data Mining by Romero and S. Ventura provides the overview of
EDM process.[3]
Overview of EDM Process
An important and unique feature of educational data is that they are hierarchical. Data at the keystroke level, the
answer level, the session level, the student level, the classroom level, the teacher level, and the school level are
ISSN 2277-1956/V2N2-773-777
IJECSE,Volume2, Number 2
Milind Kansara et al.
nested inside one another (Baker 2011; Romero and Ventura 2010). From year 2008 an international conference
dedicated to the field of EDM has began.
Five aspirations for Educational Data Mining (EDM 2012; Bob Dolan and John Behrens) fall into five categories
which begin with the statement that “We hope that Educational Data Mining will...”
1) Consider the broad range of the social and organizational aspects of education and its administration,
including informal and ubiquitous learning;
2) Consider the broad range of inputs of digital artifacts that feed into the design of learning systems (not just the
outcomes of system interactions);
3) Consider data mining as a human endeavour which is itself a proper topic of psychological, sociological and
other academic disciplines;
4) Remember the fundamentals of quality data analysis regardless of computational techniques
5) Provide information that celebrates the diversity of effective pedagogies and supports learning by the outliers,
hidden clusters, and otherwise missed special groups of people that are lost in the averages or other insensitive
aggregations. [4]
II. DATA MINING IN EDUCATION SYSTEM: OBJECTIVE
Data mining can be applied majorly in two different types of education system [3] i.e. traditional class room and
distance education. No doubt that these systems are with different objectives and uses different data sources, hence
we may need to apply the different data mining techniques for them.
Table I gives the overview of these systems with their basic or key idea.
TABLE I
Sr. No.
1
2
3
4
5
Type of education system
Traditional class room
Distance education
Particular Web based Course
Learning Content Management
System
Adaptive and intelligent webbased educational systems
Key Concept
Face-to-face con-tact between educators and students, organized through
lectures.
Techniques and methods providing access to educational programs for students
who are separated by time and space from lecturers.
Specific courseware that use standard HTML (Hyper Text Markup Language)
Offer a great variety of channels and workspaces to facilitate information
sharing and communication between participants in a course, let educators
distribute information to students, produce content material, prepare
assignments and tests, engage in discussions,
manage distance classes and enable collaborative learning with forums, chats,
file storage areas, news services,
Provides an alternative to the traditional just-put-it-on-the-web approach in the
development of web-based educational courseware
Educational data mining researchers (e.g., Baker 2011; Baker and Yacef 2009) uses the five categories of technical
methods to accomplish the goal of their research.[5,6]
1.
2.
Predication- entails developing a model that can infer a single aspect of the data (predicted variable) from some
combination of other aspects of the data (predictor variables).
Clustering -refers to finding data points that naturally group together and can be used to split a full dataset into
categories.
ISSN 2277-1956/V2N2-773-777
774
775
Understanding Educational Data Mining (EDM)
3.
4.
5.
Relationship mining -involves discovering relationships between variables in a dataset and encoding them as rules
for later use.
Distillation for human judgment – is a technique that involves depicting data in a way that enables a human to
quickly identify or classify features of the data.
Discovery with model - is a technique that involves using a validated model of a phenomenon (developed through
prediction, clustering, or manual knowledge engineering) as a component in further analysis.[3]
III. DATA MINING PREPROCESSING AND DATA MINING TECHNIQUES
Data pre-processing allows the transformation of original data into a suitable shape to be used by a particular mining
algorithm. So, before applying the data mining algorithm, a number of general data pre-processing tasks have to be
addressed (Koutri et al., 2004; Zorrilla et al., 2005):
1. Data cleaning.- removing irrelevant items and log entries that are not needed for the mining process such us
scripts, graphics, etc.
2. User identification.- process of associating page references to the connected user.
3. Session identification- taking all of the page references for a given user and course in a log and breaks them up
into user sessions.
4. Path completion.- fills in page references that are missing due to browser and proxy server caching.
5. Transaction identification- breaks down sessions into smaller units, referred to as transactions or episodes.
6. Data transformation and enrichment. - It includes calculation of new attributes from the existing ones, conversion
of numerical attributes into nominal attributes, providing meaning to references contained in the log, etc.
7. Data integration. - Integration and synchronization of data from heterogeneous sources.
8. Data reduction. - To reduce data dimensionality.
Along with this we have to consider and address the some special issues concerned with web based systems also.
The different data mining techniques used in Educational System are: Statistics and visualization, Web mining,
Clustering, classification and outlier detection, Association rule mining and sequential pattern mining, Text mining,
etc.
IV. DEVELOPMENT IN THE FIELD OF EDM
In “Educational data mining: A survey from 1995 to 2005” C. Romera and S. Ventura mentioned the development
in the field of EDM till 2005. Here Table II gives a brief idea about the major developments took place after 2005.
TABLE III
Development in the field of EDM
Author(year)
Cristobal Romero, Sebastián Ventura, Pedro G.
Espejo and Cesar Hervas (2008)
ISSN 2277-1956/V2N2-773-777
Work carried
Data Mining Algorithms to Classify Students.
IJECSE,Volume2, Number 2
Milind Kansara et al.
Claudia Antunes (2008)
Acquiring Background Knowledge for Intelligent Tutoring Systems.
Michel Desmarais, Alejandro Villarreal and
Michel Gagnon (2008)
Adaptive Test Design
with a Naive Bayes Framework.
Hogyeong Jeong and Gautam Biswas.(2008)
Mining Student Behavior Models in Learning-by-Teaching Environments.
Nguyen Thai-Nghe, Tom´aˇs Horv´ath and Lars
Schmidt-Thieme (2011)
Factorization Models for Forecasting Student Performance
Nan Li, William Cohen, Kenneth R. Koedinger
and Noboru Matsuda (2011)
A Machine Learning Approach for Automatic Student Model Discovery
Kelly Wauters, Piet Desmet and Wim Van Den
Noortgate(2011)
Acquiring Item Difficulty Estimates: a Collaborative Effort of Data and Judgment
Shubhendu Trivedi, Zachary Pardos, G´abor
S´ark¨ozy and Neil Heffernan (2011)
Spectral Clustering in Educational Data Mining
Annalies Vuong, Tristan Nixon and Brendon
Towle (2011)
A Method for Finding Prerequisites Within a Curriculum
Vasile Rus, Arthur Graesser, Cristian Moldovan
and Nobal Niraula (2012)
Automatic Discovery of Speech Act Categories in Educational Games
Shubhendu Trivedi, Zachary Pardos, Gábor
Sárközy and Neil Heffernan (2012)
Co-Clustering by Bipartite Spectral Graph Partitioning for Out-of-Tutor
Prediction
John Kinnebrew and Gautam Biswas (2012)
Identifying Learning Behaviors by Contextualizing Differential Sequence
Mining with Action Features and Performance Evolution
François Bouchet, John Kinnebrew, Gautam
Biswas and Roger Azevedo (2012)
Identifying Students' Characteristic Learning Behaviors in an Intelligent
Tutoring System Fostering Self-Regulated Learning
Terry Peckham and Gordon McCalla (2012)
Mining Student Behavior Patterns in Reading Comprehension Tasks
Tomas Obsivac, Lubos Popelinsky, Jaroslav
Bayer, Jan Geryk and Hana Bydzovska(2012)
Ma. Mercedes Rodrigo, Ryan S. J. D. Baker,
Bruce McLaren, Alejandra Jayme and
Thomas Dy (2012)
Predicting drop-out from social behaviour of students
Development of a Workbench to Address the Educational Data Mining
Bottleneck
Judi Mccuaig and Julia Baldwin(2012)
Identifying Successful Learners from Interaction Behaviour
V. APPLICATIONS OF DATA MINING
Educational data mining is a multidisciplinary research area; hence it is difficult to bind it in few applications. But to
list few of them, primary applications of EDM are
Predicting student performance
Student modelling
Detecting undesirable student behaviours
Analysis and visualization of data
Providing feedback for supporting instructors
Constructing courseware
Planning and scheduling
Recommendations for students
Grouping students
Social network analysis
Developing concept maps
ISSN 2277-1956/V2N2-773-777
776
777
Understanding Educational Data Mining (EDM)
VI.CONCLUSION
Educational data mining is a young research area. It is an emerging field related to several well-established areas
of research including e-learning, adaptive hypermedia, intelligent tutoring systems, web mining, data mining, etc.
The application of data mining in educational systems has specific requirements not present in other domains,
mainly the need to take into account pedagogical aspects of the learner and the system. Although the educational
data mining is a very recent research area there are an important number of contributions published. EDM brings
together researchers and practitioners from computer science, education, psychometrics, statistics, psychology, etc.
REFERENCE
[1]
[2]
[3]
[4]
[5]
[6]
[7]
[8]
[9]
Ryan S.J.d. Baker, “Data Mining for Education” Carnegie Mellon University, Pittsburgh, Pennsylvania, USA
Toon Calders, Mykola Pechenizkiy, “Introduction to The Special Section on Educational Data Mining”, SIGKDD Explorations Volume 13,
Issue 2.
C. Romero, S. Ventura, “Educational data mining: A survey from 1995 to 2005,” Expert Systems with Applications 33 (2007) 135–
146(Science Direct).
Bob Dolan and John Behrens, Pearson, “Five Aspirations for Educational Data Mining,” Proceedings of the 5th International Conference on
Educational Data Mining- EDM 2012.
Baker, R., & Yacef, K. (2009). The State of Educational Data mining in 2009: A Review and Future Visions. Journal of Educational Data
Mining,
Richard A. Huebner, “A survey of educational data-mining research,” Norwich University.
Proceedings of the 1st International Conference on Educational Data Mining- EDM 2008.
Proceedings of the 4th International Conference on Educational Data Mining- EDM 2011.
Proceedings of the 5th International Conference on Educational Data Mining- EDM 2012.
ISSN 2277-1956/V2N2-773-777