Download “educational data mining” and “learning

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Nonlinear dimensionality reduction wikipedia , lookup

Transcript
APPLICATION OF BIG DATA IN EDUCATION
DATA MINING AND LEARNING ANALYTICS – A
LITERATURE REVIEW
Abstract:
The usage of learning management systems in education has been increasing in the
last few years. Students have started using mobile phones, primarily smart phones
that have become a part of their daily life, to access online content. Student's online
activities generate enormous amount of unused data that are wasted as traditional
learning analytics are not capable of processing them. This has resulted in the
penetration of Big Data technologies and tools into education, to process the large
amount of data involved. This study looks into the recent applications of Big Data
technologies in education and presents a review of literature available on
Educational Data Mining and Learning Analytics.
Architecture Diagram:
Existing system:
Using big data allows educational institutions to identify post education
employment opportunities for graduates and help target education that more
closely aligns with employment market needs. It can also predict graduate
employment, unemployment, or undetermined situations about job opportunities
Using big data can help stakeholders in the educational system better understand
vocational prospects for students and better assess student learning programs for
occupational compatibility In a global learning environment, this type of
information not only can facilitate better educational and post education vocational
planning, but also may prove useful to organizations as they make hiring and
budgeting decisions for college graduates in different disciplines.
Proposed System:
Research in education has resulted in several new pedagogical improvements.
Community based learning environments have increased in number. In the current
learning environments, users learn in online communities like discussion forums,
online chats, instant messaging clients and various Learning Management Systems
like Model. Recent learning methods like Flipped Classroom greatly depend on
online activities. Several frameworks and models have been proposed for online
learning management systems to improve the learning experience. Entry of open
source projects in mobile computing has led to low cost smartphones and smart
phones have penetrated much. Students have started using smart phones to access
learning content. As the learning environments have become accessible anywhere
through the internet, students access their courses anywhere and indulge in
learning activities. Students’ activities through learning management systems
create large amount of data that can be utilized in developing the learning
environment, helping the students in learning and improving the overall learning
experience.
Learning Individual Behavior in an Educational Game: A Data-Driven Approach”,
describes a framework for modeling user's behavior. The proposed system learns
individual policies from the movement of the players in the game and builds a
cognitive model. He states that this type of modeling will help in understanding
learning processes of the user who interacts with the system and in adapting the
learning environment to the user.
Advantages:
Examination of the literature reveals how the use of big data is beneficial for
higher education and includes various aspects from learning analytics that closely
examine the educational process to improve learning. Another benefit includes the
use of academic analytics that make alterations as a result of the application of
algorithms to various points of data to improve learning. Through careful analysis
of big data, researchers can determine useful information that can benefit
educational institutions, students, instructors, and researchers in various ways.
These stakeholder benefits include targeted course offerings, curriculum
development, student learning outcomes and behavior, personalized learning,
improved instructor performance, post-educational employment opportunities, and
improved research in the field of education.
FUTURE WORK
1. Predicting students’ future learning behavior by creating student models that
incorporate such detailed information as students’ knowledge, motivation,
metacognition, and attitudes
2. Discovering or improving domain models that characterize the content to be
learned and optimal instructional sequences
3. Studying the effects of different kinds of pedagogical support that can be
provided by learning software.
4. Advancing scientific knowledge about learning and learners through building
computational models that incorporate models of the student, the domain, and the
software’s pedagogy.
Implementation Modules:
1. Data Collection
2. Prediction
3. Clustering
4. Map Reduce
5. Data Classification/Analysis
1. Data Collection
This study reviews literature chosen with the primary focus in Educational Data
Mining and Learning Analytics and its implications to higher education,
educational technology and instructional design.
Google Scholar was used to search and locate academic papers from
journals, conference proceedings and professional magazines with the
keywords “educational data mining” and “learning analytics”. The
search period was set the papers reviewed include both qualitative and
quantitative studies from researchers in the field of educational data
mining and learning analytics worldwide. The search for the keywords
in Google Scholar when sorted by relevance yielded the below results.
2. Prediction
Prediction entails developing a model that can infer a single aspect of the data
(predicted variable) from some combination of other aspects of the data (predictor
variables). Examples of using prediction include detecting such student behaviors
as when they are gaming the system, engaging in off-task behavior, or failing to
answer a question correctly despite having a skill. Predictive models have been
used for understanding what behaviors in an online learning environment—
participation in discussion forums, taking practice tests and the like—will predict
which students might fail a class. Prediction shows promise in developing domain
models, such as connecting procedures or facts with the specific sequence and
amount of practice items that best teach them, and forecasting and understanding
student educational outcomes, such as success on posttests after tutoring [8].
3. Clustering
Clustering refers to finding data points that naturally group together and can be
used to split a full dataset into categories. Examples of clustering applications are
grouping students based on their learning difficulties and interaction patterns, such
as how and how much they use tools in a learning management system[9], and
grouping users for purposes of recommending actions and resources to similar
users. Data as varied as online learning resources, student cognitive interviews, and
postings in discussion forums can be analyzed usingtechniques for working with
unstructured data to extract characteristics of the data and then clustering the
results. Clustering can be used in any domain that involves classifying, even to
determine how much collaboration users exhibit based on postings in discussion
forums.
4. Map Reduce:
Map-reduce is a programming model that has its roots in functional programming.
In addition to often producing short, elegant code for problems involving lists or
collections, this model has proven very useful for large-scale highly parallel data
processing. Map function reads a stream of data and parses it into intermediate
(key, value) pairs. When that is complete, the Reduce function is called once for
each unique key that was generated by Map and is given the key and a list of all
values that were generated for that key as a parameter. The keys are presented in
sorted order. As an example of using Map Reduce, consider the task of counting
the number of occurrences of each word in a large collection of documents. The
user-written Map function reads the document data and parses out the words. For
each word, it writes the (key, value) pair of (word, 1). That is, the word is treated
as the key and the associated value of 1 means that we saw the word once. This
intermediate data is then sorted by Map Reduce by keys and the user's Reduce
function is called for each unique key.
5. Data Classification/Analysis
Articles were classified both quantitatively and qualitatively. The quantitative
analysis was used to classify the papers according to the publication year and the
type of publication in which the articles appeared. Papers were qualitatively
classified using open coded content analysis whereby each paper was reviewed to
identify themes and trends in the literature.
Algorithm:
Parallel K-Means Algorithm Based on Map Reduce
Configuration:H/W System Configuration: System
- Pentium –IV 2.4 GHz
 Speed
- 1.1 Ghz
 RAM
- 256MB (min)
 Hard Disk
- 40 GB
 Key Board
- Standard Windows Keyboard
 Mouse
- Logitech
 Monitor
- 15 VGA Color.
S/W System Configuration:
Operating System
: Windows/XP/7.

Application Server
: Tomcat 5.0/6.0

Front End
: HTML, Java, Jsp

Scripts
: JavaScript.

Server side Script
: Java Server Pages.

Database
: MongoDB

Database Connectivity
: Robomongo-0.8.5-i386.
Result Analysis
The annual International Conference on Educational Data Mining (EDM) and the
annual International Conference on Learning Analytics and Knowledge (LAK)
have seen many papers submitted and presented to showcase the emerging and fast
developing field of Educational Data Mining and Learning Analytics The trend in
the numbers of articles submitted/published in the two journals in the past years
clearly shows the growing interest in the field.
Conclusion:
As the data involved in education becomes larger, the applications of Big Data
techniques become more and more necessary in learning environments. MOOCs
are good examples of learning environments that were resource hungry and raised
the need for data mining in education. The recent trends in the published papers in
EDM indicate the growth in data mining in education field. Apart from EDM
which we saw in this study, other communities are also involved in researching
this field. Exploring those communities will provide greater insights in the field.
Educational Data Mining is sure to reshape the way in which the forthcoming
generations would learn. As this study has reviewed only a tiny portion of the
available articles, there remains a need for a systematic study of published
literature on the fast growing field of application of big data in education and
learning.