Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
APPLICATION OF BIG DATA IN EDUCATION DATA MINING AND LEARNING ANALYTICS – A LITERATURE REVIEW Abstract: The usage of learning management systems in education has been increasing in the last few years. Students have started using mobile phones, primarily smart phones that have become a part of their daily life, to access online content. Student's online activities generate enormous amount of unused data that are wasted as traditional learning analytics are not capable of processing them. This has resulted in the penetration of Big Data technologies and tools into education, to process the large amount of data involved. This study looks into the recent applications of Big Data technologies in education and presents a review of literature available on Educational Data Mining and Learning Analytics. Architecture Diagram: Existing system: Using big data allows educational institutions to identify post education employment opportunities for graduates and help target education that more closely aligns with employment market needs. It can also predict graduate employment, unemployment, or undetermined situations about job opportunities Using big data can help stakeholders in the educational system better understand vocational prospects for students and better assess student learning programs for occupational compatibility In a global learning environment, this type of information not only can facilitate better educational and post education vocational planning, but also may prove useful to organizations as they make hiring and budgeting decisions for college graduates in different disciplines. Proposed System: Research in education has resulted in several new pedagogical improvements. Community based learning environments have increased in number. In the current learning environments, users learn in online communities like discussion forums, online chats, instant messaging clients and various Learning Management Systems like Model. Recent learning methods like Flipped Classroom greatly depend on online activities. Several frameworks and models have been proposed for online learning management systems to improve the learning experience. Entry of open source projects in mobile computing has led to low cost smartphones and smart phones have penetrated much. Students have started using smart phones to access learning content. As the learning environments have become accessible anywhere through the internet, students access their courses anywhere and indulge in learning activities. Students’ activities through learning management systems create large amount of data that can be utilized in developing the learning environment, helping the students in learning and improving the overall learning experience. Learning Individual Behavior in an Educational Game: A Data-Driven Approach”, describes a framework for modeling user's behavior. The proposed system learns individual policies from the movement of the players in the game and builds a cognitive model. He states that this type of modeling will help in understanding learning processes of the user who interacts with the system and in adapting the learning environment to the user. Advantages: Examination of the literature reveals how the use of big data is beneficial for higher education and includes various aspects from learning analytics that closely examine the educational process to improve learning. Another benefit includes the use of academic analytics that make alterations as a result of the application of algorithms to various points of data to improve learning. Through careful analysis of big data, researchers can determine useful information that can benefit educational institutions, students, instructors, and researchers in various ways. These stakeholder benefits include targeted course offerings, curriculum development, student learning outcomes and behavior, personalized learning, improved instructor performance, post-educational employment opportunities, and improved research in the field of education. FUTURE WORK 1. Predicting students’ future learning behavior by creating student models that incorporate such detailed information as students’ knowledge, motivation, metacognition, and attitudes 2. Discovering or improving domain models that characterize the content to be learned and optimal instructional sequences 3. Studying the effects of different kinds of pedagogical support that can be provided by learning software. 4. Advancing scientific knowledge about learning and learners through building computational models that incorporate models of the student, the domain, and the software’s pedagogy. Implementation Modules: 1. Data Collection 2. Prediction 3. Clustering 4. Map Reduce 5. Data Classification/Analysis 1. Data Collection This study reviews literature chosen with the primary focus in Educational Data Mining and Learning Analytics and its implications to higher education, educational technology and instructional design. Google Scholar was used to search and locate academic papers from journals, conference proceedings and professional magazines with the keywords “educational data mining” and “learning analytics”. The search period was set the papers reviewed include both qualitative and quantitative studies from researchers in the field of educational data mining and learning analytics worldwide. The search for the keywords in Google Scholar when sorted by relevance yielded the below results. 2. Prediction Prediction entails developing a model that can infer a single aspect of the data (predicted variable) from some combination of other aspects of the data (predictor variables). Examples of using prediction include detecting such student behaviors as when they are gaming the system, engaging in off-task behavior, or failing to answer a question correctly despite having a skill. Predictive models have been used for understanding what behaviors in an online learning environment— participation in discussion forums, taking practice tests and the like—will predict which students might fail a class. Prediction shows promise in developing domain models, such as connecting procedures or facts with the specific sequence and amount of practice items that best teach them, and forecasting and understanding student educational outcomes, such as success on posttests after tutoring [8]. 3. Clustering Clustering refers to finding data points that naturally group together and can be used to split a full dataset into categories. Examples of clustering applications are grouping students based on their learning difficulties and interaction patterns, such as how and how much they use tools in a learning management system[9], and grouping users for purposes of recommending actions and resources to similar users. Data as varied as online learning resources, student cognitive interviews, and postings in discussion forums can be analyzed usingtechniques for working with unstructured data to extract characteristics of the data and then clustering the results. Clustering can be used in any domain that involves classifying, even to determine how much collaboration users exhibit based on postings in discussion forums. 4. Map Reduce: Map-reduce is a programming model that has its roots in functional programming. In addition to often producing short, elegant code for problems involving lists or collections, this model has proven very useful for large-scale highly parallel data processing. Map function reads a stream of data and parses it into intermediate (key, value) pairs. When that is complete, the Reduce function is called once for each unique key that was generated by Map and is given the key and a list of all values that were generated for that key as a parameter. The keys are presented in sorted order. As an example of using Map Reduce, consider the task of counting the number of occurrences of each word in a large collection of documents. The user-written Map function reads the document data and parses out the words. For each word, it writes the (key, value) pair of (word, 1). That is, the word is treated as the key and the associated value of 1 means that we saw the word once. This intermediate data is then sorted by Map Reduce by keys and the user's Reduce function is called for each unique key. 5. Data Classification/Analysis Articles were classified both quantitatively and qualitatively. The quantitative analysis was used to classify the papers according to the publication year and the type of publication in which the articles appeared. Papers were qualitatively classified using open coded content analysis whereby each paper was reviewed to identify themes and trends in the literature. Algorithm: Parallel K-Means Algorithm Based on Map Reduce Configuration:H/W System Configuration: System - Pentium –IV 2.4 GHz Speed - 1.1 Ghz RAM - 256MB (min) Hard Disk - 40 GB Key Board - Standard Windows Keyboard Mouse - Logitech Monitor - 15 VGA Color. S/W System Configuration: Operating System : Windows/XP/7. Application Server : Tomcat 5.0/6.0 Front End : HTML, Java, Jsp Scripts : JavaScript. Server side Script : Java Server Pages. Database : MongoDB Database Connectivity : Robomongo-0.8.5-i386. Result Analysis The annual International Conference on Educational Data Mining (EDM) and the annual International Conference on Learning Analytics and Knowledge (LAK) have seen many papers submitted and presented to showcase the emerging and fast developing field of Educational Data Mining and Learning Analytics The trend in the numbers of articles submitted/published in the two journals in the past years clearly shows the growing interest in the field. Conclusion: As the data involved in education becomes larger, the applications of Big Data techniques become more and more necessary in learning environments. MOOCs are good examples of learning environments that were resource hungry and raised the need for data mining in education. The recent trends in the published papers in EDM indicate the growth in data mining in education field. Apart from EDM which we saw in this study, other communities are also involved in researching this field. Exploring those communities will provide greater insights in the field. Educational Data Mining is sure to reshape the way in which the forthcoming generations would learn. As this study has reviewed only a tiny portion of the available articles, there remains a need for a systematic study of published literature on the fast growing field of application of big data in education and learning.