Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Facultatea de Științe Economice și Gestiunea Afacerilor Str. Teodor Mihali nr. 58-60 Cluj-Napoca, RO-400951 Tel.: 0264-41.86.52-5 Fax: 0264-41.25.70 [email protected] www.econ.ubbcluj.ro DETAILED SYLLABUS Methods in Data Science 1. Information about the study program 1.1 University 1.2 Faculty 1.3 Department 1.4 Field of study 1.5 Program level (bachelor or master) 1.6 Study program / Qualification Babeș Bolyai Economic Sciences and Business Administration Business Information Systems Business Information Systems Master Business Modeling and Distributed Computing 2. Information about the subject 2.1 Subject title Methods in Data Science 2.2 Course activities professor Lect. Dr. Darie Moldovan 2.3 Seminar activities Lect. Dr. Darie Moldovan professor 2.4 Year of 2.6 Type of I 2.5 Semester I Summative 2.7 Subject regime Mandatory study assessment 3. Total estimated time (teaching hours per semester) 3.1 Number of hours per week out of which: 3.2 course out of which: 3.5 56 course 4 2 3.3 seminar/laboratory 3.6 seminar/laboratory 2 3.4 Total number of hours in 28 28 the curriculum Time distribution Hours Study based on textbook, course support, references and notes 38 Additional documentation in the library, through specialized databases and field activities 24 Preparing seminars/laboratories, essays, portfolios and reports 45 Tutoring 8 Assessment (examinations) 4 Others activities 0 3.7 Total hours for individual 119 study 3.8 Total hours per semester 175 3.9 Number of credits 7 1 NOTE: This document represents an informal translation performed by the faculty. 4. Preconditions (if necessary) 4.1 Curriculum 4.2 Skills Not necessary Basic programming skills, basic statistics knowledge 5. Conditions (if necessary) 5.1. For course development 5.2. For seminar / laboratory development Notebook, beamer, Internet connection Computers with Internet connection, specialized software: SAS Enterprise Miner, Weka 6. Acquired specific competences Professional competences Transversal competences Obtain key competences in data science - Cleaning and sampling data sets - Data management - Exploratory data analysis - Prediction based on statistical methods - Communication of results Gain competences in working within a team, segregate tasks, are able to learn from different areas connected to the addressed problem. 7. Subject objectives (arising from the acquired specific competences) 7.1 Subject’s general objective 7.2 Specific objectives Students must be familiar with data science methods and work through a data science project end to end. Students have to: learn how to analyze a dataset be able to access big data explore data and generate hypotheses use specific methods such as regression and classification for prediction communicate the results of their research using visualization tools and summaries 8. Contents 8.1 Course 1. Introduction. Course overview. About Data Science. 2. Univariate linear regression. Applications. 3. Multivariate linear regression. Applications Teaching methods Observations Lecture, demonstration, open 1 lecture discussion Lecture, demonstration, open 1 lecture discussion Lecture, open 1 lecture discussion 2 NOTE: This document represents an informal translation performed by the faculty. Lecture, demonstration, open 2 lectures discussion Lecture, 1 lecture open discussion Lecture, 1 lecture open discussion Lecture, open discussion, case 1 lecture studies Lecture, open discussion, 1 lecture demonstration Lecture, open discussion, 1 lecture demonstration Lecture, 4 lectures demonstration 4. Classification methods. Logistic regression. Decision Trees. 5. Neural networks. 6. Applying learning algorithms. Data preprocessing. 7. Results evaluation and model implementation 8. Unsupervised learning. Clustering 9. Data mining applications. 12. Large-scale data mining. HPC Computing. References: 1. Ian H. Witten, Eibe Frank, Datamining: practical machine learning tools and techniques, Morgan Kaufmann, 2011, 3rd ed. 2. Trevor Hastie, Robert Tibshirani and Jerome Friedman, The Elements of Statistical Learning. Springer, 2009 3. Jure Leskovec, Anand Rajaraman, Jeff Ullman, Mining of Massive Datasets, Cambridge, 2011 4. Pan-Ning Tan, Michael Steinbach, Vipin Kumar, Introduction to Datamining, Addison Wesley, 2006 5. Richard Duda, Peter Hart and David Stork, Pattern Classification, 2nd ed. John Wiley & Sons, 2001. 6. Drew Conway, John Myles White, Machine Learning for Hackers. Case Studies and Algorithms to Get You Started, O'Reilly Media, 2012 7. Tom Mitchell, Machine Learning. McGraw-Hill, 1997. 8. S. Haykin, Neural Networks and Machine Learning, 3rd ed., Prentice Hall, 2008 9. J. Georges, J. Thompson, C. Wells. Applied Analytics Using SAS Enterprise Miner, Course Notes, SAS Publishing, 2010 8.2 Seminar/laboratory Demonstrative example case Building a simple linear regression model Multivariate linear regression in practice. Classification methods. Naïve Bayes, Decision trees, Logistic regression. Teaching methods Observations Running examples and individual exercises/ Homework Running examples and individual exercises/ Homework Running examples and individual exercises/ Homework Running examples and individual exercises/ Homework 1 Laboratory 1 Laboratory 1 Laboratory 2 Laboratories 3 NOTE: This document represents an informal translation performed by the faculty. Neural networks. Running examples and individual exercises/ Homework Data Visualization tools. Running examples and individual exercises/ Homework Feature selection, sampling the datasets and other Running examples and preprocessing operations. individual exercises/ Homework Models comparison. Deploying the solution. Running examples and individual exercises/ Homework Clustering. Running examples and individual exercises/ Homework Large datasets analysis, Map Reduce tools, SAS EM HPC. Running examples and individual exercises/ Homework References: 1. Ian H. Witten, Eibe Frank, Datamining: practical machine learning tools and techniques, Morgan Kaufmann, 2011, 3rd ed. 1 Laboratory 1 Laboratory 1 Laboratory 1 Laboratory 1 Laboratory 4 Laboratories 2. Trevor Hastie, Robert Tibshirani and Jerome Friedman, The Elements of Statistical Learning. Springer, 2009 3. Jure Leskovec, Anand Rajaraman, Jeff Ullman, Mining of Massive Datasets, Cambridge, 2011 4. Pan-Ning Tan, Michael Steinbach, Vipin Kumar, Introduction to Datamining, Addison Wesley, 2006 5. Richard Duda, Peter Hart and David Stork, Pattern Classification, 2nd ed. John Wiley & Sons, 2001. 6. Drew Conway, John Myles White, Machine Learning for Hackers. Case Studies and Algorithms to Get You Started, O'Reilly Media, 2012 7. Tom Mitchell, Machine Learning. McGraw-Hill, 1997. 8. S. Haykin, Neural Networks and Machine Learning, 3rd ed., Prentice Hall, 2008 9. J. Georges, J. Thompson, C. Wells. Applied Analytics Using SAS Enterprise Miner, Course Notes, SAS Publishing, 2010 9. Corroboration / validation of the subject’s content in relation to the expectations coming from representatives of the epistemic community, of the professional associations and of the representative employers in the program’s field. This subject is included in the certification offered by the Association of Chartered Certified Accountants (ACCA); The profession of data scientist has recently become very popular due to the growing data available for analysis. The increasing computational power has generated new possibilities for statisticians and other specialists working with data to access a new field: the automated data analysis, which requires interdisciplinary skills: statistics, machine learning and their applications. 10. Assessment (examination) 4 NOTE: This document represents an informal translation performed by the faculty. Type of activity 10.1 Assessment criteria 10.2 Assessment methods 10.4 Course Multiple choice quiz Multiple choice test grid and Practical exam on a Data Mining software. 10.5 Seminar/ laboratory 10.3 Weight in the final grade 80% Practical exam Homework assignments, laboratory activities 10.6 Minimum performance standard • Minimum 50% of total points Date of filling 15.01.2016 Signature of the course professor Lect.Dr. Darie Moldovan Date of approval by the department 21.01.2016 20% Signature of the seminar professor Lect. Dr. Darie Moldovan Head of department’s signature Prof. habil. Dr. Gheorghe Cosmin Silaghi 5 NOTE: This document represents an informal translation performed by the faculty.