Download 510.650.doc - Johns Hopkins Carey Business School

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Data Analytics
2 Credits
BU.510.650.XX
Class Day/Time & Start/End date
Semester
Class Location
Instructor
Full Name
Contact Information
Phone Number: (###) ###-####
E-mail Address:
Office Hours
Day/s
Times
Teaching Assistant
Full Name
E-mail Address:
Required Text and Learning Materials
There is no required textbook: all class materials will be available on our Blackboard website. However, some
books are very useful if you want to learn more and deeper about data analytics. The best way to learn is by
doing (especially with programming)
Textbook (highly recommend, easy following with many examples and data sets): Data Mining and
Business Analytics with R, by Johannes Ledolter;
Publisher: Wiley (2013), ISBN-13: 978-1118447147;
Available in Johns Hopkins online library: https://catalyst.library.jhu.edu/catalog/bib_4637122
Optional Textbook (solid primer, with theory and explanation): An Introduction to Statistical Learning with
Application in R, by Gareth James, Daniela Witten, Trevor Hastie, Robert Tibshirani;
Publisher: Springer (2013); ISBN-13: 978-1461471370;
Available in Johns Hopkins online library: https://catalyst.library.jhu.edu/catalog/bib_4654919
Optional Textbook (a great advanced text): Elements of Statistical Learning: Data Mining, Inference, and
Prediction, by Trevor Hastie, Robert Tibshirani and Jerome Friedman, but it requires some mathematical
sophistication and goes beyond the material we will be covering. The book is free at
http://statweb.stanford.edu/~tibs/ElemStatLearn/index.html
Software:
 We require the R Statistical Software, which is powerful and free. R can be downloaded at the link
below: http://www.cran.r-project.org/
 Rstudio is a free platform for both writing and running R, available at www.rstudio.org. Some students
find it friendlier than basic R (especially in windows OS).
 The learning curve is very steep. Students can become proficient in a few weeks. Some manuals are
very helpful to learn R, e.g., http://cran.r-project.org/manuals.html
 I provide limited software instruction, in-class demonstration, and code to accompany lectures and
assignments. We do not assume that you have used R in a previous class. However, this is not a
class on R. Like any language, R is only learned by doing. You should install R as soon as possible
and familiarize yourself with basic operations.
BU.510.650.XX – Data Analytics – Instructor’s Name – Page 2 of 4

Additional resources: (a) Tutorials at data.princeton.edu/R are fantastic (and there are many others out
there). (b) Youtube intros to R, e.g. the series from Google Developers.
Blackboard Site
A Blackboard course site is set up for this course. Each student is expected to check the site throughout the
semester as Blackboard will be the primary venue for outside classroom communications between the
instructors and the students. Students can access the course site at https://blackboard.jhu.edu. Support for
Blackboard is available at 1-866-669-6138.
Course Evaluation
As a research and learning community, the Carey Business School is committed to continuous improvement.
The faculty strongly encourages students to provide complete and honest feedback for this course. Please
take this activity seriously because we depend on your feedback to help us improve so you and your
colleagues will benefit. Information on how to complete the evaluation will be provided towards the end of the
course.
Disability Services
Johns Hopkins University and the Carey Business School are committed to making all academic programs,
support services, and facilities accessible. To determine eligibility for accommodations, please contact the
Carey Disability Services Office at time of admission and allow at least four weeks prior to the beginning of the
first class meeting. Students should contact Rachel Pickett in the Disability Services office by phone at 410234-9243, by fax at 443-529-1552, or email: [email protected].
Important Academic Policies and Services
 Honor Code
 Statement of Diversity and Inclusion
 Student Success Center
 Inclement Weather Policy
Students are strongly encouraged to consult the Johns Hopkins Carey Business School Student Handbook
and Academic Catalog and the School website http://carey.jhu.edu/students/student-resources/university-andschool-policies/ for detailed information regarding the above items.
Course Description
This course prepares students to gather, describe, and analyze data, using advanced statistical tools to
support operations, risk management, and response to disruptions. Analysis is done targeting economic and
financial decisions in complex systems that involve multiple partners. Topics include: probability, statistics,
hypothesis testing, experimentation, and forecasting. Prerequisite: BU.510.601 Statistical Analysis OR
BU.914.610 Quantitative Methods.
Course Overview
This is an advanced course in statistics, machine learning, and data-driven decision making. This course is
designed for students who wish to increase their capability to build, use and interpret data analysis models for
business, health care and other quantitative management.This course prepares students to gather, describe,
and analyze real-world data, use advanced analytical tools to provide scientific guidance in decision making.
Students are supposed to have basic knowledge of calculus, probability and statistics and other quantitative
background. Students should be comfortable with mathematical formulas and are willing to develop
programming skills to analyze data.
Course topics include a review of basis statistical ideas, numerical and graphical methods for summarizing
data, linear regression, logistic regression, model choice and false discovery rates, multinomial and binary
regression, classification, decision trees, factor models, clustering, cross-validation, decision trees and other
emerging data analytics methods. The course presents real-world examples where a significant competitive
advantage has been obtained through large-scale data analysis. We learn both basic underlying concepts and
practical computational skills, including techniques for analysis of distributed data. Examples include
advertising, eCommerce, finance, health care, marketing, and revenue management. The ultimate goal is, of
course, help to make better business decisions using advanced data analytics.
Student Learning Objectives for This Course
BU.510.650.XX – Data Analytics – Instructor’s Name – Page 3 of 4
All Carey graduates are expected to demonstrate competence on four Learning Goals, operationalized
in eight Learning Objectives. These learning goals and objectives are supported by the courses Carey
offers. For a complete list of Carey learning goals and objectives, please refer to the website
http://carey.jhu.edu/faculty-research/learning-at-carey/learning-assessment.
Parts of the learning objectives for this course are provided as follows:
1. Gather sufficient relevant data, conduct data analytics using scientific methods, make
appropriate and powerful connections between analysis and real-world problems.
2. Demonstrate sophisticated understanding for the concepts and methods; know the exact
scopes and possible limitations of each method; show capability of using data analytics skills
to provide constructive guidance in decision making.
3. Use advanced techniques to conduct thorough and insightful analysis, interpret the results
correctly with convincing and useful information.
4. Demonstrate substantial understanding of the real problems; conduct deep data analytics
using correct methods; draw reasonable conclusions with sufficient explanation and
elaboration.
5. Write an insightful and well-organized report for a real-world case study, including thorough
and thoughtful details.
6. Finally, students will develop the capabilities of making better business decisions by using
advance techniques in data analytics.
Attendance Policy
Attendance and class participation are part of each student’s course grade. Students are expected to attend all
scheduled class sessions. Failure to attend class will result in an inability to achieve the objectives of the
course. Excessive absence will result in loss of points for participation. Regular attendance and active
participation are required for students to successfully complete the course.
Class participation is an important part of learning. If you have a question, it’s likely that others do as well. I
encourage active participation, and course grades will take into account students who make particularly strong
contributions.
Assignments
All students are expected to view the Carey Business School Honor Code/Code of Conduct tutorial and submit
their pledge online. Students who fail to complete and submit the pledge will have a registrar’s hold on their
account. Please contact the student services office via email [email protected] if you have any
questions.
Students are not allowed to use any electronic devices during in-class tests. Calculators will be provided if the
instructor requires them for test taking. Students must seek permission from the instructor to leave the
classroom during an in-class test. Test scripts must not be removed from the classroom during the test.
Homework: weekly individual homework assignments, due by the midnight of next class day. All homework
assignment should be submitted through the Blackboard links.
Group Projects: 2-4 students form a group and work on the projects as a team. Students can identify a
company or a scenario, collect data, use techniques taught in class to study the data patterns or to predict
future outcomes. Students are required to write a 4-6 page project report, and present in class using Power
Point slides. Details will be available shortly.
Final Exam: the final exam is in-class Closed-book individual written exam.
Late submission including assignments, projects and exams will not be accepted.
Study Group (not required, but highly recommend)
Many students learn better and faster when working in a group, so I encourage collaborative learning. You can
work together in a study group with 2-4 students, to discuss class materials, homework assignments and
projects on a weekly basis. However, each student must write your homework assignment individually using
BU.510.650.XX – Data Analytics – Instructor’s Name – Page 4 of 4
your own language – your text should reflect your own understanding of the materials. The study groups can
be different from your project groups.
Evaluation and Grading
Assignment
Attendance and participation in class discussion
Homework
Project
Final Exam
Total
Learning Objectives
1,2,3,4,5,6
1,2,3,4,5,6
1,2,3,4,5,6
Weight
10%
30%
20%
40%
100%
Important notes about grading policy:
The grade of A is reserved for those who demonstrate extraordinarily excellent performance. The grade of
A- is awarded only for excellent performance. The grade for good performance in this course is a B+/B. The
grades of D+, D, and D- are not awarded at the graduate level.
Please refer to the Carey Business School Student Handbook for grade appeal information
http://carey.jhu.edu/students/student-handbook-and-academic-catalog/
Tentative Course Calendar*
*The instructors reserve the right to alter course content and/or adjust the pace to accommodate class
progress. Students are responsible for keeping up with all adjustments to the course calendar.
Week
Date
1
Date
2
Date
3
Date
4
5
6
7
8
Date
Date
Date
Date
Date
Weekly Objectives/Topics
Introduction, Data Summarization
and Visualization
Linear and Nonlinear Regression,
Model Selection
Classification, Logistic Regression,
Poisson Regression
Clustering, Decision Trees
Dimension Reduction
Text data, Time Series
Project Presentation
Final Exam
Recommended Reading
(book by Ledolter)
Text, Ch 1, 2
Assignments
Text, Ch 3, 4, 5, 6
HW 1 is due
Text, Ch 7, 8, 9, 11
HW 2 is due
Text, Ch 13, 14, 15, 16
Text, Ch 17, 18
Text, Ch 19, 20
HW
HW
HW
HW
3 is due
4 is due
5 is due
5 is due
Copyright Statement
Unless explicitly allowed by the instructor, course materials, class discussions, and examinations are created
for and expected to be used by class participants only. The recording and rebroadcasting of such material, by
any means, is forbidden. Violations are subject to sanctions under the Honor Code.