Download Syllabus

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Nonlinear dimensionality reduction wikipedia , lookup

Cluster analysis wikipedia , lookup

Transcript
Mathematics 5490 and Computer Science 4570/5010: Big Data and Mining
Professor Craig C. Douglas, http://www.mgnet.org/~douglas
Syllabus, Spring 2016
Online syllabus: Throughout the entire semester, this syllabus will be online at the URL
http://www.mgnet.org/~douglas/Classes/bigdata/2016s-notes/syllabus.pdf
Graduate Course Description: An advanced topics course. Dynamic Big Data-Driven Application
Systems (DBDDAS) is a paradigm whereby applications and measurements become a symbiotic
feedback control system with the ability to dynamically incorporate additional Big Data into
executing applications and dynamically steer the measurement process, which provides more
accurate analysis and prediction, more precise controls, and more reliable outcomes. Data mining
is a paradigm to find hidden data and anomalies in either data sets or bases. The data can be
either static or dynamic and can come from streams that are not saved.
Undergraduate Course Description: The course will be similar to the graduate level course. There
will be more emphasis on just data mining, however.
Prerequisites: An eclectic group of students who are not afraid to program or use a computer and
manipulate data in new ways.
Registered auditors: Welcomed.
Classrooms: Ross Hall 247 (M/W 1:00-2:15) and Ross Hall 241 (occasionally)
Class web page: http://www.mgnet.org/~douglas/Classes/bigdata/2016s-index.html
Office hours: M/Tu 11:00-12:00 and W 10:00-11:00. Also, by appointment.
Homework: (Graduate) There will be some homework and a project in this course.
(Undergraduate) There will be homework specifically on data mining techniques.
Exams: There will be no exams.
Grading: (Graduate) Grades will be determined by how you do on your homework/project (90%)
and on class participation (10%). (Undergraduate) You can opt for the graduate scheme or
undergraduate assigned problems (100%). In either case, you are expected to be at class
promptly and not be late. Class starts at exactly 1:00. You are responsible for announcements
and answers to questions at the beginning of all classes.
References: I will be lecturing from a number of sources, most of which will be placed on the
course web site during the semester. The following will be useful to you.
• Anand Rajaraman, Jure Leskovec, and Jeffrey D. Ullman, Mining of Massive Datasets,
Stanford, 2014. See Amazon.com for the hardcopy edition published by Cambridge
University Press in 2011. Most up to date and online at
http://infolab.stanford.edu/~ullman/mmds/bookL.pdf, 2015.
• http://www.mmds.org (companion web site to first reference).
• Wooyoung Kim, Parallel Clustering Algorithms: Survey, Parallel Clustering Algorithms:
Survey, http://grid.cs.gsu.edu/~wkim/index_files/SurveyParallelClustering.html, 2009.
Learning objectives: Students should know how Big Data based DDDAS are designed and
implemented and key issues in data mining.
Cheating Policy: Getting caught cheating or plagiarizing will result in a failing grade and possibly
much worse, including expulsion from the university and legal proceedings against you. Check
with the university handbook, http://uwadmnweb.uwyo.edu/REGISTRAR/bulletin/honor.html, for
more details.
Disability Policy: It is University of Wyoming policy to accommodate students, faculty, staff, and
visitors with disabilities. If you have a physical, learning, sensory, or psychological disability and
require accommodations, please let me know as soon as possible. You will need to register with
University Disability Support Services (UDSS) in the Student Educational Opportunity offices,
Room 330 Knight Hall, and provide UDSS with documentation of your disability. See the
university handbook for more details.
Topics:
• Introduction to Big Data, Data Mining, and Dynamic (Big) Data-Driven Application
Systems
• How many unique sentences are there in a given file?
•
•
•
•
•
•
•
•
•
•
•
Data mining
MapReduce and similar techniques
Anomalies
Finding similar items
Data stream mining
PageRank algorithms
Market-basket algorithms
Clustering algorithms
Machine learning
Web advertising techniques
Dynamic Data Applications