Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Mathematics 5490 and Computer Science 4570/5010: Big Data and Mining Professor Craig C. Douglas, http://www.mgnet.org/~douglas Syllabus, Spring 2016 Online syllabus: Throughout the entire semester, this syllabus will be online at the URL http://www.mgnet.org/~douglas/Classes/bigdata/2016s-notes/syllabus.pdf Graduate Course Description: An advanced topics course. Dynamic Big Data-Driven Application Systems (DBDDAS) is a paradigm whereby applications and measurements become a symbiotic feedback control system with the ability to dynamically incorporate additional Big Data into executing applications and dynamically steer the measurement process, which provides more accurate analysis and prediction, more precise controls, and more reliable outcomes. Data mining is a paradigm to find hidden data and anomalies in either data sets or bases. The data can be either static or dynamic and can come from streams that are not saved. Undergraduate Course Description: The course will be similar to the graduate level course. There will be more emphasis on just data mining, however. Prerequisites: An eclectic group of students who are not afraid to program or use a computer and manipulate data in new ways. Registered auditors: Welcomed. Classrooms: Ross Hall 247 (M/W 1:00-2:15) and Ross Hall 241 (occasionally) Class web page: http://www.mgnet.org/~douglas/Classes/bigdata/2016s-index.html Office hours: M/Tu 11:00-12:00 and W 10:00-11:00. Also, by appointment. Homework: (Graduate) There will be some homework and a project in this course. (Undergraduate) There will be homework specifically on data mining techniques. Exams: There will be no exams. Grading: (Graduate) Grades will be determined by how you do on your homework/project (90%) and on class participation (10%). (Undergraduate) You can opt for the graduate scheme or undergraduate assigned problems (100%). In either case, you are expected to be at class promptly and not be late. Class starts at exactly 1:00. You are responsible for announcements and answers to questions at the beginning of all classes. References: I will be lecturing from a number of sources, most of which will be placed on the course web site during the semester. The following will be useful to you. • Anand Rajaraman, Jure Leskovec, and Jeffrey D. Ullman, Mining of Massive Datasets, Stanford, 2014. See Amazon.com for the hardcopy edition published by Cambridge University Press in 2011. Most up to date and online at http://infolab.stanford.edu/~ullman/mmds/bookL.pdf, 2015. • http://www.mmds.org (companion web site to first reference). • Wooyoung Kim, Parallel Clustering Algorithms: Survey, Parallel Clustering Algorithms: Survey, http://grid.cs.gsu.edu/~wkim/index_files/SurveyParallelClustering.html, 2009. Learning objectives: Students should know how Big Data based DDDAS are designed and implemented and key issues in data mining. Cheating Policy: Getting caught cheating or plagiarizing will result in a failing grade and possibly much worse, including expulsion from the university and legal proceedings against you. Check with the university handbook, http://uwadmnweb.uwyo.edu/REGISTRAR/bulletin/honor.html, for more details. Disability Policy: It is University of Wyoming policy to accommodate students, faculty, staff, and visitors with disabilities. If you have a physical, learning, sensory, or psychological disability and require accommodations, please let me know as soon as possible. You will need to register with University Disability Support Services (UDSS) in the Student Educational Opportunity offices, Room 330 Knight Hall, and provide UDSS with documentation of your disability. See the university handbook for more details. Topics: • Introduction to Big Data, Data Mining, and Dynamic (Big) Data-Driven Application Systems • How many unique sentences are there in a given file? • • • • • • • • • • • Data mining MapReduce and similar techniques Anomalies Finding similar items Data stream mining PageRank algorithms Market-basket algorithms Clustering algorithms Machine learning Web advertising techniques Dynamic Data Applications