Download Machine Learning - Department of Computer Science

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of artificial intelligence wikipedia , lookup

Time series wikipedia , lookup

Quantum machine learning wikipedia , lookup

Catastrophic interference wikipedia , lookup

Neural modeling fields wikipedia , lookup

Reinforcement learning wikipedia , lookup

Pattern recognition wikipedia , lookup

Concept learning wikipedia , lookup

Machine learning wikipedia , lookup

Transcript
General Information
Course Id:
COSC6342 Machine Learning
Time:
MO/WE 2:30-4p
Instructor:
Christoph F. Eick
Classroom:
SEC 201
E-mail:
[email protected]
Homepage:
http://www2.cs.uh.edu/~ceick/
What is Machine Learning?



Machine Learning is the
• study of algorithms that
• improve their performance
• at some task
• with experience
Role of Statistics: Inference from a sample
Role of Computer science: Efficient algorithms to
• Solve optimization problems
• Learning, representing and evaluating models
for inference
2
Example of a Decision Tree Model
Tid Refund Marital
Status
Taxable
Income Cheat
1
Yes
Single
125K
No
2
No
Married
100K
No
3
No
Single
70K
No
4
Yes
Married
120K
No
5
No
Divorced 95K
Yes
6
No
Married
No
7
Yes
Divorced 220K
No
8
No
Single
85K
Yes
9
No
Married
75K
No
10
No
Single
90K
Yes
60K
Splitting Attributes
Refund
Yes
No
NO
MarSt
Single, Divorced
TaxInc
NO
< 80K
NO
Married
> 80K
YES
10
Training Data
Decision Tree Model
Classification Model in General:
f: {yes,no}{married,single,divorced}+ {yes,no}
Machine Learning Tasks





Supervised Learning
• Classification
• Prediction
Unsupervised Learning and Summarization of Data
• Association Analysis
• Clustering
Preprocessing
Reinforcement Learning and Adaptation
Activities Related to Models
• Learning parameters of models
• Choosing/Comparing models
4
• Evaluating Models (e.g.
predicting their accuracy)
Prerequisites
Background
 Probabilities
• Distributions, densities, marginalization…
 Basic statistics
• Moments, typical distributions, regression
 Basic knowledge of optimization techniques
 Algorithms
• basic data structures, complexity…
 Programming skills
 We provide some background, but the class will be
fast paced
 Ability to deal with “abstract mathematical
concepts”
Textbooks
Textbook:
Ethem Alpaydin, Introduction to Machine Learning,
MIT Press, Second Edition, 2010.
Mildly Recommended Textbooks:
1. Christopher M. Bishop, Pattern Recognition and
Machine Learning, 2006.
2. Tom Mitchell, Machine Learning, McGraw-Hill,
1997.
Grading Spring 2014
2 Exams
3 Projects and 2HW
Attendance
58-62%
38-41%
1%
Remark: Weights are subject to change
NOTE: PLAGIARISM IS NOT TOLERATED.
Topics Covered in 2014 (Based on Alpaydin)
Topic
Topic
Topic
Topic
Topic
Topic
Topic
Topic
Topic
Topic
Topic
Topic
Topic
Topic
Topic
Topic
Topic
Topic
Topic
1: Introduction to Machine Learning
18: Reinforcement Learning
2: Supervised Learning
3: Bayesian Decision Theory (excluding Belief Networks)
5: Parametric Model Estimation
6: Dimensionality Reduction Centering on PCA
7: Clustering1: Mixture Models, K-Means and EM
8: Non-Parametric Methods Centering on kNN and density estimation
9: Clustering2: Density-based Approaches
10 Decision Trees
11: Comparing Classifiers
12: Combining Multiple Learners
13: Linear Discrimination Centering on Support Vector Machines
14: More on Kernel Methods
15: Graphical Models Centering on Belief Networks
16: Success Stories of Machine Learning
17: Hidden Markov Models
19: Neural Networks
20: Computational Learning Theory
Remark: Topics 17, 19, and 20 likely will be only briefly covered or
skipped---due to the lack of time. For Topic 16 your input is appreciated!
Course Elements
Total: 26-27 classes
• 18-19 lectures
• 3 course projects
• 2-3 classes for review and discussing course projects
• 1-2 classes will be allocated for student presentations
• 3 40 minutes reviews
• 2 exams
• Graded and ungraded paper and pencil home problems
• Course Webpage: http://www2.cs.uh.edu/~ceick/ML/ML.html
2014 Plan of Course Activities
1. Through March 15: Homework1; Individual Project1
(Reinforcement Learning and Adaptation: Learn how
to act intelligently in an unknown/changing
environment); Homework2.
2. We., March 5: Midterm Exam
3. March 16-April 5: Group Project2 (TBDL).
4. April 6-April 26: Homework3, Project3 (TBDL)
5. Mo., May 5, 2p: Final Exam
Remark: Schedule is the same as in 2013, except reinforcement learning
will covered after the introduction.
Schedule ML Spring 2013
Week
Topic
Jan 14
Introduction
Jan 16
Introduction / Supervised Learning
Jan 21
Bayesian Decision Theory, Parametric Approaches
Jan. 23
Multivariate Methods, Homework1
Jan. 28
Multivariate Methods, Dim. Reduction, Project1
Jan. 30
Clustering1
Feb. 5
Non-parametric Methods, Review1
…
Decision Trees, Review2, Project2, Midterm Exam
…
Decision Trees, Clustering2, Reinforcement Learning
…
Reinforcement Learning
…
Ensembles, SVM
…
SVM, Project 3, Project2 SP
…
Project2 SP, More on Kernels, Project3, Comparing
Learners
…
Review3, Graphical Models, Kaelbling Article, TE
Post Analysis Project1, Review 4
Green: will use other teaching material
Dates to Remember
March 5 (or March 17) + May 5,
2p
Exams
April 6+8??
Project2 Student Project Presentations
Jan. 20, March 10/12
No class (Spring Break)
Feb. 23 , April 3/5, April 26
Submit Project Report /Software/Deliverable
Exams
 Will
be open notes/textbook
 Will get a review list before the exam
 Exams will center (80% or more) on material that was
covered in the lecture
Exam scores will be immediately converted into number
grades
 We only have 2009, 2011 and 2013 sample exams; I
taught this course only three times recently.
Other UH-CS Courses with Overlapping Contents
1.
COSC 6368: Artificial Intelligence
 Strong Overlap: Decision Trees, Bayesian Belief Networks
 Medium Overlap: Reinforcement Learning
 COSC 6335: Data Mining
 Strong Overlap: Decision trees, SVM, kNN, Densitybased Clustering
 Medium Overlap: K-means, Decision Trees,
Preprocessing/Exploratory DA, AdaBoost
 COSC 6343: Pattern Classification?!?
 Medium Overlap: all classification algorithms, feature
selection—discusses those topics taking
a different perspective.
Purpose of COSC 6342
Machine Learning is the study of how to build computer
systems that learn from experience. It
intersects with statistics, cognitive science, information
theory, artificial intelligence, pattern recognition and
probability theory, among others. The course will explain
how to build systems that learn and adapt using realworld applications. Its main themes include:
• Learning how to create models from examples that
classify or predict.
• Learning in unknown and changing environments
• Theory of machine learning
• Preprocessing
• Unsupervised learning and other learning paradigms
Course Objectives COSC 6342
Upon completion of this course, students
• will know what the goals and objectives of machine learning
are
• will have a basic understanding on how to use machine
learning to build real-world systems
• will have sound knowledge of popular classification and
prediction techniques, such as decision trees, support vector
machines, nearest-neighbor approaches and regression.
• will learn how to build systems that explore unknown and
changing environments
• will get some exposure to machine learning theory, in
particular how learn models that exhibit high accuracies.
• will have some exposure to more advanced topics, such as
ensemble approaches, kernel methods, unsupervised
learning, feature selection and generation, density estimation.