Download 1-Intro - Fordham University Computer and Information Sciences

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Knowledge representation and reasoning wikipedia , lookup

Human-Computer Interaction Institute wikipedia , lookup

Quantum machine learning wikipedia , lookup

Catastrophic interference wikipedia , lookup

Time series wikipedia , lookup

History of artificial intelligence wikipedia , lookup

Concept learning wikipedia , lookup

Pattern recognition wikipedia , lookup

Machine learning wikipedia , lookup

Transcript
Machine Learning
Introduction
Class Info
• Office Hours
– Monday:
– Wednesday:
– Thursday:
11:30 – 1:00
10:00 – 1:00
11:30 – 1:00
• Course Text
– Tom Mitchell: Machine Learning
– Course notes
• Prerequisites
– CS1, CS2
Important
• Come to class. Pay attention. Ask questions.
– There are no stupid questions!!
•
•
•
•
Come to my office hours
Start the homework assignments early
Homework in this class requires “thinking time”
Read the textbook and notes
– The textbook can be difficult to read: very technical
Important
• The course material is difficult
• Material for every class requires complete
understanding of the material from all the
previous classes
– Come to my office hours!!
• First two-three classes will cover important
mathematical background for the class
– You will be tested on this material
Why Machine Learning
• Sorting algorithms
– Can you write a program?
• Facial recognition
– Can you write a program?
– How do people do it? (Can we simulate this
process?!)
– Instead of writing a program by hand, we collect lots
of examples that specify the correct output for a given
input
– A machine learning algorithm then takes these
examples and produces a program that does the job
Find Waldo (2)
Examples of Machine Learning
Applications
• Predicting discrete labels (classification)
– Email spam filtering
• Predicting real numbers
– Predicting future stock prices or exchange rates
– Recommendation systems (e.g. Netflix competition, for movies)
• Recognizing patterns:
– Facial identities or facial expressions
– Handwritten or spoken words
• Recognizing anomalies: monitoring
– Unusual sequences of credit card transactions
– Unusual patterns of sensor readings in a nuclear power plant, or
in a hospital intensive care ward (epidemiology)
Three Niches for Machine Learning
• Data mining : using historical data to improve
decisions
– medical records  medical knowledge
• Software applications we can't program by hand
– autonomous driving
– speech recognition
• Self customizing programs
– Newsreader that learns user interests
Typical Datamining Task
Data mining : using historical data to improve decisions
medical records  medical knowledge
• Given:
– 9714 patient records, each describing a pregnancy and birth
– Each patient record contains 215 features
•
Learn to predict:
– Classes of future patients at high risk for Emergency Cesarean Section
Datamining Result
Data mining : using historical data to improve decisions
medical records  medical knowledge
One of 18 learned rules:
– If
• No previous vaginal delivery, and
• Abnormal 2nd Trimester Ultrasound, and
• Malpresentation at admission
Over training data: 26/41 = .63,
Over test data: 12/20 = .60
– Then Probability of Emergency C-Section is 0.6
Credit Risk Analysis
• Rules learned from synthesized data:
– If
• Other-Delinquent-Accounts > 2, and
• Number-Delinquent-Billing-Cycles > 1
– Then Profitable-Customer? = No [Deny Credit Card application]
– If
• Other-Delinquent-Accounts = 0, and
• (Income > $30k) OR (Years-of-Credit > 3)
– Then Profitable-Customer? = Yes [Accept Credit Card application]
Other Prediction Problems
Customer purchase behavior:
Other Prediction Problems
Customer retention
Other Prediction Problems
Process optimization
Displaying the structure of a set of documents
using a deep neural network (clustering vs classification)
Why Study Machine Learning?
• Engineering Better Computing Systems
– Develop systems that are too difficult/expensive to construct
manually because they require specific detailed skills or
knowledge tuned to a specific task (knowledge engineering
bottleneck).
– Develop systems that can automatically adapt and customize
themselves to individual users.
• Personalized news or mail filter
• Personalized tutoring
– Discover new knowledge from large databases (data mining).
• Market basket analysis (e.g. diapers and beer)
• Medical text mining (e.g. migraines to calcium channel blockers to
magnesium)
R. Mooney, Univ. of Texas at
Austin
Why Study Machine Learning?
• Cognitive Science
– Computational studies of learning may help us understand
learning in humans and other biological organisms.
• Hebbian neural learning
– “Neurons that fire together, wire together.”
log(perf. time)
• Human’s relative difficulty of learning disjunctive concepts vs.
conjunctive ones.
• Power law of practice
log(# training trials)
R. Mooney, Univ. of Texas at
Austin
Why Study Machine Learning?
• The Time is Ripe
– Many basic effective and efficient algorithms
available.
– Large amounts of on-line data available.
– Large amounts of computational resources
available.
R. Mooney, Univ. of Texas at
Austin
Machine Learning and Statistics
• A lot of work in machine learning can be seen as a
rediscovery of things that were known in statistics
– Statistics: interpretation
– ML: prediction
• Machine learning often refers to tasks associated with
artificial intelligence (AI)
–
–
–
–
Recognition
Diagnosis,
Planning
Robot control
• Goals can be autonomous machine performance, or
enabling humans to learn from data (data mining)
What is a Learning Problem?
• Learning = Improving with experience at some task
– Improve over task T
– with respect to performance measure P
– based on experience E
• E.g., Learn to play checkers
– T: Play checkers
– P: % of games won in world tournament
– E: opportunity to play against self
•
•
•
•
What experience?
What exactly should be learned?
How shall it be represented?
What specific algorithm to learn it?
Type of Training Experience
• Direct or indirect?
• Teacher or not?
Problem: is training experience representative of
performance goal?
Defining the Learning Task
Task T, Performance metric P, Experience E
T: Recognizing hand-written words
P: Percentage of words correctly classified
E: Database of human-labeled images of handwritten words
T: Driving on four-lane highways using vision sensors
P: Average distance traveled before a human-judged error
E: A sequence of images and steering commands recorded while
observing a human driver.
T: Categorize email messages as spam or legitimate.
P: Percentage of email messages correctly classified.
E: Database of emails, some with human-given labels
Designing a Learning System
• Choose the training experience
• Choose exactly what is too be learned, i.e. the target
function.
• Choose how to represent the target function.
• Choose a learning algorithm to infer the target function
from the experience.
Learner
Environment/
Experience
Knowledge
Performance
Element
R. Mooney, Univ. of Texas at
Austin
Types of learning task
• Supervised learning
– Learn to predict output when given an input vector
• Who provides the correct answer?
• Reinforcement learning (will not be covered in this course)
– Learn action to maximize payoff
• Not much information in a payoff signal
• Payoff is often delayed
• Unsupervised learning
– Create an internal representation of the input e.g. form
clusters; extract features
• How do we know if a representation is good?
– This is the new frontier of machine learning because
most big datasets do not come with labels.
Some Issues in Machine Learning:
Textbook, Ch. 1
•
•
•
•
•
•
•
•
What algorithms can approximate functions well (and when)?
How does number of training examples influence accuracy?
How does complexity of hypothesis representation impact it?
How does noisy data influence accuracy?
What are the theoretical limits of learnability?
How can prior knowledge of learner help?
What clues can we get from biological learning systems?
How can systems alter their own representations?