Download Machine Learning

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Nonlinear dimensionality reduction wikipedia , lookup

Transcript
CS 416 – Machine Learning
LECTURE 1
INTRODUCTION TO MACHINE LEARNING
Differentiate between Statistics, Data mining, Machine learning and Artificial Intelligence:
1. Statistics is just about the numbers, and quantifying the data. There are many tools for
finding relevant properties of the data but this is pretty close to pure mathematics.
2. Data Mining is about using Statistics as well as other programming methods to find
patterns hidden in the data so that you can explain some phenomenon. Data Mining
builds intuition about what is really happening in some data and is still little more
towards math than programming, but uses both.
3. Machine Learning uses Data Mining techniques and other learning algorithms to build
models of what is happening behind some data so that it can predict future outcomes.
Math is the basis for many of the algorithms, but this is more towards programming.
4. Artificial Intelligence uses models built by Machine Learning and other ways
to reason about the world and give rise to intelligent behavior whether this is playing a
game or driving a robot/car. Artificial Intelligence has some goal to achieve by predicting
how actions will affect the model of the world and chooses the actions that will best
achieve that goal. Very programming based.
In short




Statistics quantifies numbers
Data Mining explains patterns
Machine Learning predicts with models
Artificial Intelligence behaves and reasons
Why Machine Learning is Used?
 Learning is used when:




Engr. Nadeem
Human expertise does not exist (navigating on Mars),
Humans are unable to explain their expertise (speech recognition)
Solution changes in time (routing on a computer network)
Solution needs to be adapted to particular cases (user biometrics)
Page 1
CS 416 – Machine Learning
What is Machine Learning?
1st Definition:
Arthur Samuel (1959). Machine Learning: Field of study that gives computers the ability to learn
without being explicitly programmed.
2nd Definition:
Explanation of 2nd Definition:
How do machines learn?
Regardless of whether the learner is a human or a machine, the basic learning process is similar.
It can be divided into three components as follows:



Data input: It utilizes observation, memory storage, and recall to provide a factual basis
for further reasoning.
Abstraction: It involves the translation of data into broader representations.
Generalization: It uses abstracted data to form a basis for action.
Engr. Nadeem
Page 2
CS 416 – Machine Learning
Types of Machine Learning Algorithms:
1 - Supervised learning
It is deals with learning a target function from training examples of its inputs and outputs.
Example: Regression and Classification.
2 - Unsupervised learning
It is attempts to learn patterns in the input for which no output values are available.
Example: Clustering
3 - Reinforcement learning
It is a type of Machine Learning, and thereby also a branch of Artificial Intelligence. It
allows machines and software agents to automatically determine the ideal behavior
within a specific context, in order to maximize its performance.
Example:
Game playing
o If there is no teacher, the player must be able to determine which actions were
critical to the outcome and then alter its heuristics accordingly.
4 - Recommender systems
Recommender systems or recommendation systems (sometimes replacing "system"
with a synonym such as platform or engine) are a subclass of information filtering
system that seek to predict the 'rating' or 'preference' that user would give to an item.
Example:
Engr. Nadeem
Page 3
CS 416 – Machine Learning
Steps to apply machine learning to your data:
Any machine learning task can be broken down into a series of more manageable steps.
1. Collecting data: Whether the data is written on paper, recorded in text files and
spreadsheets, or stored in an SQL database, you will need to gather it in an electronic
format suitable for analysis. This data will serve as the learning material an algorithm
uses to generate actionable knowledge.
2. Exploring and preparing the data: The quality of any machine learning project is based
largely on the quality of data it uses.
3. Training a model on the data: By the time the data has been prepared for analysis, you
are likely to have a sense of what you are hoping to learn from the data. The specific
machine learning task will inform the selection of an appropriate algorithm, and the
algorithm will represent the data in the form of a model.
4. Evaluating model performance: Because each machine learning model results in a
biased solution to the learning problem, it is important to evaluate how well the algorithm
learned from its experience. Depending on the type of model used, you might be able to
evaluate the accuracy of the model using a test dataset
5. Improving model performance: If better performance is needed, it becomes necessary
to utilize more advanced strategies to augment the performance of the model.
The Role of Machine Learning in Statistics and Computer science?
 Role of Statistics: Inference from a sample
 Role of Computer science: Efficient algorithms to
 Solve the optimization problem
 Representing and evaluating the model for inference
Examples of Machine Learning Problems
1. Spam Detection: Given email in an inbox, identify those email messages that are spam and
those that are not. Having a model of this problem would allow a program to leave non-spam
emails in the inbox and move spam emails to a spam folder. We should all be familiar with
this example.
2. Credit Card Fraud Detection: Given credit card transactions for a customer in a month,
identify those transactions that were made by the customer and those that were not. A
program with a model of this decision could refund those transactions that were fraudulent.
Engr. Nadeem
Page 4
CS 416 – Machine Learning
3. Digit Recognition: Given a zip codes hand written on envelops, identify the digit for each
hand written character. A model of this problem would allow a computer program to read
and understand handwritten zip codes and sort envelops by geographic region.
4. Speech Understanding: Given an utterance from a user, identify the specific request made
by the user. A model of this problem would allow a program to understand and make an
attempt to fulfill that request. The iPhone with Siri has this capability.
5. Face Detection: Given a digital photo album of many hundreds of digital photographs,
identify those photos that include a given person. A model of this decision process would
allow a program to organize photos by person. Some cameras and software like iPhoto has
this capability.
6. Product Recommendation: Given a purchase history for a customer and a large inventory
of products, identify those products in which that customer will be interested and likely to
purchase. A model of this decision process would allow a program to make recommendations
to a customer and motivate product purchases. Amazon has this capability. Also think of
Facebook, GooglePlus and Facebook that recommend users to connect with you after you
sign-up.
7. Medical Diagnosis: Given the symptoms exhibited in a patient and a database of
anonymized patient records, predict whether the patient is likely to have an illness. A model
of this decision problem could be used by a program to provide decision support to medical
professionals.
8. Stock Trading: Given the current and past price movements for a stock, determine whether
the stock should be bought, held or sold. A model of this decision problem could provide
decision support to financial analysts.
9. Customer Segmentation: Given the pattern of behavior by a user during a trial period and
the past behaviors of all users, identify those users that will convert to the paid version of the
product and those that will not. A model of this decision problem would allow a program to
trigger customer interventions to persuade the customer to covert early or better engage in the
trial.
Engr. Nadeem
Page 5