Download Machine Learning: Symbol

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

AI winter wikipedia , lookup

Artificial intelligence in video games wikipedia , lookup

Formal concept analysis wikipedia , lookup

Embodied cognitive science wikipedia , lookup

Pattern recognition wikipedia , lookup

Technological singularity wikipedia , lookup

Machine learning wikipedia , lookup

Philosophy of artificial intelligence wikipedia , lookup

Ethics of artificial intelligence wikipedia , lookup

History of artificial intelligence wikipedia , lookup

Intelligence explosion wikipedia , lookup

Concept learning wikipedia , lookup

Existential risk from artificial general intelligence wikipedia , lookup

Transcript
Chapter 10
Machine Learning: Symbol-Based
Contents
A Framework
Version Space Search
ID3: Decision Tree
CSC411
Artificial Intelligence
1
Machine Learning
AI systems grow from a minimal amount
of knowledge by learning
Herbert Simon (1983):
– Any change in a system that allows it to
perform better the second time on repetition of
the same task or on another task drawn from
the same population
Machine learning issues:
– Generalization from experience
Induction
Inductive biases
– Performance change: improve or degrade
CSC411
Artificial Intelligence
2
Machine Learning Categories
Symbol-based learning
– Inductive learning -- learning by examples
– Supervised learning/unsupervised learning
Concept learning –- classification
Concept formation -- clustering
– Explanation-based learning
– Reinforcement learning
Neural/connectionist networks
Genetic/evolutionary learning
CSC411
Artificial Intelligence
3
A general model of the learning process
CSC411
Artificial Intelligence
4
Learning Components
Data and goals of learning task
– What are given – training instances
– What are expected
Knowledge representation
– Logic expressions
– Decision trees
– Rules
Operations
– Generalization/specialization
– Heuristic rules
– Weight adjusts
Concept space
– Search space: representation, format
Heuristic search
– Search control in the concept space
CSC411
Artificial Intelligence
5
Learning By Examples
Patrick Winston (1975)
– Given a set of positive and a set of negative
examples
– Find a concept representation
– Semantic network representation
Example
– Learn a general definition of structural
concept, say “arch”
– Positive examples: examples of arch
What an arch looks like, to define the arch
– Negative examples: near misses
What an arch doesn’t look like, to avoid the overcoverage of arch
CSC411
Artificial Intelligence
6
Examples and near misses for the concept “arch.”
CSC411
Artificial Intelligence
7
Generalization of descriptions to include multiple
examples.
CSC411
Artificial Intelligence
8
Generalization of descriptions to include multiple
examples (cont’d)
CSC411
Artificial Intelligence
9
Specialization of a description to exclude a near miss. In
c we add constraints to a so that it can’t match with b.
CSC411
Artificial Intelligence
10
Version Space Search
Inductive learning as search through a
concept space
Generalization imposes an ordering on the
concepts in the space and uses the
ordering to guide the search
Generalization
– Principles
Extend the coverage of instances
Shorten/shrink the constrains
– Operations
Replacing constant with variables
Dropping conditions from a conjunctive expression
Adding a disjunct to an expression
Replacing a concept with one of its parent concepts
CSC411
Artificial Intelligence
11
A concept space:
•
•
CSC411
Initial state obj(X, Y, Z) might cover all instances: too general
As more instances are added, X, Y, Z will be constrained
Artificial Intelligence
12
Version Space Search Algorithms
Characteristics of these algorithms
–
Data-driven
Positive examples to generalize the concept
Negative examples to constrain the concept (avoid
overgeneralization)
–
Procedure:
Starting from whole space
Reducing the size of the space as more examples included
Finding regularities (rules) in the training data
–
Generalization on these regularities (rules)
Three algorithms
–
–
–
CSC411
Reducing the size of the version space in a specific to
general direction
Reducing the size of the version space in a general to
specific direction
Combination of above: candidate elimination algorithm
Artificial Intelligence
13
Negative Examples
The role of negative examples in preventing
overgeneralization by forcing the learner to specialize
concepts in order to exclude negative examples
CSC411
Artificial Intelligence
14
Specific to General Search
Maintains a set S of candidate concepts,
the maximally specific generalizations
from the training instances
A concept c is maximally specific if it
– covers all positive examples, non of the
negative examples, and
– for any other concept c’ that covers the
positive examples, c≤c’
The algorithm uses
– Positive examples to generalize the candidate
concepts
– Negative example to avoid overgeneralization
CSC411
Artificial Intelligence
15
Specific to General Search Algorithm
For hypothesis set S:
CSC411
Artificial Intelligence
16
Specific to general search of the version space learning
the concept “ball.”
CSC411
Artificial Intelligence
17
General to Specific Search
Maintains a set G of maximally general
concepts
A concept c is maximally general if it
– covers non of the negative training examples,
and
– for any other concept c’ that covers no
negative training examples, cc’
The algorithm uses
– negative examples to specialize the candidate
concepts
– Positive examples to eliminate
overspecialization
CSC411
Artificial Intelligence
18
General to Specific Search Algorithm
CSC411
Artificial Intelligence
19
General to specific search of the version space learning
the concept “ball.”
CSC411
Artificial Intelligence
20
Candidate Elimination Algorithm
Combination of above two algorithms into
a bi-direction search
Maintains two sets of candidate concepts
– G, the set of maximally general candidates
– S, the set of maximally specific candidates
The algorithm specializes G and
generalizes S until they converge on the
target concept.
CSC411
Artificial Intelligence
21
Candidate Elimination Algorithm
CSC411
Artificial Intelligence
22
The
candidate
elimination
algorithm
learning the
concept
“red ball.”
CSC411
Artificial Intelligence
23
Converging boundaries of the G and S sets in the
candidate elimination algorithm.
CSC411
Artificial Intelligence
24
Decision Trees
Learning algorithms of inducing concepts
from examples
Characteristics
– A tree structure to represent the concept,
equivalent to a set of rules
– Entropy and information gain as heuristics for
selecting candidate concepts
– Handling noise data
– Classification – supervised learning
Typical systems: ID3, C4.5, C5.0
CSC411
Artificial Intelligence
25
Data from credit history of loan applications
CSC411
Artificial Intelligence
26
A decision tree for credit risk assessment.
CSC411
Artificial Intelligence
27
A simplified decision tree for credit risk
assessment.
CSC411
Artificial Intelligence
28
Decision Tree Construction Algorithm
The induction algorithm begins with a sample of correctly
classified members of the target categories.
CSC411
Artificial Intelligence
29
A partially constructed decision tree.
Another partially constructed decision tree.
CSC411
Artificial Intelligence
30