Download Sample Lecture - University of Calgary

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Central limit theorem wikipedia , lookup

Transcript
Statistical Methods
for
Optimization Algorithms
Robert Collier
[email protected]
Fundamental Concepts
A random variable is a variable...
...with a value that is the result of a measurement taken of a random process.
Robert Collier
2
Fundamental Concepts
A random variable is a variable...
...with a value that is the result of a measurement taken of a random process.
A probability distribution function is a function that describes the probability...
...that a random variable is assigned a specific value.
Uniform Distribution
Robert Collier
Binomial Distribution
Normal Distribution
3
Fundamental Concepts
A random variable is a variable...
...with a value that is the result of a measurement taken of a random process.
A probability distribution function is a function that describes the probability...
...that a random variable is assigned a specific value.
A cumulative distribution function is a function that describes the probability...
...that a random variable is assigned a value less than or equal to a specific value.
Uniform Distribution
Robert Collier
Binomial Distribution
Normal Distribution
4
Elementary Statistics
The mean (μ) of a random variable is...
...the weighted average of the possible values that could be assigned.
The variance (σ2) of a random variable is...
...the average squared difference between each observation and the mean...
...and the square root of the variance is known as the standard deviation (σ).
Robert Collier
5
Elementary Statistics
The mean (μ) of a random variable is...
...the weighted average of the possible values that could be assigned.
The variance (σ2) of a random variable is...
...the average squared difference between each observation and the mean...
...and the square root of the variance is known as the standard deviation (σ).
It is very important to recognize that a sample data set...
...will often have a different mean, variance, and standard deviation...
...than that of the system from which the sample was taken.
Example:
Consider the sample {1, 3, 1, 2} of the outcome of a six-sided die...
=
1.750000
True Mean (μ)
=
3.500000
Sample Variance (s2) =
0.916667
True Variance (σ)
=
2.916667
Sample Mean (x)
x
Robert Collier
6
Central Limit Theorem
The sum of many random variables...
...taken independently, from the same distribution...
...approaches the normal distribution.
Robert Collier
7
Central Limit Theorem
The sum of many random variables...
...taken independently, from the same distribution...
...approaches the normal distribution.
Example:
Consider the probability distribution for a group of six-sided dice...
If there is one die...
If there are two dice...
If there are three dice...
1.00
1.00
1.00
0.75
0.75
0.75
0.50
0.50
0.50
0.25
0.25
0.25
0.00
0.00
1
2
3
4
5
6
0.00
2
3
4
5
6
7
8
9
10 11
12
3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
...the distribution of the sum approaches the normal distribution.
This is a crucial property, because many statistical tests assume normality.
Robert Collier
8
Sample Mean Distribution
There are functions that generate random values from a normal distribution...
in Excel:
in Matlab:
Robert Collier
NORMINV(RAND(), μ, σ)
random('Normal', μ, σ, 1, 1)
9
Sample Mean Distribution
There are functions that generate random values from a normal distribution...
in Excel:
in Matlab:
NORMINV(RAND(), μ, σ)
random('Normal', μ, σ, 1, 1)
If one of these functions is used to generate a random sample...
...then the mean of the sample is, itself, a random variable...
...meaning that it has a probability distribution function of its own.
Robert Collier
10
Sample Mean Distribution
There are functions that generate random values from a normal distribution...
in Excel:
in Matlab:
NORMINV(RAND(), μ, σ)
random('Normal', μ, σ, 1, 1)
If one of these functions is used to generate a random sample...
...then the mean of the sample is, itself, a random variable...
...meaning that it has a probability distribution function of its own.
If the random sample is only a single value...
...the sample mean has the same distribution as the source.
Robert Collier
11
Sample Mean Distribution
However, as the random sample size increases...
...the distribution of the sample mean seems to narrow.
For sample size 5...
Robert Collier
For sample size 10...
For sample size 100...
12
Sample Mean Distribution
However, as the random sample size increases...
...the distribution of the sample mean seems to narrow.
For sample size 5...
For sample size 10...
For sample size 100...
With the sample size increasing, the standard deviation...
...must be smaller than that of the source distribution.
σx 
σ
n
The shape of this distribution approaches normal if the sample size is large...
...but if the sample size is smaller it will have a flatter peak and fatter tails...
...and it is referred to as the t-distribution.
Robert Collier
13
Confidence Intervals
By specifying an interval in under which some percent of the curve is located...
...the true mean lies within that same interval, with the same percent certainty.
Robert Collier
14
Confidence Intervals
By specifying an interval in under which some percent of the curve is located...
...the true mean lies within that same interval, with the same percent certainty.
since, for this example, the sample
size was 50, this point is at:
specified by parameter α, where:
α = 1.000 – 0.950 = 0.050
sample mean + t0.05/2 (49)
95.0 %
of the area
under the curve
(sample mean + 2.01)
sample mean
The need for confidence intervals can be easily demonstrated...
Robert Collier
15
The remaining slides have been
omitted from this preview.
This graduate level course in genetic
algorithms was developed and taught by
Robert Collier, MSc, BSc.
It was most recently offered at the
University of Guelph during the summer
2011 semester.