Download 2.4 - Summary

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Ismor Fischer, 5/29/2012
2.4
2.4-1
Summary
(Compare with first page of §2.3.)
Distribution of X
POPULATION
X discrete
Random Variable X, numerical
X continuous
Parameters
♦ Mean µ
(“mu”)
σ
♦ Variance σ
♦ Standard Deviation σ
(“sigma”)
2
µ
Statistical
Inference
SAMPLE, size n
Relative
frequency of xi
Estimators µ̂ and σˆ can be
calculated via the following
statistics:
♦ Mean x
s
♦ Variance s2
♦ Standard Deviation s
xi
x
Density Histogram
Comments:
 The population mean µ and variance σ 2 are defined in terms of expected value:
µ = E[X] =
Σ
x f(x),
all x
σ 2 = E[(X − µ)2] =
Σ
(x − µ) 2 f(x)
all x
if X is discrete (with corresponding “integration formulas” if X is continuous), where
f(x) is the probability of value x occurring in the population, i.e., P(X = x). Later…
 If n is used instead of n − 1 in the denominator of s2, the expected value is always less
than σ 2. Consistent under- (or over-) estimation of a parameter by a statistic is called
bias. The formulas given for the sample mean and variance are unbiased estimators.
Ismor Fischer, 5/29/2012
2.4-2
 Chebyshev’s Inequality
Whatever the shape of the distribution, at least 75% of the values
lie within ±2 standard deviations of the mean, at least 89% lie
within ±3 standard deviations, etc.

1 
More generally, at least 1 − 2  × 100% of the values lie within ± k
k 

standard deviations of the mean. (Note that k > 1, but it need not be
an integer!)
Pafnuty Chebyshev
(1821-1894)
σ
µ − 3σ
µ − 2σ
µ −1σ
µ
µ + 1σ
µ + 2σ
µ + 3σ
≥ 75%
≥ 89%
Exercise: Suppose that a population of individuals has a mean age of µ = 40 years,
and standard deviation of σ = 10 years. At least how much of the population is between
20 and 60 years old? Between 15 and 65 years old? What symmetric age interval about
the mean is guaranteed to contain at least half the population?
Note: If the distribution is bell-shaped, then approximately 68% lie within ±1σ,
approximately 95% lie within ±2σ, approximately 99.7% lie within ±3σ. For other
multiples of σ, percentages can be obtained via software or tables. Much sharper than
Chebyshev’s general result, which can be overly conservative, this can be used to check
if a distribution is reasonably bell-shaped for use in subsequent testing procedures.
(Later...)
Related documents