Download Terms - Courses

Document related concepts

Probability wikipedia , lookup

History of statistics wikipedia , lookup

Statistics wikipedia , lookup

Transcript
Probability Boot camp
Joel Barajas
October 13th 2008
Basic Probability

If we toss a coin twice

sample space of outcomes = ?
{HH, HT, TH, TT}
 Event – A subset of the sample space


only one head comes up
probability of this event: 1/2
Permutations


Suppose that we are given n distinct objects and
wish to arrange r of these on line where the order
matters.
The number of arrangements is equal to:
n
Pr  n(n  1)( n  2)....( n  r  1)
n!
n Pr 
(n  r )!

Example: The rankings of the schools
Combination
If we want to select r objects without
regard the order then we use the
combination.
 It is denoted by:

 n
n!
n Cr  
 r   r!(n  r )!
 

Example: The toppings for the pizza
Venn Diagram
S
A
B
AB
Probability Theorems
Theorem 1 : The probability of an event lies between ‘0’ and
‘1’.
i.e. O<= P(E) <= 1.
Proof: Let ‘S’ be the sample space and ‘E’ be the event. Then
0  n(E)  n (S)
0 / n(S)  n(E)/ n(S)   n(S) / n(S)
or 0 < =P(E) <= 1
The number of elements in ‘E’ can’t be less than ‘0’
i.e. negative and greater than the number of elements in S.
Probability Theorems
Theorem 2 : The probability of an impossible event
is ‘0’ i.e. P (E) = 0
Proof: Since E has no element, n(E) = 0
From definition of Probability:
P(E)  n (E) / n(S)
P(E)  0
 0 / n(S)
Probability Theorems
Theorem 3 : The probability of a sure event is 1.
i.e. P(S) = 1. where ‘S’ is the sure event.
Proof : In sure event n(E) = n(S)
[ Since Number of elements in Event ‘E’ will be
equal to the number of element in samplespace.]
By definition of Probability :
P(S) =
n (S)/ n (S) =
1
P(S) = 1
Probability Theorems
Theorem 4: If two events ‘A’ and ‘B’ are such
that A <=B,
then P(A) < =P(B).
Proof:
n(A) < = n(B)
or
n(A) / N(S) < = n(B) / n(S)
Then
P(A) < =P(B)
Since ‘A’ is the sub-set of ‘B”, so from set theory
number of elements in ‘A’ can’t be more than
number of element in ‘B’.
Probability Theorems
Theorem 5 : If ‘E’ is any event and E1 be
the complement of event ‘E’, then P(E) +
P(E1) = 1.
Proof:
Let ‘S’ be the sample – space, then
n(E) + n(E1) = n(S)
or
n (E) / n (S) + n (E1) / n (S) = 1
or
P(E) + P(E1) = 1
Computing Conditional
Probabilities
Conditional probability P(A|B) is the probability of
event A, given that event B has occurred:
P(A  B)
P(A | B) 
P(B)
The conditional
probability of A given
that B has occurred
Where P(A  B) = joint probability of A and B
P(A) = marginal probability of A
P(B) = marginal probability of B
Computing Joint and
Marginal Probabilities

The probability of a joint event, A and B:
P( A and B) 

Independent events:



number of outcomes satisfying A and B
total number of elementary outcomes
P(B|A) = P(B) equivalent to
P(A and B) = P(A)P(B)
Bayes’ Theorem:

A1, A2,…An are mutually exclusive and collectively exhaustive
Visualizing Events

Contingency Tables
Ace

Not Ace
Black
2
24
26
Red
2
24
26
Total
4
48
52
Tree Diagrams
2
Sample
Space
Full Deck
of 52 Cards
Total
24
2
24
Sample
Space
Joint Probabilities Using
Contingency Table
Event
B1
Event
B2
Total
A1
P(A1  B1)
P(A1  B2)
P(A1)
A2
P(A2  B1)
P(A2  B2)
P(A2)
Total
P(B1)
P(B2)
1
Joint Probabilities
Marginal (Simple) Probabilities
Example
Of the cars on a used car lot, 70% have
air conditioning (AC) and 40% have a CD
player (CD). 20% of the cars have a CD
player but not AC.
 What is the probability that a car has a CD
player, given that it has AC ?

Introduction to Probability
Distributions

Random Variable
 Represents a possible numerical value from an
uncertain event
Random
Variables
Discrete
Random Variable
Continuous
Random Variable

Mean
N
  E(X)   X i P( X i )
i 1
Variance of a discrete
random variable
N
σ   [X i  E(X)] 2 P(X i )
2
i 1

Deviation of a
discrete random
variable
 E[(X   ) 2 ]
σ  σ  E[(X   ) ]
2
2
where:
E(X) = Expected value of the discrete random variable X
Xi = the ith outcome of X
P(Xi) = Probability of the ith occurrence of X
Example: Toss 2 coins,
X = # of heads,
compute expected value of X:
E(X) = (0 x 0.25) + (1 x 0.50) + (2 x 0.25)
X
P(X)
0
0.25
1
0.50
2
0.25
= 1.0

compute standard deviation
σ
σ
2
[X

E(X)]
P(Xi )
 i
(0  1)2 (0.25)  (1  1)2 (0.50)  (2  1)2 (0.25) 
Possible number of heads
= 0, 1, or 2
0.50  0.707
The Covariance
The covariance measures the strength of the linear
relationship between two variables
The covariance:
N
σ XY   [ X i  E ( X )][(Yi  E (Y )] P( X iYi )
i 1
 E[( X  x )(Y  y )]
where:
X = discrete variable X
Xi = the ith outcome of X
Y = discrete variable Y
Yi = the ith outcome of Y
P(XiYi) = probability of occurrence of the
ith outcome of X and the ith outcome of Y
Correlation Coefficient
Measure of dependence of variables X and Y is
given by
xy
x y

 
if  = 0 then X and Y are uncorrelated
Probability Distributions
Probability
Distributions
Discrete
Probability
Distributions
Continuous
Probability
Distributions
Binomial
Normal
Poisson
Uniform
Hypergeometric
Multinomial
Exponential
Binomial Distribution Formula
n!
c
nc
P(X=c) 
p (1-p)
c ! (n  c )!
P(X=c) = probability of c successes in n trials,
Random variable X denotes the number of
‘successes’ in n trials, (X = 0, 1, 2, ..., n)
n = sample size (number of trials
or observations)
p = probability of “success” in a single trial
(does not change from one trial to the next)
Example: Flip a coin four
times, let x = # heads:
n=4
p = 0.5
1 - p = (1 - 0.5) = 0.5
X = 0, 1, 2, 3, 4
Binomial Distribution

The shape of the binomial distribution depends on the values
of p and n
Mean

Here, n = 5 and p =
0.1
P(X)
.6
.4
.2
0
X
0

Here, n = 5 and p =
0.5
n = 5 p = 0.1
P(X)
.6
.4
.2
0
1
2
3
4
5
n = 5 p = 0.5
X
0
1
2
3
4
5
Binomial Distribution
Characteristics

Mean

μ  E(x)  np
Variance and Standard
Deviation
2
σ  np(1 - p)
σ  np(1 - p)
Where n = sample size
p = probability of success
(1 – p) = probability of failure
Multinomial Distribution
 n  1
k
 p1 .... pk
p( )  
1 2 ... k 
P(Xi=c..Xk=Ck) = probability of
having xi outputs in n trials,
Random variable Xi denotes the
number of ‘successes’ in n trials, (X
= 0, 1, 2, ..., n)
n = sample size (number of trials
or observations)
p= probability of “success”
Example: You have 5
red, 4 blue and 3 yellow
balls
times, let xi = # balls:
n =12
p =[ 0.416, 0.33, 0.25]
The Normal Distribution
‘Bell Shaped’
 Symmetrical
 Mean, Median and Mode
are Equal

f(X)
Location is determined by the
mean, μ
Spread is determined by the
standard deviation. The random
variable has an infinite
theoretical range:
+  to  
σ
μ
Mean
= Median
= Mode
X

The formula for the normal probability density function is
1
(1/2)[(Xμ)/σ]2
f(X) 
e
2π
Any normal distribution (with any mean and standard deviation
combination) can be transformed into the standardized normal
distribution (Z). Where Z=(X-mean)/std dev.
Need to transform X units into Z units
1 (1/2)Z 2
f(Z) 
e
2π
Where
e = the mathematical constant approximated by 2.71828
π = the mathematical constant approximated by 3.14159
μ = the population mean
σ = the population standard deviation
X = any value of the continuous variable
Comparing X and Z units
100
0
200
2.0
X
Z
(μ = 100, σ = 50)
(μ = 0, σ = 1)
Note that the distribution is the same, only the
scale has changed. We can express the problem in
original units (X) or in standardized units (Z)
Finding Normal Probabilities
 Suppose
X is normal with mean
8.0 and standard deviation 5.0
Find P(X < 8.6) = 0.5 + P(8 < X < 8.6)
X
8.0
8.6
The Standardized Normal Table
The column gives the
value of Z to the second
decimal point
Z
0.0
The row
0.1
shows the
.
value of Z to
.
.
the first
decimal point 2.0
0.00
.4772
2.0
P(Z < 2.00) = 0.5 + 0.4772
0.01
0.02 …
The value within the
table gives the
probability from Z =
  up to the desired
Z value
Relationship between Binomial &
Normal distributions

If n is large and if neither p nor q is too close to
zero, the binomial distribution can be closely
approximated by a normal distribution with
standardized normal variable given by
X - np
Z
npq
X is the random variable giving the no. of successes in n
Bernoulli trials and p is the probability of success.

Z is asymptotically normal
Normal Approximation to the
Binomial Distribution



The binomial distribution is a discrete
distribution, but the normal is continuous
To use the normal to approximate the binomial,
accuracy is improved if you use a correction for
continuity adjustment
Example:

X is discrete in a binomial distribution, so P(X = 4) can
be approximated with a continuous normal distribution
by finding
P(3.5 < X < 4.5)
Normal Approximation to the
Binomial Distribution



(continued)
The closer p is to 0.5, the better the normal
approximation to the binomial
The larger the sample size n, the better the
normal approximation to the binomial
General rule:

The normal distribution can be used to approximate the
binomial distribution if
np ≥ 5
and
n(1 – p) ≥ 5
Normal Approximation to the
Binomial Distribution

(continued)
The mean and standard deviation of the
binomial distribution are
μ = np
σ  np(1  p)

Transform binomial to normal using the formula:
X μ
X  np
Z

σ
np(1  p)
Using the Normal Approximation
to the Binomial Distribution



If n = 1000 and p = 0.2, what is P(X ≤ 180)?
Approximate P(X ≤ 180) using a continuity correction
adjustment:
P(X ≤ 180.5)
Transform to standardized normal:
X  np
180.5  (1000)(0.2 )
Z

 1.54
np(1  p)
(1000)(0.2 )(1  0.2)

So P(Z ≤ -1.54) = 0.0618
180.5
-1.54
200
0
X
Z
Poisson Distribution
 x
e 
P( X) 
X!
where:
X = discrete random variable (number of events in
an area of opportunity)
 = expected number of events (constant)
e = base of the natural logarithm system
(2.71828...)
Poisson Distribution
Characteristics

Mean

μλ
Variance and Standard
Deviation
2
σ λ
σ λ
where  = expected number of events
Poisson Distribution Shape

The shape of the Poisson Distribution
depends on the parameter  :
=
=
0.50
3.00
0.70
0.25
0.60
0.20
0.40
P(x)
P(x)
0.50
0.30
0.15
0.10
0.20
0.10
0.05
0.00
0
1
2
3
4
x
5
6
7
0.00
1
2
3
4
5
6
7
x
8
9
10
11
12
Relationship between Poisson &
Normal distributions

In a Binomial Distribution if n is large and
p is small ( probability of success ) then it
approximates to Poisson Distribution with
= np.
Relationship b/w Poisson & Normal
distributions

Poisson distribution approaches normal
distribution as
with standardized
normal variable given by
Z
X-

Are there any other distributions besides
binomial and Poisson that have the normal
distribution as the limiting case?
The Uniform Distribution

The uniform distribution is a probability
distribution that has equal probabilities
for all possible outcomes of the random
variable

Also called a rectangular distribution
Uniform Distribution Example
Example: Uniform probability distribution
over the range 2 ≤ X ≤ 6:
1
f(X) = b-a
= 0.25 for 2 ≤ X ≤ 6
f(X)
μ
0.25
2
6
X
σ
ab 26

4
2
2
(b - a)2

12
(6 - 2)2
 1.1547
12
Sampling Distributions
Sampling
Distributions
Sampling
Distribution of
the Mean
Sampling
Distribution of
the Proportion
Sampling Distributions
A
sampling distribution is a
distribution of all of the
possible values of a statistic for
a given size sample selected
from a population
Developing a
Sampling Distribution

Assume there is a population …

Population size N=4

Random variable, X,
is age of individuals

Values of X: 18, 20,
22, 24 (years)
A
B
C
D
Developing a
Sampling Distribution
(continued)
Summary Measures for the Population Distribution:
X

μ
P(x)
i
N
.3
18  20  22  24

 21
4
σ
 (X  μ)
i
N
.2
.1
0
2
 2.236
18
20
22
24
A
B
C
D
Uniform Distribution
x
Sampling Distribution of Means
(continued)
Now consider all possible samples of size n=2
1st
Obs
2nd Observation
18
20
22
24
18,1
8
18,2
0
18,2
2
18,2
4
20
20,1
8
20,2
0
20,2
2
20,2
4
22
22,1
8
22,2
0
22,2
2
22,2
4
24,1
8
24,2
0
24,2
2
24,2
4
18
24
16 possible samples
(sampling with
replacement)
16 Sample
Means
1st 2nd Observation
Obs 18 20 22 24
18 18 19 20 21
20 19 20 21 22
22 20 21 22 23
24 21 22 23 24
Sampling Distribution of Means
Summary Measures of this Sampling
Distribution:
μX
X


N
σX 

i
(continued)
18  19  21    24

 21
16
2
(
X

μ
)
i

X
N
(18 - 21)2  (19 - 21)2    (24 - 21)2
 1.58
16
Comparing the Population with its
Sampling Distribution
Population
N=4
μ  21
σ  2.236
Sample Means Distribution
n = 16
μX  21
σ X  1.58
_
P(X)
.3
P(X)
.3
.2
.2
.1
.1
0
18
20
22
24
A
B
C
D
X
0
18 19
20 21 22 23
24
_
X
Standard Error, Mean and Variance

Different samples of the same size from the same
population will yield different sample means

A measure of the variability in the mean from sample to
sample is given by the Standard Error of the Mean:
σX
σ

n
(This assumes that sampling is with replacement or
sampling is withoutX replacement from an infinite population)

Note that the standard error of the mean decreases as the
sample size increases
X
Standard Error, Mean and Variance

If a population is normal with mean μ and
standard deviation σ, the sampling distribution of
is also normally distributed with
μX  μ

σX 
σ
n
Z Value = unit normal distribution of a
sampling distribution of
Z
( X  μX )
σX

( X  μ)
σ
n
Sampling Distribution Properties
μx  μ

(i.e.
x
Normal Population
Distribution
is unbiased )
μ
x
μx
x
Normal Sampling
Distribution
(has the same mean)
Sampling Distribution Properties
(continued)
As n increases,
σx
Larger
sample size
decreases
Smaller
sample size
μ
x
If the Population is not Normal

We can apply the Central Limit Theorem:

Even if the population is not normal,

…sample means from the population will be
approximately normal as long as the sample size is
large enough.
Properties of the sampling distribution:
μx  μ
and
σ
σx 
n
Central Limit Theorem
As the
sample
size gets
large
enough…
n↑
the sampling
distribution
becomes
almost normal
regardless of
shape of
population
x
If the Population is not Normal (continued)
Population Distribution
Sampling distribution
properties:
Central Tendency
μx  μ
σ
σx 
n
Variation
μ
x
Sampling Distribution
(becomes normal as n increases)
Larger
sample
size
Smaller
sample size
μx
x
How Large is Large Enough?

For most distributions, n > 30 will give
a sampling distribution that is nearly
normal

For fairly symmetric distributions, n > 15

For normal population distributions, the
sampling distribution of the mean is
always normally distributed
Example

Suppose a population has mean μ = 8
and standard deviation σ = 3. Suppose a
random sample of size n = 36 is selected.

What is the probability that the sample
mean is between 7.8 and 8.2?
Example
(continued)
Solution:

Even if the population is not normally
distributed, the central limit theorem can be
used (n > 30)

… so the sampling distribution of
approximately normal

… with mean

x
…and standard μdeviation
x
is
= 8
σ
3
σx 

 0.5
n
36
Example
(continued)
Solution
(continued):


 7.8 - 8
X -μ
8.2 - 8 
P(7.8  X  8.2)  P



3
σ
3


36
n
36 

 P(-0.4  Z  0.4)  0.3108
Population
Distribution
???
?
??
?
?
?
?
?
μ8
Sampling
Distribution
Standard Normal
Distribution
Sample
.1554
+.1554
Standardize
?
X
7.8
μX  8
8.2
x
-0.4
μz  0
0.4
Z
Population Proportions
π = the proportion of the population
having some characteristic

p
Sample proportion ( p ) provides an estimate
of π:
X
number of items in the sample having the characteri stic of interest

n
sample size

0≤ p≤1

p has a binomial distribution
(assuming sampling with replacement from a finite
population or without replacement from an infinite
population)
Sampling Distribution of Proportions

For large values of n
(n>=30), the sampling
distribution is very nearly a
normal distribution.
P( ps)
.3
.2
.1
0
0
μp  π
Sampling Distribution
.2
π(1 π )
σp 
n
.4
Z
.6
p 

σp
(where π = population proportion)
8
1
p 
 (1  )
n
p
Example

If the true proportion of voters who
support Proposition A is π = 0.4, what
is the probability that a sample of size
200 yields a sample proportion between
0.40 and 0.45?

i.e.:
if π = 0.4 and n = 200, what is
P(0.40 ≤ p ≤ 0.45) ?
Example

(continued)
if π = 0.4 and n = 200, what is
P(0.40 ≤ p ≤ 0.45) ?
Find σ p : σ p 
 (1  )
n
0.4(1  0.4)

 0.03464
200
0.45  0.40 
 0.40  0.40
Convert to
P(0.40  p  0.45)  P
Z

standard
0.03464 
 0.03464
normal:
 P(0  Z  1.44)
Example

(continued)
if π = 0.4 and n = 200, what is
P(0.40 ≤ p ≤ 0.45) ?
Use standard normal table:
P(0 ≤ Z ≤ 1.44) = 0.4251
Standardized
Normal Distribution
Sampling Distribution
0.4251
Standardize
0.40
0.45
p
0
1.44
Z
Point and Interval Estimates

A point estimate is a single number,

a confidence interval provides additional
information about variability
Lower
Confidence
Limit
Point Estimate
Width of
confidence interval
Upper
Confidence
Limit
Point Estimates
We can estimate a
Population Parameter …
Mean
Proportion
with a Sample Statistic
(a Point Estimate)
μ
π
X
p
How much uncertainty is associated with a point estimate of a population
parameter?
An interval estimate provides more information about a population
characteristic than does a point estimate
Such interval estimates are called confidence intervals
Confidence Interval Estimate

An interval gives a range of values:

Takes into consideration variation in sample
statistics from sample to sample

Based on observations from 1 sample

Gives information about closeness to
unknown population parameters

Stated in terms of level of confidence

Can never be 100% confident
Estimation Process
Random Sample
Population
(mean, μ, is
unknown)
Sample
Mean
X = 50
I am 95%
confident that
μ is between
40 & 60.
General Formula
 The
general formula for
all confidence intervals is:
Point Estimate ± (Critical Value)(Standard Error)
Confidence Interval for μ
(σ Known)

Assumptions
 Population standard deviation σ is known
 Population is normally distributed
 If population is not normal, use large sample

Confidence interval estimate:
σ
XZ
n

where X is the point estimate
Z is the normal distribution critical value on a particular level of
confidence
σ/ n is the standard error
Finding the Critical Value, Z

Consider a 95% confidence interval:
Z  1.96
1   0.95
α
 0.025
2
Z units:
X units:
α
 0.025
2
Z= -1.96
Lower
Confidence
Limit
0
Point Estimate
Z= 1.96
Upper
Confidence
Limit
Intervals and Level of Confidence
Sampling Distribution of the Mean
/2
Intervals
extend from
σ
XZ
n
1 
/2
x
μx  μ
x1
x2
to
σ
XZ
n
Confidence Intervals
(1-)x100%
of intervals
constructed
contain μ;
()x100% do
not.
Example

A sample of 11 circuits from a large normal
population has a mean resistance of 2.20
ohms. We know from past testing that the
population standard deviation is 0.35 ohms.

Determine a 95% confidence interval for the
true mean resistance of the population.
Example
(continued)


A sample of 11 circuits from a large normal
population has a mean resistance of 2.20
ohms. We know from past testing that the
population standard deviation is 0.35 ohms.
Solution:
σ
X Z
n
 2.20  1.96 (0.35/ 11)
 2.20  0.2068
1.9932    2.4068
Interpretation

We are 95% confident that the true mean
resistance is between 1.9932 and 2.4068
ohms

Although the true mean may or may not
be in this interval, 95% of intervals formed
in this manner will contain the true mean
Confidence Interval for μ (σ Unknown)

If the population standard deviation σ is
unknown, we can substitute the sample
standard deviation, S

This introduces extra uncertainty, since
S is variable from sample to sample

So we use the t distribution instead of
the normal distribution
Confidence Interval for μ
(σ Unknown)

(continued)
Assumptions



Population standard deviation is
unknown
Population is normally distributed
If population is not normal, use large
sample
Use Student’s t Distribution
 Confidence Interval Estimate:

X  t n-1
S
n
Student’s t Distribution
 The
t is a family of distributions
 The
t value depends on degrees of
freedom (d.f.)

Number of observations that are free to vary
after sample mean has been calculated
d.f. = n - 1
DOF ::Idea: Number of observations that are free to vary
after sample mean has been calculated
Example: Suppose the mean of 3 numbers is 8.0.
Let X1 = 7
Let X2 = 8
What is X3?
If the mean of these three
values is 8.0,
then X3 must be 9
(i.e., X3 is not free to vary)
Here, n = 3, so degrees of freedom = n – 1 = 3 – 1 = 2
(2 values can be any numbers, but the third is not free to vary
for a given mean)
Student’s t Distribution
Note: t
Z as n increases
Standard
Normal
(t with df = ∞)
t (df = 13)
t-distributions are bellshaped and symmetric, but
have ‘fatter’ tails than the
normal
t (df = 5)
0
t
Student’s t Table
df
.25
.10
.05
1 1.000 3.078 6.314
Let: n = 3
df = n - 1 = 2
90% confidence
2 0.817 1.886 2.920
0.05
3 0.765 1.638 2.353
The body of the table
contains t values, not
probabilities
0
2.920 t
Example
A random sample of n = 25 has X = 50 and
S = 8. Form a 95% confidence interval for μ

d.f. = n – 1 = 24, so
t p , n 1  t 0.025,24  2.0639
The confidence interval is
X  tn 1
S
8
 50  (2.0639)
n
25
46.698 ≤ μ ≤ 53.302
What is a Hypothesis?

A hypothesis is a claim
(assumption) about a
population parameter:

population mean
Example: The mean monthly cell phone bill
of this city is μ = $42

population proportion
Example: The proportion of adults in this
city with cell phones is π = 0.68
The Null Hypothesis, H0

States the claim or assertion to be tested
Example: The average number of TV sets in U.S. Homes is
equal to three (

H0 : μ  3)
Is always about a population parameter,
not about a sample statistic
H0 : μ  3
H0 : X  3
The Null Hypothesis, H0
(continued)



Begin with the assumption that the null
hypothesis is true
Always contains “=” , “≤” or “” sign
May or may not be rejected
The Alternative Hypothesis, H1

Is the opposite of the null hypothesis




e.g., The average number of TV sets in U.S.
homes is not equal to 3 ( H1: μ ≠ 3 )
Never contains the “=” , “≤” or “” sign
May or may not be proven
Is generally the hypothesis that the
researcher is trying to prove
Hypothesis Testing Process
Claim: the
population
mean age is 50.
(Null Hypothesis:
H0: μ = 50 )
Population
Is X 20 likely if μ = 50?
If not likely,
REJECT
Null Hypothesis
Suppose
the sample
mean age
is 20: X = 20
Now select a
random sample
Sample
Level of Significance
and the Rejection Region
Level of significance =
H0: μ = 3
H1: μ ≠ 3

/2
Two-tail test
/2

Upper-tail test
H0: μ ≥ 3
H1: μ < 3
Rejection
region is
shaded
0
H0: μ ≤ 3
H1: μ > 3
0

Lower-tail test
0
Represents
critical value
Hypothesis Testing






If we know that some data comes from a certain distribution, but
the parameter is unknown, we might try to predict what the
parameter is. Hypothesis testing is about working out how likely
our predictions are.
We then perform a test to decide whether or not we should reject
the null hypothesis in favor of the alternative.
We test how likely it is that the value we were given could have
come from the distribution with this predicted parameter.
A one-tailed test looks for an increase or decrease in the
parameter whereas a two-tailed test looks for any change in the
parameter (which can be any change- increase or decrease).
We can perform the test at any level (usually 1%, 5% or 10%).
For example, performing the test at a 5% level means that there
is a 5% chance of wrongly rejecting H0.
If we perform the test at the 5% level and decide to reject the
null hypothesis, we say "there is significant evidence at the 5%
level to suggest the hypothesis is false".
Hypothesis Testing Example
Test the claim that the true mean # of TV
sets in US homes is equal to 3.
(Assume σ = 0.8)
1. State the appropriate null and alternative
hypotheses
 H0: μ = 3
H1: μ ≠ 3 (This is a twotail test)
2. Specify the desired level of significance and
the sample size
 Suppose that  = 0.05 and n = 100 are
chosen for this test
Hypothesis Testing Example
(continued)
3.
Determine the appropriate technique
 σ is known so this is a Z test.
4. Determine the critical values
 For  = 0.05 the critical Z values are ±1.96
5. Collect the data and compute the test statistic
 Suppose the sample results are
n = 100,
X = 2.84 (σ = 0.8 is assumed known)
So the test statistic is:
Z 
X μ
2.84  3
 .16


 2.0
σ
0.8
.08
n
100
Hypothesis Testing Example

(continued)
6. Is the test statistic in the rejection region?
 = 0.05/2
Reject H0 if
Z < -1.96 or
Z > 1.96;
otherwise
do not
reject H0
Reject H0
-Z= -1.96
 = 0.05/2
Do not reject H0
0
Reject H0
+Z= +1.96
Here, Z = -2.0 < -1.96, so the
test statistic is in the rejection
region
Hypothesis Testing Example
(continued)
6(continued). Reach a decision and interpret the
result
 = 0.05/2
Reject H0
-Z= -1.96
 = 0.05/2
Do not reject H0
0
Reject H0
+Z= +1.96
-2.0
Since Z = -2.0 < -1.96, we reject the null hypothesis
and conclude that there is sufficient evidence that the
mean number of TVs in US homes is not equal to 3
One-Tail Tests

In many cases, the alternative
hypothesis focuses on a particular
direction
H0: μ ≥ 3
H1: μ < 3
H0: μ ≤ 3
H1: μ > 3
This is a lower-tail test since the
alternative hypothesis is focused on
the lower tail below the mean of 3
This is an upper-tail test since the
alternative hypothesis is focused on
the upper tail above the mean of 3
Example: Upper-Tail Z Test
for Mean ( Known)
A phone industry manager thinks that
customer monthly cell phone bills have
increased, and now average over $52 per
month. The company wishes to test this
claim. (Assume  = 10 is known)
Form hypothesis test:
H0: μ ≤ 52 the average is not over $52 per month
H1: μ > 52
the average is greater than $52 per month
(i.e., sufficient evidence exists to support the
manager’s claim)

Suppose that  = 0.10 is chosen for this
test
Find the rejection region:
Reject H0
 = 0.10
Do not reject H0
0
1.28
Reject H0
Reject H0 if Z > 1.28
Review:
One-Tail Critical Value
What is Z given  = 0.10?
0.90
Standardized Normal
Distribution Table (Portion)
0.10
 = 0.10
0.90
Z
.07
.08
.09
1.1 .8790 .8810 .8830
1.2 .8980 .8997 .9015
z
0 1.28
Critical Value
= 1.28
1.3 .9147 .9162 .9177
t Test of Hypothesis for the Mean
(σ Unknown)

Convert sample statistic ( X ) to a t test
statistic
Hypothesis
Tests for 
σKnown
Known
(Z test)
σUnknown
Unknown
(t test)
The test statistic is:
t n-1
X μ

S
n
Example: Two-Tail Test
( Unknown)
The average cost of a hotel room
in New York is said to be $168 per
night. A random sample of 25
hotels resulted in X = $172.50
and
S = $15.40. Test at the
 = 0.05 level.
(Assume the population
distribution is normal)
H0: μ = 168
H1: μ  168
Example Solution:
Two-Tail Test
H0: μ = 168
H1: μ  168

 = 0.05

n = 25

 is unknown, so
use a t statistic

Critical Val:t24 = ±
2.0639
/2=.025
Reject H0
-t n-1,α/2
-2.0639
t n1 
/2=.025
Do not reject H0
0
1.46
Reject H0
t n-1,α/2
2.0639
X μ
172.50  168

 1.46
S
15.40
n
25
Do not reject H0: not sufficient evidence that
true mean cost is different than $168
Errors in Making Decisions

Type I Error
 Reject a true null hypothesis
 Considered a serious type of error
The probability of Type I Error is 

Called level of significance of the test

Set by the researcher in advance
Errors in Making Decisions

Type II Error
 Fail to reject a false null hypothesis
The probability of Type II Error is β
(continued)
Type II Error


In a hypothesis test, a type II error occurs when the null
hypothesis H0 is not rejected when it is in fact false.
Suppose we do not reject H0: μ  52 when in fact the true
mean is μ = 50
Here, β = P( X  cutoff ) if μ = 50
β

50
52
Reject
H0: μ  52
Do not reject
H0 : μ  52
Calculating β

Suppose n = 64 , σ = 6 , and  = .05
σ
6
 52  1.645
 50.766
n
64
cutoff  X  μ  Z
(for H0 : μ  52)
So β = P( x  50.766 ) if μ = 50

50
50.766
Reject
H0: μ  52
52
Do not reject
H0 : μ  52
Calculating β and
Power of the test

(continued)
Suppose n = 64 , σ = 6 , and  = 0.05



50.766  50 
P( x  50.766 | μ  50)  P z 
  P(z  1.02)  1.0  0.8461  0.1539
6


64 

Power
=1-β
= 0.8461
The probability of
correctly rejecting a
false null hypothesis is
0.8641
Probability of
type II error:
β = 0.1539
50
50.766
Reject
H0: μ  52
52
Do not reject
H0 : μ  52
p-value

The probability value (p-value) of a statistical hypothesis
test is the probability of wrongly rejecting the null
hypothesis if it is in fact true.

It is equal to the significance level of the test for which we
would only just reject the null hypothesis.

The p-value is compared with the actual significance level
of our test and, if it is smaller, the result is significant.
if the null hypothesis were to be rejected at the 5%
significance level, this would be reported as "p < 0.05".
Small p-values suggest that the null hypothesis is unlikely
to be true.
The smaller it is, the more convincing is the rejection of the
null hypothesis.


p-Value Example

Example: How likely is it to see a sample mean of 2.84
(or something further from the mean, in either direction) if
the true mean is  = 3.0? n = 100, σ = 0.8
X = 2.84 is translated
to a Z score of Z = -2.0
P(Z  2.0)  0.0228
P(Z  2.0)  0.0228
/2 = 0.025
/2 = 0.025
0.0228
0.0228
p-value
= 0.0228 + 0.0228 = 0.0456
-1.96
-2.0
0
1.96
2.0
Z
p-Value Example

(continued)
Compare the p-value with 

If p-value <
 , reject H0

If p-value 
 , do not reject H0
Here: p-value = 0.0456
 = 0.05
/2 = 0.025
Since 0.0456 < 0.05,
we reject the null
hypothesis
0.0228
/2 = 0.025
0.0228
-1.96
-2.0
0
1.96
2.0
Z