Download October 19th lecture

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Foundations of statistics wikipedia , lookup

Statistics wikipedia , lookup

Transcript
Don’t cry because it is all
over, smile because it
happened 
Potential Problems in
Sampling
Poor Sampling Frame
Cost of Sampling
Built -In Bias
Cost of Sampling
Money
Time
Wide Geographic Region
Major Errors in
Sampling
Bias:
Consistent, repeated divergence in the
same direction of a sample statistic from
its associated population parameter.
Lack of Precision:
Large theoretical variation in a sample
statistic
Sampling Error
The difference between the sample statistic
and its corresponding population
parameter.
Population:
97, 103, 96, 99, 105
(Mean = 100)
Non-Sampling Errors
Survey Timing
Survey Mode
Interviewer – Subject Relationship
Survey Topic
Question Wording
Question Sequence
Statistical Significance
An observed effect so large that
it would rarely occur by
chance.
Hypothesis Testing
What is a Hypothesis?
A statement about the value of a
population parameter developed for
the purpose of testing.
Hypothesis Testing
What is Hypothesis Testing?
A procedure, based on sample
evidence and probability theory,
used to determine whether the
hypothesis is a reasonable
statement and should not be
rejected or is unreasonable and
should be rejected.
Hypothesis Testing
Examples of hypotheses made about a
population parameter are:
•
The mean monthly income for systems
analysts is $3,625.
•
Twenty percent of all juvenile offenders
are caught and sentenced to prison.
Hypothesis Testing
Null Hypothesis H0:
A statement about the value of a
population parameter.
Alternative Hypothesis H1:
A statement that is accepted if the
sample data provide evidence that
the null hypothesis is false.
Hypothesis Testing
Level of Significance:
The probability of rejecting the null
hypothesis when it is actually true.
Hypothesis Testing
Statistical testing is often done by
testing a hypothesis that you expect
to reject.
Null Hypothesis
Null Hypothesis H0: A statement about
the value of a population parameter.
Stating the current fact(s).


  
 
   


  
 



  
 
   


  
 



  
 
   

  
 

Population

 
  
  
    
 
   
  
 
Graphic Representation
of the Population
Alternative Hypothesis
Alternative (Research) Hypothesis H1:
A statement that is accepted if the
sample data provide evidence that
the null hypothesis is false.



     
         

   

Sample

  
  
  
    
 
   
  
 
Graphic Representation
of a Large Sample
Graphic Representation of
the Population & Sample
Z
Population
Z
Sample
Statistics! Statistics! Statistics!
Finish the Maze and we get to take a break!
Testing a Hypothesis
Tail
Tail
Testing a Hypothesis
.05 level of significance
Critical Z
Critical Region
.05 Area
Null Hypothesis Area
1.645Z
Critical Value
One tailed test: More than; greater than; larger than;
etc…
One Tailed Test, .05
Smaller than; less than, etc.
Critical Z
Critical Region
Null Hypothesis Area
1.645 Z
Z value
Two Tailed Test, .05
Not Equal to; Different Than
Critical Z
Critical Z
Critical Region
Critical Region
Null Hypothesis Area
1.96 Z
Z value
Z
+Z1.96
value
Graphic Representation
of Hypothesis Test Results
This maze is longer than I thought.
Go Ahead and take a break!
Hypothesis Testing
State null and alternative hypothesis
Select a level of significance
Formulate a decision rule
Identify the test statistic
Take a sample, arrive at a decision
(Reject or fail to reject the null)
Test for Sample Means
S
X = Sample mean
μ = Hypothesized population mean
s = Sample standard deviation
N = Sample size
One Sample Mean Problem
A recent article in Vitality magazine reported
that the mean amount of leisure time per
week for American men is 40.0 hours. You
believe this figure is too large and decide to
conduct your own test. In a random sample
of 60 men, you find that the mean is 37.8
hours of leisure per week with a standard
deviation of 12.2 hours. Can you conclude
that the data in the article is too large? Use
the .05 significance level.
Step 1
State the null and alternative hypothesis.
H0: Mean = 40.0 hours
H1: Mean < 40.0 hours
Step 2
Select a level of significance.
This will be given to you. In this
problem, it is .05.
Step 3
Establish critical region by converting level of
significance to a Z score.
.5000 - .0500 = .4500 = 1.64z
If the test statistic falls below -1.64z, the null
hypothesis will be rejected.
Step 4
Identify the test statistic.
Z = 37.8 – 40.0
12.2 / 7.75
Z = -2.2 / 1.57
Test Statistic: Z = -1.40
Step 5
Arrive at a decision.
The test statistic falls in the null
hypothesis region. Therefore, we fail to
reject the null.
Test for Two Sample Means
Xi = Mean for group i
Si = Standard deviation for group i
ni = Number in group i
Two Sample Means Problem
The board of directors at the Anchor Pointe Marina
is studying the usage of boats among its
members. A sample of 30 members who have
boats 10 to 20 feet in length showed that they
used their boats an average of 11 days last July.
The standard deviation of the sample was 3.88
days. For a sample of 40 member with boats 21
to 40 feet in length, the average number of days
they used their boats in July was 7.67 with a
standard deviation of 4.42 days. At the .02
significance level, can the board of directors
conclude that those with the smaller boats use
their crafts more frequently?
Step 1
State the null and alternative hypothesis.
H0: Large boat usage = small boat usage
H1: Smaller boat usage > large boat usage
Step 2
Select a level of significance.
This will be given to you. In this
problem it is .02.
Step 3
Formulate a decision rule.
.5000 - .0200 = .4800 = 2.05z
Step 4
Identify the test statistic.
11 – 7.67
= 3.35z
3.882 + 4.422
30
40
Step 5
Arrive at a decision.
The test statistic falls in the critical region,
therefore we reject the null.
p-Value in Hypothesis Testing
•
p-Value: The probability, assuming that the
null hypothesis is true, of getting a value of
the test statistic at least as extreme as the
computed value for the test.
•
If the p-value area is smaller than the
significance level, H0 is rejected.
•
If the p-value area is larger than the
significance level, H0 is not rejected.
Statistical Significance
p-Value: The probability of getting a sample
outcome as far from what we would expect
to get if the null hypothesis is true.
The stronger that p-value, the stronger the
evidence that the null hypothesis is false.
Statistical Significance
P-values can be determined by
- computing the z-score
- using the standard normal table
The null hypothesis can be rejected if the pvalue is small enough.
P-Value
1.64 Z 2.05Z