Download Introduction to Hypothesis Testing

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
Transcript
Hypothesis Testing
GTECH 201
Lecture 16
Overview of Today’s Topic

Formulation

Evaluation

Refining and Restating

Statistical Tests
What is a Hypothesis?





Unproven or unsubstantiated statement
You need to know the literature before you
can formulate a hypothesis statement
Data collection should support hypothesis
testing and evaluation
If hypothesis is tested and found to be
correct, then results can be refined (different
scenarios can be tested)
If partially correct, then hypothesis statement
needs to be refined (reworded)
Hypothesis Testing

Multi-step procedure that leads the researcher
from the hypothesis statement to the decision
regarding the hypothesis
6- step process

1.
2.
3.
4.
State null and alternate hypotheses
Select appropriate statistical test
Select level of significance
Delineate regions of rejection and nonrejection of
hypotheses
5. Calculate test statistic
6. Make regarding null hypothesis
Step 1


State null and alternate hypotheses
Null hypothesis



A hypothesis to be tested
Usually represented as H 0
Alternative hypothesis


A hypothesis considered as an alternate to the
null hypothesis
Usually represented as H A
Guidelines for Setting up H0, HA

Hypothesis tests concerning one parameter
 Population mean, m



A null hypothesis for a hypothesis test
concerning a population mean should always
specify a single value for that parameter

m0 
(= ) sign must appear in the null hypothesis
Therefore: H 0 : m  m 0
H0 : m  mH
H0 : m  mH  0
Guidelines, part 2

Alternative hypothesis



The choice of the alternative hypothesis
depends on and should reveal the purpose of
the hypothesis test
Null hypothesis and alternative hypothesis are
mutually exclusive
Three choices are possible

H A : m  m 0 (nondirectional )
H A : m  m 0 (directiona l )
H A : m  m 0 (directiona l )
Guidelines, part 3





H A : m  m0
An alternate hypothesis with a  sign is called a
two-tailed test
The population mean, m is different from a
specified value, m 0
When a < sign appears in the alternate
hypothesis, the test is called a left-tailed test
When a > sign appears in the alternate
hypothesis, the test is called a right-tailed test
Setting up Hypotheses

A snack food company produces 454 gms bags of
pretzels. Although the actual weights deviate slightly
from the 454 gms, and vary from one bag to another,
the quality control team insists that the mean net
weight of bags be maintained at 454 gms. If the mean
net weight of the bags is lower or higher, it is likely to
cause problems.

If you work for the quality control team and you want
to decide whether the packaging machine is working
properly, how would you set up a hypothesis test?
Stating Hypotheses

H A  454g
The packaging machine IS working properly
H A  454g
The packaging machine IS NOT working properly
Select Appropriate Test


One sample difference of means t test
Objective


Requirements and assumptions





Compare a random sample mean to a
population mean for difference
Random sample
Normally distributed population
Variable is measured at interval or ratio scale
Hypotheses
Test Statistic
Test Statistic
X m

X




X
m
X

sample mean
population mean
standard error of the mean
population standard deviation
Level of Significance


 = 0.10 (90%); 0.05 (95%); 0.01 (99.7%)
Errors


Type I error: Rejecting the null hypothesis
when it is in fact true
Type II error: Not rejecting the null hypothesis
when it is in fact false
Null Hypothesis is
TRUE
FALSE
do not
correct
reject null
decision
hypothesis
reject null
hypothesis
type I
error
type II
error
correct
decision
Identify Regions of Rejection

Of null hypothesis





Two-tailed
Left tailed (directional)
Right tailed (directional)
Calculate test statistic
Make decision regarding null or alternate
hypothesis
To Work in Class







We want to investigate demographic change in an
area
3500 households (HH)
You take a sample of 250 HH
Sample mean = 2.68; sample variance =4.3
 = 0.10 (90%)
Now, we want to find out if the mean HH size in
this one area is typical or representative of the
national mean household size (2.61)
Use the six step process to compare how closely
the samples that you have taken compare with the
national average HH size of 2.61
Limits of Hypothesis Testing

Pre-selecting level of significance




Lacks a theoretical basis
Used for convenience
Binary nature of null and alternative
hypothesis
P-value or Probability value


Accepted approach
The exact significance level associated with
the calculated test statistic is determined
More About P-Value

We can define P-value as:


The exact probability of getting a test
statistic value of a given magnitude, IF the
null hypothesis is true
What is the probability of making a Type I
error

Type I error occurs when the null hypothesis is
rejected using the hypothesis testing procedure,
even though in reality the null hypothesis is true
Comparing Classical and P Value Approaches

Classical






State hypotheses
Decide on significance
level
Select test
Delineate regions of
rejection/nonrejection
Calculate the test
statistic
State your conclusion
in words

P- Value






State hypotheses
Decide on significance
level
Compute the value of
the test statistic
Determine P-value
P   reject null
hypothesis; otherwise
do not reject
State your conclusion
in words
Guidelines for Using P-Value
P  0.10
0.05  P  0.10
0.01  P  0.05
P  0.01
Evidence against H0
 Weak or none

Moderate

Strong

Very strong
Example


A random sample of 18 people with income
below the poverty level reveals their daily intake
of calcium

mean 747.4 mg

standard deviation 188 mg
Use the P-value approach to determine whether
the data provides sufficient evidence at the 5%
significance level to conclude that the mean
calcium intake of all Americans with income
below the poverty level is less than the required
daily allowance of 800 mg
Parametric and
Nonparametric Tests

Parametric tests


Require knowledge about population parameters
Assumptions made about population distribution



E.g., population is normally distributed
Sample data measured on Interval/Ratio scale
Non-parametric tests



Requires no knowledge about population
parameters
Distribution-free
Some non-parametric tests are designed to be
2
applied for nominal, ordinal data ( )
– we will talk about these in the next lecture
Choices/Options




Run only a parametric test
Run only a non-parametric test
Run both tests
Goal




State the problem
Decide what inferential technique will be useful
Identify formulae associated with the technique
Interpret the results