Download Hypothesis Testing

Document related concepts
no text concepts found
Transcript
Statistics Quick Overview
Class #3
Copyright by Michael S. Watson, 2012
A/B Testing in Obama’s 2008 Campaign

Objective: Maximize Sign-Up Rate
Source: http://www.youtube.com/watch?v=7xV7dlwMChc
Copyright by Michael S. Watson, 2012; Slides from Managerial Statistics book
2
Copyright by Michael S. Watson, 2012; Slides from Managerial Statistics book
3
Copyright by Michael S. Watson, 2012; Slides from Managerial Statistics book
4
Copyright by Michael S. Watson, 2012; Slides from Managerial Statistics book
5
Copyright by Michael S. Watson, 2012; Slides from Managerial Statistics book
6
Copyright by Michael S. Watson, 2012; Slides from Managerial Statistics book
7
Copyright by Michael S. Watson, 2012; Slides from Managerial Statistics book
8
Copyright by Michael S. Watson, 2012; Slides from Managerial Statistics book
9
So, What is Your Guess?
Copyright by Michael S. Watson, 2012; Slides from Managerial Statistics book
10
Copyright by Michael S. Watson, 2012; Slides from Managerial Statistics book
11
Copyright by Michael S. Watson, 2012; Slides from Managerial Statistics book
12
Copyright by Michael S. Watson, 2012; Slides from Managerial Statistics book
13
A/B Testing for On-Line Businesses

What is it?






Develop two versions of a page
Randomly show users different versions
Track how they do
Uses statistics to decide which is better
Answers yes/no questions
Why?



You have the data to do it
Web sites convert a small number of users
Some see a 40% increase in conversion
Source: Ben Tilly [email protected]
Copyright by Michael S. Watson, 2012; Slides from Managerial Statistics book
14
Some Lessons from A/B Testing

Explore before you refine

Example: ABC Family:
−
−
Existing Website: Promotions for upcoming shows
Radical Idea: People come to the website looking for old episodes
+600%
engagement
Copyright by Michael S. Watson, 2012; Slides from Managerial Statistics book
15
Some Lessons from A/B Testing

Words Matter, Call to action

Which button led to the biggest increase in donations?
Copyright by Michael S. Watson, 2012; Slides from Managerial Statistics book
16
Some Lessons from A/B Testing

Words Matter, Call to action

Which button led to the biggest increase in donations?

Trick question. Depended on what campaign knew!
Copyright by Michael S. Watson, 2012; Slides from Managerial Statistics book
17
Thought Exercise with Our Packaging Example
Original Case (mean = 290, sd = 53)
If a store manager came to you and
said, “what will my sales be?” how
would you answer?
If CEO came to you and said, “what
will average sales be?” how would
you answer?
Less Variability (m = 290, sd = 5)
More Variability (m = 290, sd = 186)
Copyright by Michael S. Watson, 2012; Slides from Managerial Statistics book
18
Thought Exercise II- We Doubled The Samples
(mean = 290, sd = 53)
(mean = 290, sd = 53)
What do you think of these questions now?
If a store manager came to you and
said, “what will my sales be?” how
would you answer?
If CEO came to you and said, “what
will average sales be?” how would
you answer?
Copyright by Michael S. Watson, 2012; Slides from Managerial Statistics book
19
Sampling Distribution–Many times we are sampling
a population and need to find the true mean

The mean of the sample is denoted by X

X

Is it a ‘good’ estimator?

It depends on a few things




estimates the true mean, µ
The standard deviation of the population
The sample size
The distribution of the population (sometimes)
A good random sample and maybe a little luck
Copyright by Michael S. Watson, 2012; Slides from Managerial Statistics book
20
Sampling Distribution

X
is approximately normally distributed with a mean of µ
and st dev of 
n



Since we never know the actual σ, we approximate it with
the sample standard deviation, s.
s
sX 
is commonly used in statistics
n
We call this term the standard error of the mean
Let’s see how this applies to our examples
Copyright by Michael S. Watson, 2012; Slides from Managerial Statistics book
21
Central Limit Theorem– General Idea

X
is approximately normally distributed with a mean of µ
and st dev of 
n

In other words, as you take various samples, the
collection of these samples will be approximately normally
distributed


The larger the value of n, the closer to normally distributed
The population data does not have to be normally
distributed
Copyright by Michael S. Watson, 2012; Slides from Managerial Statistics book
22
We Have 3 Measures for a Sample of Data

Mean (average)

Standard Deviation (sample standard deviation)

Standard Error of the Mean

Let’s build a confidence interval….
Copyright by Michael S. Watson, 2012; Slides from Managerial Statistics book
23
The t-distribution

The t-distribution resembles a standard normal but with
thicker ‘tails’

t-distributions are characterized by a feature called
degrees of freedom

t-distributions with higher degrees of freedom more
closely represent the standard normal
Copyright by Michael S. Watson, 2012; Slides from Managerial Statistics book
24
t-distributions with various Degrees of
Freedom
Copyright by Michael S. Watson, 2012; Slides from Managerial Statistics book
25
Excel: The t-distribution
The
TDIST function requires three
inputs



X
X (the function finds the area to the
right of X)
Deg_freedom
Tails (inputting 1 tail finds the area to
the right of X, 2 tails reports twice the
area)
must be a positive number
Copyright by Michael S. Watson, 2012; Slides from Managerial Statistics book
26
Excel: The inverse t-distribution
The
TINV function requires two
inputs


Probability
Deg_freedom
The
function reports the value, t,
that will yield the required probability
to its right for a t-dist with the
specified d.f.
Copyright by Michael S. Watson, 2012; Slides from Managerial Statistics book
27
Sampling Distribution

X
is approximately normally distributed with a mean of µ
and st dev of 
n

Since we never know the actual σ, we approximate it with
the sample standard deviation, s.
 t
X
s/ n
follows a t-distribution with n-1 d.f.
Copyright by Michael S. Watson, 2012; Slides from Managerial Statistics book
28
Notation


s
sX 
is commonly used in statistics
n
We call this term the standard error of the mean
Copyright by Michael S. Watson, 2012; Slides from Managerial Statistics book
29
Interval Estimates

Our estimate of the true mean sales per store is 290.5

The standard error of the mean is 8.8

What proportion of samples like ours would be within 10
units of the true mean?

We can use the t-distribution to find out
Copyright by Michael S. Watson, 2012; Slides from Managerial Statistics book
30
The Computations
𝑃𝑟𝑜𝑏 −10 ≤ 𝑥 − 𝜇 ≤ 10
𝑃𝑟𝑜𝑏 −10/𝑆𝑥 ≤ (𝑥 − 𝜇)/𝑆𝑥 ≤ 10/𝑆𝑥
t
X
s/ n
sX 
s
n
𝑃𝑟𝑜𝑏 −10/𝑆𝑥 ≤ 𝑡 ≤ 10/𝑆𝑥
𝑃𝑟𝑜𝑏 −10/8.8 ≤ 𝑡 ≤ 10/8.8
𝑃𝑟𝑜𝑏 −1.13 ≤ 𝑡 ≤ 1.13
Area between -1.13 and 1.13
Copyright by Michael S. Watson, 2012; Slides from Managerial Statistics book
31
Where does this fall on t-distribution?
Not to scale
Degrees of F: 35
-1.13
0
1.13
Copyright by Michael S. Watson, 2012; Slides from Managerial Statistics book
32
Let’s Do This in Excel

Find the probability of +/- 10 units
Copyright by Michael S. Watson, 2012; Slides from Managerial Statistics book
33
Confidence

In this example, we say that we are 73% confident that the true
mean lies within 10 units of our estimate.

We must use the word confidence instead of probability as the
randomness is associated with our estimator and not the true mean
which is not random at all.

Usually, we work backwards from a desired level of confidence and
then find the range of the interval necessary to achieve that level.
Copyright by Michael S. Watson, 2012; Slides from Managerial Statistics book
34
95% Confidence Intervals

A 95% confidence interval takes on the form:
X  t / 2,n1SX

where t / 2,n1 is the value needed to generate an area of α/2
in each tail of a t-distribution with n-1 degrees of freedom

Use the Excel formula CONFIDENCE.T for

CONFIDENCE.T uses the following:
X  t / 2,n1SX



Alpha = 1 – Confidence you want
Std Dev = Std Deviation (not the std error of the mean)
Sample= sample size
Copyright by Michael S. Watson, 2012; Slides from Managerial Statistics book
35
Test With Sample Data

Divide into groups

Work on one of the data sets

Find the Mean, Std Dev, Std Error of the Mean, and the
95% Confidence Intervals
Copyright by Michael S. Watson, 2012; Slides from Managerial Statistics book
36
Hypothesis Testing
Source for Hypothesis Testing: Dr Nicola Ward Petty and CreativeHeuresitcs
Copyright by Michael S. Watson, 2012; Slides from Managerial Statistics book
37
Hypothesis Testing
We can say things about
a population from a
sample taken from the
population
Source for Hypothesis Testing: Dr Nicola Ward Petty and CreativeHeuresitcs
Copyright by Michael S. Watson, 2012; Slides from Managerial Statistics book
38
Steps of Hypotheses Testing

Hypotheses

Significance

Sample

P-value

Decide
Copyright by Michael S. Watson, 2012; Slides from Managerial Statistics book
39
Hypothesis Testing: Step 1: The Hypothesis
H0- Null Hypothesis (everything else or the status quo)
Ha- Alternative Hypothesis (what you want to prove)

We are testing something about the underlying
population parameters

Null includes the equality sign (=, ≥, or ≤)
Copyright by Michael S. Watson, 2012; Slides from Managerial Statistics book
40
Test Marketing (Formally)
 : average sales per week.
Ho:  is equal to or smaller than 275.
H:  is greater than 275.
Copyright by Michael S. Watson, 2012; Slides from Managerial Statistics book
41
Hypothesis Testing, Step 2: Significance

Significance, or alpha (α), is generally set to 5%

It is the probability that the Null is rejected when it is really
correct,

Or a Type I Error
Copyright by Michael S. Watson, 2012; Slides from Managerial Statistics book
42
Hypothesis Testing: Step 3: Sample
Take a sample and
gather the statistics
about the sample (like
the mean, std dev, std
error of the mean, etc)
Copyright by Michael S. Watson, 2012; Slides from Managerial Statistics book
43
Hypothesis Testing, Step 4: P-Value


Different ways to calculate p-value if we are testing one
mean or two

One mean: Will the new packaging have sales greater than 275?

Two means: Is the Blue Package better than the Green Package?
We will start with one mean.
t
X
s/ n

To start, we calculate the test statistic:

The value for μ is the value in our Null hypothesis (we are
testing to see if this is true population value)
Copyright by Michael S. Watson, 2012; Slides from Managerial Statistics book
44
Hypothesis Testing: P-Value:
Example with Packaging
Copyright by Michael S. Watson, 2012; Slides from Managerial Statistics book
45
Let’s Not Lose Track of the Intuition…

Is 290 larger than 275?


How much larger is 290 than 275 relative to the statistics
we have calculated?


What if sales had to be more than 400, more than 500, more than 320,
would you be comfortable about our hypothesis?
Hint– think about the standard deviation and the standard error of the
mean
How do you feel about our test?
Copyright by Michael S. Watson, 2012; Slides from Managerial Statistics book
46
Hypothesis Testing: P-Value:
If 275 is the true
mean (our Null
Hypothesis), what is
the chance we drew
a sample with an
average of 290.54?
St. Dev = 8.8475
275
290.54
Copyright by Michael S. Watson, 2012; Slides from Managerial Statistics book
47
Hypothesis Testing: P-Value:
Formal Statement Of Problem
 : average sales per store
Ho:  is less than or equal to 275.
H:  is greater than 275.
Copyright by Michael S. Watson, 2012; Slides from Managerial Statistics book
48
Hypothesis Testing: P-Value:
Computations
290.54−275
Test Statistic =
=1.76
8.8
Case: When Null is ≤ and the sample mean is higher than
the null value:
P equals (1-T.DIST) Function or the T.DIST.RT Function
Let’s test in Excel
Copyright by Michael S. Watson, 2012; Slides from Managerial Statistics book
49
Hypothesis Testing Step 5: Decide
How to Use the P-Value
If p > Significance Level, Do Not Reject the Null
Significance
If p < Significance Level, Reject the Null
Copyright by Michael S. Watson, 2012; Slides from Managerial Statistics book
50
Hypothesis Testing: Decide:
How to Use the P-Value

Low p-value (e.g. 4.4%) means reject the null.

1 minus the p-value is maximum confidence on the
alternative hypothesis.

Average Weekly Sales will exceed 275
Copyright by Michael S. Watson, 2012; Slides from Managerial Statistics book
51
Sales Distribution– How far away is 290 if the real mean is
275?
Ho:  is less than or = 275.
H:  is greater than 275.
Area = 4.4%
0
1.7575
Not Drawn to Scale
Copyright by Michael S. Watson, 2012; Slides from Managerial Statistics book
52
Sales Distribution– How far away is 290 if the real mean is
285?
Ho:  is less than = 285.
H:  is greater than 285.
Area = 26.8%
0
0.6278
Not Drawn to Scale
Copyright by Michael S. Watson, 2012; Slides from Managerial Statistics book
53
Sales Distribution– How far away is 290 if the real mean is
265?
Ho:  is less than = 265.
H:  is greater than 265.
Area = 0.3%
0
2.89
Not Drawn to Scale
Copyright by Michael S. Watson, 2012; Slides from Managerial Statistics book
54