Download Unit 4A

Document related concepts

History of statistics wikipedia , lookup

Foundations of statistics wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Taylor's law wikipedia , lookup

Psychometrics wikipedia , lookup

Categorical variable wikipedia , lookup

Omnibus test wikipedia , lookup

Misuse of statistics wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Student's t-test wikipedia , lookup

Transcript
GM07: RESEARCH METHODOLOGY
UNIT 4-A: DATA ANALYSIS and REPORTING
1. Frequency Distribution
2. Cross Tabulation
3. Hypothesis Testing
Organizing Numerical Data
• Data in raw form (as collected):
24, 26, 24, 21, 27, 27, 30, 41, 32, 38
• Data in ordered array from smallest to largest:
21, 24, 24, 26, 27, 27, 30, 32, 38, 41
• Stem-and-leaf display:
2 144677
3 028
4 1
Frequency Distribution
• In a frequency distribution, one variable is
considered at a time.
• A frequency distribution for a variable produces a
table of frequency counts, percentages, and
cumulative percentages for all the values
associated with that variable.
Organizing Numerical Data
Numerical Data
Ordered Array
21, 24, 24, 26, 27, 27, 30, 32, 38, 41
41, 24, 32, 26, 27, 27, 30, 24, 38, 21
Frequency Distributions
Cumulative Distributions
O g ive
120
100
80
60
40
20
0
10
Stem and Leaf
Display
2 144677
3 028
4 1
Histograms
20
30
6
5
4
Polygons
3
2
1
0
10
20
30
40
50
60
50
Ogive
7
Tables
40
60
Tabulating Numerical Data: Frequency
Distributions
• Sort raw data in ascending order:
12, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53,
58
•
•
•
•
Find range: 58 - 12 = 46
Select number of classes: 5 (usually between 5 and 15)
Compute class interval (width): 10 (46/5 then round up)
Determine class boundaries (limits): 10, 20, 30, 40, 50,
60
• Compute class midpoints: 15, 25, 35, 45, 55
• Count observations & assign to classes
Frequency Distributions, Relative Frequency
Distributions and Percentage Distributions
Data in ordered array:
12, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53, 58
Class
10 but under 20
20 but under 30
30 but under 40
40 but under 50
50 but under 60
Total
Relative
Frequency Frequency Percentage
3
6
5
4
2
20
.15
.30
.25
.20
.10
1
15
30
25
20
10
100
Graphing Numerical Data:
The Histogram
Data in ordered array:
12, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53, 58
Frequency
Histogram
7
6
5
4
3
2
1
0
6
5
3
2
0
5
Class Boundaries
No Gaps
Between
Bars
4
0
15
25
36
45
Class Midpoints
55
More
Graphing Numerical Data:
The Frequency Polygon
Data in ordered array:
12, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53, 58
Frequenc y
7
6
5
4
3
2
1
0
5
15
25
36
45
55
Class Midpoints
M ore
Tabulating Numerical Data:
Cumulative Frequency
Data in ordered array:
12, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53, 58
Class
10 but under 20
20 but under 30
30 but under 40
40 but under 50
50 but under 60
Cumulative
Frequency
3
9
14
18
20
Cumulative
% Frequency
15
45
70
90
100
Graphing Numerical Data:
The Ogive (Cumulative % Polygon)
Data in ordered array:
12, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53, 58
Ogive
100
80
60
40
20
0
10
20
30
40
50
60
Class Boundaries (Not Midpoints)
Graphing Bivariate Numerical Data
(Scatter Plot)
Mutual Funds Scatter Plot
Total Year to
Date Return
(%)
40
30
20
10
0
0
10
20
30
Net Asset Values
40
Tabulating and Graphing Categorical
Data:Univariate Data
Categorical Data
Tabulating Data
The Summary Table
Graphing Data
Pie Charts
Bar Charts
Pareto Diagram
Summary Table
(for an Investor’s Portfolio)
Investment Category
Amount
Percentage
(in thousands )
Stocks
Bonds
CD
Savings
Total
46.5
32
15.5
16
110
Variables are Categorical
42.27
29.09
14.09
14.55
100
Graphing Categorical Data:
Univariate Data
Categorical Data
Graphing Data
Tabulating Data
The Summary Table
Pie Charts
CD
Pareto Diagram
S a vi n g s
Bar Charts
B onds
S to c k s
0
10
20
30
40
50
45
120
40
100
35
30
80
25
60
20
15
40
10
20
5
0
0
S to c k s
B onds
S a vi n g s
CD
Bar Chart
(for an Investor’s Portfolio)
Investor's Portfolio
Savings
CD
Bonds
Stocks
0
10
20
30
Amount in K$
40
50
Pie Chart
(for an Investor’s Portfolio)
Amount Invested in K$
Savings
15%
Stocks
42%
CD
14%
Bonds
29%
Percentages are
rounded to the
nearest percent.
Pareto Diagram
Axis for
bar
chart
shows
%
invested
in each
category
45%
100%
40%
90%
80%
35%
70%
30%
60%
25%
50%
20%
40%
15%
30%
10%
20%
5%
10%
0%
0%
Stocks
Bonds
Savings
CD
Axis for line
graph
shows
cumulative
% invested
Tabulating and Graphing Bivariate
Categorical Data
• Contingency tables: investment in thousands
Investment
Category
Investor A
Stocks
Bonds
CD
Savings
46.5
32
15.5
16
Total
110
Investor B
Investor C
Total
55
44
20
28
27.5
19
13.5
7
129
95
49
51
147
67
324
Tabulating and Graphing Bivariate
Categorical Data
• Side by side charts
C o m p arin g In vesto rs
S avings
CD
B onds
S toc k s
0
10
Inves tor A
20
30
Inves tor B
40
50
Inves tor C
60
Principles of Graphical Excellence
• Presents data in a way that provides
substance, statistics and design
• Communicates complex ideas with clarity,
precision and efficiency
• Gives the largest number of ideas in the most
efficient manner
• Almost always involves several dimensions
• Tells the truth about the data
“Chart Junk”
Bad Presentation
 Good Presentation
Minimum Wage
1960: $1.00
Minimum Wage
4
$
1970: $1.60
2
1980: $3.10
0
1990: $3.80
1960
1970
1980
1990
No Relative Basis
Bad Presentation
 Good Presentation
A’s received by
Freq. students.
300
200
30 %

10
0

FR SO
JR SR
A’s received by
students.

FR SO JR SR
FR = Freshmen, SO = Sophomore, JR = Junior, SR = Senior
Compressing Vertical Axis
Bad Presentation
Good Presentation
Quarterly Sales
200
$
Quarterly Sales
50
100
25
0
0
Q1 Q2
Q3 Q4
$
Q1
Q2
Q3 Q4
No Zero Point on Vertical Axis
Bad Presentation

Good Presentation
Monthly Sales
45
$
Monthly Sales
42
39
45
42
39
$
36
36
J F M A M J
0
Graphing the first six months of sales.
J F M A M J
Statistics for Frequency Distribution
• Measures of central tendency
– Mean, median, mode, geometric mean
• Quartile
• Measure of variation
– Range, Interquartile range, variance and standard
deviation, coefficient of variation
• Measure of Shape
– Symmetric, skewed, using box-and-whisker plots
Measures of Central Tendency
Central Tendency
Average
Median
Mode
n
X 
X
i 1
n

i 1
N
Geometric Mean
X G   X1  X 2   X n 
1/ n
N
X
i
i
Mean (Arithmetic Mean)
• Mean (arithmetic mean) of data values
– Sample mean
Sample Size
n
X
X
i 1
i
n
X1  X 2 

n
– Population mean
N

X
i 1
N
i
 Xn
Population Size
X1  X 2 

N
 XN
Mean (Arithmetic Mean)
(continued)
• The most common measure of central
tendency
• Affected by extreme values (outliers)
0 1 2 3 4 5 6 7 8 9 10
Mean = 5
0 1 2 3 4 5 6 7 8 9 10 12 14
Mean = 6
Median
• Robust measure of central tendency
• Not affected by extreme values
0 1 2 3 4 5 6 7 8 9 10
Median = 5
0 1 2 3 4 5 6 7 8 9 10 12 14
Median = 5
• In an ordered array, the median is the
“middle” number
– If n or N is odd, the median is the middle number
– If n or N is even, the median is the average of the
two middle numbers
Mode
•
•
•
•
•
•
A measure of central tendency
Value that occurs most often
Not affected by extreme values
Used for either numerical or categorical data
There may be no mode
There may be several modes
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
Mode = 9
0 1 2 3 4 5 6
No Mode
Geometric Mean
• Useful in the measure of rate of change of a
variable over time
X G   X1  X 2 
 Xn 
1/ n
• Geometric mean rate of return
– Measures the status of an investment over time
RG  1  R1   1  R2  
 1  Rn  
1/ n
1
Quartiles
• Split Ordered Data into 4 Quarters
25%
25%
 Q1 
25%
 Q2 
• Position of i-th Quartile
25%
Q3 
i  n  1
 Qi  
4
Data in Ordered Array: 11 12 13 16 16 17 18 21 22
1 9  1
Position of Q1 
 2.5
4
Q1
12  13


 12.5
2
• Q1and Q3 Are Measures of Noncentral Location
• Q2 = Median, A Measure of Central Tendency
Measures of Variation
Variation
Variance
Range
Population
Variance
Sample
Variance
Interquartile Range
Standard Deviation
Population
Standard
Deviation
Sample
Standard
Deviation
Coefficient
of Variation
Range
• Measure of variation
• Difference between the largest and the smallest
observations:
Range  X Largest  X Smallest
• Ignores the way in which data are distributed
Range = 12 - 7 = 5
Range = 12 - 7 = 5
7
8
9
10
11
12
7
8
9
10
11
12
Interquartile Range
• Measure of variation
• Also known as midspread
– Spread in the middle 50%
• Difference between the first and third
quartiles
Data in Ordered Array: 11 12 13 16 16 17
17 18 21
Interquartile Range  Q3  Q1  17.5  12.5  5
• Not affected by extreme values
Variance
• Important measure of variation
• Shows variation about the mean
– Sample variance:
n
S 
2
– Population variance:
 X
i 1
n 1
N
 
2
X
i
 X
i 1
i

N
2
2
Standard Deviation
• Most important measure of variation
• Shows variation about the mean
• Has the same units as the original data
– Sample standard deviation:
S
– Population standard deviation:

n
 X
i 1
X
i
2
n 1
N
 X
i 1
i
N

2
Comparing Standard Deviations
Data A
11 12 13 14 15 16 17 18 19 20 21
Mean = 15.5
s = 3.338
Data B
11 12 13 14 15 16 17 18 19 20 21
Mean = 15.5
s = .9258
Data C
11 12 13 14 15 16 17 18 19 20 21
Mean = 15.5
s = 4.57
Coefficient of Variation
• Measures relative variation
• Always in percentage (%)
• Shows variation relative to mean
• Is used to compare two or more sets of data
measured in different units
•
S
CV  
X

100%

Comparing Coefficient
of Variation
• Stock A:
– Average price last year = $50
– Standard deviation = $5
• Stock B:
– Average price last year = $100
– Standard deviation = $5
• Coefficient of variation:
– Stock A:
S
CV  
X

 $5 
100%  
100%  10%

 $50 
– Stock B:
S
CV  
X

 $5 
100%  
100%  5%

 $100 
Shape of a Distribution
• Describes how data is distributed
• Measures of shape
– Symmetric or skewed
Left-Skewed
Mean < Median < Mode
Symmetric
Mean = Median =Mode
Right-Skewed
Mode < Median < Mean
Exploratory Data Analysis
• Box-and-whisker plot
– Graphical display of data using 5-number summary
X smallest Q
1
4
6
Median( Q2)
8
Q3
10
Xlargest
12
Distribution Shape and
Box-and-Whisker Plot
Left-Skewed
Q1
Q2 Q3
Symmetric
Q1Q2Q3
Right-Skewed
Q1 Q2 Q3
Cross-Tabulation
• While a frequency distribution describes one variable at a
time, a cross-tabulation describes two or more variables
simultaneously.
• Cross-tabulation results in tables that reflect the joint
distribution of two or more variables with a limited number
of categories or distinct values
Gender and Internet Usage
Gender
Internet Usage
Male
Female
Row
Total
Light (1)
5
10
15
Heavy (2)
10
5
15
Column Total
15
15
Two Variables Cross-Tabulation
• Since two variables have been cross-classified, percentages
could be computed either columnwise, based on column
totals, or rowwise, based on row totals .
• The general rule is to compute the percentages in the
direction of the independent variable, across the
dependent variable. The correct way of calculating
percentages is as shown in next slide.
Internet Usage by Gender
Gender
Internet Usage
Male
Female
Light
33.3%
66.7%
Heavy
66.7%
33.3%
Column total
100%
100%
Gender by Internet Usage
Internet Usage
Gender
Light
Heavy
Total
Male
33.3%
66.7%
100.0%
Female
66.7%
33.3%
100.0%
Introduction of a Third Variable in Cross-Tabulation
Original Two Variables
Some Association
between the Two
Variables
Introduce a Third
Variable
Refined Association No Association
between the Two
between the Two
Variables
Variables
No Association
between the Two
Variables
Introduce a Third
Variable
No Change in
the Initial
Pattern
Some Association
between the Two
Variables
Three Variables Cross-Tabulation
Refine an Initial Relationship
The introduction of a third variable can result in four possibilities:
• As can be seen from, 52% of unmarried respondents fell in the high-purchase
category, as opposed to 31% of the married respondents. Before concluding
that unmarried respondents purchase more fashion clothing than those who
are married, a third variable, the buyer's sex, was introduced into the analysis.
• As shown in the table, in the case of females, 60% of the unmarried fall in the
high-purchase category, as compared to 25% of those who are married. On
the other hand, the percentages are much closer for males, with 40% of the
unmarried and 35% of the married falling in the high purchase category.
• Hence, the introduction of sex (third variable) has refined the relationship
between marital status and purchase of fashion clothing (original variables).
Unmarried respondents are more likely to fall in the high purchase category
than married ones, and this effect is much more pronounced for females than
for males.
Purchase of Fashion Clothing by
Marital Status
Purchase of
Fashion
Clothing
Current Marital Status
Married
Unmarried
High
31%
52%
Low
69%
48%
Column
100%
100%
700
300
Number of
respondents
Purchase of Fashion Clothing by
Marital Status
Purchase of
Fashion
Clothing
Sex
Male
Female
Married
Not
Married
Married
Not
Married
High
35%
40%
25%
60%
Low
65%
60%
75%
40%
Column
totals
Number of
cases
100%
100%
100%
100%
400
120
300
180
Three Variables Cross-Tabulation
Initial Relationship was Spurious
• Table shows that 32% of those with college degrees own an
expensive automobile, as compared to 21% of those without
college degrees. Realizing that income may also be a factor, the
researcher decided to reexamine the relationship between
education and ownership of expensive automobiles in light of
income level.
• In Table, the percentages of those with and without college
degrees who own expensive automobiles are the same for each
of the income groups. When the data for the high income and
low income groups are examined separately, the association
between education and ownership of expensive automobiles
disappears, indicating that the initial relationship observed
between these two variables was spurious.
Ownership of Expensive
Automobiles by Education Level
Own Expensive
Automobile
Education
College Degree
No College Degree
Yes
32%
21%
No
68%
79%
Column totals
100%
100%
Number of cases
250
750
Ownership of Expensive Automobiles by
Education Level and Income Levels
Income
Own
Expensive
Automobile
Low Income
High Income
College
Degree
No
College
Degree
College
Degree
No College
Degree
Yes
20%
20%
40%
40%
No
80%
80%
60%
60%
100%
100%
100%
100%
100
700
150
50
Column totals
Number of
respondents
Three Variables Cross-Tabulation
Reveal Suppressed Association
• Table shows no association between desire to travel abroad and age.
• When sex was introduced as the third variable, Table was obtained.
Among men, 60% of those under 45 indicated a desire to travel
abroad, as compared to 40% of those 45 or older. The pattern was
reversed for women, where 35% of those under 45 indicated a desire
to travel abroad as opposed to 65% of those 45 or older.
• Since the association between desire to travel abroad and age runs in
the opposite direction for males and females, the relationship
between these two variables is masked when the data are aggregated
across sex as in Table.
• But when the effect of sex is controlled, as in Table , the suppressed
association between desire to travel abroad and age is revealed for
the separate categories of males and females.
Desire to Travel Abroad by Age
Desire to Travel Abroad
Age
Less than 45
45 or More
Yes
50%
50%
No
50%
50%
Column totals
100%
100%
500
500
Number of respondents
Desire to Travel Abroad by
Age and Gender
Desir e to
Tr avel
Abr oad
Sex
Male
Age
Female
Age
< 45
>=45
<45
>=45
Yes
60%
40%
35%
65%
No
40%
60%
65%
35%
100%
100%
100%
100%
300
300
200
200
Column
totals
Number of
Cases
Three Variables Cross-Tabulations
No Change in Initial Relationship
• Consider the cross-tabulation of family size and the
tendency to eat out frequently in fast-food restaurants as
shown in Table . No association is observed.
• When income was introduced as a third variable in the
analysis, Table was obtained. Again, no association was
observed.
Eating Frequently in
Fast-Food Restaurants by Family Size
Eat Frequently in FastFood Restaurants
Family Size
Small
Large
Yes
65%
65%
No
35%
35%
Column totals
100%
100%
500
500
Number of cases
Eating Frequently in Fast Food-Restaurants
by Family Size and Income
Income
Eat Frequently in FastFood Restaurants
Low
Family size
Small Large
Yes
65% 65%
No
35% 35%
Column totals
100% 100%
Number of respondents 250 250
High
Family size
Small Large
65% 65%
35% 35%
100% 100%
250 250
Statistics Associated with
Cross-Tabulation Chi-Square
• To determine whether a systematic association exists, the
probability of obtaining a value of chi-square as large or larger
than the one calculated from the cross-tabulation is estimated.
• An important characteristic of the chi-square statistic is the
number of degrees of freedom (df) associated with it. That is,
df = (r - 1) x (c -1).
• The null hypothesis (H0) of no association between the two
variables will be rejected only when the calculated value of the
test statistic is greater than the critical value of the chi-square
distribution with the appropriate degrees of freedom, as
shown.
Statistics Associated with
Cross-Tabulation Phi Coefficient
• The phi coefficient (f) is used as a measure of the strength
of association in the special case of a table with two rows
and two columns (a 2 x 2 table).
• The phi coefficient is proportional to the square root of the
chi-square statistic
c2
f=
n
• It takes the value of 0 when there is no association, which
would be indicated by a chi-square value of 0 as well. When
the variables are perfectly associated, phi assumes the value
of 1 and all the observations fall just on the main or minor
diagonal.
Statistics Associated with Cross-Tabulation
Contingency Coefficient
• While the phi coefficient is specific to a 2 x 2 table, the
contingency coefficient (C) can be used to assess the
strength of association in a table of any size.
C=
c2
c2 + n
• The contingency coefficient varies between 0 and 1.
• The maximum value of the contingency coefficient
depends on the size of the table (number of rows and
number of columns). For this reason, it should be used
only to compare tables of the same size.
Statistics Associated with Cross-Tabulation
Cramer’s V
• Cramer's V is a modified version of the phi correlation
coefficient, f, and is used in tables larger than 2 x 2.
V=
or
V=
f2
min (r-1), (c-1)
c2/n
min (r-1), (c-1)
Statistics Associated with Cross-Tabulation
Lambda Coefficient
• Asymmetric lambda measures the percentage improvement in
predicting the value of the dependent variable, given the value of
the independent variable.
• Lambda also varies between 0 and 1. A value of 0 means no
improvement in prediction. A value of 1 indicates that the
prediction can be made without error. This happens when each
independent variable category is associated with a single category
of the dependent variable.
• Asymmetric lambda is computed for each of the variables (treating
it as the dependent variable).
• A symmetric lambda is also computed, which is a kind of average of
the two asymmetric values. The symmetric lambda does not make
an assumption about which variable is dependent. It measures the
overall improvement when prediction is done in both directions.
Other Statistics Associated with
Cross-Tabulation
• Other statistics like tau b, tau c, and gamma are available to
measure association between two ordinal-level variables. Both
tau b and tau c adjust for ties.
• Tau b is the most appropriate with square tables in which the
number of rows and the number of columns are equal. Its value
varies between +1 and -1.
• For a rectangular table in which the number of rows is different
than the number of columns, tau c should be used.
• Gamma does not make an adjustment for either ties or table size.
Gamma also varies between +1 and -1 and generally has a higher
numerical value than tau b or tau c.
Cross-Tabulation in Practice
While conducting cross-tabulation analysis in practice, it is useful to proceed
along the following steps.
1. Test the null hypothesis that there is no association between the variables
using the chi-square statistic. If you fail to reject the null hypothesis, then
there is no relationship.
2. If H0 is rejected, then determine the strength of the association using an
appropriate statistic (phi-coefficient, contingency coefficient, Cramer's V,
lambda coefficient, or other statistics), as discussed earlier.
3. If H0 is rejected, interpret the pattern of the relationship by computing the
percentages in the direction of the independent variable, across the dependent
variable.
4. If the variables are treated as ordinal rather than nominal, use tau b, tau c, or
Gamma as the test statistic. If H0 is rejected, then determine the strength of
the association using the magnitude, and the direction of the relationship using
the sign of the test statistic.
Hypothesis Testing
 In statistics, a hypothesis is a claim or statement about
a property of a population.
 A hypothesis test (or test of significance) is a standard
procedure for testing a claim about a property of a
population.
Rare Event Rule for Inferential Statistics
If, under a given assumption, the probability of a
particular observed event is exceptionally small, we
conclude that the assumption is probably not correct
Steps Involved in Hypothesis Testing
Formulate H0 and H1
Select Appropriate Test
Choose Level of Significance
Collect Data and Calculate Test Statistic
Determine Probability
Associated with Test
Statistic
Compare with Level
of Significance, 
Determine Critical Value
of Test Statistic TSCR
Determine if TSCR
falls into (Non)
Rejection Region
Reject or Do not Reject H0
Draw Research Conclusion
Note about Identifying
H0 and H1
Definitions
Critical Region
The critical region (or rejection region) is the set of all values of
the test statistic that cause us to reject the null hypothesis.
Significance Level
The significance level (denoted by ) is the probability that the test
statistic will fall in the critical region when the null hypothesis is
actually true. Common choices for  are 0.05, 0.01, and 0.10.
Critical Value
A critical value is any value that separates the critical region
(where we reject the null hypothesis) from the values of the test
statistic that do not lead to rejection of the null hypothesis, the
sampling distribution that applies, and the significance level .
The critical value of z = 1.96 corresponds to a significance level
of  = 0.05.
Type I & Type II Errors
 A Type I error is the mistake of
rejecting the null hypothesis
when it is true.
 The symbol  (alpha) is used to
represent the probability of a
type I error.
 A Type II error is the mistake of
failing to reject the null
hypothesis when it is false.
 The symbol (beta) is used to
represent the probability of a
type II error.
Controlling Type I and
Type II Errors
 For any fixed , an increase in the sample size n
will cause a decrease in 
 For any fixed sample size n , a decrease in  will
cause an increase in . Conversely, an increase
in  will cause a decrease in  .
 To decrease both  and , increase the sample
size.
Conclusions
in Hypothesis Testing
We always test the null hypothesis.
1. Reject the H0
2. Fail to reject the H0
Two-tailed,
Right-tailed,
Left-tailed Tests
• The tails in a distribution are the extreme
regions bounded by critical values.
Two-tailed Test
H0: =
H1:

 is divided equally between
the two tails of the critical
region
Means less than or greater than
Right-tailed Test
H0: =
H1: >
Points Right
Left-tailed Test
H0: =
H1: <
Points Left
Decision Criterion
• Traditional method:
Reject H0 if the test statistic falls within the critical
region.
Fail to reject H0 if the test statistic does not fall within
the critical region.
• P-value method:
Reject H0 if P-value   (where  is the significance
level, such as 0.05).
Fail to reject H0 if P-value > .
• Another option:
Instead of using a significance level such as 0.05,
simply identify the P-value and leave the decision to
the reader.
Decision Criterion
• Confidence Intervals:
Because a confidence interval estimate of a
population parameter contains the likely
values of that parameter, reject a claim that
the population parameter has a value that is
not included in the confidence interval.
P-Value
The P-value (or p-value or probability value)
is the probability of getting a value of the test
statistic that is at least as extreme as the one
representing the sample data, assuming that
the null hypothesis is true. The null
hypothesis is rejected if the P-value is very
small, such as 0.05 or less.
Example: Finding P-values.
Wording of Final Conclusion
Accept versus
Fail to Reject
 Some texts use “accept the null hypothesis.”
 We are not proving the null hypothesis.
 The sample evidence is not strong enough to
warrant rejection (such as not enough
evidence to convict a suspect).
Comprehensive Hypothesis Test
Statistical Tests
Parametric
Non-Parametric
Interval or ratio Scaled Data
Assumption about population
probability distribution
Nominal or ordinal data
No assumption about population
probability distribution
Example
Z, t, F test etc.
Example
χ2, sign, Wilcoxon Signed-Rank,
Kruskal-Wallis Test etc.
A Classification of Hypothesis Testing Procedures
for Examining Differences
Hypothesis Tests
Non-parametric Tests
(Nonmetric Tests)
Parametric Tests
(Metric Tests)
One Sample
* t test
* Z test
Two or More
Samples
Independent Paired
Samples
Samples
* Two-Group t
test
* Z test
* Paired
t test
One Sample
* Chi-Square
* K-S
* Runs
* Binomial
Two or More
Samples
Independent
Samples
* Chi-Square
* Mann-Whitney
* Median
* K-S
Paired
Samples
* Sign
* Wilcoxon
* McNemar
* Chi-Square
A Broad Classification of Hypothesis Tests
Hypothesis Tests
Tests of
Differences
Tests of
Association
Distributions
Means
Proportions
Median/
Rankings
Applications of Z – test (n>30)
•
•
•
•
Test of significance for single mean
Test of significance for difference of means
Test of significance for difference of
standard deviation (s.d.)
Testing a Claim about a Proportion
•
Testing difference of Two proportions
Test of significance for single mean
Z 
X 

Where
n
• X Sample Mean
•  Population mean
•  Population standard deviation(s.d.)
• n Sample size
NOTE: If  population standard deviation(s.d.) is unknown then
estimated sample standard deviation s will be used.
1
S
(X  X )
n
n
95% confidence interval for  is
i 1
X  1.96
99% confidence interval for  is
X  2.57
i

n

n
2
Test of significance for difference of means
X X
Z 
1

2
1
n

1
2

2
2
n
2
Where
X 1, X 2are the means of first and second sample
 1,  2 are standard deviations of samples
n1, n 2 are the sample size of first sample & second
sample
2
 2not known then
1
• IF  12  and
and
are
2
Z 
X X
s
s

n
n
1
2
2
1
1
2
2
2
Test of significance for difference of standard
deviation (s.d.)
Z
s s

1
2
2
1
2n


2
2n
1
• IF  1and
2
2
not known then
 are
2
s s
Z
s
s

2n 2n
1
2
2
2
1
2
1
2
Testing a Claim about a Proportion
n = number of trials

p = x (sample proportion)
n
p = population proportion (used in the null
hypothesis)
q=1–p
Z 
ˆp
p
pq
n
Testing difference of Two proportions
P1  P 2
Z
SE
Where
 n1  n2 

SE   p  [1  p ]
 n1 * n2 
*
*
Note that p* is the combined two sample
proportion weighted by the two sample sizes
(n1 * p1 )  (n2 * p2 )
P 
n1  n2
*
Applications of t – test(n≤30)
•
•
•
•
Test the significance of the mean of a
random sample
Test the difference between means of two
samples (Independent Samples)
Test the difference between means of two
samples (Dependent Samples)
Test of significance of an observed
correlation coefficients
Test the significance of the mean of a
random sample
( X  ) n
t
s
Where
X Sample Mean
 Population mean
S standard deviation of the sample=
n Sample size
Degree of freedom(d.f.)= n-1
95% confidence interval for  is

X
t( 0.05,n1)
n
1 n
2
(
X

X
)
 i
n  1 i 1
Test of significance for difference of
means
Where
X 1  X 2 n1n2
t (
)
S
n1  n2
X 1 X 2  1  2 n1 n 2 are the mean, standard deviation and
sample size of first and second sample
n1
S combined standard deviation S 
IF s1, s2 and n1 n2 are given
d.f. = n1+ n2 -2
S 
(X
i 1
n2
 X1)   (X i  X 2 )2
2
i
i 1
n1  n2  2
( n1  1) s12  ( n 2  1) s 22
n1  n 2  2
Test the difference between means of two
samples (Dependent Samples)
d
t
S
n
Where
d is the mean of the differences
n is the number of paired observations
S is standard deviation of differences
d.f. = n – 1
1 n
2
(
d

d
)
 i
n  1 i 1
Test of significance of an observed
correlation coefficients
t
r
1 r 2
n2
Where
r is correlation coefficient
n is sample size
d.f. = n – 2
Two Independent Samples F Test
An F test of sample variance may be performed if it is
not known whether the two populations have equal
variance. In this case, the hypotheses are:
H0:12 = 22
H1:12  22
Two Independent Samples
F
Statistic
The F statistic is computed from the sample variances
as follows
s12
F(n1-1),(n2-1) =
2
s
2
where
n1
n2
n1-1
n2-1
s12
s22
= size of sample 1
= size of sample 2
= degrees of freedom for sample 1
= degrees of freedom for sample 2
= sample variance for sample 1
= sample variance for sample 2
suppose we wanted to determine whether Internet usage was
different for males as compared to females. A two-independentsamples t test was conducted.
Nonparametric Tests
Nonparametric tests are used when the independent
variables are nonmetric. Like parametric tests,
nonparametric tests are available for testing variables
from one sample, two independent samples, or two
related samples.
Non-parametric Test
•
•
•
•
•
•
•
•
Chi-Square Test
Binomial
Runs
1-Samples K-S
2-Independent samples
K-Independent Samples
2-Dependent Samples
K-Dependent Samples
Applications of χ2 Test
1. Goodness of Fit
2. Contingency Analysis (or Test of
Independence)
3. Test of population variance
Goodness of Fit
n
Oi  Ei 
i 1
Ei
c 
2
2
Where Oi and Ei are Observed and expected
frequencies
Degree of freedom(d.f.) = n-1
Note:
• No Ei should be less than 5, if so cell(s) must be
combined and d.f. should be reduced accordingly
• If some parameters are calculated from Oi to calculate
Ei e.g. mean or standard deviation etc then d.f. should
be reduced by 1 for each such parameters.
Contingency Analysis (or Test of
Independence)
n
Oi  Ei 
i 1
Ei
c 
2
2
Where Oi and Ei are Observed and expected frequencies
Degree of freedom = (rows-1)(column-1)
Note:
When Degree of freedom is 1 AND N<50, adjust χ2 by Yates's
Correction Factor i.e.
n
c 
2
i 1
O  E
i
i
 0.5
2
Ei
Unless O  E  0.5 in such case original (Oi – Ei)2 term is
preserved
i
i
Test of population variance
(n  1) S
c 

2
2
Where
S2 is sample variance
σ2 is population variance
Degree of freedom = (n-1)
n number of observations
2
Nonparametric Tests One Sample
Sometimes the researcher wants to test whether the
observations for a particular variable could reasonably
have come from a particular distribution, such as the
normal, uniform, or Poisson distribution.
The Kolmogorov-Smirnov (K-S) one-sample test
is one such goodness-of-fit test. The K-S compares the
cumulative distribution function for a variable with a
specified distribution. Ai denotes the cumulative
relative frequency for each category of the theoretical
(assumed) distribution, and Oi the comparable value of
the sample frequency. The K-S test is based on the
maximum value of the absolute difference between Ai
and Oi. The test statistic is
K = Max A i - Oi
Nonparametric Tests One Sample
• The decision to reject the null hypothesis is based on the value of
K. The larger the K is, the more confidence we have that H0 is
false. For = 0.05, thecritical
value of K for large samples (over

35) is given by 1.36/ Alternatively, K can
n be transformed into a
normally distributed z statistic and its associated probability
determined.
• In the context of the Internet usage example, suppose we wanted
to test whether the distribution of Internet usage was normal. A
K-S one-sample test is conducted, yielding the data shown in
Table indicates that the probability of observing a K value of
0.222, as determined by the normalized z statistic, is 0.103. Since
this is more than the significance level of 0.05, the null
hypothesis can not be rejected, leading to the same conclusion.
Hence, the distribution of Internet usage does not deviate
significantly from the normal distribution.
K-S One-Sample Test for
Normality of Internet Usage
Test Distribution - Normal
Mean:
Standard Deviation:
Cases:
6.600
4.296
30
Most Extreme Differences
Absolute Positive
Negative
0.222
0.222
-0.142
K-S z
1.217
2-Tailed p
0.103
Nonparametric Tests One Sample
• The chi-square test can also be performed on a single variable
from one sample. In this context, the chi-square serves as a
goodness-of-fit test.
• The runs test is a test of randomness for the dichotomous
variables. This test is conducted by determining whether the
order or sequence in which observations are obtained is
random.
• The binomial test is also a goodness-of-fit test for
dichotomous variables. It tests the goodness of fit of the
observed number of observations in each category to the
number expected under a specified binomial distribution.
Nonparametric Tests
Two Independent Samples
• When the difference in the location of two populations is to be
compared based on observations from two independent samples, and
the variable is measured on an ordinal scale, the Mann-Whitney U
test can be used.
• In the Mann-Whitney U test, the two samples are combined and the
cases are ranked in order of increasing size.
• The test statistic, U, is computed as the number of times a score from
sample or group 1 precedes a score from group 2.
• If the samples are from the same population, the distribution of
scores from the two groups in the rank list should be random. An
extreme value of U would indicate a nonrandom pattern, pointing to
the inequality of the two groups.
• For samples of less than 30, the exact significance level for U is
computed. For larger samples, U is transformed into a normally
distributed z statistic. This z can be corrected for ties within ranks.
Nonparametric Tests
Two Independent Samples
• We examine again the difference in the Internet usage of males and
females. This time, though, the Mann-Whitney U test is used. The
results are given in Table .
• One could also use the cross-tabulation procedure to conduct a chisquare test. In this case, we will have a 2 x 2 table. One variable will be
used to denote the sample, and will assume the value 1 for sample 1
and the value of 2 for sample 2. The other variable will be the binary
variable of interest.
• The two-sample median test determines whether the two groups are
drawn from populations with the same median. It is not as powerful as
the Mann-Whitney U test because it merely uses the location of each
observation relative to the median, and not the rank, of each
observation.
• The Kolmogorov-Smirnov two-sample test examines whether the two
distributions are the same. It takes into account any differences
between the two distributions, including the median, dispersion, and
skewness.
Mann-Whitney U - Wilcoxon Rank
Sum W Test Internet Usage by Gender
Sex
Mean Rank
Cases
20.93
10.07
15
15
Male
Female
Total
30
U
31.000
W
151.000
z
-3.406
Corrected for ties
2-tailed p
0.001
Note
U = Mann-Whitney test statistic
W= Wilcoxon W Statistic
z = U transformed into normally distributed z statistic.
Nonparametric Tests
Paired Samples
• The Wilcoxon matched-pairs signed-ranks test analyzes the
differences between the paired observations, taking into
account the magnitude of the differences.
• It computes the differences between the pairs of variables
and ranks the absolute differences.
• The next step is to sum the positive and negative ranks. The
test statistic, z, is computed from the positive and negative
rank sums.
• Under the null hypothesis of no difference, z is a standard
normal variate with mean 0 and variance 1 for large
samples.
Nonparametric Tests Paired Samples
• The example considered for the paired t test, whether the
respondents differed in terms of attitude toward the Internet
and attitude toward technology, is considered again. Suppose
we assume that both these variables are measured on ordinal
rather than interval scales. Accordingly, we use the Wilcoxon
test.
• The sign test is not as powerful as the Wilcoxon matched-pairs
signed-ranks test as it only compares the signs of the differences
between pairs of variables without taking into account the ranks.
• In the special case of a binary variable where the researcher
wishes to test differences in proportions, the McNemar test can
be used. Alternatively, the chi-square test can also be used for
binary variables.
Wilcoxon Matched-Pairs Signed-Rank Test
Internet with Technology
(Technology
- Internet)
Cases
-Ranks
23
+Ranks
1
Ties
6
Total
30
z = -4.207
Mean rank
12.72
7.50
2-tailed p = 0.0000
A Summary of Hypothesis Tests
Related to Differences
Sample
Application
Level of Scaling
One Sample
Proportion
Metric
One Sample
Distributions
Nonmetric
Test/Comments
Z test
K-S and chi-square for
goodness of fit
Runs test for randomness
Binomial test for goodness of
fit for dichotomous variables
One Sample
Means
Metric
t test, if variance is unknown
z test, if variance is known
A Summary of Hypothesis Tests
Related to Differences
Two Independent Samples
Two independent samples Distributions
Nonmetric
K-S two-sample test
for examining the
equivalence of two
distributions
Two independent samples Means
Metric
Two-group t test
F test for equality of
variances
Two independent samples Proportions
Metric
Nonmetric
z test
Chi-square test
Two independent samples Rankings/Medians Nonmetric
Mann-Whitney U test is
more powerful than
the median test
A Summary of Hypothesis Tests
Related to Differences
Paired Samples
Paired samples
Means
Metric
Paired t test
Paired samples
Proportions
Nonmetric
Paired samples
Rankings/Medians
Nonmetric
McNemar test for
binary variables
Chi-square test
Wilcoxon matched-pairs
ranked-signs test
is more powerful than
the sign test