Survey
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
Chapter 10
This multimedia product and its contents are protected under
copyright law. The following are prohibited by law:
• Any public performance or display, including transmission of
any image over a network;
• Preparation of any derivative work, including the extraction, in
whole or in part, of any images;
• Any rental, lease, or lending of the program.
Copyright © Allyn & Bacon 2008
Inferential statistics
Purpose
Error
Terminology
Hypothesis testing
Inferential tests
Criteria for evaluating the inferential
statistics reports in studies
Copyright © Allyn & Bacon 2008
The purpose of inferential statistics is to
draw inferences about a population on the
basis of an estimate from a sample
Inferential statistics - specific statistical
procedures that accomplish this purpose
The ultimate goal is to draw accurate
conclusions about the population
Copyright © Allyn & Bacon 2008
Two types of errors
Sampling errors
▪ Without measuring the entire population, the results can be
inaccurate due to sampling error
▪ The larger the proportion of the population that is sampled, the lower the
sampling error; the smaller the proportion of the population that is sampled,
the higher the sampling error
▪ A sample of 99% of a population is likely to show results that are very, very
similar to those that would have been found if everyone in the population
was measured
▪ A sample of 1% is likely to show results that are different from those in the
population - the question is how different are the sample results
▪ Need to estimate the level of sampling error relative to the inferences
being drawn
Copyright © Allyn & Bacon 2008
Measurement errors
Regardless of the sample size, the results can be
inaccurate due to measurement error
▪ Lack of validity
▪ Lack of reliability
Need to estimate the level of measurement error
relative to the inferences being drawn
Copyright © Allyn & Bacon 2008
Terminology
Null hypothesis
▪ No differences between groups
▪ No relationships between variables
Level of significance
▪ Probability of being wrong in rejecting the null hypothesis
▪ Known as alpha (")
Types of errors
▪ Type I - rejecting the null hypothesis when it is true
▪ Type II - not rejecting (i.e., accepting) the null hypothesis when it is not true
Copyright © Allyn & Bacon 2008
Hypothesis testing exemplified with an experimental control
group comparison
The five stages of the process
▪ State the null hypothesis - no difference between the mean scores for the
experimental and control groups
▪ Assume the null hypothesis is true to establish a base from which the
statistician can work
▪ The base is actually the sampling distribution of the test statistic, in this case the
sampling distribution of the difference between two means, t
▪ Through statistical theory we can establish the characteristics of this sampling
distribution (i.e., mean; standard deviation, known as the standard error in this
situation; and shape)
Copyright © Allyn & Bacon 2008
The five stages of the process (continued)
▪ Calculate the observed difference between the mean scores for the
two groups
▪ Compare the observed difference between mean scores to the
sampling distribution of the test statistic
▪ Accept or reject the null hypothesis based on this comparison
▪ If the observed difference is typical of the sampling distribution, the null
hypothesis is likely true and it is accepted
▪ If the observed difference is atypical of the sampling distribution, the null
hypothesis is likely untrue and it is rejected.
Copyright © Allyn & Bacon 2008
Issues related to statistical and practical
significance
▪ Statistical significance
▪ The typical or atypical nature of the comparison of the observed
difference to the sampling distribution can be estimated using
statistical theory
The estimate is the probability of being wrong in rejecting the
null hypothesis
It is stated as p = x where x is the specific probability of the
comparison (e.g., p = .001, p = .042, p = .56) or as p < y where y is
the alpha level (e.g., .10, .05, .01)
Copyright © Allyn & Bacon 2008
▪ There is always the possibility of making a mistake given that this is
based on a probability model
▪ Type I error - deciding to reject the null hypothesis when in reality it is true
▪ Type II error - accepting the null hypothesis when it in reality it is false
▪ Typical levels of significance in education - .10, .05, and .01
▪ Factors affecting the level of significance
▪ The actual differences between the groups
▪ The degree to which sampling and measurement errors exist
▪ The size of the sample
Copyright © Allyn & Bacon 2008
Practical significance
▪ Practical significance is related to the importance and
usefulness of the results
▪ Estimates of practical significance
▪ For correlations the coefficient of determination (i.e., r2) is used
▪ For comparisons an effect size is used
Effect size is the difference between two group means in terms
of the control group standard deviation
Evaluating effect sizes – small (.30), moderate (.50), and large
(.75)
Copyright © Allyn & Bacon 2008
Each consumer of the research must judge the
balance between the statistical significance
and the practical significance of the statistical
results given the context in which the results
might be used
Copyright © Allyn & Bacon 2008
Two types of inferential tests
Parametric - inferential procedures using
interval or ratio level data
Non-parametric - inferential procedures using
nominal or ordinal data
Copyright © Allyn & Bacon 2008
T-test
A comparison of the means for two groups
▪ Do the mean scores on the final exam differ for the
experimental and control groups?
▪ Independent samples t-test - compares the means of
two separate groups on one variable
▪ Posttest means for Group 1 and Group 2
▪ Dependent sample t-test - compares the means of two
variables for one group
▪ Pre-test and posttest means for Group 1
Copyright © Allyn & Bacon 2008
T-test (continued)
A determination of whether a relationship exists
▪ Does a correlation of +.63 between students’ math
attitudes and math achievement indicate a relationship
exists between these two variables?
▪ Correlation t-test - compares the magnitude of the
difference between a correlation coefficient and 0.00
Copyright © Allyn & Bacon 2008
Analysis of variance (ANOVA)
A comparison of the means for two or more
groups
Omnibus ANOVA - a procedure that indicates
whether one of more pairs of means are different
Do the mean scores differ for the groups using cooperative group, lecture, or web-based
instruction?
Copyright © Allyn & Bacon 2008
ANOVA (continued)
Multiple comparisons (i.e., post-hoc)
▪ Procedures that indicate which specific pairs of means are different as a
follow-up to a significant omnibus ANOVA result
▪ Do the mean scores differ between the co-operative group and lecture, cooperative group and web-based, and lecture and web-based instruction?
▪ Two common tests
▪ Tukey
▪ Scheffe
Copyright © Allyn & Bacon 2008
Factorial ANOVA
A procedure that analyzes the difference between groups across two
or more independent variables
Do the mean scores differ for co-operative group, lecture, and webbased instruction for males and females?
Effects
▪ Main effects - differences between the levels of each independent variable
▪ Interaction effects - differences between combinations of the levels of each
independent variable
Copyright © Allyn & Bacon 2008
Analysis of covariance (ANCOVA)
A procedure that compares means after
statistically adjusting them for pretest differences
between groups
Very stringent assumptions that must be met to
use this procedure
Adjusts for small to moderate - not large - pretest
differences
Copyright © Allyn & Bacon 2008
Multivariate statistics
Comparisons or relationships involving two or more dependent
variables
Comparison of means
▪ Are there differences in the attitudes and performances of students being
taught with lecture or web-based instruction?
▪ Specific tests
▪ Multivariate ANOVA (MANVOA)
▪ Multivariate ANCOVA (MANCOVA)
▪ Hotelling’s T
Copyright © Allyn & Bacon 2008
Multivariate statistics (continued)
Relationships
▪ Are students’ affective traits (e.g., attitudes, selfesteem, preferences, etc.) predictive of their
knowledge (i.e., test scores) and skills (i.e.,
performances)?
▪ Canonical correlation
Copyright © Allyn & Bacon 2008
Chi-square - differences in frequencies across
different categories
Do mothers and fathers differ in their support of a
year-round school calendar?
Do the percentages of undergraduate, graduate,
and doctoral students differ in terms of their
support for the new class attendance policy?
Copyright © Allyn & Bacon 2008
Comparison of means
Mann Whitney U-test
Wilcoxon test
Kruskal-Wallis ANOVA
Relationships
Spearman r
Copyright © Allyn & Bacon 2008
Basic descriptive statistics are needed to
evaluate the inferential results
Inferential analyses report statistical
significance, not practical significance
Inferential analyses do not indicate internal or
external validity
The results depend on sample sizes
Copyright © Allyn & Bacon 2008
The appropriate statistical procedures are
used
The level of significance is interpreted
correctly
Caution is used to interpret nonparametric results from studies with few
subjects in one or more groups or
categories
Copyright © Allyn & Bacon 2008