Download Data Analysis and Interpretation 1: Descriptive Statistics

Document related concepts

Foundations of statistics wikipedia , lookup

Psychometrics wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Regression toward the mean wikipedia , lookup

Time series wikipedia , lookup

History of statistics wikipedia , lookup

Misuse of statistics wikipedia , lookup

Transcript
Educational Research:
Data analysis and interpretation – 1
Descriptive statistics
EDU 8603
Educational Research
Richard M. Jacobs, OSA, Ph.D.
Statistics...

A set of mathematical procedures for
describing, synthesizing, analyzing,
and interpreting quantitative data
…the selection of an appropriate
statistical technique is determined by
the research design, hypothesis, and
the data collected
Preparing data for analysis...
Data must be accurately scored and
systematically organized to facilitate
data analysis:
 scoring: assigning a total to each
participant’s instrument
 tabulating: organizing the data in a
systematic manner
 coding: assigning numerals (e.g., ID)
to data

descriptive statistics...
…permit the researcher to describe
many pieces of data with a few
indices

statistics...
…indices calculated by the researcher
for a sample drawn from a population

parameters...
…indices calculated by the researcher
for an entire population
Types of descriptive statistics…
1. graphs
2. measures of central tendency
3. measures of variability

graphs...
…representations of data enabling the
researcher to see what the
distribution of scores look like
1. Graphs…




frequency polygon
pie chart
boxplot
stem-and-leaf chart

measures of central tendency...
…indices enabling the researcher to
determine the typical or average
score of a group of scores
2. Measures of central tendency…



mode
median
mean

mode...
…the score attained by more
participants than any other score

median...
…the point in a distribution above and
below which are 50% of the scores

mean...
…the arithmetic average of the scores

measures of variability...
…indices enabling the researcher to
indicate how spread out a group of
scores are
3. Measures of variability…




range
quartile deviation
variance
standard deviation

range...
…the difference between the highest
and lowest score in a distribution

quartile deviation...
…one half of the difference between
the upper quartile (the 75%’ile) and
the lower quartile (the 25%’ile) in a
distribution

variance...
…a summary statistic indicating the
degree of variability among
participants for a given variable

standard deviation...
…the square root of variance providing
an index of variability in the
distribution of scores
Normal distributions of data
(the normal curve)...
A bell-shaped distribution of scores
having four identifiable properties…
…50% of the scores fall above the
mean and 50% of the scores fall
below the mean
…the mean, median, and mode are the
same value
…most scores are near the mean and,
the farther from the mean a score is,
the fewer the number of participants
who attained that score
…the same number, or percentage, of
scores is between the mean and plus
one standard deviation as is between
the mean and minus one standard
deviation
Non-normal distributions of data
(skewed distributions)...
A non-bell-shaped distribution of scores
where…
…mean < median < mode
(a “negatively skewed distribution”)
…mean > median > mode
(a “positively skewed distribution”)

measures of relative position...
…indices enabling the researcher to
describe a participant’s performance
compared to the performance of all
other participants
4. Measures of relative position…


percentile ranks
standard scores

percentile rank...
…indicates the percentage of scores
that fall at or below a given score

standard score...
…a measure of relative position

Types of standard scores...
…z score
…T score
…stanines

z score...
…a statistic expressing how far a score
is from the mean in terms of standard
deviation units

T score...
…a transformed z score that voids
negative numbers and decimals by
multiplying the z score by 10 and
adding 50

stanines...
…a standard score that divides a
distribution into nine parts

measures of relationship...
…indices enabling the researcher to
indicate the degree to which two sets
of scores are related
5. Measures of relationship…


Spearman Rho
Pearson r

correlations
…determines whether and to what
degree a relationship exists between
two or more quantifiable variables
…the degree of the relationship is
expressed as a coefficient of
correlation
…the presence of a correlation does
not indicate a cause-effect
relationship primarily because of the
possibility of multiple confounding
factors
Correlation coefficient…
-1.00
strong negative
0.00
+1.00
strong positive
no
relationship

Spearman Rho...
…a measure of correlation used for
rank and ordinal data

Pearson r...
…a measure of correlation used for
data of interval or ratio scales
…assumes that the relationship
between the variables being
correlated is linear
Mini-Quiz…

True and false…
…the analysis of the data is as
important as any other component
of the research process
True

True and false…
…descriptive statistics are normally
computed separately for each
group in a research study
True

True and false…
…every instrument administered
must always be scored accurately
and consistently, using the same
procedures and criteria
True

True and false…
…tentative scoring procedures must
always be tried out beforehand by
administering the instrument to the
study participants
False

True and false…
…a computer should not be used to
perform an analysis that a
researcher has never completed by
hand or, at least, studied
extensively
True

True and false…
…the first step in data analysis is to
describe, or summarize, the data
using descriptive statistics
True

True and false…
…the number resulting from the
computation of a measure of central
tendency represents the typical
score attained by a group of
participants
True

True and false…
…the mean is the most precise,
stable index of typical performance
that is especially useful in
situations in which there are
extreme scores
False

True and false…
…unless a correlation coefficient is
used to compute the reliability of
an instrument in a causalcomparative or experimental study,
a correlation coefficient is only
computed in a correlation study
True

True and false…
…plus and/or minus two standard
deviations includes more the 99%
of the scores
False

True and false…
…standard scores are rarely used in
research studies
True

True and false…
…to test a hypothesis adequately,
more than descriptive statistics are
normally needed
True

True and false…
…if the extreme scores are at the
upper, or higher, end of the
distribution, it is said to be
positively skewed
True

True and false…
…the median of a set of scores
corresponds to the 50% percentile
True

True and false…
…a standard score is a measure of
relative position that is appropriate
when the data represent a nominal
scale
False

True and false…
…a z score expresses how far a
score is from the mean in terms of
standard deviation units
True

True and false…
…the Spearman Rho is the
appropriate measure of correlation
when the variables are expressed
as ranks instead of scores
True

True and false…
…the assumption associated with the
application of Pearson r is that the
relationship between the variables
being correlated is linear
True

Fill in the blank…
…statistics which permit the
researcher to describe many scores
with a small number of indices
descriptive statistics

Fill in the blank…
…the values calculated for a sample
drawn form a population
statistics

Fill in the blank…
…the values calculated for an entire
population
parameters

Fill in the blank…
…a convenient way to describe a set of
data with a single number
measures of central tendency

Fill in the blank…
…the index of central tendency
appropriate for nominal data
mode

Fill in the blank…
…the index of central tendency
appropriate for ordinal data
median

Fill in the blank…
…the index of central tendency
appropriate for interval or ratio data
mean

Fill in the blank…
…the score attained by more
participants than any other score
mode

Fill in the blank…
…the point in a distribution above and
below which are 50% of the scores
median

Fill in the blank…
…the arithmetic average of the scores
mean

Fill in the blank…
…the difference between the highest
and lowest score in a distribution
range

Fill in the blank…
…the measure of variability identifying
one half of the difference between
the 75th percentile and the 25th
percentile
quartile deviation

Fill in the blank…
…the measure of variability used for
interval and ratio data
standard deviation

Fill in the blank…
…the only appropriate measure of
variability for nominal data
range

Fill in the blank…
…+/- 1.00 standard deviations
constitutes ____ % of the sample
68%

Fill in the blank…
…extreme scores at the lower end of
the distribution indicates a ______
skewed distribution
positively

Fill in the blank…
…indices describing where a score is
in relation to all other scores
measures of relative position

Fill in the blank…
…indicates the percentage of scores
that fall at or below a given score
percentile ranks

Fill in the blank…
…if a set of scores is transformed into
a set of z scores, the new
distribution has a mean of ____ and
a standard deviation of ____
zero; one

Fill in the blank…
…a set of standard scores that divide a
distribution into nine parts
stanines

Fill in the blank…
…the most appropriate measure of
correlation when the sets of data to
be correlated represent either
interval or ratio scales
Pearson r
This module has focused on...
descriptive statistics
...the statistical procedures for
describing, synthesizing, analyzing,
and interpreting quantitative data
The next module will focus on...
inferential statistics
...the statistical procedures for
generalizing to a population of
individuals based on information
obtained from a limited number of
research participants