Download measures of central tendency

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Chapter 2
Describing and Presenting a
Distribution of Scores
©2013, The McGraw-Hill Companies, Inc. All Rights Reserved
Chapter Objectives
After completing this chapter, you should be able to
1. Define all statistical terms that are presented.
2. Describe the four scales of measurement and provide
examples of each.
3. Describe a normal distribution and four curves for
distributions that are not normal.
4. Define the terms measures of central tendency and
measures of variability.
5. Define the three measures of central tendency, identify the
symbols used to represent them, describe their
characteristics, calculate them with ungrouped
data, and state how they can be used to interpret data.
©2013, The McGraw-Hill Companies, Inc. All Rights Reserved
2-2
Chapter Objectives
6. Define the three measures of variability, identify the
symbols used to represent them, describe their
characteristics, calculate them with ungrouped data, and
state how they can be used to interpret data.
7. Describe the relationship of the standard deviation and
the normal curve.
8. Define percentile and percentile rank, identify the
symbols used to represent them, calculate them with
ungrouped date, and state how they can be used to
interpret data.
9. Define standard scores, calculate z-scores, and interpret
their meanings.
©2013, The McGraw-Hill Companies, Inc. All Rights Reserved
2-3
Statistical Terms
data
variable
population
sample
random sample
parameter
statistic (as contrasted
with statistics; page 1 of
textbook)
descriptive statistics
inferential statistics
discrete data
continuous data
ungrouped data
©2013, The McGraw-Hill Companies, Inc. All Rights Reserved
2-4
Scales of Measurement
(See Table 2.1)
Nominal Scale - lowest and most elementary; also
called categorical; naming level only; no
comparisons of categories
Ordinal Scale - order or rank; no indication of how
much better one score is than other
Interval Scale - order or rank; same distance exists
between each division; no true zero
Ratio Scale - possesses all characteristics of interval
scale and has true zero point
©2013, The McGraw-Hill Companies, Inc. All Rights Reserved
2-5
Normal Distribution
(See Figure 2.1)
Most statistical methods are based on assumption
that a distribution of scores is normal and that the
distribution can be graphically represented by the
normal curve (bell-shaped).
Normal distribution is theoretical and is based on the
assumption that the distribution contains an infinite
number of scores.
©2013, The McGraw-Hill Companies, Inc. All Rights Reserved
2-6
Characteristics of Normal Curve
• Bell-shaped curve
• Symmetrical distribution about vertical axis
of curve
• Greatest number of scores found in middle
of curve
• All measures of central tendency at vertical
axis
©2013, The McGraw-Hill Companies, Inc. All Rights Reserved
2-7
mean
©2013, The McGraw-Hill
Companies, Inc. All Rights
median
Reserved
mode Inc. All Rights Reserved
©2013, The McGraw-Hill Companies,
2-8
Different Curves
(see Figure 2.2)
• leptokurtic – very similar in ability;
homogeneous group
• platykurtic – wide range of ability;
heterogeneous group
• bimodal - two high points
• skewed - scores clustered at on end; positive
or negative
©2013, The McGraw-Hill Companies, Inc. All Rights Reserved
2-9
Analysis of Ungrouped Data
• Better understanding of data
• Interpret data
©2013, The McGraw-Hill Companies, Inc. All Rights Reserved
2-10
Score Rank
1. List scores in descending order.
2. Number the scores; highest score is number 1
and last score is the number of the total number
of scores.
3. Average rank of identical scores and assign
them the same rank (may determine the
midpoint and assign that rank).
©2013, The McGraw-Hill Companies, Inc. All Rights Reserved
2-11
Table 2.2 Rank of Volleyball Knowledge Test Scores
Rank
Score
1
96
2
95
3
93
4 4.5
92
5
92
6 6.5
91
7
91
8
90
9
9
90
10
90
11
89
12 12
89
13
89
14
88
15
88
Rank
Score
16
16
88
17
88
18
88
19
87
20
20
87
21
87
22 22.5
86
23
86
24 24.5
85
25
85
26 26.5
84
27
84
28
83
29
82
30
81
©2013, The McGraw-Hill Companies, Inc. All Rights Reserved
2-12
Measures of Central Tendency
• descriptive statistics
• describe the middle characteristics of the
data (distribution of scores); represent
scores in a distribution around which other
scores seem to center
• most widely used statistics
• mean, median, and mode
©2013, The McGraw-Hill Companies, Inc. All Rights Reserved
2-13
Mean
The arithmetic average of a distribution of scores; most
generally used measure of central tendency.
Characteristics
• Most sensitive of all measures of central tendency
• Most appropriate measure of central tendency to
use for ratio data (may be used on interval data)
• Considers all information about the data and is used
to perform other statistical calculations
• Influenced by extreme scores, especially if the
distribution is small
©2013, The McGraw-Hill Companies, Inc. All Rights Reserved
2-14
Symbols Used to Calculate Mean
X = the mean (called X-bar)
 = (Greek letter sigma) = “the sum of”
X = individual score
N = the total number of scores in distribution
Mean Formula X = X
N
Table 2.3: X = 2644 = 88.1
30
©2013, The McGraw-Hill Companies, Inc. All Rights Reserved
2-15
Median
Score that represents the exact middle of the distribution;
the fiftieth percentile; the score that 50% of the scores are
above and 50% of the scores are below.
Characteristics
•
•
•
•
Not affected by extreme scores.
A measure of position.
Not used for additional statistical calculations.
Represented by Mdn or P50.
©2013, The McGraw-Hill Companies, Inc. All Rights Reserved
2-16
Steps In Calculation of Median
1. Arrange the scores in ascending order.
2. Multiple N by .50.
3. If the number of scores is odd, P50 is the middle score
of the distribution.
4. If the number of scores is even, P50 is the arithmetic
average of the two middle scores of the distribution.
Table 2.3: .50(30) = 15
Fifteenth and sixteenth scores are 88
P50 = 88
©2013, The McGraw-Hill Companies, Inc. All Rights Reserved
2-17
Mode
Score that occurs most frequently; may have more than
one mode.
Characteristics
• Least used measure of central tendency.
• Not used for additional statistics.
• Not affected by extreme scores.
Table 2.3: Mode = 88
©2013, The McGraw-Hill Companies, Inc. All Rights Reserved
2-18
Which Measure of Central Tendency is
Best for Interpretation of Test Results?
(See Figure 2.3)
• Mean, median, and mode are the same for a
normal distribution, but often will not have a
normal curve.
• The farther away from the mean and median the
mode is, the less normal the distribution.
• The mean and median are both useful measures.
• In most testing, the mean is the most reliable and
useful measure of central tendency; it is also used
in many other statistical procedures.
©2013, The McGraw-Hill Companies, Inc. All Rights Reserved
2-19
Measures of Variability
To provide a more meaningful interpretation of data,
you need to know how the scores spread.
Variability - the spread, or scatter, of scores; terms
dispersion and deviation often used
With the measures of variability, you can determine the
amount that the scores spread, or deviate, from the
measures of central tendency.
Descriptive statistics; reported with measures of central
tendency
©2013, The McGraw-Hill Companies, Inc. All Rights Reserved
2-20
Range
Determined by subtracting the lowest score from the
highest score; represents on the extreme scores.
Characteristics
1. Dependent on the two extreme scores.
2. Least useful measure of variability.
Formula: R = Hx - Lx
Table 2.3: R = 96 - 81 = 15
©2013, The McGraw-Hill Companies, Inc. All Rights Reserved
2-21
Quartile Deviation
Sometimes called semiquartile range; is the spread of
middle 50% of the scores around the median. Extreme
scores will not affect the quartile deviation.
Characteristics
1. Uses the 75th and 25th percentiles; difference between
these two percentiles is referred to as the interquartile
range.
2. Indicates the amount that needs to be added to, and
subtracted from, the median to include the middle
50% of the scores.
3. Usually not used in additional statistical calculations.
©2013, The McGraw-Hill Companies, Inc. All Rights Reserved
2-22
Quartile Deviation
Symbols
Q = quartile deviation
Q1 = 25th percentile or first quartile (P25) = score
in which 25% of scores are below and 75%
of scores are above
Q3 = 75th percentile or third quartile (P75) =
score in which 75% of scores are below
and 25% of scores are above
©2013, The McGraw-Hill Companies, Inc. All Rights Reserved
2-23
Steps for Calculation of Q3
1. Arrange scores in ascending order.
2. Multiply N by .75 to find 75% of the distribution.
3. Count up from the bottom score to the number
determined in step 2. Approximation and interpolation
may be required.
Steps for Calculation of Q1
1. Multiply N by .25 to find 25% of the distribution.
2. Count up from the bottom score to the number
determined in step 1.
To Calculate Q
Substitute values in formula: Q = Q3 - Q1
2
©2013, The McGraw-Hill Companies, Inc. All Rights Reserved
2-24
Quartiles
Q1 = 25%
Q2 = 50%
Q3 = 75%
Q4 = 100%
Q2 - Q1 = range of scores below median
Q3 - Q2 = range of scores above median
©2013, The McGraw-Hill Companies, Inc. All Rights Reserved
2-25
Table 2.3:
1. .75(30) = 22.5; twenty-second score = 90; twenty-third
score = 90; midway between two scores would be
same score
75% = 90
2. .25(30) = 7.5; seventh score = 85; eight score = 86;
midway between two scores = 85.5
3. Q = 90 - 85.5 = 4.5 = 2.25
2
2
Table 2.3:
88 + 2.25 = 90.25
88 - 2.25 = 85.75
Theoretically, middle 50% of scores fall between the
scores of 85.75 and 90.25.
©2013, The McGraw-Hill Companies, Inc. All Rights Reserved
2-26
Standard Deviation
• Most useful and sophisticated measure of variability.
• Describes the scatter of scores around the mean.
• Is a more stable measure of variability than the range
or quartile deviation because it depends on the weight
of each score in the distribution.
• Lowercase Greek letter sigma (σ) is used to indicate the
standard deviation of a population; letter s is used
to indicate the standard deviation of a sample.
• Since you generally will be working with small samples,
the formula for determining the standard deviation will
include (N - 1) rather than N.
©2013, The McGraw-Hill Companies, Inc. All Rights Reserved
2-27
Characteristics of Standard Deviation
1. Is the square root of the variance, which is the average
of the squared deviations from the mean. Population
variance is represented as σ2 and the sample variance is
represented as s2.
2. Is applicable to interval and ratio data, includes all
scores, and is the most reliable measure of variability.
3. Is used with the mean. In a normal distribution, one
standard deviation added to the mean and one standard
deviation subtracted from the mean includes the middle
68.26% of the scores.
©2013, The McGraw-Hill Companies, Inc. All Rights Reserved
2-28
Characteristics of Standard Deviation
4. With most data, a relatively small standard deviation
indicates that the group being tested has little
variability (performed homogeneously). A relatively
large standard deviation indicates the group has much
variability (performed heterogeneously).
5. Is used to perform other statistical calculations.
Symbols used to determine the standard deviation:
s = standard deviation
X = individual score
X = mean
N = number of scores
Σ = sum of
d = deviation score (X - X)
©2013, The McGraw-Hill Companies, Inc. All Rights Reserved
2-29
Calculation of Standard Deviation with ΣX2
1.
2.
3.
4.
Arrange scores into a series.
Find ΣX2.
Square each of the scores and add to determine the ΣX2.
Insert the values into the formula
s=
NΣX2 - (ΣX)2
N(N- 1)
Table 2.3:
ΣX = 2644
ΣX2 = 233,398
N = 30
s = 3.6
©2013, The McGraw-Hill Companies, Inc. All Rights Reserved
2-30
Calculation of Standard Deviation with Σd2
1.
2.
3.
4.
Arrange the scores into a series.
Calculate X.
Determine d and d2 for each score; calculate Σd2.
Insert the values into the formula
s=
Table 2.4:
X = 88.1
Σd2 = 373.5
N = 30
Σd2
N-1
s = 3.6
©2013, The McGraw-Hill Companies, Inc. All Rights Reserved
2-31
Interpretation of Standard Deviation
in Tables 2.3 and 2.4
S = 3.6
X = 88.1
88.1 + 3.6 = 91.7
88.1 - 3.6 = 84.5
In a normal distribution, 68.26% of the scores would fall
between 84.5 and 91.7.
©2013, The McGraw-Hill Companies, Inc. All Rights Reserved
2-32
Relationship of Standard Deviation and Normal
Curve (See Figure 2.4)
Based on the probability of a normal distribution, there is
an exact relationship between the standard deviation and
the proportion of area and scores under the curve.
1. 68.26% of the scores will fall between +1.0 and -1.0
standard deviations.
2. 95.44% of the scores will fall between +2.00 and
-2.00 standard deviations.
3. 99.73% of the scores will fall between +3.0 and -3.00
standard deviations.
4. Generally, scores will not exceed +3.0 and -3.0
standard deviations from the mean.
©2013, The McGraw-Hill Companies, Inc. All Rights Reserved
2-33
Figure 2.4 Characteristics of normal curve.
©2013, The McGraw-Hill Companies, Inc. All Rights Reserved
2-34
60-sec Sit-up Test to Two
Fitness Classes
Class 1
Class 2
X = 32
X = 28
s=2
s=4
Figure 2.5 compares the spread of the two distributions.
Individual A in Class 1 completed 34 sit-ups and individual
B completed 34 sit-ups in Class 2. Both individuals have
the same score, but do not have the same relationship to
their respective means and standard deviations. Figure 2.6
compares the individual performances.
©2013, The McGraw-Hill Companies, Inc. All Rights Reserved
2-35
Calculation of Percentile Rank through
Use of Mean and Standard Deviation.
1. Calculate the deviation of the score from the mean.
d = (X - X)
2. Calculate the number of standard deviation units the
score is from the mean (z-scores).
No. of standard deviation units from the mean = d
s
3. Use table 2.5 to determine where the percentile rank
of the score is on the curve. If negative value found in
step 1, the percentile rank will always be less than 50.
©2013, The McGraw-Hill Companies, Inc. All Rights Reserved
2-36
Which Measure of Variability is Best for
Interpretation of Test Results?
1. Range is the least desirable.
2. The quartile deviation is more meaningful than the
range, but it considers only the middle 50% of
the scores.
3. The standard deviation considers every score, is the
most reliable, and is the most commonly used
measure of variability.
©2013, The McGraw-Hill Companies, Inc. All Rights Reserved
2-37
Percentiles and Percentile Ranks
Percentile - a point in a distribution of scores
below which a given percentage of scores fall.
Examples - 60th percentile and 40 percentile
Percentile rank - percentage of the total scores
that fall below a given score in a distribution;
determined by beginning with the raw scores
and calculating the percentile ranks for the
scores.
©2013, The McGraw-Hill Companies, Inc. All Rights Reserved
2-38
Weakness of Percentiles
1. The relative distance between percentile scores are the
same, but the relative distances between the observed
scores are not.
2. Since percentile scores are based on the number of scores
in a distribution rather than the size of the score obtained, it
is sometimes more difficult to increase a percentile score at
the ends of the scale than in in the middle.
3. Average performers (in middle of distribution) need only a
small change in their raw scores to produce a large change
in their percentile scores.
4. Below average and above average performers (at ends of
distribution) need a large change in their raw scores to
produce even a small change in their percentile scores.
©2013, The McGraw-Hill Companies, Inc. All Rights Reserved
2-39
Frequency Distribution
•
•
•
•
•
Serve to group data
Scores listed in ascending or descending order
Number of times each occurs indicated
Percent of times each score occurs indicated
Cumulative percent (percent of scores below a
given score) indicated
See table 2.6
©2013, The McGraw-Hill Companies, Inc. All Rights Reserved
2-40
Graphs
1. Enable individuals to interpret data without reading
raw data or tables.
2. Different types of graphs are used.
Examples - histogram (column), frequency polygon
(line), pie chart, area, scatter, and pyramid
3. Standard guidelines should be used when
constructing graphs.
See figures 2.7, 2.8, 2.9, and 2.10.
©2013, The McGraw-Hill Companies, Inc. All Rights Reserved
2-41
Standard Scores
Provide method for comparing unlike scores; can
obtain an average score, or total score for unlike
scores.
z-score - represents the number of standard
deviations a raw score deviated from the mean
FORMULA
z=X-X
s
©2013, The McGraw-Hill Companies, Inc. All Rights Reserved
2-42
z-Scores
Table 2.7 – Tennis Serve Scores
Scores of 88 and 54; X = 72.2; s = 10.8
z=X-X
s
z = 88 - 72.1 = 15.9
10.8
10.8
z = 1.47
z = 54 - 72.1 = -18.1
10.8
10.8
z = -1.68
INTERPRETION?
©2013, The McGraw-Hill Companies, Inc. All Rights Reserved
2-43
z-Scores
• The z-scale has a mean of 0 and a standard deviation
of 1.
• Normally extends from –3 to +3 standard deviations.
• All standard scored are based on the z-score.
• Since z-scores are expressed in small, involve
decimals, and may be positive or negative, many
testers do not use them.
Table 2.5 shows relationship of standard deviation units
and percentile rank.
©2013, The McGraw-Hill Companies, Inc. All Rights Reserved
2-44
T-Scores
T-scale
• Has a mean of 50.
• Has a standard deviation of 10.
• May extend from 0 to 100.
• Unlikely that any t-score will be beyond 20 or 80
(this range includes plus and minus 3 standard
deviations).
Formula
T-score = 50 + 10 (X - X) = 50 + 10z
s
Figure 2.11 shows the relationship of z-scores,
T-scores, and the normal curve.
©2013, The McGraw-Hill Companies, Inc. All Rights Reserved
2-45
Figure 2.9 z-scores and T-scores plotted on normal curve.
©2013, The McGraw-Hill Companies, Inc. All Rights Reserved
2-46
T-Scores
Table 2.7 - Tennis Serve Scores
Scores of 88 and 54; X = 72.2; s = 10.8
T88 = 50 + 10(1.47)
T54 = 50 + 10 (-1.68)
= 50 + 14.7
= 50 + (-16.8)
= 64.7 = 65
= 33.2 = 33
(T-scores are reported as whole numbers)
©2013, The McGraw-Hill Companies, Inc. All Rights Reserved
2-47
T-Scores
T-scores may be used in same way as z-scores, but
usually preferred because:
• Only positive whole numbers are reported.
• Range from 0 to 100.
Sometime confusing because 60 or above is good
score.
©2013, The McGraw-Hill Companies, Inc. All Rights Reserved
2-48
T-Scores
May convert raw scores in a distribution to T-scores
See Table 2.7
1. Number a column of T-scores from 20 to 80.
2. Place the mean of the distribution of the scores opposite
the T-score of 50.
3. Divide the standard deviation of the distribution by 10.
The standard deviation for the T-scale is 10, so each
T-score from 0 to 100 is one-tenth of the standard
deviation.
©2013, The McGraw-Hill Companies, Inc. All Rights Reserved
2-49
T-Scores
4. Add the value found in step 3 to the mean and each
subsequent number until you reach the T-score of 80.
5. Subtract the value found in step 3 from the mean and
each decreasing number until you reach the number 20.
6. Round off the scores to the nearest whole number.
*For some scores, lower scores are better (timed events).
©2013, The McGraw-Hill Companies, Inc. All Rights Reserved
2-50
Percentiles
• Are standard scores and may be used to compare scores
of different measurements.
• Change at different rates (remember comparison of low
and high percentile scores with middle percentiles), so
they should not be used to determine one score for
several different tests.
• May prefer to use T-scale when converting raw scores
to standard scores.
©2013, The McGraw-Hill Companies, Inc. All Rights Reserved
2-51