Download Lecture 3 - Illinois State University Department of Psychology

Document related concepts

History of statistics wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Taylor's law wikipedia , lookup

Regression toward the mean wikipedia , lookup

Student's t-test wikipedia , lookup

Transcript
Tuesday August 27, 2013
Distributions:
Measures of Central
Tendency & Variability
Today: Finish up Frequency & Distributions,
then Turn to Means and Standard Deviations
First, hand in your
homework.
Any questions from last
time?
Grouped Frequency Table
A frequency table that uses intervals (range
of values) instead of single values
Pairs of Shoes
X Values
0-4
5-9
10-14
15-19
20-24
25-29
30-34
Total
Freq
3
6
7
2
4
1
1
24
%
13
25
29
8
17
4
4
100
Cumulative %↑
12
38
67
75
92
96
100
Cumulative %↓
100
87
62
33
25
8
4
Frequency Graphs

Histogram
 Plot the
different
values
against the
frequency of
each value
Frequency Graphs

Histogram (create one for class height)
 Step 1: make a frequency distribution table (may
use grouped frequency tables)
 Step 2: put the values along the bottom, left to
right, lowest to highest
 Step 3: make a scale of frequencies along left
edge
 Step 4: make a bar above each value with a
height for the frequency of that value
Frequency Graphs

Frequency polygon - essentially the
same, but uses lines instead of bars
Properties of distributions

Distributions are typically summarized with three
features
 Shape
 Center
 Variability (Spread)
Shapes of Frequency Distributions

Unimodal, bimodal,
and rectangular
Shapes of Frequency Distributions

Symmetrical and skewed distributions
 Normal and kurtotic distributions
Next Topic
 In
addition to using tables and
graphs to describe distributions,
we also can provide numerical
summaries
Chapters 3 & 4

Measures of Central Tendency
◦ Mean
◦ Median
◦ Mode

Measures of Variability
◦ Standard Deviation & Variance (Population)
◦ Standard Deviation & Variance (Samples)

Effects of linear transformations on mean
and standard deviation
Self-Monitor you Understanding
These topics should all be review from PSY 138,
so I will move fairly quickly through the lecture.
 I will stop periodically to ask for questions.
 Please ask if you don’t understand something!!!
 If you are confused by this material, it will be
very hard for you to follow and keep up with
later topics.

Describing distributions

Distributions are typically described with three
properties:
◦ Shape: unimodal, symmetrical, skewed, etc.
◦ Center: mean, median, mode
◦ Spread (variability): standard deviation, variance
Describing distributions

Distributions are typically described with three
properties:
◦ Shape: unimodal, symmetric, skewed, etc.
◦ Center: mean, median, mode
◦ Spread (variability): standard deviation, variance
Which center when?

Depends on a number of factors, like scale of
measurement and shape.
◦ The mean is the most preferred measure and it is
closely related to measures of variability
◦ However, there are times when the mean isn’t the
appropriate measure.
Which center when?

Use the median if:
 The distribution is skewed
 The distribution is ‘open-ended’
 (e.g. your top answer on your questionnaire is ‘5 or more’)
 Data are on an ordinal scale (rankings)

Use the mode if the data are on a nominal scale
Self-monitor your understanding
We are about to turn to a discussion of
calculating means.
 Before we move on, any questions about
when to use which measure of central
tendency?

The Mean


The most commonly used measure of center
The arithmetic average
◦ Computing the mean
– The formula for the population
mean is (a parameter):
åX
m=
N
– The formula for the sample
mean is (a statistic):
åX
M=
n
Divide by the
total number in
the population
Add up all of
the X’s
Divide by the
total number in
the sample
• Note: Sometimes ‘ X ’ is used in place of M to denote
the mean in formulas
The Mean

Number of shoes:
2,2,2,5,5,5,7,8
6,10,10,12,12,13,14,14,15,15,20,20,20,20,25,30
åX
= (2+2+2+5+5+5+7+8)/8 = 36/8 = 4.5
M=
n
å X = (6+10+10+12+12+13+14+14+15+15+20+20+20+20+
M=
n 25+30)/16 = 256/16 = 16
• Suppose we want the mean of the entire group? Can we
simply add the two means together and divide by 2?
• (4.5 + 16)/2 = 20.5/2 = 10.25
• NO. Why not?
The Weighted Mean

Number of shoes:
2,2,2,5,5,5,7,8,6,10,10,12,12,13,14,14,15,15,20,20,20,20,25,30
Mean for men = 4.5
Mean for women = 16
M1n1 + M 2 n2
MN =
n1 + n2
= [(4.5*8)+(16*16)]/(8+16) =(36+256)/24)
= 292/24 = 12.17
Need to take into account the number of scores in each
mean ( n1 & n 2 )
The Weighted Mean
Number of shoes:
2,2,2,5,5,5,7,8,6,10,10,12,12,13,14,14,15,15,20,20,20,20,25,30
M1n1 + M 2 n2
MN =
n1 + n2
= [(4.5*8)+(16*16)]/(8+16) = (36+256)/24 = 292/24 = 12.17
Let’s check:
åX
= 256/24=12.17
M=
n
• Both ways give the same answer
Self-monitor your understanding
We are about to move on to a quick
discussion of calculating the median and
mode.
 Before we move on, any questions about
the formulae for the population mean,
sample mean?
 Questions about the weighted mean?

The median
The median is the score that divides a distribution exactly in
half. Exactly 50% of the individuals in a distribution have
scores at or below the median.
◦ Case1: Odd number of scores in the distribution
Step1: put the scores in order
Step2: find the middle score
– Case2: Even number of scores in the distribution
Step1: put the scores in order
Step2: find the middle two scores
Step3: find the arithmetic average of the two
middle scores
The mode
The mode is the score or category that has the
greatest frequency.
◦ So look at your frequency table or graph and pick the
variable that has the highest frequency.
major mode
so the mode is 5
so the modes
are 2 and 8
minor
mode
Note: if one were bigger
than the other it would be
called the major mode and
the other would be the
minor mode
Self-monitor your understanding
We are about to switch to the topic of
measures of variability
 Before we move on, any questions about
measures of central tendency?

Describing distributions
Distributions are typically described with three
properties:
◦ Shape: unimodal, symmetric, skewed, etc.
◦ Center: mean, median, mode
◦ Spread (variability): standard deviation, variance
Variability of a distribution
Variability provides a quantitative measure of the degree
to which scores in a distribution are spread out or
clustered together.
◦ In other words variabilility refers to the degree of “differentness” of
the scores in the distribution.
High variability means that
the scores differ by a lot
Low variability means that the
scores are all similar
Standard deviation
The standard deviation is the most commonly used
measure of variability.
◦ The standard deviation measures how far off all of the scores
in the distribution are from the mean of the distribution.
◦ Essentially, the average of the deviations.
μ
Computing standard deviation (population)
Step 1: To get a measure of the deviation we need to
subtract the population mean from every individual in our
distribution.
Our population
2, 4, 6, 8
å X 2 + 4 + 6 + 8 20
m=
=
= = 5.0
N
4
4
X - μ = deviation scores
2 - 5 = -3
-3
1 2 3 4 5 6 7 8 9 10
μ
Computing standard deviation (population)
Step 1: To get a measure of the deviation we need to
subtract the population mean from every individual in our
distribution.
Our population
2, 4, 6, 8
å X 2 + 4 + 6 + 8 20
m=
=
= = 5.0
N
4
4
X - μ = deviation scores
2 - 5 = -3
4 - 5 = -1
-1
1 2 3 4 5 6 7 8 9 10
μ
Computing standard deviation (population)
Step 1: To get a measure of the deviation we need to
subtract the population mean from every individual in our
distribution.
Our population
2, 4, 6, 8
å X 2 + 4 + 6 + 8 20
m=
=
= = 5.0
N
4
4
X - μ = deviation scores
2 - 5 = -3
4 - 5 = -1
6 - 5 = +1
1
1 2 3 4 5 6 7 8 9 10
μ
Computing standard deviation (population)
Step 1: Compute the deviation scores: Subtract the
population mean from every score in the distribution.
Our population
2, 4, 6, 8
å X 2 + 4 + 6 + 8 20
m=
=
= = 5.0
N
4
4
X - μ = deviation scores
2 - 5 = -3
4 - 5 = -1
6 - 5 = +1
8 - 5 = +3
3
1 2 3 4 5 6 7 8 9 10
μ
Notice that if you add up
all of the deviations they
must equal 0.
Computing standard deviation (population)
Step 2: Get rid of the negative signs. Square the deviations
and add them together to compute the sum of the
squared deviations (SS).
X - μ = deviation scores
2 - 5 = -3
4 - 5 = -1
6 - 5 = +1
8 - 5 = +3
SS = Σ (X - μ)2
= (-3)2 + (-1)2 + (+1)2 + (+3)2
= 9 + 1 + 1 + 9 = 20
Computing standard deviation (population)
Step 3: Compute the Variance (the average of the
squared deviations)
 Divide by the number of individuals in the population.
variance = σ2 = SS/N
Computing standard deviation (population)
Step 4: Compute the standard deviation. Take the
square root of the population variance.
standard deviation = σ = s =
2
å ( X - m)
N
2
Computing standard deviation (population)

To review:
◦ Step 1: compute deviation scores
◦ Step 2: compute the SS
 SS = Σ (X - μ)2
◦ Step 3: determine the variance
 take the average of the squared deviations
 divide the SS by the N
◦ Step 4: determine the standard deviation
 take the square root of the variance
Self-monitor your understanding
We are about to learn how to calculate
sample standard deviations.
 Before we move on, any questions about how
to calculate population standard deviations?
 Any questions about these terms: deviation
scores, squared deviations, sum of squares,
variance, standard deviation?

• Any questions about these symbols:
2
SS
s s
Computing standard deviation (sample)
The basic procedure is the same.
◦ Step 1: compute deviation scores
◦ Step 2: compute the SS
◦ Step 3: determine the variance
 This step is different
◦ Step 4: determine the standard deviation
Computing standard deviation (sample)
Step 1: Compute the deviation scores
◦ subtract the sample mean from every individual in our distribution.
Our sample
2, 4, 6, 8
å X 2 + 4 + 6 + 8 20
M=
=
= = 5.0
n
4
4
X - M = Deviation Score
2 - 5 = -3
4 - 5 = -1
6 - 5 = +1
8 - 5 = +3
1 2 3 4 5 6 7 8 9 10
M
Computing standard deviation (sample)
Step 2: Determine the sum of the squared
deviations (SS).
X - M = deviation scores
2 - 5 = -3
4 - 5 = -1
6 - 5 = +1
8 - 5 = +3
SS = Σ(X - M)2
= (-3)2 + (-1)2 + (+1)2 + (+3)2
= 9 + 1 + 1 + 9 = 20
Apart from notational differences the procedure is
the same as before
Computing standard deviation (sample)
Step 3: Determine the variance
Recall:
Population variance = σ2 = SS/N
The variability of the samples is
typically smaller than the
population’s variability
X4
X1 μ X3
X2
Computing standard deviation (sample)
Step 3: Determine the variance
Recall:
Population variance = σ2 = SS/N
The variability of the samples is
typically smaller than the
population’s variability
To correct for this we divide by (n-1) instead of just n
Sample variance =
s2
SS
=
( n -1)
Computing standard deviation (sample)
Step 4: Determine the standard deviation
2
s
=
standard deviation = s =
å( X - M)
n -1
2
Self-monitor your understanding
Next, we’ll find out how changing our scores
(adding, subtracting, multiplying, dividing) affects
the mean and standard deviation.
 Before we move on, any questions about the
sample standard deviation?
 About why we divide by (n-1)?
 About the following symbols:

◦ s2
◦ s
Properties of means and standard deviations
Change/add/delete a given
score
Mean
Standard deviation
changes
changes
– Changes the total and the number of scores, this will change the
mean and the standard deviation
åX
m=
N
s=
å( X - m )
N
2
Properties of means and standard deviations
Mean
Change/add/delete a given
score
changes
Add/subtract a constant to
each score
– All of the scores change by the same constant.
Mold
Standard deviation
changes
Properties of means and standard deviations
Mean
Change/add/delete a given
score
changes
Add/subtract a constant to
each score
– All of the scores change by the same constant.
Mold
Standard deviation
changes
Properties of means and standard deviations
Mean
Change/add/delete a given score
changes
Add/subtract a constant to
each score
– All of the scores change by the same constant.
Mold
Standard deviation
changes
Properties of means and standard deviations
Change/add/delete a given score
Mean
changes
Add/subtract a constant to
each score
– All of the scores change by the same constant.
Mold
Standard deviation
changes
Properties of means and standard deviations
Change/add/delete a given
score
Add/subtract a constant to
each score
Mean
changes
changes
– All of the scores change by the same constant.
– But so does the mean
Mnew
Standard deviation
changes
Properties of means and standard deviations
Mean
Change/add/delete a given
score
changes
Add/subtract a constant to
each score
Standard deviation
changes
changes
– It is as if you just pick up the distribution and move it over, but the
spread (variability) stays the same
Mold
Properties of means and standard deviations
Mean
Change/add/delete a given
score
changes
Add/subtract a constant to
each score
Standard deviation
changes
changes
– It is as if you just pick up the distribution and move it over, but the
spread (variability) stays the same
Mold
Properties of means and standard deviations
Mean
Change/add/delete a given
score
changes
Add/subtract a constant to
each score
Standard deviation
changes
changes
– It is as if you just pick up the distribution and move it over, but the
spread (variability) stays the same
Mold
Properties of means and standard deviations
Mean
Change/add/delete a given
score
changes
Add/subtract a constant to
each score
Standard deviation
changes
changes
– It is as if you just pick up the distribution and move it over, but the
spread (variability) stays the same
Mold
Properties of means and standard deviations
Mean
Change/add/delete a given
score
changes
Add/subtract a constant to
each score
Standard deviation
changes
changes
– It is as if you just pick up the distribution and move it over, but the
spread (variability) stays the same
Mold
Properties of means and standard deviations
Mean
Change/add/delete a given
score
changes
Add/subtract a constant to
each score
Standard deviation
changes
changes
– It is as if you just pick up the distribution and move it over, but the
spread (variability) stays the same
Mold
Properties of means and standard deviations
Mean
Change/add/delete a given
score
changes
Add/subtract a constant to
each score
Standard deviation
changes
changes
– It is as if you just pick up the distribution and move it over, but the
spread (variability) stays the same
Mold
Properties of means and standard deviations
Mean
Change/add/delete a given
score
Add/subtract a constant to
each score
Standard deviation
changes
changes
changes
No change
– It is as if you just pick up the distribution and move it over, but the
spread (variability) stays the same
Mold Mnew
Properties of means and standard deviations
Change/add/delete a given
score
Add/subtract a constant to
each score
Multiply/divide a constant
to each score
20 21 22 23 24
Mean
changes
changes
changes
No change
21 - 22 = -1
23 - 22 = +1
s=
M
Standard deviation
(-1)2
(+1)2
å( X - M )
n -1
2
= 2 = 1.41
Properties of means and standard deviations
Mean
Change/add/delete a given
score
Add/subtract a constant to
each score
Multiply/divide a constant
to each score
– Multiply scores by 2
40 42 44 46 48
changes
changes
changes
No change
changes
changes
42 - 44 = -2
46 - 44 = +2
s=
M
Standard deviation
(-2)2
(+2)2
2
å( X - M)
n -1
Sold=1.41
= 8 = 2.82