Download Ch03a

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Mean field particle methods wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Taylor's law wikipedia , lookup

Categorical variable wikipedia , lookup

Regression toward the mean wikipedia , lookup

Transcript
• Measures of central tendency
– We like to boil things down. It’s easier to
report a single number than many
– In the behavioral and natural sciences, the
frequency of many data sets are symmetrical
For a normal distribution, one should be able to
imagine the frequency distribution histogram if given
the mean and standard deviation.
For a normal distribution, the mean median and mode
are the same number.
X
f
2
3
4
5
6
7
8
X=
Calculate the mean (a.k.a. M or X for a sample and μ for
a population).
More terminology for populations vs.
samples
• Population mean
μ = ΣX
N
• Sample mean
M = ΣX
n
X
1
2
3
4
5
6
7
8
9
X=
f
X
1
2
3
4
5
6
7
8
9
X=
f
The mean is the arithmetic average, but it is also the
“balance point.” The total distance from the mean to all
points above is equal to the total distance from the mean
to all points below. The distance between the mean and
all points determines variance. More on this later…
Sometimes the means from two or more samples need to
be combined. The importance of each mean is determined
by the size of the respective sample, so you shouldn’t
simply combine means and divide by two.
• Example
– Two classes take an exam
• Class 1: n = 4 and M = 65%
• Class 2: n = 45 and M = 89%
– Calculating overall mean “unfairly” leads to
65% + 89% = 77%
2
The two courses should not be treated equally since one
has many more students than the other. The larger class
has more weight and should have more influence in
calculating the overall mean.
• Example
– Two classes take an exam
• Class 1: n = 4 and M = 65%
– This is like having 4 scores of 65% (ΣX1 = 260%)
• Class 2: n = 45 and M = 89%
– This is like having 45 scores of 89% (ΣX2 = 4005%)
– Weighted mean =
ΣX1 + ΣX2 = 256 + 4005
n1 + n 2
4 + 45
=
87%
• The weighted mean is the same number
you would get if you had totaled all of the
original scores and divided by N.
• The weighted mean calculation is a
shortcut. You can also calculate it by
multiplying each sample mean by the
proportion of total participants found in
each sample.
The two courses should not be treated equally since one
has many more students than the other. The larger class
has more weight and should have more influence in
calculating the overall mean.
• Example
– Two classes take an exam
• Class 1: n = 4 and M = 65%
– This is like having 4 scores of 65% (ΣX1 = 256%)
• Class 2: n = 45 and M = 89%
– This is like having 45 scores of 89% (ΣX2 = 4005%)
Weighted mean = ΣX1p1 + ΣX2p2
= 65% (4/49) + 89% (45/49) = 87%
• Influencing the mean:
– It should be obvious to you that changing a
score, adding a score, or removing a score will
change the mean
– Adding or subtracting a constant from each
score will change the mean by the same amount
– Multiplying or dividing each score by a constant
will change the mean in the same way
• Think about converting meters to yards or ounces to
grams
Figure 3.3 (p. 63) - Consider the changes just
described as applied to the scale model
Table 3.2 (p. 64)
The weight of a freight truck with and without a
200 lb. passenger.
Truck
A
B
C
D
E
F
Passenger
1200
1600
1400
2100
2500
1400
X=1700
No Passenger
1000
1400
1200
1900
2300
1200
X=
Table 3.3 (p. 65)
• The median is the score that divides the
distribution in half so that 50% are above
and 50% are below.
– If number of scores is odd, it is the middle
score. Otherwise it is the average of the two
middle scores
Figure 3.4 (p. 66) - The median divides the area of a
frequency distribution histogram in half.
(Blue = White)
Figure 3.5 (p. 67)
2.25
Figure 3.6 (p. 67) Sometimes the result looks strange,
but the rule still applies. Though you may round
numbers off to suit your study. It may be awkward to
report some variables, such as children in a
household, as non-discrete.
Median = 2.5
Figure 3.7 (p. 68) mean = 4, median = 2.5
For a skewed distribution, the mean is pulled away from the
median in the direction of the skew.
Be careful!!
The mode is Luigi’s,
NOT 42
Table 3.4 (p. 70) The mode is a score or a
category that has the greatest frequency.
Figure 3.8 (p. 70) The term “bi-modal” (or multi-modal) is
casually used to describe a distribution with more than
one clear peak.
Figure 3.9 (p. 72) Use the mode when describing distributions
with few unusually high or low scores. What’s the mean?
Table 3.5 (p. 74) Use it in cases in which there is no data available
for a participant.
Table 3.6 (p. 75) Illustrating means in discrete groups.
Figure 3.10 (p. 76) Illustrating means in continuous groups.
Figure 3.11 (p. 76)
Figure 3.12 (p. 77)
Figure 3.13 (p. 78)
Using a Box-and-Whisker Plot to Illustrate skewed
distributions.
• Two classes have the following quiz
6
f1
f2
1
5
1
2
8
0
3
4
0
4
3
1
5
2
3
6
1
5
7
0
8
8
1
2
9
0
2
10
0
1
4
3
2
1
0
1
2
3
4
5
Scores for Class 1
Frequency
x
Frequency
5
4.5
4
3.5
3
2.5
2
1.5
1
0.5
0
1
2
3
Scores for Class 2
4
5
These scores cannot simply be reported with a mean
and standard deviation
Basics of SPSS (Appendix D in your book)
• Start SPSS
• Start by defining
variables by
clicking on the
“Variable View”
Tab
Basics of SPSS
• We have two
variables to
consider
– Class
– Score
• Define these
variables and
return to “Data
View”
•Generating Descriptive Statistics
Choose the variable of interest and move it
to the “variable “ window”
Select the “Charts” button to display a frequencydistribution histogram
•Generating Descriptive Statistics
Select “Explore” to Create Box-and-Whisker
Plots
The Box has the mean going through it as well as
the 25th and 75th percentiles (1st and 3 quartiles)
•The plot splits scores into four equal groups by
defining the first, second, and third quartiles
3
2
1
6
Frequency
5
4
3
2
1
0
1
2
3
4
5
Scores for Class 1
Frequency
•Visualizing the
distribution with Boxand-Whisker Plots
4.5
4
3.5
3
2.5
2
1.5
1
0.5
0
1
2
3
Scores for Class 2
4
5