Download Chapter 4: Descriptive Statistics

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Psychometrics wikipedia , lookup

History of statistics wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Taylor's law wikipedia , lookup

Student's t-test wikipedia , lookup

Regression toward the mean wikipedia , lookup

Transcript
Chapter 4: Descriptive Statistics
Chapter Objectives
When you finish this chapter you should be able to










explain the concepts of central tendency, dispersion, and shape.
use Excel to obtain descriptive statistics and visual displays.
calculate and interpret common descriptive statistics.
identify the properties of common measures of central tendency.
calculate and interpret common measures of dispersion.
transform a data set into standardized values.
apply the Empirical Rule and recognize outliers.
calculate quartiles and other percentiles.
make and interpret box plots.
calculate the mean and standard deviation from grouped data.
Quiz Yourself
True/False Questions
T F 1. If the standard deviation of the ages of a female group of employees is 6 years and the standard
deviation of the ages of a male group in the same plant is 10 years, it indicates that there is more
spread in the ages of the female employees.
T F 2.
The value of the mean times the number of observations equals the sum of the observations.
T F 3.
The coefficient of variation is a measure of relative dispersion, which expresses the mean as a
percent of the standard deviation.
T F 4.
Dispersion is the degree of variation in the data.
T F 5.
The sum of the deviations from the mean is zero.
T F 6.
The mean divides observations into two parts of equal size.
T F 7.
The data set [0, 0, 1, 1] has the identical variance of the data set [1, 1, 2, 2].
Multiple Choice Questions
1.
A supermarket has determined that daily demand for eggs has a bell-shaped distribution, with a
mean of 55 cartons and a standard deviation of six cartons. If the supermarket begins each
morning with a stock of 61 cartons of eggs, approximately what percentage of days will there be a
surplus of eggs?
A.
2.
≈16%
B.
≈32%
C.
≈68%
D.
≈84%
The average score for a class of 30 students was 75. The 20 male students in the class averaged 70.
The 10 female students in the class averaged:
A.
85
B.
80
C.
75
D.
70
3.
A child was born into the Doe family each year for five consecutive years. What is the variance of
the ages of the Doe children?
A.
4.
1.4
B.
1.6
C.
2.0
D.
2.5
A study of the scores on an in plant course in management principles and the years of service of
the employees enrolled in the course yielded the following statistics:
Test Scores:
Years of Service:
mean=100
mean=5
variance = 225
variance = 81
Of test scores and years of service, which measure has the greater dispersion?
A.
B.
C.
D.
5.
8.
Number of
Textbooks
$25 to $35
35 to 45
45 to 55
55 to 65
65 to 75
2
16
5
7
20
$554.00
B.
$11.08
C.
$60.00
D.
$55.40
196.78
D.
729.00
Estimate the standard deviation of the price of a textbook.
A.
7.
Textbook Price
Estimate the mean price of a textbook.
A.
6.
Test scores
Years of service
Both test scores and years of service have the same relative dispersion.
It is impossible to tell.
14.03
B.
33.85
C.
In December of 1999, a research team at Ohio State University reported that the median allowance
for U.S. teenagers is $50 per week, with a coefficient of skewness (as calculated in class) of 0.762.
While some teenagers received no allowance, others reported receiving $200 per week. Based on
this information, the mean allowance for U.S. teenagers is most likely
A.
$50.
B.
less than $50.
C.
more than $50.
D.
impossible to determine without more information.
A clothes store manager has sales data of trouser sizes for the last month’s sales. Which measure
of central tendency should the manager use, if the manager is interested in the most sellable size?
A.
B.
C.
D.
E.
Mean
Mode
Median
Standard deviation
Interquartile range
9.
A sample of 10 observations has a variance of 16. The sum of the squared deviations from the
sample mean is
A.
400
B.
256
C.
160
D.
144
The monthly amounts spent for food by families of four receiving food stamps approximates a symmetrical
distribution. The mean is $100 and the variance is $400. Use this information to answer the next two
questions.
10.
Using the Empirical Rule, the range over which approximately 95% of families’ monthly food
expenditures falls is
A.
B.
C.
D.
11.
$60 and $140
$0 and $400
$80
$120
Which is wider, the range over which Chebyshev predicts at least 68% or the range over which the
Empirical Rule predicts about 68%?
A.
B.
C.
D.
The ranges are the same size.
The range predicted by Chebyshev’s Rule is wider.
The range predicted by the Empirical Rule is wider.
It is not possible to perform this calculation.
Solved Problems From Text
4.8
a.
Mon
1
1
1
1
5
5
5
6
6
9
Sorted Data:
Tue
1
1
3
3
4
6
6
7
7
9
Wed
0
0
0
0
4
6
6
6
6
10
Thu
1
1
1
1
1
1
1
1
1
10
b.
Descriptive Statistics
mean
median
trimmed mean
geometric mean
midrange
Mon
4
5
4
2.89
5
Tue
4.7
5
4.7
3.76
5
Wed
3.8
5
3.8
NA
5
Thu
1.9
1
1.9
1.26
5.5
c.
The geometric mean is very different from the other measures. The mean, median, and midrange
are close in
value.
d.
The mean or median are better measures of central tendency for this type of data. More empty
seats
on
Monday and Tuesday than on Wednesday and Thursday. If one stands by on Thursday,
there is little chance
that they will get on a flight compared to earlier in the week.
4.16
a.
Count
Mean
sample standard deviation
coefficient of variation
(CV)
Quiz 1
10
72.00
13.23
Quiz 2
10
72.00
6.67
Quiz 3
10
76.00
11.41
Quiz 4
10
76.00
27.43
18.38%
9.26%
15.02%
36.09%
b.
Scores, on average, are higher for Quiz 3 and Quiz 4. Quiz 2 has the least relative variation and
Quiz 4 the most. As the quiz scores increase from quiz to quiz, the variation within a quiz
increases.
c.
While students fared on average the same in quizzes 1 and 2, the small dispersion of Quiz 2
suggests that scores were much more similar than Quiz one. If the Quizzes were given in order
from 1 to 4, the difference between student performance on 1 and that on 2 suggests that some
students began with a level of knowledge that enabled them to excel on the first quiz, but by the
second one their advantage had lessened. Moving from Quiz 3 to Quiz 4, the opposite situation
appears. While the relative dispersion of Quiz 3 is comparable to that of Quiz 1, the relative
dispersion of Quiz 4 is double that of the previous high. This last quiz appears to detect a great
deal of difference in the level of knowledge among the students.
4.18
From Megastat:
empirical rule
mean - 1s
mean + 1s
percent in interval (68.26%)
19.49
35.20
68.8%
mean - 2s
mean + 2s
percent in interval (95.44%)
11.64
43.05
96.9%
mean - 3s
mean + 3s
percent in interval (99.73%)
3.79
50.90
100.0%
b.
c.
d.
4.20
There are no outliers based on the empirical rule. No unusual data values.
Assume the data are normally distributed based on the empirical rule results.
Yes, the sample size is large enough.
a.
1st quartile
Median
3rd quartile
midhinge
b.
low extremes
low outliers
high outliers
high extremes
22.50
26.00
33.00
27.5
0
0
0
0
c.
The median number of customers is 26. Days with 22 or fewer customers are in the bottom
quartile. Days with 33 or more customers are in the upper quartile. The midhinge is 27.75 and is a
measure of central tenndency. It is a
measure of central tendency. The box plot displays the first
and third quartiles and the median. The box plot
indicates that there are no extreme values or outliers.
Quiz Yourself Answers
True/False
1 F
2 T
3 F
4 T
5 T
6 F
7 T
Multiple Choice
1 D
7 C
2 A
8 B
3 C
9 D
4 B
10 C
5 D
11 B
6 A