Download Data arranged in order

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Student's t-test wikipedia , lookup

Time series wikipedia , lookup

Transcript
Chapter 3
Data Description
Section 3-3
Measures of Variation
Range
Variance
Sample Variance
Sample Standard Deviation
Shortcut
Section 3-3 Exercise #7
The number of incidents where policies were needed for a
sample of ten schools in Allegheny County is
7, 37, 3, 8, 48, 11, 6, 0, 10, 3. Assume the data
represent samples.
Find the range.
Use the shortcut formula for the unbiased
estimator to compute the variance and
standard deviation.
Use the shortcut formula for the unbiased
estimator to compute the variance and
standard deviation.
Is the data consistent or does it vary? Explain.
Finding the Sample Variance and
Standard Deviation for Grouped
Data
Section 3-3 Exercise #21
The data shows the number of murders in 25
selected cities.
Find the variance and
Number
f
standard deviation.
27-90
13
91-154
155-218
219-282
283-346
347-410
411-474
475-539
539-602
2
0
5
0
2
0
1
2
Class
Xm
f
f • Xm
f • X m2
The data shows the number of murders in 25
selected cities.
Find the variance and standard deviation.
Section 3-3 Exercise #33
The mean of a distribution is 20 and the standard
deviation is 2. Answer each. Use Chebyshev’s theorem.
a. At least what percentage of the values will
fall between 10 and 30?
b. At least what percentage of the
values will fall between 12 and 28?
a. Subtract the mean from the larger value: 30 – 20 = 10
10
Divide by the standard deviation to get k:
=5
2
b. Subtract the mean from the larger
value: 28 – 20 = 8. Divide by the standard
8
deviation to get k: = 4
2
Chebyshev’s theorem
The Empirical (Normal) Rule
Chebyshev’s theorem applies to any distribution
regardless of its shape. However, when a distribution
is bell-shaped (or what is called normal), the
following statements, which make up the empirical
rule, are true.
Approximately 68% of the data values will fall within 1
standard deviation of the mean.
Approximately 95% of the data values will fall within 2
standard deviations of the mean.
Approximately 99.7% of the data values will fall within
3 standard deviations of the mean.
Section 3-3 Exercise #41
The average U.S. yearly per capita consumption of citrus
fruits is 26.8 pounds. Suppose that the distribution of
fruit amounts consumed is bell-shaped with a standard
deviation equal to 4.2 pounds.
What percentage of Americans would you expect to
consume more than 31 pounds of citrus
fruit per year?
By the Empirical Rule, 68% of consumption is within 1
standard deviation of the mean. Then 1/2 of 32%, or 16%,
of consumption would be more than 31 pounds of citrus
fruit per year.
Chapter 3
Data Description
Section 3-4
Measures of Position
A z score or standard score for a value is obtained by
subtracting the mean from the value and dividing the
result by the standard deviation. The symbol for a
standard score is z. The formula is
Section 3-4 Exercise #13
Which of the following exam scores has a better
relative position?
a. A score of 42 on an exam with X = 39 and s = 4
b. A score of 76 on an exam with
X = 71 and s = 3
Percentile Formula
Section 3-4 Exercise #22
Find the percentile ranks of each weight in the data set.
The weights are in pounds.
Data: 78, 82, 86, 88, 92, 97
Section 3-4 Exercise #23
What value corresponds to the 30th percentile?
Find the percentile ranks of each weight in the data set.
The weights are in pounds.
Chapter 3
Data Description
Section 3-5
Exploratory Data Analysis
The Five-Number Summary and
Boxplots
A boxplot is a graph of a data set
obtained by drawing a horizontal line from
the minimum data value to Q1, drawing a
horizontal line from Q3 to the maximum
data value, and drawing a box whose
vertical sides pass through Q1 and Q3
with a vertical line inside the box passing
through the median or Q2.
Section 3-5 Exercise #1
Identify the five number summary and find
the interquartile range.
8, 12, 32, 6, 27, 19, 54
Data arranged in order:
Minimum:
Median:
Maximum:
Q1:
Q3:
Interquartile Range:
Section 3-5 Exercise #9
Use the boxplot to identify the maximum value, minimum
value, median, first quartile, third quartile, and
interquartile range.
Information Obtained from a Boxplot
1. a. If the median is near the center of the box, the
distribution is approximately symmetric.
b. If the median falls to the left of the center of the
box, the distribution is positively skewed.
c. If the median falls to the right of the center, the
distribution is negatively skewed.
2. a. If the lines are about the same length, the
distribution is approximately symmetric.
b. If the right line is larger than the left line, the
distribution is positively skewed.
c. If the left line is larger than the right line, the
distribution is negatively skewed.
Section 3-5 Exercise #15
9.8
8.0
13.9
4.4
3.9
21.7
15.9
3.2
11.7
24.8
34.1
17.6
These data are the number of inches of
snow reported in randomly selected cities
for September 1 through January 10.
Construct a boxplot and comment on the
skewness of the data.
Data arranged in order :
Section 3-5 Exercise #16
These data represent the volumes in cubic yards of the
largest dams in the United States and in South America.
Construct a boxplot of the data for each
region and compare the distributions.
United States
125,628
92,000
78,008
77,700
66,500
62,850
52,435
50,000
South America
311,539
274,026
105,944
102,014
56,242
46,563