Download Statistics Review for CH 2 Test CHAPTER 2 REVIEW Data analysis

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Bootstrapping (statistics) wikipedia , lookup

History of statistics wikipedia , lookup

Categorical variable wikipedia , lookup

Transcript
Statistics
Review for CH 2 Test
CHAPTER 2 REVIEW
Data analysis is the art of describing data using graphs and numerical summaries. The purpose of data analysis
is to help us see and understand the most important features of a set of data. Section 2.1 focused on graphical
displays: pie charts, bar graphs,dotplots, stemplots, and histograms. Section 2.2 showed how to describe
shape,center, and spread for distributions of quantitative variables. Section 2.3 examined proper and improper
uses of graphs (line graphs, pictograms) and numerical summaries. Figure 2.30 organizes the big ideas. We plot
our data, and then we describe their center and spread using either the mean and standard deviation or the
median and IQR.
Figure 2.30 To describe the distribution of a quantitative variable, plot, interpret, and summarize.
1.
2.
3.
4.
5.
6.
7.
1.
2.
3.
4.
5.
6.
1.
2.
3.
4.
5.
Here is a review list of the most important skills you should have developed from your study of this
chapter.
A. DISPLAYING DISTRIBUTIONS
Recognize when a pie chart can and cannot be used.
Make a bar graph of the distribution of a categorical variable or, in general, to compare related quantities.
Interpret pie charts and bar graphs.
Make a line graph of a quantitative variable over time. Recognize patterns such as trends and seasonal variation
in line graphs.
Be aware of graphical abuses, especially in line graphs and pictograms.
Make a dotplot or stemplot of the distribution of a small set of observations.
Make a histogram of the distribution of a quantitative variable.
B. DESCRIBING DISTRIBUTIONS (QUANTITATIVE VARIABLE)
Look for the overall pattern of a dotplot, stemplot, or histogram and for major deviations from the pattern.
Assess from a dotplot, stemplot, or histogram whether the shape of a distribution is roughly
symmetric, distinctly skewed, or neither. Assess whether the distribution has one or more major peaks.
Describe the overall pattern by giving numerical measures of center and spread in addition to a verbal
description of shape.
Decide which measures of center and spread are more appropriate: the mean and standard
deviation (especially for symmetric distributions) or the five-number summary(especially for skewed
distributions).
Identify outliers and give plausible explanations for them.
Compare distributions of categorical or quantitative variables.
C. NUMERICAL SUMMARIES OF DISTRIBUTIONS
Find the median M and the quartiles Q1 and Q3 for a set of observations.
Give the five-number summary and draw a boxplot; use boxplots to compare distributions.
Find the mean and (using a calculator) the standard deviation s for a small set of observations.
Understand that the median is less affected by extreme observations than the mean. Recognize that skewness in
a distribution moves the mean away from the median toward the long tail.
Know the basic properties of the standard deviation: s = 0 only when all observations are identical; s increases
as the spread increases; s has the same units as the original measurements; s is pulled strongly up by outliers or
skewness.
2.67 Longevity of presidents Table 2.9 shows the ages at death of U.S. presidents.
(a) Make a stemplot of these data. Did you decide to split the stems?
(b) Now make a histogram. Describe the shape, center, and spread of the distribution. Are there any outliers?
(c) Which plot is better at displaying this distribution: a stemplot or a histogram? Why?
2.69 Smoking-related deaths Below is a table from Smoking and Health Now, a report of the British Royal
College of Physicians. It shows the number and percent of deaths among men aged 35 and over from the chief
diseases related to smoking. One of the entries in the table is incorrect and an erratum slip was inserted to
correct it. Which entry is wrong, and what is the correct value?
2.70 Who sells cars? Figure 2.31 is a pie chart of the percent of passenger car sales in a given year by various
manufacturers.
(a) The artist has tried to make the graph more interesting by using the wheel of a car for the “pie.” Is the graph
still a correct display of the data? Explain your answer.
(b) Make a bar graph of the data. What advantage does your new graph have over the pie chart in Figure 2.31?
Figure 2.31 Passenger car sales by several manufacturers in the same year.
2.71 We pay high interest Figure 2.32 shows a graph taken from an advertisement for an investment that
promises to pay a higher interest rate than bank accounts and other competing investments. Is this graph a
correct comparison of the four interest rates?Explain your answer.
Figure 2.32 Pictogram comparing interest rates.
2.72 State SAT scores Figure 2.33 is a histogram of the average scores on the mathematics part of the SAT
exam for students in the 50 states and the District of Columbia.29 The distinctive overall shape of this
distribution implies that a single measure of center such as the mean or the median is of little value in
describing the distribution.Explain why.
Figure 2.33 Histogram of the average scores on the SAT Math exam for students in the
50 states and the District of Columbia.
2.73 Are you wired? In late 2006, the Pew Internet and American Life Project conducted a telephone survey of
935 adults and their children aged 12 to 17. One question they asked was “Do you, personally, happen to
have…a desktop computer? A cell phone?An iPod or other MP3 player? A laptop computer? A PDA like a Palm
Pilot or Blackberry?” The table below summarizes the responses.
Make a bar graph comparing teens’ and parents’ ownership of these devices.
2.74 Home run king In 1927, Babe Ruth broke Major League Baseball’s single-season home run record by
hitting 60 home runs. The Babe’s record stood until 1961, when Roger Maris hit 61 homers in a season. Almost
40 years later, Mark McGwire (70) and Sammy Sosa (68) excited baseball fans by smashing Maris’s record in the
same season. Barry Bonds captured the record in 2001 by hitting 73 home runs in a season. Below are data on
the number of home runs hit by Bonds and McGwire each season during the prime of their careers. Who is the
better home run hitter? Make comparative boxplots and provide numerical evidence to support your answer.
2.75 Getting more sleep An experiment was carried out with 10 patients to investigate the effectiveness of a
drug that was designed to increase sleep time. The data below show the number of additional hours of sleep
gained by each subject after taking the drug.30 (A negative value indicates that the subject got less sleep after
taking the drug.) Do these data provide sufficient evidence to conclude that the drug was effective? Follow
the four-step problem-solving process from Chapter 1 in answering this question.
1.9 0.8 1.1 0.1 −0.1 4.4 5.5 1.6 4.6 3.4
2.76 Sleep: hours or minutes? Refer to the previous exercise. Suppose the sleep increase data are converted
from hours to minutes. How will this affect the mean, median,standard deviation, and IQR? Explain.