Download Student Notes – Prep Session Topic: Exploring Data

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Psychometrics wikipedia , lookup

Data mining wikipedia , lookup

History of statistics wikipedia , lookup

Misuse of statistics wikipedia , lookup

Time series wikipedia , lookup

Transcript
Student Notes – Prep Session Topic: Exploring Data
Content
The AP Statistics topic outline contains a long list of items in the category titled Exploring Data.
These items are copied below. In this session we will work only on sections A, B, C, and E.
I. Exploring Data: Describing patterns and departures from patterns
Exploratory analysis of data makes use of graphical and numerical techniques to study patterns
and departures from patterns. Emphasis should be placed on interpreting information from graphical and numerical displays and summaries.
A. Constructing and interpreting graphical displays of distributions of univariate data (dotplot,
stemplot, histogram, cumulative frequency plot)
1. Center and spread
2. Clusters and gaps
3. Outliers and other unusual features
4. Shape
B. Summarizing distributions of univariate data
1. Measuring center: median, mean
2. Measuring spread: range, interquartile range, standard deviation
3. Measuring position: quartiles, percentiles, standardized scores (z-scores)
4. Using boxplots
5. The effect of changing units on summary measures
C. Comparing distributions of univariate data (dotplots, back-to-back stemplots, parallel boxplots)
1. Comparing center and spread: within group, between group variation
2. Comparing clusters and gaps
3. Comparing outliers and other unusual features
4. Comparing shapes
D. Exploring bivariate data
1. Analyzing patterns in scatterplots
2. Correlation and linearity
3. Least-squares regression line
4. Residual plots, outliers, and influential points
5. Transformations to achieve linearity: logarithmic and power transformations
E. Exploring categorical data
1. Frequency tables and bar charts
2. Marginal and joint frequencies for two-way tables
3. Conditional relative frequencies and association
4. Comparing distributions using bar charts
1
Gloria Barrett and Daren Starnes, Virginia Advanced Study Strategies
December, 2009
Formulas Provided
There are two formulas related to Topic I, Sections A, B, C, and E that are provided on the formula
sheet:
x
x
i
n
and
sx 
1
 ( xi  x )2
n 1
Calculator Use
To save time on the exam, you may want to use your calculator to compute summary statistics.
Specifically, you will want to know how to enter data in lists and calculate 1-Var Stats.
Note: When you use your calculator for computations on Free Response questions, it will be very
important to provide proper communication and support for your work. AP Exam readers are instructed not to consider calculator syntax as sufficient support for answers.
Reminders about concepts and communication -- If you are asked to make a graph, be sure to include a title, labels on the horizontal and vertical axes, and scales on both axes (if appropriate). Also, if the graph includes multiple data
sets (for example parallel boxplots), be sure to label each plot.
 Be careful when you describe the shape of a mound-shaped, approximately symmetric distribution. The distribution may or may not be normal. Graders will accept the description as
approximately normal, but they will not accept that the distribution is normal based only on
a mound-shaped, symmetric graph.
 Be careful to use the correct term when you describe the shape of a uniform distribution.
 If you are asked to provide information about a distribution based on a graph, you should
always comment on center, shape, and spread. If there are unusual features, for example
outliers, clusters, or gaps, you should also comment on those. All discussion should be in
context.
 If you are asked to compare two distributions based on graphs, be sure to compare and describe the center, shape, and spread. Simply listing these features for both samples without
a direct comparison has earned students no credit in the past. Also, saying that shapes are
similar without describing the shape will not receive full credit.
 Right skewed is the same as skewed toward large values; left skewed is the same as skewed
toward small values.
 If a distribution is approximately symmetric, the mean and median will be close in value. If a
distribution is skewed, the mean will generally be pulled away from the median in the direction of the tale. So generally it will be correct to say “since the distribution is skewed to the
right, we expect the mean to be greater than the median.”
 Knowing that the mean and median are unequal does not guarantee that the shape of the
distribution is skewed. So it is risky (and generally not correct) to say something like “since
the mean is greater than the median, the distribution is skewed to the right.”
2
Gloria Barrett and Daren Starnes, Virginia Advanced Study Strategies
December, 2009
Multiple Choice Questions from 2002 AP Exam
7. Suppose that the distribution of a set of scores has a mean of 47 and a standard deviation of 14. If 4
is added to each score, what will be the mean and the standard deviation of the distribution of new
scores?
A)
B)
C)
D)
E)
Mean
51
51
47
47
47
Standard Deviation
14
18
14
16
18
14. The boxplots shown above summarize two data sets, I and II. Based on the boxplots, which of the
following statements about these two data sets CANNOT be justified?
A).
B)
C)
D)
E)
The range of data set I is equal to the range of data set II.
The interquartile range of data set I is equal to the interquartile range of data set II.
The median of data set I is less than the median of data set II.
Data set I and data set II have the same number of data points.
About 75% of the values in data set II are greater than or equal to about 50% of the values
in data set I.
20. A small town employs 34 salaried, nonunion employees. Each employee receives an annual salary
increase of between $500 and $2,000 based on a performance review by the mayor's staff. Some
employees are members of the mayor's political party, and the rest are not. Students at the local high
school form two lists, A and B, one for the raises granted to employees who are in the mayor's party,
and the other for raises granted to employees who are not. They want to display a graph (or graphs) of
the salary increases in the student newspaper that readers can use to judge whether the two groups of
employees have been treated in a reasonably equitable manner. Which of the foIlowing displays is
least likely to be useful to readers for this purpose?
A)
B)
C)
D)
E)
Back-to-back stemplots of A and B
Scatterplot of B versus A
Parallel boxplots of A and B
Histograms of A and B that are drawn to the same scale
Dotplots of A and B that are drawn to the same scale
3
Gloria Barrett and Daren Starnes, Virginia Advanced Study Strategies
December, 2009
27. The figure above shows a cumulative relative frequency histogram of 40 scores on a test given in
an AP Statistics class. Which of the following conclusions can be made from the graph?
A)
B)
C)
D)
E)
There is greater variability in the lower 20 test scores than in the higher 20 test scores.
The median test score is less than 50.
Sixty percent of the students had test scores above 80.
If the passing score is 70, most students did not pass the test.
The horizontal nature of the graph for test scores of 60 and below indicates that those
scores occurred most frequently.
Multiple choice question from “Acorn Book”
Descriptive Statistics
5. Some descriptive statistics for a set of test scores are shown above. For this test, a certain student
has a standardized score of z = –1.2. What score did this student receive on the test?
(a) 266.28
(b) 779.42
(c) 1008.02
(d) 1083.38
(e) 1311.98
4
Gloria Barrett and Daren Starnes, Virginia Advanced Study Strategies
December, 2009
AP Exam Free Response Questions for Practice and Discussion
2006, #1: Comparing catapults
5
Gloria Barrett and Daren Starnes, Virginia Advanced Study Strategies
December, 2009
Solution—2006, #1
Notes for part (a):
Notes for part (c):
6
Gloria Barrett and Daren Starnes, Virginia Advanced Study Strategies
December, 2009
2006 Form B, #1: Home sales
7
Gloria Barrett and Daren Starnes, Virginia Advanced Study Strategies
December, 2009
Solution—2006 Form B, #1
8
Gloria Barrett and Daren Starnes, Virginia Advanced Study Strategies
December, 2009
2001, #1 LA Rainfall
9
Gloria Barrett and Daren Starnes, Virginia Advanced Study Strategies
December, 2009
Solution, 2001 #1
10
Gloria Barrett and Daren Starnes, Virginia Advanced Study Strategies
December, 2009
2005 Form B, #1
11
Gloria Barrett and Daren Starnes, Virginia Advanced Study Strategies
December, 2009
Solution—2005 Form B, #1
12
Gloria Barrett and Daren Starnes, Virginia Advanced Study Strategies
December, 2009