Download Statistics Unit 2 Exam – Topics 6-10

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Gibbs sampling wikipedia , lookup

Categorical variable wikipedia , lookup

Transcript
Name ______________________________ Block _____
STATISTICS Unit 2 STUDY GUIDE – Topics 6-10
Part 1: Vocabulary
For each word, be sure you know the definition, the formula, or what the graph looks like.
A. association
M. mean absolute deviation
Y. side-by-side stemplot
B. boxplot
N. median
Z. Simpson‟s paradox
C. center
O. mode
AA. standard deviation
D. conditional distribution
P. modified boxplot
BB. standardization
E. empirical rule
Q. outliers
CC. stemplot
F. five-number summary
R. outlier test
DD. symmetric
G. histogram
S. range
EE. two-way table
H. independent
T. relative risk
FF. upper quartile
I. interquartile range
U. resistant
GG. variability
J. lower quartile
V. segmented bar graph
HH. z-score
K. marginal distribution
W. skewed left
L. mean
X. skewed right
_______ 1.
A graph for a quantitative variable that divides a distribution into 25% segments.
_______ 2.
A graph for a quantitative variable that divides a distribution into 25% segments and shows all mathematical
outliers.
_______ 3.
The minimum, Q1, median, Q3, and maximum.
_______ 4.
The “middle” of a distribution that can be described by the mean, median, or mode.
_______ 5.
The “middle” of a distribution that is also known as the average.
_______ 6.
The “middle” of a distribution that is the most frequently occurring number.
_______ 7.
The “middle” of a distribution that divides the list of numbers in half.
_______ 8.
A graph for a quantitative variable that has a column for part of the numbers and rows for the other part of
the numbers.
_______ 9.
A graph for a quantitative variable and a categorical binary variable that has a column for part of the
numbers and rows off to the left and right for the other part of the numbers.
_______ 10.
The description of a distribution‟s shape that has a peak in the middle and tapers off evenly to the left and to
the right.
_______ 11.
The description of a distribution‟s shape that has a peak on the left and tapers off to the right.
_______ 12.
The description of a distribution‟s shape that has a peak on the right and tapers off to the left.
_______ 13.
68% of the data falls between – 1 and + 1 standard deviations,
95% of the data falls between – 2 and + 2 standard deviations, and
99.7% of the data falls between – 3 and +3 standard deviations
_______ 14.
Q3 – Q1
_______ 15.
maximum – minimum
_______ 16.
Q3 + (IQR * 1.5) and Q1 – (IQR * 1.5)
_______ 17.
The proportion of an event in one category compared to the proportion of the same event in a different
category. This value tells you how many times more likely the event is to occur in the first category than in
the second.
_______ 18.
Q3 This value divides a distribution into 75% and 25% segments.
_______ 19.
Q1 This value divides a distribution into 25% and 75% segments.
_______ 20.
A value (or values) that are significantly far away from the rest of the data.
_______ 21.
A measure of spread that is calculated by (1) subtracting the mean from each number in a distribution,
(2) taking the absolute value of each of the differences, then (3) taking the average of these differences.
_______ 22.
A measure of spread that is calculated by (1) subtracting the mean from each number in a distribution,
(2) squaring the differences, (3) adding the squared values, (4) dividing that sum by n – 1, and (5) taking
the square root of the quotient.
_______ 23.
When one variable has an affect on another variable, there is this between them.
_______ 24.
When one variable does not have any affect on another variable, they are said to be this.
_______ 25.
A graph for two categorical variables where one of the variables is represented in columns and the other
variable is represented as segments within the columns.
_______ 26.
This is a measure of standard deviations.
_______ 27.
A phenomenon where overall proportions contradict proportions in separate categories.
_______ 28.
The process of measuring different distributions in standard deviations so comparisons can be made
between them.
_______ 29.
A tool for organizing two categorical variables in rows and columns.
_______ 30.
Proportions that are calculated within the columns of a two-way table.
_______ 31.
Proportions that are calculated within the margins of a two-way table.
_______ 32.
A measurement that doesn‟t change when outliers are present is said to be this.
_______ 33.
Another term for the spread.
_______ 34.
A graph for a quantitative variable that is similar to a dotplot, but uses columns.
Part 2: General Knowledge Questions
35. What are the different measures of center?
_________________________
_________________________
_________________________
36. What are the different measures of spread?
_________________________
_________________________
_________________________
37. What are the 3 different shapes a distribution can have?
_________________________
_________________________
_________________________
38. What are the five values listed in the five-number summary?
_______________
_______________
_______________
_______________
_______________
39. Based on the five-number summary, what percent of the data falls:
Below the Q1? _______________
Between the Q1 and the Q3? _______________
Below the Median? _______________
Between the Min and the Q3? _______________
Below the Q3? _______________
Between the Q1 and the Max? _______________
40. For a normal distribution, what is the proportion of data that falls within:
one standard deviation of the mean? _______________
two standard deviations of the mean? _______________
three standard deviations of the mean? _______________
What is this pattern called? ________________________________________
Using the histograms provided, choose the most appropriate graph for each description.
_____ 41. The mean is greater than the median.
_____ 44. The median is greater than the mean.
_____ 42. The standard deviation is largest.
_____ 45. The graph is a normal distribution.
_____ 43. The graph is skewed left.
_____ 46. The graph is skewed right.
A.
B.
C.
D.
Match each of the following graphs with the proper description.
_____ 47. Bar Graph
_____ 51. Modified Box Plot
_____ 48. Box Plot
_____ 52. Segmented Bar Graph
_____ 49. Dot Plot
_____ 53. Stem Plot
_____ 50. Histogram
A.
B.
C.
D.
F.
E.
KEY:
G.
0 5 5 6 8
1 0 1 3 4 7 9 9 9
2 0 0 1 2 3 5
For each of the graphs, state what type of variables are represented by that type of graph and how many variables
can be represented at a time.
graph
54. Bar Graph
55. Box Plot
56. Dot Plot
57. Histogram
58. Modified Box Plot
59. Scatter Plot
60. Segmented Bar Graph
61. Stem Plot
number of variables
type of variables
Match each term with the appropriate letter, formula, or equation.
Please use capital letters.
A. Max – Min
B.
z=
x-μ
σ
C. Q1 – (1.5*IQR)
_____ 62. The interquartile range.
_____ 63. Test for lower outliers.
_____ 65. Test for upper outliers.
_____ 66. The z-score.
D. Q3 + (1.5*IQR)
E. Q3 – Q1
_____ 64. The range.
Part 3: Short Answer / Extended Response
Topic 6: Given the number of times an event occurs out of how many total occurrences for two different groups, you should
be able to create a two-way table, a segmented bar graph and calculate the relative risk.
Toward the end of 2003, there were many warnings that the flu season would be especially severe and many more
people chose to obtain a flu vaccine than in previous years. In January 2004, the Centers for Disease Control and
Prevention magazine published the results of a study that looked at workers at Children‟s Hospital in Denver,
Colorado. Of the 1000 people who had chosen to receive the flu vaccine (before November 1, 2003), 149 still
developed flu-like symptoms. Of the 402 people who did not get the vaccine, 68 developed flu-like symptoms.
a. Create a two-way table for the data in the paragraph above.
TOTAL
TOTAL
b. Calculate the conditional distributions and write the proportions in the lower right corners of the table.
c. Create a segmented bar graph based on the conditional distributions.
KEY:
d. What is the relative risk of developing flu-like symptoms? Show all work.
e. Are these variables independent? __________
Why or why not?
Topic 7: Given quantitative data or quantitative data that is divided into categories, you should be able to create a histogram,
a stemplot, or a side-by-side stemplot then describe the distribution using SOCS.
Arby’s Sandwiches
Arby‟s Melt with Cheddar
Arby Q
Bac‟n Cheddar Deluxe
Beef „n Cheddar
Giant Roast Beef
Junior Roast Beef
Regular Roast Beef
Super Roast Beef
Breaded Chicken Fillet
Chicken Cordon Bleu
Grilled Chicken BBQ
Grilled Chicken Deluxe
Roast Chicken Club
Roast Chicken Deluxe
a. Create a histogram of the Arby‟s data.
b. Describe the distribution using SOCS.
fat/oz
3.5 *
2.8
4.2 *
4.2 *
3.5
3.2
3.5
3.1
3.9
3.9 *
1.8
2.5 *
3.6 *
2.9 *
Arby’s Sandwiches
Roast Chicken Santa Fe
French Dip
Hot Ham „n Swiss
Italian Sub
Philly Beef „n Swiss
Roast Beef Sub
Triple Cheese Melt
Turkey Sub
Roast Beef Deluxe
Roast Chicken Deluxe
Roast Turkey Deluxe
Fish Fillet
Ham „n Cheese
Ham „n Cheese Melt
fat/oz
3.4
3.2
2.5 *
3.6 *
4.5 *
3.9
5.4 *
2.8
1.6
0.9
1.0
3.5
2.4 *
2.7 *
c. Create a stemplot for the Arby‟s data.
Rough Draft:
Final Copy:
d. Create a side-by-side stemplot for the Arby‟s data (An asterix indicates a sandwich with cheese, those without an
asterix do not have cheese.)
e. Describe the cheese distribution and the no cheese distribution using SOCS.
Topic 8: Given quantitative data in a list or in a table, you should be able to calculate the mean, the median and the mode.
Also, look over the review packet from this topic. The main concepts were mean, median, mode, comparing dotplots,
and using a calculator to generate the 3 measures of center.
a. The table represents the number of friends students reported having in their first block class. Determine
each of the three measures of center from the table. (Round to the nearest tenth if rounding is necessary.)
# friends
1
2
3
4
5
6
7
8
frequency
0
1
2
7
15
18
12
10
mean = __________
median = __________
mode = __________
b. Use the Arby‟s data from the Topic 7 example to calculate each of the following measurements. Check your
answers by entering the data into your calculator. Using 1-Var Stats and a calculator-generated dotplot.
(Round to the nearest tenth.)
mean = __________
median = __________
shape = ____________________
mode = __________
spread = ____________________
Topic 9: Look over your review sheets from this chapter. The main concepts were range, IQR, and standard deviation.
We also calculated the MAD and standard deviation by hand, looked at the Empirical Rule and z-scores so we could
compare distributions that were measured on different scales.
Topic10: Look over your review sheets from this chapter. The main concepts were boxplots, modified boxplots, the
5-number summary and calculating outliers. We also used our calculator to send groups, ungroup them, modify lists,
sort lists (with an ID list), and create graphs.
Use the Five-Number Summary to answer #8 - 11.
Minimum
Q1
Median
Q3
Maximum
7
15
18
33
75
a. What is the IQR? _______________
b. What is the range? _______________
b. An upper outlier would be any number that falls above what value?
c. A lower outlier would be any number that falls below what value?
d. Construct a regular boxplot for the data above.
e. Would it be possible to make a modified boxplot? __________ Why or why not?