Download Quant Cont.

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Bootstrapping (statistics) wikipedia , lookup

Misuse of statistics wikipedia , lookup

Time series wikipedia , lookup

Transcript
Histograms
 Used with numerical data
 Bars touch on histograms
 For comparative histograms – use two separate
graphs with the same scale on the horizontal
axis
 Histogram is used when quantitative variables are too




many for a stemplot or dotplot.
Divide the range of the data into groups of equal width
Count the number of individuals in each group
Draw the histogram, title, label axis
There is no horizontal space between bars unless a
group is empty
 Calculator
 STAT choose 1 Edit – Type values into L1
 Set Up Histogram – 2nd Y (Stat Plot)
Enter 1
Plot 1
ON
Type “histogram”
X List: L1
Freq: 1
 Quick Graph
ZOOM
Choose 9
Trace to look at class intervals
 Set Window to match intervals
 Graph - Trace
195
204
204
192
192
193
209
194
199
204
204
192
214
222
209
Age (Months)
5
Frequency
4
3
2
1
0
192
198
204
210
15 Students
216
222
 Center
 Shape
 Spread
 Outliers
 Cautions: Pancake and skyscraper effect
States differ widely with respect to the percentage of college students who are enrolled in public
institutions. The U.S. Department of Education provided the accompanying data on this percentage for
the 50 U.S. states for fall 1999. Create a histogram to display this data and then give a brief description of
the distribution. (use a minimum of 40, and maximum of 100 with class widths of 10)
Percentage of College Students Enrolled in Public Institutions
95
73
63
92
96
75
77
87
76
52
81
74
91
90
65
69
75
88
80
62
85
95
86
93
85
73
70
82
56
80
80
91
89
84
76
82
55
81
60
82
72
89
79
89
92
81
56
84
43
Complete the frequency table below and construct the corresponding histogram.
Class
25 to < 34
34 to < 43
43 to < 52
52 to < 61
61 to < 70
70 to < 79
•
•
•
•
•
•
Count
Describe the shape: roughly symmetric, roughly skewed left, roughly skewed right, or no
discernible shape.
Describe the spread of the distribution. …………………………
What is the center of the distribution? (Hint: look at the original data set)
……………
Do there appear to be any obvious outliers? If so, name them.
…………………………………
What is the width of each class in the histogram? …………
Could this data set be represented by a pie graph? Why or why not?
 ease of construction
 convenient handling of outliers
 construction is not subjective
(like histograms)
 Used with medium or large size
data sets (n > 10)
 useful for comparative displays
find five-number summary
Min Q1 Med Q3 Max
draw box from Q1 to Q3
draw median as center line in
the box
extend whiskers to min &
max
display outliers
fences mark off mild & extreme
outliersALWAYS use modified
whiskers
extend
to largest
boxplots
in this
class!!!
(smallest) data value inside
the fence

A report from the U.S. Department of Justice gave the
following percent increase in federal prison populations in
20 northeastern & mid-western states in 1999.
5.9
4.5
1.3
6.9
3.5
5.0
5.9
4.5
5.6
4.1
6.3
4.8
7.2
6.4
5.5
5.3
8.0
4.4
7.2
Create a modified boxplot. Describe the
distribution.
Use the calculator to create a modified boxplot.
3.2
Symmetrical boxplots
Approximately symmetrical boxplot
Skewed boxplot
Evidence suggests that a high indoor radon
concentration might be linked to the development of
childhood cancers. The data that follows is the
radon concentration in two different samples of
houses. The first sample consisted of houses in
which a child was diagnosed with cancer. Houses in
the second sample had no recorded cases of
childhood cancer.
Cancer
10 21
20 45
16 21
17 33
5 23 15 11 9 13 27 13 39 22 7
12 15 3 8 11 18 16 23 16 9 57
18 38 37 10 15 11 18 210 22 11 16
10
No Cancer
9 38 11 12 29 5 7 6 8 29 24 12 17
11 11 3 9 33 17 55 11 29 13 24 7 11
21 6 39 29 7 8 55 9 21 9 3 85 11 14
Create parallel boxplots. Compare the distributions.
Creating a Box Plot
Cancer
No
Cancer
0
50
100
Radon
150
200
Cancer
No Cancer
100
200
Radon
The median radon concentration for the no cancer
group is lower than the median for the cancer
group. The range of the cancer group is larger than
the range for the no cancer group. Both
distributions are skewed right. The cancer group
has outliers at 39, 45, 57, and 210. The no cancer
group has outliers at 55 and 85.
 Which terms best represent the data?
 The mean and median best illustrate skewed data
 While variance and standard deviation represent
symmetrical data
 Spread – how far away from the mean does the data stretch
 To calculate variances – we need to square the differences
between the mean and each data value.
 Variance (s2) - a measure of how far a set of numbers is
spread out.
A variance of zero indicates that all the values are identical
A small variance indicates a small spread, while a large variance
means the numbers are spread out
 Standard Deviation (s) - shows how much variation or dispersion
from the average exists