Download 251solnG1

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Taylor's law wikipedia , lookup

Student's t-test wikipedia , lookup

Regression toward the mean wikipedia , lookup

Transcript
251solnG1 1/31/08 (Open this document in 'Page Layout' view!)
1
G. Measures of Dispersion and Asymmetry.
1. Range
Downing & Clark, problem 7 above (Use data to find IQR). Review solutions and terms on page 41 (36 in 3 rd ed.) of Downing &
Clark.
2. The Variance and Standard Deviation of Ungrouped Data.
Text exercises 3.1b, 3.2b, 3.6, 3.37, 3.24 [3.1b, 3.2b, 3.7, 3.37, 3.23] (3.1b, 3.2b, 3.7, 3.23, 3.33)
3. The Variance and Standard Deviation of Grouped Data.
Text exercises 3.28, 3.30 (3.68, 3.70) (work 3.30 in thousands), Downing & Clark pg 42 or 37, problems 6,7 (Find sample standard
deviation – hint: run problem 6 in hundreds) (Note that you can use the Excel or Minitab techniques in the graded assignment to
compute and sum the
fx
and
fx 2
columns in problems 6 and 7.), Problems G1, G2. Graded Assignment 1
4. Skewness and Kurtosis.
Find the standard deviation, coefficient of variation and measures of skewness in Text problem 3.1, 3.2. Problems G3A, G4 (See
251wrksht).
5. Review
a. Grouped Data.
b. Ungrouped Data.
Sections 1 and 2 are in this document.
----------------------------------------------------------------------------------------------------------------------------.
Downing and Clark, pg. 37, Application 7(Use data IQR or midrange):
Solution: From section F we know
 .252953   558 
1Q  x.75  30  
10  35 .15 .
350


 .752953   2138 
3Q  x.25  90  
10  94.386 .
175


So IQR  3Q  1Q  94.386  35.15  59.236.
Exercise 3.1b: The numbers are 7, 4, 9, 8, 2. Find the range, IQR, variance, standard deviation and
coefficient of variation.
Solution: In order these numbers are 2, 4, 7, 8, 9. n  5.
Index
1
2
3
4
5
i) Range: range  9  2  7 .
x
2
4
7
8
9
30
x  x 
x2
4
16
49
64
81
214
2
4
7
8
9
–
–
–
–
-
6
6
6
6
6
= -4
= -2
= 1
= 2
= 3
0
 x  x 2
16
4
1
4
9
34
251solnG1 1/31/08 (Open this document in 'Page Layout' view!)
ii) Variance: First, find the sample mean. x 
 x  30  6 . The preferred method for finding the
n
5
variance is the computational formula, which has the advantage that the
are not needed. s 2 
x
2
 nx
2

n 1
 x  x 
2
x  x 
and
 x  x 2
columns
214  56
214  180 34


 8.5 . If we use the definitional
5 1
4
4
2
2
34
34


 8.5 .
n 1
5 1 4
iii) IQR: To find the first quartile p  .25 so position  pn  1  .256  1.5  a.b . So a  1 and .b  .5 .
Thus 1Q  x.75  xa  .bxa1  xa   x1  .5x11  x1   x1  .5x2  x1 
 2  .54  2  2  1  3 . To find the third quartile p  .75 so position  pn  1  .756  4.5  a.b . So
a  4 and .b  .5 . Thus
3Q  x.25  xa  .bxa1  xa   x4  .5x41  x4   x4  .5x5  x4   8  .59  8
 8  0.5  8.5 . Finally, IQR  3Q  1Q  x.25  x.75  8.5  3  5.5 .
formula instead, s 2 
iv) Standard deviation: s  s 2  8.5  2.915
std .deviation s 2.915
v) Coefficient of variation: C 
 
 0.4859 or 48.59%.
mean
x
6
Exercise 3.2b: The numbers are 7, 4, 9, 7, 3, 12. Find the range, IQR, variance, standard deviation and
coefficient of variation.
Solution: In order these are 3, 4, 7, 7, 9, 12. n  6.
Index
1
2
3
4
5
6
x
3
4
7
7
9
12
42
9
16
49
49
81
144
348
i) Range: range  12  3  9 .
ii) Variance: First, find the sample mean. x 
s2 
x
s2 

2
 nx
n 1
x  x 2
n 1
2


x  x 
x2
3
4
7
7
9
12
–
–
–
–
–
-
7
7
7
7
7
7
= -4
= -3
= 0
= 0
= 2
= 5
0
 x  x 2
16
9
0
0
4
25
54
 x  42  7 . By the computational method
n
6
348  67
348  294 54


 10 .8 . If we use the definitional formula instead,
6 1
5
5
2
54
54

 10.8 .
6 1 5
251solnG1 1/31/08 (Open this document in 'Page Layout' view!)
3
iii) IQR: To find the first quartile p  .25 so position  pn  1  .257  1.75  a.b . So a  1 and
.b  .75 . Thus 1Q  x.75  xa  .bxa1  xa   x1  .75x11  x1   x1  .75x2  x1 
 3  .754  3  3.75 . To find the third quartile p  .75 so position  pn  1  .757  5.25  a.b . So
a  5 and .b  .25 . Thus
3Q  x.25  xa  .bxa1  xa   x5  .25x51  x5   x5  .5x6  x5   9  .2512  9
 9  0.75  9.75 . Finally, IQR  3Q  1Q  x.25  x.75  9.75  3.75  6 . (A more accurate answer than
the text answer)
iv) Standard deviation: s  s 2  10 .8  3.286
std .deviation s 3.286
v) Coefficient of variation: C 
 
 0.4695 or 46.95%.
mean
x
7
Exercise 3.6 [3.7 in 9th]: The x numbers are 568, 570, 575, 578, 584 and the y numbers are 573, 574,
575, 577, 578.These are the inner diameters of two grades of tires. a) Find the mean, median and standard
deviation for each of the two grades. b) Which grade is providing better quality? c) How would your answer
change if we change the 578 for grade y to 588?
These are, in order as below. n  5.
Index
x
x2
y
y2
1
2
3
4
5
568
570
575
578
584
2875
322624
324900
330625
334084
341056
1653289
a) Grade x: First, find the sample mean. x 

573
574
575
577
578
2877
328329
329476
330625
332929
334084
1655443
 x  2875  575 .0. The median is the middle number, 575
n
5
1653289  5575 .02 164 .0

 41 .0 . s x  s x2  41 .0  6.40 .
n 1
5 1
4
y 2877

 575 .4. The median is the middle number, 575
Grade y: First, find the sample mean. y 
n
5
y 2  ny 2 1655443  5575 .42 9.5777683


 4.3 .
and the variance is s 2y 
n 1
5 1
4
and the variance is s x2 
x 2  nx 2



s y  s 2y  4.3  2.07 .
According to the Instructor’s Solutions Manual
Mean
Median
Standard deviation
Grade X
575
575
6.40
Grade Y
575.4
575
2.07
251solnG1 1/31/08 (Open this document in 'Page Layout' view!)
4
b) According to the Instructor’s Solutions Manual
(b)
If quality is measured by the average inner diameter, Grade X tires provide slightly better quality because X’s
mean and median are both equal to the expected value, 575 mm. If, however, quality is measured by
consistency, Grade Y provides better quality because, even though Y’s mean is only slightly larger than the
mean for Grade X, Y’s standard deviation is much smaller. The range in values for Grade Y is 5 mm compared
to the range in values for Grade X which is 16 mm.
c) According to the Instructor’s Solutions Manual
(c)
Grade X
Grade Y, Altered
Mean
575
577.4
Median
575
575
Standard deviation
6.40
6.11
In the event the fifth Y tire measures 588 mm rather than 578 mm, Y’s average inner diameter becomes
577.4 mm, which is larger than X’s average inner diameter, and Y’s standard deviation swells from 2.07
mm to 6.11 mm. In this case, X’s tires are providing better quality in terms of the average inner diameter
with only slightly more variation among the tires than Y’s.
Exercise 3.37 (3.23 in 8th edition): The data set ‘batteries’ consists of the numbers 342, 426, 317, 545,
264, 451, 1049, 631, 512, 266, 492, 562 and 298. a) Produce a 5-number summary. b) Construct a box-andwhiskers plot and describe the shape.
Solution: The numbers in order are 264 266 298 317 342 426 451 492 512 545 562 631 1049 n  13.
1 2 3 4 5 6 7 8 9 10 11 12 13
a) The five number summary consists of the lowest value, the first quartile, the median, the third quartile
and the highest value.
(i) The lowest value is 264.
(ii) To find the first quartile p  .25 so position  pn  1  .2514   3.5  a.b . So a  3 and .b  .5 .
Thus 1Q  x.75  xa  .bxa1  xa   x3  .5x4  x3   298  .5317  298   307 .5.
(iii) For the median position  pn  1  .514   7 and x7  451 .
(iv) To find the third quartile p  .75 so position  pn  1  .7514   10.5  a.b . So a  10 and
.b  .5 . Thus 3Q  x.25  xa  .bxa1  xa   x10  .5x11  x10   545  .5562  545   553 .5 .
(v) The highest value is 1049.
According to the Instructor’s Solutions Manual
3.23
(a)
Five-number summary: 264 307.5 451 553.5 1,049
b) According to the Instructor’s Solutions Manual
(b)
*
0
500
1000
1500
The distribution is right-skewed.
c) In the 9th edition, you are asked to compare the answer to b) with the results of 313(h). According to the
Instructor’s Solutions Manual the answer to 3.13h (3.12h) is
3.13
(a)
(h)
Mean = 473.46
Median = 451
There is no mode.
The median seems to be better descriptive measures of the data, since they are closer to the observed values
than is the mean.
The shape of the distribution of the original data is right-skewed, since the mean is larger than the
median.
So, looking at the box-and-whisker plot, we can say
(c)
Because the data set is small, one very large value (1,049) skews the distribution to the right.
251solnG1 1/31/08 (Open this document in 'Page Layout' view!)
5
Exercise 3.24 [3.23 in 9th edition]: You have data on N  1024 mutual funds. For 1-year total returns you
have a population mean of 8.20, a population standard deviation of 2.75, a range from -2.0 to +17.1, a first
quartile of 5.5 and a third quartile of 10.5. a) According to the empirical rule, what percentage of the returns
is within 1 standard deviation of the mean? b) within 2 standard deviations of the mean? c) According to the
Tchebyschev rule what percentage of the returns should be within 1, 2 or 3 standard deviations of the mean?
d) According to the Tchebyschev rule, between what two amounts should there be 93.75% of the returns?
Solution: Note that   8.20 and   2.75 . The Empirical rule states that about 67% will be within 1
standard deviations of the mean and about 95% are within 2 standard deviations of the mean. The
1
Tchebyschev rule states P x    k 
. What this means is that for any value of k above or equal to
k2
1, we can divide a distribution into two types of areas. The center is the area between   k and   k .


The tails are the areas below   k and above   k . The fraction of the data in the tails cannot exceed
1
k
2
and the fraction in the center will be at least 1 
3.23 (a)
(f)
68%
2
(b) 95%
 1
  4 to   4
1
k2
. In part d) the fraction within k  2 standard
3
. in part c) They ask where at least 93.75% of the data must
4
k
2
1
15
be. Since 93.75% is 15/16, we can say 1  2 
, so according to the Instructor’s Solutions Manual the
16
k
answer to 3.23 is:
deviations of the mean is 1 
1
1
2

(c) not calculable
or -2.8 to 19.2
(d) 75%
(e) 88.89%
Exercise 3.33 (in 8th edition): Note that   28.20 and   6.75 . The Empirical rule states that about
67% will be within 1 standard deviations of the mean and about 95% are within 2 standard deviations of the
1
mean. The Chebyschef rule states P x    k  2 . What this means is that for any value of k above
k
or equal to 1, we can divide a distribution into two types of areas. The center is the area between   k
and   k . The tails are the areas below   k and above   k . The fraction of the data in the tails

cannot exceed
1
k2

and the fraction in the center will be at least 1 
3.33
(a)
(b)
(c)
(d)
(e)
(f)
k2
 1
1
k2
. In part d) the fraction within k  2
3
. in part c) They ask where at least 93.75% of the
4
1
15
1
1
data must be. Since 93.75% is 15/16, we can say 1  2 
, so 2 
and k  4. So According to the
16
16
k
k
Instructor’s Solutions Manual the answer to 3.33 is:
standard deviations of the mean is 1 
1
1
22

67%
90%-95%
not calculable
75%
88.89%
  4 to   4
Parts not copied ©2003 Roger Even Bove
or 1.2 to 55.2