Download 251x9811 2/11/98

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Sufficient statistic wikipedia , lookup

Taylor's law wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Misuse of statistics wikipedia , lookup

German tank problem wikipedia , lookup

Student's t-test wikipedia , lookup

Transcript
251y9852 10/05/98
ECO251 QBA1
FIRST HOUR EXAM
Name __________________
Class MWF TR 10 11 12:30
OCTOBER 8, 1998
Part I.
Explain or Define the Following (13 Points Maximum)
a) Field (1)
See below
b) Cell (1)
A location in a field
c) Stub (1)
See below
d) I have a table describing the entire output of automobiles by a plant. It groups the output
according to number of cylinders (4, 6, or 8) and miles per gallon on highways. If we define the following
categories:
A 'Under 30 miles per gallon'
B '30 or more miles per gallon'
C '4 Cylinders’
D '6 Cylinders'
Are the following classes: (3)
Mutually Exclusive?
Collectively Exhaustive?
A and B
___y__
___y___
C and D
___y__
___n___
A, B and D
__n___
___y___
e) I have a table of the percentage return on equity of a group of banks for 1994. If the mean return
is (rounded to the nearest per cent) 19% and the median is 24%, is the distribution skewed? To the right or
the left? Where would you expect the mode to be relative to these two numbers? (2)
Skewed to the left, above 24%.
f) Ogive (1)
A graph of the cumulative distribution
g) Interval Data (1)
Data like temperatures, where differences between numbers are
meaningful, but where there is no bottom or zero on the scale.
h) Third Quintile (1)
A point with 3/5 or 60% of the data below it, x.40 .
i) Leptokurtic (1)
Sharp-peaked – coefficient of excess is positive.
j) Explain the difference between a statistic and a parameter (2)
A statistic comes from a sample; a parameter describes a population.
k) I have a table of the pizza delivery times over an evening. If the lowest is 16.5 minutes and the
highest is 47.5 minutes and I want the data presented in seven intervals, what intervals would you use?
Explain using an appropriate formula (3)
Since the highest is 47.5 and the lowest is 16.5, use
47 .5  16 .5
 4.43 . I would suggest using 4.5 or 5 which gives
7
intervals like 15-19.99, 20-24.99, 25-29.99, 30-35.99, 35-39.99, 4044.99, 45-49.99.
l) Box Plot (2)
See below. Or see text page 85.
m) Inner and Outer Fences (2)
See text page 85.
1
251y9852 10/05/98
Part II. Give Formulas for the Following (12 Points maximum):
a) Population Variance (Definitional formula for ungrouped data) (2) 
1
1
1
 
b) Harmonic Mean (1)
xh n
x
c) Population Mean (Grouped data) (1)  
2
 x   

2
N
 fx
N
d) Sample Standard Deviation (Use words!) (1) The square root of the sample variance.
e) Coefficient of Variation (For a sample)(1) C 
f) Sample Skewness
k 3
k3  (1)
s
x
n
n
 x  x 3 

(n  1)( n  2)
(n  1)( n  2)
 x
 3x  x 2  2nx 3
3
g) Sample Variance (Grouped Data - Computational Formula (2) s 2 
h) Kurtosis (Population - Ungrouped) (2)
4  
 fx
x   
2

 nx 2
n 1
4
N
3mean  mode
i) Pearson's Measure of Skewness (2) SK 
std. deviation
j) Compute the Geometric Mean of the numbers 3, 5, 7 and 8 (2)
1
x g   x1  x2  x3 xn  n  n
x
 
or ln x g 
1
 ln( x)
n
1
x g  3  5  7  8 4  4 840  5.38356
k) Explain Chebyschef's Inequality (A formula or diagram may be used) (3)
P x    k  
1
k2
2
251y9852 10/05/98
Part III. Do the Following Problems (25 Points)
To control costs, a company takes a sample of salespersons’ expenditures in entertaining clients.
Results are as follows: 155 130 107 143 123 137
n  6 ,  x  795 ,
x
2
 x  x 
2
x2
x
Compute the Following:
a) Mean Expenditures (1)
b) The Median (1)
c) The Standard Deviation (3)
d) The 80th Percentile (2)
x
155 24025
130 16900
107 11449
143 20449
123 15129
137 18769
795 106721
2
index x in order
1
107
2
123
3
130
4
137
5
143
6
155
 106721 ,  x  x   0.00,
 x  x 
2
 1383.5 .
 x  795  as some of you seem to have fooled yourselves into believing. Nor is
equal to  x  x   795  132 .5 . If you had tried these in any of the homework problems,
is not
2
2
2
2
you would have found that they didn’t work.
a) x 
 x  795  132.5
n
6
b) position  pn  1  a.b  .57  3.5
x1 p  x a  .b ( x a 1  x a ) so x1.5  x.5  x3  .5( x4  x3 )
130  137
 133.5
2
x  x 2 1383.5
x 2  nx 2 106721  6132 .52


2
2
c) s 

 276 .7 s 

 276.7
n 1
5
n 1
5
s  276.7  16.6343
d) The 80th percentile has 80% below it. position  pn  1  a.b  .87  5.6
x1 p  x a  .b ( x a 1  x a ) so x1.8  x.2  x5  .6( x6  x5 )
 143  .6(155  143)  150.2
 130  .5(137  130) 
3
251y9852 10/05/98
2. An automobile is taken from each of 50 production runs and tested for miles per gallon. A summary of
the results is shown below:
Mileage
Frequency
29.80-30.39
30.40-30.99
31.00-31.59
31.60-32.19
32.20-32.79
32.80-33.39
n   f  50,
Midpoint
F
f
5
8
12
13
9
3
50
5
13
25
38
47
50
 fx
x
30.1
30.7
31.3
31.9
32.5
33.1
 1578.2,
fx 2
fx
fx3
150.5
4530.05
245.6
7539.92
375.6 11756.28
414.7 13228.93
292.5
9506.25
99.3
3286.83
1578.2 49848.26
 fx
 f x  x   0.0000,  f x  x 
2
2
136354.505
231475.544
367971.564
422002.867
308953.125
108794.093
1575551.698
 49848.26, and
 33.9552, and
 fx
3
 f x  x 
3
 1575551.698.
 2.32443. As usual half
of the people who tried to compute these last three sums got them wrong. See the example in the outline.
a. Calculate the Cumulative Frequency (1) (See above)
b. Calculate the Mean (1) x 
 fx  1578.2  31.564
n
50
c. Calculate the Median (2) position  pn  1  .551  25.5 . This is above 25 and below 38, so
 pN  F 
the interval is 31.60 to 32.19. x1 p  L p  
 w so
 f p 
 .550  25 
x1.5  x.5  31.60  
 0.6  31.60
13

d. Calculate the Mode (1) The mode is the midpoint of the largest group. Since 13 is the largest frequency,
the modal group is 31.60 to 32.19 and the mode is 31.9.
e. Calculate the Variance (3) s
 f x  x 
2
s2 
n 1

2
 fx

2
 nx 2
n 1

49848 .26  50 31 .576 2
 0.6930
49
33.9552
 0.6929
49
f. Calculate the Standard Deviation (2)
s  0.693  0.8325
4
251y9852 10/05/98
position  pn  1  .2551  12.75 . This is
 pN  F 
 Lp  
 w gives us
 f p 
g. Calculate the Interquartile Range (3) First Quartile:
just below 13, so the group is 30.40 to 30.99. x1 p
 .2550  5 
Q1  x1.25  x.75  30.40  
 0.6  30.962
8

Third Quartile: position  pn  1  .7551  38.25 . This is just above 38, so the group is 32.20 to
 .7550  38 
32.79. x1.75  x.25  32.20  
 0.6  32.167 . Since this produces a number below
9

32, I have directed you to use 31.60 to 32.19 – this is the only time that I have seen this formula break
down.
 .7550  25 
Q3  x1.75  x.25  31.60  
 0.6  32.177.
13

IQR  Q3  Q1  32.177  30.962  1.215 .
h. Calculate a Statistic showing Skewness and interpret it (3)
k 3

n
(n  1)( n  2)

 fx
3
 3x  fx 2  2nx 3


50
3
157551.698  331.57649848.26  25031.564  0.0490 .
4948
n
x  x 3  50  2.3244  0.0494

(n  1)( n  2)
4948
k
 0.0494
or g1  33 
 0.085
s
0.83253
3mean  mod e  331.564  31.9

 1.211.
or Pearson's Measure of Skewness SK 
std.deviation
0.8325
or
k 3
Because of the negative sign, all of these imply skewness to the left.
i. Make a Frequency Polygon of the Data (Neatness Counts!)(2)
5