Download 251x9811 2/11/98 - On

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Taylor's law wikipedia , lookup

Resampling (statistics) wikipedia , lookup

German tank problem wikipedia , lookup

Misuse of statistics wikipedia , lookup

Student's t-test wikipedia , lookup

Transcript
251y9912 10/7/99251 (Open this document in 'Page Layout' view!)
ECO251 QBA1
Name __________________
FIRST HOUR EXAM
SECTION MWF TR 11AM 12:30
OCTOBER 7, 1999
Note that in 2000 all exams will be open book. The exam below is a closed book exam. This means that
parts I and II will be changed considerably. Questions like d) and l) in part I and j), k) an l) in part II will
still appear.
 pN  F 
x1 p  L p  
w
 f p 
Part I. Explain or Define the Following (13 Points Maximum)
a) Field (1) See diagram at right
b) Cell (1) A location in a field
c) Stubhead (1) The top area of a stub –
see diagram.
Table Number
Title
Headnote

Stub
Master Caption
Stub Head
R
O
W
L
A
B
E
L
S
Footnotes
Source Note
Column Labels
C
E
L
L
S
 Boxhead
Field
Field


d) I have a table describing the grades received by a class (A, B, C, D, F or NG) and their year (Fr,
So, Jr, Sr). there are no individuals in the class who are not regularly enrolled undergraduates. If
we define the following categories:
C1 'Person received A or B'
C3 ' Person is Freshman or Sophomore’'
C2 'Person received C, D, or F’
C4 'Person is Junior or Senior'
Are the following classes: (3)
Mutually Exclusive?
Collectively Exhaustive?
C1 and C2
__y__
___n___
C3 and C4
__y___
___y___
C1, C3, C4
__n___
___y___
C1 and C2 are mutually exclusive because no individual in C1 can also be in C2.
C1, C3 and C4 are collectively exhaustive because everyone must be in at least one of these
classes.
e) The mean, median and mode have something in common as to what they measure, which they
do not share with , say, the variance. What is it? Make a sketch showing where they would be
relative to one another in a distribution that is skewed to the left. (4) They are all measures of
central tendency. The sketch should show the mean on the left, the mode on the right and the
median between them .
f) Frequency Polygon (What does it look like, what does it show? A labeled sketch might work
best.) (1) A line graph with x on the x-axis and frequency on the y-axis.
g) Interval Data (1) Quantitative data like temperature where the interval between two values has
a constant meaning and there is no real bottom or top. Contrast it with ordinal or ratio data.
251y9912 10/7/99
h) Third Quintile (1) A point with 3/5 or 60% of the data below it.
i) Mesokurtic (1) Not flat or sharp peaked, or, better, with a zero coefficient of excess,  2 .
j) Box Plot (2) See text page 85. A double firecracker diagram that shows the median and the
quartiles.
k) Is the mean of a population a statistic or a parameter? (1) Numbers that describe a population
are parameters.
l) My firm has a large number of stores throughout the country with annual sales between 2 and
40 million dollars. If this data is to be presented in nine classes, what intervals would you use?
highest - lowest 40  2
Explain your reasoning using the appropriate formula. (3) Use

no. of classes
9
 4.2222 . So the interval should be above 4.2222, maybe 4.5. You might try 0-4.49, 4.5-8.99,
9-13.49, 13.5-17.99, 18-22.49, 22.5 to 26.99, 27-31.49, 31.5 to 35.99, and 36 to 40.49.
m) If I pick an item at random from a very large population with a median of 531, what is the
approximate probability that the item I pick will be above 531? (2) Since half the numbers are
above the median, the probability is 50%.
n) If you answered question m) and the distribution in m) is skewed to the right and I pick another
item from the population, is the chance that it is above the mean greater, less than the
probability you gave in m), or is it unchanged? ( A picture may help you decide) (1)
If the distribution is skewed to the right, the mean is above the median. If half the numbers are
above the median, less than half must be above the mean.
2
251y9912 10/7/99
Part II. Give formulas for the following or compute an appropriate answer, showing your work (12 Points
maximum):
a) Population Variance (Definitional formula for
ungrouped data) (2)
b) Root-mean-square (2)
c) Sample Mean (Grouped data) (1)
d) Sample Standard Deviation (Use words!) (1)
e) Coefficient of Variation ( For a population)(1)
f) Population Skewness (1)
g) Sample Variance (Grouped Data Computational Formula (2)
h) Kurtosis (Population - Ungrouped) (2)
i) Pearson's Measure of Skewness (2)
j) Compute the Harmonic Mean of the numbers 3,
30 and 300 (2)
k) If a population of 1500 items has a mean of
13555 and a standard deviation of 100, according
to Chebyshef's Inequality, what is the minimum
number of items that could be between 13200 and
13855? (3)
l) If you answered k) and you now learn that the
population is unimodal and symmetrical, would
you raise or lower your estimate in k), or would
you leave it unchanged? (1)
Solution: a) Population Variance (Definitional formula for ungrouped data) (2) 
 x or
 fx
c) Sample Mean (Grouped data) (1) x 
b) Root-mean-square (2) x rms 2 
1
n
x rms 
2
1
n
x
2
 x   

2
N
2
n
d) Sample Standard Deviation (Use words!) (1) The square root of the sample variance.
e) Coefficient of Variation ( For a population)(1) C 
f) Population Skewness (1)  3 
1
N
 x   
3

1
N


 x
3
 3
x
g) Sample Variance (Grouped Data - Computational Formula (2) s
h) Kurtosis (Population - Ungrouped) (2)
4  
2
2
 2n 3
 fx

x   
2

 nx 2
n 1
4
N
3mean  mode 
i) Pearson's Measure of Skewness (2) SK 
std .deviation
j) Compute the Harmonic Mean of the numbers 3, 30 and 300 (2)
1
1

xh n
 x so
1
1
1 1 1
1  1  100 10
1  1  111  111
900
  

 


 

and x h 
 8.108



x h 3  3 30 300  3  300 300 300  3  300  900
111
k) If a population of 1500 items has a mean of 13555 and a standard deviation of 100, according
to Chebyshef's Inequality, what is the minimum number of items that could be between 13255
1
and 13855? (3) Chebyshef’s Inequality says Px    k   2 which means that the
k
1
proportion in the tails outside the interval   k to   k is at most 2 , so that the
k
13855  13555
1
 3 , so the
proportion within the same interval is at least 1  2 . In this case k 
100
k
8
1
1 8
proportion is at least 1  2  1   , and 1500   1333 .33 , so the number is at least 1334.
9
9 9
k
l) If you answered k) and you now learn that the population is unimodal and symmetrical, would
you raise or lower your estimate in k), or would you leave it unchanged? (1) The empirical rule
says that almost all (99.7%) is within 3 standard deviations, so raise the estimate.
3
251y9912 10/7/99
Part III. Do the Following Problems (25 Points)
In a period of 8 days you make the following numbers of sales(in millions):
Day :
1
2
3
4
5
6
Sales: 9.2
10.2
9.2
11.2
18.2
12.2
Compute the Following (assuming that the numbers are a sample):
a) Mean Sales (1)
b) The Median (1)
c) The Standard Deviation (3)
d) The 7th Decile (2)
Solution: Compute the Following:
Note that x is in order
n  6 ,  x  97 .6 ,
x
 x is not  x   97.6
 x  x  equal to  x  x 
2
2
2
2
Index x
x2
1
9.2
84.64
2
9.2
84.64
3 10.2 104.04
4 11.2 125.44
5 12.2 148.84
6 13.2 174.24
7 14.2 201.64
8 18.2 331.24
97.6 1254.72
 1254 .72 ,
xx
-3.0
–3.0
–2.0
–1.0
0.0
1.0
2.0
6.0
0.0
7
13.2
8
14.2
 x  x 2
9.00
9.00
4.00
1.00
0.00
1.00
4.00
36.00
64.00
 x  x   0.00,  x  x 2  64.00 .
2
as some of you seem to have fooled yourselves into believing. Nor is
2
 97 .6  12 .22 . If you had tried these in any of the homework problems,
you would have found that these tricks didn’t work.
a) x 
 x  97.6  12.2
n
8
b) position  pn  1  a.b  .59  4.5
x1 p  xa  .b( xa1  xa ) so x1.5  x.5  x 4  .5( x5  x 4 )  11 .2  .5(12 .2  11 .2) 
c) s
2
x

2
 nx 2
n 1
1254 .72  812 .22

 9.1429
7
s
2
 x  x 

n 1
2

11 .2  12 .2
 11 .7
2
64 .00
 9.1429
7
s  9.1429  3.0237
d) The 7th decile has 70% below it. position  pn  1  a.b  .79  6.3
x1 p  xa  .b( xa1  xa ) so x1.7  x.3  x6  .3( x7  x6 )  13.2  .3(14.2  13.2)  13 .5
4
251y9912 10/7/99
2. A retailer finds that the lead time (time between receipt and fulfillment) for a sample of orders is as
below:
Days
a. Calculate the Cumulative Frequency (1)
b. Calculate The Mean (1)
c. Calculate the Median (2)
d. Calculate the Mode (1)
e. Calculate the Variance (3)
f. Calculate the Standard Deviation (2)
g. Calculate the Interquartile Range (3) (Assume
that the 3rd quartile is in 15-19.9)
h. Calculate a Statistic showing Skewness and
Interpret it (3)
i. Make an Ogive of the Data (Neatness
Counts!)(2)
Frequency
0-4.9
5-9.9
10-14.9
15-19.9
20-24.99
7
8
13
10
12
Solution: x is the midpoint of the class.
7
8
13
10
12
50
7
15
28
38
50
n   f  50,
 f x  x 
2
x
F
f
fx 2
fx
fx3
2.5 17.5
43.75
109
7.5 60.0 450.00
3375
12.5 162.5 2031.25 25391
17.5 175.0 3062.50 53594
22.5 270.0 6075.00 136688
685.0 11662.5 219156
 fx
 685 .0,
 2278.0, and
 fx
 f x  x 
3
2
 11662 .5, and
xx
-11.2
-6.2
-1.2
3.8
8.8
 fx
3
f x  x  f x  x 2 f x  x 3
-78.4 878.08 -9834.50
-49.6 307.52 -1906.62
-15.6 18.72
-22.46
38.0 144.40
548.72
105.6 929.28 8177.67
0.0 2278.0 -3037.20
 219156 ,
 f x  x   0.00,
 3037.20. As usual half of the people who tried to compute
these last three sums got them wrong.
a. Calculate the Cumulative Frequency (1) (See above) The cumulative frequency is the whole F column.
b. Calculate the Mean (1) x 
 fx  685 .0  13.7
n
50
c. Calculate the Median (2) position  pn  1  .551  25.5 . This is above 15 and below 28, so the
 pN  F 
 .550   15 
interval is 10-14.9. x1 p  L p  
 w so x1.5  x.5  10  
5  13 .846
13


 f p 
d. Calculate the Mode (1) The mode is the midpoint of the largest group. Since 13 is the largest frequency,
the modal group is 10 to 15 and the mode is 12.5.
e. Calculate the Variance (3) s 2 
s2 
 f x  x 
n 1
2

 fx
2
 nx 2
n 1

11662 .5  50 13 .7 2
 46 .4898 or
49
2278 .0
 46 .4898
49
f. Calculate the Standard Deviation (2) s  46.4898  6.8183
5
251y9912 10/7/99
g. Calculate the Interquartile Range (3) First Quartile: position  pn  1  .2551  12.75 . This is above
 pN  F 
F  7 and below F  15 , so the group is 5 to 9.9. x1 p  L p  
 gives us
 f p 
 .2550   7 
Q1  x1.25  x.75  5  
 5  8.4375 .
8


Third Quartile: position  pn  1  .7551  38.25 . This is just above 38, so the group is 20 to 24.9.
 .7550   38 
x1.75  x.25  20  
 5  19 .792 . Since this produces a number below 20, I have directed you
12


to use 15 to 19.9 – this actually does not seem to be a breakdown in the formula. It should be footnoted to
 pN  F 
say that if 
 is negative, go back to the previous group. Nevertheless some rethinking seems
 f p 
indicated.
 .7550   28 
Q3  x1.75  x.25  15  
 5  19 .75 . IQR  Q3  Q1  19.75  8.4375  11.3175.
10


h. Calculate a Statistic showing Skewness and interpret it (3)
k 3
n
(n  1)( n  2)
 fx
3
 3x  fx 2  2nx 3
  495048 219156  313.711662 .5  25013.7 
3
 0.02126 3037 .45   64.5716 . The computer got –64.5657 for this calculation.
or k 3 
n
(n  1)( n  2)
or g 1 
k3
s3

 f x  x 
 64 .566
6.8183 3
3

50
 3037 .20   64.5663
49 48 
 0.2037
3mean  mode 313 .7  12 .5

 0.1760
std .deviation
6.8183
Because of the negative sign, the first two measures imply skewness to the left, while the last one implies
skewness to the right.
i. Make an Ogive of the Data (Neatness Counts!)(2) An ogive is a graph of the cumulative distribution. It
must begin at y  0 (not necessarily x ) and should end with a horizontal line. It cannot include negative
slopes.
Your points are:
Up to x  0
5
10
15
20
25
above 25
0
7
15
28
38
50
50
F
or Pearson's Measure of Skewness SK 
6