Download From this article below, calculate the probability of accident given

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Statistics wikipedia , lookup

Transcript
Econ 57
Economic Statistics
Spring 2006
HW 6 assignment
1.
Connecticut vs Teal, 1982, was concerned with a promotion test that was passed by 26 of
48 black employees and 206 of 259 (79.53%) white employees.
a. Use these data to test the null hypothesis that the pass rate of black employees (X/n) is
equal to that of white employees (using p=79.53%), against the alternative hypothesis
that the pass rate is lower than that of white employees. Be careful to specify exactly
what your null hypothesis is and what your alternative hypothesis is. Using a decision
rule of 2%, what do you conclude from the p-value here?
b. Now test the null hypothesis that pass rate of black employees is equal to that of white
employees (using 79.53%), against the alternative hypothesis that the pass rate is higher
than that of white employees. Be careful to specify exactly what your null hypothesis and
what your alternative hypothesis are. Using a decision rule of 2%, what do you conclude
from the p-value here?
c. Now test the null hypothesis that of black employees is equal to that of white employees
(using 79.53%), against the alternative hypothesis that the pass rate is not equal to that of
white employees. Be careful to specify exactly what your null hypothesis and what your
alternative hypothesis are. Using a decision rule of 2%, what do you conclude from the pvalue here?
d. Explain how (a), (b), and (c) above forced you to set up the problem differently each time
and explain how the set-up of the problem affected your p-values and interpretation of the
results. Summarize about how specifying an alternative hypothesis matters to hypothesis
testing.
2a.
2b.
2c.
In an article entitled “Tanning Risk: Few Adults Apply Sunscreen,” by Jesse Drris,
abcnews.com, 8/8/01, at http://abcnews.go.com/sections/living/DailyNews/, reported on
sunscreen behavior. (A survey was conducted by telephone Aug.1-5 among a random
national sample of 1,023 adults.) Using tans.xls, test the null hypothesis that people are
equally likely to use and not use sunscreen against the alternative hypothesis that people
are not equally likely to use and not use sunscreen, using a 5% decision rule. (Define
“not” using sunscreen as rarely or never using sunscreen).
Using p = 0.5 when estimating the standard deviation, calculate a 95 percent confidence
interval for the probability of rarely or never using sunscreen. What does this confidence
interval tell you about the probability of “not” using sunscreen?
Compare the result in (a) to the result in (b) above. Explain the relationship between
confidence intervals and hypothesis tests.
3.
question 7.32 on page 377.
4.
Using datamining.xls, each column represents one sample from the same population
where p=probability of female=50%. For each of the 100 samples (of sample size, 80),
test the hypothesis that the fraction of females is 50% against the alternative that the
fraction of females is not equal to 50%, using a decision rule of 5%. How many times
did you reject the null? What does this exercise illustrate about the probability of type I
errors?
5a.
In general, if your p-value were less than 5% , could you be absolutely sure that the null
that you were rejecting was wrong? Carefully explain your reasoning.
In general, if your p-value were more than 5%, could you be absolutely sure that the null
that you were not rejecting was true?
What is the difference between not rejecting the null and accepting the null? Why should
you be careful to make a distinction between these two ways of phrasing your
conclusion?
5b.
5c.
6.
Using excel >> tools >> data analysis >> random number generation, create
three columns of data. The first column will be a random sample of size 16
drawn from a normal distribution centered at .5 and with a standard deviation
of .15. The second will be a random sample of size 1000 drawn from the same
normal distribution. The third will be a random sample of size 10,000 drawn
from the same normal distribution. Generate column 1 by entering in these
parameters (then repeat for column 2 and 3):
Number of variables: 1,
Number of Random Numbers: 16, 1000, or 10000.
Distribution: Normal
Mean = .5
Std dev = .15
Random seed = (any number)
Output Range = A5 (for column 1), B5 (for column 2), C5 (for column 3).
For each of these columns, calculate the sample mean, x-bar, the sample standard
deviation, S, and the test statistic, t = (x-bar - .48)/ (S/ sqrt of n) to test the null hypothesis
of 48 against the alternative of ≠ .48. Now answer the following questions:
a. Compare the sample standard deviations, S, across the 3 columns?
b. Does the sample standard deviation, S, fall as n rises or stay approximately
constant at around .15? Explain.
c. Distinguish between the standard deviation of the sample (of size n) and the
standard deviation of the sample mean, x-bar? Give a numerical example of this
difference by using the data you generated.
d. Compare the test statistics, t, across the 3 columns. Do you notice anything?
e. What happens to the probability of finding a “statistically significant” difference
from a  of .48 as n rises? Explain.
f. Why are you more likely to find a “statistically significant” result with a larger
sample than a smaller one?