Download Final

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Foundations of statistics wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Taylor's law wikipedia , lookup

German tank problem wikipedia , lookup

Law of large numbers wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Student's t-test wikipedia , lookup

Misuse of statistics wikipedia , lookup

Transcript
Part 1. For each of the following questions fill-in the blanks. Each question is worth 3
points.
1.
When researchers look for a relationship between two categorical variables for individuals in the
__________, they measure those categorical variables on individuals in the __________.
2.
Another name for the alternative hypothesis is the __________ hypothesis.
3.
The __________ hypothesis is usually written to express the fact that ‘nothing is happening.’
4.
A __________ test is a statistical procedure that is used to determine whether or not there is a
relationship between two categorical variables.
5.
Using the relative frequency approach, we can define the probability of any specific outcome as
the __________ of times it occurs over the long run.
6.
The __________ represents the average value of any measurement over the long run.
7.
For a 95% confidence interval, the value of 95% is called the _______________.
8.
When a relationship or value from a sample is so strong that we can effectively rule out chance as
an explanation, we say that the result is _______________.
Part 2. For each of the following questions circle the correct response. Each
question is worth 3 points.
9.
Which of the following statements is true about chi-square tests?
a. A large chi-square test statistic results in a large p-value.
b. A large p-value means that there is a good chance that the relationship is statistically
significant.
c. If the two variables are not related in the population, then less than 5% of the samples
you could ever take would give you a test statistic of 3.84 or larger.
d. All of the above.
10. Which of the following is a true probability?
a. -.22
b. 120%
c. 1
d. None of the above
11. Suppose the outcomes of births within a given family are independent of each other, and a couple
has already had four boys. Which of the following best describes the probability that their next
baby will be a girl?
a. Approximately 50%
b. Much less than 50%
c. Much greater than 50%
d. Not enough information to tell
12. Which of the following statements is false?
a. Sample results will always be very close to their respective population values.
b. Sample results vary from one sample to the next.
c. The key to interpreting statistical results is to understand what kind of dissimilarity we
should expect to see in various samples from the same population.
d. None of the above statements are false.
13. Which of the following is a correct interpretation of a 90% confidence interval?
a. 90% of the random samples you could select would result in intervals that contain the
true population value.
b. 90% of the population values should be close to our sample results.
c. Once a specific sample has been selected, the probability that its resulting confidence
interval contains the true population value is 90%.
d. All of the above statements are true.
14. What does it mean for a confidence interval for the difference of two means to contain zero?
a. You are unable to say there is a difference in the population means.
b. Different samples could give results in either direction; completely above zero,
completely below zero, or containing zero.
c. The confidence interval will contain some negative numbers and some positive numbers.
d. All of the above.
15. Suppose a confidence interval for the difference in mean weight loss for two different weight loss
programs (Program 1 – Program 2) is entirely above zero. What does this mean?
a. We can’t say with any confidence that there is a difference in mean weight loss for the
populations of people on these two programs.
b. We can say with confidence that there is a difference in mean weight loss for the
populations of people on these two programs; further, we can say that the average weight
loss on Program 1 is higher.
c. We can say with confidence that there is a difference in mean weight loss for the
populations of people on these two programs; further, we can say that the average weight
loss on Program 2 is higher.
d. None of the above.
Part 3. For each of the following questions give a short answer. Use complete sentences.
Each question is worth 3 points.
16. Suppose you want to investigate whether there is a relationship between the gender of college
students and whether or not they wear hats in school. What would be your null hypothesis and
your alternative hypothesis (in words)? Be sure to label clearly which hypothesis is which.
.
17. The airlines routinely report their on-time flight percentages, which can be interpreted as
probabilities. What method of finding probabilities was most likely used in determining this?
18. Tell whether the following statement is correct; if it is not correct, explain the problem. “If the
probability of a single birth resulting in a boy is .51, then the probability of it resulting in a girl is
also .51.”
19. Which would be wider, a 90% confidence interval or a 95% confidence interval? (Assume both of
them were calculated using the same sample data.) Explain your answer.
Part 4 Make sure to show all work in the following questions!
20. One of major hospitals in the country conducted a study to check if there is a relationship
(association) between pet ownership and survival after major surgery. 92 patients were followed
after major surgeries , they were classified as pet owners or not and their survival status after one
year was determined. The data obtained by the hospital is summarized in the table below:
PET (YES)
50 (
)
3(
)
DEAD
ALIVE
TOTAL
PET(NO)
28(
11(
TOTAL
)
)
39
53
Does the data provide evidence that there is a relationship between
78
14
92
pet ownership
and
survival after major surgery?
a.(3 points) Formulate appropriate Null and Alternative hypotheses to be tested
b. (3 points) Compute the Expected Counts in your table under the assumption of no
association and fill them in (
) in the table.
c. (3 points)Compute the value of a Chi-square test statistics.
d. (3points) Decide if Null hypothesis should be rejected or not, explain your decision
and clearly answer question posed in the problem.
21. (3 points) Suppose a class of 100 students took their statistics final and their grades are shown in
the table below. Choose one student at random. What is the probability that he/she received a B or
a C?
A
25
B
28
C
34
D
10
F
3
22. (3 points) Suppose the chances of picking up a cold from someone by shaking hands with them is
.02 (assuming you don’t know whether they have a cold or not), and that each encounter you have
is independent of another. Suppose you shake hands with 5 people in a given day. What is the
probability that you don’t pick up a cold from any of these people?
23. (3 points) Suppose an “Instant Lotto” ticket costs $3, and the chances of winning the $80 prize are
1/1000 There are no other prizes. What is your expected value for this game for each ticket you
buy?
Give an interpretation of your answer.
24. Suppose numerous random samples of size 2,500 are taken from a population made up of 20% cell
phone owners.
a. (3 points) What is the approximate shape of the frequency curve made from proportions of cell
phone owners from the various samples of size 2,500 from this population ? Give mean
and standard deviation of that curve.
b. (3 points) Suppose you took a random sample of size 2,500 from this population and found that
17.6% of them owned a cell phone. Is this considered to be a reasonable value given
the
size of this sample? Use the standardized score in your answer.
25. Suppose that test scores on a particular exam have a mean of 77 and standard deviation of 5, and
that they have a bell-shaped curve. Suppose you take numerous random samples of size 100 from
this population.
a. (3 points) Describe the shape and give the mean and standard deviation of the resulting
frequency
curve.
b. (3 points) Suppose you take a single random sample of size 100 from this population, and you
get a mean test score of 78. What is a chance of observing a value like that or larger? Use a
standardized score to justify your answer.
26. Suppose a shipment of oranges is advertised to weigh 5 pounds per bag. We know that not every
bag can contain exactly 5 pounds of oranges. We decide to take a random sample of 100 bags of
oranges and find out what they tell us about the population of all bags in this shipment. We are
only interested in whether or not the bags are underweight, so each bag is weighed and counted as
underweight if it weighs less than 5 pounds. Five bags in our sample of 100 were found to be
underweight.
a. (3 points) Compute a 95% confidence interval for the proportion of bags in the shipment that
are underweight?
b. (3 points) Suppose the grocery store who ordered the oranges will reject the shipment if they
believe, based on these sample results, that more than 10% of the bags in the entire truckload are
underweight. Based on our sample, will they have to return this shipment? Explain your answer.
27. The introductory biology class at a large university is taught to hundreds of students each
semester. For planning purposes, the instructor wants to find out the average amount of time that
students would use to take the first quiz, if they could have as long as necessary to take it. She
takes a random sample of 100 students from this population and finds that their average time for
taking the quiz is 24 minutes, and the standard deviation is 16 minutes.
a. (3 points) Find a 95% confidence interval for the average time to take this quiz for the whole
population of students who take the class.
b. (3 points) Suppose the professor expects the average time to take the exam is 23 minutes. Do
you have enough evidence to say that the professor is wrong in her estimation of the average time
to take this quiz? Base your answer on the confidence interval you obtained in part a.