Download Chapter 1

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Foundations of statistics wikipedia , lookup

History of statistics wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Taylor's law wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Student's t-test wikipedia , lookup

Misuse of statistics wikipedia , lookup

Transcript
Chapter 1
1. A survey of 1420 U.S. undergraduate English majors asked which Shakespearean
play was most relevant in the year 2000. What is the population and the sample?
Ans. Population is all of the undergraduate English majors. The sample is 1420 U.S.
undergraduate English majors.
2. Decide which method of data collection you would use to gather data for the
study. Explain. Twenty-five students are randomly selected from each grade
level at a high school and surveyed about their study habits.
Ans. Stratified, because each must be represented in the study.
3. You are a researcher for a professional research firm. Your firm has won a
contract on doing a study for an automobile industry publication. The publication
would like to get its readers’ (engineers, manufactures, researchers, and
developers) thought on the future of automobiles, such as what type of fuel they
think will be used in the future. The publication would like to get input from
those who work for automakers and for those who work for automaker suppliers.
The publication has given you their readership database and the 25 questions they
would like to ask (the sample questions from a previous study are given below).
It is too expensive to contact all the readers so you need to determine a way to
contact a representative sample of the entire readership population.
How will the internal combustion
engine of the future be fueled?
Fuel
Gasoline
Hydrogen
Diesel Fuel
Natural Gas
Other
No response
Percent Responding
38.1%
23.2%
19.6%
11.4%
6.7%
1%
When will affordable fuel cell
vehicles be on the market?
Time
5 years or less
More than 5 years to
At most 10 years
More than 10 years
Not likely
No response
Percent Responding
5.4%
33.6%
48.7%
7.6%
4.7%
A. a) What sampling technique would you use to select the sample for the study?
Why?
b) Will the technique you choose in part a give you a sample that is
representative of the population?
c) Describe the method for collecting data.
d) Identify possible flaws or biases in your study.
B. a) What type of data do you expect to collect: qualitative, quantitative, or
both? Why?
b) What levels of measurement do you think the data in the study will be?
Why?
c) Will the data collected for the study represent a population or a sample?
Why?
d) Will the numerical descriptions of the data be parameters or statistics?
Ans: Possible Responses
1. a) Systematic sampling using every 4th or 5th subscriber from the publications
subscriber database because this would provide SRS sampling that is cost
effective and simple to run, instead of trying to figure out how to perhaps
break the different subscribers jobs into categories and then do a stratified
based on that. However cluster with North, East, South, and West is also a
viable option.
b) The technique will give a representative sample of the reader’s population.
c) Data collection will be bases of a survey with 25 questions that are multiple
choice.
d) Possible flaws would be nonresponse bias or response bias from leading
questions. Perhaps certain types of readers, who therefore have similar jobs
might be most likely to reply while others might not reply at all.
2. a) Quantitive because the data is looking at the percentages, numbers/data, of
readers who responded to the questions and their answers.
b) Ratio, because they can be ordered for example from lowest to highest, their
differences can be found and so can their ratio values. For example it is
acceptable to say that approximately twice as many readers think gasoline at
38.1% will be the fuel of the future compared to 19.6% who think diesel duel
will be
c) Sample
d) Statistic
3. You are conducting a survey to find out how many students’ favorite subject is
math at your school. Describe how you would do this by using stratified and
cluster sampling (assume that your school goes from grade nine through twelve)
Ans: Stratified: You would survey a sample of students from each grade level.
Cluster: you would only survey a sample of all the students from one grade level.
5. Using the random number table from a group of 999 in which each person is
randomly assigned a number, randomly select ten people by using the third row.
Ans: 596, 547, 196, 627, 386, 500, 40, 535, 894, and 31
Chapter 2
When data is skewed right, the mean is:
a. Greater than the mode and less than the median?
b. Greater than the mode and greater than the median?
c. Less than the mode and greater than the median?
d. Less than the mode and less than the median?
Ans: b
2. In a normally distributed bell curve, what percentage of the data lies within one
standard deviation of the mean?
a. 68
b. 34
c. 27
d. 56
Ans. A
3. Use the following data set:
50 51 54 56 59 60 61 61 61
63 64 65 68 69 70 71 71 75
a) Make a stem and leaf plot of the data
b) Make a box and whisker plot of the data
ANS:
5 01469
6 011134589
7 0115
50
60
70
4. The weight of 20 baseball players have a distribution with a mean of 195 pounds
and a standard deviation of 25 pounds. Use z-scores to determine if the weights
of the following players are unusual.
a) 251 pounds
c) 219 pounds
b) 162 pounds
d) 178 pounds
ANS: a. unusual; b) Not unusual; c) Not unusual; and d) Not unusual
Chapter 3
1. Are the following events independent or dependent?
Flipping a coin and getting heads and rolling a die and obtaining a five
Ans: Independent, one outcome does not affect the other.
2. The state government wants to construct a new interstate highway and gets 16
bids for the project. 4 of the 16 bidding companies will be picked to sponsor the
project. In how many ways can the 4 companies be picked?
Ans. 1,820 different ways
3. Find the probability of selecting four consecutive twos when four cards are drawn
without replacement from a standard deck of 52 playing cards. Round your
answer to four decimal places.
ANS: Not provided 
4. If a student is randomly selected, find the probability that it is a senior given that
the student owns a credit card. Round your answers to three decimal places.
Class
Credit Card Carrier
Junior
26
Senior
32
Total
58
ANS: Not provided 
Not a Credit Card
Carrier
40
45
85
Total
66
77
143
5. Decide if the events are mutually exclusive:
Event A: Randomly select a person who uses the internet at least twice a week.
Event B: Randomly select a person who has not used the internet in seven days.
Ans: Mutually Exclusive – The two events A and B cannot occur at the same time.
6. The starting lineup for a softball team consists of ten players. How many
different batting orders are possible using the starting lineup?
Ans: 3,628,800 batting orders (Permutation)
Chapter 4
1. You are taking a multiple-choice quiz that consists of five questions. Each
question has four possible answers, only one of which is correct. To complete the
quiz, you randomly guess the answer to each question. Find the probability of
guessing:
a) Exactly 3 answers correctly
b) At least three answers correctly
c) Less than 3 answers correctly
Ans: a) 0.088 b) 0.0104 c) 0.896
2. Decide whether the distribution is a probability distribution
X
5
6
7
8
P(x) 0.28 0.21 0.43 0.15
Ans: No. The sum is greater than 1
3. A six-sided die is rolled 3 times. Find the probability of rolling exactly one 6.
Ans: 0.347
4). Determine what type of distribution the following statement represents, and run the
test for the distribution to answer the included question.
As a hungry boy, Peter really enjoys visiting his favorite fast food chain, Taco Bell.
However, his best friend, Fabio, claims that they have terrible service, because
forty-three percent of the time he receives the wrong order. Peter makes a bet with
Fabio that he will not receive a wrong order from Taco Bell until his fifth visit (not
counting the other times Fabio visited, so start from 0). What is the probability
that Peter is correct?
Ans: Geometric Distribution, 0.045
5) Create a discrete probability distribution and graph according to the following
scenario. A business owner at a local ice cream shop wants to see the probability
distribution of the number of toppings (1 – 4) that people put on their ice cream, and
determine whether or not is normally distributed. After keeping tally of 112
customers he placed his results in the following chart (frequency distribution).
Score
1
2
3
4
Frequency (f)
12
47
30
23
Ans:
x
1
2
3
4
P(x) 0.107 0.420 0.268 0.205
Relative
Frequency
Ice Cream
Pretend the graph to the
right has no space
between the bars!
0.5
0.4
0.3
0.2
0.1
0
Number of
Toppings
1
2
3
4
Numer of Ice Cream Toppin gs
6. A local pet shelter is selling $4 raffle tickets as part of a fundraising program.
The first prize is a vacation getaway valued at $3150, and the second prize is a
camping tent valued at $450. The rest of the prizes are 15 - $25 Target gift
certificates. The number of tickets sold is 5000. Find the expected net gain to the
player for one play of the game. Is the player expected to win or lose?
Ans: -3.205 the players are expected to lose.
7. Assume the probability that you will make a hole in one on your first swing is
0.19. Find the probability that you ….
a. Make a hole in one on the fifth swing
b. Make a hole in one on the first, second or third swing.
c. Do no make a hole in one on the first three swings.
Ans: a. 0.082; b. 0.469; c. 0.531
Chapter 5
1. A survey indicates that for each trip to the supermarket, a shopper spends an
average of   45 minutes with a standard deviation of   12 minutes. The
length of time spent in the store is normally distributed and is represented by the
variable x. A shopper enters the store. Find the probability that the shopper will
be in the store between 24 and 54 minutes.
Ans: 0.733
2. 29% of people in the US say they are confident that passenger trips to the moon
will occur in their lifetime. You randomly select 200 people in the US and ask
each if he or she thinks passenger trips will occur in his/her lifetime. What is the
probability that at least 50 will say yes?
Ans. 0.9066
3. Shoppers spend an average of 65 minutes at the grocery store every week with a
standard deviation of 15 minutes. Find the z-score for:
a) 50 – 80 minutes
b) 95 or more minutes
Ans. a) 0.680; b) 0.023
4. Find the z-score that corresponds to:
a) P5
b) P10
c) P15
d) P20
Ans: a = -1.65; b = -1.28; c = -1.04; d = -0.84
Chapter 6
1. A video gamer wishes to estimate the mean level of all players on his server. In a
random sample of 100 players, the mean level is found to be 17.8. Similar
experiments in the past have found the population standard deviation to be 1.8.
Assuming the population is normally distributed, construct a 90% confidence
interval of the population mean age.
Ans. 17.5039    18.0961
2. A jeweler randomly selects and weighs 30 diamonds. The sample standard
deviation is 53 grams. Construct a 99% confidence interval for the population
standard deviation. Assume the weights are normally distributed.
Ans. 39.4412    78.79368
3. Use a normal distribution or a t-distribution to construct a 95% confidence
interval for the population mean. Justify your decision. If neither distribution can
be used, explain why not.
You make a random survey of 25 sports cars and record the miles per gallon for
each. The data are listed below. Assume the miles per gallon are normally
distributed.
24 24 27 20 26 23 18 29 24 22 22 27 26
20 28 30 23 24 19 22 24 26 23 24 25
Ans: Use t-distribution because n<30, the miles per gallon are normally distributed, and
the standard deviation is unknown. (22.762, 25.238)
4. A container of car oil is supposed to contain 1000 milliliters of oil. A quality
control manager wants to be sure that the standard deviation of the oil container is
less than 20 milliliters. He randomly selects 10 cans of oil with a mean of 997
milliliters and a standard deviation of 32 milliliters. Use these sample results to
construct a 95% confidence interval for the true value of the population standard
deviation.
Ans: (22.01, 58.42)
5. Find the Critical Value tc for a 95% confidence level when the sample size is 12.
Ans. tc = 2.201
6. You randomly select 12 cities in Georgia and measure the temperature in the
summer of each city. The sample mean is 98” with a sample standard deviation
of 5”. Find the 95% confidence interval for the mean temperature. Assume
temperatures are normally distributed.
Ans: (94.823, 101.18)
Chapter 7
1. A local basketball club claims that the length of time to play an entire game has a
standard deviation of more than 10 minutes.
a. What would be a consequence of a Type I error in this case?
b. What would be a consequence of a Type II error in this case?
Ans: a. Type I: rejecting the null hypothesis that the standard deviation is less than or
equal to 10, when it is actually true.
b. Not rejecting the null hypothesis when it is actually false.
2. The teachers of Chandler High School report that their students have 3 or less
hours of homework every night, for all of their classes combined. You disagree
with this statement and randomly select a sample of 450 students. You find that
the calculated p-value is 0.0312. At the 5% significance level, what can you
conclude?
Ans. Reject H0. At 5% level, there is not enough evidence to support the claim the CHS
students have less than or equal to 3 hours of homework.
3. You represent a company that claims its mean amount of products shipped out
every year is 68 products. Write the null and alternative hypothesis to test this
claim.
H 0  68 (claim )
Ans:
H a  68
4. A repair company states that the mean price for every repair they make is less
than $100. You are thinking about hiring this repair company to fix your broken
fridge, and find that the mean repair cost of 5 fridges that they repaired was $75
with a standard deviation of $12.50. At   0.01 do you have enough evidence to
support the company’s claim?
Ans.
H o    100
H a    100 (claim )
Reject H0. There is enough evidence at the 1% level to support the claim that the average
repair cost for broken fridges is less than $100.
Chapter 8
1. Find the critical value for the indicated test, level of significance, and given
sample size. Assume samples are independent, normal, random, and that the
population variances are not equal!
Two- tailed;   0.10 ; n1 = 10, n2 = 12
Ans: t 0  1.725
2. Classify the two given examples as independent or dependent. Explain your
answer.
Sample 1: The weights of 51 adults
Sample 2: The weights of the same 51 adults after participating in a diet and exercise
program for one month.
Ans: Dependent because the same adults were sampled.
3. Test the claim about the difference between two population proportions for the
given level of significance, and the given sample statistics
Claim: p1  p2 x1 = 35, n1 = 70, x2 = 36; n2 = 60   0.01
Ans. z  1.142; p  02.54
Fail to reject null at the 1% level. Not enough evidence to support the claim that
p1  p2
4. Decide whether the following are independent or dependent. Explain reasoning.
a) Sample 1: test scores for 14 statistics students on a pre test
Sample 2: test scores for the same 14 statistics students on a posttest
b) Sample 1: results of a medicine on a sample of 26 patients
Sample 2: results of a placebo on a sample of 26 patients
Ans: a. The two samples are dependent because the same 14 students are used for both
samples.
b. The two samples are independent because the patients involved cannot be taking both
the medicine and the placebo, thus they are independent.
5. A real estate agent claims that there is no difference between the mean household
incomes from two neighborhoods. The mean income of 12 randomly selected
households from the first neighborhood was $12,250 with a standard deviation of
$1200. In the second neighborhood, 10 randomly selected households had a mean
income of $17,500 with a standard deviation of $950. Assume normal
distribution and equal population variance. Test the claim at   0.05.
Ans: 2-Sample t -Test (pooled yes) t = -11.202. Conclusion: Reject H0: at that the 5%
level, evidence does not support the real estate agent’s claim that there is no difference
between the mean incomes of two neighborhoods.
6. An advertising agency claims that there is no difference between the income of
doctors and the income of lawyers. Five hundred doctors and five hundred
lawyers are surveyed to find their mean annual income and the results are
recorded below. The two samples are independent. Do the results support the
advertiser’s claim at the 5% level?
x1 = 125,000
s1 = 28,000; n1 = 500
x2 = 118,000;
s2 = 25,000;
n2 = 500
Ans: Reject the null. There is enough evidence to reject the claim that the mean annual
income of doctors and the mean annual income of lawyers are equal at the 5% level.
7. A travel agent claims that the proportion of people who develop violent diarrhea
after visiting a Latin American country is the same as the proportion of people who
develop violent diarrhea in a South American country. After polling 200 visitors of
Latin America and 150 of South America, ABC 15 investigators found that 30% of
visitors to the Latin American country developed debilitating diarrhea and 65% of
visitors to the South American country developed such diarrhea. Test the claim at the
5% level that the proportions are the same.
Ans: Fail to reject the null hypothesis. There is enough evidence at the 5% level to
support the claim that the proportions are equal.
Chapter 9
1. Calculate the correlation coefficient (r) and make a conclusion about the type of
correlation. The number of hours 15 statistics students spent studying for a test
and their scores on that test.
X
Y
0
48
1
53
1
56
2
65
4
77
4
80
5
87
6
94
3
72
2
63
5
81
4
78
3
70
5
84
6
98
Ans: r = 0.989, hours spent studying and test scores have a strong positive linear
relationship.
2. Interpret the meaning of the following coefficient of determination, r 2 = 0.978
Ans: It means that 97.8% of the data is explained by the regression line and 2.2% is not.
4. Find the equation for the linear regression line given that the line passes through
the point (10, 5):
Sy = 5.789; Sx = 10.596; r = 0.85
Ans: yˆ  4.40  0.44 x
5. Perform a t-test for correlation to see if there is a significant correlation between
age and number of magazine subscriptions, use   0.05
Age Magazine Subscriptions
10
0
22
2
37
2
48
3
50
2
Ans: Fail to reject H0, there is evidence at the 5% level to show that there is a
correlation between age and number of magazine subscriptions.
Chapter 10
1. The following table shows the price per gallon for a random sample of exterior
deck treatments. At 10% level is there enough evidence to show that at least one
of the treatments is different than the rest. (Hint: ANOVA TEST)
Semitransparent treatments
24
23
22
17
21
17
Lightly tinted treatments
51
14
21
16
Clear treatments
13
13
10
12
22
Ans: At the 10% level, there is enough evidence to support the H0. There is not enough
evidence to show that the treatments are different.
2. Biologists were studying the eating habits of pythons and anacondas in the
wilderness. Eight pythons and six anacondas were observed. The variance of
time between meals for the pythons was five months. The variance of time
between meals for the anacondas was eight months. The biologists claim that the
anacondas eat more often than the pythons do. Prove or disprove this claim.
  0.05.
Ans: Fail to reject the null hypothesis. There is insufficient evidence at the 5% level to
conclude that anacondas eat more than pythons.
3. The following contingency table shows the results of a random sample of 550 company
CEOs classified by age and size of company. At α=0.01, can you conclude that the CEOs’
ages are related to company size?
Company size
Age of CEOs
39 and
under
40-49
50-59
60-69
70+
Total
Small/Midsize
42
69
108
60
21
300
Large
5
18
85
120
22
250
Total
47
87
193
180
43
550
Ans: H0: The CEOs’ ages are independent of company size
Ha: The CEOs’ ages are dependent on company size
d.f. = (2-1)(5-1) = 4 α=0.01
χ2c = 13.277
χ2 = 77.9
Reject null; At 1% level, there is enough evidence to support the claim that CEOs’ ages
are dependent on company size.
4. A medical researcher claims that specially treated intravenous solution decreases the
variance of the time required for nutrients to enter the blood stream. Independent
samples from each type of solution are randomly selected, and the results are shown in
the table. At α=0.01, is there enough evidence to support the researcher’s claim?
Assume the populations are normally distributed.
Normal Solution
n = 25
s2 = 180
Treated Solution
n = 20
s2 = 56
b) Identify claim and state null and alternative hypothesis.
c) Specify level of significance.
d) Determine degrees of freedom for numerator and denominator.
e) Find critical values and identify rejection region.
f)
Use F-Test to find F test statistic
g) Conclude.
Ans: H0:
Ha:
(claim)
α=0.01
d.f. (numerator) = 24
d.f. (denominator) = 19
F0 = 2.92
F = 3.214
Reject null, evidence supports claim at 1% level.