Download Chapter 3 - Statistics - Western Michigan University

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
Transcript
Chapter 3
(1) 40 secretaries at a company had the following number of years of
secretarial experience:
10, 2, 16, 3, 6, 5, 11, 1, 14, 3, 6, 2, 1, 1, 8, 10, 18, 4, 1, 9,
13, 4, 7, 2, 9, 16, 8, 3, 5, 2, 12, 0, 6, 12, 4, 2, 13, 8, 6, 10.
Make a relative frequency histogram with 5 intervals and label the
vertical axis.
(2) The following data is for the amount purchased by 200 customers.
(Note two intervals are wider.) Draw a histogram from this information.
Amount of
purchase ($)
0 - 20
20 - 30
30 - 40
40 - 50
50 - 60
60 - 70
70 - 100
Frequency
36
42
50
34
16
12
10
(3) The following histogram was made from 1000 observations on the
lifetimes of a product as reported by customers. The vertical axis is
density % per hour. The heights of the rectangles are given.
4.0
+---+
|
|
|
|2.5
|
|---+
1.0
|
|
|-------+
|
|
|
|
.5
|
|
|
|
+-----------|
|
|
|
|---+---+---+---+---+---+---+-----lifetime
10 20 30 40 50 60 70
(hours)
What percent of the lifetimes exceed 40 hours?
What interval contains the median lifetime?
(4) The following data is the number of cars sold by 25 salespeople in
2003.
15,19,20,22,24,25,25,25,26,26,27,29,30,31,31,33,
35,38,40,42,50,51,60,65.
Make a histogram.
Use intervals: 10-30,30-50,50-70.
A number on a boundary is counted in the interval to the right.
(5) A mail company was concerned about the time it takes for them to
receive payment for an order. They collected the following data on the
number of days until payment was received.
10,12,17,20,21,22,22,23,23,23,25,25,25,26,27,29,30,33,35,42
Make an accurate histogram using intervals 10-20,20-30,30-40,40-50.
A number on a boundary is counted in the interval to the right.
Chapter 4
(1) Before microwave ovens are sold, the manufacturer must check to
ensure that the radiation coming through the door is below a specified
safe limit. The amount of radiation leakage (mw/cm2), with the
door closed, from 5 randomly selected ovens are given below.
15
18
9
10
5
What are the mean and median for the radiation leakage data
respectively?
What is the standard deviation of the radiation leakage data?
(2) 50 water samples were collected from a lake. Chloride
concentration was measured for each. The data had mean 50 and SD 10,
with roughly a symmetrical shape.
(a) About what percentage of the data will be between 40 and 60?
(b) Draw a rough (smooth) sketch of the histogram.
(3) Determine the mean, median and standard deviation of the following
sample of 3 numbers: 33, 40, 38.
(4)
40,
(a)
(b)
Consider the following 25 observations. 32, 35, 35, 36, 39, 40, 40,
41, 43, 45, 48, 49, 50, 51, 54, 55, 57, 60, 61, 68, 72, 76, 85, 88.
Give a dotplot on axis below
Give a rough estimate of the standard deviation .
--|---------|---------|---------|---------|---------|---------|-----30
40
50
60
70
80
90
(5) In a 5 week period at a factory, the count of the number of workers
who were paid overtime wages each week was 23,60,42,77,48.
Determine the mean, median and standard deviation.
(6) Calculate the mean and standard deviation of the following
observations:
14, 15,18, 21.
(7) Estimate, roughly, the mean and SD for the histogram below on test
scores.
(8) Weights of a group of men average 190 lbs with SD of 25 lbs.
John weighs 230 lbs so he is above average by ________ SDs.
Mark is 1.4 SDs below average so his weight is about ________ lbs.
Men with weights within 1.2 SDs of average have weights from
_______ lbs to _______ lbs.
Chapter 5
(1) The time for an emergency medical squad to arrive at the sports
center at the edge of a particular town is normally distributed
with mean 20 minutes and standard deviation 8 minutes.
What is the probability that it will take the medical squad more the 30
minutes to arrive at the sports center?
What is the arrival time in which 10% of all arrival times fall below?
(2) The Graduate Management Admission Test (GMAT) is given for
students wishing to attend a graduate business program.
Educational Testing Service (ETS) , the company that administers
the test, have found that the scores are approximately normally
distributed, with a mean of 500, and a standard deviation of 100.
If a business school admissions office uses the 75th percentile as a
cutoff point, does a score of 552 guarantee admission?
a.
b.
c.
d.
e.
No, 552 is less than the 75th percentile.
With a normal distribution, the 75th percentile is not
possible to obtain.
It's not clear, since 552 is the 75th percentile.
Yes, 552 is higher than the 75th percentile.
Not enough information is given for this problem.
(3) Grades on a
a SD of 10. We
What should the
what percentage
test are normally distributed with an average of 75 and
want to pass 80% of the students.
passing score be?
of students score less than 60?
(4) The times for students to finish a test follow a normal
distribution with mean 50 minutes and standard deviation 5 minutes.
(a) What percent of the students finish the test within 56 minutes?
(b) If we want 90% of the students to finish the test, how much time
should be allowed?
(5)The voltage of a battery X has a normal distribution with mean 10
and variance .25 volts. Batteries with voltages less than 9.2 volts
are defective. What is the probability that the battery is defective?
(6) The number of customer phone complaints that a company receives in
one month is normallly distributed with mean 100 and standard deviation
20. What is the probability of between 90 and 125 phone complaints
next month?
Chapter 8
(1) Calculate the correlation coefficient for the following two
samples.
x | 1 3 7 9
--|-----------y | 8 5 13 10
(2) In a study of how salaries are related to years of experience in a
company, five employees were selected at random and the following
information obtained:
Employee
Years of experience x
Salary ($1000) y
1
3
20
2
5
24
3
6
27
4
8
33
5
8
26
(a) Compute the correlation coefficient of this data.
(b) Determine the equation of the regression line.(needs chpt 12)
(3) For the plot below,
estimate the average x,
the standard deviation of x,
and the correlation coefficient.
Plot of x vs y
180
160
y
140
120
100
80
20
30
40
50
x
60
70
80
(4) Make a plot of the following 6 data points on 2 variables x and y.
x
y
The
The
The
The
The
1
1
2
3
3
2
4
5
5
4
6
5
largest x is ____
smallest y is ____
average x is ____
SD of x is about 1 2 5 6 (choose one)
correlation is about 0 .2 .5 (choose one)
(5) Suppose x and y have correlation of R = -.4 .
As x increases the y ______ .
Chapter 9
(1) Suppose height and weight are measured for two groups, each group
having 100 men:
Group I:
Choose 100 men from a European background.
Group II: Choose 100 men from a Mexican background.
Each group has a correlation of about .70.
If the combined data for the 200 men is plotted, do you think the
correlation will be lower, about equal, or higher than .70?
Draw a figure to explain your reasoning.
(2) Researchers studying a large group of college men found a
correlation of -0.3 between GPA and hours per week playing computer
games. They concluded that time spent playing computer games reduces
GPA. What do you think of this conclusion?
(3) A beach refreshment stand wanted to measure daily drink sales
versus temperature, so as to help predict weekly supply needs, and so
it randomly selected 13 days from the previous summer and recorded
drink sales and high temperature for the day.
From the data,
High Temp
88 94 84 96 100 95 90 84 83 90 77 88 81
Drink Sales($) 373 390 338 357 296 357 381 368 340 418 239 390 319
450
400
350
Series1
300
250
200
75
85
95
105
From the data, the following statistics were measured
Average
Standard deviation
r(correlation)
X
88.5
6.36
.36
Y
351.2
44.86
The manager asserts that, since the correlation is positive, anytime
the next day’s temperature is predicted to be higher than the current
day, they should expect to sell more drinks, and if the predicted
temperature is lower, they should expect to sell fewer. What do you
think of this conclusion? Explain.
(4) The president of an international agricultural company wants to see
what relationship exists between the average monthly temperature and
total production per acre on certain crops. When looking at the
numbers supplied, she notices that the correlation measurement from the
European based operations is calculated using degrees Celsius and
kilograms, while the same American based number is calculated in
degrees Fahrenheit and pounds. Wanting to make sure that she is
“comparing apples to apples” she calls in her data people and tells
them to get her the correlation numbers using the same units. The data
people defend the numbers saying it doesn’t matter which units are
used, the correlation calculation will be the same. Do you agree?
Explain.
(5) A study is performed in which a random sample of adults (18yrs+),
taken at a chain of farm/gardening supply stores, were required to take
a short physical strength test and asked how many hours per week they
spend gardening. The strength test, which consisted of several
different measurements involving upper and lower body activities, was
recorded as a sum of the all these measurements. In addition, a short
list of personal questions, such as age, weight, and occupation, were
also recorded.
A correlation coefficient of R=.1 was measured between
gardening hours and strength. The board of the chain of stores
concludes that there is not a strong relation between gardening hours
and strength and proposes to let the study rest and not pursue it any
further. Would you agree?
with the data?
If not, what further steps might you take
Chapter 10
(1) For a group of college freshmen the correlation between x = high
school GPA and y = college GPA is R = .3 .
A student with x at the 80th percentile is predicted to have college
GPA at the ________ percentile.
A student with x at the 10th percentile is predicted to have college
GPA at the ________ percentile.
(2) Regression analysis was used to study height and weight of men. A
man who is 62 inches tall is predicted to weigh 140 lbs. Therefore, a
man who weighs 140 lbs is predicted to be 62 inches tall. Is this
conclusion true or false (why)?
(3) A university math course gives an exam every Tuesday morning,
outside of lecture, which all enrolled students must take. The graded
exams, from which the scores are approximately normally distributed,
are returned Wednesday in lecture. At this point, students scoring at
or below the 50th percentile must return for another test over the same
material on Thursday morning, with the understanding that their score
for the week will be two thirds the higher and one third the lower of
the two scores. Students scoring above the 50th percentile have the
option of sticking with their Tuesday exam score, or taking the
Thursday exam with the same understanding as above as to the
calculation of their score for the week. One of the graduate students
responsible for grading these exams explained “while we can’t stop you
from retaking the exam if you scored above the 50th percentile, it
appears most people who do so are over confident and/or don’t
sufficiently prepare themselves for the second exam, since their scores
tend to go down in percentile ranking in comparison with their first
score.” Would you agree with this graduate student? Explain.
(4) A study group in a social sciences class is assigned to study what,
if any, predictable relationship exists between household income and
total monthly expenditure on pharmacy/first aid/medical products at
their stores. They go to the two local grocery stores in town to obtain
data regarding this question. Store 1 was able to supply the average
monthly pharmacy/first aid/medical products expenditure for four
different household income levels. Store 2, on the other hand, was
able to supply the average household income level for four different
monthly expenditure levels for pharmacy/first aid/medical products. In
plotting the averages on a graph, Both sets of data show a positive
correlation, But the line from the store 1 data is quite different from
that of the store 2 data. Person 1 in the study group says that this
suggests a difference in spending habits between people at different
ends of town. Person 2 says they cannot make such a conclusion yet, as
the two lines represent different measurements. Which person do you
believe? Explain.
(5) The company “UFIT” made a study of the effects of its patented
daily stretching routine on flexibility.
The study consisted of
having a random sample of 1000 6th-10th grade students participate in
this daily stretching program for six weeks. A flexibility test was
administered to each test subject just before, and at the end of, the
six week program. The large sample size produced an extremely wide
range of “before” flexibility test scores, and the “before” and “after”
test score pairs yielded an unusually high correlation coefficient of
0.9.
UFIT introduced its program to employees at a large corporation,
claiming that, upon completion of its six week program, it could fairly
accurately predict the increase in an employee’s score on the
flexibility test, citing the above study. Does UFIT have strong
evidence for this claim? Explain.
Chapter 11
(1) A study of taxi drivers in a large Chicago company showed
x = gas mileage 1st month (mpg) mean = 25, SD = 3
y = gas mileage for 2nd month (mpg) mean = 27, SD = 2
with a correlation of R = .8 .
(a) Estimate the average gas mileage in the 2nd month for drivers who
get 23 mpg in the 1st month.
(b) For drivers who get 23 mpg in the 1st month, give an interval
containing 95% of their gas mileages for the 2nd month.
(c) Drivers at the 80th percentile in the 1st month tend to be at the
80th percentile in the 2nd month. True or false? Explain.
(d) Give the equation of the regression line. (needs chpt 12)
(2) For a certain population of people suppose x = income in 2000 and
y = income in 2007. (in $1000 units). Data showed the following:
x has mean 30 and SD 10
y has mean 35 and SD 12
the correlation is R = .6
We want to predict y for those people with x = 35. Give the prediction
and also give a 90% range (an interval which contains about 90% of the
2007 incomes for people having 2000 incomes of 35).
(3) Suppose we want to predict y = final test score based on knowledge
of x = entrance test score for students in freshman chemistry classes.
We have data from 50 students last semester on x and y which showed
x has mean 60 and SD 15
y has mean 70 and SD 12
the correlation is R = .7
Give a prediction of the final test score for a student who scores 80
on the entrance test and give an interval which should contain the
final test score with a 95% chance.
(4) Data on n = 30 fish of a certain species was collected to study the
relationship between x = length and y = weight. Calculations yielded
x-bar = 18 inches
sum{ (xi - x-bar)^2 } = 840
y-bar = 9.1 lbs
sum{ (yi - y-bar)^2 } = 580
sum{ (xi - x-bar)(yi - y-bar) } = 450
(a) Predict the weight of a fish of length 22 inches.
(b) Give a 95% interval estimate for the average weight of 22 inch
fish.
#5) Below are graphs of residuals vs. fitted values from linear
regression fits of various data sets. Are there any that indicate that
linear regression fits may not be appropriate? Why?
Versus Fits
Versus Fits
(response is C10)
(response is C11)
200
5.0
150
2.5
Residual
Residual
100
50
0
0.0
-2.5
-5.0
-50
-7.5
-100
-100
-50
0
50
100
Fitted Value
150
200
-0.5
250
0.0
0.5
Fitted Value
A
1.5
B
Versus Fits
Versus Fits
(response is C7)
(response is C6)
125
3
100
2
75
Residual
1
Residual
1.0
0
-1
50
25
0
-2
-25
-3
-50
-4
0
10
20
30
Fitted Value
C
40
50
-200
-100
0
100
200
300
Fitted Value
400
500
600
700
D
(6)A hospital wants to know what, if any, relationship exist between
the average number of overtime of hours per week a nurse works and the
number of “personal” days they take each year. The manager in
personnel gets a call one morning from the chief financial officer’s
office with the request for this information. In particular, the
manager is told “we need to know ASAP if we can have a regression of
“personal” days taken on average overtime hours with an r.m.s. error no
greater than two days.” Quickly looking at the available data on his
computer screen, the manager sees the following summary statistics:
Average over all nurses’ overtime hours/week=8.73 SD=4.02
Average # of “personal” days over all nurses=3.1 SD=1.97
No correlation between the two is currently available.
N=81
N=81
The manager immediately assures the caller that, in fact, such a
regression is obtainable, and should be available by late afternoon if
not sooner. Is the manager justified in his confidence? Explain.
Chapter 12
(1) To study the relationship of x = age of sunfish and y = weight of
sunfish, a random sample of 25 fish were caught in a lake and the
following data was recorded (x in years, y in ounces):
x has mean 6 and SD 3 (years)
y has mean 14 and SD 2 (oz)
the correlation is R = .7
Determine the regression equation and give a 90% interval estimate of
the average weight of sunfish 4 years old.
(2) A compact car was used in an experiment to study the effect of
weight in the car on gas mileage. Mileage was determined with 5
different weights as follows.
x
y
Weight (100 lbs) | 25
Mileage (m/y)
| 38.1
27
36.2
29
37.5
Determine the regression line and RMS.
variable.) these calculation may help
x-bar = 29
y-bar = 36.8
sum(x - x-bar)^2 = 40
sum(y - y-bar)^2 = 4.8
31
36.9
33
35.3
(mileage is the dependent
(3) A personnel manager for a large corporation feels that there may be
a relationship between absenteeism and age and would like to use the
age of a worker to develop a model to predict the number of days absent
during a calendar year. A random sample of 80 workers was selected and
a regression analysis was performed. Use the following Minitab
regression output to answer the questions that follow.
MTB > regress 'days' 1 'age';
The regression equation is
days = 21.6 - 0.268 age
rms = 1.615
What is the intercept of the regression line?
Give a one sentence interpretation of the slope of the regression line.
John Smith is 40 years old - give a 95% prediction interval for the
number of days he will be absent in the next year.
(4)A company wants to explore what factors affect the sales volumes of
its sales representatives. A regression analysis was performed on the
total vacation time taken each year and the total sales volume for each
of the company’s 40 sales representative, yielding the following
regression line of total sales (in $10,000’s) on vacation time:
Total sales = 18.53 + 1.21x(total # weeks yearly vacation taken)
From this analysis, the CEO reasons that if he forces everyone one of
his sales reps to take an additional week of vacation, he can increase
company sales, on average, by
(# of sales reps.)x(average increase in sales volume)x$10,000
=(40)x(1.21)x$10,000
= $484,000
Do you agree with this strategy? Explain.
(5) Data on midterm score and final score on 5 students is given.
(a) Estimate a regression line to predict final score from midterm
score. (Note partial calculations)
(b) Give RMS
midterm x
final y
65
75
80
85
90
85
85
92
75
78
x-bar = 79
y-bar = 83
Chapter 16
(1) A bet on one number in roulette, like “10”, pays 35 to 1.
Make a box model for your winnings if you bet $1 on the number 10.
(2)Make a box model, if you can, for your winnings if you alter the bet
in (1) by
(a) betting $1 on “10”, plus $1 on “odd”.
(b) betting $1 on “10”, plus $1 on “even”.
Assume the payoff for ‘odd’ or ‘even’ is “even money”.
(3)Suppose that on 10 roles of the roulette wheel, the numbers come up:
12 33 6 16 8 4 18 29 5 14
What is your net gain if you bet as in 2(a)?
What is your net gain if you bet as in 2(b)?
(4)Make a box model, if you can, for your winnings if you alter the bet
in (1) by betting $1 on “10” plus $1 on “black”.
(5) Your cousin is a fair shot from the free throw line, for his level
of experience, with a shooting percentage you can assume to be 50%. He
wants to make a bet with you that he can hit 14 out of 20 free throws,
or 70%, which you know to be possible but not probable. You consider a
counter bet of 44/70, less than 63%, which, although lower than 70%,
might be better according to the law of averages. Which bet, assuming
your cousin’s shooting percentage is distributed normally about the
mean=50%, is actually better for you?
Chapter 17
(1) You can draw at random a number from the bowl {1 2 3} and
you "win" the amount drawn.
In 100 draws, how much would you expect to win?
What is the standard error of the amount you win?
(2) Consider
(a) the chance of between 40 and 60 heads in 100 tosses of a coin
and
(b) the chance of between 400 and 600 heads in 1000 tosses of a coin.
Which is larger, (a) or (b)? Explain your reasoning.
(3) Consider a true-false test with 100 questions. A student will
randomly guess at each question. Specify a box model for this
situation. What is the EV and SE of his score?
(2) Repeat the previous exercise, but assume the student has a 70%
chance on any question of getting it right.
(4) Consider a multiple choice test of 50 questions. Each question has
4 foils ( 4 responses to choose from ).
If a student randomly guesses at each question, set up a box model and
give the EV and SE of his score if
(i) score = number of correct responses
(ii) score = number of correct responses - (1/4) number of wrong
responses .
(5) On any day a salesman will sell 2 cars with a 20% chance, 1 car
with a 30% chance and no cars with a 50% chance. Over 30 sales days,
how many cars will he sell? To respond set up a box model and give
the EV and SE of his sales total.
(6) One hospital has 100 births and another has 200 births. Which one
has a higher chance of having more than 55% boys?
(7) Assuming boy-girl births are equally likely, what is the chance of
more than 1300 girls in the next 2500 births?
Chapter 18
(1) We believe that on the average a machine makes 15 defective parts
out of 1000. What is the probability of 6 or more defective parts in
the next 200 parts?
(2) 10% of the invoices received by a company have some error. If 200
invoices are sampled and carefully checked for error, what is the
probability 30 or more of these will have errors?
(3) With its personnel and procedures, a company's invoices have an
error about 5% of the time. If 1000 invoices are selected at random
and checked for accuracy, what is the probability that the number of
invoices with errors is in the range from 40 to 60?
(4) A company is developing a new product and plans to survey potential
customers in a target area. Suppose that 30% of the people would say
"yes" to the question "would you buy it for $25". If the company takes
a random sample of 200 people what is the chances that more than 70 of
these would say "yes"?
(5) A box contains these 4 numbers {1 2 3 4}.
100 draws will be made at random from this box.
What is the chance the sum of the numbers drawn is 270 or more?
(6) A box of numbers has an average of 25 and a standard deviation of
10. If 60 numbers are drawn at random, what is the chance the sum is
less than 2400?
Chapter 20
(1) Suppose 40% of the students at WMU prefer Pepsi to Coke. If we
take a random sample of 400 students, What is the chance that
less that 35% of them favor Pepsi to Coke?
(2) Suppose 30% of the employees at a company favor new work rules. If
100 employees are sampled at random, give an interval having 95%
probability of containing the sample proportion of employees favoring
new work rules.
(3) Suppose that 40% of the students at WMU regularly drive a car to
campus. If we take a random sample of 300 students what is the
probability that we would find between 37% and 43% of the sample drive
a car regularly to campus?
Also, give a range having 99% probability for the sample proportion of
students regularly driving a car to campus.
(4) One poll used a sample of 1000 people in a city of population
50,000. Another poll used a sample of 1000 people in a city of
population 500,000.
True or false - the 1st poll is likely to be more accurate than the 2nd
Explain your reasoning.
(5)Western Michigan University and Kalamazoo College, having
undergraduate enrollment totals of 22756 and 1387 respectively, are
interested in comparing the number of “in-state” students they have
among undergraduates. A random sample of 350 students is drawn from
the enrollment data bases, and the student is counted as “in-state” if
they are recorded as paying in-state tuition, and estimates and SE’s
are calculated. Which estimate do you expect to be more accurate when
(a) “Draws” from the data-bases are made without replacement?
(b)
“
“
“
“
“
“ with
“
?
Chapter 21
(1) A magazine publisher wishes to estimate the true proportion(p) of
subscribers who have fixed incomes. The publisher randomly selects 400
subscribers and finds that 64 of them have fixed incomes.
Find a 99% confidence interval for the true proportion of
subscribers that have fixed incomes.
(2) Candidate B wishes to estimate the proportion of registered voters
who favor him. A random sample of 250 registered voters was selected.
There were 150 voters who favored candidate B in the sample.
Construct a 90% confidence interval for the true proportion of
registered voters who favor candidate B.
Which of the following statements is the correct interpretation
of the confidence interval in the above.
a.
b.
c.
d.
e.
We are 90% confident that the true proportion lies in the
interval.
We are 10% confident that the true proportion does not lie
in the interval.
We are 90% confident that the given interval, before
sampling occurs, will contain the true proportion.
We are 90% confident that the sample proportion lies in the
interval.
All of the above are correct interpretations.
(3) Suppose that 20% of the fish in a lake are contaminated with too
high a level of mercury. If we sample 50 fish at random from the lake
and test them, what would be a range of values for the sample
proportion of fish with high mercury, say a range having 95% chance?
(4) A company wants to estimate the proportion of its cast door handles
that would pass a careful inspection. A random sample of 100 such door
handles had 86 pass the inspection. Give a 99% confidence interval for
the population proportion.
(5) A random sample of 200 student was taken at WMU. It was found
that 50 of these sampled students owned their own computer. Give
a 95% confidence interval for the proportion of students at WMU
who own their own computer.
(6) A poll was taken of 500 adult residents in a city. They found that
150 of these people had seen a new TV advertisement. Give a 99%
confidence internal for the proportion of adult residents in the city
who have seen the advertisement.
(7) A company that makes washing machines collected data from 200
customers and found that 150 of them were "very satisfied" with the
product. Give a 90% confidence interval for the proportion of
customers overall who are "very satisfied" with their washing machine.
Chapter 23
(1) The PCB level of fish from a lake has mean 20 and variance 40. We
sample 10 fish and measure their PCB level. What is the probability
that the mean of this sample is less than 15. Assume a normal
distribution.
(2) A portable instrument can take an air sample and measure the carbon
monoxide (CO) level in the air. Suppose we want to estimate the
average CO level in a city on a given day with a 95% confidence
interval. We take a CO measurement at each of 15 different locations
in the city and find the sample mean is 21.2 and the sample variance is
65.6.
Give the 95% confidence interval.
(3) The
average
service
average
time for a bank teller to serve a customer is 3.2 min on the
with a standard deviation of .40 min. Consider the average
time for the next 25 customers; what is the probability this
exceeds 3.4 min?
(4) For a population of residences in a city the monthly electric bill
averages $75 with a standard deviation of $30. If a random sample of
20 residences is selected at random, what is the chance that the sample
mean will exceed $85?
(5) Grade point averages of students at a college have a mean of 2.6
with a standard deviation of .6. If a random sample of 10 students is
selected,
(a)what is the probability that the sample mean is less than 2.3?
(b)Give a 95% probability range for the sample mean.
(6) A company has a list of hundreds of accounts that are past due.
The amount due for these accounts has a mean of $300 and a standard
deviation of $200.
If the company selects 25 of these accounts at random,
(a)what is the chance the mean of the sample is less than $250?
(b)Give a 95% probability range for the sample mean.
(7) For a study of houses in a target residential area they want a 95%
confidence interval for the average size of the houses (square feet of
floor space). Give the result if a random sample of 30 houses had an
average size of 1800 sq.ft. with a standard deviation of 400 sq.ft.
Chapter 26
(1) Angeles Printing failed to meet their delivery dates 20% of the
time. New management procedures have been implemented in an attempt to
reduce this rate. After 6 months, 150 orders were randomly selected,
and it was found that 18 orders had not been completed on time.
Suppose we wish to
reduced. What are
a. H0: p=0.12
b. H0: p=0.20
c. H0: p=0.12
d. H0: p=0.20
e. H0: p=0.20
test if the failing delivery rate has been
the appropriate hypotheses?
vs. H1: p<0.12
vs. H1: p>0.20
vs. H1: p>0.12
vs. H1: p<0.20
vs. H1: p not equal 0.20
What is the value of the test statistic, and what is your
conclusion for testing the above hypotheses if you test at alpha=0.05?
a. -2.45, Accept H0
b. -2.45, Reject H0
c. -1.65, Accept H0
d.
2.45, Accept H0
e. -1.65, Reject H0
(2) A tire company guarantees that a particular brand of tire has a
mean lifetime of 42 thousand miles or more. A consumer test agency,
wishing to verify this claim, randomly selected 10 of the tires and
recorded the lifetimes. The sample mean was 41 thousand miles and
the sample standard deviation was 3.59 thousand miles. Suppose
we wish to test the claim at alpha=0.05 using the hypotheses below.
H0: mean=42 vs. H1: mean<42
What is are the value of the test statistic and your conclusion?
a.
b.
c.
d.
e.
-1.83,
-2.79,
-0.88,
-1.83,
-0.84,
Reject
Reject
Reject
Accept
Accept
H0
H0
H0
H0
H0
Give the p-value for the test using the t distribution.
a.
b.
c.
d.
e.
p-value
0.025 <
0.050 <
0.100 <
p-value
< 0.025
p-value < 0.050
p-value < 0.100
p-value < 0.250
> 0.25
(3) Last year a class of 60 students gave Math 200 an overall rating of
3.8 on a 5 point scale. This year the course was reorganized and new
materials used, and a class of 40 students gave the class a 4.1 rating.
The Math Department concluded that the new course is an improvement
over the old from the students' point of view. Is this good evidence
for such a conclusion?
Explain!
(4) A standard medicine to treat warts has a cure rate of 65%. A new
medicine was tested on 100 patients, and 70 of these patients were
cured. Is this enough evidence to conclude that the new medicine is an
improvement?
Specify appropriate hypotheses and carry out the test.
(5) Two-year old fish raised in a pond have weighed 3.1 lbs on the
average in the past. A new diet was used on 40 fish and their average
weight at 2 years was 3.3 lbs with a standard deviation of .5 lbs. Is
this good evidence to conclude that the new diet improves weight gain?
Test appropriate hypotheses with alpha = .01.
(6) A winery wants the PH level of its wine to be 6.0. To check on this
a random sample of 10 bottles is tested. The sample mean is 6.1 with a
standard deviation of .15. Test appropriate hypotheses with a=.05.
Use the t test.
(7) A car manufacturer wants to show that less than 10% of its
transmissions have a customer complaint in the first 3 years
of operation. It samples 200 of its cars and finds that 12
have had such a complaint. Is this enough evidence to support
the manufacturer? Test appropriate hypotheses with significance
level a=.05.
(8) A population of children had an average of $75 in dental expenses
per child in 2002. Is the average the same for 2007? To get data, a
random sample of 50 children was selected and the 2007 dental expenses
for this sample averaged $85. Assume the population standard deviation
is $30. Carry out a test of the hypotheses that the 2007 average dental
expense is $75 using a significance level of 5%.(Do all the steps!)
(9) A national survey showed that high school students spend 20 hours
per week watching TV on the average. A local educator wondering
whether this is true in Kalamazoo, took a random sample of 30 high
school students and found they watched TV an average of 17 hours per
week with a standard deviation of 6 hours. Should we reject the
hypothesis that Kalamazoo high school students watch TV an average of
20 hours per week?
Hypotheses?
Test Statistic?
p-value?
Conclusion?
(10) The breaking strength of fishing line is measured by the force
needed to break a 5 foot segment. A new batch of fishing line is being
run and to test it 20 spools are selected at random and tested. The
average of these 20 breaking strengths is 5.9 lbs. The machines
produce fishing line so that the breaking strengths have a standard
deviation of .4 lbs. Test the hypothesis that the average breaking
strength of the batch is 6 lbs (the advertised value). Use
a 5% significance level.
(11) A biologists report claimed that 30% of the salmon in Gull Lake
were infected with a fungus. A random sample of 50 salmon from the
lake was taken and 25 of these had the fungus. Does this refute the
claim? Set up appropriate hypotheses and carry out a test.
Chapter 27
(1) A study of salaries of men and women health professionals was done
to see if there was any difference in average salary. The following
data was obtained. Is it evidence of a significant difference?
Men
Women
Sample size
50
50
Sample mean
33000
32000
Sample standard deviation
2000
2500
(2) 100 parts randomly selected from parts supplied by Company I had 10
defective parts, and 200 parts selected from parts supplied by Company
II had 30 defective parts. Does this data indicate that the companies
have different defective rates? Answer this by specifying appropriate
hypotheses and carry out the test.
(3) The lengths of fish in a lake have a distribution with mean 10
inches and standard deviation 3 inches. If two fish are caught from
the lake, what is the mean and standard deviation of the difference
between their lengths?
(4) Acidity was measured in water samples from 8 lakes upwind of a city
and in 12 lakes down wind. Determine if acidity levels are higher
downwind if the following data resulted. Use alpha = .05.
sample size
sample mean(Ph)
sample standard deviation(Ph)
upwind
8
6.5
.3
downwind
12
6.1
.2
Note: Something with a ph<7.0 is considered acidic, and the lower the
ph below this point, the higher the acidity.
(5) A survey was conducted this morning at WMU to study awareness of
current affairs. Last night congress voted on a major issue. In a
sample of 100 men, 62 knew of this issue and in a sample of 100
women, 50 knew of this issue. Is this evidence of a difference
between men and women on their knowledge of this issue? Test
hypotheses with alpha =.05.
(6) Two light bulb companies are to be compared as to the lifetimes of
their 40 watt bulbs. We want to test the hypothesis that these bulbs
have the same average lifetimes against a two-sided alternative
hypothesis with alpha = .05. Carry out the testing with the following
data.
Company I
Company II
Sample size
50
50
Sample mean
650 hrs
675 hrs
Sample standard deviation
60 hrs
40 hrs
Chapter 28
(1) A company buys a certain part from 3 suppliers. In a study of the
quality of these parts, it tests 200 parts from
each supplier selected at random from their inventory. The results are
given in the following table.
Outcome
Good
Defective
Supplier
1
180
20
2
175
25
3
190
10
Does this show a significant difference in quality level between the 3
suppliers - test with significance level of .05 ?
(2)A random sample of 200 residents (18yrs+) of a downtown area is
questioned regarding marital status and attitude towards a proposed
shopping mall near downtown. The following data was tabulated:
Marital Status
Married
Single
Attitude
Favorable
73
94
Unfavorable
10
23
Is there enough evidence here to conclude that the attitude towards the
mall is dependent upon marital status. What might be done (added) to
this study to more accurately reflect any existing dependence?
(3)A random sample of 150 WMU students were questioned at 9am about
regarding their opinion of the state budget deficit, and both their
response and gender was recorded, thereby allowing a 2 test of
independence. Then, a second random sample of the same size was asked
the same questions at 2pm of the same day and recorded similarly. Can
you pool the results of these two 2 tests? (Are they independent?)
(4)At the time of the first survey was taken in (3), students of
sample were asked to have their “significant other”, if they had
come by the student center and participate in the same survey at
that day. Can the results of the noon survey and test be pooled
those of the 9am survey? with the 2pm survey?
the
one,
noon
with
(5)A computer role playing game engine is designed to randomly generate
wandering opponents(oppn) for the player with the following challenge
(level) distribution relative to the level of the player(plr):
Lvl. of oppn
plr lvl-1
plr lvl
plr lvl+1
plr lvl+2
% of opponents
30
40
20
10
During beta-testing,a random sample of 1000 wandering opponents were
recorded having the following observed frequency:
Lvl. of oppn
plr lvl-1
plr lvl
plr lvl+1
plr lvl+2
280
414
189
117
Does the random generator part of the game engine need to be
reexamined?