Download MAT 2210 Statistics Minitab Project #2: Discrete and Continuous

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Central limit theorem wikipedia , lookup

Transcript
MAT 2210 Statistics
Minitab Project #2:
Discrete and Continuous Probability
Distributions
Summer 2011 – Professor A. Jones
Due Date: _________________
Perform these tasks using Minitab. You will submit your final document that contains all answers
and requested justification. Copy and paste the required output, tables, and graphs to this
document where indicated. Part of being a good statistician is being able to make your point in a
clear, organized way with sufficient and effective evidence. Make sure the submitted document is
organized in a way that your answers are clear and easy to read. Also, be sure that you answer each
part of the question fully.
All data sources may be downloaded from my website: http://academic.pgcc.edu/~ajones. It is
suggested that you download the files using Internet Explorer.
Score
Part Points Possible Points Earned
A
10
B
10
C
10
Total
30
Name ______________________________________________________________
Part A: Binomial and Geometric Probability Calculations [10 points]
Minitab will also do geometric, binomial, and Poisson probability calculations much the same way your TI calculator will.
On question #1 below, you will be given guidance on how to obtain these values from Minitab. Practice the procedure
and confirm your answers with those provided below. On questions #2 and #3, you will apply the same procedures in
the first problem. Copy the results from the session window as documentation to show you have used Minitab to find the
answers to questions #2 and #3.
1.
(NO ANSWERS ARE REQUIRED ON QUESTION 1.) An article reported that 29% of the home mortgage loans made by
Washington Mutual, Inc. in 2008 were high cost loans (mainly sub-prime loans.) Suppose you choose five loans at
random from all loans processed by Washington Mutual in 2008.
a. Determine the probability that exactly 1 of the 5 loans was a high cost loan. This is equivalent to finding P(X = 1).
To do this on Minitab, CALC→PROBABILITY DISTRIBUTION→BINOMIAL. After selecting the PROBABILITY option,
enter the appropriate parameters (number of trials = 5 and probability of event = 0.29.) Then INPUT CONSTANT of
1 (this is the value of the random variable X for which we want the probability.) The probability that exactly one of
the five loans is a high cost loan is displayed in the session window as
X
1
P( X = x )
0.368469
There is a 36.85% chance that exactly one of the five loans is a high cost loan.
b.
Determine the probability that 3 or fewer of the loans are high cost. This is equivalent to finding P(X ≤ 3). This is a
cumulative probability that should include the probabilities that X = 0, X = 1, X = 2, or X = 3. We can use Minitab’s
cumulative option to calculate this just like on the TI calculator. To do this on Minitab, CALC→PROBABILITY
DISTRIBUTION→BINOMIAL. After selecting the CUMULATIVE PROBABILITY option, enter the appropriate
parameters (number of trials = 5 and probability of event = 0.29. Then INPUT CONSTANT of 3. The probability that
three or fewer of the five loans are high cost is displayed in the session window as
x
3
P( X <= x )
0.972840
There is a 97.28% chance that three or fewer of the five loans are high cost loans.
c.
Determine the probability that the first high cost loan is the fifth loan chosen. This is a geometric probability for
P(X = 5). To do this on Minitab, CALC→PROBABILITY DISTRIBUTION→GEOMETRIC. After selecting the
PROBABILITY option, enter the appropriate parameters (number of trials = 5 and probability of event = 0.29.) Then
INPUT CONSTANT of 5 (this is the value of the random variable X for which we want the probability.) The
probability that the first high cost loan is the fifth loan chosen is displayed in the session window as
x
5
P( X = x )
0.0736939
There is a 7.37% chance that the first high cost loan is the fifth loan chosen.
2.
In a survey conducted by the Society for Human Resource Management, 68% of workers said that employers have the
right to monitor their telephone conversations at work. Assume that the probability of choosing a worker who agrees
with this statement is 68%. Suppose that a random sample of 20 workers is selected and asked this question. Use the
same format as in question #1 to find the answers to the following questions. Copy your results from Minitab to the
document as I have done above and write a complete sentence to answer the question.
a.
What is the probability that exactly 10 of the workers agree with this statement?
b.
What is the probability that 15 or fewer of the workers agree with this statement?
c.
What is the probability that at least 10 of the workers agree with this statement?
d.
Give the expected value and standard deviation for the distribution of workers who agree with this statement.
(Note: Minitab will not do this for you. Use what you know about the binomial distribution.)
3.
Based on last month’s customer surveys, it is known that 4% of food orders at McDonald’s are filled incorrectly. Use
the same format as in question #1 to find the answers to the following questions. Copy your results from Minitab to
the document has I have done above and write a complete statement to answer the question
a.
What is the probability that in the first 10 orders of the day, there will be exactly one wrong order?
b.
What is the probability that in the first 10 orders of the day, there will be at least one wrong order?
c.
What is the probability that the first order filled incorrectly is the 10th order? (Be careful! This is not binomial.)
d.
What is the probability that the first order filled incorrectly occurs on or before the 10th order?
e.
How many orders can you expect before an order is filled incorrectly? (Note: Minitab will not do this for you. Use
what you know about the geometric distribution. )
PART B: Assessing Normality [10 points]
A study was recently conducted by researchers at University of Maryland to assess whether the mean body temperature
of humans is indeed 98.6◦ Fahrenheit as has long been thought. The researchers obtained the body temperatures of 93
healthy adults (n = 93) and these data are provided as the Minitab worksheet BodyTemperature. (Additional details of
the study, if you are interested, can be found in the Journal of the American Medical Association.)
We may wonder if the distribution of body temperature is normally distributed. With a sample of
n = 93, we would not expect it to look perfectly normal. We will consider three ways to assess whether a distribution is
approximately normal.
4.
5.
The first way is through visual inspection. We will graph the data and compare it to a normal model.
a. Open the data file BodyTemperature and calculate the basic descriptive statistics as you have done in a previous
Minitab assignment. Copy and paste these descriptive statistics below and highlight the sample mean x and
sample standard deviation s of these temperature data on the document.
b.
Create a SIMPLE histogram of the temperatures as you have done on an earlier assignment. In a sentence or two,
describe the distribution in terms of shape. Do not copy the graph at this time.
c.
The first way to assess normality is by a visual inspection. Create another histogram, but this time, select the
option WITH FIT. This will overlay a “perfect normal curve” with the mean and standard deviation calculated in
part a on top of the histogram to help you compare the histogram to a perfect normal curve. Copy this graph
below.
d.
Comment on the normality based on this visual inspection and comparison to the perfect normal curve. Does the
data seem to be following normal curve (approximately)?
A second way to consider normality is to see how well the data from the 93 subjects fits the Empirical Rule for normal
populations.
a. We wish to see how many of the data lie within x  s, x  2s, and x  3s . Let’s standardize all 93 observations
and look at the z-scores. You did this is a previous assignment. You do not need to copy these z-scores but you will
use them in part b.
b.
6.
Determine the percentage of z-scores that fall between -1 and 1. Do the same for z-scores between -2 and 2 and
between -3 and 3. It may be helpful to put them in order (sort them.) (Note: If a z-score is between -1 and 1 it is
also between -2 and 2.) Comment on how closely these percentages mirror the known percentages for any normal
distribution.
The third way to determine normality is to produce a normal probability plot (sometimes called a QQ plot.) This is
essentially a scatter plot of the actual data against the “ideal” values from a normal distribution. If the points fall
approximately along a straight line with positive slope, then a normal distribution is a reasonable model. Points that
deviate from the straight line indicate departures from a normal model.
a. To make a normal probability plot in Minitab, GRAPH→PROBABILTY PLOT. You must select the column that
contains your data. Copy the normal probability plot below.
b.
Using your normal probability plot from part a, comment on your assessment of whether the data is following an
approximately normal distribution. Does any part of the data deviate from the normal distribution more than
others?
PART C: Normal and Uniform Calculations [10 points]
Minitab will also do normal and uniform calculations much the same way your TI calculator will. On question #7 below,
you will be given guidance on how to obtain these values from Minitab. On question #8, you will apply what you have
learned in the first problem about normal procedures. On question #9, you will do perform these procedures for a
uniform distribution.
7.
(NO ANSWERS ARE REQUIRED ON QUESTION 7). The length of human pregnancies from conception to birth is a
normal random variable with mean 266 days and standard deviation of 16 days.
a. What is the probability that a randomly selected woman has a pregnancy lasting less than 240 days (that’s about 8
months)? To do this on Minitab, CALC→PROBABILITY DISTRIBUTION→NORMAL. After selecting the CUMULATIVE
option, enter the appropriate mean and standard deviation. Then INPUT CONSTANT of 240 (this is the observation
x.) The probability that a woman has a pregnancy lasting less than 240 days is displayed in the session window as
x
240
P( X <= x )
0.0520813
There is a 5.21% chance that a woman has a pregnancy lasting less than 240 days.
b.
What is the probability that a randomly selected woman has a pregnancy lasting between 240 and 270 days
(roughly 8 and 9 months)? To answer this question, find the areas to the left of 240 and 270 separately and then
subtract the areas to find the area between 240 and 270. (We need to subtract the values on a calculator or by
hand). The session window displays
x
240
P( X <= x )
0.0520813
x
270
P( X <= x )
0.598706
There is a 54.66% chance that a woman has a pregnancy lasting between 240 and 270 days.
c.
How long do the longest 20% of pregnancies last? This is an INVERSE CUMULATIVE PROBABILITY problem with
80% as the INPUT CONSTANT because you have been asked to find the highest 20%. The cut point of the longest
20% of pregnancies is displayed in the session window as
P( X <= x )
0.8
x
279.466
The longest 20% of pregnancies last 279.466 days or longer.
8.
The annual ground coffee expenditures for households are approximately normally distributed with a mean of $44.74
and a standard deviation of $10.00. Use the same format as in question #7 to find the answers to the following
questions. Copy your results from Minitab to the document has I have done above and write a complete statement
to answer the question.
a.
What is the probability that a household spends less than $30.00 annually on coffee?
b.
What is the probability that a household spends more than $60.00 annually on coffee?
c.
What is the probability that a household spends between $35 and $50 annually on coffee?
d.
The bottom 10% of households spends less than how much on coffee?
e.
The top 25% (Q3) of households spends more than how much on coffee?
9.
The scheduled commuting time on MARC from Union Station to Philadelphia is supposedly 120 minutes. However, in
reality it is uniformly distributed between 110 and 135 minutes: U(110, 135). Use the same format as in question #7
to find the answers to the following questions. Copy your results from Minitab to the document has I have done and
write a complete statement to answer the question.
a.
What is the probability that a train will reach Philadelphia ahead of schedule? (i.e. will need less than 120 minutes
of travel time)?
b.
What is the probability that a train will need between 115 and 125 minutes to reach Philadelphia?
c.
What is the probability that a train will need more than 130 minutes to reach Philadelphia?
d.
Determine the time so that 25% of trains will reach Philadelphia by that time.
e.
Find the IQR of travel time to Philadelphia. (Hint: You have done some work to help answer this question in part
d.)