Download Statistics MINITAB

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Statistics
MINITAB - Lab 6
Chebyshev's Rule and the Empirical Rule
1.
Open the Minitab worksheet called VARIABLES.MTW that we used last week. It is to be
found on the online class for this course .This worksheet contains two sets of randomly
generated variables. The two sets of variables are stored in colums C1 and C2 called
VARS1 and VARS2 respectively. If you haven’t already done so, download it now and
open it.
2.
We have two rules that aid our interpretation of the standard deviation of a set of values.

Chebyshev's Rule states in general that for a value of K greater than 1, at least 1-1/k2
proportion of the measurements of any set of data (regadless of shape) will fall within
k standard deviations of the mean.

The Empirical Rule apllies only to symmetrical mound shaped distributions. It states
that approximately 68% of values will fall within  1 standard deviation of the mean,
95% within  2 standard deviation of the mean and 99.7% within  3 standard
deviations of the mean.
Note that Chebyshev's Rule specifies the lowest proportion to be found within the intervals
whereas the Empirical Rule specifies approximate percentages/proportions within the
intervals.
To examine the spread of measurements within a data set it is often useful to standardise
them. This means that we express each measurement as the number of standard
deviations it is, above or below the mean.
Z – scores (standardised scores) for a sample are cauclated by;
xx
s
Where x is the sample mean and s is the sample standard deviation.
Type the following commands at the Minitab command prompt.
MTB > let c3=(c1-mean(c1))/stdev(c1)
MTB > let c4=(c2-mean(c2))/stdev(c2)
The Columns C3 and C4 now contain the z – scores (standardised scores) for VARS1 and
VARS2. We now wish to check how the distribution in the two sets of variables coorespond
to Chebyshev's and the Empircial Rules.
1
Use the follwing command to get a tally with cumulative counts and percents for the two
sets of variables. NB. Theses variables are all ready sorted in ascending order of
magnitude.
MTB > TALLY C1 C2;
SUBC > CUMCOUNTS.
Now fill in the following table:
(1) Fill in the expected values for Chebyshev’s and the Empirical Rule
(2) Fill in the actual values for VARS1 and VARS2
VARS1:
Standardised Interval
Empirical
(Expected)
Chebyshev's
(Expected)
Actual
X
-1  Z  1
-2  Z  2
-3  Z  3
VARS2:
Standardised Interval
-1  Z  1
Empirical
(Expected)
Chebyshev's
(Expected)
Actual
X
-2  Z  2
-3  Z  3
How do the distributions of measurements around the mean agree with the two rules ?
_______________________________________________________________________
_______________________________________________________________________
_______________________________________________________________________
Why does VARS2 not follow the Empirical rule ?
_______________________________________________________________________
2
Discrete proabability Distribution
3.
Last week we looked at the Normal distribution as an example of a continuous probability
distibution. This week we will look at the binomial as an example of a discrete probability
distribution. Recall that the probability function for a binomial with n trials and p as the
probability of sucess is,
P(X = x) =
n x
n x
  p  1  p 
 x
and that the cumulative distribution function for the binomial is
n x
n x






p
1

p

 
x 0  x 
x
P(X  x) =
Suppose define a sucess as getting a tail on a toss of a fair coin, with the probability of
getting a tail = 0.5. We want to calculate two probabilities;
a) The proability of getting exactly 5 tails from 10 tossess of a coin
b) The probabilty of getting 5 tails or less from 10 tosses of a coin.
The answer to a) is given in Mintab by using the probability distribution function (PDF)
command and specifiying the correct n and p values.
MTB > pdf 5;
SUBC> binomial 10 .5.
What is this probability ? ___________________________
The answer to b) is given by using the cumulative distribution function (CDF) comand and
again specifying the correct n and p values.
MTB > cdf 5;
SUBC> binomial 10 .5.
What is this probability ? _________________________
Why is the probability in b) greater than the probabilty in a) ?
___________________________________________________________________
3
The probability of passing the driving test at the first attempt is 65%. On a particular day a
driving school has 20 clients doing the test for their first time. The driving school advertises an
75% pass rate. Assuming that the driving school's clients have no more chance of passing the
test than anyone else, what is the probability that exactly 75% of the driving schools clients will
pass the test that day ?
________ .
What is the probability that at least 75% of their clients will pass that day ? ________.
Assignment:
Which of the following is most likely;
To get an ace (i.e. a one) on a roll of a die
OR
To get at least one ace in 6 rolls of a die
OR
to get at least 2 aces on 12 rolls of a die ?
REVISION SUMMARY
After this lab you should be able to :
-
Understand Chebyshev’s and the Empirical Rule. (Refer back to your lecture notes if
necessary)
-
Understand what a z-score is and calculate z-scores in Minitab
-
Understand the difference between a discrete and a continuous random variable
-
Calculate probabilities for a binomial distribution
-
Understand the difference between the cdf and pdf commands in Minitab
END
4