Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Statistics MINITAB - Lab 6 Chebyshev's Rule and the Empirical Rule 1. Open the Minitab worksheet called VARIABLES.MTW that we used last week. It is to be found on the online class for this course .This worksheet contains two sets of randomly generated variables. The two sets of variables are stored in colums C1 and C2 called VARS1 and VARS2 respectively. If you haven’t already done so, download it now and open it. 2. We have two rules that aid our interpretation of the standard deviation of a set of values. Chebyshev's Rule states in general that for a value of K greater than 1, at least 1-1/k2 proportion of the measurements of any set of data (regadless of shape) will fall within k standard deviations of the mean. The Empirical Rule apllies only to symmetrical mound shaped distributions. It states that approximately 68% of values will fall within 1 standard deviation of the mean, 95% within 2 standard deviation of the mean and 99.7% within 3 standard deviations of the mean. Note that Chebyshev's Rule specifies the lowest proportion to be found within the intervals whereas the Empirical Rule specifies approximate percentages/proportions within the intervals. To examine the spread of measurements within a data set it is often useful to standardise them. This means that we express each measurement as the number of standard deviations it is, above or below the mean. Z – scores (standardised scores) for a sample are cauclated by; xx s Where x is the sample mean and s is the sample standard deviation. Type the following commands at the Minitab command prompt. MTB > let c3=(c1-mean(c1))/stdev(c1) MTB > let c4=(c2-mean(c2))/stdev(c2) The Columns C3 and C4 now contain the z – scores (standardised scores) for VARS1 and VARS2. We now wish to check how the distribution in the two sets of variables coorespond to Chebyshev's and the Empircial Rules. 1 Use the follwing command to get a tally with cumulative counts and percents for the two sets of variables. NB. Theses variables are all ready sorted in ascending order of magnitude. MTB > TALLY C1 C2; SUBC > CUMCOUNTS. Now fill in the following table: (1) Fill in the expected values for Chebyshev’s and the Empirical Rule (2) Fill in the actual values for VARS1 and VARS2 VARS1: Standardised Interval Empirical (Expected) Chebyshev's (Expected) Actual X -1 Z 1 -2 Z 2 -3 Z 3 VARS2: Standardised Interval -1 Z 1 Empirical (Expected) Chebyshev's (Expected) Actual X -2 Z 2 -3 Z 3 How do the distributions of measurements around the mean agree with the two rules ? _______________________________________________________________________ _______________________________________________________________________ _______________________________________________________________________ Why does VARS2 not follow the Empirical rule ? _______________________________________________________________________ 2 Discrete proabability Distribution 3. Last week we looked at the Normal distribution as an example of a continuous probability distibution. This week we will look at the binomial as an example of a discrete probability distribution. Recall that the probability function for a binomial with n trials and p as the probability of sucess is, P(X = x) = n x n x p 1 p x and that the cumulative distribution function for the binomial is n x n x p 1 p x 0 x x P(X x) = Suppose define a sucess as getting a tail on a toss of a fair coin, with the probability of getting a tail = 0.5. We want to calculate two probabilities; a) The proability of getting exactly 5 tails from 10 tossess of a coin b) The probabilty of getting 5 tails or less from 10 tosses of a coin. The answer to a) is given in Mintab by using the probability distribution function (PDF) command and specifiying the correct n and p values. MTB > pdf 5; SUBC> binomial 10 .5. What is this probability ? ___________________________ The answer to b) is given by using the cumulative distribution function (CDF) comand and again specifying the correct n and p values. MTB > cdf 5; SUBC> binomial 10 .5. What is this probability ? _________________________ Why is the probability in b) greater than the probabilty in a) ? ___________________________________________________________________ 3 The probability of passing the driving test at the first attempt is 65%. On a particular day a driving school has 20 clients doing the test for their first time. The driving school advertises an 75% pass rate. Assuming that the driving school's clients have no more chance of passing the test than anyone else, what is the probability that exactly 75% of the driving schools clients will pass the test that day ? ________ . What is the probability that at least 75% of their clients will pass that day ? ________. Assignment: Which of the following is most likely; To get an ace (i.e. a one) on a roll of a die OR To get at least one ace in 6 rolls of a die OR to get at least 2 aces on 12 rolls of a die ? REVISION SUMMARY After this lab you should be able to : - Understand Chebyshev’s and the Empirical Rule. (Refer back to your lecture notes if necessary) - Understand what a z-score is and calculate z-scores in Minitab - Understand the difference between a discrete and a continuous random variable - Calculate probabilities for a binomial distribution - Understand the difference between the cdf and pdf commands in Minitab END 4