Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
S1 STATISTICS SLBS THE NORMAL DISTRIBUTION This is a model for continuous random variables. P(a < X < b) a b x The graph is symmetrical about the mean μ Therefore, mode = median = mean Total area under the curve = 1 (unity) The shaded area gives the probability that X is between a and b We say that the random variable X is distributed normally with mean value μ and variance σ2 (these are the parameters of the distribution) This statement is represented by the notation: X ~ N (μ, σ2) E[X] = μ Var[X] = σ2 JMcC 1 S1 STATISTICS SLBS These two distributions have different locations (means) but the same the same dispersion (variance):- X X 1 c 2 d These two distributions have the same location (mean) but different dispersions (variances):X 1 X2 c Note:- Total area under all curves = 1 JMcC 2 S1 STATISTICS SLBS Returning to the area under the curve needed to calculate probabilities:- Standard techniques (integration) cannot be used to evaluate this area. Statistical tables have to be referred to. It would require an infinite number of such tables to cope with all possible values for the mean and variance, so all such distributions are compared with one special one:- THE STANDARD NORMAL DISTRIBUTION This has mean value 0 and variance 1 It is represented by the letter Z Z ~ N (0, 1) 0 Φ (z) = P (Z < z) z Z this is the Cumulative Distribution Function (Φ = Phi pronounced “fie”) The table on p 201 gives the values of the Normal Distribution Function JMcC 3 S1 STATISTICS SLBS HOW TO READ THE TABLES Always use a sketch Examples 1) P (Z < 0.72) = 0.7642 Z 0.72 2) P (Z > 0.72) = 1 – P(Z < 0.72) = 1 – 0.7642 = 0.2358 Z 0.72 3) P (Z > − 0.6) = P(Z < 0.6) by symmetry, this is the same as -0.6 JMcC Z = 0.7257 0.6 Z 4 S1 STATISTICS 4) SLBS P (Z < − 1.2) = P(Z > 1.2) = 1 – P(Z < 1.2) symmetry => = 1 – 0.8849 = 0.1151 5) 1.2 Z -1.2 Z P (− 0.4 < Z < 1.75) = P(Z < 1.75) – P(Z < – 0.4) = P(Z < 1.75) – [1 – P(Z < 0.4)] = 0.9599 – 1 + 0.6554 = 0.6153 -0.4 1.75 Z Ex 9A p 179 JMcC 5 S1 STATISTICS SLBS USING TABLES TO FIND z GIVEN THE PROBABILITY The tables can be read in reverse! If you know that P (Z < k) = 0.8888 Look for 0.8888 in the main table in one of the P(Z<z) columns The corresponding value of z gives the value of k z P(Z<z) 1.22 0.8888 Hence, k = 1.22 Example P(Z<z) = 0.3 find z Tables only give values of P(Z<z) from 0.5 up to 1 So z will be a negative quantity (see Example 4) P(Z<−z) = 1 − P(Z<z) = 0.7 Tables give −z = 0.52 hence z = − 0.52 PERCENTAGE POINTS TABLE This is another table that can be used and is very popular with the examiners! (on p202) This reads the main table backwards! It gives the values of z in the main table for which Z exceeds with probability p i.e. P(Z>z) = 1 − P(Z<z) = p Whenever possible this table should be used to find a value of z given a value for p = P(Z>z) JMcC 6 S1 STATISTICS SLBS Example If 30% of the values of Z are greater than z Then z = 0.5244 (This is found by looking up 0.3 under the column headed p and reading off the corresponding value of z) If told that 70% of the values of Z are greater than z Then z = − 0.5244 (This is found by considering symmetry) Note: this table enables you to deal only with certain percentages:50% 40% (+ 60%) 30% (+70%) 20% (+80%) 15% (+85%) 10% (+90%) 5% (+95%) 2.5% (+97.5%) 1% (+99%) 0.5% (+99.5%) 0.1% (+99.9%) 0.05% (+99.95%) Ex 9B p 181 JMcC 7 S1 STATISTICS SLBS STANDARDISING A NORMAL DISTRIBUTION Any Normal Distribution can be mapped onto the Standard Normal Distribution by a simple method of coding. Subtracting the mean value shifts the axis of symmetry from z = μ to the value where z = 0 Dividing by the standard deviation (σ) reduces the variance to 1 If X ~ N(μ, σ2) Then Z = X – where Z ~ N(0, 1) Given the mean and the variance, you can find the probability that X lies within any given range. Example 2 If X ~ N(14,2 ) find P(13 < X < 16) 13 – 14 16 – 14 = P < Z < 2 2 = P( – 0·5 < Z < 1) = P(Z < 1) – P(Z < – 0.5) = P(Z < 1) – [1 – P(Z < 0.5)] = 0.8413 – [1 – 0.6915] = 0.5328 -0.5 JMcC 1 Z 8 S1 STATISTICS SLBS SOME STANDARD RESULTS For any normal distribution of X, the probability that X lies within 1 standard deviation of the mean can be found: X can be mapped onto Z 0 Z P( – 1 < Z < 1) = P(Z < 1) – P(Z < – 1) = P(Z < 1) – P(Z > 1) = P(Z < 1) – [1 – P(Z < 1)] = 2P(Z < 1) – 1 = 2 0·8413 – 1 = 0·6826 68% Similarly 0 Z 0 Z P(within 2 s.d. of mean) = 2P(Z < 2) – 1 = 0·9544 95% P(within 3 s.d. of mean) = 2P(Z < 3) – 1 = 0·9974 99·7% JMcC 9 S1 STATISTICS SLBS If you are given the mean and variance and the probability that X is less than or greater than an unknown value, you can find that unknown value. Example X has a normal distribution with mean 5 and variance 0.81. Find x, such that P(X < x) = 0.01 X ~ N(5, 0.92) x – 5 P(X < x) = P Z < 0·9 = 0·01 = 0.01 x – 5 0 0.01 X 0 0.9 X 5 – x 0.9 Tables only give values of p(Z<z) from 0.5 up to 1 x – 5 5 – x Symmetry P Z > – = P Z > = 0·01 0·9 0·9 5 – x P Z < = 0·99 0·9 and tables 5 – x 0·9 = 2·32 5 – x = 2·32 0·9 x = 5 – 2·32 0·9 = 2·912 Ex 9C p 184 JMcC 10 S1 STATISTICS SLBS FINDING THE MEAN AND/OR THE VARIANCE The most difficult problems are when you are given the probability that X is less than or greater than a specific value together with the mean (or variance) and you have to calculate the variance (or mean) of the distribution. Example X is normally distributed with standard deviation 10 P(X > 40) = 0.25. Find the mean of the distribution X ~ N(μ, 102) 40 – P(X > 40) = P Z > = 0·25 0.25 40 – X 10 40 – P Z < = 0·75 10 40 – 10 = 0·675 40 – = 0·675 10 = 6·75 = 40 – 6·75 = 33·25 JMcC 11 S1 STATISTICS SLBS If you have to find both the mean and the variance, you will need two pieces of information and use simultaneous equations Example X is normally distributed P(X > 76) = 0.0354 and P(Z < 47) = 0.1 Find the mean and the variance. P(X > 76) = P Z > 0.1 0.0354 P Z < 76 – = 0.0354 76 – = 1 – 0.0354 = 0.9646 Tables 47 – 0 76 – 76 – = 1.81 76 – = 1.81 (1) X 47 – P(X < 47) = P Z < = 0·1 47 – 47 – = 0·1 P Z < – P Z > – = 0·9 – 47 P Z < = 0·9 Tables – 47 = 1·28 – 47 = 1·28 (2) (1) + (2) 29 = 3·09 = 9·39 to 2dp Subst in (2) = 59·01 to 2dp JMcC Ex 9D p 188 12 S1 STATISTICS SLBS PROBLEMS IN CONTEXT Example A lift states that its maximum safe load is 90kg. The weights of people using the building are normally distributed with mean 67kg and variance 60 kg2. The lift is meant for books and equipment only. What is the probability that someone who is tempted to use the lift exceeds the safety limit? X ~ N (67, 60) => σ = √60 67 90 X 90 – 67 P(X > 90) = P Z > 60 = P(Z > 2·97) = 1 – P(Z < 2·97) = 1 – 0·9984 = 0·0016 JMcC 13 S1 STATISTICS SLBS Example On a farm, the middle 80% of the eggs produced have masses between 45g and 70g. Assuming that the masses of the eggs are normally distributed, find their mean and standard deviation. 10% 10% 70 45 Symmetry = 45 + 70 2 X = 57·5 g P(X < 70) = 0·9 70 – 57·5 P Z < = 0·9 Tables 70 – 57·5 = 1·28 70 – 57·5 = 1·28 12·5 = 1·28 = 9·77g to 2dp JMcC 14 S1 STATISTICS SLBS Example The mass of beans in a tin is normally distributed with mean 250g and standard deviation 10g. It is a legal requirement that no more than 5% of the tins should have a mass less than that stated on the label. What mass should the label state? X ~ N(250, 102) 5% W 250 X W – 250 250 – W P(X < W) = P Z < = P Z > = 0·05 10 10 Percentage Points table 250 – W 10 = 1·6449 W = – (1·6449 10) + 250 = 233·551 The label should read 233g Mixed Ex 9D JMcC 15