Download chapter 2: statistics

A Mathematical Skills Fundamental for the Pulp and Paper Industry STATISTICS AND PROBABILITIES 4 Facilitator Guide NQF Level 4 Credits: 5 Unit Standard 9015 Compiled by: Amanda Gilfillan Johan Els for FIETA Sparrow Research and Industrial Consultants © July 2005 Statistics and Probabilities 4 Learning Outcomes Upon studying this module, the learner will be able to apply his / her knowledge of statistics and probability to:  critically interrogate and effectively communicate results  look at samples in terms of size and representativeness  understand what a normal distribution is  have a basic understanding of probability Specific Outcomes Unit Standard 9015: Apply knowledge of statistics and probability to critically interrogate and effectively communicate findings on life related problems  Critique and use techniques for collecting, organising and representing data  Use theoretical and experimental probability to develop models  Critically interrogate and use probability and statistical models Facilitator Guide US 9015 – Statistics and Probabilities 4 2 Sparrow Research and Industrial Consultants © July 2005 Table of Contents CHAPTER 1: 1 PROBABILITY ............................................................................................ 4 WHAT IS PROBABILITY? ......................................................................................... 4 2 CALCULATING PROBABILITY ................................................................................. 4 2.1 Theoretical probability ........................................................................................ 6 2.2 Experimental probability ..................................................................................... 6 2.3 Subjective probability ......................................................................................... 6 3 NOTATION ................................................................................................................ 7 4 ADDITIONAL RULES ...............................................................................................14 4.1 Mutually exclusive events..................................................................................15 4.2 Exhaustive ........................................................................................................15 5 TREE DIAGRAMS ....................................................................................................20 6 TERMS USED ..........................................................................................................23 6.1 Probability statements .......................................................................................23 6.2 Sample space ...................................................................................................23 6.3 Joint and disjoint outcomes ...............................................................................24 6.4 Independent and dependent outcomes .............................................................24 7 SUMMARY ...............................................................................................................28 CHAPTER 2: STATISTICS .............................................................................................29 1 THE NORMAL DISTRIBUTION ................................................................................29 1.1 Skewed and symmetrical distributions...............................................................29 1.2 Normal distribution curves .................................................................................30 1.3 Characteristics of the normal curve ...................................................................32 1.4 Skewed data .....................................................................................................45 2 SAMPLE SIZE AND REPRESENTATIVENESS .......................................................45 2.1 Sample representativeness ...............................................................................45 2.2 Sample size ......................................................................................................46 2.3 Central limit theorem .........................................................................................47 3 STATISTICS IN THE MEDIA ....................................................................................53 3.1 Misleading statistics ..........................................................................................53 Annexure A – Normal distribution table ................................................................................59 Facilitator Guide US 9015 – Statistics and Probabilities 4 3 Sparrow Research and Industrial Consultants © July 2005 CHAPTER 1: PROBABILITY On completion of this chapter, you will be able to:  have a basic understanding of probability 1 WHAT IS PROBABILITY? Probability is an attempt to quantify (put a value to) uncertainty by measuring or calculating the likelihood of some event happening or not happening. Since we need a measure that can be easily understood, probabilities are usually represented as either percentages (say, a 80% chance) or as figures with two decimal places, which is a fraction of the whole, for example: “the probability is 0,50” (out of 1,00). Probability is the relative frequency with which a certain event will occur in the long run. In real life, we make many decisions based on our perception of probability. People who won’t do anything unless they are certain of it success will never do anything at all. Typical examples are:  “This furniture should last a long time!” (before buying new dining room furniture)  “A computer course should really get my career going!” (before deciding on enrolling on new studies)  “Rather just buy ice-cream for dessert, everybody will enjoy that” (when preparing a menu for weekend guests)  “This fence will keep the burglars out!” 2 CALCULATING PROBABILITY A few essential definitions:  An event is the outcome of an experiment or trail  The probability of an event occurring   Probability is measured on a scale from 0 to 1 Facilitator Guide number of sucsessful outcomes total number of outcomes US 9015 – Statistics and Probabilities 4 4 Sparrow Research and Industrial Consultants © July 2005 Exercise 1 - Probabilities 1. When two football teams play against each other, how many outcomes are there? Mostly three – win, lose or draw. When there are rules to force a result (penalty shootout, golden goal, etc.) then there are only two outcomes – win or lose. 2. What is the probability that the following is true: a) tomorrow is Wednesday 1 7 b) (It is a Wednesday once every seven days) the sun will shine tomorrow ½ (“shine” or “not shine” are the possible outcomes) If the probability that Susan is on time is ⅓, what is the probability that she is late? 3. ⅔ 4. (1 – ⅓) An ordinary die is thrown. What is the probability that the number thrown is: a) less than 3 Less than 3 is 1 and 2 so b) 2 6 =⅓ more than 7 0 c) a factor of 6? Factors of 6 are 1, 2, 3, 6 so Facilitator Guide 4 6 =⅔ US 9015 – Statistics and Probabilities 4 5 Sparrow Research and Industrial Consultants © July 2005 There are several methods of determining probability: 2.1 Theoretical probability Theoretical probability is when we can establish probability by using previous knowledge. For example, the probability of throwing a 5 with a fair die is one out of six or 2.2 1 6 . Experimental probability Experimental (or empirical) probability means we have to carry out a large amount of trails to determine the probability of a particular outcome. For example, to determine whether a slice of bread will land with the butter side down, we will have to drop a large number of buttered slices. By recording the outcome we could find the probability of any one slice landing buttered side down. 2.3 Subjective probability If it is not possible or practical to carry out a large number of trails, a subjective probability has to be formed. To determine whether it will rain on New Year’s Day, we would have to look at the weather records for many previous years. On the basis of that information it would be possible to estimate the chances that it will rain this New Year’s Day. Facilitator Guide US 9015 – Statistics and Probabilities 4 6 Sparrow Research and Industrial Consultants © July 2005 3 NOTATION If we consider 100 learners registered at Richardsbay Technical College, 20 take mathematics courses, 15 are studying chemistry and 8 are studying both maths and chemistry, we can illustrate the information in a Venn diagram. In a Venn diagram each group of objects is represented by a circle, the number inside the circle represent the number of people (objects) in that particular set. So for maths that would be 12 + 8 = 20. The number in the intersection represents the number of people taking both maths and chemistry and the number outside the circles but inside the box represent the number of people in the group who takes neither maths nor chemistry. The probability that a learner takes maths (indicated by “M”) is written as P(M), where P(M) = 20 1  100 5 Similarly the probability that a random chosen learner takes chemistry is P(C) = 15 3  100 20 Facilitator Guide US 9015 – Statistics and Probabilities 4 7 Sparrow Research and Industrial Consultants © July 2005 The shaded region is written as M ∩ C, and is read as M intersection C. The probability that a learner takes both maths and chemistry is written P(M ∩ C), where 8 2  100 25 P(M ∩ C) = The probability that a learner takes either maths or chemistry is written as M ∩ C, and is read as M union C. P(M ∩ C) = 12  8  7 27  100 100 We can also find the probability that a learner does not study maths from the Venn diagram. The notation P(M’) is used for “not Maths” Thus P(M’) = Facilitator Guide 7  73 80 4   100 100 5 US 9015 – Statistics and Probabilities 4 8 Sparrow Research and Industrial Consultants © July 2005 or It is important to point out that: P(M’) = 1 – P(M)  1 1 4  5 5 This is known as complementary probability. P(M’ ∩ C) is the probability that a learner does not study maths but does study chemistry. P(M’ ∩ C) = 7 100 Exercise 2 – Venn diagrams 1. Draw a Venn diagram to show the following information. In a group of 40 learners, 25 plays netball and 17 play hockey and 5 play both. Use your diagram to find: a) The probability that a student chosen at random from the group will play netball. b) The probability that the student plays either netball or hockey. Facilitator Guide US 9015 – Statistics and Probabilities 4 9 Sparrow Research and Industrial Consultants © July 2005 c) The probability that the student does not play netball but does play hockey. P(N)= 2. 25 5  40 8 P(N U H) = 20  5  12 37  40 40 P(N’ ∩ H) = 12 3  40 10 You may find it useful to use a Venn diagram to answer the question. In a class of 65 learners 15 are left-handed. There are 41 girls in the class of whom 4 are lefthanded. If a student is chosen at random, calculate the probability that the student is: a) right-handed. b) a left-handed male. c) a right-handed female. Facilitator Guide US 9015 – Statistics and Probabilities 4 10 Sparrow Research and Industrial Consultants © July 2005 P(L’) = 3. 50 65 P(L ∩ G’) = 11 65 P(L’ ∩ G) = 37 65 In a row of 30 houses, three have security fences and an alarm system. Eleven houses have neither a fence nor an alarm system, and 17 have a alarm system. What is the probability that a house chosen at random has: a) A security fence but no alarm system. b) Either an alarm or a security fence? P(S ∩ A’) = Facilitator Guide 14 7  30 15 US 9015 – Statistics and Probabilities 4 11 Sparrow Research and Industrial Consultants © July 2005 P(S U A) =  4. 14  3  2 19  30 30 In a secondary school class there are 28 pupils. Seven are in the chess team and play snooker. There are 16 pupils involved in snooker and ten in the chess team. Find the probability that a student chosen at random: a) is only playing chess. b) is in either the snooker or chess team. c) is in neither the snooker or chess team. 5. P(C ∩ S’) = 3 28 P(S U C) = 9  7  3 19  28 28 P(S U C)’ = 9 28 In a road of 30 houses, 25 are known to have mobile phones. Eight of these houses have a video recorder. One house has neither the phone nor the video recorder. If one house is chosen at random what is the probability that: a) it has a video recorder b) it does not have a video recorder c) has either a video recorder or a mobile phone Facilitator Guide US 9015 – Statistics and Probabilities 4 12 Sparrow Research and Industrial Consultants © July 2005 P(V) = 8  4 12 2   30 30 5 P(V’) = 17  1 18 3 2 3   or 1 - P(V) = 1   30 30 5 5 5 P(V U M) = 6. 17  8  4 29  30 30 A group of 60 people were asked if they had watched the cricket game or the news during the past week on TV. Thirty-five said they watched cricket, 20 said they hade watched the news and 14 said they had watched neither. What is the probability that a person chosen at random watched: a) both b) cricket but not the news c) either cricket or the news? Facilitator Guide US 9015 – Statistics and Probabilities 4 13 Sparrow Research and Industrial Consultants © July 2005 P(C ∩ N) = 9 3  60 20 P(C U N’) = C  N   P(C U N) = 4 26 13  60 30 26  9  11 46 23   60 60 30 ADDITIONAL RULES The diagram shows two events A and B. The Shaded area represents P (A U B), that is the probability of A or B occurring. This can be found by adding the probability of A to the probability of B and subtracting the probability of both A and B occurring. This gives the important result of P (A U B) = P (A) + P (B) – P(A ∩ B) Facilitator Guide US 9015 – Statistics and Probabilities 4 14 Sparrow Research and Industrial Consultants © July 2005 P (M) = 20 15 8 ; P(C) = and P(M ∩ C) = 100 100 100 So P (A U B) = P (A) + P(B) – P(A ∩ B) P (A U B) = 20 15 8   100 100 100 = 27 100 P(A ∩ B) is subtracted as the members of the intersection are already included in P(A) 4.1 Mutually exclusive events A card is chosen from a pack of 52 playing cards. If the card is a spade then it cannot be a red card. Events such as this were two outcomes cannot occur at the same time are said to be mutually exclusive and we have: P (R U S) = P(R) +P(S) 4.2 Exhaustive If P(R U S) = 1 then the events R and S are said to be exhaustive. This means that there are only two possible outcomes: R and S. For example, if a playing card is chosen at random it can be either red or black. Half of the pack is red and the other half is black. Facilitator Guide US 9015 – Statistics and Probabilities 4 15 Sparrow Research and Industrial Consultants © July 2005 P (red card) = ½ ; P(black cards) = ½ P (red U black) = ½ + ½ = 1 A card is drawn at random from an ordinary pack of 52 playing cards. Find the probability that the card is: a) a spade or a heart b) a spade or an ace Solution A card can not be a spade and a heart at the same time, so they are mutually exclusive. a) P(S U H) = P(S) + P (H) 1 1  4 4 1  2  or P(S U H) = P(S) + P (H) Facilitator Guide US 9015 – Statistics and Probabilities 4 16 Sparrow Research and Industrial Consultants © July 2005 13 13 26   52 52 52 1  2  b) P(S U A) = P(S) + P(A) – P(S ∩ A) or P(S U A) = P(S) +P (A) – P(S ∩ A) 13 4 1   52 52 52 16  52 4  13  1 1 1   4 13 52 4  13  Exercise 3 - Probabilities 1. A card is draw at random from a pack of 52 playing cards. Work out the probability that: a) The card is either a hart or a queen. P(H ∩ Q) = P(H) +P(Q) –P(H ∩ Q) Facilitator Guide US 9015 – Statistics and Probabilities 4 17 Sparrow Research and Industrial Consultants © July 2005 13 4 1   52 52 52 16  52 4  13  b) The card is either a heart or a diamond. P(H U D) = P(H) +P(D) –P(H ∩ D) 13 13 0    52 52 52 26  52 1  2  c) The card is either red or a ten. P(R U T) = P(R) + P(T) – P(R ∩ T) 26 4 2   52 52 52 28  52 7  13  2. A ten-sided dice, numbered 1 to 10, is thrown. Calculate the probability that: a) the number scored is a prime number. Prime Numbers 2; 3; 5; 7 P(P) = b) 4 2  10 5 The number scored is either a prime number or a multiple of 4. P(P U M4) = P(P) + P(M4) – P(P ∩ M4) Facilitator Guide US 9015 – Statistics and Probabilities 4 18 Sparrow Research and Industrial Consultants © July 2005 4 2 0   10 10 10 6  10 3  5  c) The number scored is either a multiple of 4 or a multiple of 3. P(M4 U M3) = P(M4) +P(M3) – P(M4 ∩ M3) 2 3 0   10 10 10 5  10 1  2  If E and F are two events such that P(E) = ¼; P(F) = ½, and P(E ∩ F) = ⅛, find: 3. a) P(E U F) P(E U F) = P(E) + P(F) – P(E ∩ F) 1 1 1   4 2 8 2  4 1  8 5  8  b) P(E U F)’ P(E U F)’ = 1 – P(E U F)  1  4. 5 8 3 8 For each of the following pairs of events, X and Y, say whether or not they are mutually exclusive and / or exhaustive? Facilitator Guide US 9015 – Statistics and Probabilities 4 19 Sparrow Research and Industrial Consultants © July 2005 a) A student is chosen at random from a tutor group. Event X: student is righthanded, Event Y: student is left-handed. mutually exclusive and exhaustive b) A fair die is thrown. Event X: die shows a multiple of 3, Event Y: die shows a prime number. not mutually exclusive or exhaustive c) A card is dealt from a pack of playing cards. Event X: the card is a spade, Event Y: the card is a King. not mutually exclusive or exhaustive 5 TREE DIAGRAMS Tree diagrams are very useful in solving probability problems, either where one event is repeated, or where more than one event occurs. 1. The probability that Jeffrey is late for class on any one day is 0,15 and is independent of whether he was late on the previous day. Find the probability that Jeffrey will: a) be late on Monday and Tuesday b) arrive on time on one of these days Facilitator Guide US 9015 – Statistics and Probabilities 4 20 Sparrow Research and Industrial Consultants © July 2005 Solution Outcome 2. Probability Late-Late 0,15 x 0,15 0,0225 Late-On time 0,15 x 0,85 0,1275 On time-Late 0,85 x 0,15 0,1275 On time-On time 0,85 x 0,85 0,7225 c) 0,0225 = 2,25% chance (late-late) d) 0,1275 + 0,1275 = 25,5% chance (on time-late, and late-on time) If we investigate the probability of drawing an ace from a pack of cards, the tree will look like this. Outcome Facilitator Guide Probability Ace – Ace 4 3   0,0045 52 51 Ace – Other card 4 48   0,0724 52 51 Other card – Ace 48 4   0,0724 52 51 Other card – Other card 48 47   0,8507 52 51 US 9015 – Statistics and Probabilities 4 21 Sparrow Research and Industrial Consultants © July 2005 3. If we now extend the diagram to a third draw of cards, the probability will look like this: Exercise 4 - Probabilities 1. Research shows that if it rains on one day in Cape Town, the probability that it will rain the next day is 30%. If it is a fair day the probability that it will rain the next day is 15%. Use a tree diagram and determine the following: If it is a fair day on Monday what is the probability that it will rain on Wednesday? A probability of 30% can be written as 0,3 and 15% as 0,15. If it is fair on Monday the probability that it will rain on Tuesday is 0,15 and the probability that it will be a fair day 0,85. If it does rain on Tuesday the probability that it will rain on Wednesday is 0,3 and that it will be fair 0,7. If Monday and Tuesday were fair days, there is a 72% probability that it will rain on Wednesday. Facilitator Guide US 9015 – Statistics and Probabilities 4 22 Sparrow Research and Industrial Consultants © July 2005 6 TERMS USED 6.1 Probability statements It is usual to make statements about probability in the form P(A) = ……………. This is read as: “The probability of an event “A” happening is …..” Thus the probability of drawing a spade from a pack of cards is P(S) = 13 = 0,25 (25%) 52 This is because there are 13 spades in a pack of cards and a pack has 52 cards. 6.2 Sample space Sample space refers to the set of all possible outcomes. When taking cards from a pack the sample space is 52 (without the jokers). There are 7 days in a week, therefore the sample space is 7 while the sample space for months in a year is 12. There are 7 grades in primary school so the sample space will be 7. Facilitator Guide US 9015 – Statistics and Probabilities 4 23 Sparrow Research and Industrial Consultants © July 2005 6.3 Joint and disjoint outcomes In many situations events are disjoint or mutually exclusive. If you consider picking Aces and Kings from a pack of cards, the two events are mutually exclusive. If you do pick an Ace it can never be a King, and if you pick a King it can never be an Ace. However if the two events are a King and a Spade, it is possible to pick the King of Spades. Such events are said to be joint or non-exclusive. 6.4 Independent and dependent outcomes The question whether events are dependent on one another sometimes arises. With independent events the result of the one event has no effect on the outcome of another event. If we toss a coin in the air (Head or Tails), the result of one throw has no effect on the result of the next throw. If we remove an Ace from a pack of cards we reduce the probability of finding another Ace in the pack, so the outcome is influenced by the first event, this is called a dependent outcome. The successive chances of picking an Ace would be: Full pack P(A) = 4 = 0,077 52 One ace removed P(A) = 3 = 0,059 51 Two aces removed P(A) = 2 = 0,040 50 Three aces removed P(A) = 1 = 0,020 49 Exercise 5 - Probabilities Where necessary, give the answer correct to three decimal places. 1. What are the chances of selecting from a full pack of cards (excluding jokers): a) The jack of diamonds? Facilitator Guide US 9015 – Statistics and Probabilities 4 24 Sparrow Research and Industrial Consultants © July 2005 1 52 b) Any jack? 4 52 c) Any diamond? 13 52 2. From a new pack of cards is taken only the aces, kings, queens, jacks and tens. What are the chances of selecting from this reduced pack: Aces = 4 ; Kings = 4 ; Queens = 4 ; Jacks = 4 ; Tens = 4 Total = 20 a) Any queen? 4 1  20 5 b) Any diamond? 1 4 c) The queen of hearts? 1 20 3. The probability that a taxi will arrive on time or late is put at 0,78. What is the probability that the taxi will arrive early? 0,22 Facilitator Guide US 9015 – Statistics and Probabilities 4 25 Sparrow Research and Industrial Consultants © July 2005 4. A bag contains three green balls and five yellow balls. One ball is chosen and its colour noted before being replaced in the bag. A second ball is selected and its colour noted. With the help of a probability tree work out: Green = 3 and Yellow = 5 Total = 8 balls So Green = ⅜ = 0,375 and Yellow =⅝ = 0,625 a) The probability that two green balls are chosen P(Green, Green) = ⅜ x ⅜ = b) 9 64 The probability that the two balls are different colours. To get 2 different colours can happen as follows: Draw Green then Yellow, or draw Yellow the Green, therefore: P(2 colours) = P(Green, Yellow) + P(Yellow, Green) P(2 colours) = ⅜ x ⅝ + ⅝ x ⅜ = 5. 15 15 30 15    64 64 64 32 At an activity holiday centre children choose either painting or drama as there activity on the first morning and horse-riding, football or swimming as their activity on the first afternoon. Past records show that in one morning 60% choose painting, and in the afternoon 45% choose horse-riding and 30% choose football and that the afternoon choices are made independently of the morning choices. a) Draw a tree diagram to illustrate the probabilities of the various choices for a child selected at random. Facilitator Guide US 9015 – Statistics and Probabilities 4 26 Sparrow Research and Industrial Consultants © July 2005 P(P, H) = 60% x 45% = 27% P(P, F) = 60% x 30% = 18% P(P, S) = 60% x 25% = 15% P(D, H) = 40% x 45% = 18% P(D, F) = 40% x 30% = 12% P(D, S) = 40% x 25% = 10% b) Use your tree diagram to find the probability that a child selected at random chooses: i. Drama in the morning and swimming in the afternoon. 0,1 ii. Neither drama nor horse-riding. 0,18 + 0,15 = 0,33 6. In a group of 50 students, 18 take history and 26 take English. If 14 students take neither History nor English find the probability that a student chosen at random takes: a) Both History and English P(H ∩ E) = 8 Facilitator Guide US 9015 – Statistics and Probabilities 4 27 Sparrow Research and Industrial Consultants © July 2005 b) History but not English P(H) U P(E’)= 10 c) Either History or English. P(H U E) = 36 7 SUMMARY 1. Probability is a way of measuring the likelihood of some event taking place. a) The probability of events A and B both taking place = P(A ∩ B) b) The probability of event A or event B or both events occurring = P(A U B) c) The probability of events A not taking place = P(A’) d) The sum of the probability of event A occurring and A not occurring is equal to 1: P(A) + P(A’) = 1 so P(A’) = 1 – P(A) e) The addition rule for probability is P(A U B) = P(A) +P(B) – P(A ∩ B) f) For mutually exclusive events P(A U B) = P(A) + P(B) and P(A ∩ B) = 0 g) Probabilities are always given as a percentage or as a decimal to two places (e.g. the probability is 80% or 0,80) h) The smallest value a probability can have is 0 – the event will never happen. i) The largest value is 1 – the event is bound to happen. 2. Statements of probabilities always refer to the long run. 3. The sample space is the set of all possible outcomes. 4. Tree diagrams and Venn-diagrams are techniques for investigating probabilities. Facilitator Guide US 9015 – Statistics and Probabilities 4 28 Sparrow Research and Industrial Consultants © July 2005 CHAPTER 2: STATISTICS On completion of this chapter, you will be able to:  look at samples in terms of size and representativeness  understand what a normal distribution is  critically interrogate and effectively communicate results No one can afford to be without some knowledge of statistical methods today. In any working situation you will have to deal with some form of statistics. In this unit standard we will apply our knowledge of statistics and probability to:  critically interrogate and effectively communicate results  look at samples in terms of size and representativeness  understand what a normal distribution is  have a basic understanding of probability. 1 THE NORMAL DISTRIBUTION 1.1 Skewed and symmetrical distributions By now, you should be able to calculate averages and measures of dispersion in order to describe any data collected. However, it is still possible to have three sets of data with the same mean and standard deviation, but completely different values. Suppose a company owns three petrol stations. Their average weekly wages are shown below: Facilitator Guide US 9015 – Statistics and Probabilities 4 29 Sparrow Research and Industrial Consultants © July 2005 Garage Mean (R) Standard Deviation (R) A 180 10 B 180 10 C 180 10 The wages structure appears to be the same, but plots of the three frequency curves might be as follows: The three petrol stations clearly have very different wage structures, but this was not apparent from the mean and standard deviation. 1.2 Normal distribution curves The concept of a ‘normal’ distribution curve is a very important one in statistics. Whether we find the average by the mean, the median or the mode, we need to know how the data that make up the distribution under discussion are spread around the average chosen. Are they symmetrically or asymmetrically (skew) arranged around the average? Even if they are symmetrically arranged, are they widely spread or narrowly spread? For example, if we take five pieces if data: 98 99 100 101 102 The average is clearly 100 (total 500  5) and the data are closely clustered around the average. By contrast, consider the following five pieces of data: 25 50 100 150 175 The average is still 100 (500  5) but now the data are widely spread around the average. Facilitator Guide US 9015 – Statistics and Probabilities 4 30 Sparrow Research and Industrial Consultants © July 2005 Both sets of data are symmetrically distributed around the average, but the distributions are far from alike. In any case of normal distribution the three averages must by definition coincide. The arithmetic mean and the central item (the median) of a symmetrically arranged distribution must coincide; while the fact that the mode is the most frequent item and is always to be found at the high point of the curve will also mean that it is centrally placed. Consider an experiment in which ten truly balanced coins are tossed into the air. We would expect to have five heads and five tails, but in any actual experiment we might have other results, such as 6:4 or 7:3, or, rather more rarely, 10:0. If we can repeat this experiment 100 times, we might have results as shown in this table. Number of heads 0 1 2 3 4 5 6 7 8 9 10 Frequency 1 3 5 10 16 24 17 11 7 3 1 We could draw this set of data as a histogram (bar chart): Heads and Tails Distribution 30 25 Frequency 20 15 10 5 0 0 1 2 3 4 5 6 7 8 9 10 Number of Heads If we consider the kind of curve that will result from a very large number of experiments with a large number of coins, we would expect a normal distribution curve. Facilitator Guide US 9015 – Statistics and Probabilities 4 31 Sparrow Research and Industrial Consultants © July 2005 1.3 Characteristics of the normal curve The characteristics of a normal curve are the following:  It is bell-shaped.  It is symmetrical about the mean.  It extends indefinitely in both directions, but in practice it is indistinguishable from the horizontal axis once we get more than three standard deviations either side of the mean.  The parts of the curve which approach the horizontal axis are called the tails.  If we know the mean and the standard deviation of the curve, the curve is completely determined mathematically. A large number of 1 kg bags of sugar are weighed to check how accurately they have been filled, and the results are shown in the histogram below: Weight Distribution of Sugar Bags 25 Frequency 20 15 10 5 1. 05 5 1. 05 0 1. 04 5 1. 04 0 1. 03 5 1. 03 0 1. 02 5 1. 02 0 1. 01 5 1. 01 0 1. 00 5 1. 00 0 0. 99 5 0. 99 0 0. 98 5 0. 98 0 0. 97 5 0. 97 0 0. 96 5 0. 96 0 0. 95 5 0. 95 0 0. 94 5 0 Weight (kg) Facilitator Guide US 9015 – Statistics and Probabilities 4 32 Sparrow Research and Industrial Consultants © July 2005 This histogram is similar to the histograms you might obtain for a variety of different types of data, such as heights or weight of learners, analysis of a chemical component, thickness of paper, strength of tissue, etc. The most important features of this histogram are:  It is (almost) symmetrical about the mean.  There are more values close to the mean than further away from the mean. As the values of  (standard deviation) and  (mean) vary, so the shape of the curve will change, although it will always retain its characteristic bell shape. In each example the area under the curve is equal to one. This follows the results of the previous chapter. The ‘peak’ occurs at the mean  while the value of the standard deviation, , determines the spread of the curve – long and thin or wide and flat. It is true, however, that for all these different normal curves almost all (99,7%) the values being within  3 standard deviations from the mean, approximately 95% of all values being within  2 standard deviations from the mean, and approximately 66% of all values being within  1 standard deviation from the mean. As a normal distribution curve is continuous, the probability that x lies between a and b can be calculated by finding the area under the curve between the values a and b. Facilitator Guide US 9015 – Statistics and Probabilities 4 33 Sparrow Research and Industrial Consultants © July 2005 The probability is written as, P(a < x < b) The area under the normal distribution curve can be calculated by converting a distribution into a “standard normal distribution”. The surface area (and therefore the probability) for the standard normal curve are given in normal distribution tables. The standard normal distribution is the special normal distribution where the mean, μ = 0 and the standard deviation, σ = 1. The curve indicates that the standard normal distribution with its mean (μ = 0) in the middle and the range of  3 σ. All normal distributions can be converted into the standard normal distribution by a process known as standardising. In order to use the standard normal tables it is necessary to convert a distribution to a standard normal distribution. This is done by calculating the value of z using the formula: Facilitator Guide US 9015 – Statistics and Probabilities 4 34 Sparrow Research and Industrial Consultants © July 2005 z X   One only needs to know the mean and standard deviation for a sample to be able to convert any value into a value on the standard normal curve. Lengths and weights of learners, weight of sugar bags, chlorine concentration in water, strength and thickness of paper, etc. The average basis weight for imported 80 g/m2 paper is in actual fact 81,1 g/m2, with a standard deviation of 1,3 g/m2. Calculate z for basis weights of 79 g/m2, 82 g/m2 and 83 g/m2. Also determine the z value for the mean basis paper weight of 81,1 g/m2. z z z z X   X   X   X    79  81,1  1,62 1,3  82  81,1  0,69 1,3  83  81,1  1,46 1,3  81,1  81,1 0 1,3 So what is the use of this conversion? Let us first investigate the meaning of the two results we obtained by finding these values on the standardised normal curve: Facilitator Guide US 9015 – Statistics and Probabilities 4 35 Sparrow Research and Industrial Consultants © July 2005 In itself, one can at least get a general feeling of how far each value is from the mean value (in terms of the standard deviation). However, from the standard normal tables one can obtain the cumulative area under the curve which is also an indication of the probability that a value would fall within this range. The surface area (and therefore probability) is written as: (z) = P(Z < z) This indicates that the standard normal curve tables always indicates the probability less than z, in other words, the area left of z on the curve: Reading from the table is done by finding the correct z value on the left and top axis and then reading the correct (z) value from the body of the table. Find (z) for z = 0,58, z= 0,46 and z = 0,03 by reading the values from the table: (0,58) = 0,71904 = 71,904% Facilitator Guide US 9015 – Statistics and Probabilities 4 36 Sparrow Research and Industrial Consultants © July 2005 (0,46) = 0,66724 = 66,724% (0,03) = 0,51197 = 51,197% The next step will be to address all the whole process in a single example. For the basis weight problem in the previous example, z values were calculated for basis weights of 79 g/m2, 82 g/m2, 83 g/m2 and 81,1 g/m2. Using these z values, determine the following probabilities: a) The imported paper has a basis weight of less than 81,1 g/m2. b) The imported paper has a basis weight of less than 83 g/m2. c) The imported paper has a basis weight of less than 79 g/m2. d) The imported paper has a basis weight between 79 and 83 g/m2. From the standard normal tables one can obtain P(Z < z): a) For X = 81,1 g/m2, z = 0,00. From the table P(Z < z) = 0,50 = 50%. It is to be expected that 50% of the values would be smaller than the average! b) For X = 83 g/m2, z = 1,46. From the table P(Z < 1,46) = 0,92785 = 92,785%. Facilitator Guide US 9015 – Statistics and Probabilities 4 37 Sparrow Research and Industrial Consultants © July 2005 c) For X = 79 g/m2, z = –1,62. Since the table gives no probabilities for negative values of z, the probability has to be calculated as follows: P(Z < z) = 1 – P(Z < |z|), which in this case is: P(Z < –1,62) = 1 – P(Z < 1,62) Where |z| indicates the value obtained from the table based on the positive value of z. Therefore, from the table P(Z < |z|) = P(Z < 1,62) = 1 - 0,94738 = 0,05262 = 5,262% d) For the imported paper to have a basis weight between 79 and 83 g/m2, the area indicated in the following diagram need to be determined: Facilitator Guide US 9015 – Statistics and Probabilities 4 38 Sparrow Research and Industrial Consultants © July 2005 The probability is therefore: P(Z < 1,46) – P(Z < -1,62) = 92,785% - 5,262% = 87,523% Find the area under the standard normal curve: a. between 1 and 2 standard deviation from the mean b. between 0,5 and 1,5 standard deviation from the mean. Solution: a. For z = 2 the table indicates (2) = 0,97725, and for z = 1, (1) = 0,84134. As explained before, the area between z = 1 and z = 2 is given as (2) – (1). (2) – (1) = 0,97725 – 0,84134 = 0,13591 (13,591%) b. For z = 1,5 the table indicates (1,5) = 0,93319, and for z = 0,5, (0,5) = 0,69146. Facilitator Guide US 9015 – Statistics and Probabilities 4 39 Sparrow Research and Industrial Consultants © July 2005 (1,5) – (0,5) = 0,93319 – 0,69146 = 0,24173 (24,173%) Since the total area under the curve equals 1, we can from the above diagram that the area under the normal curve for z > 1,5 we can simply find the area under the curve for z<1,5 and subtract this value from 1. (>1,5) = 1 – (1,5) = 1 – 0,93319 = 0,06681 Find the area under the curve which is more than: Facilitator Guide US 9015 – Statistics and Probabilities 4 40 Sparrow Research and Industrial Consultants © July 2005 a. 1 standard deviation above the mean b. 2,4 standard deviations above the mean Solution: a. First draw a sketch to show the area required. (>1,0) = 1 – (1,0) = 1 – 0,84134 = 0,15866 b. First draw a sketch to show the area required. (>2,4) = 1 – (2,4) = 1 – 0,99180 = 0,00820 Facilitator Guide US 9015 – Statistics and Probabilities 4 41 Sparrow Research and Industrial Consultants © July 2005 Note: Since the table only gives values of (z) for z > 0. One can work out values for z < 0 if you remember that the normal curve is symmetrical about the mean (z = 0) by using: (z < 0) = 1 – (–z) Find the area under the curve that is less than: a. 1 standard deviation below the mean b. 1,75 standard deviations below the mean. Solution: a. First draw a sketch to show the area required. As can be seen this is the same as more than 1 standard deviation above the mean. The formula used is: (–1,0) = 1 – (1,0) = 1 – 0,84134 = 0,15866 Facilitator Guide US 9015 – Statistics and Probabilities 4 42 Sparrow Research and Industrial Consultants © July 2005 b. On the same basis as a. above: (–1,75) = 1 – (1,75) = 1 – 0,95994 = 0,04006 Exercise 6 – Normal distribution 1. Find the probability of an event occurring between 1 and 1,5 standard deviations from the mean. The area to the left of z = 1,5 is 0,93319 and the area to the left of z = 1 is 0,84134. Therefore the area between z = 1 and z = 1,5 is: (1,0 < z < 1,5) = (1,5) – (1,0) From the tables: (1,5) = 0,93319; (1,0) = 0,84134 (1,0 < z < 1,5) = 0,93319 – 0,84134 = 0,09185 Facilitator Guide US 9015 – Statistics and Probabilities 4 43 Sparrow Research and Industrial Consultants © July 2005 2. Find the probability of an event occurring which is more than 1,3 standard deviations above the mean. First draw a sketch to show the area required. Shaded area = (z > 1,3) = 1 – (1,3) = 1 – 0,90320 = 0,09680 3. Find the area under the curve that is less than 2 standard deviations below the mean. First draw a sketch to show the area required. (z < -2,0) = 1 – (2) Facilitator Guide US 9015 – Statistics and Probabilities 4 44 Sparrow Research and Industrial Consultants © July 2005 =1 – 0,97725 =0,02275 1.4 Skewed data Although the normal distribution is an important theoretical distribution, it is unlikely to be exactly met in real life. However, a number of sets of data tend towards normality e.g. educational data, sample quality control measurements, or some market research information, particularly where the numbers concerned are very large. Most sets of data will display some skewness and the reader will see that this causes some separation between the mean, the medium and the mode. Since we find it helpful to describe any set of data to know where the ‘average’ is, it follows that our three averages (if they do not coincide to give a normal distribution) may give us some clues as to the extent of the skewness they display. It is this observed difference between the averages which is used to measure skewness. 2 SAMPLE SIZE AND REPRESENTATIVENESS 2.1 Sample representativeness The choice of a sample in itself is a major statistical exercise, because the type of sample must be balanced against the cost to collect it: If the sample choice is narrowed down to lower the cost, it will not be as representative as a completely random sample. For example, basing an enquiry about the family expenditure of South Africans on citizens that live in Westville will be less representative of the total population than one based on the citizens of the whole Durban area or further expanding it to all the people staying near the KwaZulu Natal coast. However, interviewing a thousand people in Westville or Durban is relatively easy, but interviewing a thousand in rural areas is not so easy, and will involve greater travelling costs and much more time. Ensuring that the sample is free from bias is not just a matter of avoiding obvious tendencies. Bias of an unconscious nature must not be included either. For example, a survey on the educational level of South Africans could not take its sample solely from Universities – it should be fairly obvious that the degree of achievement would be grossly overstated. However, if you conducted a survey by questioning the first five thousand people you encountered in Smith Street, it would not necessarily be biased but it would not be representative of the whole population either. Facilitator Guide US 9015 – Statistics and Probabilities 4 45 Sparrow Research and Industrial Consultants © July 2005 In order to overcome such problems, random sampling is used. It is important to realise that although random sampling is a method of selecting a sample free of bias, it cannot guarantee that the respondent who happened to be selected in the sample, is not biased – the randomly selected respondent may still have very biased views on the matter in hand. A major problem in survey sampling is matching the studied population to the target population. As an example, after an in-line sample has been analysed in a pulping plant’s chemical recovery unit, it is not available for further analysis or investigation since it either reenters the system or it is disposed. Samples belonging to certain strata may not be externally recognisable or extractable from its environment (e.g. certain polymers in a fibre stream). Sometimes there is a convenient list of groups in a population, which can then be used to specify the sample that is sought, but this is not always the case – there is no list of people who buys stationary. 2.2 Sample size Before carrying out a test or survey it is often useful to be able to calculate the size of a sample needed to give the mean of the population with a certain accuracy. Suppose the mean height of a population was to be quoted to the nearest centimetre, with 95% confidence. What size of sample is required? The factors that affect the sample size are:  The variation between different people – this can be expressed as the standard deviation. The standard deviation is therefore a function of the characteristics of the specific parameter (height, weight, hair colour, etc.) and the specific population under consideration (see graph below). Facilitator Guide US 9015 – Statistics and Probabilities 4 46 Sparrow Research and Industrial Consultants © July 2005  The confidence level that we require. To be 80% sure, a much smaller sample is required than when you want to be 99% sure.  The limit within which the variation should be established. For example it would be much more difficult to establish the average height to within 1 mm than to within 100 mm with 99% confidence. 2.3 Central limit theorem The central limit theorem provides us with a method to determine sample size – in actual fact; it is useful within a number of cases:  Given a confidence level (e.g. 99% certainty) and limit (e.g. within 10 cm), a required sample size can be determined  Given confidence level (e.g. 99% certainty) and sample size (e.g. 30 respondents), the limit can be determined  Given sample size (e.g. 30 respondents) and limit (e.g. within 10 cm) the confidence level can be determined To determine these parameters, the following equation is used: X   z  n X – μ = Limit (Distance from the mean value) Μ = Mean z = Standardised normal distribution σ = standard deviation n = Sample size Facilitator Guide US 9015 – Statistics and Probabilities 4 47 Sparrow Research and Industrial Consultants © July 2005 The mean height of a population, with a standard deviation of 5 cm, was to be quoted to the nearest centimetre, with 95% confidence. What size of sample is required? Solution: Since the height has to be quoted to the nearest centimetre, it actually means 0,5 cm from the mean value. For a confidence level of 95%, the standardised score from the standardised normal distribution tables, is Z = 1,96 ((z) = 0,975). 1.96  5  0.5 n 1.96  5  n 0.5 19.6  n 384.16  n A sample size of 385 is required to be within 0,5 cm as the sample size must be an integer. From this example it can be seen that to achieve a small limit the required size of sample is quite large. A balance must be kept between the size of the limit and the size of the sample. If we enlarge the limit to within 4 cm (2 cm) the following sample size can be calculated: Facilitator Guide US 9015 – Statistics and Probabilities 4 48 Sparrow Research and Industrial Consultants © July 2005 1.96  5  2 n 1.96  5  n 2 4.9  n 24.0  n This is clearly a much easier achievable target. The mean fuel consumption of a fleet of 9 Toyota taxis has been determined by the owner, Mr Twala, as 10,5 km/l with a standard deviation of 1 km/l. He claims that the economy of similar Toyotas should “definitely” be between 10,0 and 11 km/l. What is the actual confidence level of his determination? Solution: The consumption limit claimed is actually 0,5 km/l from the mean value. X   z 0,5  z   n 1,0 9 z 0,5  9  1,5 1,0 From the tables it can be seen that: (1,5) = 0,93319 The top and bottom ends are therefore both: (z > 1,5) = 1 – 0,93319 = 0,06681 The top and bottom together are therefore: 0,06681 x 2 = 0,13362 Facilitator Guide US 9015 – Statistics and Probabilities 4 49 Sparrow Research and Industrial Consultants © July 2005 The probability of the value being between 10 km/l and 11 km/l are therefore: (-1,5 > z > 1,5) = 1 – 0,13362 = 0,86638 or 86,638% Mr Twala is therefore a bit optimistic in using the word “definitely”, he would be more accurate if he says that it is 86,638% sure that similar Toyotas will achieve between 10 and 11 km/l! Exercise 7 – Sample size Find the size of sample required at the: a) 90% b) 95% confidence limit for the following distribution and intervals. σ = 15 cm , interval  8 cm 1. a) Facilitator Guide US 9015 – Statistics and Probabilities 4 50 Sparrow Research and Industrial Consultants © July 2005 X    z 8  1,645   n 15 n 15  1,645 n  3,084 8  n  9,51 Since one cannot take 9,51 samples, n ≥ 10 b) X    z 8  1,96   n 15 n 15  1,96 n  3,675 8  n  13,51 Since one cannot take 13,51 samples, n ≥ 14 σ =25 cm interval 20 cm 2. a) a. n ≥ 67,2, ∴ n ≥ 68 b) b. n ≥ 96,04, ∴ n ≥ 97 σ = 12 cm interval 0,5 cm 3. a) n ≥ 4,2, ∴ n ≥ 5 b) n ≥ 6,0025, ∴ n ≥ 7 σ = 50 cm interval 10 cm 4. a) n ≥ 1549,2, ∴ n ≥ 1550 Facilitator Guide US 9015 – Statistics and Probabilities 4 51 Sparrow Research and Industrial Consultants © July 2005 b) n ≥ 2212,76, ∴ n ≥ 2213 σ = 0,5 cm interval 0,1 cm 5. a) n ≥ 67,62, ∴ n ≥ 68 b) n ≥ 96,04, ∴ n ≥ 97 σ =22,5 cm interval 8 cm 6. a) n ≥ 67,65, ∴ n ≥ 68 b) n ≥ 96,04, ∴ n ≥ 97 σ =1000 cm interval 50 cm 7. a) n ≥ 21,405, ∴ n ≥ 22 b) n ≥ 30,39, ∴ n ≥ 31 σ = 0,1 cm interval 0,001 cm 8. a) n ≥ 1082,41, ∴ n ≥ 1083 b) n ≥ 1536,6, ∴ n ≥ 1537 σ = 3,4 cm interval 0,25 cm 9. a) n ≥ 27060,25, ∴ n ≥ 27061 b) n ≥ 38416, ∴ n ≥ 38416 Facilitator Guide US 9015 – Statistics and Probabilities 4 52 Sparrow Research and Industrial Consultants © July 2005 3 STATISTICS IN THE MEDIA Sometimes statistics are misrepresented to imply a much better (or worse) situation than is really the case. In this section we will consider a few techniques used to create misleading statistics. 3.1 Misleading statistics 3.1.1 Axes’ scales The starting point of the scale on one or both axes can be changed. Diagrams A and B show the South African budgeted expenditure on education since 1999. Which is the misleading display? Discuss. Diagram A: Education Budget 50 45 40 Budget (R billion) 35 30 25 20 15 10 5 0 1999 Facilitator Guide 2000 2001 2002 US 9015 – Statistics and Probabilities 4 2003 2004 53 Sparrow Research and Industrial Consultants © July 2005 Diagram B: Education Budget 50 Budget (R billion) 45 40 35 30 25 1999 2000 2001 2002 2003 2004 The spacing of the scale on one or both axes can also be changed. Diagrams C and D show the price of 1 US dollar paid in RSA cents from 1979 to 2002. Which is the misleading display? Discuss. Diagram C: Average Rand / US Dollar Exchange Rate (1979 - 2002) 12 11 10 9 ZAR / US$ 8 7 6 5 4 3 2 1 0 1979 1980 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 Facilitator Guide US 9015 – Statistics and Probabilities 4 54 Sparrow Research and Industrial Consultants © July 2005 Diagram D: Average Rand / US Dollar Exchange Rate (1979 - 2002) 12 11 10 9 ZAR / US$ 8 7 6 5 4 3 2 1 0 1979 1981 1983 1985 1987 1989 1991 1993 1995 1997 1999 2001 3.1.2 Perspective Perspective in 3D can be misused to make things seem larger or smaller than is really the case. The pie charts below display the percentage owned by shareholders in a company. Why can the second display be considered misleading? Facilitator Guide US 9015 – Statistics and Probabilities 4 55 Sparrow Research and Industrial Consultants © July 2005 3.1.3 Size Area or volume can be misused. These displays show the increase in turnover in the company from 2001 to 2004. Are any of them misleading? Exercise 8 – Displaying statistics The table below shows a dramatic increase in the number of women diagnosed with HIV each year. In 1990, females accounted for just 15% of HIV diagnoses, but in 2004 that figure was 43%. 1. Draw a graph of your choice to illustrate the findings for 1990 up to 2004. 2. Is there any noticeable trend in your graph? 3. Discuss the decline in 2004 – do you think it is real or is it a statistical or administrative change in measurement? 4. What is your forecast of what would happen in the future? Facilitator Guide US 9015 – Statistics and Probabilities 4 56 Sparrow Research and Industrial Consultants © July 2005 HIV infected individuals and AIDS cases by year of diagnosis and sex HIV Year of diagnosis Male 1989 or earlier 12892 1319 14238 3398 155 3553 1990 2177 371 2553 1145 97 1242 1991 2278 450 2728 1253 138 1391 1992 2202 541 2744 1405 173 1578 1993 2101 533 2635 1551 238 1789 1994 2042 534 2577 1626 227 1853 1995 2085 572 2657 1487 284 1771 1996 2117 587 2704 1172 272 1444 1997 2093 669 2763 863 217 1080 1998 2086 757 2844 593 195 788 1999 2154 943 3099 564 192 756 2000 2482 1390 3872 586 245 831 2001 3108 1959 5068 487 240 727 2002 3575 2636 6211 548 319 867 2003 3980 3156 7136 496 390 886 2004 3681 2721 6403 395 303 698 51053 19138 70232 17569 3685 21254 Total Facilitator Guide Female AIDS Total* Male US 9015 – Statistics and Probabilities 4 Female Total 57 Sparrow Research and Industrial Consultants © July 2005 Solution 1. HIV and AIDS in South Africa 4500 4000 3500 Cases Diagnosed 3000 HIV Male HIV Female AIDS Male AIDS Female 2500 2000 1500 1000 500 0 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2. Sharp increase from 1998 to 2003 followed by a decline in all cases in 2004. 3. It is important to investigate the results to determine the reason for the changes. 4. The HIV cases will in future convert to AIDS with the result that AIDS may well increase due to the high HIV in the late 1990s and early 2000s. The trend with HIV is hopefully broken and may continue to a low plateau in future. Facilitator Guide US 9015 – Statistics and Probabilities 4 58 Sparrow Research and Industrial Consultants © July 2005 Annexure A – Normal distribution table z 0,0 0,1 0,2 0,3 0,4 0,5 0,6 0,7 0,8 0,9 1,0 1,1 1,2 1,3 1,4 1,5 1,6 1,7 1,8 1,9 2,0 2,1 2,2 2,3 2,4 2,5 2,6 2,7 2,8 2,9 3,0 3,1 3,2 3,3 3,4 3,5 3,6 3,7 3,8 3,9 0 0,01 0,02 0,03 0,04 0,05 0,06 0,07 0,08 0,09 0,50000 0,53983 0,57926 0,61791 0,65542 0,69146 0,72575 0,75804 0,78814 0,81494 0,84134 0,86433 0,88493 0,90320 0,91924 0,93319 0,94520 0,95543 0,96407 0,97128 0,97725 0,98214 0,98610 0,98928 0,99180 0,99379 0,99534 0,99653 0,99744 0,99813 0,99865 0,99903 0,99931 0,99952 0,99966 0,99977 0,99984 0,99989 0,99993 0,99995 0,50399 0,54380 0,58317 0,62172 0,65910 0,69497 0,72907 0,76115 0,79103 0,81859 0,84375 0,86650 0,88686 0,90490 0,92073 0,93448 0,94630 0,95637 0,96485 0,97193 0,97778 0,98257 0,98645 0,98956 0,99202 0,99396 0,99547 0,99664 0,99752 0,99819 0,99869 0,99906 0,99934 0,99953 0,99968 0,99978 0,99985 0,99990 0,99993 0,99995 0,50798 0,54776 0,58706 0,62552 0,66276 0,69847 0,73237 0,76424 0,79389 0,82121 0,84614 0,86864 0,88877 0,90658 0,92220 0,93574 0,94738 0,95728 0,96562 0,97257 0,97831 0,98300 0,98679 0,98983 0,99224 0,99413 0,99560 0,99674 0,99760 0,99825 0,99874 0,99910 0,99936 0,99955 0,99969 0,99978 0,99985 0,99990 0,99993 0,99996 0,51197 0,55172 0,59095 0,62930 0,66640 0,70194 0,73565 0,76730 0,79673 0,82381 0,84849 0,87076 0,89065 0,90824 0,92364 0,93699 0,94845 0,95813 0,96638 0,97320 0,97882 0,98341 0,98713 0,99010 0,99245 0,99430 0,99573 0,99683 0,99767 0,99831 0,99878 0,99913 0,99938 0,99957 0,99970 0,99979 0,99986 0,99990 0,99994 0,99996 0,51595 0,55567 0,59483 0,63307 0,67003 0,70540 0,73891 0,77035 0,79955 0,82639 0,85083 0,87286 0,89251 0,90988 0,92507 0,93822 0,94950 0,95907 0,96712 0,97381 0,97932 0,98382 0,98745 0,99036 0,99266 0,99446 0,99585 0,99693 0,99774 0,99836 0,99882 0,99916 0,99940 0,99958 0,99971 0,99980 0,99986 0,99991 0,99994 0,99996 0,51994 0,55962 0,59871 0,63683 0,67364 0,70884 0,74215 0,77337 0,80234 0,82894 0,85314 0,87493 0,89435 0,91149 0,92647 0,93943 0,95053 0,95994 0,96784 0,97441 0,97982 0,98422 0,98778 0,99061 0,99286 0,99461 0,99598 0,99702 0,99781 0,99841 0,99886 0,99918 0,99942 0,99960 0,99972 0,99981 0,99987 0,99991 0,99994 0,99996 0,52392 0,56356 0,60257 0,64058 0,67724 0,71226 0,74537 0,77637 0,80511 0,83147 0,85543 0,87698 0,89617 0,91309 0,92785 0,94062 0,95154 0,96080 0,96856 0,97500 0,98030 0,98461 0,98809 0,99086 0,99305 0,99477 0,99609 0,99711 0,99788 0,99846 0,99889 0,99921 0,99944 0,99961 0,99973 0,99981 0,99987 0,99992 0,99994 0,99996 0,52790 0,56749 0,60642 0,64431 0,68082 0,71566 0,74857 0,77935 0,80785 0,83398 0,85769 0,87900 0,89796 0,91466 0,92922 0,94179 0,95254 0,96164 0,96926 0,97558 0,98077 0,98500 0,98840 0,99111 0,99324 0,99492 0,99621 0,99720 0,99795 0,99851 0,99893 0,99924 0,99946 0,99962 0,99974 0,99982 0,99988 0,99992 0,99995 0,99996 0,53188 0,57142 0,61026 0,64803 0,68439 0,71904 0,75175 0,78230 0,81057 0,83646 0,85993 0,88100 0,89973 0,91621 0,93056 0,94296 0,95352 0,96246 0,96995 0,97615 0,98124 0,98537 0,98870 0,99134 0,99343 0,99506 0,99632 0,99728 0,99801 0,99856 0,99896 0,99926 0,99948 0,99964 0,99975 0,99983 0,99988 0,99992 0,99995 0,99997 0,53586 0,57535 0,61409 0,65173 0,68793 0,72240 0,75490 0,78524 0,81327 0,83891 0,86214 0,88298 0,90147 0,91774 0,93189 0,94408 0,95449 0,96327 0,97062 0,97670 0,98169 0,98574 0,98899 0,99158 0,99361 0,99520 0,99643 0,99736 0,99807 0,99861 0,99900 0,99929 0,99950 0,99965 0,99976 0,99983 0,99989 0,99992 0,99995 0,99997 Facilitator Guide US 9015 – Statistics and Probabilities 4 59

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download chapter 2: statistics