Download Unit 2: Modeling Random Behavior Week 3 : Probability

Unit 2: Modeling Random Behavior Week 3 : Probability By now you should appreciate that real engineering data exhibit variability. Probability is the tool used to model such variability. Statisticians use probability in much the same way as chemical engineers use chemistry. Chemistry provides the chemical engineer with the language necessary to solve typical problems in chemical engineering in much the same way as probability is the language of uncertainty. Ultimately, probability represents a standardised measure of chance – with zero meaning no chance and unity meaning something is certain to occur. Example: Nickel-Hydrogen batteries A Nickel–Hydrogen battery (NiH2 or Ni–H2) is a rechargeable electrochemical power source based on Nickel and Hydrogen. It differs from a Nickel–metal hydride (NIMH) battery by the use of hydrogen in a pressurized cell at up to 8.5 MPa. The screen shot below gives more details on the structure of this type of battery. The blistering of sintered Nickel electrodes within this type of battery has been of continuing concern to users of long life batteries containing these electrodes. Blistering can be affected by many parameters that are sensitive to both the manufacturing process and to the electrodes' operating conditions. Some of these parameters are the active material loading level and the ability of the electrode to vent gases due to overcharge. Also, the active material impregnation process can enhance blister formation. Consider a manufacturer of these Nickel-Hydrogen batteries, who has run into problems with cells shorting out prematurely. The manufacturer therefore cuts open 100 recently manufactured cells (that each have 60 nickel plates within them) and observed the following results. Did the cell short? Yes 35 15 Yes No Did the cell have blistered plates? No 5 45 So, for example, 35 of the 100 cells shorted and on further inspection these cells also had one or more blisters on the Nickel plates. But 15 of the cells were fine even though some of the plates had blisters on them. This tends to suggest that there must be a critical (i.e. minimum) number of blisters required before a cell shorts out. 45 cells without any blistering were fine, but 5 cells with no blisters did short - presumably because other failure mechanisms are at work within the manufactured batteries. A. Defining Probability A random experiment is one that can result in different outcomes. In the above example there are four possible outcomes for the tested cell: the cell shorted (call this Event A), the cell did not short (Event B), the cell had blistered plates (Event C) and the cell did not have blistered plates (Event D). The set of all possible outcomes of a random experiment is called the sample space, S. In this example the sample space is made up of four outcomes S = {Event A, Event B, Event C, Event D}. Probability is a pure number between 0 and 1 that reflects the chances of a particular event occurring. If an event is absolutely certain to occur that event has a probability of one associated with it. If on the other hand an event can’t possibly occur, that event has a probability of zero. If an experiment is repeated N times and a particular event is observed in a fraction n of these experiments, then the ratio n/N is defined to be the probability P of that event occurring (provided N is very large) P(Event) = n/N Of the N = 100 cells tested, n = 35 + 5 = 40 of them had shorted. Thus the probability of event A occurring is: P(A) = 40/100 = 0.40 (or 40%) Also P(B) = (15+45)100 = 0.6 (or 60%) Notice that events A and B cannot occur together (either the cell has failed or it has not) and so these events are said to be mutually exclusive. Mutually exclusive events cannot occur at the same time. Further, P(C) = (35+15)/100 = 0.50 (or 50%) and P(D) = (5+45)100 = 0.5 (or 50%) Again, events C and D are mutually exclusive. B. Combinations of events (AND, OR events) Events A and C are not mutually exclusive as 35 of the cells that shorted also had blistered plates. Similarly, Event B and D are also not mutually exclusive. It is therefore possible to calculate a probability of two or more events like these occurring together. For example, the probability of observing a cell that has shorted and has blistered plates is P(A AND C) = 35/100 = 0.35 (or 35 %) Again the probability of observing a cell that has not shorted and has no blistered plates is P(B AND D) = 45/100 = 0.45 (or 45 %) It is also possible to look at the probability of one of a number of events occurring. For example, Event A or C. (Given that if A and C occur, we have the result that A or C has occurred, so when we say A or C we actually mean A or C or both). Some n = 35 + 5 + 15 = 55 out of the N = 100 cells had either shorted or had blistered plates and so P(A OR C) = 55/100 = 0.55 (or 55 %) Similarly, some n = 15 + 45 + 5 = 65 out of the N = 100 cells had not shorted or did not have blistered plates and so P(B OR D) = 65/100 = 0.65 (or 65 %) C. The Addition Rule The Addition rule of probability states P(A OR C) = P(A) + P(C) – P(A AND C) This can be proved using the above illustrations P(A OR C) = 0.40 + 0.50 – 0.35 = 0.55 which is exactly the same answer as that given above using the n/N definition of probability. For mutually exclusive events P(A AND C) would equal zero so that the addition rule simplifies to: P(A OR C) = P(A) + P(C) for mutually exclusive event D. Conditional Probability Often two or more events will be related. Conditional probability quantifies the chances of one even occurring given that the others have occurred. Dependent events have conditional probabilities associated with them, whereas independent events do not. In the battery example, the occurrence of event A is dependent upon the presence of blistered plates, as more cells short in the presence of blistered plates. The conditional probability of event A occurring, given that event C has already occurred is written as P(A|C). Given that event C has already occurred the reduced number of observations is N = 35 + 15 = 50. Out of these some 35 cells shorted, thus P(A|C) = 35/50 = 0.7 (or 70%) This probability differs from P(A) and so events A and C are said to be dependent. If A and C were independent then P(A) would be the same irrespective of whether event C had occurred or not. So for independent events P(A|C) = P(A). E. The Product Rule The Product rule of probability states P(A|C) = P(A AND C) / P(C) This can be proved using the above illustrations P(A|C) = 0.35/0.5 = 0.70 which is exactly the same answer as that given above using the n/N definition of probability. The product rule is more often expressed in the following way by rearranging the last equation P(A AND C) = P(A|C) P(C) For independent events P(A|C) = P(A) so that the product rule simplifies to P(A AND C) = P(A) P(C) for independent events only. All the above calculations and more are contained in the screen shot below. F. Systems Reliability Many physical systems (e.g. bridges, car engines, air-conditioning systems, biological and ecological systems) and non-physical systems (e.g. chains of command in civilian or military organizations and quality control systems in manufacturing plants) may be viewed as assemblies of many interacting elements. The elements are often arranged in mechanical or logical series or parallel configurations and the reliability of a system is easily calculated from the reliability of its components using the above rules of probability. Series systems function properly only when all their components function properly. Thus the addition rule of probability applies. Examples are chains made out of links, highways that may be closed to traffic due to accidents at different locations, the food chains of certain animal species, and layered company organizations in which information is passed from one hierarchical level to the next. Let Bi be the event that component i fails. Also denote by P(Bi) the probability that component i fails. The probability of failure of a system with k components arranged in series is then P(Systems failure) = P(B1 or B2 …. or Bk) For components that can’t fail together (i.e. mutually exclusive components) this simplifies to P(Systems failure) = P(B1) + P(B2) + …. + P(Bk) Example: Welded parts Consider the screen shot below showing a simple component manufactured by welding together 3 separate parts. The welds are arranged in series as the structure will clearly fail if any one of the welds break. Suppose the probability of an individual weld failing is 0.01 and the welds are independent of each other. The screen shot shows how to calculate the probability of the part failing. Note that the probability of the part failing is close to twice that for a single weld. Thus probability theory tells us that simple series systems (i.e. those with fewer parts (e.g. fewer welds)) will be more reliable than those with many parts. A parallel system fails only if all its components fail. Thus the product rule of probability applies. For example, if an office has k copy machines, it is possible to copy a document if at least one machine is in good working condition. The probability of failure of a system with k components in parallel is then P(Systems failure)= P(B1 and B2 …. and Bk) If components fail independently of each other this simplifies to P(Systems failure) = P(B1) x P(B2) x…. x P(Bk) Example: Safety valves Consider the pressure vessel with three safety valves shown in the following screenshot. The use of three safety valves is an example of a triply redundant system provided that destructive over pressure can only occur if all the valves fail. One valve is sufficient to release pressure. If this is the case the three valves are said to be arranged in parallel. Suppose also that the probability of a safety valve failing is 0.15 and that these valves are independent of each other. Notice that the chances of destructive over pressure are a lot less than the chances of an individual safety valve failing. This is the benefit derived from designing in parallel. If only two safety valves had been built into the pressure vessel the probability of destructive over pressure would rise to: P(Systems failure) = P(B1) x P(B2) = 0.15 x 0.15 = 0.0225 Building additional safety valves has cost implications and a sensible balance needs to be struck between economic cost and safety. Weeks 4 & 5: Random Variables A. Random Variables In most applications of statistics to engineering problems, the researcher is only concerned with one or maybe just a few numbers that are associated with the outcomes of the experiment being carried out. For example, in the inspection of a manufactured product the engineer may be interested only in the number of defectives or in the study of the performance of a miniature rechargeable battery the engineer may be interested only in its power and lifetime. VOICE OF EXPERIENCE Random variables provide a way to model the behavior of real data. In general, each outcome of an experiment may be associated with a number by specifying a rule of association. Such a rule of association is called a random variable – a variable because different numerical values are possible and random because all these values are associated with situations involving an element of chance or uncertainty. An upper case letter denotes a random variable and a lower case letter denotes a specific value for this variable. Example: Failure of a 9 Cr steel Suppose four specimens of a 9Cr steel alloy are put on test at a load of 150 MPa and a temperature of 5500 C. The engineer is interested in how many of these specimens will fail before 1,000 hours of testing at this condition. This creates a random variable X = number of failed specimens which can take the values x = 0, 1, 2, 3 and 4. The actual outcome of this experiment is uncertain – it will not be known until the end of the test (i.e. after 1,000 hours) what value X actually takes. For example, if at the end of the test two specimens are observed to have failed, then x = 2. A random variable can either be discrete in nature or continuous. A discrete random variable is one that can assume at most a countable number of values (e.g. the number of defects in a new car or the number of defective silicon chips in a lot). A continuous random variable is one that can assume any real value over a specified interval (e.g. the outside diameter of a pen barrel or the times between breakdowns of a stamping press at a car assembly plant. One way of specifying the uncertainty associated with a random variable is through the function F(x) = P(X ≤ x) F(x) is the probability that the random variable X will take on the value x or less at the end of the experiment. F(x) is called the cumulative distribution function (or cdf for short). When X takes on its smallest possible value, the probability of observing a smaller value than this is zero. Then when X takes on its largest possible value, the probability of observing this value or less is one. Thus F(x) varies from zero through too one as X increases. The cdf also provides a convenient way of finding the probability that a random variable lies within a certain range P(x1 ≤ X ≤ x2) = P(X ≤ x2) - P(X ≤ x1) = F(x2) – F(x1) Another approach commonly used to represent a random variable is through the use of the probability density function (or pdf for short). The pdf of a random variable X is defined as the derivative of the cdf f(x)  d F(x) dx As integrals define areas under curves, it follows that P(x1 ≤ X ≤ x2) can also be found by calculating the area under the pdf between x1 and x2. As the maximum value for F(x) is one, it follows that the total area under the pdf must also sum to one. Example: Battery failures Suppose that a battery failure time, measured in hours, has a cdf given by F(x)  1  1 (x  1) 2 The probability that the battery lasts between x1 = 1 and x2 = 2 hours is therefore P(x1 ≤ X ≤ x2) = P(1 ≤ X ≤ 2) = F(x2) – F(x1) = F(2) – F(1) F(2)  1  1 8 1 3  ; F(1)  1   2 2 9 4 (2  1) (1  1) Thus 8 3 5 P(x1 ≤ X ≤ x2) =   9 4 36 The pdf is given by f(x)  d F(x) 2  2[x  1]3  dx [x  1]3 Working with only the pdf 2 P(x1 ≤ X ≤ x2) =  2 1  -1   -1   -1  1 1 5 2 dx        3 2 2 2 [x  1] [x  1] 1 [2  1]  [1  1]  4 9 36 More details on this type of calculation can be seen in the screen shot below (click on it to access the actual Excel file). B. Distributions Depending on the characteristics of the random variable being studied, there are a variety of different cdf’s available. This module will consider only some of the many cdf available to practicing engineers. These cdf’s are described by formulas that depend on parameter values. These are population parameters in that they should ideally be calculated from the population of values on X. i. The Binomial Distribution VOICE OF EXPERIENCE The binomial distribution models defective/non defective. The simplest discrete random variable is one than can take on just two values. Such a random variable can be used to model the outcome of a coin toss, whether an important safety valve is open or shut, whether an item or component is defective or not and so on. Typically, the outcomes are labeled 0 (for not defective) and 1 (for defective) and this random variable is defined by the parameter p, where p is the probability that X takes on the value 1. Such a simple random variable is called a Bernoulli random variable. Many experiments and tasks carried out by engineers can be thought of as consisting of a sequence of Bernoulli trials, such as the repeated examination of critical components and parts to determine whether they are defective. In such cases, the random variable of interest is the number of observed successes obtained within a fixed number of trials, n. Such a random variable is called a binomial random variable. Specifically, if n independent Bernoulli trials X1,……,Xn are performed, each with a probability p of being defective, then the random variable X = X1 + X2 +….+ Xn is said to have a binomial distribution with parameters n and p, which is written as X ~ B(n, p) The pdf of a B(n,p) random variable is given by the formula f(x)  n! p x (1  p) n x (n  x)! x! where ! stands for factorial (e.g. 3! = 3 x 2 x 1 = 6). The mean and variance are in turn given by Mean  np ; Variance  np(1  p) These equations is easily derived from the rules of probability looked at in an earlier unit. This derivation is best done via an example. Example: Polyester yarn A chemical engineer monitors a dying process for polyester yarn used in clothing by comparing a sample of the yarn against a standard colour chart. The engineer accepts or rejects the entire batch based on the result of this comparison. Historically, this dying process averages 25% rejected batches, so in the terminology above p = 0.25. Each shift, the process dyes four batches and so in the terminology above, the number of trials is n = 4. Now take X = 1 as an example. There are only four ways in which this can happen: Possibility 1: X = X1 + X2 + X3 + X4 = 1 + 0 + 0 + 0 = 1. (Only the first batch is defective). The probability that this outcome will occur is the probability that X1 = 1 And X2 = 0 And X3 = 0 And X4 = 0, which of course requires the use of the product rule. Further, the outcome of the second batch is independent of what happened to the first batch and what will happen to the third batch and so the simpler product rule for independent events is applicable P[X1 = 1 And X2 = 0 And X3 = 0 And X4 = 0] = p(1-p)(1-p)(1-p) Possibility 2: X = X1 + X2 + X3 + X4= 0 + 1 + 0 + 0 = 1. (Only the second batch is defective). P[X1 = 0 And X2 = 1 and X3 = 0 And X4 = 0] = (1-p)p(1-p)(1-p) Possibility 3: X = X1 + X2 + X3 + X4 = 0 + 0 + 1 + 0 = 1. (Only the third batch is defective). P[X1 = 0 And X2 = 0 and X3 = 1 And X4 = 0] = (1-p)(1-p)p(1-p) Possibility 4: X = X1 + X2 + X3 + X4 = 0 + 0 + 0 + 1 = 1. (Only the fourth batch is defective). P[X1 = 0 And X2 = 0 and X3 = 0 And X4 = 1] = (1-p)(1-p)(1-p)p Notice that X = 1 can therefore occur in four separate ways and so X = 1 if possibility 1 Or possibility 2 Or possibility 3 Or possibility 4 occurs. Or events require the addition rule of probability and because these outcomes are mutually exclusive (the researcher can only observe one of the possibilities above during a single shift) P[X = 1] = P(possibility 1 Or possibility 2 Or possibility 3 Or possibility 4 ] = p(1-p)(1-p)(1-p) + (1-p)p(1-p)(1-p) + (1-p)(1-p)p(1-p) + (1-p)(1-p)(1-p)p P[X = 1] = 4p1(1-p)3 The number 4 is this last expression is the number of ways that X can equal 1. This n! can be worked out quickly using the first part of the pdf for B(n,p), i.e. using . (n  x)! x! When x = 1 for n = 4 this gives 4! 24  4 (4  1)!1! (6)1 Now look at the second part of the pdf for B(n,p) above. This part, p x (1  p) nx , is simply p1(1-p)3 when n = 4 and x = 1. Now with p = 0.25, P[X = 1] = 4(0.25)(1-0.25)3 = 0.4219. There is a 42.19% chance that by the end of the shift the control engineer will have observed one batch with defective colour. The probabilities associated with the other values for X are easily worked out using the pdf for B(n,p). Play around with the p value in cell B7 of this Excel file and notice how the binomial distribution is symmetric around its peak point when p = 0.5, which is a characteristic of the very important normal distribution to be discussed below ((i.e. when p = 0.5 and n is large there is very little difference between these two distributions). ii. The Uniform Distribution. The uniform distribution can be used when any value for the random variable is equally likely to occur. The shorthand for saying X in uniformly distributed between the limits a and b is X ~ U(a, b) with parameters a and b. The cdf for a uniform random variable is written as F(x)  xa (b  a) The pdf is given by for a≤x≤b f(x)  d F(x) 1  dx b-a The mean and variance are in turn given by Mean  ab (b  a) 2 ; Variance  2 12 All of this can be visualized in a simple cross plot of F(x) against x: Mean = (a+b)/2 F(x) 1 b/(b-a) Variance = (b-a)2/12 0 x a b Example: A manufactured metal pins and Pearls Suppose that a manufactured metal pin has a diameter that has a uniform distribution between a = 4.182 mm and b = 4.185 mm. Thus X ~ U(4.182, 4.185) and so the mean pin diameter is (4.182 + 4.185)/2 = 4.1835 mm with a standard deviation of (4.185 - 4.182)/√12 = 0.00087 mm. The probability that a pin selected at random from this manufacturing process fits in a hole that has a diameter of 4.184 mm is F(x) = F(4.184) = P(X ≤ 4.184) = 4.184  4.182  0.6667 (4.185  4.182 ) (or 66.67%) When pearl oysters are opened, pearls of various sizes are typically found. Suppose that each oyster contains a pearl with a diameter in mm that has a U(0, 10) distribution. The mean pearl diameter is therefore (0 + 10)/2 = 5 mm with a standard deviation of (10 - 0)/√12 = 2.89 mm. If pearls with a diameter of at least 4 mm have commercial value, then the probability that a randomly selected oyster from a recent catch of oysters has commercial value is P(X ≥ 4) = 1 - F(3)  1  30  0.70 (10  0) (or 70%) More details on this type of calculation can be seen in the screen shot below (click on it to access the actual Excel file). iii. The Poisson Distribution The Poisson distribution can be used when the random variable being studied is the count of the number of events within a specified boundary. VOICE OF EXPERIENCE The poisson distribution models counts such as the number of defects. For example, the number of hairline fractures in a 10 m long steel girder, or the number of blisters per battery plate, or the number of cooling pump failures within a UK nuclear power plant over a 10 year period, or the number of failures of air conditioning equipment in a Boeing 720 aircraft over 300 hours of flight. The shorthand for saying X is Poison distributed is X ~ P() with parameter . If the number of events are uniformly or randomly spaced within the specified boundaries, then the cdf for X is given by x F(x)  e λ  i0 λi i! and the pdf is given by f(x)  d F(x) λ x λ  e dx x! The mean determines how quickly the cdf approaches its maximum value of 1. 1 0.9 0.8 0.7 F(x) 0.6 0.5 0.4 0.3 Mean = 0.5 0.2 mean =1 0.1 Mean =2 0 0 1 2 3 4 5 x 6 7 8 9 10 The mean and variance are in turn given by Mean  Variance  λ Example: Steel girders An engineer examines the edges of steel girders for hairline fractures. The girders are 10 m long, and it is discovered that after inspecting many such girders, they have an average of 5 fractures each and the fractures are always randomly spaced out on the girders. The probability that a randomly selected girder, on inspection, has three or less hairline fractures is  50 F(x) = F(3) = P(X ≤ 3) = e 5    0!  51 52 53  25 125        e 5 1  5     0.2650 1! 2! 3!  2 6    The probability that a randomly selected girder, on inspection, has exactly two hairline fractures is  50 F(3) – F(2) = 0.2650 - e 5    0!  51 52      0.2650  0.1247  0.1404 1! 2!   More details on this type of calculation can be seen in the screen shot below (click on it to access the actual Excel file). iv. The Exponential Distribution VOICE OF EXPERIENCE The exponential distribution models the length of time or space between successive events. The exponential distribution can be used when the random variable being studied is the amount of time or space between successive events occurring in a Poisson process. For example, the size of the gaps between fractures on the edges of steel girders or the times between failures of air conditioning equipment in Boeing 720 aircraft. The shorthand for saying X is exponentially distributed is X ~ Exp() with parameter . The cdf is given by F(x)  1  e  λx and the pdf is f(x)  d F(x)  e λx dx The mean and variance are in turn given by Mean  1  ; Variance  1 2 The mean determines how quickly the cdf approaches its maximum value of 1. The Poisson and exponential distributions are closely related as is illustrated in the next example. 1 0.9 0.8 0.7 F(x) 0.6 0.5 0.4 0.3 Mean = 0.5 0.2 Mean = 1.0 0.1 Mean = 2.0 0 0 1 2 3 4 5 6 7 8 9 10 x Example: Steel girders revisited An engineer examines the edges of steel girders for hairline fractures. The girders are 10 m long, and it is discovered that after inspecting many such girders, they have an average of 5 fractures each and also that the fractures appear to be uniformly spaced on the girders. If a girder has an average of 5 fractures, then there are an average of 6 gaps between fractures or between the ends of the girder and the adjacent fractures. The average length of these gaps between fractures is therefore 10/6 = 1.67 m, and so 1.67 = 1/, or = 1/1.67 = 0.6 m. Thus the number of fractures follows a Poisson distribution with = 5, whilst the gap between these fractures follows an exponential distribution with = 0.6 m. In this case, the probability that a measured gap on a randomly selected 10 m girder is less than or equal to 0.8 m long is F(x) = F(0.8) = P(X ≤ 0.8) = 1  e 0.6(0.8)  0.3812 If a 2.5 m segment of girder is selected, the average number of fractures it contains is 0.25(5) = 1.25. The probability that this segment contains at least 1 fracture is 1.250 1.251  1.25 F(x) = F(1) = P(X ≤ 1) = e 1.25     e 1  1.25  0.6446 1!   0! v. The Weibull Distribution The Weibull distribution is often used to model times to certain events, such as the time or number of miles driven for brake pads to wear out, or the lifetime of boiler components in a power plant, or the lifetime of a particular bacterium at a certain high temperature. However, this distribution is used to model a variety of other types or random variables occurring in engineering - especially extreme or maximum values. The shorthand for saying X is Weibull distributed is X ~ W() with parameters and . The cdf is given by F(x)  1  e( λx)  and the pdf is f(x)  β d F(x)  βλβ xβ 1e(x) dx The mean and variance are in turn given by 2  1  1  1 1    2  Mean   1   ; Variance  2  1     1                f(x) The function [k] is the gamma function. If k is a positive integer number then [k] = (k-1)!. Except for these special cases there is no closed form expression for the gamma function. It can however be worked out numerically in Excel using the Gammaln() function which calculates the natural log of the gamma function.  is called the scale parameter and  is called the shape parameter. The Weibull distribution is very useful in applied engineering research as its pdf can exhibit a wide variety of shapes, depending on the choice of values for the parameters. Notice in the following figure that when =1, the Weibull distribution has the same shape as the exponential distribution, i.e. it is the exponential distribution (this can be seen also by setting = 1 in the above Weibull cdf and comparing the result to the cdf for the exponential distribution). The Weibull distribution is therefore simply a generalization of the exponential distribution. 1.5 1.4 1.3 1.2 1.1 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 -0.1 0 lambda = 1, beta = 0.5 lambda = 1, beta = 2 lambda = 1, beta = 3 1 2 3 4 5 6 7 x Example: Pitting corrosion. A study of pitting corrosion was carried out on buried cast iron oil and gas pipelines. The study found that the maximum pit depth data followed a Weibull distribution with  = 0.45 and  = 1.8. The average maximum pit depth is therefore Mean  1  1  1 Γ 1    0.889  1.98mm 0.45  1.8  0.45 with a variance of 2  1  1  1   2  Variance  2  1     1              0.452 2  2 1        1     1     1.29   1.8     1.8   The probability that a randomly selected dug up pipe, on inspection, has a maximum pit depth of 2 mm or less is F(x) = F(2) = P(X ≤ 2) = F(x)  1  e(0.45x)  0.5628 1.8 vi. The Normal Distribution Among all the distributions used in applied research, the single most important is the normal distribution. There are two fundamental reasons for this. First, the behavior of many physical phenomena can be modeled well using this distribution. Secondly, and perhaps more importantly, the behavior of averages can, when using large samples, be modeled well using this distribution This latter fact leads to the development of many tests for important hypothesis (see unit 4 for more on this). The shorthand for saying X is normally distributed is X ~ N() with parameters and . In fact, if a random variable is normally distributed then its mean value equals and 2 is its variance (as these are population parameters,  is the population mean and  the population standard deviation). The pdf is f(x)  d F(x)  dx 1 2 2  1  x   2  exp      2     Unfortunately there is no analytical solution to the integral of f(x) and so there is no closed form expression for F(x). The integral must be evaluated numerically and this creates some minor complications. Using numerical integration procedures it has been proved that: 1. 2. 3. Approximately 68% of all the possible values that X can take on are within the numeric range ± 1 Approximately 95% of all the possible values that X can take on are within the numeric range ± 2 Virtually all (approximately 99.7%) the possible values that X can take on are within the numeric range ± 3 Further, the pdf is bell shaped and symmetric around the peak point. The peak point is located at the mean value , and ± 1represents the points of inflection for the pdf so that  essentially determines how wide the distribution is. For this reason  is sometimes called the location parameter and  the shape parameter. All this is illustrated in the screen shot below. Any normally distributed random variable can be converted into a standard normal random variable, Z. This is important because of the inability to integrate f(x). Because any normal variable can be converted into a standard normal variable, it is only required to numerically integrate various areas under this single standard normal distribution and tabulate the results for future use. In fact the expression in round brackets in the formula for f(x) above is a standard normal variable, which is usually given the symbol Z Z X   So if X is a random variable following a normal distribution, its re scaled value, Z, follows the standard normal distribution, whose pdf is f(z)   1 2 exp  z    2  2 1 2 The following screenshot shows the numerically calculated integral for f(z), i.e. the values for F(z) corresponding to various values for z. A table of F(z) is shown below and can be found at the back of any decent textbook on statistics. This table is usually referred to as the Z Table. Alternatively, F(z) can be “looked up” in Excel using the =NormsDist(z) function. The Z Table can be read the other way around, i.e. finding the z value associated with a given value for F(z) in Excel, using the =NormsInv(z) function. Notice the F(z) values shown in the Z table below are the probabilities of observing the shown Z value or less. It follows from the last two screenshots that a normally distributed variable and its standardized equivalent are quite specifically related. When a normally distributed variable, X, takes on a value equal to its mean, the corresponding standardized value is zero. So a standardized normal variable has a mean of zero. When a normally distributed variable takes on a value equal to its mean plus one standard deviation, the corresponding standardized value is unity. So a standardized normal variable has a standard deviation of one. All of this is best illustrated using an example. Example: Concrete blocks. A company manufactures concrete blocks that are used for construction purposes. Suppose that the weight of all the individual concrete blocks produced by this company are normally distributed with a mean value of  = 11 kg and a standard deviation of  = 0.3 kg. What percentage of concrete blocks produced by this company weighs 10.5 kg or less? The standardized value for 10.5 kg is Z = (10.5 – 11)/0.3 = -1.67. From the Z table above, the probability of observing this Z value or less is 0.0475, or 4.75% of manufactured blocks will have a weight of 10.5 kg or less. Z values 0.6 -3 -2.5 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 2.5 3 0.5 f(x) 0.4 0.3 0.2 =Normsdist(-1.67) = 0.0475 0.1 0 10.1 10.2 10.3 10.4 10.5 10.6 10.7 10.8 10.9 11 11.1 11.2 11.3 11.4 11.5 11.6 11.7 11.8 11.9 Concrete block weights (kg) vii. Transformations of the normal distribution If a random variable X has a normal distribution, so that Z has a standard normal distribution, then its square has a chi square distribution with parameter v = 1. v is referred to as the degrees of freedom. More generally if there v normally distributed variables that are independent of each other, the squares of their standardized values once added together follow a chi square distribution with v degrees of freedom. That is 2v = {(Z1)2 + (Z2)2 + …. + (Zn)2} has a pdf given by f(  v2 )  1  /2 2 ( / 2) y ( / 2)1 exp(   v2 / 2) As the chi square distribution is essentially a transformation of the standard normal distribution, it is not surprising to realize that again there is not closed form expression for F(  v2 ). Again values for F(  v2 ) have been numerically calculated and tabulated for various values of v. Values from this table can be looked up in Excel using the function =ChiDist(  v2 ,). .

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Unit 2: Modeling Random Behavior Week 3 : Probability