Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Introduction to Statistics STATISTICS and PROBABILITY LECTURE: PROBABILITY DISTRIBUTIONS Prof. Dr. İrfan KAYMAZ Atatürk University Engineering Faculty Department of Mechanical Engineering P.S. These lecture notes are mainly based on the reference given in the last page. Atatürk University objectives of this lecture Introduction to Statistics After carefully listening of this lecture, you should be able to do the following: Determine probabilities from probability mass functions and the reverse. Determine probabilities from cumulative distribution functions, and cumulative distribution functions from probability mass functions and the reverse. Determine probabilities from probability density functions. Determine probabilities from cumulative distribution functions, and cumulative distribution functions from probability density functions, and the reverse. Atatürk University Random Variables Probability A variable that associates a number with the outcome of a random experiment is called a random variable. A random variable is a function that assigns a real number to each outcome in the sample space of a random experiment. Particular notation is used to distinguish the random variable (rv) from the real number. The rv is denoted by an uppercase letter, such as X. After the experiment is conducted, the measured value is denoted by a lowercase letter, such a x = 70. X and x are shown in italics, e.g., P(X=x). © John Wiley & Sons, Inc. Applied Statistics and Probability for Engineers, by Montgomery and Runger. Atatürk University Continuous & Discrete Random Variables Probability A discrete random variable is a rv with a finite (or countably infinite) range. They are usually integer counts, e.g., number of errors or number of bit errors per 100,000 transmitted (rate). The ends of the range of rv values may be finite (0 ≤ x ≤ 5) or infinite (x ≥ 0). A continuous random variable is a rv with an interval (either finite or infinite) of real numbers for its range. Its precision depends on the measuring instrument. © John Wiley & Sons, Inc. Applied Statistics and Probability for Engineers, by Montgomery and Runger. Atatürk University Examples of Discrete & Continuous RVs Probability Continuous rv’s: Electrical current and voltage. Physical measurements, e.g., length, weight, time, temperature, pressure. Discrete rv’s: Number of scratches on a surface. Proportion of defective parts among 100 tested. Number of transmitted bits received in error. Number of common stock shares traded per day. © John Wiley & Sons, Inc. Applied Statistics and Probability for Engineers, by Montgomery and Runger. Atatürk University Probability Distributions Probability A random variable X associates the outcomes of a random experiment to a number on the number line. The probability distribution of the random variable X is a description of the probabilities with the possible numerical values of X. A probability distribution of a discrete random variable can be: A list of the possible values along with their probabilities. A formula that is used to calculate the probability in response to an input of the random variable’s value. © John Wiley & Sons, Inc. Applied Statistics and Probability for Engineers, by Montgomery and Runger. Atatürk University Example: Digital Channel There is a chance that a bit transmitted through a digital transmission channel is received in error. Let X equal the number of bits received in error of the next 4 transmitted. The associated probability distribution of X is shown as a graph and as a table. Probability Figure 3-1 Probability distribution for bits in error. P(X =0) = P(X =1) = P(X =2) = P(X =3) = P(X =4) = © John Wiley & Sons, Inc. Applied Statistics and Probability for Engineers, by Montgomery and Runger. 0.6561 0.2916 0.0486 0.0036 0.0001 1.0000 Atatürk University Probability Mass Function Probability Suppose a loading on a long, thin beam places mass only at discrete points. This represents a probability distribution where the beam is the number line over the range of x and the probabilities represent the mass. That’s why it is called a probability mass function. Figure 3-2 Loading at discrete points on a long, thin beam. © John Wiley & Sons, Inc. Applied Statistics and Probability for Engineers, by Montgomery and Runger. Atatürk University Probability Mass Function Properties Probability For a discrete random variable X with possible values x1 ,x 2 , ... x n , a probability mass function is a function such that: (1) f xi 0 n (2) f x 1 i 1 i (3) f xi P X xi © John Wiley & Sons, Inc. Applied Statistics and Probability for Engineers, by Montgomery and Runger. Atatürk University Example: Wafer Contamination Let the random variable X denote the number of wafers that need to be analyzed to detect a large particle. Assume that the probability that a wafer contains a large particle is 0.01, and that the wafers are independent. Determine the probability distribution of X. Let p denote a wafer for which a large particle is present & let a denote a wafer in which it is absent. The sample space is: S = {p, ap, aap, aaap, …} The range of the values of X is: x = 1, 2, 3, 4, … Probability Probability Distribution P(X =1) = 0.1 0.1 P(X =2) = (0.9)*0.1 0.09 P(X =3) = (0.9)2*0.1 0.081 P(X =4) = (0.9)3*0.2 0.0729 0.3439 © John Wiley & Sons, Inc. Applied Statistics and Probability for Engineers, by Montgomery and Runger. Atatürk University Cumulative Distribution Functions Example 3-6: From Example 3.4, we can express the probability of three or fewer bits being in error, denoted as P(X ≤ 3). The event (X ≤ 3) is the union of the mutually exclusive events: (X=0), (X=1), (X=2), (X=3). From the table: Probability x 0 1 2 3 4 P(X =x ) P(X ≤x ) 0.6561 0.2916 0.0486 0.0036 0.0001 1.0000 0.6561 0.9477 0.9963 0.9999 1.0000 P(X ≤ 3) = P(X=0) + P(X=1) + P(X=2) + P(X=3) = 0.9999 P(X = 3) = P(X ≤ 3) - P(X ≤ 2) = 0.0036 © John Wiley & Sons, Inc. Applied Statistics and Probability for Engineers, by Montgomery and Runger. Atatürk University Cumulative Distribution Function Properties Probability The cumulative distribution function is built from the probability mass function and vice versa. The cumulative distribution function of a discrete random variable X , denoted as F ( x), is: F x F X x xi xi x For a discrete random variable X , F x satisfies the following properties: (1) F x P X x f xi xi x (2) 0 F x 1 (3) If x y, then F x F y © John Wiley & Sons, Inc. Applied Statistics and Probability for Engineers, by Montgomery and Runger. Atatürk University Example :Cumulative Distribution Function Probability Determine the probability mass function of X from this cumulative distribution function: F (x) = 0.0 0.2 0.7 1.0 x < -2 -2 ≤ x < 0 0≤x <2 2≤x PMF f (2) = 0.2 f (0) = 0.5 f (2) = 0.3 Figure 3-3 Graph of the CDF © John Wiley & Sons, Inc. Applied Statistics and Probability for Engineers, by Montgomery and Runger. Atatürk University Example: Sampling without Replacement Probability A day’s production of 850 parts contains 50 defective parts. Two parts are selected at random without replacement. Let the random variable X equal the number of defective parts in the sample. Create the CDF of X. 799 P X 0 800 850 849 0.886 50 P X 1 2 800 850 849 0.111 50 49 P X 2 850 849 0.003 Therefore, F 0 P X 0 0.886 F 1 P X 1 0.997 F 2 P X 2 1.000 Figure 3-4 CDF. Note that F(x) is defined for all x, - <x < , not just 0, 1 and 2. 14 © John Wiley & Sons, Inc. Applied Statistics and Probability for Engineers, by Montgomery and Runger. Atatürk University Continuous Density Functions Probability Density functions, in contrast to mass functions, distribute probability continuously along an interval. The loading on the beam between points a & b is the integral of the function between points a & b. Figure 4-1 Density function as a loading on a long, thin beam. Most of the load occurs at the larger values of x. © John Wiley & Sons, Inc. Applied Statistics and Probability for Engineers, by Montgomery and Runger. Atatürk University Continuous Density Functions Probability A probability density function f(x) describes the probability distribution of a continuous random variable. It is analogous to the beam loading. Figure 4-2 Probability is determined from the area under f(x) from a to b. © John Wiley & Sons, Inc. Applied Statistics and Probability for Engineers, by Montgomery and Runger. Atatürk University Probability Density Function Probability For a continuous random variable X , a probability density function is a function such that (1) f x 0 means that the function is always non-negative. (2) f ( x)dx 1 b (3) (4) P a X b f x dx area under f x dx from a to b f x 0 a means there is no area exactly at x. © John Wiley & Sons, Inc. Applied Statistics and Probability for Engineers, by Montgomery and Runger. Atatürk University Histograms Probability A histogram is graphical display of data showing a series of adjacent rectangles. Each rectangle has a base which represents an interval of data values. The height of the rectangle creates an area which represents the relative frequency associated with the values included in the base. A continuous probability distribution f(x) is a model approximating a histogram. A bar has the same area of the integral of those limits. Figure 4-3 Histogram approximates a probability density function. © John Wiley & Sons, Inc. Applied Statistics and Probability for Engineers, by Montgomery and Runger. Atatürk University Area of a Point Probability If X is a continuous random variable, for any x1 and x2 , P x1 X x2 P x1 X x2 P x1 X x2 P x1 X x2 (4-2) which implies that P X x 0. From another perspective: As x1 approaches x2 , the area or probability becomes smaller and smaller. As x1 becomes x2 , the area or probability becomes zero. © John Wiley & Sons, Inc. Applied Statistics and Probability for Engineers, by Montgomery and Runger. Atatürk University Example: Hole Diameter Probability Let the continuous random variable X denote the diameter of a hole drilled in a sheet metal component. The target diameter is 12.5 mm. Random disturbances to the process result in larger diameters. Historical data shows that the distribution of X can be modeled by f(x)= 20e-20(x-12.5), x ≥ 12.5 mm. If a part with a diameter larger than 12.60 mm is scrapped, what proportion of parts is scrapped? © John Wiley & Sons, Inc. Applied Statistics and Probability for Engineers, by Montgomery and Runger. Atatürk University Probability Example: Hole Diameter Figure 4-5 P X 12.60 20e 20 x 12.5 dx 0.135 12.6 © John Wiley & Sons, Inc. Applied Statistics and Probability for Engineers, by Montgomery and Runger. Atatürk University Cumulative Distribution Functions Probability The cumulative distribution function of a continuous random variable X is, F x P X x x f u du for x (4-3) © John Wiley & Sons, Inc. Applied Statistics and Probability for Engineers, by Montgomery and Runger. Atatürk University Example :Electric Current Probability For the copper wire current measurement in Exercise 4-1, the cumulative distribution function (CDF) consists of three expressions to cover the entire real number line. 0 x <0 F (x ) = 0.05x 0 ≤ x ≤ 20 1 20 < x Figure 4-6 This graph shows the CDF as a continuous function. © John Wiley & Sons, Inc. Applied Statistics and Probability for Engineers, by Montgomery and Runger. Atatürk University Example :Hole Diameter Probability For the drilling operation in Example 4-2, F(x) consists of two expressions. This shows the proper notation. F x 0 F x for x 12.5 x 20e 20 u 12.5 du 12.5 1 e 20 x 12.5 for x 12.5 Figure 4-7 This graph shows F(x) as a continuous function. © John Wiley & Sons, Inc. Applied Statistics and Probability for Engineers, by Montgomery and Runger. Atatürk University Density vs. Cumulative Functions Probability The probability density function (PDF) is the derivative of the cumulative distribution function (CDF). The cumulative distribution function (CDF) is the integral of the probability density function (PDF). dF x Given F x , f x as long as the derivative exists. dx © John Wiley & Sons, Inc. Applied Statistics and Probability for Engineers, by Montgomery and Runger. Atatürk University Exercise: Reaction Time Probability The time until a chemical reaction is complete (in milliseconds, ms) is approximated by this CDF: 0 for x 0 F x 1 e0.01x for 0 x What is the PDF? What proportion of reactions is complete within 200 ms? © John Wiley & Sons, Inc. Applied Statistics and Probability for Engineers, by Montgomery and Runger. Atatürk University Probability Exercise: Reaction Time the PDF dF x d 0 0 for x 0 f x 0.01x 0.01 x 1 e 0.01 e for 0 x dx dx The proportion of reactions is complete within 200 ms P X 200 F 200 1 e2 0.8647 © John Wiley & Sons, Inc. Applied Statistics and Probability for Engineers, by Montgomery and Runger. Atatürk University Next Week Probability Most commonly used probability functions…. © John Wiley & Sons, Inc. Applied Statistics and Probability for Engineers, by Montgomery and Runger. Atatürk University References Douglas C. Montgomery, George C. Runger Applied Statistics and Probability for Engineers, John Wiley & Sons, Inc. Atatürk University