Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
5 Joint Probability Distributions and Random Samples Li Jie 5.1 Jointly Distributed Random Variables 5.2 Expected Values,Covariance,and Correlation 5.3 Statistics and Their Distributions 5.4 The Distribution of the Sample Mean 5.5 The Distribution of a Linear Combination Supplementary Exercises Bibliography Li Jie Introduction In this chapter,we first discuss probability models for the joint behavior of several random variables, putting special emphasis on the case in which the variables are independent of one another .We then study expected values of functions of several random variables, including covariance and correlation as measures of the degree of association between two variables. Li Jie 5.1 Jointly Distributed Random Variables There are many experimental situations in which more than one random variable (rv) will be of an investigator. We shall first consider joint probability distributions for two discrete rv’s, then for two continuous. Li Jie The Joint Probability Mass Function for Two Discrete Random Variables The probability mass function (pmf) of a single discrete rv X specifies how much probability mass is placed on each x value.The joint pmf of two discrete rv’s X and Y describes how much probability mass is placed on each possible pair of values (x, y). Li Jie Example 1 we want to study the distribution of people’s height and weight, let X=(X1,X2), X1---the height of people, X2---the weight of people. Example 2 When two fair dice are rolled in an honest manner, let Y=(Y1,Y2), Y1----the number shown on first dice, Y2---the number shown on second dice. Example 3 A box has six tickets, labeled from 1 to 6. Two tickets are selected from the box by sampling without replacement. Let X=(X1,X2), X1, X2, respectively, denote the labels of the first and the second ticket so selected. Li Jie DEFINITION Let X and Y be two discrete rv’s defined on the sample space S of an experiment. The joint probability mass function p(x,y) is defined for each pair of numbers (x,y) by p(x,y)=P(X=x and Y=y) Let A be any set consisting of pairs of (x, y) values. Then the probability P[( X , Y ) A] is obtained by summing the joint pmf over pairs in A: P[( X , Y ) A] p( x, y) ( x , y )A A function p(x,y) can be used as a joint pmf provided that p( x, y ) 0 for all x and y and p ( x, y ) 1 x y Li Jie Example When two fair dice are rolled in an honest manner, let Y=(Y1,Y2), Y1----the number shown on first dice, Y2---the number shown on second dice. Y2 1 2 3 4 5 6 1 1/36 1/36 1/36 1/36 1/36 1/36 2 1/36 1/36 1/36 1/36 1/36 1/36 3 1/36 1/36 1/36 1/36 1/36 1/36 4 1/36 1/36 1/36 1/36 1/36 1/36 5 1/36 1/36 1/36 1/36 1/36 1/36 6 1/36 1/36 1/36 1/36 1/36 1/36 Y1 Li Jie Example 5.1 X=the deductible amount on auto policy and Y=the deductible amount on homeowner’s policy. Y p(x,y) X 100 250 0 .20 .05 100 .10 .15 200 .20 .30 Li Jie DEFINITION The marginal probability mass functions of X and of Y, denoted by pX (x)and pY ( y), respectively, are given by p X ( x ) p( x , y ) y pY ( y ) p( x, y ) x The pmf of one of the variables alone is obtained by summing p(x,y) over values of the other variable. The result is called a marginal pmf because when the p(x,y) values appear in a rectangular table, the sums are just marginal (row or column) totals. Li Jie Y2 1 2 3 4 5 6 1 1/36 1/36 1/36 1/36 1/36 1/36 2 1/36 1/36 1/36 1/36 1/36 1/36 3 1/36 1/36 1/36 1/36 1/36 1/36 4 1/36 1/36 1/36 1/36 1/36 1/36 5 1/36 1/36 1/36 1/36 1/36 1/36 6 1/36 1/36 1/36 1/36 1/36 1/36 Y1 Pi. p.j Li Jie Example 5.2(Example 5.1 continued) The possible X values are x=100 and x=250, so computing row totals in the joint probabilities table yields pX (100) p(100,0) p(100,100) p(100,200) .50 and pX (250) p(250,0) p(250,100) p(250,200) .50 The marginal pmf of X is then .5 pX ( x) 0 x 100,250 otherwise Li Jie Similarly, the marginal pmf of Y is obtained from column totals as .25 pY ( y ) .50 0 y 0,100 y 200 otherwise so P (Y 100) pY (100) pY (200) .75 as before. Li Jie The Joint Probability Density Function for Two Continuous Random Variables DEFINITION Let X and Y be continuous rv’s. Then f(x,y) is the joint probability density function for X and Y if for any two-dimensional set A P[( X , Y ) A] f ( x, y)dxdy A Li Jie In particular, if A is two-dimensional rectangle ( x, y) : a x b, c y d then P[( X , Y ) A] P(a X b, c Y d ) b d a c f ( x, y)dydx For f(x,y) to be a candidate for a joint pdf, it must satisfy . (1) f ( x, y) 0 and (2) f ( x, y)dxdy 1 Li Jie We can think of f(x,y) as specifying a surface at height f(x,y) above the point (x,y) in a threedimensional coordinate system. Then P[( X , Y ) A] is the volume underneath this surface and above the region A, analogous to the area under a curve in the one-dimensional case. This is illustrated in Figure 5.1. f(x,y) y Surface f(x,y) A=Shaded rectangle x Li Jie Example 5.3 Suppose the joint pdf of (X,Y) is given by 6 ( x y2 ) f ( x, y ) 5 0 0 x 1,0 y 1 otherwise To verify that this is a legitimate pdf, note that f ( x, y ) 0 and f ( x, y)dxdy 1 6 f ( x, y)dxdy ( x y 2 )dxdy 0 0 5 1 1 1 16 6 xdxdy y 2 dxdy 0 0 5 0 0 5 1 1 16 6 6 6 2 xdx y dy 1 0 5 0 5 10 15 1 Li Jie 1 1 1 1 6 4 4 P (0 X ,0 Y ) ( x y 2 )dxdy 0 0 5 4 4 6 14 14 6 14 14 2 xdxdy y dxdy 5 0 0 5 0 0 .109 Li Jie Example The pdf of vector (X,Y) is given by ce 3( x y ) , x 0, y 0 f ( x, y) otherwise 0, 1 x+y=1 G Determine : (1) the constant c 1 (2)the probability P{(X,Y)∈G} Li Jie 1 x+y=1 G Solution: (1) 1 f ( x, y )dxdy 1 0 0 ce 3 ( x y ) 3 x 0 dxdy c e 3 y 0 dx e 1 dy c 1 9 c9 (2) 1 x P{( X , Y ) G} 0 dx 0 9e 3( x y ) dy 1 3 x 3(1 x ) 3 3 e [ 1 e ] dx 1 4 e 1 0 Li Jie Example : The pdf for the vector (X,Y) whose components are positive random variables is given by x y f ( x, y) ye e , x 0, y 0 Determine the probability that Y>X. Solution: P (Y X ) P(Y X ) 0 dy 0 ye y ( x y ) dx 0 ye (1 e )dy 3 / 4 y y Li Jie Definition The marginal probability density functions of X and Y, denoted by f X (x)and f Y ( y) ,respectively, are given by f X ( x) f ( x, y)dy for x fY ( y) f ( x, y)dx for y As with joint pmf’s, from the joint pdf of X and Y, each of the two Marginal density functions can be computed. Li Jie Example : The pdf of X and Y is given by f ( x, y) xe x ( y1) , x 0, y 0 Determine the marginal density functions. Solution: f X ( x) f ( x, y )dy 0 xe x ( y 1) dy e x ( x 0) fY ( y ) f ( x, y )dx 0 x ( y 1) xe 1 dx ( y 1) 2 y0 Li Jie Example : Suppose the pdf of random vector (X,Y) is given by: 2 1 x xy, 0 x 1,0 y 2 f ( x, y ) 3 0, otherwise Determine (1)the marginal density functions fX(x), fY(y) (2) the probability of P{X+Y>1},P{Y>X} Li Jie Solution: (1) The marginal density function are: f X ( x) f ( x, y )dy xy 2 2 ( x )dy 2 x x 3 3 2 0 So Similarly 2 2 2 2 x x, 0 x 1 f X ( x) 3 0, otherwise fY ( y ) f ( x, y )dx xy 1 1 ( x )dx y 3 3 6 1 0 2 Li Jie 1 1 y, 0 y 2 fY ( y ) 3 6 0, otherwise 2 D 1 (2) Computer the probability x+y=1 0 P{ X Y 1} f ( x, y )dxdy 1 D 1 2 0 1x 1 65 ( x xy)dxdy 3 72 2 Li Jie 0 2 G 1 y=x 0 1 P{Y X } f ( x, y )dxdy G 1 17 ( x xy)dxdy 3 24 1 2 0 x 2 Li Jie Exercise 1: The probability density function for the continuous random vector (X,Y) is given by 2 x, 0 x 1,0 y 1 f ( x, y ) 0, otherwise Determine the marginal density functions fX(x), fY(y) Li Jie Exercise 2: The probability density function for the continuous random vector (X,Y) is given by 1 / 2, x y 2, x 0, y 0 f ( x, y ) 0, otherwise a. Determine the marginal density functions fX(x), fY(y) b. Calculate the probability that X>2Y Li Jie Independent random variables In chapter 2 we pointed out that one way of defining independence of two events is to say that A and B are independent if P( A B) P( A) P( B) .We now use an analogous definition for the independence of two rv’s. Li Jie DEFINITION Two random variables X and Y are said to be independent if for every pair of x and y values p( x, y) pX ( x) pY ( y) when X and Y are discrete or f ( x, y) f X ( x) fY ( y) when X and Y are continuous otherwise is not satisfied for all (x, y), then X and Y are said to be dependent. Li Jie Example 5.6 In the insurance situation of Example 5.1 and 5.2, p(100,100) .10 (.5)(.25) pX (100) pY (100) so X and Y are not independent. Li Jie Example 5.8 Suppose that the lifetimes of two components are independent of one another and that the first lifetime, X ,has an exponential distribution with parameter whereas the second, X 2 , has an exponential distribution with parameter 2 .Then the joint pdf is 1 1 f ( x1 , x2 ) f X1 ( x1 ) f X 2 ( x2 ) 1e 1x1 2 e 2 x2 0 x1 0, x2 0 , otherwise Li Jie Let 1=1/1000 and 2 =1/1200 ,so that the expected lifetimes are 1000 hours and 1200 hours,respectively. The probability that both component lifetimes are at least 1500 hours is P (1500 X 1 , 1500 X 2 ) P (1500 X 1 ) P (1500 X 2 ) e 1 (1500) e 2 (1500) (.2231)(.2865) .0639 Li Jie Example There are three white balls and three red balls, put them into three boxes, the boxes are labeled 1,2,3. Let X denote the number of white balls in first box, Y denote the number of red balls in second box. Determine joint probability distribution of (X,Y) pij P{ X i, Y j} P{ X i} P{Y j} 1 i 2 3 i j 1 j 2 3 j C ( ) ( ) C3 ( ) ( ) 3 3 3 3 i 3 Li Jie Example There are three white balls and three red balls, put them into three boxes, the boxes are labeled 1,2,3. Let X denote the number of white balls in first box, Y denote the number of red balls in second box. Determine joint probability distribution of (X,Y) X Y 0 1 2 3 0 64/729 96/729 48/729 8/729 1 96/729 16/81 8/81 12/729 2 48/729 8/81 4/81 6/729 3 8/729 12/729 6/729 1/729 Li Jie Example 5 Put three balls (these balls are undistinguished) into three boxes, which labeled 1,2,3. Let X denote the number of balls in the first box, Y denote the number of balls in the second box. Determine the joint probability distribution of (X,Y). pij P{X i, Y j} P{X i Y j} P{Y j} 0i j 3 1 j 2 3 j P{Y j} C ( ) ( ) , 0 j 3 3 3 j 3 Li Jie pij P{X i, Y j} P{X i Y j} P{Y j} 0i j 3 1 j 2 3 j P{Y j} C ( ) ( ) , 0 j 3 3 3 j 3 P{X i Y j} C i 3 j 1 i 1 3 j i ( )( ) ,0i j 3 2 2 1 3! pij 27 i! j!(3 i j )! Li Jie Example 5 Put three balls (these balls are undistinguished) into three boxes, which labeled 1,2,3. Let X denote the number of balls in the first box, Y denote the number of balls in the second box. Determine the joint probability distribution of (X,Y). X Y 0 1 2 3 0 1/27 1/9 1/9 1/27 1 1/9 2/9 1/9 0 2 1/9 1/9 0 0 3 1/27 0 0 0 Li Jie Exercise1: Two cards are drawn from a special deck consisting of two hearts and two diamonds, and they are placed face down in front of us. The two cards are then turned over and their suits are observed. Let X1 and X2 be defined as follows: 1 if the card on our left is a heart X1 0 if the card on our left is a diamond 1 if the card on our right is a heart X2 0 if the card on our right is a diamond Li Jie Determine the joint probability distribution of X=(X1,X2) X1 X2 0 1 0 1/6 1/3 1 1/3 1/6 Li Jie 5.2 Expected Values,Covariance,and Correlation PROPOSITION Let X and Y be jointly distributed rv’s with pmf p(x,y) according to whether the variables are discrete or continuous.Then the expected value of a function h(X,Y),denoted by E[h(X,Y)] or μh(X,Y) ,is given by E[h(X,Y)] h( x, y ) p( x, y ) if X and Y are discrete E (h( X , Y )) x y h( x, y ) f ( x, y )dxdy if X and Y are continuous Li Jie Example 5.13 Five friends have purchased tickets to a certain concert.If the tickets are for seats 1-5 in a particular row and the tickets are randomly distributed among the five,what is the expected number of seats separating any particular two of the five? Let X and Y denote the seat number of the first and second individuals, respectively. Possible (X,Y) pairs are {(1,2),(1,3),…,(5,4)},and the joint pmf of (X,Y) is 1 p( x, y ) 20 0 x 1,...,5; y 1,...,5; x y otherwise The number of seats separating the two individuals is h(X,Y)=|X-Y|-1.The accompanying table gives h(X,Y) for each possible (x,y) pair. Li Jie X h(x,y) 1 2 3 4 5 1 2 -0 0 -- 1 0 2 1 3 2 3 1 0 -- 0 1 4 2 1 0 -- 0 5 3 2 1 0 -- Y Thus, 5 5 1 E[h( X , Y )] h( x, y) p( x, y) (| x y | 1) 1, x y 20 ( x, y ) x 1 y 1 Li Jie Covariance When two random variables X and Y are not independent,it is frequently of interest to assess how strongly they are related to one another; DEFINITON The covariance between two rv’s X and Y is Cov ( X , Y ) E[( X X )(Y Y )] ( x X )( y Y ) p( x, y ) x y ( x X )( y Y ) f ( x, y )dxdy X , Y , discrete X , Y , continuous Li Jie Example 5.15 The joint and marginal pmf’s for X=automobile policy deductible amount and Y=homeowner policy deductible amount in Example y 5.1 were x 100 250 y p(x,y) 0 100 200 0 100 200 x 100 .20 .10 .20 p (x) .5 .5 pY(y) .25 .25 .5 X 250 .05 .15 .30 From which μX=∑xpX(x)=175 and μY=125.Therefore. Cov( X , Y ) ( x 175)( y 125) p( x, y ) ( x, y ) (100 175)(0 125)(. 20) ... (250 175)( 200 125)(. 30) 1875 Li Jie The following shortcut formula for Cov(X,Y) simplifies the computations. PROPOSITION Cov( X , Y ) E( XY ) X Y Cov( X , X ) V ( X ) Cov( X , Y ) Cov(Y , X ) Cov(aX b, Y ) aCov( X , Y ) Li Jie Cov( X ,Y + Z ) = Cov( X ,Y ) + Cov( X , Z ) V ( X ± Y ) = Cov( X ± Y , X ± Y ) = V ( X ) + V ( Y ) ± 2Cov( X ,Y ) Li Jie If the Xi are independent, then Cov( X i , X j ) 0 i j ,and we have another corollary. If the Xi are independent, then V( ∑ X ) = V ( X ) ∑ i i i =1 i =1 n n Li Jie Example 5.16 The joint and marginal pdf’s X=amount of and Y= amount of cashews were 24 xy;0 x 1,0 y 1, x y 1 f ( x, y ) 0; otherwise 12 x(1 x) 2 ;0 x 1 f X ( x) 0; otherwise With fY(y) obtained by replacing x by y in fX(x).It is easily verified that μX= μY=2/5,and E ( XY ) 1 0 1 x 0 xyf ( x, y )dxdy 1 xy 24 xydydx 8 x 2 (1 x)3 dx 2 / 15 0 Li Jie Thus Cov(X, Y)=2/15-(2/5)2=2/15-4/25=-2/75. Li Jie Example Suppose a random variable XBin(12,0.5), and another random variable YN(0,1), COV(X,Y)= -1, find variance and covariance of V=4X+3Y+1 and W=-2X+4Y Var (V ) Var (4 X 3Y 1) 16Var ( X ) 9Var (Y ) 24Cov( X , Y ) Var (W ) Var (2 X 4Y ) 4Var ( X ) 16Var (Y ) 16Cov( X , Y ) Cov(V ,W ) E (VW ) E (V ) E (W ) E[( 4 X 3Y 1)( 2 X 4Y ) E (4 X 3Y 1) E (2 X 4Y ) Li Jie Example : Let X have the binomial distribution with parameters n, p, determine V (X) . Solution: because a binomial distribution is n Bernoulli distribution, and a Bernoulli distribution is Xi 0 1 P q p EX i = p , EX i 2 = p , V ( X i ) = p-p 2 = p( 1-p ) = pq X X1 X 2 X n Because X1,X2,…,Xn are independent, then V ( X ) = V ( X1 ) + V ( X 2 ) + + V ( X n ) = npq Li Jie Correlation DEFINITION The correlation coefficient of X and Y,denoted by Corr(X,Y), XY or just , is defined by X ,Y Cov( X , Y ) X Y Example 5.17 It is easily verified that in the insurance problem of Example 5.15,E(X2)=36,250,σ2x=36,250-(175)2=5625,σX=75, E(Y2)=22,500, σ2y=6875,and σY=82.92.This gives ρ=1875/(75)(82.92)=.301 Li Jie PROPOSITION 1. If a and c are either both positive or both negative Corr(aX+b,cY+d)=Corr(X,Y) 2. For any two rv’s X and Y,-1≤Corr(X,Y)≤1. PROPOSITION 1. If X and Y are independent,then ρ=0,but ρ=0 does not imply independence. 2. ρ=1 or –1 iff Y=aX+b for some numbers a and b with a≠0 Li Jie Example 5.18 Let X and Y be discrete rv’s with 1 joint pmf ( x, y ) (4,1), (4,1), (2,2)( 2,2) p ( x, y ) 4 0 otherwise the two variables are completely dependent. However, 0 XY Although there is perfect dependence, there is also complete absence of any linear relationship! Li Jie 5.3 Statistics and Their Distributions The observations in a single sample were denoted in Chapter 1 by x1,x2,…,xn. Consider selecting two different samples of size n from the same population distribution. The xi’s in the second sample will virtually always differ at least a bit from those in the first sample. Li Jie For example,a first sample of n=3 cars of a particular type might result in fuel efficiencies x1=30.7, x2=29.4, x3=31.1, whereas a second sample may give x1=28.8, x2=30.0,and x3=31.1. Before we obtain data, there is uncertainty about the value of each xi. Because of this uncertainty, before the data becomes available we view each observation as a random variable and denote the sample by X1, X2, …, Xn (uppercase letters for random variables). Li Jie This variation in observed values in turn implies that the value of any function of the sample observations, such as the sample mean, sample standard deviation, or sample fourth spread, also varies from sample to sample. That is, prior to obtaining x1, …, xn , there is uncertainty as to the value of x, the value of s, and so on. Li Jie DEFINITION A statistic is any quantity whose value can be calculated from sample data. Prior to obtaining data,there is uncertainty as to what value of any particular statistic will result. a statistic is a random variable and will be denoted by an uppercase letter; a lowercase letter is used to represent the calculated or observed value of the statistic. Li Jie Random Samples The probability distribution of any particular statistic depends not only on the population distribution(normal,uniform,etc.)and the sample size n but also on the method of sampling. Li Jie DEFINITION The rv’s X1, X2, …, Xn are said to form a (simple) random sample of size n if 1. The Xi’s are independent rv’s. 2.Every Xi has the same probability distribution. i.i.d Li Jie Deriving the Sampling Distribution of a Statistic Probability rules can be used to obtain the distribution of a statistic provided that it is a “fairly simple” function of the Xi’s and either there are relatively few different X values in the population or else the population distribution has a “nice” form. Example 5.20 and 5.21 Li Jie Simulation Experiments (omit) The second method of obtaining information about a statistic’s sampling distribution is to perform a simulation experiment. This method is usually used when a derivation via probability rules is too difficult or complicated to be carried out. Such an experiment is virtually always done with the aid of a computer. The following characteristics of an experiment must be specified: 1. The statistic of interest ( X , S,a particular trimmed mean,etc.) 2. The population distribution (normal with μ=100 and σ=15,uniform with lower limit A=5 and upper limit B=10,etc) Li Jie 3. The sample size n (e.g.,n=10 or n=50) 4. The number of replication k (e.g.,k=500) Then use a computer to obtain k different random samples,each of size n ,from the designated population distribution.For each such sample,calculate the value of the statistic and construct a histogram of the k calculated values.This histogram gives the approximate sampling distribution of the statistic.The larger the value of k,the better the approximation will tend to be (the actual sampling distribution emerges as k→∞).In practice,k=500 or 1000 is usually enough if the statistic is “fairly simple”. Li Jie Example 5.23 Consider a simulation experiment in which the population distribution is quite skewed.Figure 5.12 shows the density curve of a certain type of electronic control (that is actually a lognormal distribution with E(ln(X))=3 and V(ln(X)))=.4).Again the statistic of interest is the sample mean X .The experiment utilized 500 replications and considered the same four sample sizes as in Example 5.22.The resulting histograms Along with a normal probability plot from MINITAB for the 500 x values based on n=30 are shown in Figure 5.13). Li Jie f(x) .03 .02 .01 0 x 25 50 75 Figure 5.12 Density curve for the simulation experiment of Example 5.23 [E(X)=μ=21.7584,V(X)=σ2=82.1449] Li Jie 5.4 The Distribution of the Sample Mean PROPOSITION Let X1 , X 2 ,, X n be a random sample from distributi on with mean value and standard deviation .Then 1.E ( X ) X 2.V ( X ) X 2 / n; and ; V ( X ) / 2 n In addition, with T0 X1 X n (the sample total), E(T0 ) nμ, V(T0 ) nσ 2 and σ T0 n σ Li Jie Example 5.24 In a notched tensile fatigue test on a titanium specimen, the expected number of cycles to first acoustic emission (used to indicate crack initiation )is μ=28,000,and the standard deviation of the number of cycles is σ=5000. Let X1,X2,…,X25 be a random sample of size 25, where each Xi is the number of cycles on a different randomly selected specimen. Then the E(X)= μ=28,000, and the expected total number of cycles for the 25 specimens is E(T0)=n μ=25(28,000)=700,000. The standard deviations of X and T0 are X / n 5000 1000 25 T n 25 (5000) 25,000 0 If the sample size increases to n=100, E(X) is unchanged,but σX=500, half of its previous value (the sample size must be quadrupled to halve the standard deviation of X). Li Jie The Case of a Normal Population Distribution Looking back to the simulation experiment of Example 5.22, we see that when the population distribution is normal, each histogram of x values is well approximated by a normal curve. The precise result follows. PROPOSITION Let X1,X2,…,Xn be a random sample from a normal distribution with mean μand standard deviation σ. Then for any n, X is normally distributed (with mean μ and standard deviation / n ), as is T0 (with mean n μand standard deviation n ) Li Jie Example 5.25 The time that it takes a randomly selected rat of a certain subspecies to find its way through a maze is a normally distributed rv with μ=1.5 min and σ=.35 min. Suppose five rats are selected.Let X1,X2,…,X5 denote their times in the maze. Assuming the Xi’s to be a random sample from this normal distribution, what is the probability that the total time T0= X1,X2,…,X5 for the five is between 6 and 8 min? By the proposition, T 0 has a normal distribution with μ T0 = n μ=5(1.5)=7.5 and variance σ 2 T =nσ 2 =5(.1225)=.6125, so σT0=.783.To standardize T0,subtract μT0 and divide by Σt0; Li Jie 6 7.5 8 7.5 Z ) .783 .783 P (1.92 Z .64) Φ(.64) Φ(1.92) .7115 P (6 T0 8) P ( Determination of the probability that the sample average time X (a normally distributed variable) is at most 2.0 min requires =.1565.Then / n μ X= μ =1.5 and σX= P ( X 2.0) P ( Z 2.0 1.5 ) P ( Z 3.19) Φ(3.19) .9993 .1565 Li Jie The Central Limit Theorem When the Xi’s are normally distributed,so is X for every sample size n. The simulation experiment of Example 5.23 suggests that even when the population distribution is highly _ nonnormal, averaging produces a distribution more bellshaped than the one being sampled. A reasonable conjecture is that if n is large, a suitable normal curve will approximate the actual distribution of X. The formal statement of this result is the most important theorem of probability. Li Jie THEOREM The Central Limit Theorem (CLT) Let X1,X2,…,Xn be a random sample from a distribution with mean μand variance σ2.Then if n is sufficiently large, X has approximately a 2 2 normal distribution with X and / n ,and T0 also has approximately a normal distribution with T n , T 2 n 2 . The larger the value of n,the better the approximation. X 0 0 Li Jie Example 5.26 When a batch of a certain chemical product is prepared,the amount of a particular impurity in the batch is a random variable with mean value 4.0g and standard deviation 1.5g.If 50 batches are independently _ prepared,what is the (approximate) probability that the sample average amount of impurity X is between 3.5 and 3.8g?According to the rule of thumb to be stated shortly,n=50 is large enough for the CLT to be applicable. X then has approximately a normal distribution with mean value X 4.0 and X 1.5 / 50 .2121, so 3.5 4.0 3.8 4.0 P(3.5 X 3.8) P( Z ) .2121 .2121 (.94) (2.36) .1645 Li Jie Other Applications of the Central Limit Theorem The CLT can be used to justify the normal approximation to the binomial distribution discussed in Chapter 4.Recall that a binomial variable X is the number of successes in a binomial experiment consisting of n n independent success/failure trials with p=P(S) for any particular trial.Define new rv’s X1,X2,…,Xn by Xi= 1 if the ith trial results in a success 0 if the ith trial results in a failure (i=1,…,n) Li Jie Because the trials are independent and P(S) is constant from trial to trial to trial,the Xi’s are idd (a random sample from a Bernoulli distribution).The CLT then implies that if n is sufficiently large,both the sum and the average of the Xi’s have pproximately normal distributions.When the Xi’s are summed,a 1 is add for every S that occurs and a 0 for every F,so Xi+….+Xn=X.The sample mean of the Xi’s is X/n,the sample proportion of successes. That is ,both X and X/n are approximately normal when n is large. The necessary sample size for this approximately depends on the value of p: When p is close to .5,the distribution is quit skewed when p is near 0 or 1.Using the approximation only if both np≥10 and n(1-p)≥10 ensures that n is large enough to overcome any skewness in the underlying Benoulli distribution Li Jie 1 0 (a) 1 0 (b) Figure 5.17 Two Bernoulli distribution: (a) p=.4 (reasonable symmetric);(b) p=.1 (very skewed) Li Jie PROPOSITION Let X1,X2,…,Xn be a random sample from a distribution for which only positive values are possible [P(Xi>0)=1].Then if n is sufficiently large,the product Y=X1X2…..Xn has approximately a lognormal distribution. To verify this,note that ln(Y)=ln(X1)+ ln(X2)+….+ ln(Xn) Li Jie 5.5 The Distribution of a Linear Combination This sample mean X and sample total T0 are special cases of a type of random variable that arises very frequently in statistical applications. Definition Given a collection of n random variables X1,…,Xn and n n numerical constants a1,…,an,the rv Y a1 X 1 an X n ai X i is called a linear combination of the Xi’s. i 1 Li Jie PROPOSITION Let X1,X2,…,Xn have mean values μ1,…, μn respectively,and variances of σ12,…., σn2,respectively. 1. Whether or not the Xi’s are independent, E (a1 X 1 a2 X 2 an X n ) a1 E ( X 1 ) a2 E ( X 2 ) an E ( X n ) a11 an n Li Jie 2. If X1,X2,…,Xn are independent, V (a1 X 1 a2 X 2 an X n ) a1 V ( X 1 ) a2 V ( X 2 ) an V ( X n ) 2 2 2 a1 21 a 2 n 2 n 2 and a X a X a 21 21 a 2 n 2 n 1 1 n n 3. For any X1,X2,…,Xn , n n V (a1 X 1 a2 X 2 an X n ) ai a j Cov( X i , X j ) i 1 j 1 Li Jie Example 5.28 A gas station sells three grades of gasoline:regular unleaded,extra unleaded,and super unleaded.These are priced at $ 1.20,$1.35,and $1.50 per gallon,respectively.Let X1,X2 and X3 denote the amounts of these grades purchased (gallon) on a particular day.Suppose the Xi’s are independent with μ1=1000, μ2= 500,μ3= 300,σ 1=100, σ 2= 80,and σ 3=50.The revenue from sales is Y=1.2X1+1.35X2+1.5X3,and E (Y ) 1.21 1.35 2 1.5 3 $2325 V (Y ) (1.2) 2 21 (1.35) 2 2 2 (1.5) 2 2 3 31,689 Y 31,689 $178.01 Li Jie The Difference Between Two Random Variables An important special case of a linear combination results from taking n=2,a1=1,and a2=-1: COROLLARY E(X1-X2) =E(X1)-E(X2) and, if X1 and X2 are independent, V(X1-X2)=V(X1)+V(X2). Li Jie Example 5.29 A certain automobile manufacturer equips a particular model with either a six-cylinder engine or a four-cylinder engine. Let X 1and X2 be fuel efficiencies for independently and randomly selected six-cylinder and four-cylinder cars, respectively. With μ1=22, μ2=26,σ1=1.2,and σ2=1.5, E ( X 1 X 2 ) 1 2 22 26 4 V ( X 1 X 2 ) 21 2 2 (1.2) 2 (1.5) 2 3.69 X X 3.69 1.92 1 2 If we relabel so that X1 refers to the four-cylinder car, then E(X1-X2)=4, but the variance of the difference is still 3.69. Li Jie The Case of Normal Random Variables PROPOSITION If X1, X2,…,Xn are independent, normally distributed rv’s (with possibly different means and/or variances), then any linear combination of the Xi’s also has a normal distribution. In particular, the difference X1-X2 between two independent, normally distributed variables is itself normally distributed. Example 5.30 The total revenue from the sale of the three grades of gasoline on a particular day was Y=1.2X1+1.35X2+1.5X3 , and we calculated μY=2325 and (assuming independence) σY=178.01. If the Xi’s are normally distributed, the probability that revenue exceeds 2500 is P(Y 2500) P( Z 2500 2325 ) P( Z .98) 1 (.98) .1635 178.01 Li Jie Proofs for the case n=2 For the result concerning expected values, suppose that X1, X2 are continuous with joint pdf f(x1,x2). Then E (a1 X 1 a2 X 2 ) a1 (a1 x1 a2 x2 ) f ( x1 x2 )dx1dx2 x1 f ( x1 x2 )dx2 dx1 a2 x2 f ( x1 x2 )dx1dx 2 a1 x1 f X 1 ( x1 )dx1 a2 x2 f X 2 ( x2 )dx2 a1 E ( X 1 ) a2 E ( X 2 ) Summation replaces integration in the discrete case. The argument for the variance result does not require specifying whether either variable is discrete or continuous. Recalling that V(Y)=E[Y-μY]2 Li Jie V (a1 X 1 a2 X 2 ) E{[ a1 X 1 a2 X 2 (a11 a2 2 )]2 } E{a 21 ( X 1 1 ) 2 a 2 2 ( X 2 2 ) 2 2a1a2 ( X 1 1 )( X 2 2 )} The expression inside the braces is a linear combination of the variables Y1=(X1-μ1)2, Y2=(X2-μ2)2, and Y3=(X1-μ1)(X2-μ2), so carrying the E operation through to the three terms gives a12V(X1)+a22V(X2)+2a1a2Cov(X1+X2) as required Li Jie