Lecture 6 In the last lecture we went over discrete random variables. Now we will transform the concepts to understand continuous random variables. Continuous random variables The probability distribution of a continuous random variable is described in terms of a probability density function, f(x). It must be noted f(x) does not give us probability. It gives us probability per unit value of x. That is why we call it probability density function. So the probability that X lies between a and b is given by, 𝑏 𝑃(𝑎 < 𝑋 < 𝑏) = ∫ 𝑓(𝑥)𝑑𝑥 𝑎 Also, it must be noted, 1) f(x) ≥ 0 ∞ 2) ∫−∞ 𝑓(𝑥)𝑑𝑥 = 1 Since, P(X=x) = 0, the following holds true, 𝑃(𝑥1 ≤ 𝑋 ≤ 𝑥2 ) = 𝑃(𝑥1 < 𝑋 ≤ 𝑥2 ) = 𝑃(𝑥1 ≤ 𝑋 < 𝑥2 ) = 𝑃(𝑥1 < 𝑋 < 𝑥2 ) Example Determine the value of k in the following probability density function. Find the probability that X lies 0 and 1. f(x) = k (1+2x) for 0 < x < 2, = 0, otherwise Solution: ∞ ∫ 𝑓(𝑥)𝑑𝑥 = 1 −∞ 2 𝑥=2 ∫ 𝑘(1 + 2𝑥)𝑑𝑥 = (𝑘𝑥 + 𝑘𝑥 2 )𝑥=0 = 2𝑘 + 4𝑘 = 1 0 k=1/6 𝑥=1 Now, P (0 < X < 1) = 𝑘(𝑥 + 𝑥 2 )𝑥=0 = 0.33 Cumulative distribution function Another way to define the probability distribution of a random variable is in terms of probability that X is less than or equal to x. 𝑥 𝐹(𝑥) = 𝑃(𝑋 ≤ 𝑥) = ∫ 𝑓(𝑢)𝑑𝑢 −∞ With this definition, 𝑃(𝑎 < 𝑋 < 𝑏) = 𝐹(𝑏) − 𝐹(𝑎) Example (continued) Determine F(x). Find probability that X lies between 0 and 1. Solution: 𝑥 𝐹(𝑥) = ∫ 0 𝐹(𝑥) = 1 + 2𝑢 𝑑𝑢 6 𝑥 + 𝑥2 6 P (0 < X < 1) = F(1) – F(0) = 2/6 + 0/6 = 0.333 Mean and Variance The mean value of any random variable X is also called the expected value X and is denoted by E(X) or μ. The mean calculated from any random sample is denoted by 𝑥̅ . The mean value of any data set is given by summing up all the values and dividing by the total number of values. If the same value occurs more than once, we sum the number of times that it occurred. This translates to, 𝐸(𝑋) = ∑𝑎𝑙𝑙 𝑥 𝑥𝑛𝑥 = ∑ 𝑥𝑝(𝑥) 𝑁 𝑎𝑙𝑙 𝑥 This can be translated into the form of the continuous variable as, ∞ 𝐸(𝑋) = 𝜇 = ∫ 𝑥𝑓(𝑥)𝑑𝑥 −∞ Similarly, the variance of a random variable X is denoted by V(X) and σ2. It is given by the average of the square of deviations of each data point from the mean (a measure of spread). The square root of the variance is the standard deviation (σ). The standard deviation has the same dimensions as the mean. The standard deviation computed from a random sample is denoted by s. 𝑠2 = ∑𝑎𝑙𝑙 𝑥 𝑛𝑥 (𝑥 − 𝑥̅ )2 = ∑ 𝑝(𝑥)(𝑥 − 𝑥̅ )2 𝑁−1 𝑎𝑙𝑙 𝑥 This can be translated into the form of a continuous variable as, ∞ 𝑉(𝑋) = ∫ (𝑥 − 𝜇)2 𝑓(𝑥)𝑑𝑥 −∞ ∞ 𝑉(𝑋) = ∫ (𝑥 2 + 𝜇 2 − 2𝑥𝜇) 𝑓(𝑥)𝑑𝑥 −∞ ∞ ∞ ∞ 𝑉(𝑋) = ∫ 𝑥 2 𝑓(𝑥)𝑑𝑥 + ∫ 𝜇 2 𝑓(𝑥) 𝑑𝑥 − ∫ 2𝑥𝜇𝑓(𝑥) 𝑑𝑥 −∞ −∞ −∞ 𝑉(𝑋) = 𝐸(𝑋 2 ) + 𝐸(𝑋)2 − 2𝐸(𝑋)2 = 𝐸(𝑋 2 ) − 𝐸(𝑋)2 Example (Continued) Find the mean and variance of X. Solution: ∞ 𝐸(𝑋) = 𝜇 = ∫ 𝑥𝑓(𝑥)𝑑𝑥 −∞ 2 𝐸(𝑋) = 𝜇 = ∫ 0 𝑥(1 + 2𝑥) 𝑑𝑥 6 𝑥=2 𝑥2 𝑥3 𝐸(𝑋) = ( + ) = 1.222 12 9 𝑥=0 𝑉(𝑋) = 𝜎 2 = 𝐸(𝑋 2 ) − 𝜇 2 2 𝐸(𝑋 2) =∫ 0 𝑥 2 (1 + 2𝑥) 𝑑𝑥 6 𝑥=2 𝐸(𝑋 2) 𝑥3 𝑥4 16 =( + ) = 18 12 𝑥=0 9 𝑉(𝑋) = 𝜎 2 = 16 121 23 − = = 0.284 9 81 81 Population Median, quartiles and percentiles The pth percentile of a random variable X is the value xp such that p% of the values of X are below it. That is, 𝑥𝑝 ∫ 𝑓(𝑥)𝑑𝑥 = −∞ 𝑝 100 The median value of X is xm such that 50% of the values of X are below it and 50% are above it. That is, 𝑥𝑚 ∫ 𝑓(𝑥)𝑑𝑥 = 0.5 −∞ The first quartile of X is xq1 such that 25% of the values of X are below it. That is, 𝑥𝑞1 ∫ 𝑓(𝑥)𝑑𝑥 = 0.25 −∞ Similarly the 3rd quartile maybe defined as the xq3 such that 75% of the values of X are below it.