Download 3.3 Cumulative Distribution Functions

STAT 421 Lecture Notes 3.3 40 Cumulative Distribution Functions Definition The cumulative distribution function (c.d.f.) of a random variable X is the function F (x) = Pr(X ≤ x) ∀x ∈ R. The c.d.f. exists and is defined in the same manner for all random variables (discrete, continuous, and mixed). Note that F : R → [0, 1]. Example: Suppose that X ∼ Bin(2, p). Then,   0,     (1 − p)2 , F (x) =   (1 − p)2 + 2p(1 − p),     1 x<0 0≤x<1 1≤x<2 x ≤ 2. 0.0 0.2 0.4 F(x) 0.6 0.8 1.0 Numerical values for this c.d.f. can be computed using the R command pbinom(0:n,n,p) (replace n and p with numerical values). The figure below is a plot of the c.d.f. of a binomial random variable parameters n = 12 and p = .3. The R code for plotting the graph is in the script entitled DistributionPlotter.R. 0 2 4 6 8 10 12 x In this instance (because X is discrete), Pr(X = k) can be recovered from the c.d.f. STAT 421 Lecture Notes 41 because Pr(X = k) = Pr(X ≤ k) − Pr(X < k) = Pr(X ≤ k) − Pr(X ≤ k − 1) = F (k) − F (k − 1). Properties of the cumulative distribution function 1. F is nondecreasing. In other words, for all x1 < x2 , F (x1 ) ≤ F (x2 ). This property can be verified by setting up the events E1 = {X ≤ x1 } and E2 = {X ≤ x2 }. Then, E1 ⊂ E2 ⇒ Pr(E1 ) ≤ Pr(E2 ) ⇒ F (x1 ) ≤ F (x2 ). 2. limx→−∞ F (x) = 0 and limx→∞ F (x) = 1. The proof is straightforward if the result of exercise 12, p. 51 is used. 3. A c.d.f. need not be continuous, as the figure above shows, but instead may be discontinuous at countably many points. On the other hand, the third property states that every c.d.f. is continuous from the right (from above). Specifically, lim F (y) = F (x). y→x+ The notation limy→x+ describes a limiting process where y → x and y > x. A cumulative distribution function need not be continuous from the left (from below). Instead, it may be that limy→x− F (y) < F (x), in which case, we say that there is a discontinuity at x, or that there is a jump in F at x. The binomial random variable c.d.f. above illustrates this phenomenon as there is a jump at every point in the support of X. If there is a jump in F at x, then the height of the jump is Pr(X = x). On the graph of a c.d.f., the height of each jump is the probability that the random variable takes on the value x at the horizontal coordinate of the jump. If the limit limy→x− F (y) = F (x− ) is not equal to F (x) = Pr(X ≤ x), then limy→x− F (y) = Pr(X < x). Consequently, F (x) − F (x− ) = Pr(X ≤ x) − Pr(X < x) = Pr(X = x). Computing probabilities from cumulative distribution functions The following results are useful for computing probabilities involving the random variable X: STAT 421 Lecture Notes 42 1. Pr(X > x) = 1 − F (x). 2. For x1 < x2 , F (x2 ) − F (x1 ) = Pr(X ≤ x2 ) − Pr(X ≤ x1 ) = Pr(x1 < X ≤ x2 ). 3. Pr(X < x) = F (x− ) = limy→x− F (y). If F (x− ) = F (x), then Pr(X < x) = Pr(X ≤ x). 4. Pr(X = x) = F (x) − F (x− ). Suppose that X is discrete and that a and b belong to the support of X. Then, 1. Pr(a < X < b) = 0 if and only if F (x) is constant over the open interval (a, b). 2. Pr(X = x) > 0 if and only if F is discontinuous at x. This fact follows from F (x) − F (x− ) = Pr(X = x) > 0. Suppose that X is continuous. Then, 1. F is continuous everywhere. Because X is continuous, 0 = Pr(X = x) = F (x) − F (x− ) = F (x) − lim− F (y) y→x ⇒ F (x) = lim− F (y). y→x Thus, F is continuous from below, and since F is continuous from above everywhere (true of every continuous and discrete r.v.), F is continuous everywhere. 2. The relationship between density functions and cumulative distribution functions is F (x) = Pr(X ≤ x) ∫ x = f (t)dt. −∞ Furthermore, at every x such that f is continuous, dF (x) = f (x). dx Recall example 3.2.8 (page 105 in the textbook) where  1   , 0<x 2 f (x) = (1 + x)  0, otherwise. STAT 421 Lecture Notes 43 From this definition, lim f (x) = 1 ̸= 0 = f (0). x→0+ Hence, f is discontinuous at 0 and F ′ = dF/dx does not exist at 0. Thus, f ̸= F ′ , since F ′ does not exist at 0 whereas f does exist at 0 (in fact, f (0) = 0). Unfortunately, the derivative of the cumulative distribution function is not necessarily the probability density function. Example Suppose that  1 − e−x/3 , x > 0 F (x) = 0, x ≤ 0. F is continuous everywhere, so the p.d.f. is f = F ′ and it is   1 e−x/3 , x > 0 f (x) = 3 0, x ≤ 0. Quantile Functions Definition 3.3.2 Suppose that X is a random variable with c.d.f. F . We define the function F −1 according to F −1 (p) = min{F (x) ≥ p}. x That is, F −1 (p) is the minimum value of x such that F (x) ≥ p. In some cases, F −1 truly is the inverse of F , but often F −1 is not the inverse function because F is not one-to-one. If F is one-to-one, then there is a unique value of x such that F (x) = p, and the inverse function exists. The quantile function maps the open interval (0, 1) to R. F −1 is defined on the open interval because lim F (x) = 0 and lim F (x) = 1. x→−∞ x→∞ F −1 (p) is called the p quantile of X. We also say that F −1 (p) is 100p percentile of X. Several quantiles are named: 1. F −1 (.5) is the median. Suppose that X is continuous. Then Pr[X ≤ F −1 (.5)] = .5 and the median partitions the distribution of X into two sets with equal probability. 2. F −1 (.25) is the first quartile. 3. F −1 (.75) is the third quartile. Example Suppose that X ∼ Unif(a, b). Then,   1 , a≤x≤b f (x) = b − a 0, otherwise, STAT 421 Lecture Notes 44   0, x<a   x − a , a≤x≤b F (x) =  b−a   1, b < x. and The quantile function is q = F −1 (p). W can determine a convenient expression for F −1 by solving the equation q−a p= b−a for q. The solution is q = a + p(b − a). The graph of the quantile function is a line connecting the pairs (0, a) to (1, b). The median of X is q = a + 12 (b − a) ⇒ q = a+b . 2 The quantile function for discrete distributions 1.0 Consider X ∼ Bin(12, .3). The c.d.f. is shown below. The table below and right shows the pairings between ranges in p and the quantiles. The ranges can be identified in the Figure as the vertical gaps between the line ends and the points vertically above the line ends. Since F −1 (p) is defined to be the smallest value of x such that p ≤ F (x), the .4 quantile is . . 3 because F (2) = .253 and F (3) = .492. 0.0 0.2 0.4 F(x) 0.6 0.8 F −1 (p) 0 1 2 3 4 5 .. . 0 2 4 6 x 8 10 12 p (0, .0138] (.014, .085] (.085, .253] (.253, .492] (.492, .724] (.724, .882] .. .

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download 3.3 Cumulative Distribution Functions