Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Pattern recognition wikipedia , lookup
Information theory wikipedia , lookup
Time value of money wikipedia , lookup
Simplex algorithm wikipedia , lookup
Generalized linear model wikipedia , lookup
Birthday problem wikipedia , lookup
Fisher–Yates shuffle wikipedia , lookup
Hardware random number generator wikipedia , lookup
STAT 421 Lecture Notes 3.2 36 Continuous Distributions In the previous section, the emphasis was upon discrete random variables. Attention shifts now to continuous random variables. Earlier it was said that a continuous random variable X differed from discrete random variables since continuous random variables take on every value in some interval and it is not possible to enumerate the set X(S), where S is the sample space.1 This section begins a more in-depth discussion of continuous random variables. Definition X is a continuous random variable (r.v.) if there exists a non-negative function f defined on R such that for every interval I = [a, b] of R , the probability that X takes on a value in I is ∫ b Pr(a ≤ X ≤ b) = f (x)dx. a Similarly, if X is continuous, then the probability that X takes on a value in an unbounded interval (−∞, b] is ∫ b Pr(X ≤ b) = f (x)dx −∞ and the probability that X takes on a value in [a, ∞) is ∫ ∞ Pr(X ≥ a) = f (x)dx. a f is the probability density function (p.d.f.) of X and the closure of {x|f (x) > 0} is the support of X. For instance, the closure of [−1, 0) ∪ (0, 1] is [−1, 1]. Every p.d.f. satisfies two properties: 1. f (x) ≥ 0 for all x ∈ R. ∫∞ 2. −∞ f (x)dx = 1. The probability of obtaining a value in an interval does not depend on whether it is open or closed, since ∫ b f (x)dx = Pr(a ≤ X < b), Pr(a ≤ X ≤ b) = a Furthermore, 0 = Pr(a ≤ X ≤ b) − Pr(a ≤ X < b) = Pr(X = b). As this relationship holds for all b, we conclude that Pr(X = x) = 0∀x ∈ R, and state that the probability that a continuous r.v. takes on any singleton point is 0. Because Pr(X = x) = 0∀x, f can be modified at countable many points in R, yet the probability that X takes on a value in a particular interval remains the same. Thus, p.d.f’s are not unique (the p.d.f. of a random variable can be defined differently yet all variations 1 In contrast, the support of a discrete random variable is countable. STAT 421 Lecture Notes 37 yield the same probabilities). The convention, however is to say that f is the p.d.f. of X though, properly, we should say that f is a p.d.f. for X. Example A continuous random variable has a uniform distribution on an interval [a, b] if f (x) is constant on [a, b]. For example, X is uniformly distributed on [a, b] if 1 a≤x≤b f (x) = b − a 0, otherwise. The denominator b − a is determined by the condition that the p.d.f. integrates to 1. We could prove that the denominator is correct by finding c: ∫ b 1= cdx = c x|ba = c(b − a). a For a ≤ x1 ≤ x2 ≤ b, Pr(x1 ≤ X ≤ x2 ) = x2 − x1 . b−a We write X ∼ Unif(a, b). Example Consider a slightly different p.d.f. given by f (x) = cx where the support of X is [a, b], and 0 ≥ a. The value of c is determined from the two requisite properties of p.d.f.s ∫∞ (f (x) ≥ 0 for all x ∈ R, and −∞ f (x)dx = 1). The first property implies c > 0. The second property implies ∫ b b2 − a2 2 1= cxdx = c ⇒c= 2 . 2 b − a2 a If [a, b] = [0, 1], then, 2x, 0 ≤ x ≤ 1 f (x) = 0, otherwise. If [a, b] = [0, 2], then, 1 x, 0 ≤ x ≤ 2 f (x) = 2 0, otherwise. The coefficient c is called a normalizing constant. Computing probabilities from probability density functions The probability that a continuous random variable takes on a value in some interval [a, b] is computed by integrating the p.d.f. and evaluating the indefinite integral as illustrated above using the uniform random variable. The interval need not be bounded. For example, STAT 421 Lecture Notes suppose that 38 1 , 0≤x 2 f (x) = (1 + x) 0, otherwise. Then, ∫ ∞ 1 dx (1 + x)2 2 ∞ 1 = − 1 + x 2 1 = . 3 Pr(X ≥ 2) = Remarks 1. Online symbolic integrators are available for computing some indefinite integrals, for example, http://integrals.wolfram.com/index.jsp. Use them to check your work but do not become dependent on them. 2. The probability distribution function evaluated at x ∈ R is not a probability. For example, 5e−5x , 0 < x f (x) = 0, otherwise. is a p.d.f. and f (1/5) = 1.84 > 0. Obviously, f (1/5) is not a probability in any sense. 3. Mixed distributions are modified continuous distributions that allow for specific, countably many values of x to be taken on with positive probability. For example, a measuring device may have a threshold value below which it cannot yield an accurate reading. Whenever the underlying random variable producing the measurements yields a value below the threshold, then the machine has been set up to report the threshold. Suppose that the underlying random variable is X and its p.d.f. is f (x). Further, suppose that p is the probability that X will be less than or equal to the threshold value x0 , i.e., 0 < p = Pr(X ≤ x0 ). Then, the random variable representing a measurement made of a randomly selected population unit, say Y , may be modeled according to ∫ Pr(Y ∈ A) = pI(x0 ∈ A) + (1 − p) f (y)dy. A The set A is some interval of R and I(x ∈ A) is an indicator variable that takes on the value 1 if x ∈ A and 0 otherwise. 4. Many of the text book problems ask for plots of the p.d.f. and related functions. It’s best to learn now how to use R for plotting. There is a script posted at STAT 421 Lecture Notes 39 http://www.math.umt.edu/steele/STAT421/Homework/ called DistributionPlotter.R which constructs two plots: the probability function for a binomial random variable and the p.d.f of an important continuous random variable. Try modifying the program when you are asked to plot these functions. To use the program, download it to your computer. Open R, go to the File dropdown menu (leftmost in the toolbar), and open the file. A simple editor will open with the program. Highlight some or all of the lines, hold down the Ctrl key and press the R key. The highlighted lines will execute. Output appears in the console window, and a graphics device window will open if the code contains plotting functions. You can copy the graphics file to the clipboard or save it as a file if you place the cursor in the graphics device window and right-click with the mouse.