Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
MATH 7550-01 INTRODUCTION TO PROBABILITY FALL 2011 Lecture 7. Densities. Distribution functions. Here I want to introduce the concept of a measure π on a measurable space (π, π³ ) being π-ο¬nite. This means that there exists an inο¬nite sequence of sets π΅π β π³ such that π(π΅π ) < β, βͺ β and π=1 π΅π = π. The Lebesgue measure ππ is π-ο¬nite (we can take π΅π = [β π, π]π ), while the counting measure # is not if the space π is uncountable. β« Now, how is the integral π ππ of a measurable function π with respect to the π΄ measure π over some subset π΄ β π, and not over the whole π, deο¬ned? This simplest way to do so is to take by deο¬nition, for π΄ β π³ , β« β« π (π₯) π(ππ₯) = πΌπ΄ (π₯) π(ππ₯). (7.1) π΄ π This deο¬nition leads to the same result as if we consider simple functions on π΄, then arbitrary nonnegative measurable functions on π΄, etc. Now, we started to whole digression about Lebesgue integrals when we spoke about distribution densities; our last formula before that was (5.13). If a set function π is given by formula (5.13) for all πΆ β π³ , it is necessarily countably additive: for disjoint π΄1 , π΄2 , ..., π΄π , ... β« β« β (βͺ ) π π΄π = π (π₯) π(ππ₯) = πΌβͺβ π΄π (π₯) β π (π₯) π(ππ₯) π=1 π βͺβ π=1 π΄π π=1 (7.2) β« β β« β β β β β = πΌπ΄π (π₯) β π (π₯) π(ππ₯) = πΌπ΄π (π₯) β π (π₯) π(ππ₯) = π(π΄π ) π π=1 π=1 π π=1 (proved above for nonnegative integrands; so something remains here to be proved). So a set function π having a density with respect to a measure π is necessarily countably additive. (In Kolmogorov & Fominβs book countably additive set functions are called charges β because such functions provide a mathematical model for electric charges, π(π΄) being the total charge, positive or negative, carried by the region π΄). A density π (π₯) of π with respect to π is, generally, not unique: if we change the function π (π₯) arbitrarily on a non-empty set π΄ having zero π-measure (such sets may not β« exist for a measure π; they certainly do exist for the Lebesgue measure), the integrals π ππ do not change, and the changed π is still a version of density. πΆ Theorem 7.1. The density of a countably additive set function π with respect to π is almost unique (of course supposing that it exists); i. e., if the measurable functions π1 , π2 are two versions of this density: β« β« π1 (π₯) π(ππ₯) = π2 (π₯) π(ππ₯) = π(πΆ) (7.3) πΆ πΆ 1 for every πΆ β π³ , then π1 (π₯) = π2 (π₯) almost everywhere, (7.4) that is, π{π₯ : π1 (π₯) β= π2 (π₯)} = 0. β« (7.5) β« βProof β: Let π΄ = {π₯ : π1 (π₯) < π2 (π₯)}; then we have [π2 (π₯) β π1 (π₯)] π(ππ₯) = π΄ β« π2 ππ β π1 ππ = π(π΄) β π(π΄) = 0. The integral of the function π2 (π₯) β π1 (π₯) that is π΄ π΄ strictly positive on π΄ is equal to 0, and it follows from this that π(π΄) = 0. Similarly, for π΅ = {π₯ : π1 (π₯) > π2 (π₯)} we have π(π΅) = 0; and π{π₯ : π1 (π₯) β= π2 (π₯)} = π(π΄) + π(π΅) = 0. However, it turns out that the statement of Theorem 7.1 is false (although itβs pretty easy to make it correct introducing some extra conditions), and the βproofβ contains some mistake. Problem 7 : Produceβ«an example of measures π and π on a space (π, π³ ) such that β« π1 (π₯) π(ππ₯) = π2 (π₯) π(ππ₯), but π{π₯ : π1 (π₯) = β π2 (π₯)} β= 0. Show at which π(πΆ) = πΆ πΆ point the βproofβ of Theorem 7.1 fails. Iβll give the corrected version of Theorem 7.1 at the end of the next version of Lecture note # 7 (not in the present version, since I want you to solve Problem 7 by yourselves). The notations for (every version of) the density of π with respect to π will be π (π₯) = π(ππ₯) ππ = (π₯). π(ππ₯) ππ (7.6) Why such notations are used, weβll discuss later. It is clear that if π has a density with respect to π, then for πΆ β π³ π(πΆ) = 0 β π(πΆ) = 0. (7.7) A countably additive set function π is called absolutely continuous with respect to π if π has a density with respect to π (formula (3.22) is satisο¬ed). The notation for absolute continuity is π βͺ π. (7.8) The following theorem (a real big one, not so easy to prove) is proved in measure theory: Theorem 7.2 (Radon β Nikodymβs Theorem: see, e. g., Kolmogorov & Fominβs book, Theorem 2 of Section 34). Let a measure π on (π, π³ ) be π-ο¬nite, and let π be a ο¬nite countably additive set function on (π, π³ ) that is absolutely continuous with respect to π. Then π has a density π (π₯) with respect to π. 2 Note that the Lebesgue measure ππ is π-ο¬nite (π΅π = {π₯ : β£π₯β£ β€ π}); so a density with ππ (ππ₯) ππ (ππ₯) respect to the Lebesgue measure = exists if and only if ππ is absolutely ππ (ππ₯) ππ₯ continuous with respect to the Lebesgue measure. (Usually, for shortness, we speak of just continuous distributions. Random variables having such distributions are called continuous random variables.) It turns out that the description of discrete distributions by means of their βprobability mass functionsβ ππ (π₯) = π {π = π₯} is also a description involving the density β not with respect to the Lebesgue measure, but with respect to the counting measure #: ππ (π₯) = ππ (ππ₯) . #(ππ₯) Indeed, this means that for every set πΆ β π³ β« π(π₯) #(ππ₯). π {π β πΆ} = (7.9) (7.10) πΆ By (6.28), this is the same as π {π β πΆ} = β π(π₯), (7.11) π₯βπΆ and this is just the formula (5.8). This is a reason for treating probability mass functions of discrete distributions and probability densities of (absolutely) continuous ones in parallel ways (e. g., the statistical maximum-likelihood estimates are deο¬ned by the same formulas in the discrete and in the continuous case). Note that since the counting measure # is not, generally, π -ο¬nite, the above results about existence and (almost) uniqueness of the density cannot be applied. As for uniqueness, it still is there, and even without βalmostβ (because there are no non-empty sets with #(π΄) = 0); but it does not follow from ππ (β ) = 0 that this distribution is a discrete one. (Nothing at all can follow from ππ (β ) = 0, because this a general property of all measures.) Sometimes it is reasonable to consider densities of probability distributions not with respect to the Lebesgue measure, or to the counting measure #, but with respect to some other measures. For example, if we consider random variables taking values in an inο¬nitedimensional space: in such spaces no inο¬nite-dimensional Lebesgue measure can be deο¬ned. πππ It is worth while to consider the density of one distribution with respect to another. πππ Every distribution of a random variable is a probability measure (i. e., a measure whose value on the largest set is equal to 1); and every probability measure π on a measurable space (π, π³ ) is the distribution of some random variable taking values in (π, π³ ) (in fact, of inο¬nitely many random variables). Indeed, we can take the probability space (Ξ©, β±, π ) in this way: Ξ© = π, β± = π³ , π = π; we deο¬ne on this space the random variable π(π) = π; and the distribution ππ of this random variable will be nothing but the measure π: ππ (πΆ) = π {π : π(π) β πΆ} = π (πΆ) = π(πΆ). 3 In the Lecture 5, discrete distributions and (absolutely) continuous distributions were introduced. Are there distributions that do not belong to these two classes? Of course, there are: mixtures of discrete and continuous distributions: π(πΆ) = π1 β π1 (πΆ) + π2 β π2 (πΆ), where π1 , π2 > 0, π1 + π2 = 1, π1 is a discrete distribution, and π2 an absolutely continuous one. The mixture is neither discrete nor continuous, because β π{π₯} = π1 β= 0, 1, (7.12) π₯ββ1 while for discrete distributions it should be equal to 1, and for continuous, to 0. Are there distributions on the real line that are not such mixtures (and not discrete, not absolutely continuous)? Weβll return to this question after considering distribution functions. Let π be a real-valued random variable. Its distribution function is a function on (β β, β) deο¬ned by πΉ (π₯) = πΉπ (π₯) = π {π β€ π₯}. (7.13) Clearly the distribution function depends only on the distribution π of a random variable: πΉ (π₯) = πΉπ (π₯) = π(β β, π₯]. (7.14) Theorem 7.3. We have, for all β β < π β€ π < β: π {π < π β€ π} = πΉπ (π) β πΉπ (π). (7.15) The proof is very simple: Clearly, (β β, π] = (β β, π] βͺ (π, π], and these intervals are disjoint; the same holds for their inverse images under the mapping π: {π β€ π} = {π β€ π} βͺ {π < π β€ π}, (7.16) and by (ο¬nite) additivity of π , π {π β€ π} = π {π β€ π} + π {π < π β€ π}, (7.17) which leads to (7.15). Theorem 7.4. If πΉ (π₯) is the distribution function of a random variable π, then πΉ (π₯) is non-decreasing, (7.18) πΉ (β) = 1, (7.19) πΉ (β β) = 0, (7.20) πΉ (π+ ) = πΉ (π) for π β (β β, β), πΉ (πβ ) = π {π < π} 4 (7.21) (7.22) (πΉ (β), πΉ (β β), πΉ (π+ ), πΉ (πβ ) are the notations for the limits limπ₯ββ πΉ (π₯), limπ₯βββ πΉ (π₯), the right-hand limit at π: limπ₯βπ+ πΉ (π₯), and the left-hand limit limπ₯βπβ πΉ (π₯). Note that by deο¬nition the formula (7.13) deο¬nes πΉ (π₯) on the real line, and not on the extended real line). Equality (7.21) means that every distribution function is continuous from the right at every point of the real axis. Proof. The statement (7.18) follows from Theorem 7.3. For a monotone function, limits at ± β and one-sided limits at every ο¬nite point necessarily exist, so the limits (7.19) β (7.22) do exist. The limit at + β is equal to the limit along every sequence of numbers going to β; e. g., πΉ (β) = lim πΉ (π) = lim π {π β€ π}. (7.23) πββ πββ The sequence of events π΅π = {π β€ π} is clearly non-decreasing, so by Theorem 3.3 we have: πΉ (β) = π ( lim {π β€ π}). (7.24) πββ What is the limit of this sequence of events? It is a non-decreasing sequence, so the limit is the union: β βͺ lim {π β€ π} = {π β€ π}. (7.25) πββ π=1 But this union clearly is the whole sample space Ξ©: for every sample point π β Ξ©, there exists at least one natural number π such that π > π(π) (or π β₯ π(π)). So πΉ (β) = π (Ξ©) = 1. (7.26) Now to (7.20): πΉ (β β) = lim πΉ (β π) = lim π {π β€ β π} = π ( lim {π β€ β π}) πββ πββ =π πββ β β© ( ) {π β€ βπ} = π (β ) = 0 (7.27) π=1 (the events {π β€ β π} form a non-increasing sequence with (clearly) empty intersection). To (7.21) (with a non-increasing sequence): πΉ (π+ ) = lim π {π β€ π + 1/π} = π πββ β (β© ) {π β€ π + 1/π} = π {π β€ π} = πΉ (π) (7.28) π=1 (the event {π β€ π} occurs if and only if all events {π β€ π + 1/π} occur: the βifβ part by limit passage in the inequality π β€ π + 1/π, and the βonly ifβ: π β€ π β π β€ π + 1/π for all natural π). Finally, (7.22) (again a non-decreasing sequence of events): β βͺ {π β€ π β 1/π} = {π < π} π=1 5 (7.29) (if π(π) is β€ than at least one π β 1/π, it is certainly (less than) π; and if π(π) is less than π, there exists a natural number π such that π β 1/π β π(π), π , and π belongs to the event in the left-hand side of (7.29)), so β πΉ (π ) = lim π {π β€ π β 1/π} = π πββ β (βͺ ) {π β€ π β 1/π} = π {π < π}. (7.30) π=1 From (7.13) and (7.22) we obtain easily: π {π = π} = π {π β€ π} β π {π < π} = πΉ (π) β πΉ (πβ ) = πΉ (π+ ) β πΉ (πβ ); (7.31) this is the jump of the distribution function πΉ at the point π. Now, the corrected formulation of the almost-uniqueness theorem: Theorem 7.1β² . Let the measure π be π-ο¬nite. Then the density of a countably additive set function π with respect to π is almost unique (supposing that it exists). Try to work out the proof by yourself (anyway I donβt have to give proofs of measuretheory theorems). 6