Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Convergence concepts in probability theory Definitions and relations between convergence concepts Sufficient conditions for almost sure convergence Convergence via transforms The law of large numbers and the central limit theorem Probability theory 2010 Coin-tossing: relative frequency of heads 1 0.9 Relative frequency of heads 0.8 0.7 Convergence of each trajectory? 0.6 Series1 0.5 Series2 Series3 Convergence in probability? 0.4 0.3 0.2 0.1 0 0 50 100 150 200 250 300 350 Probability theory 2010 400 Convergence to a constant The sequence {Xn} of random variables converges almost surely to the constant c if and only if P({ ; Xn() c as n }) = 1 The sequence {Xn} of random variables converges in probability to the constant c if and only if, for all > 0, P({ ; | Xn() – c| > }) 0 as n Probability theory 2010 An (artificial) example Let X1, X2,… be a sequence of independent binary random variables such that P(Xn = 1) = 1/n and P(Xn = 0) = 1 – 1/n Does Xn converge to 0 in probability? Does Xn converge to 0 almost surely? Common exception set? Probability theory 2010 The law of large numbers for random variables with finite variance Let {Xn} be a sequence of independent and identically distributed random variables with mean and variance 2, and set Sn = X1 + … + Xn Then P(| Sn | ) 0 as n , for all 0 n Proof: Assume that = 0. Then . P(| Sn | ) n Var ( Sn ) n 2 Probability theory 2010 Convergence to a random variable: definitions The sequence {Xn} of random variables converges almost surely to the random variable X if and only if P({ ; Xn() X() as n }) = 1 a.s. Notation: X X as n n The sequence {Xn} of random variables converges in probability to the random variable X if and only if, for all > 0, P({ ; | Xn() – X()| > }) 0 as n Notation: Probability theory 2010 p Xn X as n Convergence to a random variable: an example Assume that the concentration of NO in air is continuously recorded and let Xt, be the concentration at time t. Consider the random variables: Y max 0 t 1 X t Yn max X 0 , X 1/ n , X 2 / n , ... , X 1 Does Yn converge to Y in probability? Does Yn converge to Y almost surely? Probability theory 2010 Convergence in distribution: an example Let Xn Bin(n, c/n). Then the distribution of Xn converges to a Po(c) distribution as n Binomial and Poisson distributions (n = 20, c p== 0.1) 0.1) 0.30 Probability 0.25 0.20 Binomial Poisson 0.15 0.10 0.05 . 0.00 0 1 2 3 4 5 6 7 Probability theory 2010 8 9 10 Convergence in distribution and in norm The sequence Xn converges in distribution to the random variable X as n iff FX n ( x) FX ( x) as n for all x where FX(x) is continuous. Notation: d Xn X as n The sequence Xn converges in quadratic mean to the random variable X as n iff E | X n X |2 0 as n Notation: Probability theory 2010 2 Xn X as n Relations between the convergence concepts Almost sure convergence Convergence in probability Convergence in r-mean Probability theory 2010 Convergence in distribution Convergence in probability implies convergence in distribution Note that, for all > 0, P X n x PX n x | X n X | PX n x | X n X | Probability theory 2010 Convergence almost surely convergence in r-mean Consider a branching process in which the offspring distribution has mean 1. Does it converge to zero almost surely? Does it converge to zero in quadratic mean? Let X1, X2,… be a sequence of independent random variables such that P(Xn = n2) = 1/n2 and P(Xn = 0) = 1 – 1/n2 Does Xn converge to 0 in probability? Does Xn converge to 0 almost surely? Does Xn converge to 0 in quadratic mean? Probability theory 2010 Relations between different types of convergence to a constant Almost sure convergence Convergence in probability Convergence in r-mean Probability theory 2010 Convergence in distribution Convergence via generating functions Let X, X1, X2, … be a sequence of nonnegative, integervalued random variables, and suppose that g X n (t ) g X (t ) as n Then d Xn X as n Is the limit function of a sequence of generating functions a generating function? Probability theory 2010 Convergence via moment generating functions Let X, X1, X2, … be a sequence of random variables, and suppose that X (t ) X (t ) as n , for | t | h n Then d Xn X as n Is the limit function of a sequence of moment generating functions a moment generating function? Probability theory 2010 Convergence via characteristic functions Let X, X1, X2, … be a sequence of random variables, and suppose that X (t ) X (t ) as n , for t n Then d Xn X as n Is the limit function of a sequence of characteristic functions a characteristic function? Probability theory 2010 Convergence to a constant via characteristic functions Let X1, X2, … be a sequence of random variables, and suppose that X (t ) eitc as n n Then p Xn c as n Probability theory 2010 The law of large numbers (for variables with finite expectation) Let {Xn} be a sequence of independent and identically distributed random variables with expectation , and set Sn = X1 + … + Xn Then . Sn p Xn as n n Probability theory 2010 The strong law of large numbers (for variables with finite expectation) Let {Xn} be a sequence of independent and identically distributed random variables with expectation , and set Sn = X1 + … + Xn Then Xn . S n a.s. as n n Probability theory 2010 The central limit theorem Let {Xn} be a sequence of independent and identically distributed random variables with mean and variance 2, and set Sn = X1 + … + Xn Then S n n d N (0 ,1) as n n Proof: If = 0, we get t t t t Sn (t ) Sn ( ) X ( ) 1 o( ) n n n 2n n n . Probability theory 2010 2 2 n Rate of convergence in the central limit theorem E| X |3 sup x | FSn n ( x) ( x) | 0.7975 3 n n Example: XU(0,1) . E| X |3 1/ 4 8.3 0.7975 3 0.7975 n 1 /(12 12 ) n n Probability theory 2010 Sums of exponentially distributed random variables gamma(10;1) N(10;sqr(10)) 0.14 1.20 0.12 1.00 Cumulative distribution function Probability density gamma(10;1) 0.10 0.08 0.06 0.04 0.02 N(10;sqr(10)) 0.80 0.60 0.40 0.20 0.00 0.00 0 5 10 15 20 25 0 Probability theory 2010 5 10 15 20 25 Convergence of empirical distribution functions Fn ( x) # observations x n d n ( Fn ( x) F ( x)) N (0, F ( x)(1 F ( x) ) Proof: Write Fn(x) as a sum of indicator functions Bootstrap techniques: The original distribution is replaced with the empirical distribution Probability theory 2010 Resampling techniques - the bootstrap method Resampled data Observed data 62 90 22 41 34 67 88 79 39 73 58 x1* , x2* , ...60 , xN* Sampling with replacement 58 90 88 79 41 22 44 70 60 34 41 44 70 60 85 85 x 88 Probability theory 2010 x1* , x2* , ..., x N* Characteristics of infinite sequences of events Let {An, n = 1, 2, …} be a sequence of events, and define A* lim inf n An A m n 1 m n A lim sup n An * A m n 1 m n Example: Consider a queueing system and let An = {the queueing system is empty at time n} Probability theory 2010 The probability that an event occurs infinitely often - Borel-Cantelli’s first lemma Let {An, n = 1, 2, …} be an arbitrary sequence of events. Then P( A ) n 1 n P( An i.o.) 0 Is the converse true? Example: Consider a queueing system and let An = {the queueing system is empty at time n} Probability theory 2010 The probability that an event occurs infinitely often - Borel-Cantelli’s second lemma Let {An, n = 1, 2, …} be a sequence of independent events. Then P( A ) n 1 n P( An i.o.) 1 Probability theory 2010 Necessary and sufficient conditions for almost sure convergence of independent random variables Let X1, X2, … be a sequence of independent random variables. Then a.s. X n 0 as n P(| X n 1 Probability theory 2010 n | ) Exercises: Chapter VI 6.1, 6.6, 6.9, 6.10, 6.17, 6.21, 6.25, 6.49 Probability theory 2010