Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
2 THE HILBERT SPACE L Definition: Let (Ω,A,P) be a probability space. The set of all random variables X:Ω→ satisfying EX2<∞ is denoted as L2. Remark: EX2<∞ implies that E X <∞ (or equivalently that EX∈ ), because X ≤X2+1 ⇒ E X ≤EX2+1. Proposition: The set L2 together with the pointwise scalar multiplication defined for X∈L2 and λ∈ by (λX)(ω)=λ(X(ω)), ω∈Ω and the pointwise addition defined for X,Y∈L2 by (X+Y)(ω)=X(ω)+Y(ω), ω∈Ω is a vector space. Proof: (i) The two operations are closed because X∈L2, λ∈ ⇒ EX2<∞ ⇒ E(λX)2=λ2EX2<∞ ⇒ λX∈L2 and X,Y∈L2 ⇒ EX2, EY2<∞ ⇒ E(X+Y)2≤E(2X2+2Y2)<∞ ⇒ X+Y∈L2. 1 (ii) The associative, commutative, and distributive properties (X+Y)+Z=X+(Y+Z), (λµ)X=λ(µX), X+Y=Y+X, λ(X+Y)=(λX)+(λY), (λ+µ)X=(λX)+(µX) follow immediately from the pointwise definitions of the two operations. For example, if X,Y,Z∈L2 then ((X+Y)+Z)(ω) =(X+Y)(ω)+Z(ω) =(X(ω)+Y(ω))+Z(ω) =X(ω)+(Y(ω)+Z(ω)) =X(ω)+(Y+Z)(ω) =(X+(Y+Z))(ω), ω∈Ω. (iii) The random variable 0 which is identically zero on Ω satisfies the property X+0=X ∀X∈L2 of a zero vector. (iv) For all X∈S2 there exists an inverse vector -X defined by (-X)(ω)=-(X(ω)), ω∈Ω, satisfying -X+X=0. (v) 1 X=X 2 Exercise: Show that a function < >:L2 ×L2 → can be defined by <X,Y>=EXY, which satisfies for X,Y,Z∈L2 and λ∈ <X+Y,Z>=<X,Z>+<Y,Z>, <λX,Y>=λ<X,Y>, <X,Y>=<Y,X>, <X,X>≥0. Solution: -∞<E(-X2-Y2)≤EXY≤E(X2+Y2)<∞ ⇒ EXY∈ , <X+Y,Z>=E(X+Y)Z=EXZ+EYZ=<X,Z>+<Y,Z>, <λX,Y>=E(λX)Y=λEXY=λ<X,Y>, <X,Y>=EXY=EYX=<Y,X>, <X,X>=EXX=EX2≥0. 3 The function < > satisfies all the properties of an inner product except for <X,X>=0 ⇔ X=0, because EX2=0 implies only that P(X=0)=1, but not that X(ω)=0 for all ω∈Ω. Analogously, the function satisfies all the properties of a norm except for X =0 ⇔ X=0. To circumvent this problem we identify two random variables if they are equal almost surely, i.e., we switch from the individual random variables X∈L2 to equivalence classes [X]={Y∈L2: P(Y=X)=1} of random variables which agree almost everywhere. Definition: Defining for equivalence classes [X], [Y] of almost surely equal elements of L2 and λ∈ [X]+[Y]=[X+Y], λ[X]=[λX], <[X],[Y]>=<X,Y> we obtain an inner product space, which is denoted by L2. 4 Proposition: The inner product space L2 of equivalence classes of almost surely equal random variables with finite variances is complete, i.e., Xn∈L2 for all n, X m − X n →0 ⇒ ∃X∈L2: X n − X →0. Thus L2 is a Hilbert space. Remark: Norm convergence X n − X →0 is equivalent to mean square convergence X n − X =E(Xn-X)2→0. 2 Exercise: Show that the relation ~ defined by X~Y ⇔ P(X=Y)=1 is indeed an equivalence relation by verifying the reflexive, symmetric, and transitive properties X~X, X~Y⇒Y~X, X~Y,Y~Z⇒X~Z ∀X,Y,Z∈L2. 5 Solution: The transitive property is satisfied, because {ω:X(ω)=Z(ω)} ⊇ {ω:X(ω)=Y(ω)=Z(ω)} ⇒ {ω:X(ω)=Z(ω)}C ⊆ {ω:X(ω)=Y(ω)=Z(ω)}C = ({ω:X(ω)=Y(ω)}∩{ω:Y(ω)=Z(ω)})C = {ω:X(ω)=Y(ω)}C∪{ω:Y(ω)=Z(ω)}C ⇒ P({ω:X(ω)=Z(ω)}C) ≤ P({ω:X(ω)=Y(ω)}C) + P({ω:Y(ω)=Z(ω)}C). Proposition: If E(Xn-X)2→0 and E(Yn-Y)2→0, then (i) (ii) (iii) (iv) EXn→EX, EXnYn→EXY, Cov(XnYn)→Cov(X,Y), Var(Xn)→Var(X). Proof: (i) EXn=EXn⋅1= X n ,1 → X,1 =EX⋅1=EX (ii) EXnYn= X n , Yn → X, Y =EXY (iii) Cov(Xn,Yn)=EXnYn-EXnEYn→EXY-EXEY=Cov(X,Y) (iv) Var(Xn)=Cov(Xn,Xn)→Cov(X,X)=Var(X) 6 Definition: The conditional expectation of X∈L2 given a closed subspace S⊆L2, which contains the constant function 1, is defined to be the projection of X onto S, i.e., E(X S )=PS(X). Remark: The conditional expectation satisfies 2 X − E(X S) < X − Y 2 for all other elements of S. Definition: The conditional expectation of X∈L2 given X1,…,Xn∈L2 is defined to be the projection of X onto the closed subspace M(X1,…,Xn) spanned by all random variables of the form g(X1,…,Xn), where g is some measurable function g: n→ , i.e., E(X X1 ,..., X n ) = PM ( X1 ,...,X n ) (X ) . Remarks: (i) It follows from span (1,X1,…,Xn)⊆M(X1,…,Xn) that 2 2 X − E (X X1 ,..., X n ) ≤ X − E(X span (1, X1 ,..., X n )) . (ii) For elements of L2 the definition of E(X X1 ,..., X n ) above coincides with the more general defnition of conditional expectation as the mean of the conditional distribution. 7 Exercise: Show that the bivariate normal density f(x)=f(x1,x2)= 1 1 (x exp( − 2 (2 π ) 2 det Σ − µ) T Σ −1 ( x − µ) ) with mean vector µ=(µ1,µ2)T and covariance matrix σ12 σ12 = 2 σ 2 ρσ1σ 2 ρσ1σ 2 2 σ2 factors into two univariate normal densities, the marginal density f1 with mean µ1 and variance σ12 and the conditional σ12 Σ= σ12 density f2|1 with mean µ2+ ρσ 2 x1σ−µ1 and variance (1-ρ2) σ 22 . 1 Solution: Putting z1= x1 −µ1 x 2 −µ 2 , z = 2 σ1 σ2 and completing squares we obtain (x-µ)TΣ-1(x-µ)= = = T x1 −µ1 σ 22 x − µ 2 2 −ρσ1σ 2 −ρσ1σ2 x1 −µ1 2 x −µ σ1 2 2 σ12σ 22 (1−ρ 2 ) σ 22 ( x1 −µ1 ) 2 − 2ρσ1σ 2 ( x1 −µ1 )( x 2 −µ 2 ) + σ12 ( x 2 −µ 2 ) 2 σ12σ22 (1−ρ 2 ) z12 − 2ρz1z 2 + z 22 1−ρ 2 = z12 −ρ 2 z12 1−ρ2 + ρ 2 z12 − 2ρz1z 2 + z 22 1−ρ 2 ( z −ρz ) = z12 + 2 21 1−ρ 2 . Thus, f(x1,x2)= 1 2 πσ12 exp(- 12 z12 ) 1 2 π (1−ρ 2 ) σ 22 2 1 ( z 2 −ρz1 ) exp(- 2 1− ρ 2 ). 8 Remark: The last exercise shows that in the case of a bivariate normal random vector (X1,X2) the mean of the conditional distribution of X2 given X1 is a linear function of 1 and X1. More generally, if (X,X1,…,Xn)T has a multivariate normal distribution, then E(X X1 ,..., X n ) =E(X span (1, X1 ,..., X n )) . 9