Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Prof. Dr. J. Franke Basic Statistics 3.1 3. Numerical characteristics of random variables 3.1 Expectation X discrete random variables with values x1, . . . , xm 1 , j = 1, . . . , m discrete uniform distribution: pr(X = xj ) = m expectation or mean of X m 1 X xj , EX = m j=1 More generally with arbitrary probability weights pr(X = xj ) = pj EX = m X j=1 xj pj = m X j=1 xj pr(X = xj ) Prof. Dr. J. Franke Basic Statistics 3.2 Expectation EX of arbitrary real-valued random variable a) distribution of X discrete with values x1, x2, . . . and probability weights pj = pr(Xj = xj ) EX = ∞ X ∞ X xj pj = j=1 xj pr(X = xj ) j=1 b) X has probability density p(x) EX = Z ∞ −∞ x p(x)dx. There are (rare) cases, where EX does not exist! Prof. Dr. J. Franke Basic Statistics 3.3 Examples: a) X Poisson-distributed: P(λ), ∞ X values 0,1,2,. . . ∞ X λj −λ EX = j pj = j e =λ j=0 j=0 j! b) X exponentially distributed: Exp(λ) EX = Z ∞ −∞ x p(x)dx = Z ∞ c) B(n, p): EX = np d) N (µ, σ 2): EX = µ e) Weibull(λ, β) : EX = 1β Γ(1 + β1 ) λ Gamma-function: Γ(n + 1) = n! 0 1 −λx xλe dx = λ Prof. Dr. J. Franke Basic Statistics 3.4 In general: If X has a probability density which is symmetric around µ, then EX = µ. Prof. Dr. J. Franke Basic Statistics 3.5 Expectations of functions of a random variable X random variable with values in in X f real-valued function on X , Ef (X) =? 1. approach: Determine distribution of random variable Y = f (X), √ e.g.: X Exp(λ) Y = X Weibull with β = 2 2. approach: a) X values xj , probability weights pj , j = 1, 2, . . . . Ef (X) = ∞ X f (xj )pj = ∞ X f (xj )pr(X = xj ) j=1 j=1 b) X probability density p(x) Ef (X) = Z ∞ −∞ f (x) p(x)dx. Prof. Dr. J. Franke Basic Statistics 3.6 Law of large numbers: X1, . . . , XN independent realisations of the same real-valued random variable X with EX = µ. Then, N 1 X XN = Xj → µ N j=1 for N → ∞ (randomness disappears due to averaging) More precisely: pr(X N → µ) = 1 Interpretation of expectation EX Repeat the experiment which results in X, very often in an independent manner independent X1, . . . , XN with same distribution as X. Then, N 1 X XN = Xj ≈ EX N j=1 Prof. Dr. J. Franke Basic Statistics 3.7 for various sample sizes N : plot of 50 sample means of N Exp(λ) variables Prof. Dr. J. Franke Basic Statistics 3.8 Rules of calculation for expectations expectation linear: for arbitrary constants c1, . . . , cN ! E c1X1 + . . . + cN XN = c1EX1 + . . . + c1EXN Factorization of expectation for independent X1, . . . , XN E(X1 · . . . · XN ) = EX1 · . . . · EXN In particular, for i.i.d. X1, . . . , XN : E(X1 · . . . · XN ) = EX1 N Prof. Dr. J. Franke Basic Statistics 3.9 3.2 Variance X real-valued random variable with expectation EX = µ. Its variance is var X = E X − EX 2 =E X −µ 2 If X has a density p(x), then, in particular, var X = standard deviation of X: Z ∞ −∞ (x − µ)2p(x)dx σ(X) = If X is N (µ, σ 2)-distributed, then √ var X. var X = σ 2. Prof. Dr. J. Franke Basic Statistics 3.10 Rules of calculation for variances var cX = c2var X, σ cX = cσ(X) Additivity of variance for independent X1, . . . , XN var N X Xj = j=1 N X var Xj j=1 In particular for i.i.d. X1, . . . , XN : var N X Xj = N · var X1 j=1 var X N = var 1 PN X 1 var X → 0 = 1 j N j=1 N PN 1 und const = EX1, da EX N = N j=1 EXj = EX1 X N → const Prof. Dr. J. Franke Basic Statistics 3.10 3.3 Dependent random variables: covariance and correlation X, Y real-valued random variables with probability densities px, py 2 V = X Y random vector with values in R . two-dimensional density pv (x, y) pr(a ≤ X ≤ b, c ≤ Y ≤ d) = X, Y independent ⇐⇒ Z dZ b c a pv (x, y)dx dy pv (x, y) = px(x) · py (y) Instead of investigating function pv (x, y) only number as measure for the strength of dependence: covariance resp. correlation of X and Y . Prof. Dr. J. Franke Basic Statistics 7.20 covariance cov (X, Y ) of two real-valued random variables X, Y ! cov (X, Y ) = E (X − EX) · (Y − EY = E(X · Y ) − EX · EY correlation corr (X, Y ) = cov (X, Y ) cov (X, Y ) =q σ(X) · σ(Y ) var (X) · var (Y ) Correlation is scale invariant: corr (aX, bY )=corr (X, Y ), a, b> 0. We always have: −1 ≤ corr (X, Y ) ≤ +1 Prof. Dr. J. Franke Basic Statistics 7.21 How to interpret the value of correlation? 1) X, Y independent cov (X, Y ) = 0 corr (X, Y ) = 0 X, Y are called uncorrelated 2) Y proportional to X, i.e. Y = c · X for some c 6= 0 2 · var X cov (X, Y ) = cov (X, cX) = c · cov (X, X) , var Y = c {z } | =var X +1 c>0 c corr (X, Y ) = = for |c| −1 c<0 In general, this holds for Y = c · X + d, too. Prof. Dr. J. Franke Basic Statistics 7.22 3) X, Y uncorrelated, i.e. corr (X, Y ) = 0, does not imply (!), that X, Y independent Extreme counter example: X U (−1, +1)-distributed, Y = X 2 corr (X, Y ) = 0 though Y is completely determined by X (but in a nonlinear manner) Summary: correlation measures the degree of linear dependence of X and Y . Important special case: X, Y jointly normally distributed. Then: X, Y independent ⇐⇒ corr (X, Y ) = 0 Prof. Dr. J. Franke Basic Statistics 7.23 Visualisation of dependence: scatter plots 2 measurements from each object x1 two-dimensional data y , . . . , xy N ∈ R2 1 N X j modelled as random vectors Yj , j = 1, . . . , N scatter plot Xj , Yj do not influence each other dinate axes uncorrelated plot largely parallel to coor- typically: ellipse with main axes parallel to coordinate axes (normal distribution!) Xj , Yj additional same variance Xj , Yj do influence each other corr (Xj , Yj ) > 0 resp. < 0 circle plot increases or decreases Prof. Dr. J. Franke Gaussian data - uncorrelated, equal variances Basic Statistics 7.24 Prof. Dr. J. Franke Gaussian data - uncorrelated, unequal variances Basic Statistics 7.25 Prof. Dr. J. Franke Gaussian data - positive correlation Basic Statistics 7.26 Prof. Dr. J. Franke Gaussian data - negative correlation Basic Statistics 7.27 Prof. Dr. J. Franke independent exponential component variables Basic Statistics 7.28 Prof. Dr. J. Franke non-Gaussian data - positive correlation Basic Statistics 7.29 Prof. Dr. J. Franke deterministicly dependent, but uncorrelated Basic Statistics 7.30