Download L2: Lecture notes: Distributions

Distributions of random variables A random variable (r.v.) is a real function X: S → R on the sample space S (a quantitative aspect of the random experiment). The range is SX = { X(s) | s S} We distinguish discrete and continuous r.v.’s: X is discrete iff (= if and only if) SX countable, that is: SX = {x1, x2, …,xn} or SX = {x1, x2,…} If X is continuous, then SX is usually an interval in R. (a definition follows later) Discrete r.v. X: the event X=x is{s S| X(s)= x} The probability (mass) function of X: P: x → P(X=x) Requirements for a probability function P: 1. P(X=x) ≥ 0 for every x SX 1 2. The distribution of a discrete r.v. X consists of a table or formula for P(X=x) of all x SX . The measure for the centre of distribution is the Expectation or Expected value of X: E ( X )   xP ( X  x ) , xS X provided that the sum is absolute convergent. Notation: E(X) = EX = µX = µ. Interpretation: “ Weighted average”. Properties E(X): 1. If P(X=x) is symmetric in x = c, then E(X)= c 2 2. Eg ( X )   g ( x ) P ( X  x ) 3. 4. E(aX + b) = aE(X) + b E[ag(X) + b h(X)] = aEg(X) + bh(X) xS X Note that in general E(X2) ≠ E(X) 2! Functions of X and their expectation: E(Xk) is the kth moment of X var(X) = E(X - µX)2 is the variance of X Notation: var(X)=  X2   2 . 3  X  var( X ) is the standard deviation of the r.v. X var(X) and σX are both measures of spread for the distribution. Properties of var(X) and σX : 1. var(X) ≥ 0 and σX ≥ 0 2. var(X) = E(X2) - µX2 (formula for computations) 3. var(X) > 0 => E(X2) > µX2 var(X) = 0 => P(X = µX) = 1 4. var(aX + b) = a2 var(X) and σaX + b = |a|σX Chebyshev’s inequality: P(| X   X |  c)  var( X ) c2 for all c > 0 Discrete distributions and characteristics: 4 name Binomial B(n, p) Poisson (µ) Geometrical (p) HyperGeometrical P(X = k) = E(X) Var(X) n k   p 1 p nk k  np np(1-p) µ µ 1 p 1 p p2 for k = 0,1,…,n k k! e , k= 0,1,… 1 p k 1 p , k = 1,2,…  R  N  R     k n  k    , N     n np np(1-p)× (p= NR ) N n N 1 k = 0, 1,…,n Properties (linking the distributions): 1. If the parts of the populations are large compared to the sample size (> 5n2 ), the hypergeometrical probabilities can be approximated with the binomial. 5 2. If X ~ B(n, p) for large n and small p so that np > 10, X is approximately Poissondistributed with µ = np. When to use these common discrete distributions? Binomial “the number of successes in n Bernoulli trials” Ex: X = “# sixes in 25 flips of a dice” X ~ B(25, 1/6) Geometrical “the trial number of the first success when performing Bernoulli trials” Ex: X = “# of the first flip of a dice that results in a 6” Property: P(X > k) = (1-p)k , k = 0, 1, 2,… Hyper geometrical “The number of white balls selected when n 6 balls are selected at random without replacement from an urn that contains R red and N-R white balls” Ex: X = “# of girls when 5 persons are selected at random from a group of 8 boys and 12 girls” Poisson “The number of rare events in a period and/or space” Ex: X = “# of traffic accidents on a busy road on a day”. Continuous random variables X is a continuous random variable if there exists a non-negative function f(x) for all real x so that for every (measurable) set B: P( X  B )   f ( x )dx B 7 f(x) or fX(x) is the probability density function. Requirements: 1. f(x) ≥ 0  2.  f ( x )dx  1  Note: f(x) is not a probability, but for small dx > 0 is P(x < X ≤ x + dx) ≈ f(x)dx x F ( x )  P ( X  x )   f (u )du is the cumulative  distribution function (c.d.f.). Notation: FX(x) = F(x) Properties F(x) for every random variable X: 1. 2. 3. a < b => F(a) ≤ F(b) (F is non decreasing). lim F ( x )  1 x  lim F ( x )  F ( a ) x a and lim F ( x )  0 x  (F is right continuous) 8 4. 5. P(X > x) = 1- F(x) P(X = x) = F(x) - limu↑x F(u) Properties of density function f and c.d.f. F of a continuous r.v. X: 1. 2. 3. 4. F(x) is a continuous function f ( x)  d dx F ( x) P(X = x) = 0 P(a < X < b) = P(a ≤ X ≤ b) b = F(b) – F(a) =  f ( x )dx a The expectation of a continuous r.v. X  E ( X ) =  xf ( x )dx  (provided that the integral is absolute convergent). A function Y = g(X) of a continuous r.v. X , if we know the density function fX. 9 The density function f Y(y) can be determined in 3 steps: 1. Express FY (y) = P( g(X) ≤ y) in FX. 2. 3. Determine fY ( y )  d F ( y) dy Y Use the known distribution f  E (Y )  Eg ( X ) =  g ( x ) f ( x )dx   Especially: E ( X )   x f ( x )dx 2 2  All properties of E(X) and var(X) hold for continuous random variables as well, e.g. var(X) = E(X2) - µX2 Properties of fX (x): 1. If fX (x) is symmetric in x = c, then E(X) = c 10 2. Linear transformation: Y = aX + b 1 F ( f ( y )  (known fX (x)) : Y |a| X y b ) a Common continuous distributions Probability Name E(X) Var(X) density function Uniform f(x) = b1 a , for x in ab (ba )2 2 12 U(a,b) [a, b] 1 Exponential f(x) = e x , for x 1 2   ≥0 Standard 1 x2 0 1 normal φ(x) = 12 e 2 N(0,1) x   2  1 Normal   2 2    1 µ σ N(µ, σ2) f(x) = 2 2 e These distributions are often used as a model of the stochastic reality: Uniform: random numbers from an interval 11 Exponential: waiting times, serving times Normal: quantities or variables in nature, economy etc, varying around an average Some properties of these continuous distributions 1. An exponential variable has no memory: P(X > x + y | X > x) = P(X > y). This follows from the exponential property P(X > x) = e –x, x ≥ 0 2. If X ~ U(0, 1), then Y = aX + (b – a) ~ U(a, b). 3. If X ~ N(µ, σ2), then Y = aX + b ~ N(aµ+b, a2σ2). 12

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download L2: Lecture notes: Distributions