Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
LECTURE LECTURE5:5:Discrete Discreterandom randomvariables: variables: probability probabilitymass massfunctions functionsand andexpectations expectations Random variables: theidea ideaand andthe thedefinition definition • • Random variables: the ngs: 2.1-2.3, start ings:Sections 2.1-2.3, start2.4 2.4idea and the definition •Sections Random variables: the Discrete:take takevalues valuesininfinite finiteororcountable countableset set –– Discrete: – Discrete:Lecture take values in finite or countable set outline Lecture outline • • Probability Probabilitymass massfunction function(PMF) (PMF) • Probability mass function (PMF) • Random variable examples • Random variable examples • –Random variable examples Bernoulli – Bernoulli –– Bernoulli Uniform Uniform Binomial –––Uniform Binomial Geometric –––Binomial Geometric(mean) and its properties •– – Expectation Geometric – The expected value rule Expectation(mean, (mean,average) average)and anditsitsproperties properties • •Expectation – Linearity Theexpected expectedvalue valuerule rule – –The Linearity – –Linearity Random variables: the idea t of a value (number) to every possible outcome y: A function ple space Ω to the real numbers ontinuous values ral random variables same sample space ble X ue x Random variables: the idea idea Random variables: Random variables: the idea Random variables: the the idea Random variables: thethe formalism Random variables: the formalism Random variables: the formalism Random variables: idea Random variables: thethe formalism Random variables: idea • •• m variable come A random variable (”r.v.”) (”r.v.”) associates a to value (a posnumber) A random variable associates value (a number) (”r.v.”) associates a value (a number)a every A random variable (“r.v.”) associates a value (a number) Random variables: the the formalism formalism Random variables: to every possible outcome to every possible outcome to every possible outcome • A random variable variable (”r.v.”) associates a value valuespace (a number) number) random (”r.v.”) associates a (a Mathematically: Asample function from the sample Ω to to the the real real numbers numbers atically: • function from theA space Ωthe to the real space numbers ••A A Mathematically: function from sample Ω Mathematically: A function from the sample space Ω to the real numbers to every possible outcome to every possible outcome – takediscrete discrete continuousvalues values ake discrete orcan continuous values • It itcan take ororcontinuous • Mathematically: Mathematically: A A function function from from the the sample sample space space Ω Ω to to the the real real numbers numbers • Notation: random variable X same numerical value x We can have several random variables defined on the same same sample space space • • It It can take discrete or continuous values have several random variables defined onvariables the sampleon space • We can have several random defined the sample can take discrete or continuous values A several function of+one one or several severalisrandom random variables is also also a a random random variable variable n of one or also a random variable •• meaning A function of or variables is ofrandom X Y : variables • We We can can have have several several random random variables variables defined defined on on the the same same sample sample space space • of X 0 or several random variables is also a random variable • meaning A function function of≥one one • A of or several random variables is also a random variable • Notation: random variable X numerical value x – random variable X – meaning of X + Y : numerical x –– meaning of value X ≥ 0” – random variable X – numerical value x It is“X the=“p • Notation: If we fix some x,• then x P •• Notation: • Notation: It is “probability orisfix “pro • If• we fixthe some x, then “X•law” =Ifx” anso e (x • we ispX the pX (x) = PIt(X = • Notation: pX • If we fix some x, then = x” is • “X Notation: • Notation: pX (x) =• PIf (Xwe =pfix • •Properties: Properties: � X • Properties:Probability pX (x) ≥ 0 mass • Notation:pX (x) = P(X = •x) Notatio = P {ω Discrete •uniform law Properties: pX (x) ≥ 0 law • It is p the (x)“probability =•PProperties (X = x) = X � Probability mass mass function function (PMF) (PMF) of of X X • Properties: pX (x) ≥ 0 Probability • If we fix some x,�then “X Probability mass function (PMF) of a discrete r.v. X • Proper • Properties: pX (x) ≥ 0 pX (x) • •X Ω consists of n equally � l x • It It is is the the “probability “probability law” law” or or “probability “probability distribution” distribution”−of ofAssume X • bNotation: • X a c d • X p Probability mass function (PMF) of X It is the “probability law” or “probability distribution” of X Probability mass function (PMF)(PMF) of X of a discrete r.v. X Probability mass function x − Assume• AX consists of m element a b c 3d 34 •45pX5(x)Exam = P(X • If we fix some x, then “X = x” is an event until first head 3 4 5 • Ifsome we fix law” some x,“probability then = distribution” x” is an eventof we“probability fix then “X = x”“X is an event number of el ss Ifthe or X the “probability law” or “probability distribution” of distribution” X • It isx,the “probability law” or “probability X •of X a b: c P d (A) = Then 4 (x) 5indep Ex •p – 3 assume Probability mass function (PMF) of a discrete r.v. X ≥el 0 3 4 • 5Properties: X of number until first hea • X a b c d Notation: P(H) = p > 0 •• Notation: weNotation: fix some x, then “X = x” is an event we fix some x, then “X = x” an event If the we fix some x,is then = x” is� an event 3� 4 5 • •It is “probability law” or“X “probability � distribution” �of X � � • Just 4 5 .. p (x) = P (X = x) = P {ω ∈ Ω s.t. X(ω) = x} pPX (x)= =x) P(X ={ω x) ∈ =ΩP s.t. {ω X(ω) ∈ Ω s.t. X(ω) = x} 3 count. X(X p (x) = = P = x} X tation: Notation: tation: • •If we fix some x, then “X = x” is an event 4 5in – 3assume until P(H)first =p – assum 1 prob = � � � � � � P(H) � 4 � • Example: X=number o p (x) = P (X = x) = P {ω ∈ Ω s.t. X(ω) = x} • ≥ 0 p (x) = 1 p (x) = P (X = x) = P {ω ∈ Ω s.t. X(ω) = x} =0P(X = xxx) =(x) P= {ω1∈ Ω s.t. X(ω) = x} X=number of coin tosses XpX X X • 0pNotation: X pX (x)•≥ X (x) ≥ Example: – geometric P until first head � � � until first head pX (x) = P(Xp= x) = (x) =P1 {ω ∈ Ω s.t. X(ω) = x} – geometri � X0 – assume independent to •� Properties: p (x) ≥ X (x) p (x) = 1 (x) ≥ ≥ 00•• Example: p (x) = 1 x – assume independent tosses, X x X x Example: X=number X=number of of coin coin tosses tosses P(H) = p > 0 � P (H) = p > 0 pX (x) = 1 – geom • until Properties: until first head headpX (x) ≥ 0 first pX (k) x � pX (k) = P(X = k) p (x) = 1 ample: X=number of coin tosses ample: X=number of coin tosses Example: X=number of coin tosses – assume independent tosses, – assume independent tosses,x iluntil first head til firstfirst headP head P(H) (H) = = pp > > 00 X •independent Example: X=number of coin tosses ssume independent tosses, – assume tosses, ssume independent tosses, ppX (k) = = P P(X (X = = k) k) – geometric PMF X(k) until first head • Example: X=number of coin tosses (H) = P(H) = p>0 (H) = pp > > 00 = = P P(T (TTT ·· ·· ·· TTH) H) until first head – assume independent tosses, k−1 (1 kk = ppX = (X k) pX (k) = (X (k) = P P (XP= = k)= = k) (1 − − p) p)k−1p, p, = 1, 1, 2, 2, .. .. .. X(k) – assume independent tosses, P(H) = p > 0 = P · · T H) (T ······TTT·H) = P (TT T (T H) P(H) = p >= 0 P – – geometric geometric=PMF PMF k−1 k−1 pp) (k) = p,Pk(X k) Xk−1 =− − p)p, k1,= p) 1, 2, .....2, = (1 (1 −(1 p, k= == 2,1, . ... pX (k) = P(X = k) = P(T T · · · T H) = P(T T · · · T H) = (1 − p)k−1p, k = 1, 2, . . . – geometric PMF • repeat for all z: five – collectPMF all possible outcomes for calculation 4 – add their probabilities – collect all possible outcomes for which X is equal t 3 Y = Second – add their probabilities roll – repeat for all x• Example: Two independent rolls 2 1 F : first rollrolls of a fair tetrahedr • Example: Two independent 3 2 1 4 S: second roll PMF calculation X = First roll Sample space: discrete/finite example X = min(F, S) F : first roll e outcomes for which X isPMF equalcalculation to x S: second roll • repeat for all z: Die roll example LetXevery possible probability 1/16 • Two rolls of outcomes a tetrahedral die Z is• equal Z = X +have Y ities S) outcome – collect all possible for which to=z min(F, – add their probabilities • P(X = 1) = – Sample space vs. sequential description Z =X +Y 4 Find pZ (z) PMF calculation 1,1 • Example: Two independent rolls of a fair tetrahedral die 3 1z: 1,2 • repeat for all Y = Second dependent rolls of a fair tetrahedral die 1,3 roll F : first roll 2 – collect all possible 1,4 outcomes for which Z is equal to z 4 S: second roll 2 4 – add their probabilities rowX = min(F, S) 1 3 throw 4 Y = Second 2 1 Z =X +Y 3 roll 4 X = First roll 1 Find pZ (z) 3 3 2 2 1 1 3 • Example: Two independent rolls of a fair tetrahedral die S = Second rollcalculation S = Second roll PMF 2 • Let B be the event: min(X, Y ) = 2 1 • Let M = max(X, Y ) 4 F•: 4first rollfor 3 repeat 4 S: second roll X = First roll 2 all z: 4,4 collectS) all possible outcomes for which Z is equal to z X– = min(F, 3 4 1 2 4 A continuous sample space: •• P(M = 1 | B) = – add their probabilities 3 such that 0 ≤ x, y ≤ 1 Z = X + Y F = First roll (x, y) • P(M = 2 | B) = 3 S = Second roll 2 1 2 3 S =4Second 5 roll6 7 2 1 pX (2) = 1 1 2 3 F = First roll 4 y8 9Findz p (z) Z pX (2) = 1 1 2 •3 Example: Two independent rolls of a fair tetrahedral d 4 F = First roll 4 F : first roll 3 n = 3, p = 0.2 n = 100, p = 0.5 n = 100, p = 0.1 ndom variable: Bernoulli withvariable: parameter p ∈ [0, 1]with parameter p ∈ [0, 1] The simplest random Bernoulli The simplest random variable: Bernoulli with parameter ∈ [0, The simplest random variable: Bernoulli with parameter p ∈p[0, 1] 1] X= 1, 0, 1, w.p. p w.p. p w.p. p X= =1,1, w.p. X w.p. 1 − pp (x) = 0, w.p.p1 − p 0, w.p. 1 − p X 0, w.p. 1 − p • trial results in success/failure, success/failure, Heads/Tails, etc. etc. • Models Models a variable; trial that that results in Heads/Tails, uniform randoma parameters a, b Discrete uniform random variable; parameters a, b egers •a, Indicator b; a ≤ b r.v. of uniform an event random A: IA = 1variable; iff A occurs Discrete parameters a, b • Parameters: integers a, b; a ≤ b k one• ofParameters: a, a + 1, . . . , bintegers at random; likely a, random b;all aequally ≤ b variable; Discrete uniform parameters a, b • Experiment: Pick one of a, a + 1, . . . , b at random; all equally likely • Pick onea,ofb;a, aa+≤1,b . . . , b at random; all equally likely • Experiment: Parameters: integers • Sample space: a+ , b}1, . . . , b at random; all equally likely Experiment: Pick{a, one of1,a,. .a. + • • • Sample space: 1, . . = . , b} Random variable {a, X:a + X(ω) ω Random X: ignorance X(ω) = ω Model of:variable complete • Model of: complete ignorance a b a b 1 1 b−a+1 b−a+1 Discrete uniform•random variable; parameters a,b b Parameters: integers a, b; a ≤ • Parameters: integers a, •b; Experiment: a≤b Pick one of a, a + 1, . . • Experiment: Pick one of a + 1, . .space: . , b at random; • a,Sample {a, a +all 1, .equally . . , b} li {a, a +•1, Random . . . , b} variable X: X(ω) = of ωad Probability mass function (PMF) Discrete uniform random•variable; parameters a, b =ω Random variable X: X(ω) Discrete uniform random variable; parameters bb • of: complete ignorance • It is the “probability law” or “probability distribution” of X • ItModel is the a, “probability law” or “probability distribut Discrete uniform random variable; parameters a, Discrete uniform random variable; parameters a, b Discrete uniform• random variable; parameters a, b Model of: complete ignorance • IfParameters: Parameters: integers a, b; a ≤ b •• we fix some x, then “X = x” is an event • If we fix some x, then “X = x” is an event • Parameters: integers integers a, a, b; b; a a≤ ≤ bb a b • Parameters: integers a, b; a ≤ b Experiment: ... ... ... ,,, bbb at random; equally likely •• • all Notation: • Experiment: Pick Pick one one of of a, a, a a+ + 1, 1, at random; all equally likely •• Notation: Experiment: random; all equally likely Discrete uniform random variable; parameters a, b � Experiment: Pick Pick one one of of a,ba a+ + 1, 1, . . . , b at at random; all equally likely a �a, � • variable; Experiment: Pick one of a, a + 1, . . . , b at random; all equally likely m random parameters a, b = x) =P {ω. ∈ = x} pX (x) = P(X = x) = P {ω ∈ Ω s.t. X X (x) = P(X{a, • Sample space: a 1, • Sample Sample pspace: space: {a, a+ + 1, . ... ,,,Ωb} b}s.t. X(ω) integers • ... Parameters: a, b; a ≤ b • {a, a + 1, . b} {a, a + 1, . . . , b} a, b; a ≤•b Sample space: 1 • Sample space: {a, a + 1, . Experiment: . . , b} • Pick one of a, a + 1, . . . , b at random; all equally likely • Random variable X: X(ω) = ω •• Properties: pX (x) ≥ 0 X: X(ω) = ω • Properties: pX (x) ≥ 0 Random variable of a, a +• . , b at random; all equally likely variable X: b−a+ � •1, . .Random Random variable X:� X(ω) X(ω) = =ω ω 1 • Sample • Random variable X: X(ω) pX (x) == 1 ω space: {a, a + 1, . . . , b} pX (x) = 1 + 1, . . . , b} • Model of: complete ignorance x x • Model of: complete ignorance b−a+1 • Model of: complete ignorance • Model of: complete ignorance • Random variable X: X(ω) = ω X(ω) = ω• Model of: complete ignorance Special case: a = b constant/de • Model of: complete ignorance gnorance a Discrete uniform random variable; a, b b c d Discrete uniform random a X bbba b c d Special case: a = bparameters r.v. •a •constant/deterministic X a Discrete uniform random variable; parameters a, b a b • Parameters: integers • Parameters: integers a, b; a ≤ b a a, b b; a ≤ b • Sample space: Probability mass function (PMF) of a discrete r.v. X • Parameters: integers a, b; pX (x) a≤b X (x) • Experiment: Pick one of a, a + 1, . . . , b at random; allpequally likely • Experiment: Pick one of a, a + 1, • Experiment: Pick one of a, a + 1, . . . , b at random; all equally likely 1 • Sample space: {a, a + 1, . . . , b} {a, a + 1, 3 . . . , b} 4 5 1 Probability mass function (PMF) of a discrete X b− a1=+ωa,1b Probability massr.v. function (PM uniform random variable; parameters b − a + Discrete 1 • Random variable X:— X(ω) • Random variable X:— X(ω) = ω −Random a+ +1 1 variable X:— X(ω) b −=a ω+ 1 bb •− a • Parameters: integers a, b; a ≤ b − a“probability +1 • Itignorance is bthe law” or “probability• distribution” of X • Model of: complete Model of: complete ignorance 314 5 • Sample space: {a, a + 1, .•. . Sample , b} 1 space: 1 • Model of: complete ignorance• • Experiment: Pick one of a, a + 1, . . . , b at random; all equally likely Special case: • Sample space: a{a, ab+ 1, . . . , b} • Random variable X:— X(ω) = ω • Model of: complete ignorance a a=b It is the “probability law” or “probabil constant r.v. • If we afixb some x, then “X = x” •is If anwe event a some b fix x, then “X = x” is an e • Notation: • Notation: � � � pX (x) = P(X = x) = P {ω ∈ Ω s.t.pX X(ω) =P x} (x) = (X = x) = P {ω b • Properties: pX (x) ≥ 0 • Properties: � pX (x) ≥ 0 � Probability mass function (PMF) of a discrete r • It is the “probability law” or “probability distribution” of X • Ifvariable; we fix some x, then “X Binomial random random parameters n, = p x” is an event Binomial variable; parameters n, p Notation: Parameters: integer n; p∈ ∈• [0, [0, 1] parameters: ••Binomial Parameters: integer n; p 1] random variable; parameters: positive integer ∈ 1] [0,�1] Binomial random variable; integer n; pn;∈p[0, pXwith (x) P (X = = x) • Experiment: Experiment:nn nindependent independenttosses tossesofof ofaa acoin coin with= P (Heads) =p= p P {ω ∈ Ω s.t. X(ω) = x} •• PP (Heads) Experiment: independent tosses coin with (Heads) = p Binomial random variable; parameters n, p Samplespace: space: Set Setofofsequences sequences andT, oflength length •• Sample ofofHHand of • Properties: pT, (x) ≥ 0 nn X random variable; parameters: positive integer n; p ∈ • Binomial Parameters: integer n; p ∈ [0, 1] � Probability mass function (PMF) of a discrete Random variable X: number numberofofHeads Heads observed •• Random variable X: observed Binomial random variable; n, p pX (x) parameters =r.v. 1 X el based on conditional probabilities • Experiment: n independent tosses of a coin with P (Heads) = p= • Experiment: n independent tosses xof a coin with P(Heads) Modelof: of: number successes given number trials •• Model number ofof successes inin aagiven number trials • P It the “probability law” or integer “probability distribution” of X • Parameters: n;ofof pindependent ∈independent [0, 1] ed coin: P(H) = p, (Tis) = 1− p • Sample space: Set of sequences of H and ofparameters length n n n, Binomial random variable; • Sample space: Set of sequences of H andT,T, of length HHH • Experiment: tosses ofobserved a coin with P(Heads) = •p If we fix some x, then variable “X = n x”independent is an eventof • Random X: number Heads 00 11 22 • Parameters: integer n; p ∈ [0, 1] • • Random variable X: number of Heads observed p Sample Set of sequences of Hnumber and T,of ofindependen length n HHT • •Model of: space: number of successes in atosses given 1•- p Notation: • Experiment: n independent of a coin with P(Hea • Model of: number of successes in a given number of independ 3 4 5 Example: X=number of coin tosses p � HTH • Random variable X:� number of Heads observed � � p 1- p n pX•until (x) =first P(Xhead = x) = P Set {ω ∈ofΩsequences s.t. X(ω) = Sample space: of x} H and T, of lengt 2 pX (x) = 0pk1(1 − p)n−k , for k = 0, 1, . . . , n HTT 1- p k •1 Model of: number of successes in a given number of indepen 0 assume independent •2– Random variable X: tosses, number of Heads observed p THH • Properties: pX (x) ≥ 0= p > 0 P(H) 1- p � p of: number of successes in a given number of ind 0 •1 Model 2 np�(x) p�X = 1 n−k 1- p THT pX (k) x = pk(k) (1 −= p) P(X , = fork)k = 0, 1, . . . , n X TTH p k = P(T T · · · T H) 1- p 0 1 2 k = 1, 2, . . . = (1 − p)k−1p, TTT 1- p • Example: X=number of coin PMF tosses – geometric until first head – assume independent tosses, P(H) = p > 0 n = 3, p = 0.2 n = 3, p = 0.5 n = 10, p = 0.5 n = 3, p = 0.2 n = 3, p = 0.2 n = 3, p = 0.5 n = 10, p = 0.5 n = 100, p = 0.5 n = 3, p = 0.2 n = 3, p = 0.2 =3, 10, p= 0.5 nn= p= 0.5 n p= n= = 100, 3, p = 0.50.5 n= p = 0.1 n= 3, 100, p = 0.5 Discrete unifo = p3,=p0.2 = 0.5 n =n3, • Parameters: integers =3, 3, pp= =0.2 0.2 n p= nn= n= = 100, 3, p = 0.20.1 = 3, p= 0.2 Pick on •n10, Experiment: n= p= 0.5 • Sample space: {a, a =10, 100, p= 0.5 uniform random variable; para nn= p= 0.5 n = 10, pDiscrete = 0.5 n3, = p10, p = 0.5 n= = 0.2 • Random variable X: • Parameters: integers a, b; a ≤ b • Model of: complete =3, 100,=p0.2 = 0.1 nn= = 0.2 Pick one of = = Experiment: a,na100, +3, 1,p.p.= . , 0.5 b0.2 at random; n= n 3, pp = 0.5 n•n== 3,3,pp==0.5 • Sample space: parameters {a, a + 1,a. . . a, ,bb}b Discrete uniform random variable; n = 100, p = 0.5 n = 100, p = 0.5 n = 100, = 0.5 n = 100, p =p0.1 n = 3, p = 0.2 n• =Random 3, p = 0.2variable X: X(ω) = ω • Parameters: integers a, b; a ≤ b complete n• = =Experiment: 100,p p==0.5 0.1 Pick one ofn•a, n== 100, 0.1 n = 100, p = 0.1uniform aModel + 1,p. = .p.of: ,= b at random;ignorance all equally likely Discrete n 10, 10, 0.5 • Sample space: {a, a + 1, . . . , b} • Parameters: integers a a b Discrete uniform random variable; Discrete uniform parameters random a, b variable; Discrete param unifo n• =Random 3, p = 0.2 n == 3,ωp = 0.2 variable X: X(ω) • Experiment: Pick one • Parameters: integers a, b;• aParameters: ≤b integers a, b; • aParameters: ≤b integer • Model of: complete ignorance ba+ • Special Sample case: space: a = {a, 100, p = 0.5 Pick one of na, 100, •n = Experiment: •= aExperiment: + 1, . .p. = , b 0.5 at random; Pick oneallofequally a,•a + Experiment: 1, likely .1 . . , b at random; Pick o • Random variable X: X •a Sample space: {a, a + 1,•. . .Sample , b} space: {a, a + 1, .•. . ,Sample b bb} − a + 1space: {a, • Model of: complete ign n = 100, p = 0.1 n = 100, p = 0.1 • Random variable X: X(ω) • = Random ω variable X: X(ω)• =Random ω variable X • ModelDiscrete of: complete ignorance • Model of: complete ignorance Model of: complete Special case: a =parameters b constant/deterministic r.v. a •b a, uniform random variable; random b variable; param 1Discrete uniform • Parameters: integers a, b; • Parameters: a≤b integers a, b; a ≤ b a b a bb − a + 1 a b • Experiment: Pick one of•a, Experiment: a + 1, . . . , b at Pick random; one all of equally a, a + 1,likely . . . , b at random; Geometric random variable; parameter p: 0 < p ≤ Geometric random variable; variable; parameter p:1 0 0< < pp ≤ ≤1 1 Geometric random parameter p: Geometric random variable; parameter p: 0 < p≤1 xperiment: infinitely many independent tosses of atosses coin with (Heads) = • Experiment: Experiment: infinitely many independent independent of a aP coin; P(Heads) (Heads) = pp • infinitely many tosses of coin; P = Geometric random variable; parameter p: 0 < p ≤ 1 • Experiment: infinitely many independent tosses of a coin; P (Heads) = p Geometric r Geometric random variable; parameter p: 0 < p ≤ 1 random variable; parameter • Sample SampleGeometric space: Set Set of infinite infinite sequences of H H and and p: T 0<p≤1 • space: of sequences of T Experiment: infinitely many independent tosses a coin; P(Heads) p •• •space: Sample space: Set of infinite sequences of H and infini Experiment: infinitely many independent tosses ofof aT coin; P(Heads) =•= p Experiment: ample Set of infinite sequences of H and T Experiment: infinitely many independent tosses a coin; P(Heads) = p • •Random Random variable X: number number of tosses tosses until until the of first Heads • variable X: of the first Heads • Sample space: S Sample space: Set of infinite sequences H and T •• •Random variable X: number of tosses until the first Heads Sample space: Set of infinite sequences ofof H and T andom variable X: number of tosses until the first Heads Geometric random variable; parameter p: 0 • Sample space: Set of infinite sequences of H and T Model of: of: waiting waiting times; times; number number of of trials trials until until a a success success •• •Model Model • independent Randomtosses variable Random variable X: number of tosses until the first Heads • of: waiting times; number of trials until a success • Experiment: infinitely many of a co Random variable X: number of until tossesa until the first Heads Model•of: waiting times; number of trials success • Random variable X: number of tosses until the first Heads • Sample space: Set of infinite sequences H and T • Model of:of waiting •Model Model waiting times; number trials until a success •ppX of:of:waiting times; number ofof trials until a success Geo (k) = (k) = X •(k) Model of: waiting times; number of trials until a success p = X • Random variable X: number of tosses until the first k) = • Experime • Model of: waiting times; number of trials until a succ p (k) = p (k) = X X Heads pX (k) = P (no ever) p (k) = • Sample s P(no Heads ever) X o Heads ever) pX (k) = k p = 1/3 • Random P1(no Heads ever) P Heads ever) 0 (no 2 P(no Heads ever) • Model of 2 0 1 2 P(no Heads ever) 0 01 12 2 0 1 2 0 1 2 P(no Heads ever) �n � � k (1 − p)n−k , �pnX� (x) = n� pk for k = 0, 1, . . . , n n−k , k n−k p (x) = p (1 − p) for k = 0, 1, . . .1, n2 3 4 5 6 0 7 1 8 2 9 k pX (x) = X p (1 − p) , for k = 0, 1, . . . , n �n � �k k n���n��k k p (x) = pk (1 − p)n−k , n−k , X n−k n p (x) = p (1 − p) for k = 0, 1, . . . , n k pX (x) = p (1 − p) , for k = 0, 1, . . , n = X(x) pX =k k pk (1 − p)n−k , for k = 0, 1, .p.X.(2) ,n k pX (k) = k for k = 0, 1, . . . , P(no Heads pX (x) = 0 1 2 • Motivation: Play a game 1000 times. Random gain at each play described by: • “Average” gain: Expectation/mean of a random variable Expectation Expectation/mean of a random variable Expectation/mean of a random variable Expectation tivation: Play a game 1000 times. 1, w.p. 2/10 • Motivation: Play a game 1000 times. dom gain at each play described by: ation: Play a game 1000 times. • Motivation: Play a game 1000 times. X = 2, w.p. 5/10 Random gain at each eachby: play described described by: by: m gain atRandom each play described gain at play 4, w.p. 3/10 erage” gain: “Average” gain: gain: age” gain: •• “Average” • Definition: � 1, w.p. 1/6 1, w.p. 1/6 E [X] = xpX (x) 2, w.p. 1/2 X= x X= 2, w.p. 1/2 1, w.p. 1/6 1, w.p. 1/6 1, w.p. 1/6 4, w.p. 1/3 • Interpretations: X = 4, w.p. 1/3 2, w.p. w.p. 1/2 1/2 X = 2, w.p. X 1/2 = 2, 4, w.p. w.p. 1/3 1/3 4, ofw.p. 1/3 – Center of gravity PMF 4, • Definition: nition: � – Average in large number � E [X] = xp (x) • XInterpretation: Average in large number Definition: tion: •• Definition: E [X] = xp (x) of independent repetitions X of the experiment x � x � of independent repetitions of the experiment � E[X] = xp (x) E [X] = xp (x) (to be substantiated later in this course) X E[X] = X(x) xpX (to be substantiated later in this course) x xx •• Interpretations: Caution: If we have an infinite sum, it •needs to be Ifwell-defined. rpretations: Caution: we have an infinite sum, it needs to � � – Center of gravity of PMF We assume |x| pX (x) < ∞ We assume |x| pX (x) < ∞ retations: Interpretations: nter of• gravity of PMF x x – Average in large number of repetitions of the experiment er of gravity ofnumber PMF Center of gravity of PMF of the experiment erage in–large of repetitions (to be substantiated later in this course) oage be in substantiated in this large numberlater oflarge repetitions Average in number of the experiment Expectation of a uniform r. • –Example: Uniform on course) 0, 1,of.of .the . ,repetitions n experiment e substantiated in this course) (to belater substantiated later in this course) • Example: Uniform on 0, 1, . . . , n • Uniform on 0, 1, . . . , n mple: Uniform on 0, 1, . . . , n pX(x ) dent repetitions of the experiment tantiated later in this course) • Interpretation: Average in large number we have an sum, it needs to of be the well-defined. of infinite independent repetitions experiment � |x| pX (x) < ∞ (to be substantiated later in this course) x ndom variable: Bernoulli with p ∈it [0, 1] to be well-defined. • Caution: If we have anparameter infinite sum, needs � Expectation of a|x|Bernoulli r.v. We assume pX (x) < ∞ x 1, p Expectation of a Bernoulli r.v. w.p. 1, w.p. p X= (x) 0, = w.p. 1 − p pX 0, w.p. 1 − p 1, w.p. p uniformExpectation random variable; parameters a, b p (x) = X r.v. of a uniform 0, w.p. 1 − p egers a, b; a ≤ b 1, . . of . If , na, ck0,one 1, . .indicator . , b at random; all equally X ais+the of an event A, Xlikely = IA : Expectation of a uniform r.v. 1 1, . . . , n • Uniform on 0, n+1 1 n+1 – Average in large number large number Expectation/mean of a random variable • If weof fixindependent some x, then “Xrepetitions = x” is an event of the experiment dent repetitions of the experiment Expectation/mean of a ran (to be substantiated later in this course) stantiated later in this course) • Motivation: Play a game 1000 times. • Notation: � � at play Play described by:1000 times. • an Caution: we an sum, it needs be each well-defined. •toMotivation: a game pXIf (x) = Phave (X = x) = infinite Pto {ω be ∈ Ω well-defined. s.t. Random X(ω) = x} gain we have infinite sum, it needs � � PMF ectation/mean of a random We assume |x| pX (x)variable <∞ Random gain at each play described by: ber |x| pX (x) < ∞ • “Average” gain: x • Properties: p (x) ≥ 0 x of the experiment tions X • Interpretations: nterpretations: � later in this course) terpretations: y a game 1000 times. pX (x) = 1 – Center of gravity ofr.v. PMF • “Average” gain: Center of gravity ofPMF PMF Expectation of a uniform infinite sum, itgravity needs to be well-defined. examples x Expectation Center of of each play described by: – Average < ∞ Average large numberin large number Average ininlarge number of independent repetitions of the experiment of independent repetitions of the experiment of independent Expectation examplesrepetitions of the experiment 1, w.p. 1/6 0, 1, . . . , n (to be substantiated later in this course) • Uniform on 0, 1, . . . , n (to be substantiated later in this course) X a b c d (to be• substantiated later in this course) X = 2, w.p. 1/2 1, w.p. 1/ • Caution: If we have an infinite sum, it needs to be well-defined. Caution: we haveananinfinite infinite sum, it needs well-defined. 4, w.p. 1/3 2, w.p. 1/ �sum, aution: IfIfwe it needs to to be be well-defined. �have pX (x) X= � ) p (x We assume X pX (x) < ∞ Weassume assume |x| (x) |x| We pXpX (x) << ∞∞ |x| 4, w.p. 1/ x x 1 1, w.p. 1/6 x 1 3 4 5 • mass Definition: Probability function (PMF) of a discrete r.v. X n+1 X = 2, w.p. 1/2 1/(n+1) examples n + 1 � Expectation examples Expectation examples Expectation x) • Definition: E[X]of=X xpX (x) • It is the “probability law” or “probability distribution” 4, w.p. 1/3 ... pX(x ) +1) x • If0,we x, then “X = x” is an event •1,1, on 1, .fix . . ,some n niform on .Uniform ... ., .n, n Uniform on0,0, ... 1/(n+1) 0 1 x n = E n- 1[X] � x 0 • Notation: xpX (x) ... 0 1 • Interpretations: x � � n n- 1 pX (x) = P (X = x) = P {ω ∈ Ω s.t. X(ω) = x} – Center of gravity of PMF E[X] = � x xpX (x 1 11 1 1 1 – Average E[X] •= Properties: 0 ×n + 1 + 1pX×(x) ≥ 0 + · · ·in +large n × number = of repetitions of the experime n + 1 n + 1 n+1 n (to + 1 be substantiated n + 1 later in this course) � x p (x) =1 1 n X n- 1 ) ) pXp(x (x ) p (x x X X • Example: Uniform on 0, 1, . . . , n 1/(n+1) 1/(n+1) 1/(n+1) • Example: X=number of coin tosses until first head . .. . . ... – assume independent tosses, P(H) = p > 0 0 0 1 1 n x x n- 1 1 = P(X = k) n n- 10 p n(k) x Expectation as a population average ctation/mean of a random variable as Expectation as a population average Expectation population average Expectation average Expectation asasa a apopulation population average nts a game•••1000 times. n students •n students nnstudents students ach by: ores play : x1,described . . . , xn Weight of ith student: Weightof ithstudent: student::xxixii ••• • Weight Weight ofofith ith student: i ent: pick a student at random, all equally likely Experiment: pick a student at random, all equally likely Experiment:pick picka studentat random, equally likely Experiment: pick a student at random, all equally likely ••• all equally likely • • Experiment: Experiment: pick a astudent student atatrandom, random, allall equally likely variable X: quiz score of selected student Random variable X: weight of selected student Random variableX: X:weight weightof selected student ••• variable student Random variable X: weight of selected student • • Random Random variable X: weight ofofselected selected student 1, w.p. 1/6 the xi are distinct – asume the x are distinct – asume the1/2 aredistinct distinct iixare the x iare –= asume asume the xix distinct X– 2, w.p. assume the are distinct i 4, w.p. 1/3 (x )= = p (x pppX X (x(x == iii))i ) XX Properties of expectations � Properties of expectations Propertiesof expectations Properties Properties ofofexpectations expectations E[X] = xpX (x) e a r.v. and let Y x= g(X) Let X be a r.v. and let Y = g(X) • Let LetX bea r.v.and andlet letY g(X) •••� g(X) Let XXbe be a ar.v. r.v. and let Y Y= == g(X) E[Y ] = ypY (y) � � � � y – Hard: E [Y ] = ypyp (y) Hard:E[Y E ] = yp (y) – ]] = Y –– Hard: Hard: [Y[Y = ypYY (y) (y) E[Y ] = � yyy y Y � � � � g(x)pX (x) x– – Easy: [Y[Y ]= (x) Easy:EE ] = g(x)p g(x)p (x) XX – Easy: E[Y ] = x xg(x)pX (x) X x In general, E[g(X)] �= g(Ex[X]) • • Caution: E[X]) Caution:InIngeneral, general,E[g(X)] E[g(X)]�=�=g(g( E[X]) • “Average” gain: X= • Definition: Elementary properties of expectations Elementary properties of expectations • Definition: 0, then E[X] ≥ 0 • If X ≥ 0, then E[X] ≥ 0 1, 2, 4, E[X] = w.p. w.p. w.p. X � x 1/6 1/2 1, 1/3 2, = 4, xpX (x) E[X] = w.p. w.p. w.p. � xp x {a, b}, then a ≤ E[X] ≤ b • If X ∈ {a, b}, then a ≤ E[X] ≤ b • Interpretations: Elementary properties of expectations a constant, E[c] = c – Center of gravity of PMF • If c is a constant, E[c] = c • If X ≥ 0, then E[X] ≥ 0 – Average in large number of repetitions of the experi (to be substantiated later in this course) • If a ≤ X ≤ b, then a ≤ E[X] ≤ b Elementary properties of expectations • If c is a constant, E[c] = c • If X ≥ 0, then E[X] ≥ 0 • If a ≤ X ≤ b, then a ≤ E[X] ≤ b • If c is a constant, E[c] = c • Example: Uniform on 0, 1, . . . , n The expected for calculating E[g( The expected value rule, for calculating E[g(X)] •value If werule, fix some x, x • Notation: • then If we“X fix = som Properties of expectations • Notation: Pr • It is the “probability law” • It is X thebe Let The expected value rule, for calculating pXrule, (x) • =for P (X = x)pX =( • Notation: The Properties expected value calculatin Notation: of expectations • Let X be a r.v. and let Y = g(X)of expectations • If we fix some x, then “X • Let X be a r.v. and let Properties If we fix The expected rule,•for for calc – Hard: E The expected value valuep rule, calcula (x) = P (X = x � X • Properties: p (x) ≥ 0 � of •Xexpectations Properties: pX • Let X be a r.v. andProbability let Y Properties =•g(X) Notation: Hard: EThe [Y = ypYlet (y)Yvalue •] = Notation of expectations � mass function (PMF) – Hard: E[Yrule, ypYca (o expected rule, for calculating E[g(X)] •– Let X be a ]r.v. and = g(X) TheProperties expected value for Properties of expectat p y y X The expected value rule, for calculating E [g(X)] •X � –≥= Easy: expectation • let Properties: (x) 0 Pvalu TE Properties: pX (x) (X � • LetEX be a r.v. and Y Properties = g(X)Thepof expected x � � – Hard: [Y ] = yp (y) Y or “probability di� • Let •X It beisa the r.v. “probability and let Y =–law” g(X) – Easy: Hard: E E[Y [Y ]] = = ypY (y) Easy: E[Y ]of = expect g(x) – g(x)p (x) ya r.v. • Let X be and let Y = g(X) X Properties � Properties of expectations • Let X be a r.v. and let Y = g(X) x≥ 0 � xy rule, Hard: E[Y ]= � ypY (y) The expected value forfor calculating E– [g(X)] Properties of expectations •x, Properties: pis The expected value rule, calculating E[g(X)] •yX (x) Propert � • ] If fix some then “X = x” an event Propert � The value rule, for calc – Easy: E[Y =Ewe (x) – Hard: [Y ]g(x)p =yexpected yp (y) X� Y – Hard: E [Y ]= yp (y) • Let X be a r.v. and let Y = g(X) • – Easy: E[Y ] = g(x)pX (x) Y x y 2 g• – Hard: E[Y� ] y=y ypY (y) •y 2Let X be0.1 a•r.v. and Y r.v. = g(X) g prob 0.2 0.3let Let be 0.4 a and let Y = g(X) xX •Let Caution: – Easy: [Y ] = � g(x)p (x) • and X Ybe=a • E Notation: y • X Let let � X be a r.v. � Properties ofof expectations Properties expectations Properties of expectat – Easy: E [Y ] = g(x)p (x) x � y 2 g– prob 0.1 –0.2 0.4 [Y ]= (y) 3E 5 X Example: of •4� � YCaution: � 3 • 4X=number 5general, Examp Easy: EHard: [YE0.3 ][Y= •yp(x) In E X x= pg(x)p � – Easy: ] g(x)p (x) •y– 2Averaging over y: (x) = P (X = x) = P {ω ∈ Ω y X Hard: E [Y ] = yp (y) X g probE0.1 0.2 0.3 0.4 Hard: [Y ]–= yp (y) until first head – yp Hard: E[Y Y x x – Hard: E[Y head Y ] =first Y (y) � Y = until • Let X be a r.v. and let g(X) y 2 g prob 0.1 0.2 0.3 0.4 y • Caution: In general, E [g(X)] = � g( E [X]) et XX bebe a r.v. and let Y = g(X) y Let a r.v. and let Y = g(X) ytosses, Linearity of expectation: E[aXy + =–aE [X]0.2 + 3 Example: Easy: ]= g(x)p (x)3 •40.45independent 5indepe Exa –Eb[Y assume X– •4X=numbe 2 gb]prob 0.1 0.3 assume � •• Averaging x: � � � • Properties: p (x) ≥ 0 Averaging over over y: x y 2 g prob 0.1 0.2 0.3 0.4 X until first head y 2 g prob 0.1 0.4 E[Y (x) ]= g(x)pX (x) • –Caution: PE (H) = p�= > g( 0 [Y until first head –Properties Easy: E �� P� (H) = pX >(x) 0[Y –0.3 E ]= g(x)p Hard: E[Y ] = 0.2 yp (y) In general, [g(X)] E[X]) – Easy: E[Y ]–= Easy: g(x)p YEasy: X x Hard: ]= (y) 3assume 4 5 x3 (y) x Hard:E••[Y E[Y ] = ypyp Intuitive passume (x) = 1 tosses, Caution: Inprob general, [g(X)] �=independent g(Example: Ep[X]) 4X=num 5ind YIn y– E0.2 Y general, X= – E[g(X)] �= g(E[X]) • • Averaging (k) P (X = y 2 g 0.1 0.3 0.4 over y: • Caution: Averaging over x: Properties: If α, β are X y y • Caution: In general, E[g(X)] �=0g(Ex[X]) � until first P(H) = p head > until first h •P(H) E[α] =pT> = P= (T – Easy: E[Y ] = g(x)p (x) y 2 � � E0.1 y 2 X y 2 g prob 0.2 0.3 0.4 • Caution: In general, [g(X)] = � g( E [X]) Properties: α, β are constants, – assume independent toss Proof, based on the value rule: • Averaging over x: – assume •If Caution: In E[α] [g(X)] �= E[X]) x general, Easy: ]= (x) • then: = Caution: In X general, Eexpected [g(X)] Easy:E••[Y E[Y ] = Eg(x)p [g(X)] = g(x)p pXg( (k) = −Pp( = (1 X (x)�= g(E[X]) P(H) = p >•0 •Caution: EP [αX] (H)= = x x • Caution: general, E[g(X)] • Averaging over y: = PIn( Properties: If α, β are constants, then: • Example: X=number ofIn coin tosses • E[α] = g prob 0.1 •y 2Caution: In general, E [g(X)] = � g( E [X]) – geometric PMF –then: geometric PM • E[αX] = 0.2β are 0.3constants, 0.4 pX (k) = Properties: If α, Proof: = (1 until first head aution: In general, E [g(X)] = � g( E [X]) • E [αX + β Properties: If α, β are constants, then: Properties: If α, β are constants, then: 2 g prob 0.1 0.2 0.3• 0.4 E[α] =] = Averaging over x: = • E[αX] = 2 ••• E [X • Etosses, [αXPMF + β] = overindependent y:– geometric – assume – geometric • Averaging E[α] = Properties: If α, β are constants, then: = • E [α] = • E [α] = P (H) = p > 0 Properties: If α, β are constants, then: Properties: If α, β are constants, then: • E [αX] = • Caution: In general, E [g(X)] = � g( E [X]) • E[αX + β] = Averaging over y: Properties: • • Averaging x: Properties: If α, β–are const – geometric PMF E[αX]• =E[α]over geome = p (k) = P (X = k) •• E [αX] = • E+ [αX] = X • E[αX β] = E[α] [α] = = • E •ET[X]) E·[α] In general, [g(X)] �= P g((T Properties: If α, β are constants, then: • • Caution: · · T=H) Averaging over x:are constants, • EE [α] = = E[αX•+ E β][αX] = If perties: If α, β then: = Properties: α, β are constants, then: • E[αX + β] = •• E = (1 − p)k−1p, E[αX [αX]+=β] = •• EE[αX] • E[αX] = • E[αX] = [α] == Properties: Caution: In general, E [g(X)] = � g( E [X]) • E [αX + β] = • E [α] = If α, β are constants, then: [α] = – geometric PMF • E[αX + β] = • E[αX + β] = • E[αX + β] 2 • E[αX + β] = • E[αX] = • E[α] = E[X = ]= [αX] • E[αX + β] •= E[αX] = [αX + β] = • E[αX + β] = • E[αX] = Properties: If α, β are constants, then: • E[αX + β] = • E[α] = • E[αX] = Linearity of expectation: perties: E[aX + b] = aE[X] + b Linearity of expectation: If α, β are constants, then: E[aX + b] = aE[X] + b α] = • Intuitive αX] = • Proof: Derivation, based on the expected value rule: αX + β]Proof: = Properties: If α, β are constants, then: • E[α] = Properties: If α, β are constants, then: • E[αX] = • E[α] = • E[αX + β] = • E[αX] = • E[αX + β] =