Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Chapter 5 – Statistical Review • Chapter 5 is a brief review of statistical concepts. It is NOT a replacement for a statistics course. • By the end of the chapter, you will be able to: 1) Identify population and sample data and perform population and sample statistical calculations. 2) Define, interpret and evaluate statistics. 3) Demonstrate the use of statistical tables. 4) Construct confidence intervals and hypothesis tests from sample data. 1 5) Begin to calculate OLS estimations 5.1 Simple Economic Models and Random Components • Consider the linear economic model: Yi = β1 + β2Xi + єi • The variable Y is related to another variable X – Utility is related to hours of TV watched • Єi (or epsilon) represents error; everything included in Y that is not explained by X – Ie: Quality of TV show, Quality of Popcorn, Other Facts of Life 2 5.1 Observed or Random Components • Єi (or epsilon) is the RANDOM ERROR TERM; it takes on values according to chance • Since Yi depends on Єi, it is also random • β1 + β2Xi is assumed to be fixed in most simple models (which simplifies everything) – Referred to as the deterministic part of the model – X, β1 and β2 are Non-Random • β1 and β2 are unknown, and must be estimated 3 5.1 Example • Consider the function: Utilityi = β1 + β2Sistersi + єi • Happiness depends on the number of sisters • єi captures: number of brothers, income, and other factors (ie: bad data collection and shocks) • Utility and Sisters are Observable • Utility and єi are random • β1 and β2 must be estimated (< or > 0?) 4 5.2 Random Variables and Probabilities • Random Variable –A variable whose value is determined by the outcome of a chance experiment • Ie: Sum of a dice roll, card taken out of a deck, performance of a stock, oil discovered in a province, gender of a new baby, etc. • Some outcomes can be more likely than others (ie: greater chance to discover oil in Alberta, more likely to roll an 8 than a 5) 5 5.2 Random Variables • Discrete Variable –Can take on a finite # of values –Ie: Dice roll, card picked • Continuous Variable –Can take on any value within a range –Ie: Height, weight, time • Variables are often assumed discrete to aid in calculations and economic assumptions (ie: Money in increments of 1 cent) 6 5.2 Probability Terminology Experiment –Any action that creates at least one possible outcome -ie: Taking tests, speeding, flipping a coin Trial – one occurrence of an experiment -ie: Taking the Econ 299 midterm, speeding tomorrow morning, flipping a coin once Outcome – Any result of an experiment -ie: Passing the test, getting a speeding ticket, getting a “tails” 7 5.2 Probability Terminology • Probabilities are assigned to the various outcomes Sample Space – set of all possible outcomes from a random experiment -ie S = {2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12} -ie E = {Pass exam, Fail exam, Fail horribly} Event – a subset of the sample space -ie B = {3, 6, 9, 12} ε S -ie F = {Fail exam, Fail horribly} ε E 8 5.2 Quiz Example Students do a 4-question quiz with each question worth 2 marks (no part marks). Handing in the quiz is worth 2 marks, and there is a 1 mark bonus question. Events: -getting a zero (not handing in the quiz) -getting at least 40% (at least 1 right) -getting 100% or more (all right or all right plus the bonus question) -getting 110% (all right plus the bonus question) *Events contain one or more possible outcomes 9 5.2 Probability Probability = the likelihood of an event occurring (between 0 and 1) P(a) = Prob(a) = probability that event a will occur P(Y=y) = probability that the random variable Y will take on value y P(ylow < Y < yhigh) = probability that the random variable Y takes on any value between ylow and yhigh 10 5.2 Probability Examples P(true love) = probability that you will find true love P(Sleep=8 hours) = probability that the random variable Sleep will take on the value 8 hours P($80 < Wedding Gift < $140) = probability that the random variable Wedding Gift takes on any value between $80 and $140 11 5.2 Probability Calculations IF probability can be calculated, two methods are: P(a) # of outcomes including a # of total outcomes # of outcomes including a P(a) # of trials -Method 1 is used for full information (die roll) -Method 2 is used for a test (asking names at a party) 12 5.2 Probability of Small World’s Die In the board game Small World, the 6-sided die has a blank (0) on 3 sides, and a 1, 2, and 3 on each of the remaining sides 3 P ( 0) 6 1 P (1) 6 1 P ( 2) 6 1 P (3) 6 13 5.2 Probability Extremes If Prob(a) = 0, the event will never occur ie: Canada moves to Europe ie: the price of cars drops below zero ie: your instructor turns into a giant llama If Prob(b) = 1, the event will always occur ie: you will get a mark on your final exam ie: you will either marry your true love or not 14 ie: the sun will rise tomorrow 5.2 Probability Terminology Mutually Exclusive Events – cannot occur at the same time -rolling both a 3 and an 11; being both dead and alive; Exhaustive Events – cover all possible outcomes -a dice roll must lie within S ε [2,12] -a person is either married or not married Complement– everything other than the event -being dead is the complement to being alive -rolling an 11 or 12 is a complement to rolling a 15 2, 3, 4, 5, 6, 7, 8, 9, or 10 5.2 Probability Rules 1) Probability Range P(a) must be greater than or equal to 0 and less than or equal to 1 : 0≤ P(a) ≤1 2) Something Must Happen ∑ P(a) =1 16 5.2 Probability Rules 3) Exhaustive Rule If any set of events (ie: {A,B,C}) are exhaustive, then P(A or B or C) = 1 ex) Prob(winning, losing or tying)=1 4a) Addition Rule (Mutually Exclusive) If any set of events (ie: {A,B,C}) are mutually exclusive, then P(A or B or C)=P(A)+P(B)+P(C) ex) Prob. of marrying the person to the right or left 17 5.2 Probability Rules 4b) Addition Rule (General) P(A or B)=P(A)+P(B)-P(A+B) Example: P(Marrying a fellow student OR someone named Joe) = P(Marry a fellow student) +P(Marry Joe) -P(Marry a fellow student named Joe) 18 5.2 Probability Rules 5) Complement Rule P(not A)=1-P(A) Example: P(Not eating ice cream)=1-P(Eating Ice Cream) 5) Multiplication Rule (A and B are independent/unrelated) P(A and B)=P(A)P(B) Example: P(Owning a cat AND having black hair) =P(Owning a Cat)P(Having black hair) 19 5.2 Probability of Small World I have the last action in Smallworld. Colin has 81 points, and I have 72, plus 8 guaranteed points this turn. (If our points are tied, I win.) I have 3 final attacks with the die. I lose each attack on a roll of zero, otherwise I win. Each attack I win gets me one point. What is the probability that I will win and Colin will lose? 20 5.2 Probability of Small World 3 6 1 P (1) 6 P ( 0) 1 6 1 P (3) 6 P ( 2) P(Win ) 1 P( Lose) P(Win ) 1 {P( Lose1st ) P( Lose 2nd ) PLose3rd )} P(Win ) 1 {( 0.5)0.5(0.5)} P(Win ) 1 0.125 P(Win ) 0.875 There is an 87.5% chance that I will win. 21 5.2 Probability Examples 1) 2) 3) 4) 5) 6) 7) 8) P(coin flip=heads) = ½ P(2 coin flips=2 heads) = ¼ Probability of tossing 6 heads in a row = 1/64 Probability of rolling less than 4 with 1 sixsided die = 3/6 Probability of throwing a 13 with 2 dice= 0 Probability of winning rock, paper, scissors = 1/3 (or 3/9) Probability of being in love or not in love=1 Probability of passing the course = ? 22 5.2.1 Probability Density Functions • The probability density function (pdf) summarizes probabilities associated with possible outcomes Discrete Random Variables – pdf f(y) = Prob (Y=y) Σf(y) = 1 -(the sum of the probabilities of all possible outcomes is one) 23 5.2.1 Dice Example • The probabilities of rolling a number with the sum of two sixsided die • Each number has different die combinations: 7={1+6, 2+5, 3+4, 4+3, 5+2, 6+1} y f(y) 2 1/36 8 5/36 3 2/36 9 4/36 4 3/36 10 3/36 5 4/36 11 2/36 6 5/36 12 1/36 Exercise: Construct a table with one 4-sided and one 8-sided die 7 6/36 y f(y) 24 5.2.1 Probability Density Functions Continuous Random Variables – pdf f(y) = pdf for continuous random variable Y ∫f(y)dy = 1 (sum/integral of all probabilities of all possibilities is one) -probabilities are measured as areas under the pdf, which must be non-negative -technically, the probability of any ONE event is zero 25 5.2.1 Continuous Headache Continuous Random Variables – pdf f(y) = 0.2 for 2<y<7 = 0 for y <2 or y >7 7 f(y) P(3 Y 7) 7 0 . 2 dy 3 [0.2 y ] 0.2(7) 0.2(3) 0.8 y 3 Continuous probabilities are the area under the pdf curve. 0.2 Y 0 3 7 26 5.3 Expected Values Expected Value – measure of central tendency; center of the distribution; population mean -If the variable is collected an infinite number of times, what average/mean would we expect? Discrete Variable: μY=E(Y) = Σyf(y) Continuous Variable: μ(Y)=E(Y) = ∫yf(y)dy 27 5.3 Expected Example What is the expected value from a dice roll? E(W) = Σyf(y) =2(1/36)+3(2/36)+…+11(2/36)+12(1/36) =7 Exercise: What is the expected value of rolling a 4-sided and an 8-sided die? A 6-sided and a 10sided die? 28 5.3 Expected Application – Pascal’s Wager Pascal’s Wager, from Philosopher, Mathematician, and Physicist Blaise Pascal (1623-62) argued that belief in God could be justified through expected value: -If you live as if God exists, you get huge rewards if you are right, and wasted some time and effort if you’re wrong -If you live as if God does not exist, you save some time and effort if you’re right, and suffer 29 huge penalties if you’re wrong 5.3 Expected Application – Pascal’s Wager Mathematically: E(belief) = Σutility * f(utility) =(Utility if God exists)*p(God exists)+ +(Utility if no God)*p(no God) =1,000,000(0.01)+(50)(0.99) =10,000+49.52 =10,049.52 30 5.3 Expected Application – Pascal’s Wager Mathematically: E(no belief)= Σutility * f(utility) =(Utility if God exists)*p(God exists)+ +(Utility if no God)*p(no God) =-1,000,000(0.01)+(150)(0.99) =-10,000+148.5 =-9,851.5 Since -9,851.5 is less than 10,049.52, Pascal argued that belief in God is rational. 31 5.3.1 Properties of Expected Values a) Constant Property E(a) = a if a is a constant or non-random variable Ie: E(14)=14 Ie: E(β1+ β2Xi) = β1+ β2Xi b) Constants and random variables E(a+bW) = a+bE(W) If a and b are non-random and W is random 32 5.3.1 Properties of Expected Values Applications: If E(єi) =0, then E(Yi) = E(β1 + β2Xi + єi) = β1 + β2Xi + E(єi) = β1 + β2Xi E(6sided+10sided) =E (6-sided) + E (10 sided) = 3.5 + 5.5 =9 33 5.3.1 Properties of Expected Values c) “Not so Fast” Property E(WV) ≠ E(W)E(V) E(W/V) ≠ E(W)/E(V) d) Non-Linear Functions E(Wk) = Σwkf(w) E(2 six-sided die2) =22(1/36)+32(2/36)+ …+112(2/36)+122(1/36) =54.83 34 5.4 Variance Consider the following 3 midterm distributions: 1) Average = 70%; everyone in the class received 70% 2) Average = 70%; half the class received 50% and half received 90% 3) Average = 70%; most of the class was in the 70’s, with a few 100’s and a few 40’s who got a Bachelor in Pottery 35 5.4 Variance Although these midterm results share the same average, their distributions differ greatly. While the first results are clustered together, the other two results are quite dispersed Variance – a measure of dispersion (how far a distribution is spread out) for a random variable 36 5.4 Variance Formula σY2= Var(Y) = E(Y-E(Y))2 = E(Y2) – [E(Y)]2 Discrete Random Variable: σY2= Var(Y)= Σ(y-E(Y))2f(y) Continuous Random Variable: σY2= Var(Y)= ∫(y-E(Y))2f(y)dy 37 5.4 Variances Example 1: E(Y)=70 Yi =70 for all i Var(Y) = Σ(y-E(Y))2f(y) = (70-70)2 (1) = (0)(1) =0 If all outcomes are the same, there is no variance. 38 5.4 Variances Example 2: E(Y)=70 f(50)=0.5, f(90)=0.5 Var(Y) = Σ(y-E(Y))2f(y) = (50-70)2(0.5)+ (90-70)2(0.5) =200+200 =400 39 5.4 Variances Example 3: E(Y)=70 f(40)=1/5, f(70)=3/5, f(100)=1/5 Var(Y) = Σ(y-E(Y))2f(y) = (40-70)2(1/5)+ (70-70)2(3/5)+ (100-70)2(1/5) =900/5+0+900/5 =1800/5 =360 40 5.4 Standard Deviation While Variance is a good tool for measuring dispersion, it is difficult to represent graphically (ie: Bell Curve) Standard Deviation is more useful for a visual view of dispersion Standard Deviation = Variance1/2 sd(W)=[var(W)]1/2 σ= (σ2)1/2 41 5.4 SD Examples In our first example, σ =01/2=0 No dispersion exists In our second example, σ =4001/2≈20 In our third example, σ =3601/2=19.0 Results where most dispersed in the second 42 example. 5.4.1 Properties of Variance a) Constant Property Var(a) = 0 if a is a constant or non-random variable Ie: Var(14)=0 Ie: Var(β1+ β2Xi) = 0 b) Constants and random variables Var(a+bW) = b2 Var(W) If a and b are non-random and W is random 43 5.4.1 Properties of Variance Applications: If Var(єi) =k, then Var(Yi) = Var(β1 + β2Xi + єi) = 0 + Var(єi) =k Exercise: Calculate the variance from: a) A coin flip b) A 4-sided die roll c) Both a and b, where the coin flip represents 0 or 1. 44 5.4.1 Properties of Variance c) Covariance Property If W and V are random variables, and a, b, and c are non-random, then Var(a+bW+cV)= Var(bW+cV) = b2 Var(W) + c2 Var (V) +2bcCov(W,V) Where Covariance will be examined in 5.6 45 5.4.1 Properties of Variance Application: If Var(Cost of Gas)=10 cents And Var(Cost of a Slurpee)=5 cents And Cov(Cost of Gas, Cost of Slurpee)=-1 cent Var(Cost of Gas+Cost of 2 Slurpees) = b2 Var(G) + c2 Var (Sl)+2bcCov(G,Sl) =12(10)+22(5)+2(2)(-1) =10+20-4 =26 cents 46 5.5 Joint Probability Density Functions Sometimes we are interested in the isolated occurrence or effects of one variable. In this case, a simple pdf is appropriate. Often we are interested in more than one variable or effect. In this case it is useful to use: Joint Probability Density Functions Conditional Probability Density Functions 47 5.5 Joint Probability Density Functions Joint Probability Density Function-summarizes the probabilities associated with the outcomes of pairs of random variables f(w,z) = Prob(W=w and Z=z) ∑ f(w,z) = 1 Similar statements are valid for continuous random variables. 48 5.5 Joint PDF and You Love and Econ Example: On Valentine’s Day, Jonny both wrote an Econ 299 midterm and sent a dozen roses to his love interest. He can either pass or fail the midterm, and his beloved can either embrace or spurn him. E = {Pass, Fail}; L = {Embrace, Spurn} 49 5.5 Joint PDF and You Love and Econ Example: Joint pdf’s are expressed as follows: P(pass and embrace) = 0.32 P(pass and spurn) = 0.08 P(fail and embrace) = 0.48 P(fail and spurn) = 0.12 (Notice that: ∑f(E,L) = 0.32+0.08+0.48+0.12 = 1) 50 5.5 Joint and Marginal Pdf’s Marginal (individual) pdf’s can be determined from joint pdf’s. Simply add all of the joint probabilities containing the desired outcome of one of the variables. Ie: f(Y=7)=∑f(Y=7,Z=zi) Probability that Y=7 = sum of ALL joint probabilities where Y=7 51 5.5 Love and Economics f(pass) = f(pass and embrace)+f(pass and spurn) = 0.32 + 0.08 = 0.40 f(fail) = f(fail and embrace)+f(fail and spurn) = 0.48 + 0.12 = 0.60 Exercise: Find f(embrace) and f(spurn) 52 5.5 Love and Economics Notice: Since passing or failing are exhaustive outcomes, Prob (pass or fail) = 1 Also, since they are mutually exclusive, Prob (pass or fail) = Prob (pass) + Prob (fail) 53 = 0.4 + 0.6 =1 5.5 Conditional Probability Density Functions Conditional Probability Density Function-summarizes the probabilities associated with the possible outcomes of one random variable conditional on the occurrence of a specific value of another random variable Conditional pdf = joint pdf/marginal pdf Or Prob(a|b) = Prob(a&b) / Prob(b) (Probability of “a” GIVEN “b”) 54 5.5 Conditional Love and Economics From our previous example: Prob(pass|embrace) = Prob(pass and embrace)/ Prob (embrace) = 0.32/0.80 = 0.4 Prob(fail|embrace) = Prob(fail and embrace)/ Prob (embrace) = 0.48/0.80 = 0.6 55 5.5 Conditional Love and Economics From our previous example: Prob(spurn|pass) = Prob(pass and spurn)/ Prob (pass) = 0.08/0.40 = 0.2 Prob(spurn|fail) = Prob(fail and spurn)/ Prob (fail) = 0.12/0.60 = 0.2 56 5.5 Conditional Love and Economics Exercise: Calculate the other conditional pdf’s: Prob(pass|spurn) Prob(fail|spurn) Prob(embrace|pass) Prob(embrace|fail) 57 5.5 Statistical Independence If two random variables (W and V) are statistically independent (one’s outcome doesn’t affect the other at all), then f(w,v)=f(w)f(v) And: 1) f(w)=f(w|any v) 2) f(v)=f(v|any w) As seen in the Love and Economics example.58 5.5 Statistically Dependent Example Bob can either watch Game of Thrones or Yodeling with the Stars: W={T, Y}. He can either be happy or sad V={H,S}. Joint pdf’s are as follows: Prob(Thrones and Happy) = 0.7 Prob(Thrones and Sad)=0.05 Prob(Yodeling and Happy)=0.10 Prob(Yodeling and Sad)=0.15 59 5.5 Statistically Dependent Example Calculate Marginal pdf’s: Prob(H)=Prob(T and H) + Prob(Y and H) =0.7+0.10 =0.8 Prob(S)=Prob(T and S) + Prob(Y and S) =0.05+0.15 =0.2 Prob(H)+Prob(S)=1 60 5.5 Statistically Dependent Example Calculate Marginal pdf’s: Prob(T)=Prob(T and H) + Prob(T and S) =0.7+0.05 =0.75 Prob(Y) =Prob(Y and H) + Prob(Y and S) =0.10+0.15 =0.25 Prob(T)+Prob(Y)=1 61 5.5 Statistically Dependent Example Calculate Conditional pdf’s: Prob(H|T)=Prob(T and H)/Prob(T) =0.7/0.75 =0.93 Prob(H|Y)=Prob(Y and H)/Prob(Y) =0.10/0.25 =0.4 Exercise: Calculate the other conditional pdf’s62 5.5 Statistically Depressant Example Notice that since these two variables are NOT statistically independent – Game of Thrones is utility enhancing – our above property does not hold. P(Happy) ≠ P(Happy given Thrones) 0.8 ≠ 0.93 P(Sad) ≠ P(Sad given Yodeling) 0.2 ≠ 0.6 (1-0.4) 63 5.5 Conditional Expectations and Variance Assuming that our variables take numerical values (or can be interpreted numerically), conditional expectations and variances can be taken: E(P|Q=500)=Σpf(p|Q=500) Var(P|Q=500)=Σ[p-E(P|Q=500)]2f(p|Q=500) Ie) money spent on a car and resulting utility 64 (both random variables expressed numerically). 5.5 Conditional Expectations and Variance Example: A consumer can spend $5000 or $10,000 on a car, yielding utility of 10 or 20. The conditional probabilities are : f(10|$5,000)=0.7 f(20|$5,000)=0.3 E(U|P=$5000) =ΣUf(U|P=$5000) =10(0.7) +20(0.3) =13 Var(U|P=$5K) =Σ[U-E(U|P=$5K)]2f(U|P=$5K) =(10-13)2(0.7)+(20-13)2(0.3) 65 = 21 5.6 Covariance and Correlation If two random variables are NOT statistically independent, it is important to measure the amount of their interconnectedness. Covariance and Correlation are useful for this. Covariance and Correlation are also useful in model testing, as you will learn in Econ 399. 66 5.6 Covariance Covariance – a measure of the degree of linear dependence between two random variables. A positive covariance indicates some degree of positive linear association between the two variables (the opposite likewise applies) Cov(V,W)=E{[W-E(W)][V-E(V)]} 67 5.6 Discrete and Continuous Covariance Discrete Random Variable: Cov(V ,W ) (v E (v))( w E ( w)) f (v, w) v w Continuous Random Variable: Cov(V ,W ) (v E (v))( w E ( w)) f (v, w)vw v w 68 5.6 Covariance Example Joe can buy either a burger ($2) or ice cream ($1) and experience utility of 1 or zero. C={$1, $2}, U={0,1} Prob($1 Prob($1 Prob($2 Prob($2 and and and and 0)=0.2 1)=0.6 0)=0.1 1)=0.1 69 5.6 Covariance Example Prob($1 Prob($1 Prob($2 Prob($2 and and and and 0)=0.2 1)=0.6 0)=0.1 1)=0.1 Prob($1)=0.2+0.6=0.8 Prob($2)=0.1+0.1=0.2 Prob(0)=0.2+0.1=0.3 Prob(1)=0.6+0.1=0.7 70 5.6 Covariance Example E(C) =∑cf(c) =$1(0.8)+$2(0.2) =$1.20 E(U) = ∑uf(u) = 0(0.3)+1(0.7) =0.7 71 5.6 Covariance Example E(C) =$1.20 E(U) =0.7 Cov(C,U)=∑∑(c-E(C))(u-E(U))f(c,u) =(1-1.20)(0-0.7)(0.2) +(1-1.20)(1-0.7)(0.6)+(2-1.20)(0-0.7)(0.1) +(2-1.20)(1-0.7))0.1) =(-0.2)(-0.7)(0.2)+(-0.2)(0.3)(0.6) +(0.8)(-0.7)(0.1)+(0.8)(0.3)(0.1) =0.028-0.036-0.056+0.032 =-0.032 (Negative Relationship) 72 5.6 Correlation Covariance is an unbounded measure of interdependence between two variables. Often, it is useful to obtain a BOUNDED measure of interdependence between two variables, as this opens the door for comparison. Correlation is such a bounded variable, as it lies between -1 and 1. 73 5.6 Correlation Correlation Formulas: W V Corr (W ,V ) Corr (V ,W ) Cov(V , W ) WV (v E (v))( w E (w)) f (v, w) v w Var (v) Var ( w) 74 5.6 Correlation Example From the Data above: Var(C) =∑ (c-E(C)2f(v) =(1-1.20)2(0.8)+(2-1.20)2(0.2) =0.032 + 0.128 =0.16 Var(W) =∑ (u-E(U)2f(w) =(0-0.7)2(0.3)+(1-0.7)2(0.7) =0.147 + 0.063 =0.21 75 5.6 Correlation Example From the Data above: Corr(C,U) =Cov(C,U)/[sd(C)sd(U)] =-0.032 / [0.16(0.21)]1/2 =-0.175 (significant, but not too strong) Still represents a negative relationship. 76 5.6 Graphical Correlation If Correlation = 1, observations of the two variables lie upon an upward sloping line If Correlation = -1, observations of the two variables lie on a downward sloping line If Correlation is between 0 and 1, observations of the two variables will be scattered along an upward sloping line. If Correlation is between 0 and -1, observations of the two variables will be scattered along a downward 77 sloping line. 5.6 Correlation, Covariance and Independence Covariance, correlation and independence have the following relationship: If two random variables are independent, their covariance (correlation) is zero. INDEPENDENCE => ZERO COVARIANCE If two variables have zero covariance (correlation), they may or may not be independent. ZERO COVARIANCE ≠> INDEPENDENCE 78 5.6 Correlation, Covariance and Independence INDEPENDENCE => ZERO COVARIANCE ZERO COVARIANCE ≠> INDEPENDENCE From these relationships, we know that Non-zero Covariance => Dependence but Dependence ≠> Non-zero Covariance 79 5.7 POPULATION VS. SAMPLE DATA Population Data – Full information on the ENTIRE population. -Includes population probability (pdf) -Uses the previous formulas -ex) data on an ENTIRE class Sample Data – Partial information from a RANDOM SAMPLE (smaller selection) of the population -Individual data points (no pdf) -Uses the following formulas -ex) Study of 2,000 random students 80 5.7 Estimators Population Expected Value: μ = E(Y) = Σ y f(y) Sample Mean: Y Y i N __ Note: From this point on, Y may be expressed as Ybar (or any other variable - ie:Xbar). For example, via email no equation editor is available, so answers 81 may be in this format. 5.7 Sample Mean Example Tom and Rodney both go to a 4-day gaming convention. How much they spend (S)each day is listed below: Day 1 2 3 4 Tom 110 85 90 135 Rodney 200 20 10 220 S T R S S T i N R S i N 110 85 90 135 105 4 200 20 10 220 105 4 82 5.7 Estimators Population Variance: σY2 = Var(Y) = Σ [y-E(y)]2 f(y) Sample Variance: S y2 2 ( Y Y ) i N 1 83 5.7 Sample Variance Example Day 1 2 3 4 Tom 110 85 90 135 Rodney 200 20 10 220 S y2 2 ( Y Y ) i N 1 Tom : S S2 T T 2 ( S S ) i N 1 2 2 2 2 ( 110 105 ) ( 85 105 ) ( 90 105 ) ( 135 105 ) 2 SS 4 1 25 400 225 900 2 SS 517 3 84 5.7 Sample Variance Example Day 1 2 3 4 Tom 110 85 90 135 Rodney 200 20 10 220 S y2 2 ( Y Y ) i N 1 Rodney : S S2 R R 2 ( S S ) i N 1 2 2 2 2 ( 200 105 ) ( 20 105 ) ( 10 105 ) ( 220 105 ) 2 SS 4 1 9025 7225 9025 13225 2 SS 12,833 3 85 5.7 Sample Variance Example Day 1 2 3 4 Tom 110 85 90 135 Rodney 200 20 10 220 T S 105 R S 105 S y2 2 ( Y Y ) i N 1 Tom : S S2 517 Rodney : S S2 12,833 Even though Tom and Rodney spent on average the same average amount every day ($105), Rodney’s spending was MUCH more spread out (changed more 86 from day to day), as seen by the variance. 5.7 Estimators Population Standard Deviation: σY = (σ2)1/2 Sample Standard Deviation: Sy = (Sy2)1/2 87 5.7 Sample Standard Deviation Example Tom : S S 517 2 S S S S 517 22.7 2 Rodney : S S 12,833 2 S S S S 12,833 113 2 88 5.7 Estimators Population Covariance: Cov(V,W)=∑∑(v-E(v))(w-E(w))f(v,w) Sample Covariance: (V V )(W W ) Cov(V ,W ) i i N 1 89 5.7 Covariance Example Day 1 2 3 4 Tom 110 85 90 135 Rodney 200 20 10 220 T Cov( S , S R (S ) T i S T )( S R i S R ) N 1 (110 105)( 200 105) (85 105)( 20 105) (90 105)(10 105) (135 105)( 220 105) Cov( S T , S R ) 4 1 475 1700 1425 3450 Cov( S T , S R ) 2350 3 There is a POSITIVE covariance between Tom and Rodney’s spending. They seem to go up and 90 down at the same time. 5.7 Estimators Population Correlation: σvw = corr(V,W)= Cov(V,W)/ σv σw Sample Correlation: rvw = corr(V,W)= Cov(V,W)/ Sv Sw 91 5.7 Correlation Example Day 1 2 3 4 Tom 110 85 90 135 Rodney 200 20 10 220 T R Cov ( S , S ) T R Cor ( S , S ) ST S R 2350 T R Cor ( S , S ) 0.916 (22.7)(113) There is a VERY STRONG positive correlation between Tom and Rodney’s spending. 92 5.7 Estimators Population Regression Function: Yi = β1 + β2Xi + єi Estimated Regression Function: ˆ ˆ ˆ Yi 1 2 X i 93 5.7 Estimators OLS Estimation: ˆ ( X X )(Y Y ) (X X ) i 2 i 2 i Cov( X , Y ) ˆ 2 S X2 ˆ Y ˆ X 1 2 ^ Note: B2 may be expressed as b2hat 94 5.7 OLS Estimation Tom Rodney ˆ ˆ ˆ St 1 1St T R Cov( S , S ) 2350 ˆ 2 20.8 2 SR 113 ˆ 1 S T ˆ2 S R 105 20.8(105) 2079 Tom Rodney ˆ St 2079 20.8St Note: Although mathematically there is a relationship between Tom and Rodney’s spending, in reality there is no connection – both are influenced by a 3rd variable – the fact that most people spend the most on the first 95 and last day of the convention. 5.7 Estimators Example 2 Given the data set: Price 4 3 3 6 Quantity 10 15 20 15 Find sample means, variance, covariance, correlation, and ols estimation 96 5.7 Estimators Example 2 Price 4 3 3 6 Quantity 10 15 20 15 Sample Means: Pbar = (4+3+3+6)/4 = 4 Qbar = (10+15+20+15)/4 = 15 97 5.7 Estimators Example 2 Price 4 3 3 6 Pbar = 4 Quantity 10 15 20 15 Qbar=15 Sample Variance: Sp2 Sq2 = [(4-4)2+(3-4)2+(3-4)2+(6-4)2]/(N-1) =(0+1+1+4)/3 =2 = [(10-15)2+(15-15)2+(20-15)2+(15-15)2]/(N-1) =(25+0+25+0)/3 =50/3 98 5.7 Estimators Example 2 Price 4 3 3 6 Pbar = 4 Quantity 10 15 20 15 Qbar=15 Sample Covariance: Cov(p,q)= [(4-4)(10-15)+(3-4)(15-15) +(3-4)(20-15)+(6-4)(15-15)]/(N-1) =[ 0 + 0 -5 +0] /3 = -5/3 99 5.7 Estimators Example 2 Price 4 3 3 6 Pbar = 4 Quantity 10 15 20 15 Qbar=15 Sample Correlation Corr(p,q) = = = = Cov(p,q)/SpSq 5/3 / [2(50/3)]1/2 -5/3 / (10/31/2) -0.2886 100 5.7 Estimators Example 2 Price 4 3 3 6 Pbar = 4 Quantity 10 15 20 15 Qbar=15 Ols Estimation B2hat = ∑(Xi-Xbar)(Yi-Ybar) ---------------------∑(Xi-Xbar)2 = [(4-4)(10-15)+(3-4)(15-15)+(3-4)(20-15)+(6-4)(15-15) ----------------------------------------------------------------------------- (4-4)2+(3-4)2+(3-4)2+(6-4)2 =-5/6 101 5.7 Estimators Example 2 Price 4 Quantity 10 3 3 6 Pbar = 4 15 20 15 Qbar=15 Ols Estimation B1hat = Ybar – (B2hat)(Xbar) = 15- (-5/6)4 110 5 = 90/6 + 20/6 Yi Xi = 110/6 6 6 110 5 ˆ Qi Pi 6 6 102 5.7.1 Estimators as random variables Each of these estimators will give us a result based upon the data available. Therefore, two different data sets can yield two different point estimates. Therefore the value of the point estimate can be seen as being the result of a chance experiment – obtaining a data set. Therefore each point estimate is a random variable, with a probability distribution that can be analyzed using the expectation and variance operator. 103 5.7.1 Estimators Distribution Since the same mean is a variable, we can easily apply expectation and summation rules to find the expected value of the sample mean: Y Y i N Yi 1 E Yi E Y E N N 1 1 E Y E (Yi ) Y N N 1 E Y NY Y N 104 5.7.1 Estimators Distribution If we make the simplifying assumption that there is no covariance between data points (ie: one person’s consumption is unaffected by the next person’s consumption), we can easily calculate variance for the sample mean: 2 Yi 1 Var Yi Var Y Var N N 2 2 1 Var Y N 1 Var (Yi ) N 2 1 2 Var Y N Y Y N N 2 2 Y 105 5.7.1 Estimators Distribution Although we can’t observe the population variance of Ybar, we can calculate its sample variance, therefore, Var Y 2 Y N 2 Y S SampleVar Y N 106 5.8 Common Economic Distributions In order to test assumptions and models, economists need be familiar with the following distributions: Normal t Chi-square F For full examples and explanations of these tables, please refer to a statistics text. 107 5.8 Normal Distribution The Normal (Z) Distribution produces a symmetric bell-shaped curve with a mean of zero and a standard deviation of one. The probability that z>0 is always 0.5 The probability that z<0 is always 0.5 Z-tables generally (but not always) measure area from the centre Probabilities decrease as you move from the center 108 5.8 Normal Example Weekly weight gain can be argued to have a normal distribution: On average, no weight is gained or lost A few pounds may be gained or lost It is very unlikely to lose or gain many pounds Find Prob(Gain between 0 and 1 pound) Prob(0<z<1) = 0.3413 = 34.13% 109 5.8 Normal Example Find Prob(Lose more than 2 pounds) Prob(z<-2) = 0.5 - 0.4772 = 0.0228 (2.28%) Find Prob (Do not gain more than 2 pounds) Prob(z<2) = 0.5+0.4772 = 0.9772(97.72%) 110 5.8 Converting to a normal distribution Z distributions assume that the mean is zero and the standard deviation is one. If this is not the case, the distribution needs to be converted to a normal distribution using the following formula: Z x x x 111 5.8 Assignment Example The average for the Fall 2005 Assignment #2 was 82%. Standard deviation was aprox. 6. What is the probability of a random student getting above 90%? Prob(Y>90) = Prob[{(Y-82)/6}>{(90-82)/6}] = Prob(Z>1.33) = 0.5 - Prob (0<Z<1.33) = 0.5 - 0.4082 =9.18% 112 5.8 Assignment Example What is the probability of getting a mark in the 80’s? Prob(79<Y<90) = Prob[{(79-82)/6}<{(Y-82)/6}<{(90-82)/6}] = Prob(-0.5<Z<1.33) = Prob(0<Z<0.5) + Prob(0<Z<1.33) =0.1915 + 0.4082 =0.5997 =59.97% 113 5.8 Assignment Example Find the mark (Y*) wherein there is a 15% probability that Y<Y* (Bottom 15% of the class) (Since 0.15<50, Z*<0) Prob(Z<Z*) = 0.5-Prob(0<Z<-Z*) 0.15 = 0.5-Prob(0<Z<-Z*) Prob (0<Z<-Z*)=0.35 From tables, -Z*= 1.04 Therefore Z* = -1.04 114 5.8 Example Continued We know that Z = (x-μ)/σ So X= μ+z(σ) X= 82+(-1.04)6 X= 82-6.24 X= 75.76 There is a 15% chance that a student scored less than 75.76% 115 5.8 Other Distributions All other distributions depend on DEGREES OF FREEDOM Degrees of Freedom are generally dependant on two things: Sample size (as sample rise rises, so does degrees of freedom) Complication of test (more complicated statistical tests reduce degrees of freedom) Simple conclusions are easier to make than complicated ones 116 5.8 t-distribution t-distributions can involve 1-tail or 2-tail tests Interpolation is often needed within the table Example 1: Find the critical t-values (t*) that cuts of 1% of both tails with 27df (Note: 1% off both tails = 0.5% off each tail) For p=0.495, df 27 gives t*=2.77, -2.77 117 5.8 t-distribution Example 2: Find the critical t-value (t*) that cuts of 1% of the right tail with 35df For 1T=0.01, df 30 gives t*=2.46 df 40 gives t*=2.42 Since 35 is halfway between 30 and 40, a good approximation of df 35 would be: t*=(2.46+2.42)/2 = 2.44 118 5.8 t-distribution Typically, the following variable (similar to the normal Z variable seen earlier) will have a tdistribution: (we will see examples later) Estimator E ( Estimator) t Sample sd ( Estimator) 119 5.8 chi-square distribution Chi-square distributions are 1-tail tests Interpolation is often needed within the table Example: Find the critical chi-squared value that cuts off 5% of the right tail with 2df For Right Tail = 0.05, df=2 Critical Chi-Squared Value = 5.99 120 5.8 F-distribution F-distributions are 1-tail tests Interpolation is often needed within the table Example: Find the critical F value (F*) that cuts of 1% of the right tail with 3df in the numerator and 80df in the denominator For Right Tail = 0.01, df1=3, df2=80, df2=60 gives F*=4.13 df2=120 gives F*=3.95 121 5.8 Interpolation df2=60 gives F*=4.13 df2=120 gives F*=3.95 Since 80 is 1/3rd of the way between 60 and 120: 60 80 100 120 Our F-value should be 1/3 of the way between 4.13 and 3.95: 4.13 ? 3.95 Approximization: F*=4.13-(4.13-3.95)/3=4.07 122 5.8 Distribution Usage Different testing of models will use different tables, as we will see later in the course. In general: 1) Normal tables do distribution estimations 2) t-tables do simple tests 3) F-tables do simultaneous tests –Prob(a & b) 4) Chi-squared tables do complicated tests devised by mathematicians smarter than you or I (they invented them, we use them) 123 5.9 Confidence Intervals Thus far, all our estimates have been POINT estimates; a single number emerges as our estimate for an unknown parameter. Ie) X 3.74 Even if we have good data and have an estimator with a small variance, the chances that our estimate will equal our actual value are very low. Ie) If a coin is expected to turn heads half the time. The chance that it actually does that in an 124 experiment is very low 5.9.1 Constructing Confidence Intervals Confidence intervals or interval estimators acknowledge underlying uncertainties and are an alternative to point estimators Confidence intervals propose a range of values in which the true parameter could lie, given a range of probability. Confidence intervals can be constructed since our 125 point estimates are RANDOM VARIABLES. 5.9.1 Degrees of Freedom When given actual population data, we converted into a z-score: Z = (x-μ)/σ With random samples, we convert into a t-score: t = (x – E(x)) / sample sd(x) with n-1 degrees of freedom This is proven by various complicated central limit theorems 126 5.9.1 CI’s and Alpha Probabilities of confidence intervals are denoted by α (alpha). Given α, we construct a 100(1- α)% confidence interval. If α=5%, we construct a 95% confidence interval. P(Lower limit<true parameter<Upper limit)=1- α 127 5.9.1 Formula Given a repeated sample, we want to construct confidence intervals for the mean such that: P{t* ( X X ) / s X t*} 1 (1-α)% -t* t* t Where t has n-1 degrees of freedom, and ±t* 128 cuts α/2 off both tails. 5.9.1 Formula Rearranging we get: P{ X t * s X X X t * s X } 1 (1-α)% X t * sX X t * sX μX 129 5.9.1 Formula Our final formula becomes: CI X X t * s X Or in general: CItruevalue estimate t * sestimate Which gives us an upper and lower bound for our CI. 130 5.9.1 Example Flipping a coin has given us 25 heads with a value of 1, and 15 tails with a value of zero. Find the 95% CI if n=40. We therefore have: 25(1) 15(0) 25 C 0.625 40 40 131 5.9.1 IMPORTANT - Estimated Standard Deviation of a Sample Mean We have already seen that sample standard deviation is found through the formula: SY (Y Y ) 2 i N 1 Standard deviation of a sample mean is found through: sY sY / N 132 5.9.1 Example SC (C C ) 2 i N 1 25(1 0.625) 15(0 0.625) SC 40 1 2 2 9.375 SC 0.49 39 sC sC / N 0.49 / 40 0.077 133 5.9.1 Example A 95% CI has 2.5% off each tail. If n=40, t* = 2.02 CI C C t * sC CI C 0.625 2.02(0.077) CI C [0.47,0.78] 134 5.9.1 Interpretation: In this example, we have a confidence interval of [0.47, 0.78]. In other words, in repeated samples, 95% of these intervals will include the probability of getting a “heads” when flipping a coin. 135 5.9.1 Confidence Requirements In order to construct a confidence interval, one needs: a) A point estimate of the parameter b) Estimated standard deviation of the parameter c) A critical value from a probability distribution (or α and the sample size, n) 136 5.10 Hypothesis Testing After a model has been derived, it is often useful to test various hypotheses: Are a pair of dice weighted towards another number (say 11)? Does a player get blackjack more often than he should? Will raising tuition increase graduation rates? Will soaring gas costs decrease car sales? Will the recession affect Xbox sales? Does fancy wrapping increase the appeal of Christmas presents? Does communication between rivals affect price? 137 5.10 Hypothesis Testing Question: Is our data CONSISTENT with a particular parameter having a specific value? Although we may observe an outcome (ie: a Blackjack player has 150% of his starting chips) (assume the average outcome should be 80%), We need to test if this outcome is: 1) Consistent with typical chance or 2) Inconsistent – perhaps showing cheating 138 5.10 Hypothesis Testing Testing Consistency of a Hypothesized Parameter: 1) Form a null and an alternate hypothesis. H0 = null hypothesis = variable is equal to a number Ha = alternate hypothesis = variable is not equal to a number EX) H0: Outcome=0.8 Ha: Outcome≠0.8 139 5.10 Hypothesis Testing Testing Consistency of a Hypothesized Parameter: 2) Collect appropriate sample data 3) Select an acceptable probability (α) of rejecting a null hypothesis when it is true -Type one error -Lower α, more unlikely to find a sample that rejects the null hypothesis - α is often 10%, 5%, or 1% 140 5.10 Hypothesis Testing Testing Consistency of a Hypothesized Parameter: 4) Construct an appropriate test statistic -ensure the test statistic can be calculated from the sample data -ensure its distribution is appropriate to that being tested (ie: t-statistic for test for mean) 141 5.10 Hypothesis Testing Testing Consistency of a Hypothesized Parameter: 5) Establish (do not) reject regions -Construct bell curve -Tails are Reject H0 regions -Centre is Do not Reject H0 regions 142 5.10 Hypothesis Testing Testing Consistency of a Hypothesized Parameter: 6) Compare the test statistic to the critical statistic -If the test statistic lies in the tails, reject -If the test statistic doesn’t lie in the tails, do not reject -Never Accept 7) Interpret Results 143 5.10 Hypothetical Example Johnny is a poker player who has an average of 8 times out of ten (from 120 games and the standard deviation is 0.5). Test the hypothesis that Johnny never wins. H0: W=0 Ha: W ≠0 2) We have estimated W=8. The standard deviation was 0.5 3) We let α=1%; we want a strong result. 4) t= (estimate-hypothesis)/sd = (8-0)/0.5=16 5) t* for n-1=119, α=1%: t*=2.62 6) t*<t; Reject H0 144 1) 5.10 Hypothetical Example 7) Allowing for a 1% chance of a Type 1 error, we reject the null hypothesis that Johnny never wins at Poker. According to our data, it is consistent that Johnny sometimes wins at Poker. 145 5.10 Hypothesis Example 2 Tom spends the following on 4 days of gaming conventions: $110, $85, $95, $130. Test the hypothesis that Tom spends an average of $100 a day (α=5%) H0: μS =0 Ha: μS ≠0 2) From before: T S 1) 2 SS (S T T S i N i 110 85 90 135 105 4 S T )2 N 1 (110 105) 2 (85 105) 2 (90 105) 2 (135 105) 2 2 SS 4 1 25 400 225 900 2 SS 517 3 sS 517 sd S sS 11.4 N 4 146 5.10 Hypothesis Example 2 3) We let α=5% 4) t= (estimate-hypothesis)/sd S S 105 100 t 0.439 sd S 11.4 5) t *df 3, 0.05 3.18 6) -t*<t<t*; Do Not reject H0. At a 5% level of significance, according to our data, Tom could spend an average of $100 a day. 147