Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Econ 140 More on Univariate Populations Lecture 4 Lecture 4 1 Today’s Plan Econ 140 • • • • • Examining known distributions: Normal distribution & Standard normal curve Student’s t distribution F distribution & c2 distribution Note: should have a handout for today’s lecture with all tables and a cartoon • Brief statements about: Bivariate populations and conditional probabilities • Joint and marginal probabilities Lecture 4 2 Standard Normal Curve (6) Econ 140 • Going back to our earlier question: What is the probability that someone earns between $300 and $400 [P(300Y 400)]? 316.6 Z1 Z2 2 25608 25608 160 P(300Y 400) 300 316.6 0.104 160 300 316.6 400 316.6 Z 400 0.52 160 P (0.104 Z 0) 0.0418 P (0 Z 0.52) 0.1985 P (0.104 Z 0.52) 0.0418 0.1985 .2403 Z300 Lecture 4 400 3 Standard Normal Curve (7) Econ 140 • We know from using our PDF that the chance of someone earning between $300 and $400 is around 23%, so 0.24 is a good approximation • Now we can ask: What is the probability that someone earns between $253 and $316? Z1 Z2 P(253Y 316) 253 316.6 0.3975 160 316 316.6 Z2 0.0038 160 P (0.3975 Z 0) 0.1554 Z1 P (0.0038 Z 0) 0.0020 P (0.3975 Z .0038) .1554 .002 Lecture 4 253 316.6 316 .1574 15.3% 4 Standard Normal Curve (8) Econ 140 • There are instructions for how you can do this using Excel: L4_1.xls. Note how to use STANDARDIZE and NORMDIST and what they represent • Our spreadsheet example has 3 examples of different earnings intervals, using the same distribution that we used today • Testing the Normality assumption. We know the approximate shape of the Earnings (L3_79.xls) distribution. Slightly skewed. Is normality a good assumption? Use in Excel (L4_2.xls) of NORMSINV Lecture 4 5 Student’s T-Distribution Econ 140 • Starting next week, we’ll be looking more closely at sample statistics • In sample statistics, we have a sample that is small relative to the population size • We do not know the true population mean and variance – So, we take samples and from those samples we will estimate a mean Y and variance SY2 Lecture 4 6 T-Distribution Properties Econ 140 • Fatter tails than the Z distribution • Variance is n/(n-2) where n is the number of observations • When n approaches a large number (usually over 30), the t approximates the normal curve • The t-distribution is also centered on a mean of zero • The t lets us approximate probabilities for small samples Lecture 4 7 F and c2 Distributions Econ 140 • Chi-squared distribution:square of a standard normal (Z) distribution is distributed c2 with one degree of freedom (df). • Chi-squared is skewed. As df increases, the c2 approximates a normal. • F-distribution: deals with sample data. F stands for Fisher, R.A. who derived the distribution. F tests if variances are equal. • F is skewed and positive. As sample sizes grow infinitely large the F approximates a normal. F has two parameters: degrees of freedom in the numerator and denominator. Lecture 4 8 A recap on the story so far Econ 140 • Probability is concerned with random events. • Nearly all data is the outcome of a ‘random draw’ - a sample drawn at random. • The probability of earning particular amounts – Relationship between a sample and population – Using standard normal tables • Introduction to the t-distribution • Introduction to the F and c2 distributions Lecture 4 9 A quick note on bivariate probability Econ 140 • Bivariate populations and conditional probabilities • Joint and marginal probabilities Lecture 4 10 A Simple E.C.P Example Econ 140 • Introduce Bivariate probability with an example of empirical classical probability (ecp). • Consider a fictitious computer company. We might ask the following questions: – What is the probability that consumers will actually buy a new computer? – What is the probability that consumers are planning to buy a new computer? – What is the probability that consumers are planning to buy and actually will buy a new computer? – Given that a consumer is planning to buy, what is the probability of a purchase? Lecture 4 11 A Simple E.C.P Example(2) Econ 140 • Think of probability as relating to the outcome of a random event (recap) • All probabilities fall between 0 and 1: null 0 P( A) 1 certain • Probability of any event A is: m P( A) with A a1, a2 , a3...an n Where m is the number of events A and n is the number of possible events Lecture 4 12 A Simple E.C.P Example(3) Econ 140 • The cumulative frequency is: P(ai ) 1 • The sample space (of a 1000 obs) looks like this: Plan to Purchase Yes (a1) No (a2) Total Actually Purchase Yes (b1) No (b2) Total 200 50 250 100 650 750 300 700 1000 • Before we move on we’ll look at some simple definitions Lecture 4 13 A Simple E.C.P Example(4) Econ 140 • If we have an event A there will be a compliment to A which we’ll call A’ or B • Computing marginal probabilities – Event A consists of two outcomes, a1 and a2: A a1,a2 – The compliment B consists also of two outcomes, b1 and b2: B b1,b2 – two events are mutually exclusive if both events cannot occur – A set of events is collectively exhaustive if one of the events must occur Lecture 4 14 A Simple E.C.P Example(5) Econ 140 • Computing marginal probabilities Pr( A) P A B1 P A B2 ... P A Bk Where k is some arbitrary large number • If A = planned to purchase and B=actually purchased: P(planned to buy) = P(planned & did) + P(planned & did not)= Plan to Purchase Yes (a1) No (a2) Total Actually Purchase Yes (b1) No (b2) Total 200 50 250 100 650 750 300 700 1000 200 50 250 0.25 1000 1000 1000 Lecture 4 15 A Simple E.C.P Example(6) Econ 140 • If the two events, A and B, are mutually exclusive, then P( AorB) P( A) P( B) – General rule written as: P( AorB) P( A) P( B) P( A B) – Example: Probability that you draw a heart or spade from a deck of cards • They’re mutually exclusive events P(Heart or Spade) = P(Heart) + P(Spade) – P(Heart + Spade)= 13 13 26 1 0 0.50 52 52 52 2 Lecture 4 16 A Simple E.C.P Example(6) Econ 140 • Probability that someone planned to buy or actually did buy: use the general addition rule: P( AorB) P( A) P( B) P( A B) • If A is planning to purchase, and B is actually purchasing, we can plug in the marginal probabilities to find 250 300 200 350 0.35 1000 1000 1000 1000 Plan to Purchase Yes (a1) No (a2) Total Actually Purchase Yes (b1) No (b2) Total 200 50 250 100 650 750 300 700 1000 Joint Probability: P(A and B): Planned and Actually Purchased Lecture 4 17 Conditional Probabilities Econ 140 • Lets leave the example for a while and consider conditional probabilities. • Conditional probabilities are represented as P(Y|X) • This looks similar to the conditional mean function: Y X n • We’ll use this to lead into regression line inference. Lecture 4 18 Conditional Probabilities (2) Econ 140 • Probabilities will be defined as p jk P( X X j , Y Yk ) j 1,...J k 1,...K • If we sum over j and k, we will get 1, or: j k p jk 1 • We define the conditional probability as f (X|Y) – This is read “a function of X given Y” – We can define this as: Joint probability of X &Y f X | Y Marginal probability of Y Lecture 4 19 Conditional Probabilities (3) Econ 140 • Similarly we can define f (Y|X): Joint probability of Y &X f Y | X Marginal probability of X • Looking at our example spreadsheet, we have a sample of weekly earnings and years of education: L5_1.XLS. • There are two statements on the spreadsheet that will clarify the difference between a joint and conditional probabilities Lecture 4 20 Conditional Probabilities (4) Econ 140 • The joint probability is a relative frequency and it asks: – How many people earn between $600 and $799 and have 10 years of education? • The conditional probability asks: – How many people earn between $600 and $799 given they have 10 years of education? • On the spreadsheet I’ve outlined the cells that contain the highest probability in each completed years of education – There’s a pattern you should notice Lecture 4 21 Conditional Probabilities (5) Econ 140 • We can use the same data to graph the conditional mean function – the graph shows the same pattern we saw in the outlined cells – The conditional probability table gives us a small distribution around each year of education Lecture 4 22 Conditional Probabilities (6) Econ 140 • To summarize, conditional probabilities can be written as P( X &Y ) Joint probability of X & Y f (X |Y) P( X ) Marginal probability of X – This is read as “The probability of X given Y” – For example: The probability that someone earns between $200 and $300, given that he/she has completed 10 years of education • Joint probabilities are written as P(X&Y) – This is read as “the probability of X and Y” – For example: The probability that someone earns between $200 and $300 and has 10 years of education Lecture 4 23 A Marketing Example Econ 140 • Now we’ll look at joint probabilities again using the marketing example from earlier in the lecture. • We will look at: – Marginal probabilities P(A) or P(B) – Joint probabilities P(A&B) – Conditional probabilities P( A& B) P( B) Lecture 4 24 Marketing Example(2) Econ 140 • Here’s the matrix Plan to Purchase Yes No Total Actually Purchase Yes No Total 200 50 250 100 650 750 300 700 1000 • Let’s look at the probability you purchased a computer given that you planned to purchase: • P(actually purchased | planned to purchase) 200 .8 80% 250 • The joint probability that you purchased and planned to purchase: 200/1000 = .2 = 20% Lecture 4 25 What we’ve done Econ 140 • Introduction to standardized normal (Z) distribution • Introduction to the t-distribution • Introduction to the F and c2 distributions • Bi-variate probabilities: calculate marginal, joint, and conditional probabilities – Computer company example – Earnings and years of education Lecture 4 26