Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Statistics – OR 155 Section 1 J. S. Marron, Professor Department of Statistics and Operations Research Class Information • • • • • Go to Blackboard Choose this course In Control Panel Go to “Course Information” Read document in “Course Info” Class Information Important Change: IA’s Office Hour: Mon. 10:30-11:30 Wed. 10:30-11:30 (now on Blackboard) Audio Recording Info • • • • • Audio Recordings + Displayed Slides Stored at: http://coursecast.unc.edu Choose “Students” Choose “155STOR, Section 1: J S Marron” Choose “Test Recording” In case needed: • User ID: • Password: 155STOR 155STOR Reading In Textbook Approximate Reading for Today’s Material: Pages 237-250 Approximate Reading for Next Class: Pages 251-266 Random Sampling How Accurate? • Can (& will) calculate using “probability” • Justifies term “scientific sampling” • 2nd improvement over quota sampling Random Sampling What is random? Simple Random Sampling: Each member of population is equally likely to be in sample Key Idea: Different from “just choose some” Random Sampling HW Interesting Question: What is the % of Male Students at UNC? (Your chance of date, or take 100% - to get your chance) HW: C1: Class Handout http://stat-or.unc.edu/webspace/courses/marron/UNCstor155-2009/HWAsst/Stor155HWC1.pdf Random Sampling HW Interesting Question: What is the % of Male Students at UNC? (Your chance of date, or take 100% HW: to get your chance) for Class Problem # 1 C1: Class Handout http://stat-or.unc.edu/webspace/courses/marron/UNCstor155-2009/HWAsst/Stor155HWC1.pdf Random Sampling HW Serious Problem that has arisen: See Blackboard (note format of Question and Answer) (please use this, and send me an email) Random Sampling HW Hint on “Equally Likely”: Suppose 2 page phone book & only 4 students Al Bob Carl Doug Sampling Scheme 1: • Randomly choose page • Randomly choose person Problem: Doug is chosen half the time, Not “equally likely” Random Sampling HW Hint on “Equally Likely”: Suppose 2 page phone book & only 4 students Al Bob Carl Doug Sampling Scheme 2: • Randomly choose page • Randomly choose i = 1,…,max page size • If no person i on chosen page, repeat both draws More On Surveys More Common Sense: How you ask the question makes a big difference HW: 3.75, 3.74 (Note: work & turn in, in order assigned) More about Sampling The “simple random sample” (recall “each equally likely”) can be expensive (e.g. nationwide political poll, collected by personal interview) So there are many cheaper variations: – – – – Stratified Sampling Multi Stage Sampling See text, Section 3.2 And there are many others as well Get up to Speed on EXCEL HW C2: Class Problem 2 (on Blackboard – HW Assignments) Recall: only turn in one printed page (per problem part) (recall HW rules in Course Info) Note: you can also write on that sheet (e.g. your name & highlight answer) Get up to Speed on EXCEL Critical Step: Load Analysis ToolPak: • Circular Office Button (upper left) • Excel Options (low button) • Add-Ins (menu on left) • Manage Excel Add-Ins > Go • Check: Analysis ToolPak > OK • Makes “Analysis Box” appear on right, under “Data” tab Random Sampling EXCEL generation of random samples: http://www.stat-or.unc.edu/webspace/courses/marron/UNCstor155-2009/ClassNotes/Stor155Eg1.xls Goal 1: Generate Random Numbers Random Sampling EXCEL generation of random samples: http://www.stat-or.unc.edu/webspace/courses/marron/UNCstor155-2009/ClassNotes/Stor155Eg1.xls Goal 1: Generate Random Numbers EXCEL approaches: • RAND function (under Formulas Function Library Math & Trig) Random Sampling EXCEL generation of random samples: http://www.stat-or.unc.edu/webspace/courses/marron/UNCstor155-2009/ClassNotes/Stor155Eg1.xls Goal 1: Generate Random Numbers EXCEL approaches: • RAND function (under Formulas Function Library Math & Trig) • Data Analysis Random Number Generation EXCEL Random Sampling Goal 2: Randomly Reorder List EXCEL Random Sampling Goal 2: Randomly Reorder List EXCEL approach: • Highlight block with list & random num’s • Sort whole thing on numbers EXCEL Random Sampling Goal 2: Randomly Reorder List EXCEL approach: • Highlight block with list & random num’s • Sort whole thing on numbers Goal 3: Random Sample from List EXCEL Random Sampling Goal 2: Randomly Reorder List EXCEL approach: • Highlight block with list & random num’s • Sort whole thing on numbers Goal 3: Random Sample from List • Choose 1st subset from random re-order • Since, each equally likely in each spot EXCEL Details RAND: • Can also get from menu under fx • • • • • • Can find on “All” menu Note no (explicit) inputs Just put in desired cell Drag downwards for several random #s Caution: these change on each re-comp. Thus not recommended for this EXCEL Details Data Analysis Random Number Generation : • Set: # Variables: 1 Distribution: Uniform (over [0,1]) • Generates Fixed List (doesn’t change with re-computation) (note entries are “just numbers”) • Thus stable for later interpretation • Recommended for random sample choice EXCEL Details Sorting Lists: • Highlight Block with Both: – Names to sort – Random numbers • Data Sort & Filter Sort • Choose Column to sort on (B or C) • Result is random re-ordering of List Random Sampling HW HW: C3: For the letters A – L, use EXCEL to: (a) Put in a random order. (b) Choose a random sample of 6. (Hints: for (a), want each equally likely, for (b), reorder, and choose a subset) Chapter 4: Probability Goal: quantify (get numerical) uncertainty Chapter 4: Probability Goal: quantify (get numerical) uncertainty • Key to answering questions above (e.g. what is “natural variation” in a random sample?) (e.g. which effects are “significant”) Chapter 4: Probability Goal: quantify (get numerical) uncertainty • Key to answering questions above (e.g. what is “natural variation” in a random sample?) (e.g. which effects are “significant”) Idea: Represent “how likely” something is by a number Simple Probability E.g. (will use for a while, since simplicity gives easy insights) Roll a die (6 sided cube, faces 1,2,…,6) Simple Probability E.g. (will use for a while, since simplicity gives easy insights) Roll a die (6 sided cube, faces 1,2,…,6) • 1 of 6 faces is a “4” • So say “chances of a 4” are: “1 out of 6” 1 6. Simple Probability E.g. (will use for a while, since simplicity gives easy insights) Roll a die (6 sided cube, faces 1,2,…,6) • 1 of 6 faces is a “4” • So say “chances of a 4” are: “1 out of 6” 1 6. • What does that number mean? • How do we find such for harder problems? Simple Probability A way to make this precise: “Frequentist Approach” Simple Probability A way to make this precise: “Frequentist Approach” In many replications (repeat of die roll), expect about 16 of total will be 4s Simple Probability A way to make this precise: “Frequentist Approach” In many replications (repeat of die roll), expect about 16 of total will be 4s Terminology (attach buzzwords to ideas): Simple Probability A way to make this precise: “Frequentist Approach” In many replications (repeat of die roll), expect about 16 of total will be 4s Terminology (attach buzzwords to ideas): Think about “outcomes” from an “experiment” e.g. #s on die e.g. roll die, observe # Simple Probability Quantify “how likely” outcomes are by assigning “probabilities” I.e. a number between 0 and 1, to each outcome, reflecting “how likely”: Simple Probability Quantify “how likely” outcomes are by assigning “probabilities” I.e. a number between 0 and 1, to each outcome, reflecting “how likely”: Intuition: • 0 means “can’t happen” • ½ means “happens half the time” • 1 means “must happen” Simple Probability HW: C4: Match one of the probabilities: 0, 0.01, 0.3, 0.6, 0.99, 1 with each statement about an event: a. Impossible, can’t occur. b. Certain, will happen on every trial. c. Very unlikely, but will occur once in a long while. d. Event will occur more often than not. Simple Probability Main Rule: Sum of all probabilities (i.e. over all outcomes) is 1: Simple Probability Main Rule: Sum of all probabilities (i.e. over all outcomes) is 1: P1 1 6 E.g. for die rolling: P2 1 6 P3 1 6 P4 1 6 P5 1 6 P6 1 6 1 Simple Probability Main Rule: Sum of all probabilities (i.e. over all outcomes) is 1: P1 1 6 E.g. for die rolling: P2 1 6 P3 1 6 P4 1 6 Greek letter P5 1 6 “sigma” for sum P6 1 6 1 Simple Probability HW: 4.23 4.24a Probability General Rules for assigning probabilities: Probability General Rules for assigning probabilities: i. Frequentist View (what happens in many repetitions?) Probability General Rules for assigning probabilities: i. Frequentist View (what happens in many repetitions?) ii. Equally Likely: for n outcomes P{one outcome} = 1/n (e.g. die rolling) Probability General Rules for assigning probabilities: i. Frequentist View (what happens in many repetitions?) ii. Equally Likely: for n outcomes P{one outcome} = 1/n (e.g. die rolling) iii. Based on Observed Frequencies e.g. life tables summarize when people die Gives “prob of dying” at a given age “life expectancy” Probability General Rules for assigning probabilities: iv. Personal Choice: – – – HW: 4.25 Reflecting “your assessment” E.g. Oddsmakers Careful: requires some care (key is prob’s need to sum to 1) Probability - Events More Terminology (to carry this further): • An event is a set of outcomes Probability - Events More Terminology (to carry this further): • An event is a set of outcomes Die Rolling: “an even #”, is the event {2, 4, 6} Probability - Events More Terminology (to carry this further): • An event is a set of outcomes Die Rolling: “an even #”, is the event {2, 4, 6} Notes: – – – – If betting on even don’t care about #, only even or odd Thus events are our foundation Each outcome is an event: the set containing just that outcome So event is the more general concept Probability on Events Sample Space is the set of all outcomes Probability on Events Sample Space is the set of all outcomes = = “event with everything that can happen” Probability on Events Sample Space is the set of all outcomes = = “event with everything that can happen” Extend Probability to Events by: P{event} = sum of probs of outcomes in event Probability on Events Sample Space is the set of all outcomes = = “event with everything that can happen” Extend Probability to Events by: P{event} = sum of probs of outcomes in event PO outcomes O Probability Technical Summary: • A probability model is a sample space • I.e. set of outcomes, plus a probability, P Probability Technical Summary: • A probability model is a sample space • I.e. set of outcomes, plus a probability, P • P assigns numbers to events, • Events are sets of outcomes Probability Function The probability, P, is a “function”, defined on a set of events Probability Function The probability, P, is a “function”, defined on a set of events Recall function in math: f ( x ) 3x 2 2 Probability Function The probability, P, is a “function”, defined on a set of events Recall function in math: plug-in f ( x ) 3x 2 2 get out Probability Function The probability, P, is a “function”, defined on a set of events Recall function in math: plug-in Probability: f ( x ) 3x 2 2 get out P{event} = “how likely” Probability Function E.g. Die Rolling • Sample Space = {1, 2, 3, 4, 5, 6} Probability Function E.g. Die Rolling • Sample Space = {1, 2, 3, 4, 5, 6} • “an even #” is the event {2, 4, 6} (a “set”) Probability Function E.g. Die Rolling • Sample Space = {1, 2, 3, 4, 5, 6} • “an even #” is the event {2, 4, 6} (a “set”) P{“even”} = P{2, 4, 6} = Po • o = P{2} + P{4} + P{6} = = 1/6 + 1/6 + 1/6 = 3/6 = ½ Probability Function E.g. Die Rolling • Sample Space = {1, 2, 3, 4, 5, 6} • “an even #” is the event {2, 4, 6} (a “set”) P{“even”} = P{2, 4, 6} = Po • o = P{2} + P{4} + P{6} = = 1/6 + 1/6 + 1/6 = 3/6 = ½ • Fits, since expect “even # half the time” Probability HW HW: 4.21 4.31 And now for something completely different Is this class too “monotone”? • Easier to understand? • Calm environment enhances learning? • Or does it induce somnolence? What is “somnolence”? Google definition: Sleepiness, a condition of semiconsciousness approaching coma. And now for something completely different Recall last class’s Student Questionnaire… I asked you for: • Name • Major • Contact Info • Background… And now for something completely different One (previous class) response: And now for something completely different OK, will try to send your mind in a different direction Hopefully, a mental break … (not on the Homework Assignment!) And now for something completely different • Did you hear about the constipated mathematician? And now for something completely different • Did you here about the constipated mathematician? • He worked it out with a pencil! And now for something completely different • Did you here about the constipated mathematician? • He worked it out with a pencil! • Apologies for the juvenile nature… And now for something completely different • Did you here about the constipated mathematician? • He worked it out with a pencil! • Apologies for the juvenile nature… • But there is an important point: And now for something completely different • Did you here about the constipated mathematician? • He worked it out with a pencil! • Apologies for the juvenile nature… • But there is an important point: The pencil is a powerful mathematical tool And now for something completely different The pencil is a powerful mathematical tool • An old student: – I was once “good in math” – But suddenly lost that – Reason: tried to do too much in head – Because: never learned power of the pencil And now for something completely different The pencil is a powerful mathematical tool • For us: now is time to start using pencil • I do PowerPoint in class • You use pencil on HW (and exams) • Change in mindset, from Excel… Probability Now stretch ideas with more interesting e.g. Probability Now stretch ideas with more interesting e.g. E.g. Political Polls, Simple Random Sampling Probability Now stretch ideas with more interesting e.g. E.g. Political Polls, Simple Random Sampling 2 views: 1. Each individual equally likely to be in sample Probability Now stretch ideas with more interesting e.g. E.g. Political Polls, Simple Random Sampling 2 views: 1. Each individual equally likely to be in sample 2. Each possible sample is equally likely Probability Now stretch ideas with more interesting e.g. E.g. Political Polls, Simple Random Sampling 2 views: 1. Each individual equally likely to be in sample 2. Each possible sample is equally likely Allows for simple Probability Modelling Simple Random Sampling • Sample Space is set of all possible samples Simple Random Sampling • Sample Space is set of all possible samples • An Event is a set of some samples Simple Random Sampling • Sample Space is set of all possible samples • An Event is a set of some samples E.g. For population A, B, C, D – Each is a voter Simple Random Sampling • Sample Space is set of all possible samples • An Event is a set of some samples E.g. For population A, B, C, D – Each is a voter – Only 4, so easy to work out S. R. S. Example For population A, B, C, D, Draw a S. R. S. of size 2 S. R. S. Example For population A, B, C, D, Draw a S. R. S. of size 2 Sample Space = {(A,B), (A,C), (A,D), (B,C), (B,D), (C,D)} S. R. S. Example For population A, B, C, D, Draw a S. R. S. of size 2 Sample Space = {(A,B), (A,C), (A,D), (B,C), (B,D), (C,D)} outcomes, i.e. possible samples of size 2 S. R. S. Example Now assign P, using “equally likely” rule: S. R. S. Example Now assign P, using “equally likely” rule: P{A,B} = P{A,C} = … = P{C,D} = = 1/(#samples) = 1/6 S. R. S. Example Now assign P, using “equally likely” rule: P{A,B} = P{A,C} = … = P{C,D} = = 1/(#samples) = 1/6 An interesting event is: “C in sample” = {(A,C),(B,C),(D,C)} (set of samples with C in them) S. R. S. Example 1 P{C in sample} = P{sample} samples samples 6 with C with C S. R. S. Example 1 P{C in sample} = P{sample} samples samples 6 with C with C 1 1 1 # samples with C 3 6 6 2 S. R. S. Example 1 P{C in sample} = P{sample} samples samples 6 with C with C 1 1 1 # samples with C 3 6 6 2 i.e. happens “half the time”. S. R. S. Probability HW HW C5: Abby, Bob, Mei-Ling, Sally and Roberto work for a firm. Two will be chosen at random to attend an overseas meeting. The choice will be made by drawing names from a hat (this is an S. R. S. of 2). a. Write down all possible choices of 2 of the 5 names. This is the sample space. b. Random choice makes all choices equally likely. What is the probability of each choice? (1/10) c. What is the prob. that Sally is chosen? (4/10) d. What is the prob. that neither Bob, nor Roberto is chosen? (3/10) Political Polls Example What is your chance of being in a poll of 1000, from S.R.S. out of 200,000,000? (crude estimate of # of U. S. voters) Political Polls Example What is your chance of being in a poll of 1000, from S.R.S. out of 200,000,000? (crude estimate of # of U. S. voters) Recall each sample is equally likely so: # samples with you Psample with you total # samples Political Polls Example What is your chance of being in a poll of 1000, from S.R.S. out of 200,000,000? (crude estimate of # of U. S. voters) Recall each sample is equally likely so: # samples with you Psample with you total # samples Problem: this is really big 2.66 10 5733 (5,733 digits, too big for easy handling….) Political Polls Example More careful calculation: 199,999,999 1 999 # samples with you Psample with you total # samples 200,000,000 1,000 Political Polls Example More careful calculation: 199,999,999 1 999 # samples with you Psample with you total # samples 200,000,000 1,000 199,999,999! 1000 1 199,999,000!999! 200,000,000! 200,000,000 200,000 199,999,000!1000! Political Polls Example More careful calculation: 199,999,999 1 999 # samples with you Psample with you total # samples 200,000,000 1,000 199,999,999! 1000 1 199,999,000!999! 200,000,000! 200,000,000 200,000 199,999,000!1000! Makes sense, since you are “equally likely to be in samples”