* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download 1 Students will be able to interpret and compute relative frequency
Survey
Document related concepts
Transcript
1 Chapter 13 Objectives: 1. The student will be able to differentiate between the population regression equation and the sample regression equation as well as a population and sample correlation analysis. 2. The student will be able to compute(using Excel) and interpret the correlation coefficient. 3. The student will be able to compute and interpret the intercept and slope of a regression equation. 4. The student will be able to compute and interpret the: standard error of the estimate, coefficient of determination and the hypothesis test for a regression equation. 5. The student will be able to use regression as a forecasting tool. 6. The student will be able to interpret a confidence interval for the estimates of the dependent variable. Scatter Diagram Pg 12 and 463 PLOTTING DATA IS INFORMATIVE Excel use chart wizard/scatter. The variable in the farthest left column is placed on the X axis. CORRELATION ANALYSIS DESCRIBES THE STRENGTH OF A LINEAR RELATIONSHIP BETWEEN TWO VARIABLES CORRELATION COEFFICIENT -1ο£Xο£+1 Excel>tools/data analysis/correlation -1 implies the strongest possible negative (inverse) relationship. +1 implies the strongest possible positive (direct) relationship 0 implies no relationship at all with assumptions 13-1 π= β(π β π)(π β π) (π β 1)ππ¦ ππ₯ Coefficient of determination r2 Correlation/Regression 1. State a hypothesis Your hypothesis should include a statement regarding causation. 2/27/07 βIs an Economist Qualified To Solve Puzzle of Autism?β WSJ A1 2 police and violent crime Vietnam war and future earnings TV and Autism Instrumental variables Election year Draft lottery Rain or percent subscribe to cable http://www.correlated.org/ 2. Gather data 3. Graph data 4. Statistical analysis 5. Hypothesis Test Spurious Correlations Pg 465 Stats-Basic Statistics-correlation (SEE HELP SHEET) 7/13/05 βDrink More, Earn More (&Give More)β WSJ UWP AODA Spring survey on excel 1/12/10 Watching TV Linked To Higher Risk of Death WSJ D1 Steps of Hypothesis Testing Pg 332-338 1. Formulate the null hypothesis Ho. Formulate the alternative hypothesis Ha in statistical terms. 2. Set the level of significance ο’ and the sample size n. 3. Select the appropriate test statistic and rejection rule. 4. Collect the data and calculate the test statistic. 13-2 5. If the calculated value of the test statistic falls in the rejection region, then reject Ho. If the calculated value of the test statistic does not fall in the rejection region, then do not reject Ho. Student t-table is the minimum number of standard deviations needed to reject Ho t-Statistic (13-2) is the actual number of standard deviations from Ho If the actual number of standard deviations is greater than the minimum, reject Ho. Regression analysis-Stats-Regression Regression describes a linear relationship by using the mathematical equation below. Population Regression π = π΄ + π΅π₯ Sample Regression π¦ = π + ππ₯ a= intercept b = slope 3 13-4 and 13-5 π=π ππ¦ 13 β 4 ππ₯ π = πΜ β ππΜ 13 β 5 [2. Give me a graphical example of a relationship between to variables that would not be represented well by a straight line.] DEPENDENT VARIABLE SALES CONSUMPTION OUTPUT/PER ACRE GRADES HOUSING INDEPENDENT VARIABLE ADVERTISING BUDGET INCOME FERTILIZER STUDY TIME MORTGAGE RATES OTHER VARIABLES CONSUMPTION illustrate GRAPHICALLY REGRESSION β(π¦ β π¦)2 = β(π¦β² β π¦)2 + β(π¦ β π¦β²)2 Total Variation (SST) = Explained(SSR) + Unexplained(SSE) POPULATION Yi = Bo + BiX + Ei ERROR EXISTS BECAUSE: 1. THERE IS NOT A PERFECT LINEAR RELATIONSHIP 2. THERE ARE OTHER VARIABLES THAT INFLUENCE THE DEPENDENT VARIABLE STANDARD ERROR OF THE ESTIMATE 13-6 β(π¦ β π¦β²)2 ππ¦|π₯ = β πβ2 PREDICTING THE DEPENDENT VARIABLE - POINT ESTIMATE INTERVAL ESTIMATE Stats-Regression-options ASSUMPTIONS OF REGRESSION: Pg 480 1. For each value of X, there is a group of Y values, and these Y values are normally distributed. 2. Data is linearly related 3. The standard deviations of these normal distributions 4 are equal. 4. The Y values are statistically independent. (Time Series) HYPOTHESIS TEST FOR THE SLOPE 13-2 π‘= πβπ β 2 β1 β π 2 INFERENCES ABOUT THE SLOPE OF THE REGRESSION LINE NULL HYPOTHESIS Ho: B = 0 Accepting the null hypothesis implies there is likely no relationship between your independent and dependent variable in your population with unknown risk of error. Null Hypothesis Ho: B ο 0 Rejecting the null hypothesis, and therefore accepting the alternative, implies there is likely a relationship between your independent and dependent variables in the population with a known risk of error. If the computed "t" value form minitab is greater than the table value, reject the null hypothesis and accept the alternative. If the computed "t" value from minitab is less than the table value, accept the null hypothesis. Type I error β rejecting a true hypothesis Type II error β Accepting a false hypothesis Type I and II error in our legal system 10/17/09 Presumption of Guilt WSJ W1 Excel - Select - tools/Data Analysis/Regression Unemployment rates on excel Chapter 5 Objectives: 1 Students will be able to interpret and compute relative frequency, classical, and subjective probability. 2. Students will be able to define and compute conditional probability. 5 3. Students will be able to distinguish independent from dependent events and them in conditional probability. 4. Students will be able to draw probability diagrams. relate tree 5. Students will be able to apply the rules of addition and multiplication. 6. Students will be able to relate Bayes' Theorem, conditional probability, dependents events to the tree diagram. 7. Students will be able to apply determine the number of possible permutations and combinations. PROBABILITY A measure of the likelihood that an event in the future will happen; it can only assume a value between 0 and 1 inclusive. 0 ο£ P (X) ο£ 1 PROBABILITY OF EVENT A P(A) = Number of favorable outcomes in the sample space Total number of outcomes in the sample space OBJECTIVE - DETERMINED THROUGH EXPERIMENTATION CLASSICAL - EVENTS WITH EQUALLY LIKELY OUTCOMES HISTORICAL- RELATIVE FREQUENCY DISTRIBUTION SUBJECTIVE - PERSONAL ESTIMATE OF THE LIKELIHOOD OF AN EVENT Relative Frequency probability that an event will be the result of a random experiment is assigned as the proportion of times that event occurs as the outcome of the experiment in the long run. INCOME/DAY 0 < 100 100 < 200 200 < 300 300 < 400 400 < 500 REL. FREQ. .1 .2 .4 .2 .1 6 SET - IS A WELL DEFINED COLLECTION OF OBJECTS Outcome - A particular result of an experiment Pg 141 SAMPLE SPACE - IS A SET OF ALL POSSIBLE OUTCOMES SAMPLE SPACE: COIN FLIP, DIE, DICE, DECK OF CARDS, HAND POKER Table #2 Pg 179 JEP Summer 1999 COUNTING A SAMPLE SPACE ROLL A TWO OR A THREE SAMPLE SPACE IS SIX ROLL A TWO AND A THREE ON TWO CONSECUTIVE ROLLS SAMPLE SPACE IS 36 = 6 X 6 EVENT - A collection of one or more outcomes of an experiment COMPLEMENT - ALL OUTCOMES THAT ARE NOT PART OF THE EVENT MUTUALLY EXCLUSIVE - The occurrence of any one event means that none of the others can occur at the same time. EVENTS THAT CAN NOT OCCUR SIMULTANEOUSLY - Pg 142 COLLECTIVELY EXHAUSTIVE - At least one of the events must occur when an experiment is conducted. NO OTHER EVENTS ARE POSSIBLE P(COLLECTIVELY EXHAUSTIVE EVENTS) = 1 COUNTING TECHNIQUES Value can only Value can be chosen be chosen 1 multiply times Order is Important COMBINATION X PERMUTATION X X Multiplication Rule* X X *Value can be chosen a multiple number of times because there are a multiple number of groups and value can be chosen from each group. Page 165-170 πΆπππππππ‘πππ = ππΆπ = π! π! (π β π)! Royal Flush πππππ’π‘ππ‘πππ = πππ = π! (π β π)! 7 Trifecta N = TOTAL NUMBER OF OBJECTS X = TOTAL NUMBER USED OUT OF N OBJECTS Multiplication Rule for the number of possibilities between two or more groups. If there are m ways of doing one thing(in group #1) and n ways of doing another thing(In group #2), there are m x n ways between both groups. [By allowing people to make multiple choices for the first time among six racial categories(White, Black, Asian, American Indian or Alaska Native, Native Hawaiian or Pacifi Islander, or some other) the census offered a mosaic of 63 racial options. Someone might check both βwhiteβ and βAsian,β for example, if they have ancestors of both races. Are 63 racial options correct?] WSJ 3/2/01 RULES FOR PROBABILITY ADDITION RULE P(A or B) = P(A) + P(B) - P(A and B) 1. EVENTS THAT ARE MUTUALLY EXCLUSIVE -Specific P(A OR B) = P(A)+P(B) - Special Rule Page 158 5-2 P(HEART OR SPADE) = P(H) + P(S) = 1/4 + 1/4 = ½ PAGE 150 TEXT 2. EVENTS THAT ARE NOT MUTUALLY EXCLUSIVE-General Rule P(A OR B)=P(A)+P(B)-P(A&B) 5-4 P(HEART OR JACK) = 13/52 + 4/52 - 1/52 P(A)=.275 P(B)=.275 Determine the probability of A or B Company "A" has 200 employees: 55 have accounting degrees and 55 have business degrees while 10 individuals have both business and accounting degrees. 1. How many employees have college degrees? 2. Determine the probability of picking someone at random from company βAβ and this person having a college degree. Multiplication rule for probability P(A and B)=P(A).P(B|A) Pg 156 Joint probability is the chance that two events will occur together or simultaneously. Savings and checking account Key words that are used to indicate joint probability are: and, both and neither. Independent events - if the occurrence of one is unrelated to the occurrence of the other. P(A AND B) = P(A) X P(B) 8 This formula applies if the events occur simultaneously or sequentially. If the events are not independent they are dependent or have conditional probability. P(A AND B) = P(A) X P(B|A) [Problem 28] P(A|B) = P(A and B) P(B) P(ACE) = 4/52 B REPRESENTS THE NEW SAMPLE SPACE P(ACE|DIAMOND) = 1/13 INDEPENDENT EVENTS MANUFACTURING FIRM BUYS 80% OF A GIVEN INPUT USED IN PRODUCTION FROM COMPANY K AND 4% ARE DEFECTIVE. IT ALSO BUYS 20% FROM COMPANY L AND 6% ARE DEFECTIVE. P(K) = .8 P(L) = .2 P(D|K) = .04 P(D}L) = .06 1. DETERMINE P(K&D) = P(K) X P(D|K) = .8 X .04 = .032 P(D&K) = P(D) X P(K|D) = Good Parts DEF Parts Total Parts INPUTS 956 44 1000 K 768 32 800 L 188 12 200 [Problem 29] Police records reveal that 10% of the accident victims who are Wearing seat belts sustain serious injury, while 50% of those who are not wearing seat belts sustain serious injury. Police estimate that 60% of the people riding in cars use seat belts. Police are called to investigate an accident in which one person is seriously injured. Estimate the probability that he was wearing his seat belt at the time of the crash. LISKA'S FIVE STEP METHOD TO SUCCESS IN SOLVING CONDITIONAL PROBABILITY. STEP 1: Skip to the end of the problem and determine what conditional probability the problem is asking for and write this in probability language. Example P(a|b) STEP 2: Use formula 5-7(5-6) in the text. The numerator of the formula is always the joint probability of the two events a and b. the denominator is the probability of the given information P(b). STEP 3: Draw the outcomes tree for this problem. The first 9 branch Of the outcomes tree will start with event a. the other branches will be the complement of event a. the key will be how you have written the question in step #1. If you have written the question as p(a|b), the first branch of your tree is event a. STEP 4: Now read the problem and identify the probabilities of the events listed on your outcomes tree and place them on the tree. STEP 5: Place the appropriate joint probability of a and b in the numerator and the probability of event b in the denominator. The probability of event b can be found by adding the joint probabilities of all the branches of your outcomes tree that lead to event b. USING THE EXAMPLE ABOVE STEP 1: Estimate the probability that the injured person was wearing his seat belt at the time of the crash. P(S|I) P(S AND I) STEP 2: P(S|I) = ----------- FORMULA 5-6 or 5-7 P(I) OUTCOMES TREE JOINT PROB P(I|S)=.1 I -------- .06 STEP 3: S P(B)=.6 AUTO ACCIDENT P(I'|S)=.9 I'-------I -------P(I|S')=.5 S' P(B)=.4 .54 .20 10 P(I'|S')=.5 I'-------S S' I I' = = = = .20 SEAT BELT NO SEAT BELT SERIOUS INJURY NO SERIOUS INJURY STEP 4: SEE ABOVE OUTCOMES TREE .O6 STEP 5: P(B|I) = -------------= .23 .26 Exponent 2/2/06, 11/6/08 and 2/12/09 9/9/09 Medicine's Dangerous Guessing Game WSJ A19 LISKA'S FIVE STEP METHOD TO SUCCESS IN SOLVING CONDITIONAL PROBABILITY USING THE MATRIX APPROACH. STEP 1: Skip to the end of the problem and determine what conditional probability the problem is asking for and write this in probability language. Example P(a|b) STEP 2: Use formula 5-7 in the text. The numerator of the formula is always the joint probability of the two events a and b. the denominator is the probability of the given information P(b). STEP 3: Construct an outcomes tree and begin with the main event(a). Form a probability matrix this event should be the event you are trying to find the probability for with the new information(the person has been injured). List the other events that can occur in this sample space and their prior probability. List the conditional probability of P(b|a). List the probability of being injured given the seat belt was on and the probability of being injured knowing the seat belt was not on. List the joint probability of the two events P(a and b) and P(a'and b). Outcomes tree see above notes. MAIN EVENT SEAT BELT PRIOR PROB .6 COND. PROB INJURY P(I|S) .1 JOINT PROB. .06 SEAT BELT AND INJURY 11 NO BELT .4 .5 PROBABILITY OF INJURY P(I) = .20 .26 STEP 4: Now read the problem and identify the probabilities of the events listed on your outcomes tree and place them in the matrix as shown. STEP 5: Place the appropriate joint probability of a and b in the numerator and the probability of event b in the denominator. The probability of event b can be found by adding the joint probabilities. P(S|I) = P(S AND I) --------- = P(I) .06 ----- = .23 .26 ANOTHER APPROACH IS TO CONSTRUCT A PROBABILITY TABLE OF THE EVENTS SHOWN BELOW:* SEAT BELT NO SEAT BELT JOINT PROBABILITY Injury No Injury .06 .54 Probability .6 .4 .20 .26 * PLEASE NOTE THIS TABLE COMES FROM BRANCHES OF THE ABOVE OUTCOMES TREE. .20 .74 THE JOINT PROBABILITY Computations of joint probability are shown on the next page P(S AND I) = P(S) X P(I|S) = .6 X .1 = .06 P(S AND I') = P(S) X P(I'|S) = .6 X .9 = .54 P(S' AND I) = P(S') X P(I|S') = .4 X .5 = .20 P(S' AND I') = P(S') X P(I'|S') = .4 X .5 = .20 P(S/I) = P(S AND I) ---------P(I) = .06 ---.26 = .23