Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
FINAL EXAM REVIEW STATISTICS WHEN TO USE EACH TEST • REGRESSION: • CHECKS IF ONE VARIABLE INFLUENCES, PREDICTS, OR IS RELATED TO THE OTHER VARIABLE. • TWO-SIDED T TEST • COMPARES TWO AVERAGES TO SEE IF THERE IS A DIFFERENCE. • ONE-SIDED T TEST • COMPARES TWO AVERAGES TO SEE IF ONE AVERAGE IS HIGHER THAN, OR LOWER THAN, THE OTHER ONE. • CONFIDENCE INTERVAL • COMPARES THE AVERAGE OF THE DATA TO A SPECIFIC NUMBER. HOW TO DO A REGRESSION TEST • GET THE DATA • GO TO DATA ANALYSIS • CHOOSE REGRESSION HOW TO DO A TWO-SIDED TTEST • ORGANIZE DATA (IF NEEDED) • TO GET P-VALUE: • TTEST(ARRAY1,ARRAY2,2,3) HOW TO DO A ONE-SIDED TTEST • ORGANIZE DATA (IF NEEDED) • CHECK IF THE AVERAGES OF THE DATA AGREE WITH • IF THEY DO NOT, DO NOT PERFORM THE TTEST • IF THEY DO, GO AHEAD AND FIND THE P-VALUE • TO GET P-VALUE: • TTEST(ARRAY1,ARRAY2,1,3) THE CLAIM. HOW TO DO A CONFIDENCE INTERVAL • ORGANIZE DATA (IF NEEDED) • FIND THE STANDARD DEVIATION • =STDEV(DATA) • FIND THE SIZE • =COUNT(DATA) • FIND THE MARGIN OF ERROR • ALPHA = 0.05 • =CONFIDENCE.T(ALPHA, STDEV, SIZE) • FIND AVERAGE • =AVERAGE(DATA) • FIND THE CONFIDENCE INTERVAL [LOWER BOUND, UPPER BOUND] • LB: = AVERAGE – MARGIN OF ERROR • UB: = AVERAGE + MARGIN OF ERROR IN THE STATE OF MISSOURI, THE SALARIES OF 20 RANDOMLY SELECTED BASKET WEAVERS WAS RECORDED. THE AVERAGE OF THESE SALARIES WAS FOUND TO BE $28,000, WITH A STANDARD DEVIATION OF $4,000. CONSTRUCT A 95% CONFIDENCE INTERVAL FOR THE TRUE AVERAGE SALARY OF BASKET WEAVERS IN MISSOURI. ROUND YOUR ANSWER TO TWO DECIMAL PLACES. (HINT: YOUR ANSWER SHOULD BE EXACTLY THE SAME AS THE ONE GIVEN) [ 26127.94, 29872.06 ] IT IS BELIEVED THAT MERCEDES MODELS HAVE LESS HIGHWAY GAS MILEAGE (ON AVERAGE) THAN TOYOTA MODELS. GIVEN THE DATA YOUR JOB IS TO CONFIRM OR DISPROVE THIS ASSERTION. (USE CARS04-1 DATA) WHAT TEST TO PERFORM? One-sided TTest IT IS BELIEVED THAT MERCEDES MODELS HAVE LESS HIGHWAY GAS MILEAGE (ON AVERAGE) THAN TOYOTA MODELS. GIVEN THE DATA YOUR JOB IS TO CONFIRM OR DISPROVE THIS ASSERTION. (USE CARS04-1 DATA) WHAT IS THE P-VALUE/MARGIN OF ERROR? 8.406E-06 IT IS BELIEVED THAT MERCEDES MODELS HAVE LESS HIGHWAY GAS MILEAGE (ON AVERAGE) THAN TOYOTA MODELS. GIVEN THE DATA YOUR JOB IS TO CONFIRM OR DISPROVE THIS ASSERTION. (USE CARS04-1 DATA) STATISTICAL INTERPRETATION? Since P-value is small, we are confident that the Mercedes average is lower than the Toyota average. IT IS BELIEVED THAT MERCEDES MODELS HAVE LESS HIGHWAY GAS MILEAGE (ON AVERAGE) THAN TOYOTA MODELS. GIVEN THE DATA YOUR JOB IS TO CONFIRM OR DISPROVE THIS ASSERTION. (USE CARS04-1 DATA) CONCLUSION? We are confident that this was a reasonable claim. IT IS BELIEVED THAT THE MORE BEDROOMS A HOUSE HAS, THE HIGHER THE SELLING PRICE IS GOING TO BE. GIVEN THE DATA YOUR JOB IS TO CONFIRM OR DISPROVE THIS ASSERTION. (USE HOUSE_PRICE5VARIABLES DATA) WHAT TEST TO PERFORM? Regression IT IS BELIEVED THAT THE MORE BEDROOMS A HOUSE HAS, THE HIGHER THE SELLING PRICE IS GOING TO BE. GIVEN THE DATA YOUR JOB IS TO CONFIRM OR DISPROVE THIS ASSERTION. (USE HOUSE_PRICE5VARIABLES DATA) WHAT IS THE P-VALUE/MARGIN OF ERROR? 0.000166 IT IS BELIEVED THAT THE MORE BEDROOMS A HOUSE HAS, THE HIGHER THE SELLING PRICE IS GOING TO BE. GIVEN THE DATA YOUR JOB IS TO CONFIRM OR DISPROVE THIS ASSERTION. (USE HOUSE_PRICE5VARIABLES DATA) STATISTICAL INTERPRETATION? Since P-value is small, we are confident that the amount of bedrooms influences the selling price of the house. IT IS BELIEVED THAT THE MORE BEDROOMS A HOUSE HAS, THE HIGHER THE SELLING PRICE IS GOING TO BE. GIVEN THE DATA YOUR JOB IS TO CONFIRM OR DISPROVE THIS ASSERTION. (USE HOUSE_PRICE5VARIABLES DATA) CONCLUSION? We are confident that the above assertion is correct. TEST IF THE HIGHER THE ASSAULTS THAT OCCUR IN FLORIDA INCREASES THE AMOUNT OF MURDERS THAT ARE EXPECTED IN FLORIDA. (USE US_CRIME DATA) WHAT TEST TO PERFORM? Regression TEST IF THE HIGHER THE ASSAULTS THAT OCCUR IN FLORIDA INCREASES THE AMOUNT OF MURDERS THAT ARE EXPECTED IN FLORIDA. (USE US_CRIME DATA) WHAT IS THE P-VALUE/MARGIN OF ERROR? 1.998E-05 TEST IF THE HIGHER THE ASSAULTS THAT OCCUR IN FLORIDA INCREASES THE AMOUNT OF MURDERS THAT ARE EXPECTED IN FLORIDA. (USE US_CRIME DATA) STATISTICAL INTERPRETATION? Since P-value is small, we are confident that the slope of the regression line is not zero. TEST IF THE HIGHER THE ASSAULTS THAT OCCUR IN FLORIDA INCREASES THE AMOUNT OF MURDERS THAT ARE EXPECTED IN FLORIDA. (USE US_CRIME DATA) CONCLUSION? We are confident that the above assertion is correct. (USE DATA_HURICANES_COMPREHENSIVE DATA) One-sided TTest (use Data_Huricanes_Comprehensive data) 0.10033 (use Data_Huricanes_Comprehensive data) Since P-value is too large, the test is inconclusive. (use Data_Huricanes_Comprehensive data) We are not confident that this was a reasonable claim. IS IT REASONABLE TO CLAIM THAT STUDENTS WITH 8 OR MORE VISITS TO OPEN LAB SESSIONS HAVE AVERAGE FINAL GRADE GREATER THAN 78? (USE LABVISITS DATA) WHAT TEST TO PERFORM? Confidence Interval IS IT REASONABLE TO CLAIM THAT STUDENTS WITH 8 OR MORE VISITS TO OPEN LAB SESSIONS HAVE AVERAGE FINAL GRADE GREATER THAN 78? (USE LABVISITS DATA) WHAT IS THE P-VALUE/MARGIN OF ERROR? 3.622127 IS IT REASONABLE TO CLAIM THAT STUDENTS WITH 8 OR MORE VISITS TO OPEN LAB SESSIONS HAVE AVERAGE FINAL GRADE GREATER THAN 78? (USE LABVISITS DATA) STATISTICAL INTERPRETATION? The prediction interval for the true average final grade of students with 8 or more visits to open lab is [78.64, 85.89]. IS IT REASONABLE TO CLAIM THAT STUDENTS WITH 8 OR MORE VISITS TO OPEN LAB SESSIONS HAVE AVERAGE FINAL GRADE GREATER THAN 78? (USE LABVISITS DATA) CONCLUSION? We are confident that students with 8 or more visits to open lab sessions have average final grade greater than 78. (USE MANBODYNEW21116 DATA) Confidence Interval (USE MANBODYNEW21116 DATA) 1.10354 (USE MANBODYNEW21116 DATA) The predicted interval is [18.844, 21.051]. (USE MANBODYNEW21116 DATA) No, we cannot claim that the above assertion is correct. (USE PRELAW_NURSING DATA) Two-sided TTest (USE PRELAW_NURSING DATA) 0.000629 (USE PRELAW_NURSING DATA) Since P-value is small, we are confident that the averages are different. (USE PRELAW_NURSING DATA) We are confident that this was a reasonable claim. Double Bell 855 55-60 Stdev < Median < Average Bachelor’s Degree Professional Degree Professional Degree High School Completion Professional Degree • • • • • • • • • • A RANDOM EXPERIMENT WAS CONDUCTED WHERE A PERSON A TOSSED SEVEN COINS AND RECORDED THE NUMBER OF “HEADS”. PERSON B ROLLED FOUR DICE AND RECORDED THE SMALLER NUMBER OUT OF THE FOUR DICE. SIMULATE THIS SCENARIO (USE 10000 LONG COLUMNS) AND ANSWER THE QUESTIONS. WHICH OF THE TWO PERSONS (A OR B) IS MORE LIKELY TO GET THE NUMBER 2? Person B WHAT IS THE PROBABILITY THAT PERSON B OBTAINS THE NUMBER “3” OR “4”? Around 18% WHICH OF THE TWO PERSONS WILL HAVE HIGHER VARIATION IN THEIR OUTCOMES? Person A WHICH OF THE TWO PERSONS WILL ON AVERAGE GET A HIGHER NUMBER? Person A WHAT IS THE PROBABILITY OF PERSON A GETTING A NUMBER BETWEEN 3 AND 6? Around 44% WHICH OF THE PERSONS HAS A HIGHER PROBABILITY OF GETTING THE NUMBER 3 OR SMALLER? Person B • 65% OF DATA • AVERAGE ± STDEV • 95% OF DATA • AVERAGE ± 2*STDEV • 99% OF DATA • AVERAGE ± 3*STDEV IN THE STATE OF FLORIDA, THE STARTING SALARY OF MECHANICAL ENGINEERS FOLLOWS A NORMAL DISTRIBUTION WITH MEAN 62,000 AND STANDARD DEVIATION 5,000. WITH THE ABOVE INFORMATION SIMULATE 10000 STARTING ENGINEERS. WHAT WOULD BE A RANGE [A TO B], WHICH WOULD CONTAIN 95% OF THE STARTING SALARIES OF MECHANICAL ENGINEERS? Between 52,000 and 72,000 What is the approximate probability that a randomly picked starting engineer will have a salary below 57,000? Around 15% WHAT IS THE APPROXIMATE PROBABILITY THAT A RANDOMLY PICKED STARTING ENGINEER WILL HAVE A SALARY OF 73,000 AND ABOVE? Around 1% What is the approximate probability that a randomly picked starting engineer will have a salary between 65,000 and 72,000? Around 25% WHAT VALUES ARE DESIGNED TO DESCRIBE THE CENTER OF THE VARIABLE? Mean and Median Which number describes the variation of the variable? Standard Deviation Which two variables are influenced by outliers? Mean and Standard Deviation