Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Correlation The apparent relation between two variables. Trend • A pattern of average behaviour that occurs over time Which of the scatter plots indicate the strongest trends? Which of the scatter plots do not indicate a trend? If a line of best fit were drawn on each of the scatter plots that show a trend, describe the slope of each line. Which do you think show strong positive correlation and which shows strong negative correlation? Make your own scatter plot and examine the trends • http://staff.argyll.epsb.ca/jreed/math9/strand 4/scatterPlot.htm Correlation Coefficient , r • A number from +1 to -1 that gives the strength and direction of the relationship between two variables. Positive Correlation • If there is a positive correlation, the coefficient is a number between 0 and 1. If there is no relationship between the predicted values and the actual values the correlation coefficient is 0 or very low (the predicted values are no better than random numbers). As the strength of the relationship between the predicted values and actual values increases so does the correlation coefficient. A perfect fit gives a coefficient of 1.0. Thus the higher the correlation coefficient the better. Negative Correlation • If there is a negative correlation, the coefficient is a number between 0 and -1. If there is no relationship between the predicted values and the actual values the correlation coefficient is 0 or very low (the predicted values are no better than random numbers). As the strength of the relationship between the predicted values and actual values increases so does the correlation coefficient. A perfect fit gives a coefficient of -1.0. Thus the higher the correlation coefficient the better. Coefficient of Determination, r2 • A number from 0 to +1 that gives the relative strength of the relationship between two variables. • If r2 = 0.44, this means 44% of the variation of the dependent variable is due to variation in the independent variable. Example • r=.9 • This means there is a strong positive correlation. • Calculate r2 and explain what it means. Residual Value – the vertical distance between a data point and the line of best fit Residual Plot To do: Make a scatter plot using Open Calc for the data below x -4 -3 0 5 15 19 21 28 36 y 8 5 12 15 27 30 32 38 50 1. Enter data for x vertically in Column 1 2. Enter data for y vertically in Column 2 3. Highlight data and click INSERT and then CHART 4. Select XY(Scatter) to get a scatter plot 5. Click, NEXT, NEXT, FINISH To do: Make a line of best fit using Open Calc for the data below 1. Highlight the x -4 -3 0 5 15 19 21 28 36 y 8 5 12 15 27 30 32 38 50 graph 2. Click INSERT, TREND LINE 3. Choose LINEAR 4. Click SHOW EQUATION and SHOW COEFFICENT before pressing OK 5. What is the equation? 6. What is R2 7. What does R2 tell you about the correlation To do: Make a scatter plot using Open Calc for the data below x -4 -2 0 2 4 6 8 10 12 y 8 10 12 14 17 18 20 21 24 1. Enter data for x vertically in Column 1 2. Enter data for y vertically in Column 2 3. Highlight data and click INSERT and then CHART 4. Select XY(Scatter) to get a scatter plot 5. Click, NEXT, NEXT, FINISH To do: Make a scatter plot using Open Calc for the data below x -4 -2 0 2 4 6 8 10 12 y 8 10 12 14 17 18 20 21 24 1. Highlight the graph 2. Click INSERT, TREND LINE 3. Choose LINEAR 4. Click SHOW EQUATION and SHOW COEFFICENT before pressing OK 5. What is the equation? 6. What is R2 7. What does R2 tell you about the correlation Compare Graph 1 and 2 • Which one of your two graphs show stronger correlation? How can you tell???