Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Chapter 7: Scatterplots, Associations, and Correlations A.P. Statistics Scatterplots Scatterplots are the best way to start observing the relationship between two quantitative variables Describing Scatterplots • Direction: Positive, Negative, none • Form: Linear, curved, cluster, etc • Strength: At this point: strong, moderate, weak • Unusual Features: outliers, clusters with, etc Describing Scatterplots • Direction • Form • Strength • Unusual Features Describing Scatterplots • Direction • Form • Strength • Unusual Features Variables • Explanatory or Predictor : Attempts to explain the observed outcome Placed on x-axis • Response : Measures an outcome Placed on y-axis BE LOGICAL Correlation • Measures the direction and strength of the linear relationship between the two quantitative variables • Given as r 1 x x y y z x z y r n s x s y n 1 Correlation: Graphical Original data Standardized Data Correlation Conditions • Quantitative Variables Condition • Straight Enough Condition • Outlier Condition Report correlation with and without outlier Correlation Properties 1. Correlation makes no difference between explanatory and response variables. 2. Correlation requires both variables to be quantitative. 3. Because r uses standardized values of the observations, r does not change when we change the units of measurements of x, y, or both. Correlation Properties 4. Positive r indicates positive association between the variables and a negative r indicates negative association. 5. The correlation is always a number between -1 and 1. The strength of the linear relationship increases as r moves away from 0 towards either -1 or 1. Correlation Properties 6. Correlation measures the strength of only a linear relationship. 7. Like mean and standard deviation, r is not resistant: it is strongly affected by a few outliers. 8. Correlation has no units. It should not be expressed as a percentage. Other Information • Correlation is not a complete description of two variable data—even if it is a linear relationship. • You should give the means and standard deviations of both x and y. Straightening Scatterplots • If a relationship between two quantitative variables is not linear, we can re-express it and straighten the form. Then we can apply the strengths of using correlation and the other measures that come from a linear relationship. • We will go more into depth about this in Chapter 10. • See page 154 in your text for graphic Problems? • Don’t say “correlation” when you mean “association”. • Don’t correlation categorical data. • Be sure the association is linear. • Beware of outliers. • Don’t confuse correlation with causation. • Watch out for lurking variables. Correlation vs. Causation • Scatterplots and correlations never prove causation. • The only thing that can show causation is a randomized controlled experiment. Correlation vs. Causation