Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Chapter 3: Variability • Mean Sometimes Not Sufficient • Frequency Distributions • Normal Distribution • Standard Deviation What City has Temperatures to My Liking? • Person 1: Likes Seasons and Variability • Person 2: Likes Consistency, Cool Temps Average Temperature by City (1961-1990) 30 Duluth Juneau Bismarck Burlington Great Falls Minneapolis-St. Paul Portland Sioux Falls Spokane Buffalo Detroit Chicago Cleveland Denver Pittsburgh Providence Omaha Boise Boston Salt Lake City Seattle-Tacoma Indianapolis Kansas City Portland Philadelphia New York 1 Baltimore St. Louis San Francisco Washington Nashville Norfolk Oklahoma City Charlotte Atlanta Memphis Los Angeles Columbia San Diego Jackson Dallas-Fort Worth Houston New Orleans Phoenix Miami Honolulu 35 40 45 50 55 60 65 70 75 80 38.5 40.6 41.6 44.6 44.8 44.9 45.4 45.5 47.3 47.7 48.6 49 49.6 50.3 50.3 50.4 50.6 50.9 51.3 52 52 52.3 53.6 53.6 54.3 54.7 55.1 56.1 57.1 58 59.1 59.2 60 60.1 61.3 62.3 63 63.1 64.2 64.2 65.4 67.9 68.1 72.6 75.9 77.2 Temperature Proximity to Ocean Elevation Latitude: South-North Climate: • Precipitation • Humidity Temperature Variation Across Cities in 2011 Boston 30 60 90 San Francisco 30 60 90 San Diego 30 60 90 Austin 30 60 90 30 Tampa Bay 60 90 Similar Mean, Different Distributions Seattle Portland Boston Omaha Normal Distribution • Adolphe Quételet (1796-1874) • ‘Quetelet Index’: Weight / Height (“Body Mass Index”) Normal Distribution Two Metrics: Mean and Standard Deviation Calculating Standard Deviation Mean • A deviation is the difference between the mean and an actual data point. • Deviations are calculated by taking each value and subtracting the mean: deviation xi x Summary the Deviation? • Deviations cancel out because some are positive and others negative. • Overall would be 0 • Not Useful Sum of Squared Deviation • Therefore, we square each deviation. • We get the sum of squares (SS). ^2 Variance • The sum of squares is a good measure of overall variability, but is dependent on the number of scores • We calculate the average variability by dividing by the number of scores (n) • This value is called the variance (s2) Standard Deviation • Variance is measured in units squared • This isn’t a very meaningful metric so we take the square root value. • This is the standard deviation (s) ^2 Median 55 2 19 36 53 70 87 104