Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Histogram Box plot Dot plot Ch04_Hurricanes 0 1 2 3 4 5 6 Hurricanes Dot Plot 7 8 Number of hurricanes that occurred each year from 1944 through 2000 as reported by Science magazine 3 Characteristics of data Shape Center Spread Shape of the Data – Symmetric The IQ scores of 60 randomly selected 5th graders Shape of the Data – Symmetric The age of all US Presidents at the time they took office Notice that this distribution has only one mode Shape of the Data – Bimodal The winning times in the Kentucky Derby from 1875 to the present. Why two modes? Shape of the Data – Bimodal The winning times in the Kentucky Derby from 1875 to the present. Why two modes? The length of the track was reduced from 1.5 miles to 1.25 miles in 1896. The race officials thought that 1.5 miles was too far. Shape of the Data – Skewed LEFT RIGHT Data for two different variables for all female heart attack patients in New York state in one year. One is skewed left; the other is skewed right. Which is which? Center and Spread of Data Maximum 100th percentile Q3 75th percentile Median 50th percentile Q1 25th percentile Minimum 0th percentile These numbers are called the 5 number summary. The median measures the center of the data. Q3 – Q1 = Interquartile range (IQR) measures the spread. Measures of Central Tendency and Dispersion Central Tendency: Mean, Median, and Mode Dispersion or Spread: Range, IQR, Standard Deviation, and Variance Examples of Uses for Standard Deviation and Variance: • • • • • • • Factory Processes Stocks Weather Sports Teams Grades Attendance to events ????? x x Symbols Symbols: • s2 = Sample Variance • s = Sample Standard Deviation • 2 = Population Variance x • = Population Standard Deviation • -- x = Mean Center and Spread of Data N x x i 1 N i sum of all numbers number of numbers The mean or average is a measure of the center of a distribution Center and Spread of Data N x i 1 i N x mean absolute deviation The mean of the absolute deviation of each number Mean absolute deviation (mad) measures the spread of the data (you learned this last year). Center and Spread of Data N x i 1 i x N 2 variance The mean of the squares of the deviation of each number The formula given above is for the population variance Center and Spread of Data N x i 1 i x N 2 standard deviation The square root of the variance. This quantity has the same units as the data. This is one of the most common measures of the spread of a distribution. The formula given above is for the population standard deviation. Center and Spread of Data N x i 1 i x N x i 1 i x 2 N x i 1 i x N N N Mean Variance Standard Absolute Deviation Deviation 2