Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Unit 1 Statistics Analyzing Data Analyzing Data • Measures of Central Tendency – How can we best describe the middle of the data? • Measures of Variability – How can we describe how the data varies, how spread out it is? Measures of Central Tendency • Arithmetic Mean ( X ) – Average • Find the total of all the data • Divide by the number of pieces of data. • If x1 is the first piece of data, x2 is second, and xn is the last (nth) piece. Then X X1 X 2 n Xn Measures of Central Tendency • Median – Middle number • List the data in numerical order, find the middle number. – If you have and even number of values, find the mean of the two centre data pieces. • Mode – Most popular response • Appears the most frequently. Measures of Variability • Range – Determine the difference from the lowest to highest piece of data. Example • The marks for a chemistry test were • 65, 66, 100, 63, 64, 63, 60 – Find the • Arithmetic Mean • Median • Mode • Range Assignment WS: Measures of Central Tendency Measures of Central Tendency A new kind of average.... Trimmed Mean • An average calculated AFTER removing a small percentage of the largest and smallest values before calculating the mean. – After removing the specified values, the trimmed mean is found the same way as an arithmetic mean. Example 1 • At a figure skating competition judges drop the highest and lowest score before determining the competitors average score. • Determine the trimmed mean for a competitor with the following scores: – 6.0, 8.1, 8.3, 9.1, 9.9. Weighted Mean • Used in situations where not all data points will be treated as having equal weight. – Your term mark is a weighted mean, and then so is your final mark! • Each category receives an average score, and each score is assigned a percentage of worth. Example 2 • In her job as a server at a restaurant, Glenna earned: – – – – 2 tips of $6.00, 3 tips of $8.00, 3 tips of $10.00, 6 tips of $12.00. • Determine her average tip. Example 3 • A student receives – A term mark of 84% – A score on the midyear of 75% – A final exam score of 80% • Determine her final mark! – Remember 55%, 15%, 30% Assignment WS: Measures of Central Tendency Measures of Variability Percentile Rank Percentile Rank • Comparison of averages is not enough. • Consider a class with the following marks • 80%, 80%, 80%, 90%, 20%, 70%, and 65% – The mean of the class is a 69% » A person with 70% could claim to be “above average”. Percentile Rank • Percentage of scores less than or equal to a particular score • Our example – 7 scores all together • Only 2 of them are lower than 70% – The student was in the bottom half of the class! Percentile Rank B 0.5E Percentile Rank = 100 n • Where – B = number of scores below a given score – E = number of scores equal to the given score • E=1 if no equal scores – n = number of scores • Percentile Rank is always rounded up. Our Example • B=2 • PR = 36 – 2 scores < 70% • E=1 – 1 score = 70% • n=7 – 7 scores total P.R.= 2 0.5 1 7 100 35.7 – P36 – 36th percentile – Scored as well as or better than 36% of the class. Special Percentiles • Median – 50th Percentile – P50 • Upper quartile – Median of upper half of data – 75th Percentile – P75 • Lower quartile – Median of lower half of data – 25th Percentile – P25 Example • Wendy is 1.7m tall. She is taller than 65 of the students in her grade and no one is the same height as she is. There are 139 students in her grade. – What percentage of students are taller than Wendy? Assignment Page 390 #1 – 7 Measures of Variability Standard Deviation Calculating Standard Deviation 1. Find the Mean 2. Find the difference between each number and the mean 3. 4. 5. 6. X Square of the differences Add up (sum) the squares X X X X X X Divide the sum of the squares by n–1 2 X X Find the square root of this number XX n 1 2 2 n 1 2 • You may find a chart useful to help keep track of the calculations… X X X X X X1 X2 Etc.. X X Then find: 2 XX n 1 2 2 Example x • A group of Senior 4 students had the following scores on a math test – – – – – – – – – – 42 53 59 66 68 68 71 76 83 94 • Find the mean, range, and Standard Deviation for the data 42 XX 53 59 66 68 68 71 76 83 94 X X X X n 1 2 2 XX 2 Example 2 • Another school wrote the same test and had the following scores – – – – – – – – – – 42 62 66 66 68 68 70 70 74 94 • Find the mean, range, and Standard Deviation for the data x X X X X 42 62 66 66 68 68 70 70 74 94 X X X X n 1 2 2 2 Assignment Page 399 #1 – 7 Measures of Variability Normal Distributions Normal Distributions • A family of graphs that have the same general shape, and characteristics. – Bell-shaped curves – Mean is in the center of the curve • Curve is symmetric about mean – Mean equals median • Equal # of data pieces above and below mean – Data is more concentrated in the middle than the ends. Normal Distributions Normal Distributions • Many data sets have a normal distribution if you collect a very large sample – – – – – Height Weight IQ score Marks Life Expectancy Normal Distribution • Closely related to standard deviation • Curve is divided into sections, – Each section is one more standard deviation away from the mean Normal Distribution • All normal distributions have the same percentage of data in each section – 68% of the data are within 1 standard deviation of the mean. (+/- 34%) – Only 0.30% of the data are more than 3 SD’s away from the mean. Example • A company manufactures batteries whose life follows a normal distribution. The batteries have an average (mean) lifespan of 90 hours, with a standard deviation of three hours. Example • What percentage of batteries will have a lifespan of less than 87hrs? • What percentage of batteries will have a lifespan greater than 87hrs but less than 96hrs? • If 5000 batteries are made per day, how many will have a lifespan greater than 93hrs? • If your school bought 200 of these batteries, how many of these would last less than 84hrs? Assignment Page 408 #1 – 8 Statistics Strength of Relationships Correlation Coefficient • Statisticians try and find if relationships exists between two variables. • Correlation coefficient (r) – Numerical value assigned to a relationship – Describes the strength of the relationship. – Between -1 and +1 Positive Correlation • Closer the r-value is to +1, the stronger the positive correlation. • As the values of one set of data (x) increase, the values of the second set of data (y) also increases. Negative Correlation • Closer the r-value is to -1, the stronger the negative correlation. • As the values of one set of data (x) increase, the values of the second set of data (y) also decreases. Golf Scores Zero Correlation • A r-value of 0 indicates a zero correlation • Weak correlations will have values close to zero. Assignment Page 422 #1 – 7