Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
UNIT 8:Statistical Measures Statistics practice of analyzing a set of data Measures of Central Tendency: numbers that represent the middle of the data Mean ( x ): Arithmetic average (Total Data of Values over Number of values) Median: Middle of the data listed in ascending order (Count inward from Max and Min – if not an exact data value, find midpoint of last two data values) Mode: Most commonly occurring number(s) (MORE THAN ONE value if repeated same number of times or NONE if no repeated values) Measures of Variation: Variance, Standard Deviation Dispersion: How spread / scattered a set of data is Range: Difference between the highest and lowest data value (MAX – MIN) Inner Quartile Range (IQR): The difference between Q3 and Q1 (Q3 – Q1) Outlier: Any data item outside Q1 or Q3 a distance 1.5*IQR *Standard Deviation (σ): A measure of how much the data is spread out **We don’t use variance** Variance (σ2 ): Measures how much the data differs from the mean 5 Number Summary: Min: Minimum Value (0 Percentile) Q1: Quartile 1 (25th Percentile) Med (Q2): Median (50th Percentile) Q3: Quartile 3 (75th Percentile) Max: Maximum or Q4 (100th Percentile) Min Med Q1 Q2 Max Q3 Q4 Calculator Commands: One-Variable Statistics • Input data: [STAT] [EDIT] L1 • [STAT] → [CALC] → [1: 1-Var STATS] • σx → represents standard deviation • x = represents mean • n = number of entries • med = median • Etc… Ex: Find the mean, median, mode, and standard deviation of … (1)Test Scores: 85, 76, 88, 91, 85, 58, 88, 91, 97, 91, 88, 97, 97 (2) GPAs: 3.42, 3.91, 3.33, 3.57, 3.45, 4.0, 3.65, 3.71, 3.35, 3.82, 3.67, 3.88, 3.76, 3.41, 3.62 • Mean = • Mean = • Median = • Median = • Mode = • Mode = • Stand. Deviation = • Stand. Deviation = Normal Distribution Mean = Median “Bell Curve” Mean + Median Skewed data is described based on location of tail Positively Skewed (Right) Negatively Skewed (Left) “Mean is to Right of Median” Median Mean “Mean is to Left of Median” Mean Median Normal Distribution: 68 – 95 – 99 Rule x Mean , Median 0.5% 2% 13.5% 34% 34% 13.5% 2% 0.5% σ = Standard Deviation 1 68% of values are 1 st. deviation from the mean: ( x , x ) 2 95% of values are 2 st. deviation from the mean: ( x 2 , x 2 ) 3 99% of values are 3 st. deviation from the mean: ( x 3 , x 3 ) Normal Distribution Curve #1: The number of problems missed on a quiz follows a normal distribution with a mean of 15 and a standard deviation of 4. Draw a normal distribution for number of problems missed. 34% 34% 13.5% .5% 1 st. dev. 2% 3 .5% .5% 13.5% 7 2% 13.5% 11 1 st. dev. 15 19 (# wrong) 68% are within 1 st. deviation (between 11 & 19) 2% 23 13.5% .5% 27 2% Related Questions: What percent of students missed between 11 – 19 problems? 68% What percent of students missed more than 23 problems? 2.5% What percent of students less than 23 problems? 97.5% What percent of students missed fewer than 3 problems? .5% What percent of students missed less than 11 problems? 16% Normal Distribution Curve #2: The final exam scores normal distribution with a mean of 80 and a standard deviation of 6. Draw a normal distribution for number of problems missed. 34% 34% 13.5% 13.5% 1 st. dev. .5% 1 st. dev. 2% 62 2% 68 74 80 86 (score) 92 .5% 98 Related Questions: What percent of students scored between 80 - 92? 47.5% What percent of students scored between 68 -86? 81.5% What percent of students scored less than 98? 99.5% What percent of students scored higher than 74? 84% What percent of students scored less than 80? 50% Normal Distribution Curve #3: The class mean of height is 60 inches. We know that 68% of the students are between 56 and 64 inches. 1) What is the standard deviation? 2) 50% of the students are shorter than …? 3) 17% of students are taller than …? 4) 2.5% of students are shorter than…? 5) 84% of students are taller than …? Sampling and Error Biased vs Unbiased • When you are sampling a section of the population (giving a poll), there are good and bad ways to do it. • An unbiased sample is one in which a good section of the population is represented. • A biased sample is one in which the sample does not adequately represent the population. • Ex. 1: You want to determine how many people in a school are going to college, so you ask every third person in an AP Calculus class. (biased or unbiased?) BIASED • Ex 2: You want to find out people’s favorite kind of food, so you ask 100 people at the food court at the mall. (biased or unbiased?) UNBIASED • The bigger the sample is, the more accurate the results will be (the more closely it will reflect the population). • The Margin of Sampling Error (ME) is a numerical way to determine the difference between how a sample responds and how the population responds. Margin of Error • If p represents the percentage of people with a particular response from a sample of n people, then 95% of the time the population will respond within one ME of the response p (p – ME or p + ME) p(1 p) ME 2 n • Example: 1500 people were asked a question and 38% responded “yes”. • – the margin of error 0.38(1 0.38) (ME) is 2.5% ME 2 1500 .025 • That means that there is a 95% chance that the people in the population that would answer “Yes” between (38-2.5) = 35.5% and (38 + 2.5)= 40.5% of the time. • (38 % with a margin of error of 2.5%) Determine whether each situation would produce a random (unbiased) sample: • 1. Putting the names of all the seniors in a hat, then drawing names from the hat to select a sample of seniors. UNBIASED • 2. Determining the shopping preferences of the students at your school by asking people at the BIASED mall. • 3. Finding the average height of the students in your school by using the members of the football BIASED team. Find the Margin of Error to the nearest tenth of a percent: • 4. p = 16%, n = 400 • 5. 934 out of the 2150 students said they read the newspaper every day: