Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
(Day 2) In a normal distribution, approximately… ◦ 68% of all data fall within 1 standard deviation of the mean. ◦ 95% of all data fall within 2 standard deviations of the mean. ◦ 99.7% of all data fall within 3 standard deviations of the mean. ◦ Remember, the amount of data will have an impact on how close these numbers are to reality. 34 — 34 13.5 2.35 0.15 13.5 2.35 0.15 1. How much of the data is greater than 500? Answer: 50% 2. How much of the data falls between 400 and 500? Answer: 34% 3. How much of the data falls between 200 and 400? Answer: 50-34-0.15=15.85% 4. How much of the data falls between 300 and 700? Answer: 95% 5. How much of the data is less than 300? Answer: 100-50-34-13.5=2.5% What kind of curve is being used if the area under the curve is defined by a proportion (a value between 0 and 1). A DENSITY CURVE To use the 68-95-99.7 Rule, it also must be a normal curve. Symmetric Bell shaped Single peak Tails fall off No outliers You can use Chebychev’s Theorem if the distribution is not bell shaped, or if the shape is not known. The portion of any data set lying within “k” standard deviations (k > 1) of the mean is at least: 1 1 2 k If you wanted to know how much data fell within 3 standard deviations of a data set and it wasn’t a bell shaped curve or you didn’t know the shape: 1 1 8 1 2 1 88.9% 3 9 9 This is used to standardize numbers which may be on different scales. z x x non standardized z standardized mean standard _ deviation Answer can be between -3.49 and +3.49. Z-scores represent the number of standard deviations above or below the mean the number is. ◦ If z=1, the value is 1 standard deviation above the mean. ◦ If z=-2, the value is 2 standard deviations below the mean. During the 2003 regular season, the Kansas City Chiefs (NFL) scored 63 touchdowns. During the 2003 regular season the Tampa Bay Storm (Arena Football) scored 119 touchdowns. The mean number of touchdowns for Kansas City is 37.4 with a standard deviation of 9.3. The mean number of touchdowns for Tampa Bay is 111.7 with a standard deviation of 17.3. Find the z-score for each. KC = 2.75 TB = 0.42 Kansas City had a better record of touchdowns for the season (much higher above the mean). Cth percentile of a distribution is a value such that C percent of the observations lie below it and the rest lie above it. ◦ 80th percentile = 80% below, 20% above = Top 20% ◦ 90th percentile = 90% below, 10% above = Top 10% ◦ 99th percentile = 99% below, 1% above = Top 1% Pg. 297..306 ◦#7-14, 16-18, 25, 27 Important Notes ◦ For the window settings, the two numbers in the brackets represent the minimum and maximum; the subscript number represents the scale. ◦ Add in the age for President Obama: 47 years old. ◦ Question 4 now changes to 44 data points since we have now included President Obama in the data. ◦ In order to sort the L1 list into L2 and L3, highlight the L2 and L3 headings and type in L1. This will copy the data into each column. To sort L2 in ascending order hit STAT, #2, L2; to sort L3 in descending order hit STAT, #3, L3. ◦ Add these 2 questions: What percent are age 42.32 years – 60.80 years? How does this compare with what should happen with the 68-95-99.7 rule? What percent are below age 59.64 years? How does this compare with what should happen with the 68-95-99.7 rule?