Download Chapter 2

Chapter 2 Descriptive Statistics I. Section 2-1 A. Steps to Constructing Frequency Distributions 1. Determine number of classes (may be given to you) a. Should be between 5 and 15 classes. 2. Find the Range a. The Maximum minus the Minimum. 1) Use the TI-84 to sort the data. a) STAT – Edit – enter numbers into L1 b) STAT – SortA(L1) will put the numbers into ascending order. c) STAT – SortD(L1) will put the numbers into descending order. 3. Find the Class Width a. Range divided by the number of classes. 1) Always round UP!! a) Even if class width comes out to a whole number, go up one. 4. Find the Lower Limits a. Begin with the minimum value in your data set, and then add the class width to that to get the next Lower Limit. 1) Repeat as many times as needed to get the required number of classes. 5. Find the Upper Limits a. The Upper Limit of the first class is one less than the Lower Limit of the second class. 1) Add the class width to each Upper Limit until you have the necessary number of classes. 6. Find the Lower Boundaries a. Subtract one-half unit from each Lower Limit (Do NOT round these!) 7. Find the Upper Boundaries a. Add one-half unit to each Upper Limit. 8. Find the Midpoints of each class. a. The means of the Lower and Upper Limits (Do NOT round). 1) Could also use the means of the boundaries for this. 9. Frequency Distribution a. Place a tally mark in each class for every piece of data that fits there. b. Add up the tally marks – these are your frequencies for each class. 10. Relative Frequencies a. Divide the class frequencies by the total number of data points to find the percentage of the total represented by each class. 11. Cumulative Frequencies a. The total number of tallies for each class, plus all those that came before. 1) The cumulative frequency of the last class must equal the number of data points used. Frequency Distribution Example (Separate Hand-Out from Notes Outline) EXAMPLE: Use the table of 30 numbers below to fill in a frequency distribution of 6 classes. STAT – Edit – Enter these 30 numbers into L1 on your calculator STAT – SortA(L1) – 2nd 1 enters L1 into the parentheses. STAT – Edit – to see the new list, in order. 72 84 61 76 104 76 86 92 80 88 98 76 97 82 84 67 70 81 82 89 74 73 86 81 85 78 82 80 91 83 61 67 70 72 73 74 76 76 76 78 80 80 81 81 82 82 82 83 84 84 85 86 86 88 89 91 92 97 98 104 EXAMPLE: Use the table of 30 numbers below to fill in a frequency distribution of 6 classes. Max Value: 104 Min Value: 61 Range: 104 - 61= 43 Class Width: 43/6 = 7.2 – Round UP to 8! Minimum Value is the First Lower Limit Add Class Width Down Check to be sure that the Maximum Value fits in the last class. 61 67 70 72 73 74 76 76 76 78 80 80 81 81 82 82 82 83 84 84 85 86 86 88 89 91 92 97 98 104 LL UL 61 68 69 76 77 84 85 92 93 100 101 108 LB UB MdPt Freq. Rel. Freq. Cum. Freq. First Upper Limit is one less than 2nd Lower Limit Add Class Width Down Since 104 fits between 101 and 108, we are good. If the Maximum value does NOT fit into the last class, you did something wrong. DO IT AGAIN!! EXAMPLE: Use the table of 30 numbers below to fill in a frequency distribution of 6 classes. Max Value: Min Value: Range: Class Width: 104 61 43 8 61 67 70 72 73 74 76 76 76 78 80 80 81 81 82 82 82 83 84 84 85 86 86 88 89 91 92 97 98 104 LL UL LB UB MdPt Subtract one-half unit from lower limits to get lower boundaries 61 68 60.5 68.5 69 76 68.5 76.5 64.5 72.5 77 84 76.5 84.5 80.5 Add one-half unit to upper limits to get upper boundaries 85 92 92.5 88.5 93 101 100 84.5 92.5 100.5 108 100.5 108.5 96.5 104.5 Freq. Rel. Freq. Cum. Freq. Find the mean of the limits (or boundaries) to find the midpoint of each class. EXAMPLE: Use the table of 30 numbers below to fill in a frequency distribution of 6 classes. Max Value: Min Value: Range: Class Width: 104 61 43 8 Count how many data points fit in each class and enter that into the Frequency column 61 67 70 72 73 74 76 76 76 78 80 80 81 81 82 82 82 83 84 84 85 86 86 88 89 91 92 97 98 104 LL UL LB UB MdPt Freq. 61 68 60.5 68.5 64.5 2 69 76 68.5 76.5 72.5 7 77 84 76.5 84.5 80.5 11 85 92 84.5 92.5 88.5 7 93 100 92.5 100.5 96.5 2 101 108 100.5 108.5 104.5 1 Rel. Freq. Cum. Freq. EXAMPLE: Use the table of 30 numbers below to fill in a frequency distribution of 6 classes. Max Value: Min Value: Range: Class Width: 104 61 43 8 Relative Frequency is the class frequency divided by the total frequency. In this case, we have 30 pieces of data, so we divide by 30. 61 67 70 72 73 74 76 76 76 78 80 80 81 81 82 82 82 83 84 84 85 86 86 88 89 91 92 97 98 104 LL UL LB UB MdPt Freq. 61 68 60.5 68.5 64.5 2 Rel. Freq. 0.07 2/30 69 76 68.5 76.5 72.5 7 0.23 7/30 77 84 76.5 84.5 80.5 11 11/30 0.37 85 92 84.5 92.5 88.5 7 93 100 92.5 100.5 96.5 2 7/30 0.23 0.07 2/30 101 108 100.5 108.5 104.5 1 1/30 0.03 1.00 Relative Frequency column MUST add up to 1!! If it does NOT, you are wrong. DO IT AGAIN!! Cum. Freq. EXAMPLE: Use the table of 30 numbers below to fill in a frequency distribution of 6 classes. Max Value: Min Value: Range: Class Width: 104 61 43 8 Cumulative Frequency is the Frequency of each class, plus the classes that came before it. The last class must have a cumulative frequency that matches the number of data points 61 67 70 72 73 74 76 76 76 78 80 80 81 81 82 82 82 83 84 84 85 86 86 88 89 91 92 97 98 104 LL UL LB UB MdPt Freq. Rel. Freq. Cum. Freq. 61 68 60.5 68.5 64.5 2 0.07 2 69 76 68.5 76.5 72.5 7 0.23 9 77 84 76.5 84.5 80.5 11 0.37 20 85 92 84.5 92.5 88.5 7 0.23 27 93 100 92.5 100.5 96.5 2 0.07 29 101 108 100.5 108.5 104.5 1 0.03 30 B. Steps to Constructing a Frequency Histogram 1. Label the horizontal axis with the class boundaries. 2. Label the vertical axis with the number of frequencies. 3. Draw a bar graph with bars that touch, using the frequencies from your frequency distribution. C. Steps to Constructing a Relative Frequency Histogram 1. Label the horizontal axis with the class boundaries. 2. Label the vertical axis with the frequency percentages. 3. Draw a bar graph with bars that touch, using the relative frequencies from your frequency distribution. D. Steps to Constructing an Ogive 1. Label the horizontal axis with the midpoints of each class. 2. Label the vertical axis with the total number of data points. 3. Place a dot at each midpoint that corresponds to that class’s cumulative frequency. a. This chart will always end at the total number of data points. Assignments: Classwork: Pages 49-51 #1-25 Odds Homework: Pages 51-54 #28-42 Evens II. Section 2-2 A. Stem and Leaf Plot 1. Use the extreme values as your starting point. 2. Go through the data points, placing the leaves beside the appropriate stems. 3. If you have too many data points, you can use two lines per stem, with 0-4 consisting of the first line, and 5-9 on the second line. EXAMPLE: Use the table of 30 numbers below to fill in a stem and leaf plot 72 84 61 76 104 76 86 92 80 88 98 76 97 82 84 67 70 81 82 89 74 73 86 81 85 78 82 80 91 83 Stem Leaves 10 9 8 4 7 2 1 6 6 Continue in this manner until all 30 data points are represented in the stem and leaf plot. B. Dot Plot 1. Use a horizontal line, numbered from lowest data value to highest. a. Place a dot on the line at each data point. 1) This allows you to see visually whether you have a tight grouping of data points, and where it is, if it exists. C. Pie Chart 1. Used to describe parts of a whole. a. Multiply the relative frequency you calculated earlier by 360 (the number of degrees in a circle) to find the number of degrees that each class will consist of. 1) The calculated number of degrees corresponds to the interior angle in the circle. a) Use a protractor to draw your angles. D. Scatter Plot 1. Used to visually examine the possible relationship between two different elements. a. Place one element on the vertical axis, and the other on the horizontal. 1) Graph them as if one was the x value of an ordered pair and the other was the y-value. 2. The closer the dots are to being linear, the stronger the relationship. a. If the slope is upward, the relationship has a positive correlation. b. If the slope is downward, the relationship has a negative correlation. D. Scatter Plot 3. To do a Scatter Plot on the TI-84, follow these steps. A. Turn STAT Plots on 1) 2nd y=, Enter 2) Highlight Plot On, Enter B) Go to STAT and Edit 1) Enter x-values into L1, and y-values into L2. C) Press the Window button, and set your x-min and x-max values to match the data in L1. 1) Repeat for y-min and y-max values to match L2. D) Press graph to see the scatterplot. 4. To get the equation of the line of best fit, go to STAT and Calc, then select LinReg (4). A) The slope and y-intercept will be given to you. 5. To graph the line with the scatterplot, manually enter the equation into the y= window and press Graph. Assignments: Classwork: Page 62 #1-12 Homework: Pages 63-66 #18, 22, 27, 28, 33, 36 Quiz on Lessons 2-1 and 2-2 on Friday!! III. Section 2-3 A. Measures of Central Tendency 1. Mean – The sum of all data points divided by the number of values. a. This one is the one that we most often think of when we say “average”. 1) It’s also the one most affected by an extreme value (either high or low). 2. Median – the middle number (or mean of two middle numbers) when the data points are put into order. a. The point which has as many data values above it as there are below it. 3. Mode – The value that happens the most often (highest frequency). B. Shapes of Distributions 1. Symmetric – Data bunched in the middle, with equal distribution on either side. 2. Uniform – Data is spread evenly across the whole spectrum. 3. Skewed Data – Named by the “tail”. a. Skewed right means most of the data values are to the left (low) end of the range. b. Skewed left means that most of the data values are to the right (high) end of the range. IV. Section 2-4 A. Measures of Variation 1. Range – the difference between the highest value and the lowest value. (Maximum minus Minimum) a. Easy to compute but only uses two numbers from a data set. 2. Deviation – The difference between the value of a data point and the mean of the data set. a. In a population, the deviation of x is 𝑥 − 𝜇. (Greek letter “mu”, pronounced “moo”) b. In a sample, the deviation of x is 𝑥 − 𝑥 (pronounced “x bar”) c.The sum of the deviations of a set of data will always be zero. 3. Population Measures of Variance – a. Population Variance -- The sum of the squares of the deviations, divided by N (the number of data points in the population). 1). Find the deviations, and then square them (this makes them all positive, so they don’t cancel each other out) a) Add up the squared deviations, and then divide by the number of data points. b. Population Standard Deviation – The square root of the population variance. 4. Sample Measures of Variance a. Sample Variance – The sum of the squares of the deviations, divided by n - 1 (one less than the number of data points in the sample). b. Sample Standard Deviation – The square root of the sample variance. B. Empirical Rule 1. All symmetric bell-shaped distributions have the following characteristics: a. About 68% of data points will occur within one standard deviation of the mean. b. About 95% of data points will occur within two standard deviations of the mean. c. About 99.7% of data points will occur within three standard deviations of the mean. C. Chebychev’s Theorem 1. This applies to ANY distribution, regardless of its shape. a. The portion of data lying with k standard deviations (k > 1) of the 1 mean is at least 1 − 2 𝑘 1) For k = 2, at least 1 – ¼ = ¾ or 75% of the data will be within 2 standard deviations of the mean. 2) For k = 3, at least 1 – 1/9 = 8/9 or 88.9% of the data will be within 3 standard deviations of the mean. V. Section 2-5 – Measures of Position A. Quartiles 1. Q1, Q2 and Q3 divide the data into 4 equal parts. a. Q2 is the same as the median, or the middle value. b. Q1 is the median of the data below Q2. c.Q3 is the median of the data above Q2. 2. Box and Whisker Plot a. Left whisker runs from lowest data value to Q1. b. Box runs from Q1 to Q3, with a line through it at Q2. 1) The distance from Q1 to Q3 is called the interquartile range. c. Right whisker runs from Q3 to highest data value. d. To draw a box-and-whisker plot on the TI-84, follow these steps. 1) Enter the data values into L1 in STAT Edit 2) Turn on your Stat Plots (2nd Y=), and select the plot with the boxand-whisker shown 3) Set your window to match the data a) Xmin should be less than your lowest data point. b) Xmax should be more than your highest data point. 4) Press graph. The box-and-whisker plot should appear. a) Press the Trace button and you can see exactly which values make up the Min, Q1, Median, Q3, and the Max. B. Percentiles 1. Divide the data into 100 parts. There are 99 percentiles (P1, P2, P3, …P99) a. P50 = Q2 = the median. b. P25 = Q1 c. P75 = Q3 2. A 63rd percentile score means that this person did as well as or better than 63% of the people who took that test. 3. The cumulative frequency that we did way back in section one can help us find the percentile. C. Z-Scores 1. Also called the “standard score”, it represents the number of standard deviations that a data value is away from the mean. value−mean 𝑥−𝜇 a. 𝑧 = = standard deviation 𝜎 2. A z-score of less than -2 or greater than 2 is considered to be unusual. a. Remember that 95% of data points should be within 2 standard deviations of the mean (if the data is symmetrically distributed).

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Chapter 2