Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
There are many ways to collect data, and all of them have some drawbacks. Frequency Distributions and Central Tendency For example, advice columnist Ann Landers once asked her readers if they would still have children, given a chance to do it over again. She reported in her column a few week later that 70% of parents said having kids was not worth it.1 1Yates, It is true that 70% of those responding to this survey said they would not have kids if they could do it all to do over again. But to you think you can use this data to draw the conclusion that 70% of parents feel this way? Why or why not? One sampling method that is fairly reliable is a random sample. Definition: A random sample of n individuals from a population chosen in such a way as each individual in that population has an equal chance of being selected. D, et all. The Practice of Statistics. 2000. This survey used what is called a voluntary response sample. Only those who felt strongly about the issue chose to respond. So instead of saying 70% of parents would not have children again, the best we could say is 70% of those who feel strongly about the issue would not have children again. One way to analyze data collected is to look at how the data is grouped. To do this, we will examine the frequency distribution. 1 Example: A random sample of 25 college students is surveyed and asked, on average, how many minutes a day they spend reading their math textbook. The answers are listed below: 14, 7, 1, 11, 2, 3, 11, 6, 10, 13, 11, 11, 16, 12, 9, 11, 9, 10, 7, 12, 9, 6, 4, 5, 9 We will use our calculators to plot the frequency distribution. Now enter ZOOM and ZOOMSTAT. Your frequency distribution graph will appear. 14, 7, 1, 11, 2, 3, 11, 6, 10, 13, 11, 11, 16, 12, 9, 11, 9, 10, 7, 12, 9, 6, 4, 5, 9 On your calculator, clear all lists. Then input this data into L1. Hit 2ND-STAT PLOT (above Y= key), and hit ENTER to choose plot 1. Select ON, and the bar graph (3rd selection) for Type. Xlist should by default be set to L1. Your plot should look something like this: Hit WINDOW. If Xscl is not equal to 1, change it. You might also want a Yscl of 1. Now, the height of the first bar represent the number of 1’s, the second 2’s, etc. Using this chart, we can write out a frequency table: minutes 1 2 3 4 5 6 7 8 frequency 1 1 1 1 1 2 2 0 minutes 9 10 11 12 13 14 15 16 frequency 4 2 5 2 1 1 0 1 We can use our plot of frequency distribution to draw a frequency polygon. A frequency polygon connects the midpoint of the top of each bar on the frequency distribution plot, using a 0 value for the top of the hypothetical 0th and (n + 1)th bar. 2 Your calculate will also regroup the data. Under WINDOW, change Xscl to 4, and hit graph (you might also need to change your Ymax). Now our rectangles represent the number of people who spent 1 – 4, 5 – 8, 9 – 12, and 13 – 16 minutes daily reading their math textbook. Another way to analyze data is to calculate its central tendency. There are three different measure of central tendency for frequency distributions; mean, median, and mode. Definition: The central tendency of a data set is a value given to the center, or middle, of the set. The mean of a frequency distribution is the most familiar. For probability distributions, the central tendency was the expected value. Definition: The mean of n numbers, x1, x2, . . . , xn is n x ∑ x + x + + xn i=1 i x= 1 2 = n n Other useful information on your screen: This process is tedious for long lists of numbers. Luckily, our calculators can do this. To find the mean of the numbers is L1, just hit STAT, scroll to CALC and hit ENTER to get 1-Var Stats. That will appear on your home screen. Hit L1 and ENTER, and a long list of statistics appear. x is the mean of the numbers in L1. ∑x ∑x = sum of L1 2 = sum of squares of values in L1 Sx = standard deviation of mean (next section) σx = standard deviation of population (we won’t use this. n = how many numbers in L1 3 Xmin = smallest number in L1 Q1 = beginning of 1st interquartile range (a measure of variance; we won’t use this) Med = median (coming up soon) Q3 = end of 2nd interquartile range (see Q1) MaxX = largest number in L1 The shape of the frequency distribution is most commonly used to decide the best measurement of central tendency. When the frequency distribution is shaped like a bell curve, the mean is the best measure of central tendency (more on this in section 9.3) When the frequency distribution is skewed, median can be a better measure of central tendency. In math, x is the name given to the mean of a sample. This number can be used to get an idea of what the mean value is for the entire population. The symbol typically used for the mean of the entire population is µ (mu). Our goal is to draw appropriate conclusions about the population by examining a sample. Definition: The median of a data set is the middle entry of that set. Note that for our data set of time spent reading math, the data was slightly skewed to the left (we say the data set’s left, not our left; that is, skewed to the larger end of our set of numbers). Thus, median would be a good measure of central tendency for this data set. Using the 1-Var Stats, we already found that the median of this data is 9. Since there are 25 numbers in this list of data, 12 are less than (or equal) 9, 12 are more than (or equal) 9. To check, we can sort our list. Under STAT menu, choose 2. SortA(, sort ascending. Hit ENTER, and SortA( will appear on you homescreen. Input L1, ENTER. “Done” will appear on the homescreen. If you now go to STAT and hit 1 for Edit, you will see L1 in ascending order. Check to see if we got the correct median. 4 If you have an even number of numbers in your data set, the median in the mean of the middle 2. Example: Find the median of 100, 114, 125, 135, 150, 172. These numbers are already in ascending order. There are 2 number less than 125 and 2 greater than 135, therefore 125 + 135 = 130 2 is the median. Examples: 1. Find the mode or modes: The final measure of central tendency is the mode. This measure is not often used, but does tend to cancel out the effect of unusually large or unusually small elements in a data set. Definition: The mode of a data set is the most frequently occurring element in that data set. A data set can have more than one mode. Classwork: Below is a sampling of ages of motorcyclists at the time they were fatally injured in traffic accidents: 2. Find the mode or modes: 17, 38, 27, 14, 18, 34, 16, 42, 28, 24, 40, 20, 23, 31, 37, 21, 30, 25, 17, 28, 33, 25, 23, 19, 51, 18, 29 1, 7, 2, 2, 5, 7, 2, 5, 7 1. Input the data in L1 in you calculator. 21, 32, 46, 32, 49, 32, 49 2. Graph a frequency distribution with bar width (Xscl) of 1. 3. Graph a frequency distribution with bar width of 10. 4. Find the mean and median of the data. 5