Survey
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
What You Will Learn • Sampling Techniques • Random Sampling • Systematic Sampling • Cluster Sampling • Stratified Sampling • Convenience Sampling 13.1-1 Copyright 2013, 2010, 2007, Pearson, Education, Inc. Statistics Statistics is the art and science of gathering, analyzing, and making inferences (predictions) from numerical information, data, obtained in an experiment. 13.1-2 Copyright 2013, 2010, 2007, Pearson, Education, Inc. Statistics Statistics is divided into two main branches. • Descriptive statistics is concerned with the collection, organization, and analysis of data. • Inferential statistics is concerned with making generalizations or predictions from the data collected. 13.1-3 Copyright 2013, 2010, 2007, Pearson, Education, Inc. Statisticians A statistician’s interest lies in drawing conclusions about possible outcomes through observations of only a few particular events. The population consists of all items or people of interest. The sample includes some of the items in the population. 13.1-4 Copyright 2013, 2010, 2007, Pearson, Education, Inc. Statisticians When a statistician draws a conclusion from a sample, there is always the possibility that the conclusion is incorrect. 13.1-5 Copyright 2013, 2010, 2007, Pearson, Education, Inc. Types of Sampling • Random sampling (each item has equal probability) • Systematic sample (like select every 8th) • Cluster sample (area sampling) • Stratified sample (divide into groups according to some characteristics, then select random sample(s) from each group) • Convenient sample (uses data that are easily obtained, and can be extremely biased) 13.1-6 Copyright 2013, 2010, 2007, Pearson, Education, Inc. Example 1: Identifying Sampling Techniques Identify the sampling technique used to obtain a sample in the following. Explain your answer. Every 20th soup can coming off an assembly line is checked for defects. Systematic Sampling a) 13.1-7 Copyright 2013, 2010, 2007, Pearson, Education, Inc. Example 1: Identifying Sampling Techniques b) A $50 gift certificate is given away at the Annual Bankers Convention. Tickets are placed in a bin, and the tickets are mixed up. Then the winning ticket is selected by a blindfolded person. Random Sampling 13.1-8 Copyright 2013, 2010, 2007, Pearson, Education, Inc. Example 1: Identifying Sampling Techniques c) Children in a large city are classified based on the neighborhood school they attend. A random sample of five schools is selected. All the children from each selected school are included in the sample. 13.1-9 Cluster Sampling Copyright 2013, 2010, 2007, Pearson, Education, Inc. Example 1: Identifying Sampling Techniques d) The first 50 people entering a zoo are asked if they support an increase in taxes to support a zoo expansion. 13.1-10 Convenience Sampling Copyright 2013, 2010, 2007, Pearson, Education, Inc. Example 1: Identifying Sampling Techniques e) Viewers of the USA Network are classified according to age. Random samples from each age group are selected. 13.1-11 Stratified Sampling Copyright 2013, 2010, 2007, Pearson, Education, Inc. What You Will Learn • Misuses of Statistics • What is Not Said • Vague or Ambiguous Words • Draw Irrelevant Conclusions • Charts and Graphs 13.2-12 Copyright 2013, 2010, 2007, Pearson, Education, Inc. Misuses of Statistics Many individuals, businesses, and advertising firms misuse statistics to their own advantage. 13.2-13 Copyright 2013, 2010, 2007, Pearson, Education, Inc. Misuses of Statistics When examining statistical information, consider the following: •Was the sample used to gather the statistical data unbiased and of sufficient size? •Is the statistical statement ambiguous, could it be interpreted in more than one way? 13.2-14 Copyright 2013, 2010, 2007, Pearson, Education, Inc. What is Not Said “Four out of five dentists recommend sugarless gum for their patients who chew gum.” • the advertisement does not tell the sample size and the number of times the experiment was performed to obtain the desired results. • The advertisement does not mention that possibly only 1 out of 100 dentists recommended gum at all. 13.2-15 Copyright 2013, 2010, 2007, Pearson, Education, Inc. Vague or Ambiguous Words Vague or ambiguous words also lead to statistical misuses or misinterpretations. The word average is one such culprit. There are at least four different “averages,” some of which are discussed in Section 13.4. 13.2-16 Copyright 2013, 2010, 2007, Pearson, Education, Inc. Vague or Ambiguous Words During contract negotiations, it is not uncommon for an employer to state publicly that the average salary of its employees is $45,000, whereas the employees’ union states that the average is $40,000. Who is lying? Actually, both sides may be telling the truth. Each side will use the average that best suits its needs to present its case. 13.2-17 Copyright 2013, 2010, 2007, Pearson, Education, Inc. Vague or Ambiguous Words Another vague word is largest. For example, ABC claims that it is the largest department store in the United States. Does that mean largest profit, largest sales, largest building, largest staff, largest acreage, or largest number of outlets? 13.2-18 Copyright 2013, 2010, 2007, Pearson, Education, Inc. Draw Irrelevant Conclusions Still another deceptive technique used in advertising is to state a claim from which the public may draw irrelevant conclusions. 13.2-19 Copyright 2013, 2010, 2007, Pearson, Education, Inc. Draw Irrelevant Conclusions For example, a disinfectant manufacturer claims that its product killed 40,760 germs in a laboratory in 5 seconds. “To prevent colds, use disinfectant A.” It may well be that the germs killed in the laboratory were not related to any type of cold germ. 13.2-20 Copyright 2013, 2010, 2007, Pearson, Education, Inc. Charts and Graphs Charts and graphs can also be misleading. Even though the data is displayed correctly, adjusting the vertical scale of a graph can give a different impression. 13.2-21 Copyright 2013, 2010, 2007, Pearson, Education, Inc. Charts and Graphs While each graph presents identical information, the vertical scales have been altered. 13.2-22 Copyright 2013, 2010, 2007, Pearson, Education, Inc. Charts and Graphs The graph in part (a) appears to show a greater increase than the graph in part (b), again because of a different scale. 13.2-23 Copyright 2013, 2010, 2007, Pearson, Education, Inc. Charts and Graphs Consider a claim that if you invest $1, by next year you will have $2. This type of claim is sometimes misrepresented. Actually, your investment has only doubled, but the area of the square on the right is four times that of the square on the left. 13.2-24 Copyright 2013, 2010, 2007, Pearson, Education, Inc. Charts and Graphs By expressing the amounts as cubes, you increase the volume eightfold. 13.2-25 Copyright 2013, 2010, 2007, Pearson, Education, Inc. Charts and Graphs A circle graph can be misleading if the sum of the parts of the graphs does not add up to 100%. 13.2-26 Copyright 2013, 2010, 2007, Pearson, Education, Inc. Charts and Graphs Wrong pie chart. Sum of percents = 183% 13.2-27 Copyright 2013, 2010, 2007, Pearson, Education, Inc. Charts and Graphs Despite the examples presented in this section, you should not be left with the impression that statistics is used solely for the purpose of misleading or cheating the consumer. There are many important and necessary uses of statistics. Most statistical reports are accurate and useful. You should realize, however, the importance of being an aware consumer. 13.2-28 Copyright 2013, 2010, 2007, Pearson, Education, Inc. What You Will Learn • Frequency Distributions • Histograms • Frequency Polygons • Stem-and-Leaf Displays • Circle Graphs 13.3-29 Copyright 2013, 2010, 2007, Pearson, Education, Inc. Frequency Distribution A piece of data is a single response to an experiment. A frequency distribution is a listing of observed values and the corresponding frequency of occurrence of each value. (Table!) 13.3-30 Copyright 2013, 2010, 2007, Pearson, Education, Inc. Example 1: Frequency Distribution The number of children per family is recorded for 64 families surveyed. Construct a frequency distribution of the following data: 13.3-31 Copyright 2013, 2010, 2007, Pearson, Education, Inc. Example 1: Frequency Distribution Number of children (observed values) Number of families (frequency) 0 8 1 11 2 18 3 11 4 6 5 4 6 2 7 1 8 2 9 1 Total: 64 13.3-32 Copyright 2013, 2010, 2007, Pearson, Education, Inc. Rules for Data Grouped by Classes A more general frequency distribution (group): 1. The classes should be of the same “width.” 2. The classes should not overlap. 3. Each piece of data should belong to only one class. Often suggested that there be 5 – 12 classes. 13.3-33 Copyright 2013, 2010, 2007, Pearson, Education, Inc. Definitions Classes 0−4 5−9 10 − 14 Lower class limits Upper class limits 15 − 19 20 − 24 25 − 29 Midpoint of a class is found by adding the lower and upper class limits and dividing the sum by 2. 13.3-34 Copyright 2013, 2010, 2007, Pearson, Education, Inc. Example 3: A Frequency Distribution of Family Income The following set of data represents the family income (in thousands of dollars, rounded to the nearest hundred) of 15 randomly selected families. 46.5 65.2 35.5 13.3-35 31.8 52.4 40.3 45.8 44.6 39.8 44.7 53.7 56.3 Copyright 2013, 2010, 2007, Pearson, Education, Inc. 40.9 48.8 50.7 Example 3: A Frequency Distribution of Family Income Construct a frequency distribution with a first class of 31.5–37.6. Solution: First sort the data (from smallest to largest) 13.3-36 Copyright 2013, 2010, 2007, Pearson, Education, Inc. Histograms A histogram is a graph with observed values on its horizontal scale and frequencies on its vertical scale. 13.3-37 Copyright 2013, 2010, 2007, Pearson, Education, Inc. Example 4: Construct a Histogram The frequency distribution developed in Example 1 is shown on the next slide. Construct a histogram of this frequency distribution. 13.3-38 Copyright 2013, 2010, 2007, Pearson, Education, Inc. 13.3-39 Copyright 2013, 2010, 2007, Pearson, Education, Inc. Frequency Polygon Frequency polygons are line graphs with scales the same as those of the histogram; that is, the horizontal scale indicates observed values and the vertical scale indicates frequency. 13.3-40 Copyright 2013, 2010, 2007, Pearson, Education, Inc. Example 5: Construct a Histogram Construct a frequency polygon of the frequency distribution in Example 1, found on the next slide. Comment: need to add two points, one on left and another on right. They lie on horizontal axis. 13.3-41 Copyright 2013, 2010, 2007, Pearson, Education, Inc. 13.3-42 Copyright 2013, 2010, 2007, Pearson, Education, Inc. Stem-and-Leaf Display A stem-and-leaf display is a tool that organizes and groups the data while allowing us to see the actual values that make up the data. 13.3-43 Copyright 2013, 2010, 2007, Pearson, Education, Inc. Example 8: Constructing a Stemand-Leaf Display The table below indicates the ages of a sample of 20 guests who stayed at Captain Fairfield Inn Bed and Breakfast. Construct a stem-and-leaf display. 29 60 47 72 13.3-44 31 62 27 44 39 59 50 45 43 58 28 44 56 32 71 68 Copyright 2013, 2010, 2007, Pearson, Education, Inc. Example 8: Constructing a Stemand-Leaf Display Solution Stem 2 3 4 5 6 7 13.3-45 Leaves 978 192 37454 6980 028 12 Copyright 2013, 2010, 2007, Pearson, Education, Inc. Example 9: Circus Performances Eight hundred people who attended a Ringling Bros. and Barnum & Bailey Circus were asked to indicate their favorite performance. The circle graph shows the percentage of respondents that answered tigers, elephants, acrobats, jugglers, and other. Determine the number of respondents for each category. 13.3-46 Copyright 2013, 2010, 2007, Pearson, Education, Inc. Example 9: Circus Performances Solution Answers: 304: 208: 136: 112: 40: 13.3-47 Copyright 2013, 2010, 2007, Pearson, Education, Inc. tigers elephants acrobats, jugglers other performance What You Will Learn 1. Averages: mean, median, mode & midrange 2. Measure of positions: Percentile & Quartiles 13.4-48 Copyright 2013, 2010, 2007, Pearson, Education, Inc. Measures of Central Tendency An average is a number that is representative of a group of data. There are at least 4 different averages: • Mean • Median • Mode • midrange 13.4-49 Copyright 2013, 2010, 2007, Pearson, Education, Inc. Measures of Central Tendency Each will result in a number near the center of the data; therefore, average is referred to as measures of central tendency. 13.4-50 Copyright 2013, 2010, 2007, Pearson, Education, Inc. Mean (or Arithmetic Mean) The mean, 13.4-51 Σx x= n Copyright 2013, 2010, 2007, Pearson, Education, Inc. Example 1: Determine the Mean Determine the mean age of a group of patients at a doctor’s office if the ages of the individuals are 28, 19, 49, 35, and 49. x = 36 13.4-52 Copyright 2013, 2010, 2007, Pearson, Education, Inc. Median The median is the value in the middle of a set of ranked data. 13.4-53 Copyright 2013, 2010, 2007, Pearson, Education, Inc. Example 2: Determine the Median Determine the median age of a group of patients at a doctor’s office if the ages of the individuals are 28, 19, 49, 35, and 49. Median = 35 Comment: odd number of pieces of data. 13.4-54 Copyright 2013, 2010, 2007, Pearson, Education, Inc. Example 3: Determine the Median of an Even Number of Pieces of Data Determine the median of the following sets of data. a) 9, 14, 16, 17, 11, 16, 11, 12 Median = 13 13.4-55 Copyright 2013, 2010, 2007, Pearson, Education, Inc. Mode The mode is the piece of data that occurs most frequently. 13.4-56 Copyright 2013, 2010, 2007, Pearson, Education, Inc. Example 4: Determine the Mode Determine the mode age of a group of patients at a doctor’s office if the ages of the individuals are 28, 19, 49, 35, and 49. Mode = 49 13.4-57 Copyright 2013, 2010, 2007, Pearson, Education, Inc. Midrange The midrange is the value halfway between the lowest (L) and highest (H) values in a set of data. lowest value + highest value Midrange = 2 13.4-58 Copyright 2013, 2010, 2007, Pearson, Education, Inc. Example 5: Determine the Midrange Determine the midrange age of a group of patients at a doctor’s office if the ages of the individuals are 28, 19, 49, 35, and 49. Midrange = 34 13.4-59 Copyright 2013, 2010, 2007, Pearson, Education, Inc. Measures of Position Measures of position are often used to make comparisons. Two measures of position are percentiles and quartiles. 13.4-60 Copyright 2013, 2010, 2007, Pearson, Education, Inc. Percentiles There are 99 percentiles dividing a set of data into 100 equal parts. 13.4-61 Copyright 2013, 2010, 2007, Pearson, Education, Inc. Percentiles A score in the nth percentile means that you out-performed about n% of the population who took the test and that (100 – n)% of the people taking the test performed better than you did. 13.4-62 Copyright 2013, 2010, 2007, Pearson, Education, Inc. Quartiles Quartiles divide data into four equal parts: 13.4-63 Copyright 2013, 2010, 2007, Pearson, Education, Inc. To Determine the Quartiles of a Set of Data 1. 2. 3. 4. 13.4-64 Order the data from smallest to largest. Q2 = the median. Q2 divide the ranked data into lower half and upper half. Q1 = the median of the lower half of the data Q3 = the median of the upper half of the data. Copyright 2013, 2010, 2007, Pearson, Education, Inc. Example 8: Finding Quartiles Electronics World is concerned about the high turnover of its sales staff. A survey was done to determine how long (in months) the sales staff had been in their current positions. The responses of 27 sales staff follow. Determine Q1, Q2, and Q3. 13.4-65 Copyright 2013, 2010, 2007, Pearson, Education, Inc. Example 8: Finding Quartiles 25 3 7 15 31 36 17 21 2 11 42 16 23 16 21 9 20 5 8 12 27 14 39 24 18 6 10 Q2 = 16, Q1 = 9, Q3 = 24 13.4-66 Copyright 2013, 2010, 2007, Pearson, Education, Inc. What You Will Learn Range Standard Deviation 13.5-67 Copyright 2013, 2010, 2007, Pearson, Education, Inc. Measures of Dispersion Range and standard deviation are measures of dispersion. Measures of dispersion are used to indicate the spread of the data. 13.5-68 Copyright 2013, 2010, 2007, Pearson, Education, Inc. Range The range is the difference between the highest and lowest values; it indicates the total spread of the data. Range = highest value – lowest value 13.5-69 Copyright 2013, 2010, 2007, Pearson, Education, Inc. Example 1: Determine the Range The amount of caffeine, in milligrams, of 10 different soft drinks is given below. Determine the range of these data. 38, 43, 26, 80, 55, 34, 40, 30, 35, 43 Range = 54 milligrams 13.5-70 Copyright 2013, 2010, 2007, Pearson, Education, Inc. Standard Deviation The standard deviation measures how much the data differ from the mean. It is symbolized with s when it is calculated for a sample, and with σ (Greek letter sigma) when it is calculated for a population. 13.5-71 Copyright 2013, 2010, 2007, Pearson, Education, Inc. Standard Deviation The standard deviation, s, of a set of data can be calculated using the following formula. ∑ (x − x ) 2 s= 13.5-72 n −1 Copyright 2013, 2010, 2007, Pearson, Education, Inc. To Find the Standard Deviation of a Set of Data 1.Find the mean of the set of data. 2.Make a chart having three columns: (3 – 6 see book) Data Data – Mean (Data – Mean)2 7. Divide the sum obtained in Step 6 by n – 1, where n is the number of pieces of data. 8. Determine the square root of the number obtained in Step 7. This number is the standard deviation of the set of data 13.5-73 Copyright 2013, 2010, 2007, Pearson, Education, Inc. Example 3: Determine the Standard Deviation of Stock Prices The following are the prices of nine stocks on the New York Stock Exchange. Determine the standard deviation of the prices. $17, $28, $32, $36, $50, $52, $66, $74, $104 13.5-74 Copyright 2013, 2010, 2007, Pearson, Education, Inc. Example 3: Determine the Standard Deviation of Stock Prices Solution: x = 51 See Excel. 13.5-75 Copyright 2013, 2010, 2007, Pearson, Education, Inc. Example 3: Determine the Standard Deviation of Stock Prices 13.5-76 Copyright 2013, 2010, 2007, Pearson, Education, Inc. Example 3: Determine the Standard Deviation of Stock Prices Solution Use the formula ∑ (x − x ) 2 s= n −1 = 5836 = 729.5 ≈ 27.01 9 −1 The standard deviation, to the nearest tenth, is $27.01. 13.5-77 Copyright 2013, 2010, 2007, Pearson, Education, Inc. Example 3: Determine the Standard Deviation of Stock Prices Another method: using TI-84 S = 27.01 13.5-78 Copyright 2013, 2010, 2007, Pearson, Education, Inc. What You Will Learn • Rectangular Distribution • J-shaped Distribution • Bimodal Distribution • Skewed Distribution • Normal Distribution • z-Scores 13.6-79 Copyright 2013, 2010, 2007, Pearson, Education, Inc. Rectangular Distribution All the observed values occur with the same frequency. 13.6-80 Copyright 2013, 2010, 2007, Pearson, Education, Inc. J-shaped Distribution The frequency is either constantly increasing or constantly decreasing. 13.6-81 Copyright 2013, 2010, 2007, Pearson, Education, Inc. Bimodal Distribution Two nonadjacent values occur more frequently than any other values in a set of data. 13.6-82 Copyright 2013, 2010, 2007, Pearson, Education, Inc. Skewed Distribution Has more of a “tail” on one side than the other. 13.6-83 Copyright 2013, 2010, 2007, Pearson, Education, Inc. Skewed Distribution Smoothing the histograms of the skewed distributions to form curves. 13.6-84 Copyright 2013, 2010, 2007, Pearson, Education, Inc. Skewed Distribution The relationship between the mean, median, and mode for curves that are skewed to the right and left. 13.6-85 Copyright 2013, 2010, 2007, Pearson, Education, Inc. Normal Distribution The most important distribution is the normal distribution. 13.6-86 Copyright 2013, 2010, 2007, Pearson, Education, Inc. Properties of a Normal Distribution • • • 13.6-87 The graph of a normal distribution is called the normal curve. The normal curve is bell shaped and symmetric about the mean. In a normal distribution, the mean, median, and mode all have the same value and all occur at the center of the distribution. Copyright 2013, 2010, 2007, Pearson, Education, Inc. Empirical Rule Approximately 68% of all the data lie within one standard deviation of the mean (in both directions). Approximately 95% of all the data lie within two standard deviations of the mean (in both directions). Approximately 99.7% of all the data lie within three standard deviations of the mean (in both directions). 13.6-88 Copyright 2013, 2010, 2007, Pearson, Education, Inc. z-Scores z-scores (or standard scores) determine how far, in terms of standard deviations, a given score is from the mean of the distribution. 13.6-89 Copyright 2013, 2010, 2007, Pearson, Education, Inc. z-Scores The formula for finding z-scores (or standard scores) is value of piece of data − mean z= standard deviation x−µ = σ 13.6-90 Copyright 2013, 2010, 2007, Pearson, Education, Inc. Example 2: Finding z-scores A normal distribution has a mean of 80 and a standard deviation of 10. Find z-scores for the following values. a) 90 b) 95 c) 80 d) 64 z90 = 1 13.6-91 z95 = 1.5 z80 = 0 Copyright 2013, 2010, 2007, Pearson, Education, Inc. z64 = −1.6 To Determine the Percent of Data Between any Two Values Look up the percent that corresponds to each z-score in Table 13.7. 13.6-92 Copyright 2013, 2010, 2007, Pearson, Education, Inc. To Determine the Percent of Data Between any Two Values a) negative z-score, use Table 13.7(a). 13.6-93 Copyright 2013, 2010, 2007, Pearson, Education, Inc. To Determine the Percent of Data Between any Two Values b) positive z-score, use Table 13.7(b). 13.6-94 Copyright 2013, 2010, 2007, Pearson, Education, Inc. To Determine the Percent of Data Between any Two Values c) When finding the percent of data to the right of a z-score (1) Use complement: area to the right of z-score = 1 – area to the left of z-score (2) Use symmetry: area to the right of z-score = area to the left of the negative z-score 13.6-95 Copyright 2013, 2010, 2007, Pearson, Education, Inc. To Determine the Percent of Data Between any Two Values d) When finding the percent of data between two z-scores, subtract the smaller percent from the larger percent. 13.6-96 Copyright 2013, 2010, 2007, Pearson, Education, Inc. To Determine the Percent of Data Between any Two Values 4. Change the areas you found in Step 3 to percents. 13.6-97 Copyright 2013, 2010, 2007, Pearson, Education, Inc. Example 5: Horseback Rides Assume that the length of time for a horseback ride on the trail at Triple R Ranch is normally distributed with a mean of 3.2 hours and a standard deviation of 0.4 hour. 50% a) What percent of horseback rides last at least 3.2 hours? b) What percent of horseback rides last less than 2.8 hours? 15.87% 10.56% c) What percent of horseback rides are at least 3.7 hours? d) What percent of horseback rides are between 2.8 hours and 4.0 hours? 13.6-98 81.58% Copyright 2013, 2010, 2007, Pearson, Education, Inc. Example 5: Horseback Rides e) In a random sample of 500 horseback rides at Triple R Ranch, how many are at least 3.7 hours? Approximately 53 horseback rides last At least 3.7 hours. 13.6-99 Copyright 2013, 2010, 2007, Pearson, Education, Inc.