* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download REVIEW - CBSD.org
Survey
Document related concepts
Transcript
NAME: _______________________________ Ch. 1 & 2 review packet MULTIPLE CHOICE: 1. Were the extinctions that occurred in the last ice age more frequent among species of animals with large body sizes? A researcher gathers data on the average body mass (in kilograms) of all species known to have existed at that time. These measurements are values of (a) a categorical variable. (b) a quantitative variable. (c) an invalid variable. (d) a margin of error. 2. An example of a categorical variable is (a) the name of the college a student attends. (b) a student's weight in kilograms. (c) a student's class rank, such as 25th out of 364. (d) a student's sex (male or female). (e) Both (a) and (d). The stock market did well during the 1990s. Here are the percent total returns (change in price plus dividends paid) for the Standard & Poor's 500 stock index: The next five questions are related to this situation. 3. The median return during this period is (a) 5.5 (b) 20.07 (c) 23.0 (d) 25.8 (e) 28.6 4. The third quartile of these returns is (a) 7.6 (b) 30.5 (c) 31.1 (d) 31.7 (e) 33.4 5. The mean return is (a) 20.07 (b) 20.69 (d) 25.8 (e) 33.4 (d) 20.07 (e) 25.8 (c) 22.3 6. The standard deviation of the returns is (a) 13.75 (b) 13.98 (c) 14.74 7. A well-drawn histogram should have (a) bars all the same width (b) no space between bars (unless a class has no observations) (c) a clearly marked vertical scale (d) all of these Here is a table of the undergraduate enrollment at a large state university, broken down by class: 8. To make a correct graph of the distribution of students by class, you could use (a) a bar graph. (b) a pie chart. (c) a histogram. (d) all of (a), (b), and (c). (e) (a) or (b), but not (c). 9. For a distribution that is skewed to the right, usually (a) the mean will be larger than the median (b) the median will be larger than the mean (c) the first quartile will be larger than the third quartile (d) the standard deviation will be negative (e) the minimum will be larger than the maximum 10. Here is a set of data: 1300, 18, 25, 19, -7, 24. Which observation is the outlier? (a) 1300 (b) 25 (c) 19 (d) -7 (e) 24 11. You have calculated that the value of the standard deviation for a set of data is -2.5. You can conclude that (a) the data are skewed to the left (b) the data are skewed to the right (c) there is very little variation in the data (d) all of the observations in the data set are negative (e) you made an error in your calculations 12. You calculate that the standard deviation of a set of observations is 0. This tells you that (a) you made an arithmetic mistake. (b) all the observations have the same value. (c) there is no straight-line association. (d) the mean must also be 0. (e) Both (b) and (d) are true. 13. The five-number summary of a distribution consists of (a) mean, median, standard deviation, and two quartiles. (b) minimum, maximum, mean, median, and standard deviation. (c) minimum, maximum, median, and two quartiles (d) mean, standard deviation, correlation, and two quartiles 14. The standard deviation is a measure of (a) the center of a distribution (b) the variability of a distribution (c) the association between two variables (d) the standardized value of a variable 15. The box in the center of a boxplot marks (a) the full range covered by the data (b) the range covered by the middle half of the data (c) the range covered by the middle three-quarters of the data (d) the span one standard deviation on each side of the mean (e) the span two standard deviations on each side of the mean 16. In a class of 31 students, how many would be above the third quartile? (a) 31 (b) 16 (c) 15 (d) 8 (e) 7 FREE RESPONSE: 1) What is a distribution of a variable? A picture that shows what values the variable takes and how often it takes those values 2) What types of values can standard deviation take? Positive values from 0 to infinity 3) If I have a standard deviation of 0, what does this mean? All the data points are the same number (like 5, 5, 5, 5, 5, 5, 5) 4) What percentage of the data is above Q3? 25% 5) What percentage of the data is below Q1? 25% 6) What percentage of the data is inside the box of a boxplot? 50% 7) What is a census? When a researcher attempts to contact every individual in the population 8) What is the difference between an observational study & an experiment? Observational study does not influence/change the individuals in any way- just looks for information that already exists. Experiments impose a treatment on the individuals, changing them in some way in order to measure the response. 9) Identify the following variables as categorical or quantitative a. Number of calculator games on each students calculator b. What type of car each student drives (brand of car, Toyota, Honda, etc.) c. GPA d. Number of mp3 songs each student has on their computer e. Whether a student is for or against the death penalty f. Amount of time spend studying for the Ch. 4 test g. Letter grade on the chapter 8 test h. Number of majors offered at a given college 10) Quant Cat Quant Quant Cat Quant Cat Quant The following is info on hotels in the area. exercise internet room room price restaurants service room rate Hotel Pool Comfort Inn out y 0 1 y $149 Fairfield Inn in y 0 1 n $119 Baymont Inn out y 0 1 y $60 Chase Hotel out n 15 0 n $139 Courtyard Marriott in y 0 1 dinner $114 Hilton in y 10 2 y $156 Best Western in y 9.95 2 y $145 a. Identify the individuals HOTELS b. Identify the variables measured and label each as categorical or quantitative Pool Categorical Exercise room Cat Internet Price Quant Restaurants Quant Room service Cat Room rate Quant 11) For the following, identify the population, sample, individuals, and variables as clearly as possible: a. A truckload of apples arrives at an apple juice production plant. The plant’s quality control team selects 3 large buckets of apples from various locations in the truck. These apples are inspected carefully, and the entire truckload is either accepted or rejected based on the 3 buckets. Population = All apples that arrive at the plant Sample = the 3 large buckets taken Individuals = apples Variables = the quality of the apple b. I decide to survey local residents about their thoughts on the President. I send out 100 surveys to local households in Bucks County and ask the adults to fill it out and send it back. Population = All Bucks County residents Sample = the 100 adults in the chosen households Individuals = adults Variables = thoughts on the president c. A study on cancer patients took a random sample of 80 patients from the lung cancer ward at Temple University Hospital. They then tested their reactions to a new drug to see if there were any improvements on their conditions. Population = All cancer patients Sample = the 80 selected lung cancer patients at Temple University Hospital Individuals = cancer patients Variables = their condition 12) Identify the following as an experiment or an observational study: a. Before a new variety of frozen muffins is put on the market, it is subjected to extensive taste testing. People are asked to taste the new muffin and a competing brand and say which one they prefer. EXPERIMENT b. A teacher wants to know about how her students spend their time after school. She passes out a survey that asks numerous questions about this topic. The survey is anonymous. OBSERVATIONAL STUDY c. Teenage drivers have long been blamed for being bad drivers, and prone to accidents. To decide whether this is a fair statement, a teenage statistics student goes to a local police precinct and obtains a list of all accidents in the past 3 years. She then selects 100 of these accidents and looks to see the age of the person at fault (teenage or adult). OBSERVATIONAL STUDY d. There has long been a debate about whether whitening toothpaste does as good of a job as whitening strips. A researcher has 20 people use whitening strips for 5 days and another 20 people use whitening toothpaste for 30 days, and then compares their beginning tooth shade to their new tooth shade. EXPERIMENT 13) For each graph state whether the mean is higher, lower, or about the same as the median. Explain. HIGHER HIGHER ABOUT THE SAME 14) Below is a two-way table for Gender vs. View of Death Penalty for a sample. a. b. c. d. e. f. What percent of the class are males? 12/28 = 42.86% What percent of the class are For the death penalty? 19/28 = 67.86% What percent of the class are female and For the death penalty? 10/28 = 35.71% What percent of the females are For the death penalty? 10/16 = 62.5% What percent of the class are females, given that they are For the d.p.? 10/19 = 52.63% What percent of the females are Against the death penalty? 6/16 = 37.5% 15) Below is a Two-Way table for Gender vs Hair Color a. b. c. d. e. What percent of the males have brown hair? 10/14 = 71.43% What percent of those that have brown hair are males? 10/20 = 50% What percent of the class are males and have brown hair? 10/34 = 29.41% What percent of females have blond hair? 7/20 = 35% Of only the males find the percent of each hair color. Brown Black Blond Males 1/14 = 7.14% 3/14 = 21.43% 10/14 = 71.43% f. Of only the females find the percent of each hair color. Brown Black Females 3/20 = 15% 7/20 = 35% Blond 10/20 = 50% 16) Given the following info about the mean and median, decide the shape of the distribution of the data, and pick a measure of center to report. a. Mean = 170 Median = 80 Shape = RIGHT SKEW Center = MEDIAN b. Mean = 100 Median = 180 Shape = LEFT SKEW Center = MEDIAN c. Mean = 79 Median = 85 Shape = SYMMETRIC Center = MEAN 17) Below is a list of the number of pieces of Halloween candy that 50 randomly selected children got on Halloween night. This data is in the list CANDY. 62 62 63 65 66 66 67 70 71 72 72 73 73 73 74 75 75 76 76 76 76 77 77 77 78 78 78 79 80 80 81 81 82 82 85 86 91 88 95 89 96 89 99 91 100 106 111 112 120 130 a. Create a relative frequency histogram of the data below. Be sure to label all your axes! 42% 22% 14% 60 70 80 90 CANDY 100 200 300 b. Based on the shape of the distribution, what measures of center & spread would you report? Since the distribution is right skewed, I would report the MEDIAN & IQR as the center & spread. c. Calculate the following statistics mean st. dev. n min 82.02 14.78 50 62 Q1 med Q3 max IQR 73 78 89 130 16 d. Test for outliers using the 1.5*IQR rule. Show all work, and list any outliers you find. A = 73 – (1.5*16) = 49 B = 89 + (1.5*16) = 113 Normal data = (49, 113) Outliers: 120, 130 e. Create a boxplot of the data. 62 73 78 89 CANDY 112 120 130 f. Describe the distribution below (shape, center, spread, etc.) Right skewed, unimodal. Center at median of 78 pieces, IQR of 16 pieces, and range of (62, 130) pieces. There are two outliers at 120 and 130 pieces. 18) The data below is in the list DATA2. 82 98 106 108 85 99 106 109 89 100 106 110 90 102 107 111 90 103 108 111 112 112 113 113 113 115 116 116 118 118 119 119 119 119 119 122 129 108 111 118 122 91 108 112 118 122 95 126 113 124 103 127 115 126 104 120 121 a. Create a stemplot, splitting your stems. (Sort the list in the calculator to help you) Key: 8|2 = 82 b. Calculate the following statistics mean st. dev. n min 110.46 10.94 57 82 Q1 med Q3 max IQR 105 112 119 129 14 c. Describe the distribution (shape, center, spread) Left Skewed, unimodal. Center at median of 112 units, IQR of 14 units, and range of (82, 129) units. 19) The data in the lists YOGA3 and YOGA8 (below) are the ages of participants in a 3pm & 8pm Yoga class. YOGA3 35 57 68 75 68 53 64 73 84 38 59 68 76 75 55 68 75 85 40 60 69 77 51 62 73 77 YOGA8 18 19 22 28 56 18 20 24 31 44 18 19 23 29 71 18 20 25 31 44 18 19 24 29 43 19 20 25 39 19 a. Find the following summary statistics for the 3pm class and the 8pm class. mean st. dev. n min Q1 med Q3 max IQR YOGA3 64.81 13.38 26 35 57 68 75 85 18 YOGA8 27.77 12.75 30 18 19 23.5 31 71 12 b. Create parallel boxplots of the two yoga classes. YOGA 3 YOGA 8 18 19 23.5 31 35 44 56 57 68 71 75 85 c. Write a few sentences to compare the shape, center, and spread of the two plots. The shape of the 3pm class is slightly left skewed, while the 8pm class is right skewed. Both are unimodal. The 8pm class has 2 outliers at 56 and 71 years old. The center for the 3pm class is at the median of 68 years old, which is much higher than the median of the 8pm class which is 23.5 years old. The IQR of the 3pm class is 18 years old, which is higher than the IQR of the 8pm class which is 12 years old. The range of the 3pm class is (18, 71) years old which is wider than the range of the 8pm class which is (35, 85) years old. d. What age was the top 25% of YOGA3 above? How about the top 25% of YOGA8? Top 25% of YOGA3 was above 75 years old, top 25% of Yoga8 is above 31 years old. e. Where are the middle 50% of YOGA3 ages ? How about the middle 50% of YOGA8 ages? Middle 50% of YOGA3 was btw (19, 31) and the middle 50% of YOGA8 was btw (57, 75). 20) During a budget meeting, local school board members decided to review class size information to determine if budgets were correct. Summary statistics are shown in the table. a. Find the IQR 11 students b. Check for outliers using the 1.5 rule. Show work. A = 29 – (1.5*11) = 12.5 Normal data = (12.5, 56.5) B = 40 + (1.5*11) = 56.5 NO OUTLIERS x S min Q1 med Q3 Max 33.39 5.66 17 29 36 40 48 c. Based on the statistics, what shape does the distribution have? Because the mean & median are close, the distribution is roughly symmetric d. Describe the distribution. Roughly symmetric, mean of 33.39 students, standard deviation of 5.66 students, range of (17, 48) students. 21) Describe the following distributions (shape center, spread, etc.). a. Plot1 Mean = 65 S = 18.5 Min = 12 Q1 = 34 Med = 67 Q3 = 91 Max = 117 10 20 30 40 50 60 70 80 90 Symmetric, bimodal center at the mean of 65 units standard deviation of 18.5 units range of (12, 117) units 100 110 120 b. Plot 2 Mean = 71 S = 17.8 Min = 11 Q1 = 63 Med = 91 Q3 = 103 Max = 119 10 20 30 40 50 60 70 80 90 Right skewed, unimodal center at the median of 91 units IQR of 40 units range of (11, 119) units 100 110 120 c. Plot 3 Mean = 52 S = 16.7 Min = 17 Q1 = 39 Med = 58 Q3 = 79 Max = 107 10 20 30 40 50 60 70 80 90 100 110 120 Symmetric, unimodal center at the mean of 52 units standard deviation of 16.7 units range of (17, 107) units