Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Practice Exercises For QA251 Elementary Statistics Professor K. Leppel Introduction & Data Collection 1. Suppose you want to know the percentage of US citizens who deliberately underpay their federal income taxes. You conduct a survey in which you ask 1000 randomly selected people the question, “Do you cheat on your federal income taxes?” Would you expect that your resulting measurement would be subject to systematic error, sampling (random) error, or both? Explain. 2. Would the variable race (with options: White, Black, Asian, Native American) be classified as a qualitative/categorical variable or a quantitative/numerical variable? If it is qualitative/ categorical, is it nominal or ordinal? If it is quantitative/numerical, is it interval or ratio? 3. Would the variable “distance individual works from home” be classified as a qualitative/categorical variable or a quantitative/numerical variable? If it is qualitative/ categorical, is it nominal or ordinal? If it is quantitative/numerical, is it interval or ratio? Descriptive Statistics 1. Consider the following data for a sample of twenty students. The numbers represent the number of hours worked per week by students who were employed. 6 6 6 6 8 9 9 10 10 10 13 16 20 20 20 20 24 27 30 33 a. Determine the mean, median, and mode. b. Determine the range, mean absolute deviation, variance, standard deviation, and coefficient of variation. c. What would the variance and standard deviation be if these data represented the whole population instead of just a sample? d. THINK ABOUT IT: Suppose you got an answer of 3 for the mean of this data. Without redoing your calculations, how do you know that this answer cannot be correct? 2. Consider the following hypothetical distribution of monthly salaries for a population of 1000 members of a particular graduating class for a particular college. salaries [ 750, 850) [ 850, 950) [ 950, 1050) [1050, 1150) [1150, 1250) frequency 250 300 200 150 100 a. Compute the mean and median. What is the modal category? b. Compute the mean absolute deviation, variance, and standard deviation. c. What would the variance and standard deviation be if these data represented just a sample instead of the whole population? d. Draw a histogram of the distribution, showing the absolute and relative frequencies on the same graph. e. THINK ABOUT IT: Suppose you got an answer of 2500 for the median of this data. Without redoing your calculations, how do you know that this answer cannot be correct? Probability 1. A student is taking a quiz. A question asks, “which 5 of the following 10 organisms fit in the biological category discussed in the chapter?” The student did not read the chapter and decides to guess. From how many different answers can he pick? 2. A student is taking a multiple-choice quiz. There are four questions. The first two questions each have four choices (a,b,c,d); the other two questions each have 5 choices (a,b,c,d,e). The student is clueless and decides to guess. From how many different possible sets of answers to the four questions can she pick? 3. A student is taking a matching quiz. There are 9 questions and 9 answers. Each answer is used exactly once. The student has no idea and decides to guess. From how many different possible sets of answers to the 9 questions can he pick? 4. Suppose a group consists of 5 students. Three students are selected at random to do a presentation. How many different sets of presenters are possible? 5. Suppose a group consists of 5 students. If 3 students in the group must do a presentation and each of the 3 students in the group must do a particular part of a presentation (for example: introduction, analysis, and conclusion) and it matters who is doing what part, how many different possibilities can be chosen from the 5 students? 6. Suppose a class consists of 3 sophomores, 4 juniors, and 2 seniors (no freshmen). We want one person from each of the class years to do a presentation. How many different sets of presenters are there? 7. Consider the following joint distribution of students with tattoos and body piercings. tattoos piercings yes no yes 0.1250 0.0625 0.1875 no 0.4375 0.3750 0.8125 0.5625 0.4375 1.0000 a. What is the probability that a randomly selected student has both piercings and tattoos? b. What is the probability that a randomly selected student has piercings (either with or without tattoos)? c. What is the probability that a randomly selected student has tattoos (either with or without piercings)? d. What is the probability that a randomly selected student has piercings or tattoos (or both)? e. If a randomly selected student has tattoos, what is the probability the student has piercings? f. If a randomly selected student has piercings, what is the probability the student has tattoos? g. Are piercings and tattoos mutually exclusive events? Explain. h. Are piercings and tattoos independent events? Explain. i. THINK ABOUT IT: Suppose that for the probability that a randomly selected student has tattoos, you got an answer of -0.2. How do you know that this answer cannot be correct? (continues on next page) 8. a. Consider a set of students. Suppose that 56% have piercings. Of those with piercings, 22% have tattoos. Of those without piercings, 14% have tattoos. If a randomly selected student has tattoos, what is the probability that the student also has piercings? Use the table below to work out the problem. (Notice that this is the same question as 7e but the information is provided differently.) Workspace Piercings (P) Pr(P) Pr(T | P) Pr(T∩P) = Pr(T | P) Pr(P) Pr(P | T) = Pr(T∩P) / Pr(T) piercings - yes piercings - no b. THINK ABOUT IT: Suppose that for the answer to part 8a, you got 7. How do you know that this answer cannot be correct? 9. Suppose that at a particular university, 15% of the students are majoring in the Business school, 20% are in the Engineering school, and the remaining students are in the other colleges/schools of the university. Suppose also that 20% of the Business students are female, 10% of the Engineering students are female, and 60% of the students in the other colleges/schools are female. If a randomly selected student is female, what is the probability that the student is majoring in (a) the Business school (b) the Engineering school, (c) the other colleges/schools? (Hint: Set up a table similar to the one in question 8 to solve this problem.) Workspace Discrete Random Variables & Probability Distributions 1. Suppose the following is the joint distribution of two random variables X and Y. Y X 0 1 a. b. c. d. e. f. 0 0.49 0.21 0.70 1 0.21 0.09 0.30 0.70 0.30 1.00 Determine the mean and variance of X. Determine the mean and variance of Y. Determine the expected value of XY. Determine the covariance of X and Y. Determine the correlation coefficient of X and Y. THINK ABOUT IT: Suppose that for the answer to part e, you got 2. How do you know that this answer cannot be correct? 2. On average, two customers enter a candy shop per minute. What is the probability that in a given minute, a. exactly one customer will enter the shop? b. at least one customer will enter the shop? c. at most one customer will enter the shop? d. THINK ABOUT IT: Suppose that for the answer to part a, you got 0.3 and for part b, you got 0.2. How do you know that one or both of these answers is not correct? 3. Suppose you are taking a quiz and there is a multiple choice question on which you are clueless. You close your eyes and pick an answer at random. If there are five choices, what is the probability that you guess correctly. 4. A bowl contains 20 candies; 15 are chocolate and 5 are vanilla. You select 5 at random. What is the probability that all 5 are chocolate? 5. A bowl contains 200 candies; 150 are chocolate and 50 are vanilla. You select 5 at random. What is the probability that all 5 are chocolate? 6. A bowl contains 200 candies; 100 are chocolate, 50 are vanilla, and 50 are strawberry. You select 5 at random. What is the probability that all 5 are chocolate? Exercises using the Continuous Uniform, Z, and t distributions 1. Suppose that the scores on a standardized test at a very large university can be approximated by a continuous uniform distribution with a low score of 45 and a high score of 95. a. Sketch the distribution. b. What proportion of the scores were between 60 and 80? c. What is the mean of the scores? d. What are the variance and standard deviation of the scores. e. THINK ABOUT IT: Suppose when you drew your distribution in part a, you had a rectangle that went from 45 to 95 on the horizontal axis and had a height of 1. How do you know that this graph cannot be correct? 2. Suppose that 60% of the people in a large population support a particular ballot referendum. If a random sample of 2400 people is taken, what is the probability that at least 1500 support the referendum? Use the normal approximation to the binomial to calculate this probability. Remember to use the continuity correction. 3. Suppose the grades in a class are normally distributed with a mean of 75 and a standard deviation of 6. What is the probability that the grade of a randomly selected student will be more than 81? 4. Suppose that the average grade in a class of 1000 is 75. A sample of 16 observations is taken. The standard deviation of class grades is 6. What is the probability that the sample mean is more than 79.42? 5. Suppose that the average grade in a class of 1000 is 75. A sample of 16 observations is taken. The standard deviation of grades in the sample is 6. What is the probability that the sample mean is more than 79.42? 6. Suppose that the average grade in a class of 100 is 75. A sample of 16 observations is taken. The standard deviation of class grades is 6. What is the probability that the sample mean is more than 78.595? 7. Suppose that the average grade in a class of 100 is 75. A sample of 16 observations is taken. The standard deviation of grades in the sample is 6. What is the probability that the sample mean is more than 78.595? Exercises on Confidence Intervals and Determining Appropriate Sample Size I. Consider the following sample data on GPAs of the students in a previous statistics class. Use the data to answer questions 1 to 5 below. males: females: 2.5 3 2.101 2.8 2.2 2.5 2.5 2.9 2.3 2.28 2.4 3.26 2.9 3.5 2.7 2.9 2.567 3 2.7 3 2.8 3.087 2.6 2.5 2.5 2.16 1.9 3.3 3.87 2.8 2.09 2.8 3.5 2.5 1. Using the sample of data on male GPAs, calculate the 90% confidence interval for the average GPA for the entire male population. Assume that you know that the standard deviation of the population is 0.44. Do the same for the female population. 2. Using the sample of data on male GPAs, calculate the 90% confidence interval for the average GPA for the entire male population. Do the same for the female population. 3. Calculate the 90% confidence interval for the difference in the mean GPAs for the male and female populations. Assume that you know that the standard deviations of both populations are 0.44. 4. Calculate the 90% confidence interval for the difference in the mean GPAs for the male and female populations. 5. Calculate the 90% confidence interval for the difference in the mean GPAs for the male and female populations. Assume that the standard deviations of the populations are believed equal. II. Use the following information to answer questions 6 and 7. Suppose you have a sample of 230 males and 110 females. Suppose also that 26.1% of the male sample has a GPA of 3.0 or better and that 27.3% of the female sample has a GPA of 3.0 or better. 6. Calculate the 90% confidence interval for the population proportion of male students with a GPA of 3.0 or better. Do the same for females. 7. Calculate the 90% confidence interval for the difference in the population proportions of male and female students with a GPA of 3.0 or better. 8. THINK ABOUT IT: Suppose that in problem 1, for the males, for the sample mean GPA you got 2.7 and for the 90% confidence interval for the population mean you got 2.8 < μ < 3.2. How do you know that your numbers cannot be correct? (continues on next page) III. Determining Appropriate Sample Size 9. Suppose you plan to estimate the mean of a particular population. The variance is known to be 125. If you want to be 95% certain that your estimate is within 5 units of the correct value, how many observations should you sample? 10. Suppose you plan to estimate the proportion of a particular population with a specified characteristic. You want to be 95% certain that your estimate is within 0.04 of the correct proportion. a. If you have no idea what the population proportion is, how many observations should you sample? b. If you think that the population proportion might be about 0.25, how many observations should you sample?