* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Math109 homework 1soln
Survey
Document related concepts
Transcript
3a. Descriptive statistics involve the statement “76% of women and 60% of men had a physical examination within the previous year.” b. An inference drawn from the study is that a higher percentage of women had a physical examination within the previous year. Section 1.1 EXERCISE SOLUTIONS o1.1—1-10, 11, 13, 14, 17, 19, 21, 29, 30, 32, 33, 36, 38, 40, 41, 46, 47 1. A sample is a subset of a population. 2. It is usually impractical (too expensive and time consuming) to obtain all the population data. 3. A parameter is a numerical description of a population characteristic. A statistic is a numerical description of a sample characteristic. 4. Descriptive statistics and inferential statistics 5. False. A statistic is a numerical measure that describes a sample characteristic. CHAPTER 1 6.2 True 2 CHAPTER 1 INTRODUCTION TO STATISTICS INTRODUCTION TO STATISTICS 2 The CHAPTER INTRODUCTION TOitSTATISTICS data set1is a population because is a collection of the heights of all the players on a school’s 7.11. True 11. basketball The data set is a population because it is a collection of the heights of all the players on a school’s team. The data set is astatistics population because it is a collection of the heights of all the players on a school’s basketball team. 8.11.False. Inferential involves using a sample to draw conclusions about a population. 12. The data set team. is a population because it is a collection of the energy collected from all the wind basketball turbines on The data setthe is awind population because of it isalla outcomes, collection of the energy collected fromorallcounts the wind 9.12.False. A population isfarm. the collection responses, measurements, that are on the wind farm. 12.ofturbines The data set is a population because it is a collection of the energy collected from all the wind interest. 1 INTRODUCTION TO STATISTICS 213. CHAPTER The data set a sample because the collection of the 500 spectators is a subset within the turbines onisthe wind farm. stadium’s 42,000 13. population The data setofisthe a sample because thespectators. collection of the 500 spectators is a subset within the 10. A statistic differ from sample 11. False. The data setofisthe a can population because it istoa sample. collection of the heights of all the players on a school’s population stadium’s 42,000 spectators. 213.CHAPTER TO STATISTICS The data 1set isINTRODUCTION a sample because the collection of the 500 spectators is a subset within the team. 14. basketball The data set is a population because it is a collection of the annual salaries of all pharmacists at a population of the stadium’s 42,000 spectators. 14. pharmacy. The data set is a population because it is a collection of the annual salaries of all pharmacists at a 11. The data of all the from players school’s 12. The data set set is is aa population population because because itit is is aa collection collection of of the the heights energy collected all on theawind pharmacy. 14.basketball The data set athe population is a collection of1within the annual salaries of pharmacists13at a team. CHAPTER INTRODUCTION TOall STATISTICS 15. Sample, because collection of the 20itpatients is a subset the population turbines on theiswind farm. because pharmacy. Copyright © 2012 because Pearson Education, Inc. Publishing as Prentice Hall. is a subset within the population 15. Sample, the collection of the 20 patients 12. The is because itcollection iscollection a U.S. collection of thespectators energy collected from all the 16. The Thedata data set population it isin a the number of televisions in all U.S. 13. data set Collection is aa population sampleofbecause the ofofthethe 500 is a subset within thewind 27. Population: allsince adults turbines on the wind farm. 15. Sample, because the collection of the 20 patients is a subset within the population households. stadium’s since 42,000 spectators. 16. population The data setofisthe a population it is a collection of the number of televisions in all U.S. households. Sample: Collection of 1442 adults surveyed 17. Population, because it is because a collection allaagolfers’ scores inspectators the tournament 13. is the itof collection of the 500 is atelevisions subset theU.S. at a 16.The Thedata dataset set apopulation population since collection of number of in all 14. The data set isisaa sample because it isis collection the annual salaries of allwithin pharmacists ofbecause the stadium’s 42,000 households. pharmacy. 17.population Population, it isofa all collection of all golfers’ scores in the tournament 28. Population: Collection peoplespectators. CHAPTER 1 INTRODUCTION TO STATISTICS 3 18. Sample, because only the age of every third person entering the clothing store is recorded 14. data because set isbecause a population it20 is collection the the annual ofisallrecorded pharmacists at a 18. Sample, because only the age ofofevery third person store 15. Sample, the collection the isentering a of subset within the population 17.The Population, it1600 isall abecause collection ofapatients all golfers’ scores inclothing thesalaries tournament Sample: Collection of people 27. of adults insurveyed 19.Population: Population,Collection because it is a collection ofthe allU.S. the U.S. presidents’ political parties pharmacy. 19. Population, because it1442 isof a all collection the U.S. presidents’ political partiesstore 16. The dataCollection set is a population since isofa all collection of the number of clothing televisions in all U.S. 29. Population: Collection registered voters 18.Sample: Sample, because the age ofitsurveyed every third person entering the is recorded ofcollection adults 20. Sample, because theonly contamination is athe subset in the population 15. Sample, because the collection of of the the10 20soil patients is a subsetlevels within population households. 20. Sample, because theofcollection of the 10 soil contamination levels ispolitical a subset parties in the population Sample: Collection voters 19.Population: Population, because it800 isallaregistered collection of allsurveyed theCounty U.S. presidents’ 28. Collection of people 21. Population: Party of registered voters in Warren 16. data set because is a population since it isofa collection the number of televisions in all U.S. 17. The Population, it is a collection all golfers’ofscores in the tournament 21. Population: Party of registered voters in Warren County households. 30. Population: Collection ofCounty allpeople students atresponding a college 20.Sample: Sample, because the collection ofsurveyed the 10 soil contamination levels is a subset in the population Collection of 1600 Sample: Party of Warren voters to online survey 18. Sample, because only the age of every third person entering the clothing store is recorded Sample: Collection Party of Warren County voters responding to online survey 17. Population, because it of is aall collection all Sample: of 496 students surveyed 22. Population: All students who donate atofavoters blood drivescores 29. Collection registered 21.Population: Population: Party of registered voters in golfers’ Warren Countyin the tournament 19. Population, because it is a collection of all the U.S. presidents’ political parties 22.Sample, Population: All students donate atin athe blood drive 18. because only the age of every third person entering the clothing 31. Population: Collection ofwho all women U.S. Sample: The students who donate and have type O+ blood Sample: Collection of 800 registered voters surveyed Sample: Party of Warren County voters responding to online survey store is recorded 20. Sample, because the collection of the 10 soil contamination levels is a subset in the population + blood Sample: The students who donate and have type O 19. Population, because it of is collection of the U.S.own presidents’ political parties Sample: Collection theaall546 U.S. women surveyed 23.Population: Population: Ages of of adults instudents the United States who 30. Collection at at aall college 22.Population: Population: All students who donate a blood drive cellular phones 21. Party of registered voters in Warren County 23.Sample, Population: Ages of adults in U.S. the who cellular phones Sample: Ages of the adults in the United States who ownown Samsung cellular 32. Population: Collection of all vacationers 20. because ofUnited the 10 States soil contamination levels is aphones subset in the population + Sample: Collection ofcollection 496 students surveyed blood Sample: The students who donate and have type O Sample: Party of Warren County voters responding to online survey Sample: Ages of adults inhomeowners the United States who own Samsung cellular phones 24. Population: Population: Income all Texas Sample: Collection of theall 791 vacationers surveyed 21. Party of of registered votersininin Warren County 31. Collection women the U.S. 23.Population: Population: Ages of of adults the United States who own cellular phones 22. Population: All students who in donate at a blood drive 24. Sample: Population: Income of all homeowners in Texas Income homeowners invoters Texas with mortgages 33.Sample: Population: Collection ofCounty all Fortune magazine’s topto100 companies Sample: Party of of Warren responding online survey to work for Collection of the 546 U.S. women surveyed Sample: Ages of adults the United States who bloodSamsung cellular phones Sample: The students whoindonate and have type O+ own Sample: Income of homeowners in in Texas with mortgages 25. Sample: Population: Collection of all adults the United States to the questionnaire Collection of of the companies 22. Population: All students who donate at a who bloodresponded drive 32. Population: Collection all85U.S. vacationers 24.Population: Population:Ages Income of allinhomeowners in Texas 23. of adults the United States who own cellular phones 25. Sample: Population: Collection of alladults adultssurveyed in the United States Collection of 1000 + 34.Sample: Population: of791 all light bulbs from theOday’s production blood Sample: TheCollection students who donate and have type Collection of the vacationers surveyed Sample: of adults in the United Samsung cellular phones Sample:Ages Income of homeowners in States Texas who withown mortgages Sample: Collection of 1000 26. Population: Collection of all adults infantssurveyed in Italy Sample: Collection theall20 lightUnited bulbs selected from thecompanies day’s production 23. Population: Population: Ages of of adults inFortune the States who own cellular phones 33. Collection of magazine’s top 100 to work for 24. Population: Income of all homeowners inthe Texas 25. Population: Collection of all adults in United States 26. Sample: Population: Collection of all infants Collection of 33,043 infantsin inItaly the study 35.Sample: Statistic.Collection The value isUnited a numerical of sample of annual salaries. Sample: Ages of adults thecompanies States who own Samsung cellular phones of$68,000 thein85 whodescription responded toathe questionnaire Sample: of homeowners in Texas with mortgages Sample:Income Collection 1000 infants adults surveyed Sample: Collection ofof33,043 in the study Sample: Collection of the 20 light bulbs selected from the day’s production Collection all20light bulbs from thebasketball day’s Sample: Collection ofof the light selected fromproduction theteams. day’s production 2a.34. (1)Population: The final standings represent a bulbs ranking of 35. Statistic. TheINTRODUCTION value $68,000TO is STATISTICS a numerical description of a sample of annual salaries. 4 CHAPTER 1 Collection of$68,000 the 20 light bulbs selected from theofday’s production 35. Statistic. The value is a numerical description a sample of annual salaries. (2)Sample: The collection phone numbers represents labels. computations can be 36. Statistic. 43% is aof numerical description of atheir sample ofNo highmathematical school students. 43. The statement “56% are the primary investors in household” is an application of descriptive made. 35. Statistic. $68,000 is description a numericalof description sample of annual salaries. 36.statistics. Statistic.The 43%value is a numerical a sample of of ahigh school students. 37. Parameter. The 62 surviving passengers out of 97 total passengers is a numerical description of b.36.(1) Ordinal, because the data can be putofan in order allinference of the43% passengers ofthe thesample Hindenburg that survived. Statistic. is a62numerical description aassociation sample of high school students. drawn from is that exists between women and being of 37.An Parameter. The surviving passengers out of 97 total passengers is U.S. a numerical description theallprimary investor in their of the passengers of thehousehold. Hindenburg that survived. (2)Parameter. Nominal, because you cannot makeout calculations onnumber the dataof 37. The 62issurviving passengers ofof97the total passengers is governors. a numerical description of 38. Parameter. 52% a numerical description total all of the passengers of the Hindenburg survived. 44. The statement “spending at least $2000 forthat their next vacation” is an example of descriptive 38. Parameter. 52% is a numerical description of the total number of governors. statistics. 3a.39. (1) The data8% setisisa the collection of bodyoftemperatures. Statistic. numerical description a sample of computer users. 38. Parameter. 52% is a numerical description of the total number of governors. 39.AnStatistic. 8% is a numerical description a sample of computer users. inference from the sample that of United States 40. 12%is isthe a numerical description of all newvacationers magazines.are associated with (2)Parameter. The data drawn set collection ofis heart rates. 39.spending Statistic.more 8% isthan a numerical of a sample of computer users. $2000 fordescription their next vacation. 40. Parameter. 12% is a numerical description of all new magazines. is a numerical description of aand sample of all people. b.41. (1)Statistic. Interval,44% because the data can be ordered meaningful differences can be calculated, but it 45. will12% vary.is a numerical description of all new magazines. 40.Answers Parameter. does not make sense writing a ratio using the temperatures 41. Statistic. 44% is a numerical description of a sample of all people. 42. Parameter. 21.0 is a numerical description of ACT scores for all graduates. 46. The volunteers the studydescription represent the 41.(a) Statistic. 44% is a in numerical of sample. a sample of all people. (2)Parameter. Ratio, because data can description be ordered,ofcan be written as all a ratio, you can calculate 42. 21.0 isthe a numerical ACT scores for graduates. meaningful and the data set contains inherent zero The population the collection of all individuals who an completed the math test. 42.(b) Parameter. 21.0differences, is is a numerical description of ACT scores for all graduates. The statement “three times more likely to answer correctly” is an application of descriptive 1.2 (c) EXERCISE SOLUTIONS statistics. Copyright © 2012 Pearson Education, Inc. Publishing as Prentice Hall. 1. Nominal and ordinal (d) An inference drawn from the sample is that individuals who are not sleep deprived will be Copyright © 2012 Pearson Education, Inc. Publishing as Prentice Hall. more likely answer asmath questions Copyrightthree © 2012times Pearson Education, Inc.toPublishing Prentice Hall. 2. Ordinal, interval, and ratio deprived. correctly than individuals who are sleep (a) AnData inference from the can sample that senioror citizens who live in Florida have better 3.47.False. at thedrawn ordinal level be is qualitative quantitative. memory than senior citizens who do not live in Florida. 4. False. For data at the interval level, you can calculate meaningful differences between data (b) It implies that if you live in Florida, you will have better memory. entries. You cannot calculate meaningful differences at the nominal or ordinal level. 48. (a) An inference drawn from the sample is that the obesity rate among boys ages 2 to 19 is 5. False. More types of calculations can be performed with data at the interval level than with data at increasing. Section 1.2 6 the CHAPTER INTRODUCTION TO STATISTICS nominal1 level. (b) It implies the same trend will continue in future years. o15.False. —7-14, 21,at22, 28, 30 can 6. Data the25, ratio level be placed a meaningful order. Quantitative, because weights of infants are in a numerical measure 49. Answers will vary. Qualitative, because because telephone species of trees are merely labelslabels 7.16.Qualitative, numbers are merely 6 Quantitative, CHAPTER 1 because INTRODUCTION TO STATISTICS 1.2 DATA CLASSIFICATION Qualitative, the results merely responses 8.17. because thepoll heights ofare hot air balloons are a numerical measure 15.Quantitative, Quantitative, because of temperatures infants are a numerical Quantitative, because weights wait times at a grocery store are ameasure numerical measure 9.18. because the body of patients is a numerical measure. 1.2 Try It Yourself Solutions 16. Qualitative. Qualitative, because of trees are merely labels but differences between data entries make no 19. Ordinal.species Dataeye can be arranged in order, 1a.Qualitative, One data set because contains names ofcolors cities and other contains city populations. 10. the arethe merely labels sense. 17. City: Qualitative, because the poll results are merely responses Nonnumerical 11.b.Quantitative, because the lengths of songs on an MP3 player are numerical measures 20. Population: Qualitative.Numerical Nominal. No mathematical computations can be made and data are categorized using 18. Quantitative, because wait times at a grocery store are a numerical measure names. 12. Quantitative, because the carrying capacities of pickups are numerical measures c. City: Qualitative 19. Population: Qualitative. Ordinal. Data can be arranged in order, but differences between data entries make no 21. Qualitative.Quantitative Nominal. No mathematical computations can be made and data are categorized using sense. 13. Qualitative, because the player numbers are merely labels names. 20.Qualitative, Qualitative. because Nominal.student No mathematical computations be made and data are categorized using 14. ID numbers are merelycan labels 22. Quantitative. Ratio. A ratio of two data values can be formed so one data value can be expressed names. Copyright 2012 Pearson Inc. Publishing as Prentice Hall. as a©multiple ofEducation, another. 21. Qualitative. Nominal. No mathematical computations can be made and data are categorized using 23. Qualitative. Ordinal. The data can be arranged in order, but differences between data entries are names. Copyright © 2012 Pearson Education, Inc. Publishing as Prentice Hall. not meaningful. 22. Quantitative. Ratio. A ratio of two data values can be formed so one data value can be expressed 24. Quantitative. Ratio. The ratio of two data values can be formed so one data value can be as a multiple of another. expressed as a multiple of another. 23. Qualitative. Ordinal. The data can be arranged in order, but differences between data entries are not meaningful. 25. Ordinal 24. Ratio Quantitative. Ratio. The ratio of two data values can be formed so one data value can be 26. expressed as a multiple of another. 27. Nominal 25. Ordinal 28. Ratio 26. Ratio c. 63, 7, 40, 19, 26 expressed as a multiple of another. 26. Ratio 63 07 82 40 19 26 b. 92 4a. (1) The sample was selected by only using the students in a randomly chosen class. Cluster 25. Ordinal c. 63, 7,sampling 40, 19, 26 27. Nominal 26. Ratio (2) The Thesample samplewas wasselected selectedby byonly numbering each studentinina the school,chosen randomly choosing 4a. using the students randomly class. Clustera 28. (1) Ratio starting number, and selecting students at regular intervals from the starting number. 27. Nominal sampling Systematic sampling 29. (2) (a) The Interval (b) Nominal (c) Ratio (d) choosing Ordinal a 28. Ratio sample was selected by numbering each student in the school, randomly b. (1) starting The sample may be biased because some classes may be more familiar with stem cell number, and selecting students at regular intervals from the starting number. research than other classes and have stronger opinions. 29. (b) (b) Nominal (c) Ratio 30. (a) (a) Interval Interval sampling Nominal (c) Interval (d) Ordinal(d) Ratio Systematic (2) Interval Thesample sample may be biased if there is any regularly occurring in with the data. 30. (a) (b) (c) Interval (d) Ratio 31. An inherent zero isbe aNominal zero that implies “none.” Answers willpattern vary. b. (1) The may biased because some classes may be more familiar stem cell research than other classes and have stronger opinions. 31. inherent zero is SOLUTIONS a zero that implies “none.” Answers will vary. 1.3AnEXERCISE 32. Answers will vary. (2) 1.3 The sample may be biased if there is any regularly occurring pattern in the data. Section 32. Answers will vary. 1. In an experiment, treatment is applied of a population and responses are observed. In an o 1, 2, 4-10, 11,DATA 13, a15, 17-22, 32, 33, 35to part AND 1.3 COLLECTION EXPERIMENTAL DESIGN observational study, a researcher measures characteristics of interest of part of a population but 1.3 EXERCISE SOLUTIONS 1.3 DATA COLLECTION does not change existing conditions. AND EXPERIMENTAL DESIGN 1. In Try an experiment, a treatment is applied to part of a population and responses are observed. In an 1.3 Itincludes Yourself Solutions 2. observational A census entire population; a sample includesofonly a portion ofof thea population. 1.3 Try It Yourself study,the a Solutions researcher measures characteristics interest of part population but does not change existing conditions. 1a. (1) Focus: Effect ofevery exercise on relieving depression 3. (1) In aFocus: random sample, member of the population has an equal chance of being selected. In a 1a. Effect of exercise on relieving depression simple random sample, every possible sample of the same size has an equal chance of being 2. A census includes the entire population; a sample includes only a portion of the population. selected. (2) Focus: Success of graduates (2) Focus: Success of graduates 3. In a random sample, every member of the population has an equal chance of being selected. In a 4. Replication is the repetition of an experiment using a large group of subjects. It is important simple random sample, every possible sample of the same size has an equal chance of being because it gives validity to the results. selected. Copyright © 2012 Pearson Education, Inc. Publishing as Prentice Hall. Copyright 5. True© 2012 Pearson Education, Inc. Publishing as Prentice Hall. 4.8 Replication repetition of an using a large group of subjects. It is important CHAPTER 1is the INTRODUCTION TOexperiment STATISTICS because it gives validity to the results. 6. False. A double-blind experiment is used to decrease the placebo effect. 7. False. Using sampling guarantees that members of each group within a population will 8 CHAPTER 1 stratified INTRODUCTION TO STATISTICS 5. True be sampled. 7. False. Using stratified sampling guarantees that members of each group within a population will 6.8. False. AAdouble-blind experiment is used to decrease the placebo effect. False. census is a count of an entire population. be sampled. Copyright © 2012 Pearson Education, Inc. Publishing as Prentice Hall. 8. False. count of ansample, entire population. 9. False.ATocensus selectisaasystematic a population is ordered in some way and then members of the population are selected at regular intervals. 9. False. To select a systematic sample, a population is ordered in some way and then members of population selected regular intervals. Copyright © 2012 Pearsonare Education, Inc.atPublishing as Prentice Hall. 10. the True 10. True 11. Use a census because all the patients are accessible and the number of patients is not too large. 11. Use a census because all the patients are accessible and the number of patients is not too large. 12. Perform an observational study because you want to observe and record motorcycle helmet usage. 12. Perform an observational study because you want to observe and record motorcycle helmet usage. 13. In this study, you want to measure the effect of a treatment (using a fat substitute) on the human digestive system. So, you would the want to perform an experiment. 13. In this study, you want to measure effect of a treatment (using a fat substitute) on the human digestive system. So, you would want to perform an experiment. 14. It would be nearly impossible to ask every customer whether he or she would still buy a product 14. It would be nearly impossible ask every he or she data. would still buy a product with a warning label. So, youtoshould usecustomer a surveywhether to collect these with a warning label. So, you should use a survey to collect these data. 15. Because it is impractical to create this situation, you would want to use a simulation. 15. Because it is impractical to create this situation, you would want to use a simulation. 16. Perform an observational study because you want to observe and record how often people wash 16. Perform an observational study because you want to observe and record how often people wash their hands in public restrooms. their hands in public restrooms. 17. (a) The Theexperimental experimentalunits units 30–35 females the treatment. 17. (a) areare thethe 30–35 yearyear old old females beingbeing givengiven the treatment. One One treatment is used. treatment is used. (b) A Aproblem problemwith withthe thedesign design is that there be some onpart theof part the researchers (b) is that there maymay be some bias bias on the theof researchers if he if he orshe sheknows knowswhich whichpatients patients were given A way to eliminate this problem or were given the the realreal drug.drug. A way to eliminate this problem would would betotomake makethe thestudy study into a double-blind experiment. be into a double-blind experiment. (c) study if the researcher did not whichwhich patients (c) The Thestudy studywould wouldbebea double-blind a double-blind study if the researcher didknow not know patients received oror thethe placebo. receivedthe thereal realdrug drug placebo. 18. (a) areare thethe 80 80 people withwith earlyearly signssigns of arthritis. One treatment is used.is used. 18. (a) The Theexperimental experimentalunits units people of arthritis. One treatment (b) A problem with the design is that the sample size is small. The experiment could be with a warning label. So, you should use a survey to collect these data. 15. Because it is impractical to create this situation, you would want to use a simulation. 16. Perform an observational study because you want to observe and record how often people wash their hands in public restrooms. CHAPTER 1 INTRODUCTION TO STATISTICS 9 17. (a) The experimental units are the 30–35 year old females being given the treatment. One treatment used. are divided into strata (rural and urban), and a sample is selected from each 20. Because theispersons stratum, this is a stratified sample. (b) A problem with the design is that there may be some bias on the part of the researchers if he or she knows which were patients weredue given theCHAPTER real drug. to eliminate problem would 21. Because the students chosen to their convenience of location (leaving the library), 1 A way INTRODUCTION TOthis STATISTICS 9this is to make thesample. study into double-blind abe convenience Biasamay enter intoexperiment. the sample because the students sampled may not be representative of the ofstrata students. there may beisanselected association 20. Because the persons arepopulation divided into (ruralFor andexample, urban), and a sample from between each (c)stratum, The spent study would be a double-blind this at is a stratified sample. time the library and drinkingstudy habits.if the researcher did not know which patients received the real drug or the placebo. 21. due to into theirgrids convenience of location (leaving the library), this isthis 22.Because Becausethe thestudents disasterwere areachosen was divided and thirty grids were then entirely selected, sample.units Biasare may into thewith sample because studentsdamaged sampled may others, not 18. (a)a isconvenience The experimental theenter 80 people early signs ofseverely arthritis. One treatment is be used. a cluster sample. Certain grids may have been much morethe than so this representative the population is a possible of source of bias. of students. For example, there may be an association between spent at the library and drinking (b)time A problem with the design is that habits. the sample size is small. The experiment could be replicated to increase validity. 23. Simple random sampling is used because each customer has an equal chance of being contacted, 22. Because the disaster area was divided into grids and thirty grids were then entirely selected, this and all samples of 580 customers have an equal chance of being selected. a cluster sample. Certain grids may have been much more severely damaged than others, so this (c)is In a placebo-controlled double-blind experiment, neither the subject nor the experimenter is a possible source of bias. knows whether the subject receiving a treatment or aentering placebo.the The experimenter 24. Systematic sampling is used is because every tenth person shopping mall isissampled. It informed after all the data have been collected. is possible for bias to enter the sample if, for some reason, there is a regular pattern to people 23. Simple random sampling is used because each customer has an equal chance of being contacted, entering the shopping mall. and all samples of 580 customers have an equal chance of being selected. (d) The group could be randomly split into 20 males or 20 females in each treatment group. 25.Systematic Because asampling sample isistaken from each one-acre subplotentering (stratum), this is a stratified sample. It 24. used because every tenth person the shopping mall is sampled. 19. Each U.S. telephone hassample an equal chance beingthere dialed all samples ofpeople 1400 phone is possible for bias tonumber enter the if, for some of reason, is aand regular pattern to numbers have an equal chance beingofselected, so thisand is aallsimple random sample. 26. Each telephone has an equalof chance being dialed samples of 1012 phoneTelephone numbers have entering the shopping mall. sampling only samples thoseselected, individuals whoishave telephones, available, and are willing only to an equal chance of being so this a simple random are sample. Telephone sampling soa those this isindividuals aispossible source ofone-acre bias. 25.respond, Because sample taken from subplot thistoisrespond, a stratified sample. samples whoeach have telephones and(stratum), are willing so this is a possible source of bias. CHAPTER 1 INTRODUCTION TO STATISTICS 9 26. Each telephone has an equal chance of being dialed and all samples of 1012 phone numbers have equal chance of being selected, so this is a simple random sample. Telephone sampling only 27.anAnswers will vary. 20. Because the persons are divided into telephones strata (ruraland andare urban), a sample so is selected from each samples those individuals who have willingand to respond, this is a possible Copyright © 2012 Pearson Education, Inc. Publishing as Prentice Hall. stratum, this is a stratified sample. source of bias. 28. Answers will vary. 21. students 27.Because Answersthe will vary. were chosen due to their convenience of location (leaving the library), this is 29.a convenience Answers willsample. vary. Bias may enter into the sample because the students sampled may not be of the population of students. For example, there may be an association between 28.representative Answers will vary. 30.time Answers spent atwill thevary. library and drinking habits. 29. Answers will vary. Census, is relatively easyinto to obtain the thirty ages of the were 115 residents 22.31.Because thebecause disasteritarea was divided grids and grids then entirely selected, this 30.isAnswers vary. Certain grids may have been much more severely damaged than others, so this a clusterwill sample. 32.is Sampling, because population of subscribers is too large to easily record their favorite movie a possible source ofthe bias. type. Random would beto advised it would easy to randomly select 31. Census, because sampling it is relatively easy obtain since the ages of the be 115too residents subscribers recordistheir type. 23. Simple randomthen sampling usedfavorite becausemovie each customer has an equal chance of being contacted, 32.and Sampling, because the customers populationhave of subscribers is too large to easily record their favorite movie all samples of 580 an equal chance of being selected. Random sampling woulditbe advised since it that would be too easy to randomly 33.type. Question is biased because already suggests eating whole-grain foods isselect good for you. The subscribers then record their because favorite movie type. 24. Systematic sampling is used every tenth person entering the shopping mall health?” is sampled. It question might be rewritten as “How does eating whole-grain foods affect your is possible for bias to enter the sample if, for some reason, there is a regular pattern to people 33. Question isis biased because suggests thatthat eating good for you. The the shopping mall. it already 34.entering Question biased because it already suggests textwhole-grain messaging foods while isdriving increases the risk question might be rewritten as “How does eating whole-grain foods affect your health?” of a crash. The question might be rewritten as “Does text messaging while driving increase the 25. Because risk of aa sample crash?”is taken from each one-acre subplot (stratum), this is a stratified sample. 34. Question is biased because it already suggests that text messaging while driving increases the risk of a crash. The has question might be rewritten asdialed “Doesand textall messaging while driving increase 26.35.Each telephone an equal chance of being samples of 1012 phone numbersthehave Question is unbiased because it does not imply how much exercise is good or bad. risk of a crash?” an equal chance of being selected, so this is a simple random sample. Telephone sampling only samples those individuals who have telephones and are willing to respond, so this is a possible 35. Question is unbiased because it does not imply how much exercise is good or bad. source of bias. 27. Answers will vary. Copyright © 2012 Pearson Education, Inc. Publishing as Prentice Hall. 28. Answers will vary. Copyright © 2012 Pearson Education, Inc. Publishing as Prentice Hall. 29. Answers will vary. 30. Answers will vary. 31. Census, because it is relatively easy to obtain the ages of the 115 residents 32. Sampling, because the population of subscribers is too large to easily record their favorite movie The graph shows that most of the pungencies of the peppers were between 36 and 43 Scoville (Answers will vary.) e.units. Answers will vary. Range 514 291 = = 27.875 28 Number of classes 8 Class Frequency, f Midpoint Relative Cumulative frequency frequency Section 2.1— 291-318 5 304.5 0.1667 5 319-346 4 332.5 0.1333 9 o20 1,CHAPTER 2, 3, 21, 25, 37 2 34, DESCRIPTIVE STATISTICS 347-374 3 360.5 0.1000 12 2.1 EXERCISE SOLUTIONS 375-402 5 388.5 0.1667 17 19a. Number of classes = 76 b. Least frequency 0.2000 10 403-430 416.5 23 c. Greatest frequency 300 d. Class widthmay = 10make 1. Organizing the data into distribution patterns within the 431-458 4a frequency 444.5 0.1333 27 data more evident. Sometimes it is easier to identify patterns of a data set by looking at a graph of the frequency 459-486 1 472.5 0.0333 28 20a. Number of classes = 7 b. Least frequency 100 distribution. 487-514 2 500.5 0.0667 30 900 STATISTICS d. Class width = 5 Greatest frequency 20c. CHAPTER 2 DESCRIPTIVE f 2. If there are too few or too f many 30 classes, it may be difficult to detect 1 patterns because the data are n too condensed or too spread out. 21a. 50 b. 22.5-23.5 pounds 19a. Number of classes = 7 b. Least frequency 10 33.7ab. Class width = c. Greatest frequency 300 d. Class width = 10 3. 50 Class limits determine which numbers can belong 22a. b. 64-66 inchesto that class. Class boundaries are the numbers that separate classes without forming gaps between them. 20a. Number of classes = 7 b. Least frequency 100 23a. b. d. 29.5 Classpounds width = 5 c. 42 Greatest frequency 900 c. 35 d. 2 24a. b. inches pounds 21a. 48 50 b. 66 22.5-23.5 c. 20 d. 6 22a. 50 b. 64-66 inches 25a. Class with greatest frequency: 8-9Hall. inches The graph shows thatrelative theInc.most frequent reaction times were between 403 and 430 milliseconds. Copyright © 2012 Pearson Education, Publishing as Prentice Class with least relative frequency: 17-18 inches (Answers will vary.) 23a. 42 b. 29.5 pounds c. 35 d. 2 0.195 b. Greatest relative frequency Range 2888 24a. 48 b. 66 inches2456 = 86.4 87 34. Class width = frequency 0.005 = c. Least 20 relativeNumber d. 6 of classes 5 Class Frequency, f Midpoint Relative Cumulative c. Approximately 0.015 25a. Class with greatest relative frequency: 8-9 inches frequency frequency Class with least relative frequency: 17-18 inches 2456-2542 7 2499 0.28 7 26a. Class with greatest relative frequency: 19-20 minutes 2543-2629 3 2586 0.12 10 with least relative frequency: minutes 0.195 21-22 b. Class Greatest relative frequency 2630-2716 2 2673 0.08 12 0.005 Least relative frequency 2717-2803 4 2760 0.16 16 b. Greatest relative frequency 40% 2804-2890 9 2847 0.36 25 relative frequency 2% c. Least Approximately 0.015 f f 25 1 c. Approximately 33%relative frequency: 19-20 minutes n 26a. Class with greatest Class with least relative frequency: 21-22 minutes 27. Class with greatest frequency: 29.5-32.5 CHAPTER 2 DESCRIPTIVE STATISTICS 23 with least frequency: 40% and 38.5-41.5 b.Classes Greatest relative frequency 11.5-14.5 Least relative frequency 2% 28. Class with greatest frequency: 7.75-8.25 with least frequency: c. Class Approximately 33% Inc. 6.25-6.75 Copyright © 2012 Pearson Education, Publishing as Prentice Hall. Range 29.5-32.5 39 0 27. Class Class width with greatest frequency: = = 7.8 8 29. = Number of classes Classes with least frequency: 11.5-14.5 5and 38.5-41.5 Class Frequency, f Midpoint Relative Cumulative frequency frequency 28. Class with greatest frequency: 7.75-8.25 The graph shows that the most common pressures at fracture time were between 2804 and 2890 0-7 least frequency: 8 6.25-6.75 3.5 0.32 8 Class with pounds per square inch. (Answers will vary.) 8-15 8 11.5 0.32 16 16-23 3 19.50 0.12 19 Range 39 Range 55 24 = =27.5 = 7.8 80.12 29. Class width = = = 6.2 7 35. Class width 24-31 3 22 Number of classes 5 Number 32-39 3 of classes 35.55 0.12 25 Class Frequency, f Midpoint Relative Cumulative Class Frequency, f Midpoint Relative Cumulative f frequency frequency frequency frequency f 25 1 n 0-724-30 8 9 3.527 0.32 0.30 98 Classes with greatest frequency: 0-7, 8-15 8-15 8 8 11.5 0.32 16 31-37 34 0.27 17 Classes with least frequency: 38-44 4132-39 0.33 27 16-23 3 10 16-23, 24-31, 19.5 0.12 19 45-51 48 0.07 29 24-31 3 2 27.5 0.12 22 52-58 55 0.03 30 32-39 3 1 35.5 0.12 25 f f 1 Copyright © 2012 Pearson Education,f Inc.f Publishing 25 30 as Prentice Hall. 1 nn Classes with greatest frequency: 0-7, 8-15 Classes with least frequency: 16-23, 24-31, 32-39 Class with greatest relative frequency: 10-24 Class with least relative frequency: 55-69 Range 462 138 = = 64.8 65 37. Class width = Number of classes 5 c. It appears that the auto industry (dealers and repair shops) account for the largest portion of Class Frequency, f Midpoint Relative Cumulative complaints filed at the BBB. (Answers will very.) frequency frequency CHAPTER STATISTICS 33 138-202 12 170 0.46 2 DESCRIPTIVE 12 6a, b. 203-267 6 235 0.23 18 3. Both the stem-and-leaf 4plot and the dot300 plot allow you to see how data are 268-332 0.15 22distributed, determine specific data entries, and 333-397 1 identify unusual 365data values. 0.04 23 398-462 3 430 0.12 26 4. In a Pareto chart, the height of each bar represents frequency f or relative frequency and the bars 26 1 are positioned in order fof decreasing height with the tallest n bar positioned at the left. 5. b 6. d 7. a 8. c 27, 32, 41, 44, 47, 48, 50, is 51,with 51, 52, 53, 53, the 54, larger 54, 54,the 54,employee’s 55, 56, 56, 58, 59, will 68, c.9.It appears that43, the43, longer an 47, employee the 53, company, salary 68, 68, 73, 78, 78, 85 be. Max: 85 Min: 27 7a, b. 10. 12.9, 13.3, 13.6, 13.7, 13.7, 14.1, 14.1, 14.1, 14.1, 14.3, 14.4, 14.4, 14.6, 14.9, 14.9, 15.0, 15.0, 15.0, 15.1, 15.2, 15.4, 15.6, 15.7, 15.8, 15.8, 15.8, 15.9, 16.1, 16.6, 16.7 Class with greatest relative frequency: 138-202 Max: 16.7 Min: 12.9 Class with least relative frequency: 333-397 11. 13, 13, 14, 14, 14, 15, 15, 15, 15, 15, 16, 17, 17, 18, 19 14 6 Max:width 19 =Min: 13 Range 38. Class = = 1.6 2 classes 12. 214, 214, 214,Number 216, 216,of217, 218, 218, 5220, 221, 223, 224, 225, 225, 227, 228, 228, 228, 228, Frequency, Midpoint Relative Cumulative 230, Class 230, 231, 235, 237, 239 f Max: 239 Min: 214 frequency c. The average bill increased from 1998 to 2004, then it hovered aroundfrequency $50.00 from 2004 to 2008. 3 6.5 0.12 3 CHAPTER 2 DESCRIPTIVE STATISTICS 33 Section 2.2 6-7 13. Sample spend the most amount of time on MySpace and the 8-9answer: Users 10 8.5 0.38 13 least amount of time on Twitter. Answers will vary. 10-11 6 plot 10.5plot allow you 0.23to see how data 19 are distributed, determine 3.1, EXERCISE Both the23, stem-and-leaf and the dot o2.2 5-8, 17, 30,SOLUTIONS 34, 35-38 12-13 6 12.5 0.23 25 specific data entries, and identify unusual data values. 14. Sample answer: Motor vehicle thefts decreased between0.04 2003 and 2008. 26 Answers will vary. 14-15 1 14.5 1. Quantitative: stem-and-leaf plot, dot plot, histogram, timef series chart, scatter plot. 4.15. In a Pareto chart, height of each bar represents frequency relative frequency and the f 26 Answers will vary. Sample answer: Tailgaters irk drivers the 1most,orwhile too cautious drivers irk bars Qualitative: pie chart,the Pareto chart ntallest bar positioned at the left. are positioned in order of decreasing height with the drivers the least. 2. Unlike the histogram, the stem-and-leaf plot still contains the original data values. However, 5.16. bAnswers 6.vary. d Sample 7. in a aThe 8. cplot. willdifficult answer: most frequent incident occurring while driving and using some data are to organize stem-and-leaf a cell phone is swerving. Twice as many people “sped up” than “cut off a car.” 9. 27, 32, 41, 43, 43, 44, 47, 47, 48, 50, 51, 51, 52, 53, 53, 53, 54, 54, 54, 54, 55, 56, 56, 58, 59, 68, 68,6 73, 7 78, 67 78, 85 17. 68, Key: Copyright © 2012 Pearson Education, Inc. Publishing as Prentice Hall. Max: 85 Min: 27 67 8 7 3 13.3, 5 5 13.6, 6 913.7, 13.7, 14.1, 14.1, 14.1, 14.1, 14.3, 14.4, 14.4, 14.6, 14.9, 14.9, 15.0, 15.0, 10. 12.9, 8 0 15.1, 0 2 15.2, 3 515.4, 5 715.6, 7 15.7, 8 15.0, 15.8, 15.8, 15.8, 15.9, 16.1, 16.6, 16.7 Max: 1 2 12.9 4 5 5 9 0 116.71 Min: Copyright © 2012 Pearson Education, Inc. Publishing as Prentice Hall. It appears that most grades for the biology midterm were in the 80s or 90s. (Answers will vary.) 11. 13, 13, 14, 14, 14, 15, 15, 15, 15, 15, 16, 17, 17, 18, 19 Max: 19 Min: 13 12. 214, 214, 214, 216, 216, 217, 218, 218, 220, 221, 223, 224, 225, 225, 227, 228, 228, 228, 228, 230, 230, 231, 235, 237, 239 Max: 239 Min: 214 13. Sample answer: Users spend the most amount of time on MySpace and the least amount of time on Twitter. Answers will vary. 14. Sample answer: Motor vehicle thefts decreased between 2003 and 2008. Answers will vary. 15. Answers will vary. Sample answer: Tailgaters irk drivers the most, while too cautious drivers irk Copyright © 2012 Education, Inc. Publishing as Prentice Hall. drivers thePearson least. 16. Answers will vary. Sample answer: The most frequent incident occurring while driving and using a cell phone is swerving. Twice as many people “sped up” than “cut off a car.” CHAPTER 2 DESCRIPTIVE STATISTICS 35 23. Category United States Italy Ethiopia South Africa Tanzania Kenya Mexico Morocco Great Britain Brazil New Zealand Frequency, f 15 4 1 2 1 8 4 1 1 2 1 f 40 Relative Frequency 0.375 0.100 0.025 0.050 0.025 0.200 0.100 0.025 0.025 0.050 0.025 f 1 N CHAPTER 2 Angle 135 36 9 18 9 72 36 9 9 18 9 360 DESCRIPTIVE STATISTICS 37 29. Most of the New York City Marathon winners are from the United States and Kenya. (Answers will vary.) It appears that it was hottest from May 7 to May 11. (Answers will vary.) 24. 30. Category Science, aeronautics, exploration Space operations Education Cross-agency support Inspector general Frequency, f 8947 6176 126 3401 36 Relative Frequency Angle 0.479 172.4 0.331 119.2 0.007 2.5 0.182 65.5 0.002 0.7 f f 18,686 360 1 N It appears that the largest decrease in manufacturing as a percent of GDP was from 2000 to 2001. (Answers will vary.) 31. Variable: Scores Key: 5 5 5.5 5 5 6 2 It that most of NASA’s budget was spent on science, aeronautics, and exploration. 6 appears 8 (Answers will vary.) 7 0 1 7 5 6 8 0 2 3 8 5 6 7 8 8 9 Copyright Education, Inc. Publishing as Prentice Hall. 3 Pearson 3 9 0© 2012 9 5 5 8 9 10 0 It appears that most scores on the final exam in economics were in the 80’s and 90’s. (Answers will vary.) CHAPTER 2 DESCRIPTIVE STATISTICS 34a. It appears that the number of registrations is increasing over time. (Answers will vary.) b. It appears that the number of crashes is decreasing over time. (Answers will vary.) c. It appears that the number of registrations is increasing over time. (Answers will vary.) d. It appears that the number of crashes is decreasing over time. (Answers will vary.) Copyright © 2012 Pearson Education, Inc. Publishing as Prentice Hall. 39 40 CHAPTER 2 DESCRIPTIVE STATISTICS 35a. The graph is misleading because the large gap from 0 to 90 makes it appear that the sales for the 3rd quarter are disproportionately larger than the other quarters. (Answers will vary.) b. 36a. The graph is misleading because the vertical axis has no break. The percent of middle schoolers that responded “yes” appears three times larger than either of the others when the difference is only 10%. (Answers will vary.) b. 37a. The graph is misleading because the angle makes it appear as though the 3rd quarter had a larger percent of sales than the others, when the 1st and 3rd quarters have the same percent. b. 38a. The graph is misleading because the “OPEC countries” bar is wider than the “non-OPEC countries” bar. b. 39a. At Law Firm A, the lowest salary was $90,000 and the highest salary was $203,000. At Law Firm B, the lowest salary was $90,000 and the highest salary was $190,000. b. There are 30 lawyers at Law Firm A and 32 lawyers at Law Firm B. Copyright © 2012 Pearson Education, Inc. Publishing as Prentice Hall. 63-69 66 70-76 73 77-83 80 84-90 5. 1, 2, 2, 2, 3 (Answers87will vary.) 10 5 8 6 N = 50 660 CHAPTER 2 365 640 552 x f 3265 DESCRIPTIVE STATISTICS 43 6. 2, 4, 5, 5,x 6,f 8 (Answers will vary.) 3265 d. 65.3 N 50 7. 2, 5, mean 7, 9, 35 will vary.) The age(Answers of the 50 richest people is 65.3 Section 2.3 o 1-8, 9-16, 19, 20, 23, 42, 45, 49, 56, 58 8.2.3 1, EXERCISE 2, 3, 3, 3, 4, 5SOLUTIONS (Answers will vary.) True 9.1. Skewed right because the “tail” of the distribution extends to the right. 2. Symmetric False. All quantitative dataleft setsand have a median. 10. because the right halves of the distribution are approximately mirror images. 3. True 11. Uniform because the bars are approximately the same height. 4. True CHAPTER 2 DESCRIPTIVE STATISTICS 12. Skewed left because the “tail” of the distribution extends to the left. 43 5. 1, 2, 2, 2, 3 (Answers will vary.) 13. (11), because the distribution values range from 1 to 12 and has (approximately) equal Copyright © 2012 Pearson Education, Inc. Publishing as Prentice Hall. 6. frequencies. 2, 4, 5, 5, 6, 8 (Answers will vary.) 14. the distribution has values in the thousands of dollars and is skewed right due to the 7. (9), 2, 5,because 7, 9, 35 (Answers will vary.) few executives that make a much higher salary than the majority of the employees. 8. 1, 2, 3, 3, 3, 4, 5 (Answers will vary.) 15. (12), because the distribution has a maximum value of 90 and is skewed left due to a few students 9. scoring Skewed much right because the “tail” of the distribution extends to the right. lower than the majority of the students. 10. Symmetric because the left and right halves of the distribution are approximately mirror images. 16. (10), because the distribution is rather symmetric due to the nature of the weights of seventh boys. 11. grade Uniform because the bars are approximately the same height. 12. Skewedxleft 64 because the “tail” of the distribution extends to the left. 17. x n 13 4.9 13. (11), because the distribution values range from 1 to 12 and has (approximately) equal frequencies. = 4 (occurs 3 times) has values in the thousands of dollars and is skewed right due to the 14. mode (9), because the distribution few executives that make a much higher salary than the majority of the employees. x 396 39.6 18. 15. x(12), because the distribution has a maximum value of 90 and is skewed left due to a few students 10 n scoring much lower than the majority of the students. 16. (10), because the distribution is rather symmetric due to the nature of the weights of seventh grade boys. mode = 39 (occurs 3 times) 17. x 19. x x x n n 64 76.8 4.9 13 11.0 7 mode = 4 (occurs 3 times) 396 mode =x11.7 (occurs 3 times) 18. x 10 n 39.6 mode = 39 (occurs 3 times) x 76.8 7 Copyright 19. x © 2012 Pearson Education, 11.0 Inc. Publishing as Prentice Hall. n mode = 11.7 (occurs 3 times) 21. x 44 x n 686.8 32 CHAPTER 2 21.46 DESCRIPTIVE STATISTICS mode = 20.4 (occurs 2 times) CHAPTER 2 DESCRIPTIVE STATISTICS x 2004 20. x 200.4 x 1223 22. x 10 61.2 n 41. 20 n Source Score, x Weight, w x·w Homework 85 0.05 4.25 Quiz 80 0.35 28 mode==80, none mode 125 Project 100 0.20 20 Themodes modedo cannot be found points are they repeated. The not represent the because center of no the data data set because are large values compared to Speech 90 0.15 13.5 the rest of the data. Final exam 0.25 23.25 x 686.8 93 21.46 21. x w 1 x w 89 23. x notnpossible32(nominal data) 47 median = not possible (nominal data) x w 89 xmode = “Eyeglasses” 89 The meanwand median 1 cannot be found because the data are at the nominal level of measurement. 24. x not possible (nominal data) 42. mode = 20.4 (occurs(nominal 2 times) median data) Source= not possible Score, x Weight, w x·w mode = “Money needed” MBAs 92,500 8 because the data740,000 The meanx and1223 median cannot be found are at the nominal level of measurement. 68,000 17 1,156,000 61.2 22. BAs x 20 n x 1194.4 w 25 x w 1,896,000 25. x n 7 170.63 x w 1,896,000 x 75,840 w 25 48 CHAPTER 2 125 DESCRIPTIVE STATISTICS mode = 80, The modes mode = none do not represent the center of the data set because they are large values compared to 43. be found because no data points are repeated. the mode rest ofcannot the data. 45. The Balance, x Days, w x·w Grade Points, x Credits, w12,552 x·w $523 24 26. xx B not possible (nominal data)data) 3 3 9 not possible (nominal 23. $2415 = not possible 2(nominal data) 4830 median B 3 3 9 median = not possible (nominal data) $250 = “Mashed” 4 1000 mode A 4 4 16 mode = “Eyeglasses” The mean and medianwcannot the data are at the nominal level of measurement. 30 be found xbecause w 18,382 1 2 found because the2 data are at the nominal level of measurement. TheDmean and median cannot be C x w 2 3 6 18,382 x $612.73 w 15 x w 42 24. x not wpossible 30 (nominal data) median = not possible (nominal data) x w 42 Copyright © 2012 Pearson Education, Inc. Publishing as Prentice Hall. mode = “Money needed” x 2.8 44. w 15 The meanxand median the data are at the nominal level of measurement. Balance, Days, wcannot be found x · because w CHAPTER 2 DESCRIPTIVE STATISTICS 49 $759 15 11,385 46. $1985 x 1194.4 5 9925 x 170.63 25.Source Score, x Weight, w7050 x·w 49.$1410n 7 5 Engineering 85 9 $348 6 2088 f 765 Class Midpoint, x Frequency, x·f Business 81 31 13 1053 29-33 341 w 31 x w11 30, 448 Math 90 36 5 450 34-38 12 432 x none w 30, 44841 w 27 2 x w 226882 mode = 39-43 x $982.19 The mode cannot be found because no 5data points are repeated. w 31 44-48 46 230 x w 2268 x 84 x f 1085 n 30 w 27 26. x not possible (nominal data) x f 1085 (nominal data) median = not possible x 36.2 miles per gallon 47. n 30 mode = “Mashed” Source x cannot Weight, w because the x·w The mean andScore, median be found data are at the nominal level of measurement. Homework 85 0.05 4.25 50. Quiz 80 Class Midpoint, x0.35Frequency, f 28 x·f Project 100 0.20 20 22-27 24.5 16 392 Speech 90 30.5 0.15 13.5 28-33 2 61 Final exam 85 36.5 0.25 21.25 34-39 2 73 Copyright © 2012 Pearson Education, Inc. Publishing as Prentice Hall. w 1 x w 87 40-45 42.5 3 127.5 46-51 48.5 1 48.5 Copyright © 2012 Pearson87 Education, Inc. Publishing as Prentice Hall. x w Range Number of classes Class Frequency, f 1 6 2 5 3 4 4 6 5 4 6 5 f 30 Shape: Uniform 56. Class width = 6 1 6 0.8333 Copyright © 2012 Pearson Education, Inc. Publishing as Prentice Hall. 1 254 454.5 64,516 206,570.25 3,870,960 14,459,917.5 x x d. s x 2 f x 2 28,715,797.5 f 28,715,797.5 999 n 1 169.5 Section 2.4 2.4 EXERCISE SOLUTIONS o 1-10, 11, 13, 19, 21, 22, 23, 31,32, 33, 38 1. The range is the difference between the maximum and minimum values of a data set. The advantage of the range is that it is easy to calculate. The disadvantage is that it uses only two entries from the data set. 58 CHAPTER 2 DESCRIPTIVE STATISTICS Copyright © 2012 Pearson Publishing between as Prentice an Hall. 2. A deviation x Education, is theInc. difference entry x and the mean of the data µ. The sum of the deviations is always zero. 3. The units of variance are squared. Its units are meaningless. (Example: dollars2) 4. The standard deviation is the positive square root of the variance. Because squared deviations can never be negative, the standard deviation and variance can never be negative. 5. {9, 9, 9, 9, 9, 9, 9} n=7 x 63 9 x 7 n x x 9 9 9 9 9 9 9 x s x x 0 0 0 0 0 0 0 63 x x x x 2 0 0 0 0 0 0 0 0 x x x 2 0 2 0 6 n 1 0 6. {3, 3, 3, 7, 7, 7} n=6 x 30 5 n 6 x x 3 3 3 7 7 7 x 30 x N –2 –2 –2 2 2 2 4 4 4 4 4 4 0 x 2 2 x 24 6 x 4 2 24 2 7. When calculating the population standard deviation, you divide the sum of the squared deviations by N, then take the square root of that value. When calculating the sample standard deviation, you divide the sum of the squared deviations by n 1 , then take the square root of that value. Copyright © 2012 Pearson Education, Inc. Publishing as Prentice Hall. CHAPTER 2 DESCRIPTIVE STATISTICS 59 8. When given a data set one would have to determine if it represented the population or if it was a sample taken from the population. If the data are a population, then is calculated. If the data are a sample, then s is calculated. 9. Similarity: Both estimate proportions of the data contained within k standard deviations of the mean. Difference: The Empirical Rule assumes the distribution is bell-shaped. Chebychev’s Theorem makes no such assumption. 10. You must know that the distribution is bell-shaped. 11. Range = Max – Min = 12 – 5 = 7 CHAPTER 2 DESCRIPTIVE STATISTICS x 90 9 N – 10 12. Range = Max Min = 25 – 15 = 10 266 x x 19 x x N 14 2 0 9 0 x x x 5 –4 16 18 –1 1 0 0 20 9 1 1 19 10 0 0 1 1 21 11 2 4 2 4 19 0 0 12 3 9 17 –2 4 7 –2 4 15 –4 16 17 7 –2 –2 4 4 25 8 6 –1 36 1 22 3 9 12 3 9 19 0 0 20 x 90 1 x 1 x 0 60 16 18 2 x 90 x N 2 x 2 x N N2 x –3 2 –1 x 48 0 4.8 10 2 86 14 6.1 4.8 86 14 N 2 2 9 1 2 x 48 86 2.2 2.5 13. Range = Max – Min = 19 – 4 = 15 x 108 x 12 n 9 x x 4 15 9 12 16 8 11 19 14 x s2 s x x –8 3 –3 0 4 –4 –1 7 2 108 x n x x x n 1 2 64 9 9 0 16 16 1 49 4 0 x x 2 168 2 168 9 1 1 x x x 21 2 21 4.6 Copyright © 2012 Pearson Education, Inc. Publishing as Prentice Hall. Copyright © 2012 Pearson Education, Inc. Publishing as Prentice Hall. 15. Range==Max Max– Min – Min = 96 15. Range = 96 – 23– =23 73= 73 CHAPTER 2 DESCRIPTIVE STATISTICS 67 = 34 – 24– =24 10= 10 16. Range Range==Max Max– Min – Min = 34 2 x x 0.00467689 17. Range – Min = 98 – 74 = 24 s ==Max 0.02418 17. Range Max n 1– Min = 98 –874 = 24 It appears data– that 18.b.Range = Maxform – Minthe = 6.7 0.5 =the 6.2batting averages for Team A are more variable than the batting 18. Range = Max – MinB.= The 6.7 –batting 0.5 = averages 6.2 averages for Team for Team A have a higher mean and a higher 19a. Range = Max Min for = 38.5 – 20.7 median than –those Team B. = 17.8 19a. Range==Max Max– Min – Min = 38.5 – 20.7 = 17.8 b. Range = 60.5 – 20.7 = 39.8 b. Range = Max – Min = 60.5 – 20.7 = 39.8 29a. Greatest sample standard deviation: (ii) 20. Changing of the greatly affects thethe range. Data setthe (ii)maximum has morevalue entries thatdata are set farther away from mean. Least sample standard deviation: (iii)data set greatly affects the range. 20. Changing the maximum value of the 21. Graph of 24are and graph a standard deviation of 16 because Data(a) sethas (iii)a standard has moredeviation entries that close to(b) thehas mean. graph (a) has more variability. The three dataa sets have deviation the same mean different deviations. 21.b.Graph (a) has standard of 24but andhave graph (b) hasstandard a standard deviation of 16 because graph (a) has more variability. 22. Graph (a) has a standard deviation of 2.4 and graph (b) has a standard deviation of 5 because 30a.graph Greatest sample standard deviation: (i) (b) has more variability. Data set (i) has more entries that are from 22. Graph (a) has a standard deviation offarther 2.4 andaway graph (b)the hasmean. a standard deviation of 5 because Least sample standard deviation: (iii) 23. Company Anmore offer variability. of $33,000 is two standard deviations from the mean of Company A’s graph (b)B.has Data set (iii) has more entries that areThe close to the starting salaries, which makes it unlikely. same offermean. is within one standard deviation of the b.mean Theofthree data sets have thesalaries, same mean, median, and mode, but have different standard Company B’s starting which makes the offer likely. 23. Company B. An offer of $33,000 is two standard deviations from the mean of Company A’s deviations. starting salaries, which makes it unlikely. The same offer is within one standard deviation of the mean of Company B’s starting salaries, which makes the offer likely. 31a. Greatest sample standard deviation: (ii) Data set (ii) has more entries that are farther away from the mean. Least sample standard deviation: (iii) Data set (iii) has more entries that are close to the mean. b. The three data sets have the same mean, median, and mode, but have different standard Copyright © 2012 Pearson Education, Inc. Publishing as Prentice Hall. deviations. Copyright © 2012 Pearson Education, Inc. Publishing as Prentice Hall. 32a. Greatest sample standard deviation: (iii) 68 CHAPTER 2 STATISTICS Data set (iii)DESCRIPTIVE has more entries that are farther away from the mean. Least sample standard deviation: (i) 36a. n 40 Data set (i) has more entries that are close to the mean. 95%(40) = (0.95)(40) 38 farms have values between $1500 and $3300 per acre. b. nThe20 three data sets have the same mean and median but have different modes and standard b. deviations. 95%(20) = (0.95)(20) 19 farms have values between $1500 and $3300 per acre. 37. 33. 34. 38. x1300, 15001700 s 1500 200 1 200 , 1500 1 200 x s, x s {$950, $1000, $2000, $2180} are outliers. They are more than 2 standard deviations from the 68% (1100, of the 1900). farms $2180 have values betweenbecause $1300itand $1700 acre. deviations from mean is very unusual is more thanper 3 standard the mean. 95% of the data falls between x xx 2400 450 2s 2400 s2 450 1500 2 s and x 2s . {$1045, $1490, $3325, $3800} are outliers. They are more than 2 standard deviations from the x 2(1500, s 2400 450 and 3300 mean 3300). 2 $1045 $3800 are very unusual because they are more than 3 standard deviations from the mean. 95% of the farms have values between $1500 and $3300 per acre. s, x 2 s 1.14, 39. nx 275 35a. 68%(75) = (0.68)(75) 1 1 1 1 1 1 2 b. n k 225 4 2 68%(25) = (0.68)(25) minutes. 5.5 are 2 standard deviations from the mean. 51 farms have values between $1300 and $1700 per acre. At least 75% of the eruption times lie between 1.14 and 5.5 0.75 17 farms have values between $1300 and $1700 per acre. If n = 32, at least (0.75)(32) = 24 eruptions will lie between 1.14 and 5.5 minutes. 40. 1 k2 1 1 1 0.75 2 4 2 At least 75% of the 400-meter dash times lie within 2 standard deviations of the mean. 1 1 Copyright © 2012 Pearson Education, Inc. Publishing as Prentice Hall. x 2 s, x 2s 54.97, 59.17 At least 75% of the 400-meter dash times lie between 54.97 and 59.17 seconds. 41. x f xf 0 1 2 5 11 7 0 11 14 x x –2.1 –1.1 –0.1 x x 4.41 1.21 0.01 2 x x 2 22.05 13.31 0.07 f Best actress: 35.9, 11.4 43.7 0.49 8.7 33 35.9 x Kate Winslet: x 33: z 0.25 11.4 c. Sean Penn’s age is 0.49 standard deviation above the mean of the best actors. Kate Winslet’s age is 0.25 standard deviation below the mean of the best actresses. Neither actor’s age is unusual. b. Sean Penn: x 48 : z x 48 Section 2.5 o (odd), 17, SOLUTIONS 25-28, 29, 30, 37, 39, 44, 58, 59 2.51-13 EXERCISE 1. The soccer team scored fewer points per game than 75% of the teams in the league. 2. The salesperson sold more hardware equipment than 80% of the other sales people. 3. The student scored higher than 78% of the students who took the actuarial exam. 4. The child’s IQ is higher than 93% of the other children in the same age group. 5. The interquartile range of a data set can be used to identify outliers because data values that are greater than Q3 1.5 IQR or less than Q1 1.5 IQR are considered outliers. 6. Quartiles are special cases of percentiles. Q1 is the 25th percentile, Q2 is the 50th percentile, and Q3 is the 75th percentile. 7. False. The median of a data set is a fractile, but the mean may or may not be fractile depending on 80 CHAPTER 2 the distribution ofDESCRIPTIVE the data. STATISTICS 8. b.True 9. True 10. False. The five numbers you need to graph a box-and-whisper plot are the minimum, the maximum, Q1 , Q3 , and the median Q2 . 24a. 11. False. The 50th percentile is equivalent to Q2 . CHAPTER 2 DESCRIPTIVE STATISTICS 12. False. Any score equal to the mean will have a corresponding z-score of zero. 79 b. Min IQR==1,Q3Q1 Q13, Q270 – 130 140 5, Q3 = 8, Max = 9 unusual. 13. False. A z-score of 2 2.5 is considered b. 17a. Min = 900, Q1 1250, Q2 1500, Q3 1950, Max = 2100 14. True b. IQR = Q3 Q1 1950 – 1250 = 700 15a. Min = 10, Q1 13, Q2 15, Q3 17, Max = 20 18a.b. Min Q2– 1365, Q3 Q1 Q1 50,17 = 4Q3 70, Max = 85 IQR==25, 25. not– skewed Q3 Data Q1 are70 50 = 20 or symmetric. b. None. IQR =The 16a. Min = 100, Q1 130, Q2 205, Q3 270, Max = 320 26. Skewed the,data to the in the box-and-whisker plot. 19a. Min = right. 1.9 , Most Q1 of 0.5 Q2 lie0.1, Q3 left0.7, Max = 2.1 Q 0.7the data 0.5lie to = 1.2 b. Skewed IQR = Qleft. 3 1 27. Most of the right in the box-and-whisker plot. Copyright © 2012 Pearson Education, Inc. Publishing as Prentice Hall. 28. Symmetric. left and=to2.1 the right of the median. , Q1data are 0.3evenly , Q2 spaced 0.2, Q3to the 0.4, Max 20a. Min = 1.3The 0.4 0.3 = 0.7 b. IQR = Q Q1 Q3 C 29. Q1 B, Q3 2 A, 25% of the values are below B, 50% of the values are below A, and 75% of the values are below 21a. C. 30. P10 T, P50 R, P80 S 10% of the values are below T, 50% of the values are below R, and 80% of the values are below S. Min = 24, Q1 28, Q2 35, Q3 41, Max = 60 31a. b. Q1 2, Q2 4, Q3 5 b. 22a. 63 63 0 c. 25% 7 23 23 x Biology: x 23 z 0 36a. $17.65 b. 50% c. 50%3.9 b. The student performed equally well on the two tests. 1.43 37. A z 34,000 35,000 x B z 0 0.44 43a. x 34,000 z C z 2.14 2, 250 The z-score 2.14 is unusual because is so large. 37,000 it 35,000 x 0.89 x 37,000 z 2, 250 38. A z 1.54 30,000 35,000 Bx 30,000 z 0.77 z x 2.22 2, 250 C z 1.54 None of the are unusual. The tire withz-scores a life span of 30,000 miles has an unusually short life span. 30,500 35,000 x b. x 30,500 z 2 2.5th percentile 75 63 x 2,250 39a. Statistics: x 75 z 1.71 7 37, 250 35,000 x 1 84th percentile x 37, 250 z x 2,2525023 0.51 Biology: x 25 z 35,000 3.9 35,000 x z a better score on the statistics0test.50th percentile b. xThe35,000 student had 2, 250 x 42a. Statistics: x 63 z 35a. 5 b. 50% 34 33 0.25 4 30 33 x 0.75 x 30 z 4 33 xEducation, 42 Copyright 2012 Pearson Inc. Publishing as Prentice Hall. 2.25 x ©42 z 4 The fruit fly with a life span of 42 days has an unusually long life span. 29 33 x 16th percentile 1 b. x 29 z 4 41 33 x 97.5th percentile 2 x 41 z 4 25 33 x 2.5th percentile 2 x 25 z 4 44a. x 34 x z CHAPTER 2 Copyright © 2012 Pearson Education,ofInc. Publishing Prentice Hall. x Number data valuesas less than 58. Percentile Total number of data values 3 100 73 59. Percentile 4th percentile Number of data values less than x 100 Total number of data values 30 100 41st percentile 73 60a. Q1 9, Q2 11, Q3 13 IQR Q3 Q1 13 9 4 1.5 IQR 6 Q1 100 1.5 IQR 9 6 3 DESCRIPTIVE STATISTICS 85