Download STT 315 Pathak - Exam 1 - Practice Exam Solutions

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Time series wikipedia , lookup

Categorical variable wikipedia , lookup

Transcript
1.
2.
3.
4.
5.
6.
www.LionTutors.com
STT315–Exam1–PracticeExamSolutions
A–Thesciencethatdealswithcollecting,classifying,summarizing,organizing,analyzing,
andinterpretinginformationordata
A–Quantitative
Quantitativevariable–Valuesthatcomeinmeaningful,non-arbitrarynumbers.
Quantitativevariablesareeitherdiscreteorcontinuous.Discretevariablesaretypically
counts,whilecontinuousvariablesaretypicallymeasurements.Sinceheightisa
measurement,itisacontinuousrandomvariable.
B–Qualitative
Qualitativevariable–Rawdatamadeupoflabels,group,orcategorynamesthatdon’t
necessarilyhavealogicalorder.Itdoesnotmakesensetoperformarithmeticoperations
(suchasfindingtheaverage)onqualitativevariables.
A–True
Notallnumericalvariablesarequantitative!Althoughaphonenumberisnumerical,itis
notameaningful,non-arbitrarynumber.Itwouldnotmakesensetoperformarithmetic
operationsonavariablelikephonenumber.Aphonenumberisanexampleofan
identifiervariable,whichisaspecialtypeofqualitativevariable.Otherexamplesof
identifiervariablesare:studentIDnumber,socialsecuritynumber,trackingnumber,
invoicenumber,etc.
B–False
AlthoughstudentIDnumberisanumericalvariable,itisnotquantitative.StudentID
numberisaspecialtypeofcategoricalvariablecalledanidentifiervariable.
A–Sample
Asampleisasubsetofapopulation.Asampleisthegroupofunitsthatareactually
measuredorsurveyed.Sincethe“streetpharmacist”onlysurveyed10ofhis50
“patients,”thedatathathecollectedisfromasample.Ifheweretosampleallofhis50
patients,hewouldhavedataaboutthepopulation.
7. D–Datafrompeopleinthesampleareusedtoobtaininformationaboutthepopulation
Alloftheotheranswerchoices,A,B,andCareincorrectbecausethewords“sample”
and“population”wereswapped.
8. C-Qualitativedataarerawdatamadeupofgrouporcategorynamesthatdon’t
necessarilyhavealogicalorder,whilequantitativedataarevaluesthatcomeintheform
ofmeaningful,non-arbitrarynumbers
9. A–Qualitative
Althoughthesixcategorieswereinputintothecomputerasnumbers,theystillrepresent
categoricaldata.
10. A–1,950
590+720+640=1,950
11. D–Unimodal
Thereisonlyoneprominentpeakinthisdistribution.Pleasemakesurethatyoutake
yourtimeontheexamandreadyouranswerchoicescarefully.Alotofpeoplewhodo
notreadcarefullywillaccidentallychoose“uniform”insteadof“unimodal.”
12. B–False
Theshapeofadistributioncanbedescribedaseithersymmetricorskewed.Ifa
distributionisnotsymmetric,itmustbeskewed.Ifadistributionisskewed,itwilleither
beskewedtotheleft,orskewedtotheright.Thisdistributionisnotskewed;therefore,it
mustbesymmetric.Ifweretofoldthisdistributioninhalf,bothsideswouldlooklike
mirrorimagesofeachother.
13. B–Therearenooutliers
Outlier–Adatapointthatisnotconsistentwiththemajorityofthedata
Youcanidentifyoutliersinahistogrambylookingforgapsinthedata.Gapsina
histogramrepresentareasofadistributionwheretherearenorecordedvalues.Ifthere
isagapinahistogram,andthentherearemoredatavaluesafterthegap,thosevalues
aretypicallyoutliers.
14. D–16to19credits
Ourthreemeasuresofcenterare:mean,median,andmode.
15. C–22to25credits
16. A–5.88
YoucouldusetheSTATfunctionoftheT!-84Pluscalculatortofindthemeanhere;
however;itisprobablyeasiertojustusetheformula:
Mean=
!.! ! !.! ! !.! ! !.! ! !.!
!
=
!".!
!
=5.88
17. C–Qualitative
18. B–False
AgeandGPAarequantitativevariables;however,studentIDisnotaquantitative
variable.Itisaspecialtypeofcategoricalvariablecalledanidentifiervariable.
19. A–True
Frequencyisthenumberofobservationsinadatasetthatfallintoaparticularclass.The
frequencyfor“Freshman”is2because2observationsfallintotheclass“Freshman.”
Relativefrequencyisthefrequencydividedbythetotalnumberofobservationsinthe
dataset.Therelativefrequencyfor“Freshman”is50%because2/4is0.50.
20. A–True
Therangeissensitivetooutliers.Thismeansthatthepresenceofoutlierswillhavea
largeeffectontherange.
21. B–False
TheIQR(InterQuartileRange)istheupperquartileminusthelowerquartile.Itisa
resistant(robust)summarymeasure.Thismeansthatthepresenceofoutlierswillnot
havealargeeffectontheIQR.
22. D–Themeanisalsoreferredtoasthe“average”
Aisincorrect–Themedianisthemiddlevalueinadataset,notthemean
Bisincorrect–Therangeisthedifferencebetweenthelargestandthesmallestnumber
inadataset,notthemean
Cisincorrect.–Themodeisthemostfrequentlyoccurringvalueinadataset,notthe
mean
Eisincorrect–Halfofthevaluesinadatasetaregreaterthanthemedian,andhalfof
thevaluesarelessthanthemedian,notthemean
23. C–Themeanissignificantlysmallerthanthemedian
Symmetricdistributions:Mean=Median
Skewedtotheright:Mean>Median
Skewedtotheleft:Mean<Median
24. A–Barchart
Wecanuseabarchartandapiecharttodisplayinformationaboutqualitativevariables.
Histogramsandstem-and-leafplotsareusedtodisplayinformationaboutquantitative
data.Themean,median,andmodeareallsummarymeasuresthatareusedfor
quantitativedata.
25. B–False
StudentIDnumberisanexampleofanidentifiervariable.
26. B–False
Whenwecreateahistogramwecanusefrequencies(thecountofoccurrences)or
relativefrequencies(frequency/n)ontheY-axis.Usingrelativefrequenciesinsteadof
frequencieswillneverchangetheshapeofthehistogram.
27. B–Accuracy
Weusemeasuresof(1)center,(2)spread,and(3)shapetodescribeadistribution.
Center=Location
Spread=Variability
Shape=Skewness
Allofthesetermsareusedsynonymously.Itisfairgameforanyofthesetermstobe
usedonyourexam.
28. B–False
Notallnumericalvariablesarequantitative.Quantitiesvariablearemeaningful,nonarbitrarynumbers.Identifiervariables(ex.Zipcodes,socialsecuritynumbers,studentID
numbers,trackingnumbers)arequalitative.
29. A–6
10to19+20to29
=4+2=6
30. B–False
Weneverleavespacesbetweenbinsinahistogram;therefore,aspaceinahistogram
impliesthatnodatavaluesfallintothatrange,andwecallthisagapinthehistogram.
Therearetwogapsinthishistogramat50–59percent,and60–69percent.Therewere
novaluesthatfellintothisrange.
31. D–70to79percentaccepted
Anoutlierisadatavaluethatisnotconsistentwiththemajorityofdata.Whenlookingat
ahistogram,datavaluesthatappearafteragaparetypicallyoutliers.
32. E–BothAandD
Positivelyskewedandskewedtotherightmeanthesamething.Whenadistributionis
skewedtotheright,thenumbersontherightsideofthenumberline(highervalues)are
spreadoutmorethanthenumbersontheleftsideofthenumberline(lowervalues).
33. A–True
Theshapeofadistributioncanbedescribedaseithersymmetricorskewed.Ifa
distributionissymmetric,therewillbezeroskewedness.
34. D–BothAandC
Aisincorrect–Modeisthemostfrequentlyoccurringvalueinadataset,notthe
average.Themeanisoftentimescalledtheaverage.
35. A–True
36. B–Themeanisgreaterthanthemedianbecausethedatasetisskewedtotheright
Skewedtotheleft–Thenumbersontheleftsideofthenumberlinearemorespread
outthanthenumbersontheright.
Skewedtotheright–Thenumbersontherightsideofthenumberlinearemorespread
outthanthenumbersontheleft.
Skewedtotheright:Mean>Median
Skewedtotheleft:Mean<Median
UsetheTI-84PlusfortheNextFiveProblems
37. D–69.9
Calculatoroutputforsamplemean=𝑥
38. A–26.44
Calculatoroutputforsamplestandarddeviation=𝑆! 39. A–75
Calculatoroutputformedian=Med
40. D–80
Range=Max–Min
Calculatoroutputformax=MaxX
Calculatoroutputformin=MinX
Range=98–18
Range=80
41. C–38.5
IQR=Q3–Q1
IQR=90–51.5
IQR=38.5
42. A–57.5
Σ! =Sumofallxvalues
n=Numberofvaluesinthedataset
!"#
Mean= !" =57.5
43. B–13
Countalloftheobservationsintherightmostcolumn(calledtheleaf)
44. C–90
Range=Max–Min
Range=99–9
Range=90
45. D–78
Medianposition=
(!!!)
!
=
(!"!!)
!
=7thposition
Median=78
46. B–70.08
Mean=(9+17+66+69+74+77+78+81+81+81+89+90+99)/13
Mean=911/13
Mean=70.076=70.08
47. A–26.83
48. C-$32,750
Newmean=a(𝑥)+b
Newmean=$500(55.5)+$5,000
Newmean=$27,750+$5,000
Newmean=$32,750
49. C-$2,400
Newstandarddeviation=|a|*𝑠! Newstandarddeviation=|$500|*4.8
Newstandarddeviation=$2,400
Remember,wedonotincludetheconstantbinlineartransformationproblemsthat
relatetomeasuresofspread(range,IQR,standarddeviation).
50. B-23,000
Newmedian=a(𝑥)+b
Newmedian=$500(36)+$5,000
Newmedian=$18,000+$5,000
Newmedian=$23,000
51. A-$31,500
IQR=Q3–Q1
TheIQRforouroriginaldata,onthenumberofpassescaught,is76–13=63
NewIQR=|a|*IQR
NewIQR=|$500|*63
NewIQR=$31,500
Remember,wedonotincludetheconstantbinlineartransformationproblemsthat
relatetomeasuresofspread(range,IQR,standarddeviation).
52. B–No
Observationswithaz-scoreofgreaterthan|3|areconsideredoutliers
Z=
Z=
!! !
!
!!!!"
!"
Z=-1.625
66isnotanoutlierbecausethez-scoreisnotgreaterthan3
53. B–Friend
Wecangainsomeinsightabouttheshapeofadistributionfromthemedianposition
withintheboxofaboxplot.Ifthemedianispositionedinthecenterofthebox,wehave
asymmetricdistribution.
54. C–Pet
Ifthemedianispositionedclosertothe1stquartile(bottomofthebox)adistributionis
skewedtotheright(positiveskew)
55. B–Friend
IQR=Q3–Q1
Whencomparingboxplotssidebyside,theboxplotwiththelargestboxwillhavethe
largestIQR.TheboxplotforfriendhasthelargestIQR.ThedistancefromQ3(topofthe
box)toQ1(bottomofthebox)isthelargest.
56. C–Pet
Inaboxplot,anasteriskisusedtorepresentanoutlier.Theonlyboxplotwithanoutlier
ispet.
57. C–Pet
Thelineinthemiddleoftheboxrepresentsthemedian.Pethasamedianof
approximately70,whichissmallerthanthemedianforfriend(approximately90)andthe
medianforcontrol(approximately85).
58. C–Mean<Median
Thisdistributionisgoingtobeskewedtotheleft.The1receptionistwillbepaid
significantlylessthanthe15highlypaidlawyers,sothenumbersontheleftsideofthe
numberlinewillbemorespreadoutthanthenumbersontheright.Whenevera
distributionisskewedtotheleft,themeanwillbedrawntowardssmalloutliers,sothe
meanwillbelessthanthemedian.
59. C–Themeanliestotherightofthemedian
Wheneverwehaveadistributionthatisskewedtotheright,themeanwillbedrawnto
largeoutliers,sothemeanwillbegreaterthanthemedian.
60. F–Alloftheabove
Forexample,iftheoriginaldatawasmeasuredininches,wewouldmakesuretoreport
themean,median,mode,range,andstandarddeviationintermsofinchesaswell.
61. D–1;sample
z=
z=
!! !
!
!"!!"
!
=1
Weknowthatthisisthez-scoreforthesamplebecauseofthenotationthatisused.𝑥 is
thenotationforsamplemean,andsisthenotationforsamplestandarddeviation.Ifthis
werepopulationdata,wewoulduse𝜇torepresentthepopulationmeanand𝜎to
representthepopulationstandarddeviation.Wewouldcalculatezscorethesameway,
butthenotationthatweusewouldjustbedifferent.
62. B–5.5
𝜇=3.1,𝜎=0.8,z-score=3.0
z=
!! !
!
!!!.!
3.0=
!.!
2.4=x–3.1
x=5.5
63. C–90.23%
x%ofdatafallswithintheinterval1 −
!
1 − !.!! =0.9023=90.23%
64. A–True
1
1 − ! = 0.75
2
!
!!
65. B–False
Ifadistributionisunimodalandsymmetric,accordingtotheEmpiricalRule,99.7%of
datawillfallwithinthreestandarddeviationsofthemean,not95%.95%ofdatawillfall
withintwostandarddeviationsofthemean.
66. F–BandD
21.60 − 24.20
−2.6
=
= −1.3
2
2
52 − 45 7
𝑧 − 𝑠𝑐𝑜𝑟𝑒 𝑠ℎ𝑜𝑡 𝑝𝑢𝑡 =
= = +1.4
5
5
𝑧 − 𝑠𝑐𝑜𝑟𝑒 200 𝑚𝑒𝑡𝑒𝑟 =
Az-scoreof-1.3inthe200metermeansthatweperformed1.3standarddeviations
worsethantheaverage,andaz-scoreof+1.4intheshotputmeansweperformed1.4
standarddeviationsbetterthantheaverage.Weperformedbetterintheshotputthan
wedidinthe200m.
67. B–Median
Whenadatasetissymmetric,themeanisthepreferredmeasureofcenter.
Whenadatasetisskewed,themediaisthepreferredmeasureofcenter.
Thisisbecausethemeanisasensitivesummarymeasure(meaningitisveryinfluenced
byoutliers)andthemedianisaresistantsummarymeasure(meaningitisnotvery
influencedbyoutliers).
68. B–False
Afive-numbersummaryofadatasetprovidesusefulinformationaboutthevariabilityof
adataset,notthecenter.
Rememberthataboxplotissimplyavisualrepresentationofadataset’sfive-number
summary;therefore,ifyouareasked,aboxplotisarepresentationofadataset’s
variabilityaswell.
69. B–Levelofeducation,eyecolor,gender
Aisincorrect–Weightisaquantitativevariable
Cisincorrect–Lengthandwidtharebothquantitativevariables
Disincorrect–Price,temperature,andthicknessareallquantitativevariables
70. D–Unimodalandskewedtotheright
Thishistogramisunimodalbecausethereisoneprominentpeak,occurringaround0
Thishistogramisskewedtotherightbecausethenumbersontherightsideofthe
numberlinearemorespreadoutthanthenumbersontheleft.
71. E–C.RRao
72. B–38
Themedianisthemiddledatavalue,whenthedataisorganizedinorderfromsmallest
tolargest.Orderingthedatagivesus:9,37,39,55.Sincetherearetwonumbersinthe
middle(37and39),themedianistheaverageofthosetwonumbers,whichis38.
73. A–36%
n=25
Students64–66inches=5
Students66–68inches=4
Students64–68inches=5+4=9
Percentageofstudents64–68inches=9/25=0.36=36%
74. D–0.47
Normalcdf(500,600,550,80.4)=0.47
75. A–0.69
Normalcdf(–9^99,590,550,80.4)
76. C–0.31
Normalcdf(590,9^99,550,80.4)
77. B–644.47
invNorm(0.88,550,80.4)=644.47
78. C–539.90
invNorm(0.45,550,80.4)=539.90
Theinputthatweenterfor“area”whenusingtheinvNormfunctionisalwaystheareato
theleftofsomespecifiedxvalue.P(X≤b).Wearegiveninformationabouttheareato
therightofthexvalue,sowemustfirstfindtheareatotheleftofthexvaluebefore
enteringintoourcalculator.1–0.55=0.45.
79. A–108.46
Q3=invNorm(0.75,550,80.4)=604.23
Q1=invNorm(0.25,550,80.4)=495.77
IQR=Q3–Q1
IQR=604.23–495.77=108.46
80. B–0.5915
Wearelookingfortheprobabilitythatastudentdoesnotgetadmitted.Sinceyouneeda
SATscoreof1560togetintoMSU,wemustlookfortheareaunderthecurvebelow
1560.
P(x<1560)=normalcdf(–9^99,1560,1492,294)=0.59145=0.5915
81. B–1976
Applicantsthatscoreinthetop5%areinthe95thpercentile.Weknowthisbecause95%
ofthedataisatorbelowtheirscore.WecanusetheTI-84Plustofindthescoreforthe
95thpercentile:
InvNorm(.95,1492,294)=1975.587=1967
82. A–1869
invNorm(0.90,1492,294)=1868.78=1869
83. B–JohnW.Tukey