Download Frequency vs. Relative Frequency vs. cumulative frequency vs

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Taylor's law wikipedia , lookup

History of statistics wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Categorical variable wikipedia , lookup

Time series wikipedia , lookup

Transcript
Chapter 1 Highlights
Variable-whatwearemeasuring
Quantitative-numericalwheremathematical
operationsmakesense.ThesehaveUNITS
Categorical-putsindividualsintocategories
Numbersdon'talwaysmeanQuantitative...
Frequency
vs.RelativeFrequency
vs.cumulativefrequency
vs.relativecumulativefrequency
Chapter 1 Highlights
Two-WayTablesandMarginalDistributions
DistributionsareofVARIABLES,not
individualvalues!!!
Toexamineamarginaldistribution,
1)Usethedatainthetabletocalculatethemarginaldistribution(in
percents)oftheroworcolumntotals.
2)Makeagraphtodisplaythemarginaldistribution.
Note:Percentsareoftenmoreinformativethancounts,especiallywhencomparing
groupsofdifferentsizes.
Chapter 1 Highlights
AConditionalDistributionofavariabledescribes
thevaluesofthatvariableamongindividualswho
haveaspeciDicvalueofanothervariable.
Toexamineorcompareconditionaldistributions,
1)Selecttherow(s)orcolumn(s)ofinterest.
2)Usethedatainthetabletocalculatetheconditionaldistribution(in
percents)oftherow(s)orcolumn(s).
3)Makeagraphtodisplaytheconditionaldistribution.
Useaside-by-sidebargraphorsegmentedbargraphtocompare
distributions.
Therearethreemainwaystodisplayquantitativedata:
-Dotplots
-Stemplots
-split
-back-to-back
-Histograms
Chapter 1 Highlights
Howtocreateadotplot:
1)Drawahorizontalaxis(anumberline)andlabelitwiththevariablename.
2)Scaletheaxisfromtheminimumtothemaximumvalue.
3)Markadotabovethelocationonthehorizontalaxiscorrespondingtoeach
datavalue.
Howtomakeastemplot:
1)Separateeachobservationintoastem(allbuttheLinaldigit)andaleaf(the
Linaldigit).
2)Writeallpossiblestemsfromthesmallesttothelargestinaverticalcolumn
anddrawaverticallinetotherightofthecolumn.
3)Writeeachleafintherowtotherightofitsstem.Arrangetheleavesin
increasingorderoutfromthestem.
4)Provideakeythatexplainsincontextwhatthestemsandleavesrepresent.
SplittingStemsandBack-to-BackStemplots
Whendatavaluesare“bunchedup”,wecangetabetterpictureofthedistribution
bysplittingstems.
Twodistributionsofthesamequantitativevariablecanbecomparedusinga
back-to-backstemplotwithcommonstems.
Howtomakeahistogram:
1)Dividetherangeofdataintoclassesofequalwidth.
2)Findthecount(frequency)orpercent(relativefrequency)of
individualsineachclass.
3)Labelandscaleyouraxesanddrawthehistogram.Theheight
ofthebarequalsitsfrequency.Adjacentbarsshouldtouch,
unlessaclasscontainsnoindividuals.
Chapter 1 Highlights
(Usingyourcalculator)
1.EnterthedataintoL1.
(presstheSTATbutton,highlightEDITandchoice#1andpress
ENTER).
2.Turnonthestat-plot.
(press2ndandtheY=buttontoselectSTATPLOT,highlight
choice#1andpressENTER,selectONandpressenter,select
thehistogramunderTYPEandpressenter)
3.Adjustyourwindow.
(presstheWINDOWbutton;enteryourminimumvalue(smaller
thanthesmallestobservation)forXmin,enteryourmaximum
value(largerthanthelargestobservation)for
Xmax,enterthelengthofyourclassesforXscl(i.e.whatyouare
countingbytogetfromXmintoXmax),adjustyourYmin=0and
Ymaxappropriately)
OR
GotoZOOMandselect#9ZoomStat
UsingHistogramsWisely
Hereareseveralcautionsbasedoncommonmistakesstudentsmakewhenusing
histograms.
1)Don’tconfusehistogramsandbargraphs.
2)Don’tusecounts(inafrequencytable)orpercents(inarelativefrequency
table)asdata.
3)Usepercentsinsteadofcountsontheverticalaxiswhencomparing
distributionswithdifferentnumbersofobservations.
4)Justbecauseagraphlooksnice,it’snotnecessarilyameaningfuldisplayof
data.
Chapter 1 Highlights
RelativeFrequencyHistogram
Thistypeofhistogramdisplaysproportionsorpercents
ratherthancounts.
CumulativeFrequencyHistogram(Ogive)
ExaminetheDistribution
LookfortheOVERALLpatternandanystrikingDEVIATIONSfromthatpattern
Describetheshape,center,andspreadanddetermineifthere
areanyoutliers(don'tforgetyourSOCS!)
Shape
Skewedorsymmetric?
Symmetric-theleftandrighthandsidesofthehistogramare
approximatelymirrorimagesofeachother
Skewedright-therightsideofthehistogramextendsMUCH
fartheroutthantheleftside("tail"goestotheright)
Skewedleft-theleftsideofthehistogramextendsMUCH
fartheroutthantherightside("tail"goestotheleft)
Uniformdistribution-doesn'tappeartohaveanymodes-prettymuchthe
sameheightacrossthewholedistribution
Chapter 1 Highlights
MeasuresofCenter
Wehavetwowaysofnumericallymeasuringthecenterofaquantitativedataset-
theMedianandtheMean.
Bothofthesecanbeconsideredtogiveusthe"average"ofadataset.
Someissueswithnotation:
Therearetwowaystowritethemean
Thechoicedependsonwhetheryouaretalkingabouttheentire
POPULATIONofinterestorjustaSAMPLEfromtheentirepopulation.
Unlessyouare100%positiveyouhavethedatafromtheENTIRE
population,useμ.Ifyouseebeingused,thenthedatamustbe
fromtheentirepopulation.
ComparingtheMeanandMedian
Inasymmetricdistributionthemeanandmedian
areVERYclosetogether.
Inaskeweddistributionthemeanwillbegreater
thanorlessthanthemedian,dependinguponthe
skew.Thelargerthedifferencebetweenthetwo,
thegreatertheskew.
Ifthemeanisgreaterthanthemedian,thedistributionis
skewedright
Ifthemeanissmallerthanthemedian,thedistributionis
skewedleft
Chapter 1 Highlights
MeasuresofSpread
Aswithmeasuresofcenter,wehavetwodifferentwaysto
measurethespreadinquantitativedata- quartilesandIQR
andthestandarddeviationandvariance.
StandardDeviation-(writtenasσ-populationors-sample)
andVariance-(writtenasσ2-populationors -sample)
2
· Thestandarddeviationgivesameasureofthe"average"distance
thatdatapointsfallfromthemean
· s=0ONLYwhenthereisNOSPREAD-thisonlyhappenswhen
everyobservationistheSAMEotherwise s>0
· Themorespreadouttheobservationsarethegreater swillbe
· shasthesameunitsofmeasurementastheobservationsdo
· Likewesawwiththemean,sisnotresistant
Choosingmeasuresofcenterofspread
1.FIVE-NUMBERSUMMARYorMedianandIQR
TheFive-NumberSummarygivesaquicksummaryofboththecenterandspreadofyour
data.SomepeoplealsoconsidergivingtheIQRwiththeMediantobeasufLicientmeasureof
centerandspread.
ItcontainstheMinimumobservation,Q ,theMedian,Q ,andtheMaximumobservation.
1
3
Usewhenthedistributionisskewedorhasstrongoutliers
Usedtocreateanothergraphicaldisplayofquantitativedata-theBOXPLOT
2.TheMeanandStandardDeviation
Useforreasonablysymmetricdistributionthatarefreeofoutliers
Chapter 1 Highlights
Boxplot
· AgraphoftheDive-numbersummary
· Acentralboxspansthequartiles, Q1andQ3withalinemarkingthe
median,M.
· Linesextendfromtheedgeofthebox( Q1andQ3)outtotheminimum
andmaximumvalues,respectively.IFTHEREAREOUTLIERS:DO
NOTextendthelinestooutliers.Onlyextendtotheminimumand
maximumvaluesthatareNOToutliers.Markoutlierswithan
asterisk.
How to use the calculator for numerical summaries and boxplots:
(Using your calculator)
1. Enter the data into L 1.
(press the STAT button, highlight EDIT and choice #1 and press ENTER).
For Numerical Summaries:
2. Press the STAT button, arrow over to CALC
3. Select 1-Var Stats
4. You will get a list of values on your main screen. Arrow through to find all necessary values.
mean
standard deviation
Minimum Observation
Q1
Median
Q3
Maximum
For Boxplot:
2. Turn on the stat-plot.
(press 2nd and the Y= button to select STAT PLOT, highlight choice #1 and press ENTER, select ON
and press enter)
3. Select the FIRST boxplot option under "TYPE" - this one graphs outliers
4. Adjust your window. (ZOOM, select #9ZoomStat)
Chapter 1 Highlights