Download Guided1 200 2013ans

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

World Values Survey wikipedia , lookup

Transcript
STAT200On-Line
GuidedExercise1
Besureto:
• PleasesubmityouranswersinaWordfiletoSakaiatthesameplaceyou
downloadedthefile
• RememberyoucanpasteanyExcelorJMPoutputintoaWordFile(usePaste
Specialforbestresults).
• PutyournameandtheAssignment#onthefilename:e.g.IlventoGuided1.doc
Answerascompletelyasyoucanandshowyourwork.
1. Anewpopularstatisticalitemisaposterofacollectionofinteresting
statisticsandgraphics.Theoneontherightisacollectionofnumberson
agingintheU.S.Itwasputtogetherbyaninsurancecompanyto
emphasizewewillbelivinglongerandthereforewewillneedtoplan
betterforthingslikeretirement,healthcare,andlivingarrangements.
Someofthedataarebasedonsurveys,somefromlifetables,andsome
basedonmodelsthatprojectintothefuture(thegeneralsourceisgiven
onthegraph(largermoredetailedpicturesaregivenonthesecondand
thirdpages).Thestatisticschosenandthewaytheyarepresentedare
designedtocatchyourinterestandstimulatediscussion.Pleasenote,itis
aninsurancecompanythatispresentingtheseideasandtheirgoalisto
sellproducts.
Thedetailsarebetterseenonthenexttwopagesandthesourceis
givenbelow(youcanalsosearchfor“LiveLongerSlate”andfindit)
http://www.slate.com/articles/health_and_science/prudential/2013/
08/why_you_need_to_start_thinking_about_the_big_truths_with_livi
ng_longer.html
Reviewthedataandanswerthefollowingquestions.
a. Whatfiguresstandouttoyouasbeingparticularlywellpresented?In
otherwords,whichareeffectiveinmakingtheirpoint?
Norightorwronganswerhere.Ilikecomparingtheprobabilitiesofbeing
lefthanded,blondeorplayinganinstrumentwithlivingto100.
b. Doanyofthefiguresseemsuspecttoyou,orbasedonanagenda,or
perhapspresentedinawaytodistortorbiasanissue?Idon’tmean
toimplythereisanythingwrongwiththenumbers,butonemight
quibblewithwhatispresented(ornot)orthewayinwhichtheyare
presented.
Norightorwronganswerhere.Iwouldhavelikedmore
informationontheoldestcities.Didcitiesneedtobeacertainsize?
Whydidtheychoose60+?
JimmyFallonalreadyreplacedLeno!!!!
Page 1 of 8
Page 2 of 8
2.AresearcherinDelawarewantedtoseetheaffectofaneducationprogramforhospitalpatientsofheartattacks
ontheirlikelihoodofreturningtothehospitalin30days(referredtoasrecidivism).Theeducationprogramconsisted
ofmoreinvolvedtrainingondiet,exercise,weight,andstickingtotherecommendationsofthephysician.The
educationprogramwasgiventoarandomsampleofpatientsduring2011andtheresultswerecomparedtoacontrol
groupwhodidnotreceivethetraining.Ananalysisofthedatashowthatthegroupreceivingthetraininghada
significantreductioninrecidivismcomparedtothecontrolgroup.
a) Whatistheunitofanalysisinthisstudy?
Theheartpatient
b) Identifythedatacollectionmethodforthisstudy
ExperimentalDesign.Sincethereisatreatmentandcontrolgroupitwasarandomsample.
c) Wouldthestudyinvolvedescriptiveorinferentialstatistics?
Theresearcherwantedtodescribethedata,butshealsowasinterestedininferringtoallheartpatients
admittedtoahospital.SoitisInferential.
d) Whatisthepopulation(orsample)ofinteresttotheresearchers?
Allheartpatientsadmittedtoahospital.
3. ToddAndrlik,founderandeditorofJournaloftheAmericanRevolution(allthingsliberty.com),wrotea
pieceabouthowyoungmanyofthefoundingfatherswerewhentheDeclarationofIndependencewas
firstsignedin1776.Therewere56signersoftheDeclarationofIndependenceandtheiragesaregiven
below,sortedbyage.
OBS
1
2
3
4
5
6
7
8
9
10
Person
ThomasLynch
EdwardRutledge
GeorgeWalton
ThomasHeyward
BenjaminRush
ElbridgeGerry
ThomasJefferson
ThomasStone
WilliamHooper
ArthurMiddleton
Age
26
26
27
29
30
31
33
33
34
34
Gender
Male
Male
Male
Male
Male
Male
Male
Male
Male
Male
Page 3 of 8
State
SouthCarolina
SouthCarolina
Georgia
SouthCarolina
Pennsylvania
Massachusetts
Virginia
Maryland
NorthCarolina
SouthCarolina
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
JamesWilson
SamuelChase
WilliamPaca
JohnPenn
GeorgeClymer
ThomasNelson,Jr.
CharlesCarroll
FrancisHopkinson
CarterBraxton
JohnHancock
JohnAdams
WilliamFloyd
ButtonGwinnett
FrancisLightfootLee
RobertMorris
ThomasMcKean
GeorgeRead
SamuelHuntington
RichardHenryLee
RobertTreatPaine
RichardStockton
WilliamWilliams
JosiahBartlett
JosephHewes
GeorgeRoss
WilliamWhipple
CaesarRodney
WilliamEllery
OliverWolcott
AbrahamClark
BenjaminHarrison
LewisMorris
GeorgeWythe
JohnMorton
LymanHall
SamuelAdams
JohnWitherspoon
RogerSherman
JamesSmith
PhilipLivingston
GeorgeTaylor
MatthewThornton
FrancisLewis
JohnHart
StephenHopkins
BenjaminFranklin
34
35
35
35
37
37
38
38
39
39
40
41
41
41
42
42
42
44
44
45
45
45
46
46
46
46
47
48
49
50
50
50
50
51
52
53
53
55
56
60
60
62
63
65
69
70
Male
Male
Male
Male
Male
Male
Male
Male
Male
Male
Male
Male
Male
Male
Male
Male
Male
Male
Male
Male
Male
Male
Male
Male
Male
Male
Male
Male
Male
Male
Male
Male
Male
Male
Male
Male
Male
Male
Male
Male
Male
Male
Male
Male
Male
Male
Pennsylvania
Maryland
Maryland
NorthCarolina
Pennsylvania
Virginia
Maryland
NewJersey
Virginia
Massachusetts
Massachusetts
NewYork
Georgia
Virginia
Pennsylvania
Delaware
Delaware
Connecticut
Virginia
Massachusetts
NewJersey
Connecticut
NewHampshire
NorthCarolina
Pennsylvania
NewHampshire
Delaware
RhodeIsland
Connecticut
NewJersey
Virginia
NewYork
Virginia
Pennsylvania
Georgia
Massachusetts
NewJersey
Connecticut
Pennsylvania
NewYork
Pennsylvania
NewHampshire
NewYork
NewJersey
RhodeIsland
Pennsylvania
a. Createastemandleafplotofthedata(youcandothisby“hand”inWordinthetablebelow).Todothisyou
needtodecideonthestemsandthentheleaves.
Stem
Leaf
1
2 6679
3 0133444555778899
4 0111222445556666789
5 0000123356
6 002359
7 0
8
Page 4 of 8
Or
Stem
2*
3
3*
4
4*
5
5*
6
6*
7
Leaf
6679
0133444
555778899
011122244
5556666789
00001233
56
0023
59
0
a. Calculatethemean,median,andmodeforthisdata.ThesumofallthevaluesisSum(X)=2,479.
Mean=Sum(x)/n=2479/56=44.27or44.3
Medianisthemiddlevalue.Sinceniseven,themedianistheaverageofthe28thand29thvalues=44
Modeisthemostfrequentvalue46or50occur4timeseach.Oryoucouldsaythemodeisundefined.
b. Brieflydescribethedistribution-focusontheshapeofthedistribution,andwhetherthereareanoutliersor
strangevalues
The distribution appears to be a symmetrical, mound shaped distribution with the center in the mid-40s.
There are no large outliers.
Below is the output from JMP software.
Age
Quantiles
30
40
50
60
70
100.0% maximum
99.5%
97.5%
90.0%
75.0%
quartile
50.0%
median
25.0%
quartile
10.0%
2.5%
0.5%
0.0%
minimum
Summary Statistics
70.0
70.0
69.6
60.6
50.0
44.0
35.5
30.7
26.0
26.0
26.0
Mean
Std Dev
Std Err Mean
Upper 95% Mean
Lower 95% Mean
N
Sum
Variance
Skewness
Kurtosis
CV
N Missing
Median
Range
Interquartile Range
Stem and Leaf
44.3
10.7
1.4
47.1
41.4
56.0
2479.0
114.1
0.5
-0.2
24.1
0.0
44.0
44.0
14.5
Stem
7
6
6
5
5
4
4
3
3
2
Leaf
0
59
0023
56
00001233
5556666789
011122244
555778899
0133444
6679
Count
1
2
4
2
8
10
9
9
7
4
2|6 represents 26
4.
For the Signer of the Declaration of Independence data above, let’s now focus on two nominal level
variables – Gender and State.
a. For gender, how would you summarize the distribution for this variable? Think in terms of how we might
describe data to talk about this variable. Is it in fact a variable?
Gender is not a variable, it is a constant. All the signers were male.
b. For State, there were 13 original colonies. Use the table below to make a frequency table of the information.
Then summarize the results in words. You can decide how you might organize the states – alphabetically, by
Page 5 of 8
north and south, or frequency order. How you organize the states will help how you can use cumulative
frequencies to describe the data.
Connecticut
4
4/56 = .0714
.0714
Delaware
3
.0536
.1250
Georgia
3
.0536
.1786
Maryland
4
.0714
.2500
Massachusetts
5
.0893
.3393
New Hampshire
3
.0536
.3929
New Jersey
5
.0893
.4821
New York
4
.0714
.5536
North Carolina
3
.0536
.6071
Pennsylvania
9
.1607
.7679
Rhode Island
2
.0357
.8036
South Carolina
4
.0714
.8750
Virginia
7
.1250
1.0000
JMP can organize it alphabetically or by ascending (or descending order)
State
State
Level
Connecticut
Delaware
Georgia
Maryland
Massachusetts
New Hampshire
New Jersey
New York
North Carolina
Pennsylvania
Rhode Island
South Carolina
Virginia
Total
N Missing
0
13 Levels
Frequencies
Count
4
3
3
4
5
3
5
4
3
9
2
4
7
56
Prob Cum Prob
0.0714
0.0714
0.0536
0.1250
0.0536
0.1786
0.0714
0.2500
0.0893
0.3393
0.0536
0.3929
0.0893
0.4821
0.0714
0.5536
0.0536
0.6071
0.1607
0.7679
0.0357
0.8036
0.0714
0.8750
0.1250
1.0000
1.0000
1.0000
Rhode Island
Delaware
Georgia
New Hampshire
North Carolina
Connecticut
Maryland
New York
South Carolina
Massachusetts
New Jersey
Virginia
Pennsylvania
Connecticut
Delaware
Georgia
Maryland
Massachusetts
New Hampshire
New Jersey
New York
North Carolina
Pennsylvania
Rhode Island
South Carolina
Virginia
Frequencies
Level
Rhode Island
Delaware
Georgia
New Hampshire
North Carolina
Connecticut
Maryland
New York
South Carolina
Massachusetts
New Jersey
Virginia
Pennsylvania
Total
N Missing
0
13 Levels
Count
2
3
3
3
3
4
4
4
4
5
5
7
9
56
Prob Cum Prob
0.0357
0.0357
0.0536
0.0893
0.0536
0.1429
0.0536
0.1964
0.0536
0.2500
0.0714
0.3214
0.0714
0.3929
0.0714
0.4643
0.0714
0.5357
0.0893
0.6250
0.0893
0.7143
0.1250
0.8393
0.1607
1.0000
1.0000
1.0000
5. Below is the data for infant mortality for 44 countries, and the same data for OECD countries. The
Organization for Economic Co-operation and Development (OECD) is an international economic
organization of 34 countries, founded in 1961 to stimulate economic progress and world trade. It is a forum
of countries describing themselves as committed to democracy and the market economy, providing a
platform to compare policy experiences, seeking answers to common problems, identify good practices
and coordinate domestic and international policies of its members (Wikipedia,
https://en.wikipedia.org/wiki/Organisation_for_Economic_Co-operation_and_Development). OECD’s web site provided some data
on infant mortality for 44 countries. Infant mortality (the rate of death of children under 1 year of age per
1,000 live births) is a measure of development. The table below has the data for 44 countries and the 34
OECD countries.
a. Create a stem and leaf plot of the data (you can do this by “hand” in Word in the table below by typing
in the stems and the leaves). Do this for the 44 countries and the 34 OECD countries.
b. Calculate the mean, median, and mode for this data
c. Briefly describe the distribution - focus on the shape of the distribution, and whether there are an
outliers or strange values
The sum of x Sum(x) for all 44 countries is 292.30 and the Sum(x) for 34 OECD countries is 128.20.
Page 6 of 8
COUNTRY
Iceland
Finland
Slovenia
Estonia
Japan
Norway
Spain
Sweden
Czech Rep.
Denmark
Israel
Austria
Germany
Italy
Korea
Portugal
Australia
Switzerland
Belgium
Ireland
United Kingdom
France
Luxembourg
Greece
Lithuania
Netherlands
New Zealand
Latvia
Poland
Canada
Hungary
United States
Slovak Rep.
Chile
Russian Fed.
Costa Rica
Turkey
China
Brazil
Mexico
Colombia
Indonesia
South Africa
India
IM
1.3
1.7
1.7
2.0
2.0
2.3
2.4
2.4
2.5
2.5
2.5
2.6
2.8
2.9
2.9
2.9
3.1
3.3
3.5
3.5
3.5
3.6
3.6
3.7
3.7
4.0
4.4
4.4
4.5
4.8
5.0
5.0
5.1
7.0
8.2
8.4
10.2
10.9
12.3
13.0
17.5
24.5
32.8
41.4
Stem
Leaf
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
377
0034455568999
135556677
04458
001
0
24
29
3
0
5
The mean for the 44
countries is 6.64
(292.30/44 = 6.64) while the
median is 3.60. Since n=44
is even, the median is the
average of the two middle
values. The 22nd (3.6) and
the 23rdth values (3.6)
which results in :
(3.6+3.6)/2 = 3.6. The mean
is pulled by the extreme
values in the data. The
most extreme values are for
Indonesia (24.5), South
Africa (32.8), and India
(41.4). There is no single
modal value. Three values
occur 3 times – 2.5, 2.9 and
3.5.
This distribution is highly
skewed with a few extreme
outliers.
5
8
4
Page 7 of 8
OECD Country
Iceland
Finland
Slovenia
Estonia
Japan
Norway
Spain
Sweden
Czech Rep.
Denmark
Israel
Austria
Germany
Italy
Korea
Portugal
Australia
Switzerland
Belgium
Ireland
United Kingdom
France
Luxembourg
Greece
Netherlands
New Zealand
Poland
Canada
Hungary
United States
Slovak Rep.
Chile
Turkey
Mexico
IM
1.3
1.7
1.7
2.0
2.0
2.3
2.4
2.4
2.5
2.5
2.5
2.6
2.8
2.9
2.9
2.9
3.1
3.3
3.5
3.5
3.5
3.6
3.6
3.7
4.0
4.4
4.5
4.8
5.0
5.0
5.1
7.0
10.2
13.0
Stem
1
2
3
4
5
6
7
8
9
10
11
12
13
14
Leaf
377
0034455568999
13555667
0458
001
0
2
0
The mean for the 34 countries
is 3.77 (128.20/34 = 3.77)
while the median is 3.20.
Since n=34 is even, the
median is the average of the
two middle values. The 17th
(3.1) and the 18th values (3.3)
which results in : (3.1+3.3)/2
= 3.2. The two measures of
center are close, but the mean
is pulled somewhat by a few
extreme values in the data.
The most extreme values are
for Mexico (13.0), Turkey
(10.2), and Chile (7.0). There
is no single modal value.
Three values occur 3 times –
2.5, 2.9 and 3.5.
This distribution is slightly
skewed with a few extreme
outliers. However, compared
with the data for all 44
countries, this skew is light.
Page 8 of 8