• Study Resource
• Explore

Survey

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
```Introduction to statistics and data
Looking at numbers…

Group exercise: What’s the math
problem in each of the four examples
I’ve given you?
EXAMPLE 1.
Experimental treatment
Standard treatment
Table 2. Outcome volume for the experimental and standard groups; mean (SD).
Location
Week 0
experimental
Week 12
standard
experimental
standard
Change (Week 0 – Week 12)
experimental
standard
Affected side
3135 (748)*
3333 (1368)*
2982 (715)*
3331 (1383)*
–154 (168)
–2 (306)
Contralateral
side
2595 (672)
2654 (761)
2553 (606)
2631 (736)
–42 (193)
–23 (219)
* p< .05 greater than the contralateral side
EXAMPLE 2.
Objective: The study objective is to determine the efficacy of a new
treatment cream as a therapeutic option for eczema.
Methods: Prospective study under institutional review board approval
of ten patients with eczema, who were all treated with the experimental
cream. Three blinded independent investigators evaluated overall
improvement, as well as changes in scaliness and redness, graded on a
quartile (0-3) scale: 0=none, 1=mild (1-33%), 2=moderate (34-66%),
3=excellent (67-100%).
Results: All patients showed overall improvement as measured by
blinded investigators. Of patients showing overall improvement, 78%
were graded as having either excellent or moderate improvement.
Ninety-six percent of subjects demonstrated improvements in scaliness
and redness.
Limitations: Small sample size
EXAMPLE 3.
Table 1 -- Baseline characteristics by height and follow-up for incident cancer in the Million Women Study
Height in cm*
<155
152·8 (4·1)
All women
165
164·9 (2·9)
170
169·0 (2·9)
≥175
173·8 (4·3)
160·9 (6·4) ‡
388 515
56·2 (4·9)
42 862 (22%)
72 763 (20%)
43 324 (22%)
65 622 (18%)
69 607 (37%)
139 607 (37%)
40 296 (10%)
82 436 (35%)
288 893
56·0 (4·8)
73 119 (19%)
51 678 (19%)
92 126 (24%)
42 004 (15%)
147 103 (39%)
108 550 (38%)
33 267 (12%)
67 118 (34%)
143 289
56·0 (4·8)
48 190 (17%)
26 147 (19%)
73 597 (26%)
18 370 (13%)
116 614 (42%)
57 852 (41%)
17 985 (13%)
127 826 (33%)
46 138
55·8 (4·8)
23 262 (16%)
8 369 (19%)
36 742 (26%)
5 320 (12%)
58 339 (42%)
20 176 (45%)
6 900 (15%)
91 287 (32%)
1 297 124
56·1 (4·9)
7 664 (17%)
20·5
11 734 (26%)
18·0
18 699 (42%)
37·4
10·8
44 074 (31%)
Age at first birth, n (%) ≥25 years 67 250 (33%)
61 042 (35%)
129 031 (38%) 103 017 (41%) 52 677 (43%)
17 492 (46%)
Postmenopausal, n (%)
162 551 (81%) 136 544 (81%) 269 384 (81%) 197 618 (80%) 97 855 (80%)
30 900 (79%)
Ever use of oral contraceptives, n (%)
133 979 (58%) 114 105 (59%) 228 669 (60%) 173 520 (61%) 85 522 (60%)
Current use of HRT, n (%)
75 151 (33%)
63 865 (33%)
128 891 (34%) 98 086 (34%)
48 516 (34%)
15 637 (34%)
Follow-up for cancer incidence
Woman-years, millions
2·1
1·8
3·5
2·6
1·3
0·4
Number of incident cancers
15 792
14 213
28 806
22 571
11 902
4 092
*
The categories of height are those reported at recruitment, and mean values are those measured in a randomly selected sample.
‡
Standardised to the distribution of categories of self-reported height in our whole analysis population.
38·2
80·5
27 571 (60%)
33·6
155
156·5 (2·3)
160
160·4 (2·9)
Mean measured height (SD)
Characteristics at recruitment
Number of women
233 516
196 773
Mean age, years (SD)
56·3 (4·9)
56·2 (4·9)
Socioeconomic status, n (%) in lowest quintile
59 220 (26%)
Current smokers, n (%)
50 775 (23%)
40 500 (22%)
Alcohol intake, n (%) ≥7 units per week
47 138 (20%)
Body-mass index, n (%) BMI ≥30 54 550 (25%)
38 493 (20%)
Strenuous exercise, n (%) once a week or more
76 917 (35%)
Age at menarche, n (%) ≥14 years 79 858 (35%)
69 718 (36%)
Parity, n (%) nulliparous
22 827 (10%)
19 149 (10%)
Number of full-term pregnancies, n (%) with three or more
11·7
97 376
EXAMPLE 4.
Original data:
Data re-use:
Clinical Data Example

1. Kline et al. (2002)


The researchers analyzed data from 934 emergency
room patients with suspected pulmonary embolism
researchers wanted to know what clinical factors
predicted PE.
I will use four variables from their dataset today:




Pulmonary embolism (yes/no)
Age (years)
Shock index = heart rate/systolic BP
Shock index categories = take shock index and divide it into
10 groups (lowest to highest shock index)
Descriptive Statistics
Types of Variables: Overview
Categorical
binary
nominal
Quantitative
ordinal
discrete
continuous
2 categories +
more categories +
order matters +
numerical +
uninterrupted
Categorical Variables

Also known as “qualitative.”

Categories.



treatment groups
exposure groups
disease status
Categorical Variables

Dichotomous (binary) – two levels







Treatment/placebo
Disease/no disease
Exposed/Unexposed
Pulmonary Embolism (yes/no)
Male/female
Categorical Variables

Nominal variables – Named categories
Order doesn’t matter!



The blood type of a patient (O, A, B, AB)
Marital status
Occupation
Categorical Variables

Ordinal variable – Ordered categories. Order
matters!







Staging in breast cancer as I, II, III, or IV
Birth order—1st, 2nd, 3rd, etc.
Letter grades (A, B, C, D, F)
Ratings on a scale from 1-5
Ratings on: always; usually; many times; once in a
while; almost never; never
Age in categories (10-20, 20-30, etc.)
Shock index categories (Kline et al.)
Quantitative Variables

Numerical variables; may be
arithmetically manipulated.




Counts
Time
Age
Height
Quantitative Variables

Discrete Numbers – a limited set of distinct
values, such as whole numbers.





Number of new AIDS cases in CA in a year (counts)
Years of school completed
The number of children in the family (cannot have a half
a child!)
The number of deaths in a defined time period (cannot
have a partial death!)
Roll of a die
Quantitative Variables

Continuous Variables - Can take on any
number within a defined range.







Time-to-event (survival time)
Age
Blood pressure
Serum insulin
Speed of a car
Income
Shock index (Kline et al.)
Review Question 1
Which of the following variables would be
considered a continuous variable?
a.
b.
c.
d.
e.
Favorite fruit
Gender
Age at first birth
Parity
Review Question 2
Which of the following variables would be
considered a nominal (categorical) variable?
a.
b.
c.
d.
e.
Favorite fruit
Gender
Age at first birth
Parity
Looking at Data

 How are the data distributed?





Where is the center?
What is the range?
What’s the shape of the distribution (e.g.,
Gaussian, binomial, exponential, skewed)?
Are there “outliers”?
Are there data points that don’t make
sense?
The first rule of statistics:
USE COMMON SENSE!
90% of the information is
contained in the graph.
Frequency Plots (univariate)
Categorical variables
 Bar Chart
Continuous variables
 Box Plot
 Histogram
Bar Chart


Used for categorical variables to show
frequency or proportion in each
category.
Translate the data from frequency
tables into a pictorial representation…
Bar Chart: categorical
variables
no
yes
Bar Chart for SI categories
200.0
Note how much
easier it is to
extract information
from a bar chart
than from a table!
183.3
Number of Patients
166.7
150.0
133.3
116.7
100.0
83.3
66.7
50.0
33.3
16.7
0.0
1
2
3
4
5
6
7
Shock Index Category
8
9
10
Box plot and histograms

To show the distribution (shape, center,
range, variation) of continuous
variables.
Shape of a Distribution

Describes how data are distributed

Measures of shape

Symmetric or skewed
Left-Skewed
Symmetric
Right-Skewed
Mean < Median
Mean = Median
Median < Mean
Box Plot: Shock Index
Shock Index Units
2.0
maximum (1.7)
Outliers
1.3
Q3 + 1.5IQR =
.8+1.5(.25)=1.175
“whisker”
0.7
75th percentile (0.8)
median (.66)
25th percentile (0.55)
interquartile range
(IQR) = .8-.55 = .25
minimum (or Q11.5IQR)
0.0
SI
Histogram of SI
25.0
Bins of size 0.1 (automatically
generated)
Note the “right skew”
Percent
16.7
8.3
0.0
0.0
0.7
1.3
SI
2.0
Histogram
6.0
100 bins (too much detail)
Percent
4.0
2.0
0.0
0.0
0.7
1.3
SI
2.0
Histogram
200.0
2 bins (too little detail)
Percent
133.3
66.7
0.0
0.0
0.7
1.3
SI
2.0
Box Plot: Shock Index
Shock Index Units
2.0
Also shows the “right
skew”
1.3
0.7
0.0
SI
Distribution Shape and
Box-and-Whisker Plot
Left-Skewed
Q1
Q2 Q3
Symmetric
Q1 Q2 Q3
Right-Skewed
Q1 Q2 Q3
Box Plot: Age
100.0
maximum
More symmetric
66.7
75th percentile
Years
interquartile range
median
25th percentile
33.3
minimum
0.0
AGE
Variables
Histogram: Age
Not skewed, but not
bell-shaped either…
14.0
Percent
9.3
4.7
0.0
0.0
33.3
66.7
AGE (Years)
100.0
(n=25)
Starting with politics…
Health Care Law
writing…
Optimism…
Diet…
Habits…
Homework and optimism?
(bivariate)
Review Question 3
Which of the following graphics should
be used for categorical variables?
a.
b.
c.
d.
Histogram
Box plot
Bar Chart
Stem-and-leaf plot
Review Question 4
What is the first thing you should do
when you get new data?
a.
b.
c.
d.
Run a ttest
Calculate a p-value
Run multivariate regression
Review Question 5
Approximately what
pulses between 80 and
90?
40.0
a. 200%
Percent
26.7
b. 100%
c. 90%
13.3
d. 50%
e. 10%
0.0
60.0
80.0
PULSE_OX
100.0
120.0
Review Question 6
What is the maximum
pulse that any subject
40.0
a. =100
Percent
26.7
b. <=100
c. >100
13.3
d. >=100
0.0
60.0
80.0
PULSE_OX
100.0
120.0
Review Question 7
This distribution of the variable (pulse) would
be described as?
Histogram
Percent
40.0
26.7
13.3
a.
b.
c.
Symmetric
Right-skewed
Left-skewed
0.0
60.0
80.0
100.0
PULSE_OX
120.0
Measures of central
tendency



Mean
Median
Mode
Central Tendency

Mean – the average; the balancing
point
calculation: the sum of values divided by the
sample size
n
In math
shorthand:
X
X
i1
n
i
X1  X2    Xn

n
Mean: example
Some data:
Age of participants: 17 19 21 22 23 23 23 38
n
X
X
i 1
n
i
17  19  21  22  23  23  23  38

 23.25
8
Mean of age in Kline’s data
Descriptive Statistics Report
Page/Date/Time1 3/30/2006 10:25:14 AM
DatabaseC:\Program Files\NCSS97\Data\Dawson\kline.S0
Means Section of AGE
Mean
50.19334
GeometricHarmonic
Median Mean
Mean
Sum
49
46.66865 43.00606 46730
14.0
Mode
49
556.9546
Percent
Parameter
Value
9.3
4.7
0.0
0.0
33.3
66.7
100.0
Mean of age in Kline’s data
Percent
14.0
9.3
4.7
0.0
0.0
33.3
66.7
The balancing point
100.0
Mean of Pulmonary Embolism?
(Binary variable?)
n
X
X
i 1
100.0
181 *Histogram
1  750 * 0 181


 .1944
931
931
80.56%
(750)
Percent
66.7
n
i
33.3
0.0
0.0
19.44%
(181)
0.3
0.7
PE
1.0
Mean

The mean is affected by extreme values
(outliers)
0 1 2 3 4 5 6 7 8 9 10
Mean = 3
1  2  3  4  5 15

3
5
5
0 1 2 3 4 5 6 7 8 9 10
Mean = 4
1  2  3  4  10 20

4
5
5
Central Tendency

Median – the exact middle value
Calculation:


If there are an odd number of observations,
find the middle value
If there are an even number of observations,
find the middle two values and average them.
Median: example
Some data:
Age of participants: 17 19 21 22 23 23 23 38
Median = (22+23)/2 = 22.5
Median of age in Kline’s data
Means Section of AGE
Mean
50.19334
Mode
49
14.0
Percent
Parameter
Value
GeometricHarmonic
Median Mean
Mean
Sum
49
46.66865 43.00606 46730
9.3
4.7
0.0
0.0
33.3
66.7
100.0
AGE (Years)
Median of age in Kline’s data
Percent
14.0
50%
50%
of mass
of
mass
9.3
4.7
0.0
0.0
33.3
66.7
100.0
Does PE have a median?

Yes, if you line up the 0’s and 1’s, the
middle number is 0.
Median

The median is not affected by extreme
values (outliers).
0 1 2 3 4 5 6 7 8 9 10
0 1 2 3 4 5 6 7 8 9 10
Median = 3
Median = 3
Central Tendency

Mode – the value that occurs most
frequently
Mode: example
Some data:
Age of participants: 17 19 21 22 23 23 23 38
Mode = 23 (occurs 3 times)
Mode of age in Kline’s data
Means Section of AGE
Parameter
Value
Mean
50.19334
GeometricHarmonic
Median Mean
Mean
Sum
49
46.66865 43.00606 46730
Mode
49
Mode of PE?

0 appears more than 1, so 0 is the
mode.
Mode




Not affected by extreme values
Used for either numerical or categorical data
There may may be no mode
There may be several modes
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
Mode = 9
0 1 2 3 4 5 6
No Mode
Which measure of central
tendency is “best”?


Mean is generally used, unless extreme
values (outliers) exist
Then median is often used, since the
median is not sensitive to extreme values.

Example: Median home prices may be
reported for a region – less sensitive to
outliers
Measures of
Variation/Dispersion




Range
Percentiles/quartiles
Interquartile range
Standard deviation/Variance
Range

Difference between the largest and the
smallest observations.
Range of age: 94 years-15 years = 79 years
14.0
Percent
9.3
4.7
0.0
0.0
33.3
66.7
AGE (Years)
100.0
Range of PE?

1-0 = 1
Quartiles
25%
Q



25%
1
25%
Q
2
25%
Q
3
The first quartile, Q1, is the value for which
25% of the observations are smaller and 75%
are larger
Q2 is the same as the median (50% are
smaller, 50% are larger)
Only 25% of the observations are greater than
the third quartile
Interquartile Range

Interquartile range = 3rd quartile – 1st
quartile = Q3 – Q1
Interquartile Range: age
minimum
Q1
25%
15
Median
(Q2)
25%
35
Q3
25%
49
maximum
25%
65
Interquartile range
= 65 – 35 = 30
94
Sample Variance

Average (roughly) of squared deviations
of values from the mean
n
S 
2
 (x  X )
i
i
n 1
2
Why squared deviations?

Adding deviations will yield a sum of 0.
Absolute values are tricky!
Squares eliminate the negatives.

Result:



Increasing contribution to the variance as
you go farther from the mean.
Standard Deviation



Most commonly used measure of variation
Has the same units as the original data
n
S
 (x  X )
i
i
n 1
2
Calculation Example:
Sample Standard Deviation
Age data (n=8) : 17 19 21 22 23 23 23 38
n=8
Mean = X = 23.25
(17  23.25) 2  (19  23.25) 2    (38  23.25) 2
S
8 1
280

 6.3
7
Std. dev is a measure of
the “average” scatter
around the mean.
14.0
Percent
9.3
Estimation method: if
the distribution is bell
shaped, the range is
around 6 SD, so here
rough guess for SD is
79/6 = 13
4.7
0.0
0.0
33.3
66.7
AGE (Years)
100.0
Std. Deviation age
Variation Section of AGE
Parameter
Value
Variance
333.1884
Standard
Deviation
18.25345
Std Dev of Shock Index
250.0
Std. dev is a measure of
the “average” scatter
around the mean.
Count
187.5
Estimation method: if
the distribution is bell
shaped, the range is
around 6 SD, so here
rough guess for SD is
1.4/6 =.23
125.0
62.5
0.0
0.0
0.5
1.0
SI
1.5
2.0
Std. Deviation SI
Variation Section of SI
Parameter Variance
Value
4.155749E-02
1.430856
Standard
Deviation
0.2038566
Std Error
of Mean
6.681129E-03
Interquartile
Range
Range
0.2460432
Std. Dev of binary variable, PE
181 * (1  .1944 ) 2  750 * (0  .1944 ) 2
S
Std. dev is a measure of
931  1
the “average” scatter
145 .8 around the mean.

 .3959
930
80.56%
19.44%
Std. Deviation PE
Variation Section of PE
Parameter
Variance
Standard
Deviation
Value
0.156786
0.3959621
Comparing Standard
Deviations
Data A
11
12
13
14
15
16
17
18
19
20 21
Mean = 15.5
S = 3.338
20 21
Mean = 15.5
S = 0.926
20 21
Mean = 15.5
S = 4.570
Data B
11
12
13
14
15
16
17
18
19
Data C
11

12
13
14
15
16
17
18
19
SSlide from: Statistics for Managers Using Microsoft® Excel 4th Edition, 2004 Prentice-Hall
Bienaymé-Chebyshev Rule

Regardless of how the data are distributed,
a certain percentage of values must fall
within K standard deviations from the mean:
Note use of  (mu) to
represent “mean”.
At least
Note use of  (sigma) to
represent “standard deviation.”
within
(1 - 1/12) = 0% …….….. k=1 (μ ± 1σ)
(1 - 1/22) = 75% …........ k=2 (μ ± 2σ)
(1 - 1/32) = 89% ………....k=3 (μ ± 3σ)
Symbol Clarification




S = Sample standard deviation
(example of a “sample statistic”)
 = Standard deviation of the entire
population (example of a “population
parameter”) or from a theoretical
probability distribution
X = Sample mean
µ = Population or theoretical mean
**The beauty of the normal curve:
No matter what  and  are, the area between - and
+ is about 68%; the area between -2 and +2 is
about 95%; and the area between -3 and +3 is
about 99.7%. Almost all values fall within 3 standard
deviations.
68-95-99.7 Rule
68% of
the data
95% of the data
99.7% of the data
Summary of Symbols







S2= Sample variance
S = Sample standard dev
2 = Population (true or theoretical) variance
 = Population standard dev.
X = Sample mean
µ = Population mean
IQR = interquartile range (middle 50%)
Review Question 8
All of the following are measures of data
variation EXCEPT:
a.
b.
c.
d.
e.
Variance
Interquartile range
Standard deviation
Range
Mean
Review Question 9
All of the following are influenced by outliers
EXCEPT:
a.
b.
c.
d.
e.
Variance
Interquartile range
Standard deviation
Range
Mean
Review Question 10

a.
b.
c.
d.
e.
If you have right-skewed data, which of the
following will be true?
Mean > median
Mean > = median
Median > = mean
Median > mean
Mean = median
Review Question 11

a.
b.
c.
d.
e.
How much of your data is guaranteed to fall
within 2 standard deviations of the mean?
None—there are no guarantees.
95%
99%
75%
89%
What’s wrong with
this graph?
from: ER Tufte. The Visual Display of Quantitative Information. Graphics Press, Cheshire, Connecticut,
1983, p.69
Notice the Xaxis
From: Visual Revelations: Graphical Tales of Fate and Deception from Napoleon Bonaparte to Ross Perot
Wainer, H. 1997, p.29.
Correctly scaled X-axis…
Report of the Presidential Commission on the Space Shuttle
Challenger Accident, 1986 (vol 1, p. 145)
The graph excludes the observations where no O-rings failed.
Smooth curve at least shows the trend toward failure at high and
low temperatures…

http://www.math.yorku.ca/SCS/Gallery/
Even better: graph all the data (including non-failures)
using a logistic regression model
Tappin, L. (1994). "Analyzing data relating to the Challenger disaster".
Mathematics Teacher, 87, 423-426
What’s wrong with
this graph?
from: ER Tufte. The Visual Display of Quantitative Information. Graphics Press, Cheshire, Connecticut,
1983, p.74
What’s the message here?
Diagraphics II, 1994
Diagraphics II, 1994
For more examples…

http://www.math.yorku.ca/SCS/Gallery/
Class exercise

What’s wrong with these graphs?
From:
Johnson R.
Just the
Essentials of
Statistics.
Duxbury
Press, 1995.
From:
Johnson R.
Just the
Essentials of
Statistics.
Duxbury
Press, 1995.
“Lying” with statistics

statistics…
Example 1: projected statistics
1935: 1/1500
1960: 1/600
1985: 1/150
2000: 1/74
2006: 1/60
http://www.melanoma.org/mrf_facts.pdf
Example 1: projected statistics
How do you think these statistics are
calculated?
How do we know what the lifetime risk of a
person born in 2006 will be?
Example 1: projected statistics
Interestingly, a clever clinical researcher recently
went back and calculated (using SEER data) the
actual lifetime risk (or risk up to 70 years) of
melanoma for a person born in 1935.
Closer to 1/150 (one order of magnitude off)
(Martin Weinstock of Brown University, AAD conference 2006)
Example 2: propagation of
statistics


In many papers and reviews of eating
disorders in women athletes, authors cite
the statistic that 15 to 62% of female
athletes have disordered eating.
I’ve found that this statistic is attributed to
about 50 different sources in the literature
and cited all over the place with or without
citations...
For example…



In a recent review (Hobart and Smucker, The
Physician, 2000):
“Although the exact prevalence of the female
athlete triad is unknown, studies have
reported disordered eating behavior in 15 to
62 percent of female college athletes.”
No citations given.
And…


Fact Sheet on eating disorders:
“Among female athletes, the
prevalence of eating disorders is
reported to be between 15% and
62%.”
Citation given: Costin, Carolyn. (1999)
The Eating Disorder Source Book: A
comprehensive guide to the causes,
treatment, and prevention of eating
disorders. 2nd edition. Lowell House: Los
Angeles.
And…



From a Fact Sheet on disordered eating
from a college website:
“Eating disorders are significantly higher
(15 to 62 percent) in the athletic
population than the general population.”
No citation given.
And…


“Studies report between 15% and
62% of college women engage in
problematic weight control behaviors
(Berry & Howe, 2000).” (in The Sport
Journal, 2004)
Citation: Berry, T.R. & Howe, B.L.
(2000, Sept). Risk factors for
disordered eating in female university
athletes. Journal of Sport Behavior,
23(3), 207-219.
And…


1999 NY Times article
“But informal surveys suggest that 15
percent to 62 percent of female athletes
are affected by disordered behavior that
ranges from a preoccupation with losing
weight to anorexia or bulimia.”
And

“It has been estimated that the prevalence of
disordered eating in female athletes ranges from
15% to 62%.” ( in Journal of General Internal
Medicine 15 (8), 577-590.)
Citations:
Steen SN. The competitive athlete. In: Rickert VI,
Management. New York, NY: Chapman and Hall;
1996:223 47.
Tofler IR, Stryer BK, Micheli LJ. Physical and
emotional problems of elite female gymnasts. N
Engl J Med. 1996;335:281 3.
Where did the statistics come
from?
The 15%: Dummer GM, Rosen LW, Heusner WW, Roberts PJ, and
Counsilman JE. Pathogenic weight-control behaviors of young
competitive swimmers. Physician Sportsmed 1987; 15: 75-84.
The “to”: Rosen LW, McKeag DB, O’Hough D, Curley VC. Pathogenic
weight-control behaviors in female athletes. Physician Sportsmed.
1986; 14: 79-86.
The 62%:Rosen LW, Hough DO. Pathogenic weight-control behaviors
of female college gymnasts. Physician Sportsmed 1988; 16:140-146.
Where did the statistics come
from?

Study design? Control group?



Cross-sectional survey (all)
No non-athlete control groups
Population/sample size?




Convenience samples
Rosen et al. 1986: 182 varsity athletes from two
midwestern universities (basketball, field hockey, golf,
running, swimming, gymnastics, volleyball, etc.)
Dummer et al. 1987: 486 9-18 year old swimmers at a
swim camp
Rosen et al. 1988: 42 college gymnasts from 5 teams
at an athletic conference
Where did the statistics come
from?

Measurement?


Instrument: Michigan State University Weight
Control Survey
Disordered eating = at least one pathogenic
weight control behavior:







Self-induced vomiting
fasting
Laxatives
Diet pills
Diuretics
In the 1986 survey, they required use 1/month; in the
1988 survey, they required use twice-weekly
In the 1988 survey, they added fluid restriction
Where did the statistics come
from?

Findings?



Rosen et al. 1986: 32% used at least one
“pathogenic weight-control behavior”
(ranges: 8% of 13 basketball players to
73.7% of 19 gymnasts)
Dummer et al. 1987: 15.4% of swimmers
used at least one of these behaviors
Rosen et al. 1988: 62% of gymnasts used
at least one of these behaviors
Citation Tree…
Figure 4A from: Smith N P et al. J Exp Biol 2007;210:1576-1583.
Figure 4B from: Smith N P et al. J Exp Biol 2007;210:1576-1583.
Homework





Problem Set 1
Fill out a “Journal Article Review Sheet” (on class
website).
Who wants to lead journal article discussion next
week?
References







http://www.math.yorku.ca/SCS/Gallery/
Kline et al. Annals of Emergency Medicine 2002; 39: 144-152.
Statistics for Managers Using Microsoft® Excel 4th Edition, 2004 Prentice-Hall
Tappin, L. (1994). "Analyzing data relating to the Challenger disaster". Mathematics Teacher, 87, 423426
Tufte. The Visual Display of Quantitative Information. Graphics Press, Cheshire, Connecticut, 1983.
Visual Revelations: Graphical Tales of Fate and Deception from Napoleon Bonaparte to Ross Perot
Wainer, H. 1997.
Johnson R. Just the Essentials of Statistics. Duxbury Press, 1995.
```
Related documents