Download Lecture 4

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Central limit theorem wikipedia , lookup

Transcript
N318b Winter 2002
Nursing Statistics
Lecture 4
Normal distribution, Z-scores,
Central Limit Theorem,
Probability
Today’s Class






Normal distribution
Z-scores
Central limit theorem
<< 10 min break >>
Probability
Applying knowledge to assigned readings
(Wolfe et al., 1996)
No work group today ! 
School of
Nursing
Institute for Work & Health
Nur 318b 2002 Lecture 4: page 2
A Quick Review
from Last Week
Data presentation
Bar graphs, pie charts
Histograms, polygons (lines)
Box plots
Measures of asymmetry
Skew
Kurtosis
School of
Nursing
Institute for Work & Health
Nur 318b 2002 Lecture 4: page 3
Normal Distribution
“what is all the fuss about?!”
 Statistics is a branch of applied math
 Most statistical tests are based on a
set of basic assumptions about data
 Most assumptions refer to distribution
 If assumptions not true tests not valid !
Review: How do you check normality of data?
School of
Nursing
Institute for Work & Health
Nur 318b 2002 Lecture 4: page 4
The (Standard) Normal Curve
- a hypothetical distribution that forms basis of
statistical theory (also called Gaussian curve)
School of
(See Figure 3.1 in textbook, page 64)
Nursing
Institute for Work & Health
Nur 318b 2002 Lecture 4: page 5
Why use normal curve?
Many variables are normally distributed
 Many tests require normal distribution
 Allows for tests of inference since study
results can be compared against it (i.e. it
is a probability or “chance” distribution)

“Understanding the normal curve
prepares you for understanding
the concept of hypothesis testing”
(Textbook page 64)
School of
Nursing
Institute for Work & Health
Nur 318b 2002 Lecture 4: page 6
Where did the normal
curve come from?
There is an elegant mathematical
formula (theory) underlying the
distribution (you don’t need to know it !)
 Discovered in 1700’s by Demoivre, then
later Gauss (1800’s) and then used by
Galton (medicine)
 Another example of mathematical theory
helping to explain observed phenomena

School of
Nursing
Institute for Work & Health
Nur 318b 2002 Lecture 4: page 7
What is the normal
curve used for?
Test if your observed value (e.g. BP) is
different from expected value (i.e. can use
standardized or Z-scores to check this)
 Estimate precision of observed study
mean (i.e. confidence intervals)
 Tests based on probability (likelihood) that
observed results “fit” normal curve

School of
Nursing
Institute for Work & Health
Nur 318b 2002 Lecture 4: page 8
What are the properties
of the normal curve?







X-axis measured in SD’s (from mean)
Y-axis is frequency (units or counts)
Mean, median, mode all same
Symmetrical (“bell-shaped”) around mean
+/- 1 SD includes 68% of population
+/- 2 SD’s includes 96% of population
“tails” hold very small % of population
(REMEMBER: total area under curve = 100% or 1.0)
School of
Nursing
Institute for Work & Health
Nur 318b 2002 Lecture 4: page 9
Standard normal curve
-1 SD
Mean
-2 SD
+1 SD
+/-1 SD either
side of mean
includes about
68% of sample
+2 SD
+/- 2 SD includes 96% of sample
School of
Nursing
Institute for Work & Health
Nur 318b 2002 Lecture 4: page 10
Z-scores
If a variable is normally distributed
then observed (mean) values can be
converted to a z-score
WHY?
Test if your study mean (e.g. BP) is
different from expected value
Z-score just another name for SD
“distance” from the population mean
School of
Nursing
Institute for Work & Health
Nur 318b 2002 Lecture 4: page 11
Z-scores – an example
HOW?
X-
Z = -----SD
 = sample mean
SD = sample SD
A population has a mean sys BP of
110 mmHG and SD of 15 mmHG
What proportion (%) of people have
BP between 95 and 120?
School of
Nursing
Institute for Work & Health
Nur 318b 2002 Lecture 4: page 12
Z-scores – an example
X-
Z = -----SD
 = sample mean
SD = sample SD
X-
95-110
Z1 = ------ = --------
15
= -1.0
X-
Z2 = -----
= 0.67
120-110
= --------15
School of
Nursing
Institute for Work & Health
Nur 318b 2002 Lecture 4: page 13
Z-scores – an example
Now need to extract % values from
the Z-scores using a table (e.g.
Appendix A, pg. 417-8 of textbook)
-’ve values are % areas to left of mean,
+’ve values are to the right of mean (
)
From Table in Appendix A
Z1 = -1.0 = 34.13% (between 95 to 110)
Z2 = 0.67 = 24.86% (between 110 to 125)
Total area = 34.13 + 24.86 = 58.99%
School of
Nursing
Institute for Work & Health
Nur 318b 2002 Lecture 4: page 14
Z-scores – example 2
What proportion (%) of people have
a systolic BP above 140?
X-
140-110
Z = ------ = --------SD
15
= 2.0
From Table in Appendix A
Z = 2.0 = 47.72% between 110 to 140
But this represents what?
 > 140 = 50 – 47.72 = 2.28%
School of
Nursing
Institute for Work & Health
Nur 318b 2002 Lecture 4: page 15
Central Limit Theorem What is it?
When large enough (e.g. n>= 25)
samples are drawn from a population
with a known variance, the sample
mean will be normally distributed
i.e. if you plot ’s you get a bell-curve
Theorem holds even if underlying
distribution moderately non-normal
(e.g. a bit skewed)
School of
Nursing
Institute for Work & Health
Nur 318b 2002 Lecture 4: page 16
Central Limit Theorem –
What is its importance?
Now have ability to statistically test the
likelihood of observed (sample) mean
 Variation (“dispersion”) about true mean is
called “Standard error” (SE) of mean
 SE (of mean) and SD (of sample) are
directly related mathematically
 SE = SD / square root of n
(where n = sample size)

School of
Nursing
Institute for Work & Health
Nur 318b 2002 Lecture 4: page 17
Z-scores – for means
How likely is it (i.e. what %) that a sample
of size n=100 will have mean systolic BP
> 113 (assuming  = 110 and  = 15)?
113 - 110
-
Want Z-scores
Z = ----- / n
= --------15 / 10
= 2.0
>= about 2 !
From Z-score Table in Appendix A
Z = 2.0 = 47.72% of area to right of 
But once again this represents what?
Sample means between 110 - 113 mmHg
 > 113 = 50 – 47.72 = 2.28%
School of
Nursing
Institute for Work & Health
Nur 318b 2002 Lecture 4: page 18
Effect of sample size on mean?
What happens if sample size drops to 10
(i.e. n=10,  > 113 and  = 110,  =
15)?
113 - 110
-
Z = ----- / n
= ----------15 / 3.16
= 0.63
From Table in Appendix A
Z = 0.63 = 23.57%
But once again this represents what?
- sample means that fall below 113 mmHg
 For  > 113 = 50 – 23.57 = 21.43%
School of
Nursing
Institute for Work & Health
Nur 318b 2002 Lecture 4: page 19
10 minute break !
School of
Nursing
Institute for Work & Health
Nur 318b 2002 Lecture 4: page 20
Probability
Think of it as a statistical measure of chance
A proportion (e.g. %) that lets you make
intelligent guesses about future events
 Often expressed as a “p-value”
 p-value “rules” in (quantitative) research

P(event) = number of events
------------------------number of subjects
(Often expressed as % when multiplied by 100)
School of
Nursing
Institute for Work & Health
Nur 318b 2002 Lecture 4: page 21
Probability – cont’d
You read a well done clinical trial that
followed 1000 women with breast CA,
200 of whom died from BC at 5 yrs
 You then see a women with BC on the
ward and she asks you if she is going
to live – what do you tell her?

She has a 20% probability or a 1 in 5
chance of dying from BC within 5 yrs
School of
Nursing
Institute for Work & Health
Nur 318b 2002 Lecture 4: page 22
Probability – cont’d

What if she then tells you she is node
negative and the tumour was small?

Then she tells you her mother and
sister both died from BC by age 45
Probability is a way of quantifying risk
or likelihood of events occurring
(usually according to a set of criteria)
School of
Nursing
Institute for Work & Health
Nur 318b 2002 Lecture 4: page 23
Probability – Facts

Probabilities always between 0 and 1
(0 = min value = no chance)
(1 = max value = definite event)
P-value = “probability due to chance”
arbitrarily “set” at p<=0.05 in most
cases, but it can vary from 0.2 to <0.01
 P-value refers to the “tails” of the normal
curve distribution (lower = better!)


School of
Nursing
Institute for Work & Health
Nur 318b 2002 Lecture 4: page 24
Probability – Rules

Conditional Probabilities
probability of event A given event B

Multiplication Rule (Independence !)
probability of A and B = P(A) x P(B)

Addition Rule (Mutually exclusive !)
probability of A and B = P(A) + P(B)
School of
Nursing
Institute for Work & Health
Nur 318b 2002 Lecture 4: page 25
Part 2:
Application to the
Assigned Reading
School of
Nursing
Institute for Work & Health
Nur 318b 2002 Lecture 4: page 26
Wolfe et al. (1996)
Quick summary of the paper:
 an etiologic study aimed at exploring
possible causal pathways between
back pain and osteoarthritis of the knee
 a 3-year consecutive series of 368 knee
OA patients via a rheumatology clinic
 X-sectional questionnaire assessment
of key study variables (possible bias?)
School of
Nursing
Institute for Work & Health
Nur 318b 2002 Lecture 4: page 27
Wolfe et al. (1996)
Typical example of a sophisticated
multistage exploratory analysis
Descriptive analysis
Exploratory univariate analysis
Causal pathway multivariate analysis
School of
Nursing
Institute for Work & Health
Nur 318b 2002 Lecture 4: page 28
Some questions …
What does Figure 1 tell us?
Why did
they group
BMI in
quartiles?
School of
Nursing
Institute for Work & Health
Nur 318b 2002 Lecture 4: page 29
Some questions …
Do you understand the major
features of the data in Table 1?
What do all the columns mean?
e.g. “unadjusted” vs. “adjusted”
Odds ratios and confidence intervals
studied later (CI’s in next lecture !)
School of
Nursing
Institute for Work & Health
Nur 318b 2002 Lecture 4: page 30
Next Week - Lecture 5:
Inference testing, Type I and
Type II errors, p-values, and
Confidence Intervals
For next week’s class please review:
1. Page 14 in syllabus
2. Textbook Chapter 3, pages 80-91
3. Syllabus papers:
i) Birenbaum et al. (1996)
ii) Gulick (1995)
School of
Nursing
Institute for Work & Health
Nur 318b 2002 Lecture 4: page 31
Research Practicum
Can those who signed up please
stay for a few extra minutes to
decide placements?
 Did those who signed-up last term
and did NOT get placed want to be
put back in the “pool” to be placed?

School of
Nursing
Institute for Work & Health
Nur 318b 2002 Lecture 4: page 32