Download Type 1

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Geographic information system wikipedia , lookup

Neuroinformatics wikipedia , lookup

Theoretical computer science wikipedia , lookup

K-nearest neighbors algorithm wikipedia , lookup

Pattern recognition wikipedia , lookup

Data assimilation wikipedia , lookup

Data analysis wikipedia , lookup

Corecursion wikipedia , lookup

Transcript
Easy (and not so easy) questions
to ask about adolescent health
data
J. Dennis Fortenberry MD MS
Indiana University School of Medicine
Four types of questions about
health data
• About data users
• About data production
• About data quality
• About data inferences
Questions about users
Questions about the users
• Who are the end-users
• What are the data skills of the end-users
• What are the conditions of use
Questions about data
production
Questions about data production
• Why were the data collected
• Who collected the data
• How were the data collected
• How were the data processed
Questions about data production
• Who is represented by the data
• Who is left out – and why
• Are there issues of privacy and
confidentiality
Questions about data quality
Questions about data quality
• Missing data
• Incorrect data
• Coded data
• Out of range data
• Accuracy
Precision and Accuracy
Accurate &
Precise
True Value
True Value
Precise
only
True Value
Accurate
only
Neither Accurate
nor Precise
True Value
Questions about data
inferences
What type of data is available
• Nominal
• Ordinal
• Interval
• Ratio
Measurement Scales
• Nominal
• Ordinal
A nominal scale produces
categorical variables
• Interval
• Ratio
Interval and Ratio scales produce
continuous variables
Examples of Measurement Scales
Nominal
Temperature (0F)
Ordinal
Blood Pressure
Interval
Tanner Stage
Ratio
Gender
What type of descriptive statistics
are needed
• Mean
• Median
• Shape of distribution
• Variation – standard deviation
• Proportion
Mean
-1 SD
+1 SD
-2 SD
+ 2 SD
-1.96 SD
+1.96 SD
For a normal curve, a traditional alpha
is nearly two standard deviation
units from the mean
Standard Deviation
• A measure of variability within a sample
• Positive square root of variance
• Area between - 1SD and +1 SD
represents 68% of area under the curve
• Between -2 SD and +2 SD is 95.4%
Confidence Intervals
• Range of values containing true mean
with a given level of certainty
• 95% CI commonly used
• 95% CI = mean  1.96 SE
The Null Hypothesis
H0: A does not differ from B
H1: A is different than B
Where A and B are two variables
of interest
Types of Error in Statistical Testing
• Type 1: Rejection of a ‘true’ null
hypothesis
• Type 2: Acceptance of a ‘false’ null
hypothesis
One-Tail versus Two-Tails
• One-Tailed tests are used to assess a
directional hypothesis
• One-tailed tests have greater power
• One-tailed tests can be used when
there is solid theoretical or empirical
basis
Elements of Statistical Power
• The statistical test
• Level of Alpha
• 1-Tailed / 2-Tailed
• Sample Size
• The difference to be detected
What type of inferential statistics
are appropriate
• Correlation
• Chi square
• t test
• Risk ratio and Odds ratio
What is a Risk Ratio
Risk
Factor
Disease Status
Total
Present
Absent
Present
a
b
a+b
Absent
c
d
c+d
Total
a+c
b+d
a+b+
c+d
What is an Odds Ratio
Risk
Factor
Disease Status
Total
Present
Absent
Present
a
b
a+b
Absent
c
d
c+d
Total
a+c
b+d
a+b+
c+d
Who do the data represent and
Can the data be applied to other
groups
• Representativeness
• Generalizability
Questions?