Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Easy (and not so easy) questions to ask about adolescent health data J. Dennis Fortenberry MD MS Indiana University School of Medicine Four types of questions about health data • About data users • About data production • About data quality • About data inferences Questions about users Questions about the users • Who are the end-users • What are the data skills of the end-users • What are the conditions of use Questions about data production Questions about data production • Why were the data collected • Who collected the data • How were the data collected • How were the data processed Questions about data production • Who is represented by the data • Who is left out – and why • Are there issues of privacy and confidentiality Questions about data quality Questions about data quality • Missing data • Incorrect data • Coded data • Out of range data • Accuracy Precision and Accuracy Accurate & Precise True Value True Value Precise only True Value Accurate only Neither Accurate nor Precise True Value Questions about data inferences What type of data is available • Nominal • Ordinal • Interval • Ratio Measurement Scales • Nominal • Ordinal A nominal scale produces categorical variables • Interval • Ratio Interval and Ratio scales produce continuous variables Examples of Measurement Scales Nominal Temperature (0F) Ordinal Blood Pressure Interval Tanner Stage Ratio Gender What type of descriptive statistics are needed • Mean • Median • Shape of distribution • Variation – standard deviation • Proportion Mean -1 SD +1 SD -2 SD + 2 SD -1.96 SD +1.96 SD For a normal curve, a traditional alpha is nearly two standard deviation units from the mean Standard Deviation • A measure of variability within a sample • Positive square root of variance • Area between - 1SD and +1 SD represents 68% of area under the curve • Between -2 SD and +2 SD is 95.4% Confidence Intervals • Range of values containing true mean with a given level of certainty • 95% CI commonly used • 95% CI = mean 1.96 SE The Null Hypothesis H0: A does not differ from B H1: A is different than B Where A and B are two variables of interest Types of Error in Statistical Testing • Type 1: Rejection of a ‘true’ null hypothesis • Type 2: Acceptance of a ‘false’ null hypothesis One-Tail versus Two-Tails • One-Tailed tests are used to assess a directional hypothesis • One-tailed tests have greater power • One-tailed tests can be used when there is solid theoretical or empirical basis Elements of Statistical Power • The statistical test • Level of Alpha • 1-Tailed / 2-Tailed • Sample Size • The difference to be detected What type of inferential statistics are appropriate • Correlation • Chi square • t test • Risk ratio and Odds ratio What is a Risk Ratio Risk Factor Disease Status Total Present Absent Present a b a+b Absent c d c+d Total a+c b+d a+b+ c+d What is an Odds Ratio Risk Factor Disease Status Total Present Absent Present a b a+b Absent c d c+d Total a+c b+d a+b+ c+d Who do the data represent and Can the data be applied to other groups • Representativeness • Generalizability Questions?