Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Common Statistical Tests Descriptive statistics (common in all types of studies – first step in reporting findings) Continuous variables: T-test, ANOVA, Pearson correlation, linear regression (e.g., pain VAS, age, cholesterol) Categorical, Nominal: Chi-square test, relative risks, proportions, Mantel-Haentzel, Spearman correlation, logistic regression (e.g., gender, death, categorical scales) *Most assume random sampling or random group assignment – frequently violated. Descriptive Statistics Measures of central tendency Measures of variability Standard deviation, standard error, confidence intervals, range of scores Frequency distribution Mean, median, mode How many people in each level of the variable Proportions Proportion (%) of sample at each level Often also referred to as frequency distribution Central Tendency Mean Median mathematical average Used when distribution is normal 50th percentile – ½ scores below, ½ above Used when distribution is skewed Mode Score with the highest frequency Seldom reported Measures of Variability Standard deviation Variability of scores around mean in your sample (spread of scores in your sample) E.g., mean of 100, S.D. 10 means that 68% of scores are between 90 and 110, 95% of scores are within 2 standard deviations of mean Standard error Measure of the inaccuracy of the sample mean compared to the true population mean Often used incorrectly in presentation of results Standard error smaller than standard deviation - makes data look less variable Measures of Variability Range of scores Range of scores observed Confidence intervals Range of values we are fairly confident will include the true value we are interested in Mean=100, 95% CI 85-105 – if we measured that value on 100 samples, 95% of those values would fall within the confidence intervals چرا آزمون آماری؟ خطای ناشی از نمونه گیری مفهوم >= H0فرض برابری (یا عدم ارتباط) چقدر نتایج بدست آمده ناشی از شانس است؟ P Value رد H0به غلط => خطای نوع اول = 0.05 قبول H0به غلط => خطای نوع دوم = 0.2 Frequency Distribution 45 40 35 20-29 years 30-39 years 40-49 years 50-59 years 60-69 years 30 25 20 15 10 5 0 Responders آزمون های آماری .1 پی بردن به اختالف: مقایسه میانگین فشار خون مقایسه توزیع جنسی در رشته های مختلف .1 پی بردن به ارتباط: تعیین ارتباط نوع شخصیت و رشته تحصیلی تعیین ارتباط عفونت کالمیدیا با IHD آزمون آماری جهت مقایسه Independent t test Paired t test ANOVA Repeated measures مجذور کای McNemar مستقل زوجی مستقل زوجی مستقل زوجی مستقل زوجی Chi Squre دو گروه سه گروه یا بیشتر دو گروه سه گروه یا بیشتر Cochran متغیر کمی(میانگین) متغیر کیفی(درصد) Statistical Analysis Student’s T-Test Measures differences between group means Requires continuous data, assumes normal distribution in each group, random sampling Considers variability within groups T-test for independent samples, t-test for dependent samples Statistical Analysis Analysis of Variance Similar in concept to t-test Used when more than two groups E.g., experimental group, placebo group, alternative medication group Requires continuous variables, normal distribution in each group, random sampling Statistical Analysis Chi-Square Differences between proportions, discrete data 2 X 2 table Considers variability within groups Mantel-Haentzel Extension of Chi-square Way of calculating adjusted odds ratios for stratified data Chi Square Depressed Smoker Nonsmoker Total Not Depressed 89 (33%) 179 (67%) a b 131 (17%) 647 (83%) c d 220 826 a+c b+d Total 268 a+b 778 c+d 1046 T (total) Chi Square Smoker Depressed Not Total Depressed a b a+b Nonsmoker c d c+d Total a+c b+d T=a+b +c+d آزمون های آماری جهت پی بردن به ارتباط Correlation Regression Correlation Coefficients Possible values from –1 to +1 -1 = perfect negative correlation As exposure increases, disease (health condition) decreases 0 = no relationship or no linear relationship +1 = perfect positive correlation As exposure increases, disease increases Other Statistics Logistic Regression Odds ratios (cohort, case-control, crosssectional studies) Odds that an exposed person develops the disease: odds than a non-exposed person develops the disease Crude OR (just taking exposure and outcome into consideration) Adjusted OR (odds taking all other factors/confounders into consideration) Other Statistics Linear regression When outcome is continuous A kind of correlation Can adjust for other factors/confounders in the model Cox Proportional Hazards When outcome is time to an event Time to death, recovery, onset of symptoms Regression model