* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Scale of Measurement Goal Interval/ratio Ordinal Nominal Describe
Survey
Document related concepts
Transcript
Goal Describe one group Compare one group to a hypothetical value Compare two paired groups Compare two unpaired groups Compare three or more paired groups Compare three or more unpaired groups Quantify association between 2 variables Predict value from another measured variable Interval/ratio Mean, standard deviation One-sample ttest; Z-test (for true variance known; or very big sample size) Paired t-test Unpaired/ two-sample t-test ANOVA repeated measures ANOVA Pearson correlation Simple linear regression Scale of Measurement Ordinal Median; interquartile range Wilcoxon signed rank test Nominal proportion Chi-square test Wilcoxon signed rank test Mann-Whitney U test McNemar’s test Chi-square test Friedman test Cochraine Q Kruskal-Wallis Chi-square test Spearman correlation Contigency coefficients Test One sample ztest When to use Compare one group (mean) to an hypothetical value if variance known or sample very large; For ratio/interval data Null hypothesis: H0: =0 One sample or paired ttest Compare one group to an hypothetical value, or two paired groups For ratio/interval data Null hypothesis: H0: =0 Assumptions Formula The sample x 0 comes from a Z calc population normally n distributed; Random Where: sampling; x - sample mean (if paired sample, Independent do mean of the differences); observations; 0 - hypothetical value under the null hypothesis; - population standard deviation (SD), or, for a paired sample , standard deviation of the differences> for very large samples it is approximated by the sample standard deviation; n - sample size Reject H0 if The sample comes from a x 0 population Tcalc s normally distributed; n Random Where: sampling x - sample mean (if paired sample, do mean of the differences); 0 - hypothetical population mean value under the null hypothesis (if paired sample, typically zero); s - sample standard deviation (for a paired sample , SD of the differences); n - sample size | Tcalc |> | Z calc |> | Z | Where is the significance level (generally 0.05); | | stands for absolute value | T ; | I.e., Reject if absolute Tcalc above absolute critical value | T ; | Where: - significance level; - degrees of freedom: = n-1 Test Unpaired/ two sample ttest When to use Compares two independent (mean) groups For ratio/interval data H0: 1=2 or, equivalently H0: 1 - 2 =0 Wilcoxon signedrank test Compare one group to an hypothetical value, or two paired groups For ordinal data (or ratio/interval if sample is small and data not normal) Null hypothesis: H0: M=M0 (where M represents median) Assumptions The samples comes from normally distributed populations; Both populations have identical variances (i.e. 1=2); Random sampling; Independent observations; Distribution symmetrical (does not need to be normal); Random sampling Formula T ( x1 x 2 ) ( 1 2 ) 0 SP 1 1 n1 n2 Where: Sp – polled standard deviation, calculated by: SP (n1 1) S12 (n2 1) S 22 n1 n2 2 and x1 - sample mean of population 1 x 2 - sample mean of population 2 s1 – sample SD of population 1 s2 – sample SD of population 2 n1 – sample size for population 1 n2 – sample size for population 2 (1-2)0– hypothesized difference between the means of the populations (generally zero) T+= sum of the ranks having a positive sign T-= sum of the ranks having a negative sign How to do it: 1 For paired data: for each data point, calculate the differences between the 2 groups; For comparison of one group to an hypothetical value: for each data point, subtract the hypothesized median value 2 Rank the absolute differences, from smaller to larger (i.e., 1 for the smallest absolute difference) 3 Add the corresponding signs (+ for originally positive differences, - for negative ones) 4 Calculate T+ and T- . Note: if one of them is calculated, the other can be calculated from: T+=n(n-1)/2 - TWhere n is the sample size Reject H0 if | Tcalc |> | T ; | I.e., Reject if absolute Tcalc above absolute critical value ( | T ; | ) Where: - significance level; - degrees of freedom: = n1 + n2 -2 Choose T+ or T(whichever is smallest) and compare with critical value (from table). Reject if below or equal to the critical value Test MannWhitney U test (also called Wilcoxon rank-sum, or Wilcoxon- When to use Compare one group to an hypothetical value, or two paired groups Assumptions Random sampling; Independent observations; For ordinal data (or ratio/interval if sample is small and data not normal) Compares two or more groups For nominal data H0: variable represented in the rows is independent of the variable represented in the columns And Alternative, one can be obtained from the other through: Reject H0 if Choose U1 or U2 (whichever is smallest) and compare with critical value (from table). Reject if below or equal to the critical value Where: R1 – sum of the ranks for population 1 R2 – sum of the ranks for population 2 n1 – sample size for population 1 n2 – sample size for population 2 Null hypothesis: H0: M1=M2 (where M represents median) Chi-square test Formula Random sampling; Independent observations; First put data in the form of a contingency table, then calculate: Where: R – total number of rows; C – total number of columns; – number of observations (frequency) in row i, column j – Expected frequency in row i, column j, which is calculated as: > X 2 ; I.e., Reject if absolute above critical value ( X 2 ; ) Where: - significance level; - degrees of freedom: = n-1 Where: Ri – sum of all the frequencies of row i (calculated as ); Cj – sum of all the frequencies of column j (calculated as ) n – Sample size. Sum of all frequencies. (calculated as ) Test ANOVA When to use Compares three or more groups Assumptions The samples comes from normally distributed For populations; ratio/interval All data populations have H0: 1=2 =3 … identical variances (i.e. 1=2=3…); Random sampling; Independent observations; Formula Fcalc Reject H0 if > MSTR MSE F ;( dfTR ;dfE ) Where MSTR – mean squared error of the treatment MSE – mean squared (residual) error Calculated as: MSTR SSTR SSE and MSE dfTR df E Where: SSTR – sum of squares of the treatment SSTR – sum of squares of the (residual) error dfTR – degrees of freedom treatment; dfE – degrees of freedom (residual) error; Calculated as: g SSTR ni ( xi x ) 2 i 1 g ni SSE ( xij xi ) 2 i 1 j 1 dfTR = g -1 and dfE =n - g Where: xij - observation j in group i xi - mean of group i ni - sample size of group i x - total mean (averaging all values, independently of the group) g – total number of groups n - total sample size Also SST=SSTR+SSE Where SST is the total sum of squares I.e., Reject if absolute above critical value ( F ;( dfTR ;dfE ) ) Where: - significance level; dfTR – degrees of freedom treatment: dfTR = g-1 dfE – degrees of freedom (residual) error: dfE = n-g g ni SST ( xij x ) 2 i 1 j 1