Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
HYPOTHESIS TESTING Hypothesis Tests: Purpose • Sampling => sampling error exists • We want to know if sampling error is a likely explanation for our observed results. Null Hypothesis: Assumed to be true for purpose of hypothesis test. Rejection implies "acceptance" of conclusion we wish to verify Hypothesis testing and truth True State of Nature : Null Hypothesis is Research Conclusion TRUE FALSE DO NOT REJECT Ho Correct decision Confidence level Probability = 1 - Error: Type II Probability = Error: Type I Significance level Probability = Correct decision Power of test Probability = 1- REJECT Ho Hypothesis testing Procedure: 1. Specify Null and alternative hypotheses 1. (null hypothesis is what we want to reject) 2. Choose statistical test for distribution for chosen test statistic 1. (see notes on choosing appropriate tests) 3. Specify confidence level 1. (ie. Required precision, ie. What , what ?) 4. Compute sample statistics, 5. Compute test statistic. Determine probability of test statistic under null hypothesis, using sampling distribution if we took repeated samples, and the null is true, how likely is the result we obtained? Hypothesis testing final step Compare obtained probability with specified significance level-- reject or do not reject if the test is-is the parameter different from 0?– then big numbers are more likely different from 0, and so the probability of observing such a big number given that the null is true is very small. therefore REJECT hypotheses where the p value or sig value are LESS than the level of significance. Notes on "p-values" and Hypothesis Testing (from Sawyer and Peter (1983 ) "The Significance of Statistical Significance" Journal of Marketing Research, v. 20 (May), pp.122-133.) DEFN: p value is P(evidence| hypothesis) e.g. Ho: µ1 = µ2: t test -- test statistic: t = [(x1- x2) - (µ1- µ2)] / S(x1- x2) and p=.05 => if the null is true (ie. no difference) then the probability of getting a mean difference this large or larger is 1/20. Interpreting “p” values 1. p=.05 does NOT imply probability of .05 of the results being due to chance or 95% probability that the observed value is "true". • (ie. P(hyp | evidence) • p value is calculated assuming that the difference is due to chance with prob=1 • so we are actually accepting or rejecting prob=1 that chance caused difference Interpreting “p” values 2. p value is not a summary of the data p value is not a measure of how strong or dependable the result is results are not "more" or "less" significant i.e. p=.001 is not highly sig. or more sig. than p=.05 Interpreting “p” values 3. Statistical significance does not equal practical significance Results are statistically significantly different from null, NOT significant or highly significant. Sample size and Probability of Hypothesis • Sampling error is explicitly included in significance tests, therefore if a given relationship is found to be statistically significant at a given confidence level then more confidence should be placed in the result if the study had a smaller rather than larger sample size • A larger sample may be more representative for nonsampling reasons but significance with a small sample is a more conservative test. • This is not to recommend overly small samples but anything will be significant in extremely large samples. Limitations of significance tests • Researcher influences objectivity of test via: Greater sample size to increase power greater reliability of measures to increase power post hoc change of level of significance ie. .05 to .10/ one tailed or two tailed test control over non-manipulated variables note: null results usually attributed to measurement error or sample size- Description of Results Focus on effect sizes and confidence intervals- not just whether or not a result is significant. If you want to say how confident you are about your results then report the confidence interval! Description of Results • Aggregate vs. Individual results- remember even with reasonably large effect sizes a large proportion of individuals in two groups will be similar or even in reverse order. (e.g. a difference of .8 std. deviations between means still has 52.6% of the two populations overlapped) Data Analysis Process 1. What's the problem to be answered? 2. What do we want to do about it? – Compare groups – Compare variables – Relationships among variables Data Analysis Process 3. What does the data look like: sample size (n) dependent variable(s) / independent variables nominal, ordinal, interval, ratio description, distribution missing values, outliers, errors (data "cleaning") research design – -- repeated measures, independent measures, time related – number of groups / treatments compared – number of relationships to be investigated Data Analysis Process 4. What is the appropriate test / test statistic assumptions underlying test statistic violations? need to change test to accommodate violations? • finalize: test, variables, options Data Analysis Process 5. Run analysis 6. Output: did it do what we wanted it tocheck "n", variables, options Data Analysis Process 7. Interpretation Results: effect size, confidence interval, statistical significance of tests, researchers' influence on those results Limitations - research design - statistical - alternative explanations LEVEL OF MEASUREMENT One Sample Two Or More Samples— Independent Two Or More Samples— Related Nominal Chi-Square* Chi-Square* McNemar* Cochran’s Q Ordinal KolmagorovSmirnov* Mann-Whitney* Kruskal Wallis (more than 2 groups)* Wilcoxan * Interval— Small sample t-test* t-test* Paired sample t-test* Interval— Large sample z test* z test* Anova* (GLM in spss) Repeated Measures Anova* Basic Considerations in Choosing a Multivariate Statistical Test Level of Measurement of Criterion? Nominal (N) Number of Criterion Variables? One Dependence Analysis None Interdependence Analysis Interval or Ratio (I) Ordinal (O) Level of Measurement of Predictors? N O Level of Measurement of Predictors? I STOP N O STOP Contingency Coefficient Index of Predictive Association I STOP Spearman’s Rank Correlation Discriminant Analysis N Level of Measurement of Predictors? N O Level of Measurement? I O Factor Analysis with Dummy Variables Cluster Analysis Kendall’s Coefficient of Concordance STOP Regression Analysis Regression Analysis with Dummy Variables