Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Hypothesis Testing Chapter 13 Hypothesis Testing Decision-making process Statistics used as a tool to assist with decision-making Scientific hypothesis is a statement of the predicted relationship amongst the variables Null hypothesis is a statement of no relationship amongst the variables Null Hypothesis Not Rejected Total Population Sample reared in sterile environment Sample reared in enriched environment Null Hypothesis Rejected Total population of rats reared in sterile environment Sample used in study Total population of rats reared in enriched environment Sample used in study Hypothesis Testing In Experimental Studies Your research design determines the kind of statistical test you will use. Experimental studies test hypotheses while quasi-experimental studies tend to focus more on generating hypotheses. Research Designs/Approaches Type Purpose Time frame Experimental Test for cause/ current effect relationships Quasiexperimental Test for cause/ Current or past effect relationships without full control Degree of control High Examples Comparing two types of treatments for anxiety. Moderate Gender to high differences in visual/spatial abilities Research Designs/Approaches Type Purpose Time frame Degree of control Examples Nonexperime ntal correlational Ex post facto Examine relationship between two variables Current (crosssectional) or past Low to medium Examine the effect of past event on current functioning. Past & current Low to medium Relationship between studying style and grade point average. Relationship between history of child abuse & depression. Research Designs/Approaches Type Purpose Time frame Nonexperime ntal correlational Cohortsequential Examine relat. Future betw. 2 var. predictive where 1 is measured later. Examine Future change in a var. over time in overlapping groups. Degree of control Examples Low to moderate Relat. betw. history of depression & development of cancer. How motherchild negativity changed over adolescence. Low to moderate Research Designs/Approaches Type Purpose Time frame Degree of control Examples Survey Assess opinions or characteristics that exist at a given time. Discover potential relationships; descriptive. Current None or low Voting preferences before an election. Past or current None or Low People’s experiences of quitting smoking. Qualitative Tests of Significance The Question Null Hypothesis Statistical Test Group Difference between means of 2 diff. groups Diff. betw. 2 means of related groups Diff. betw. means of 3 groups Group Relationships: betw. 2 variables Group Relationships: betw. 2 correlations H0: g1 = g2 t-independent H0: g1a = g1b t-dependent H0: g1 = g2 = g3 ANOVA H0: xy = 0 H0: ab = cd t-test for sig. Of correlation t-test for sig. Of diff. betw. 2 corr. Experimental Designs Examines differences between experimentally manipulated groups or variables (e.g., one group gets a certain drug and the other gets a placebo). At minimum, experimental (independent) variable has two levels (e.g., drug vs. placebo). – Advantage is that you can determine causality. – Disadvantage is cost and many variables cannot be experimentally manipulated (e.g., smoke exposure over time). Null Hypothesis Significance Testing Null hypothesis – Results are due to “chance” – H0 Alternative (scientific) hypothesis – Results are due to a true “effect” – H1 Null Hypothesis Significance Testing Null hypothesis – Results are due to “chance” (H0) Alternative (scientific) hypothesis – Results are due to a true “effect” (H1) Assess – Assuming H0 is true, what is the probability or “chance” of obtaining the data we did? Null Hypothesis Significance Testing Null hypothesis – Results are due to “chance” (H0) Alternative (scientific) hypothesis – Results are due to a true “effect” (H1) Assess – Assuming H0 is true, what is the probability or “chance” of obtaining the data we did? Decide – If the chance is small enough, reject H0 and infer the “effect” is real. Experimental Designs: Hypothesis Testing Type of Experimental Research Design Between Subject Within Subject Number of independent variables Number of groups or levels of the independent variable One independent variable Two groups More than two groups Independent samples t-tes One-way ANOVA Two independent variables Two groups or two levels of the independent variable More than two groups or more then two levels of the independent variable Two-way ANOVA Correlated t-tests Repeated measures ANOVA Parametric Vs. Non-Parametric Statistics: Two-Sample Cases Level of Related Samples measurement Nominal McNemar test Independent Samples Fisher X2 exact test Ordinal Sign test Wilcoxon matchedpairs sign test Median Interval T-test for T-independent test pairs matched test Mann-Witney U test Parametric Vs. Non-Parametric Statistics: > 2-Sample Cases Level of Related Samples measurement Nominal Cochran Ordinal Friedman Q test 2-way ANOVA Interval Repeated ANOVA Independent Samples X2 test Kruskal-Wallis way ANOVA measures ANOVA one- Parametric Vs. Non-Parametric Statistics: > 2-Sample Cases Level of measurement Correlation Nominal Contingency Ordinal Spearman Interval Pearson’s coefficient rank correlation Kendall rank correlation, etc. Coefficient Correlation Sampling Distribution of Mean Difference Scores 4.5 4 3.5 3 2.5 Normal Curve 2 1.5 1 95% of all cases 0.5 0 99% of all cases 0 Critical Values of T Need to determine the degrees of freedom – df = N-2 Need to determine the p value for rejecting the null hypothesis (alpha) Need to determine if this is a 1-tailed or 2tailed level of significance. T-Values T120 = 2.00, p < 0.05 What is one of the major criticisms of employing statistical tests of the null hypothesis to determine if effects are true? Limitations of Statistical Tests of the Null Hypothesis Does not take into account the size of the difference between means (effect size) Analysis of Variance (ANOVA) F-ratio = MSbet MSwithin Essentially is the between group variance divided by the within group variance. If the groups come from similar populations, the variances between the groups will be similar to the variance within groups (null hypothesis is not rejected). ANOVA Between group variance consists of: – Variability due to the effect of the independent variable (treatment effect) – Variability due to chance factors Within group variance consists of: – Variability in data with the treatment groups that is due to chance since if treatment effect was consistent, all subjects within a treatment group would experience similar magnitude of effect. Analysis of Variance (ANOVA) F-ratio = MSbet MSwithin The MS refers to the mean square and is the sums of squares divided by the appropriate degrees of freedom. Df for MSbet is the number of groups minus 1. Df for MSwithin is the total number of scores in the experiment minus the number of groups. ANOVA MSbet = treatment effect + chance variability MSwithin = chance variability Ratio will be 1 if there is no treatment effect F(2,144) = 5.56, p < 0.05. Two-Way ANOVA Where you have 2 independent variables, each having at least 2 levels. For example, – Drug dose (none vs. 5 mg) – Delivery mood (intravenous vs. oral) Factorial design so you can test both main effects and interaction effects Mixed Model: 2 Between Subject Factors 1 within Subject Factor Where you have 2 independent variables, each having at least 2 levels. For example, – Drug dose (none vs. 5 mg) – Delivery mood (intravenous vs. oral) One within subject factor with for example 3 levels – Pre-treatment, 3 and 6 months follow-up Factorial design so you can test both main effects and interaction effects (3-way interaction effects) Rejecting the Null Hypothesis Null hypothesis can be rejected but not accepted Arguments made for allowing some flexibility in being able to conclude the null hypothesis is true; – No other studies of the phenomenon have rejected the null hypothesis – P value for the test of the null hypothesis is large (e.g., > .20 or .40). – Research design is sufficiently powerful Errors in Statistical Decision-Making Type I error – falsely reject the null hypothesis – At p < .05 there is a 5% chance (5 in 100) of falsely rejecting null hypothesis Type II error – failing to reject the null hypothesis when it is false External Validity Chapter 14 Goals of Psychology Research Goal is to understand the underlying laws governing the behaviour of organisms. The extent to which the results of your study help inform one about these underlying laws, the more valuable the findings. Limits to the importance of the findings are the internal/external validity. External Validity Extent to which the results of the study can be generalized across different persons, settings, and times. Typically think of generalizing to specific populations (e.g., North American elementary school students) than world at large. Best safeguard is random selection but not usually feasible. Threats to External Validity Lack of population validity Lack of ecological validity Lack of time validity Population Validity Generalizing to the defined population (i.e., target population) from which the sample was drawn. Sample is the experimentally accessible population. Population Validity Target Population Experimentally accessible population Sample Population Validity Threatened by a selection by treatment interaction: – Treatment results may not be exactly reproducible in target population. Even willingness to volunteer for studies have been shown to result in a selection by treatment interaction effect. Ecological Validity Extent to which the results can be generalized across settings or environmental conditions. – E.g., Would the treatment effect observed in patients recruited from a 1st class medical centre be the same as the the treatment effect observed in patients recruited from a local community hospital? Ecological Validity Multiple-Treatment Interference – Sequencing effect whereby exposure to one treatment influences responses to another treatment; or – Exposure to one experiment influences response in another experiment (e.g., sophisticated participants). Ecological Validity Hawthorne Effect – Knowing one is in a study can affect one’s behaviour – Participant bias effects (e.g., social acceptability, compliance) Novelty or Disruption Effect – Effects are simply due to novelty and wear off once novelty diminishes. Ecological Validity Experimenter Effect – Enthusiastic experimenter/clinician may get different effects than a clinician who is implementing the treatment in routine care. Pre-testing Effect – Administering a pre-test may sensitive the participant in such a way that he/she may respond differently to the experiment than what would have occurred without a pre-test. Temporal Validity Extent to which the results would generalize to other times – Results might vary depending on the time elapsed between presentation of the independent variable and the measurement of the dependent variable. Temporal Validity Seasonal Variation – Variation that appears regularly over time (e.g., change in traffic accident rates between daylight savings time and non-daylight savings time). – Fixed-time variation – variation at specific, predictable time points – Variable-time variation – don’t know when variation will occur but when it occurs, there are predictable responses. Temporal Validity Cyclical Variation – Predictable variation within people or other organisms Personological Variation – Variation in the characteristics of the individual over time Internal Vs. External Validity Tends to be an inverse relationship – Internal validity ; external validity In testing for between group differences, you want to minimize within group variability and maximize between group differences To do so you want to ensure high control over factors that could confound the results but this often results in increasingly artificial experimental conditions. When Is External Validity Less Important When you don’t need to demonstrate that “X” will happen but rather “X” can happen. Sometimes the main goal is to test a theory and extent to which it reflects “real-life” is less important.