Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Inferential Statistics Which Statistic Do I Use? •Dependent Variable Type •Continuous •Categorical •Number of Factors (Independent Variables) •One •Two or More (Factorial Analysis) •Number of Levels (of the Independent Variable) •Two •Three or More (Between or Repeated Measures) Inferential Statistics Random Error •will be responsible for some difference in the means Inferential Statistics •gives the probability that the difference between means reflects random error rather than a real difference Null Hypothesis Null Hypothesis (H0): •states that there is no difference between the sample means •any observed difference is due to random error •the Null Hypothesis is rejected when there is a low probability that the attained results are due to random error Research Hypothesis Research Hypothesis (H1): •state that there is a difference between the sample means •the difference is not due to random error •the difference is due to the Independent Variable Probability •probability is the likelihood of the occurrence of some event or outcome •if probability is low we reject the possibility of random error •a significant result is one that is very unlikely if the null hypothesis is correct •alpha level: probability required for significance •most common alpha level is p< 0.05 •if there is less than 5% chance that the results were due to random error then the results are considered to be statistically significant Samples and Populations Samples are subsets of the population. Inferential statistics reflect what would happen if you had multiple samples Sampling If a dependent variable within the population is normally distributed and we can calculate or estimate the mean and standard deviation of the population, then we can use probabilities to determine if the independent variable has caused a significant change in the dependent variable. This is the essence of the scientific method. To begin we must collect a representative sample of our much larger population. Representative Sample means that all significant subgroups of the population are represented in the sample. Random Sampling is used to increase the chances of obtaining a representative sample. It assures that everyone in the population of interest is equally likely to be chosen as a subject. The larger the random sample, the more likely it will be representative of the population. Randomization •Assures that any extraneous variable will affect all •participants equally •Uses lists of random numbers or random number generators •Any variable that can not be held constant is controlled by randomization •Can be used for scheduling the ordering of events Random Numbers Step 1: make a numbered list of all of your experimental participants Step 2: flip through the 4 pages and arbitrarily put your finger on a page Step 3: read the numbers in sequence, either down or across Step 4: assign those numbers, in order, to your list of participants; if the number is duplicated then skip and go to the next one Step 5: assign your participants with the highest numbers to group 1 and the ones with the lowest numbers to group 2. Who Needs Random Sampling? Hite Report Survey on Women’s Relationships: Percentage of Respondents Hite Report Findings 100 80 60 40 20 0 95 70 Had an Affair Felt Harassed Women's Disclosure about Relationship We Need Random Sampling When Survey was Redone with a Random Sample: Percentage of Respondents Hite Report Findings 100 80 60 40 20 0 14 Had an Affair 3 Felt Harassed Women's Disclosure about Relationship Sampling Techniques • Probability Sampling • Simple Random Sampling • Stratified Random Sampling • Cluster Sampling • Non-Probability Sampling • Haphazard Sampling • Quota Sampling Evaluating Samples • Sampling Frame • is defined as the actual population of individuals from which a random sample will be drawn • must assess how well the sampling frame matches the overall population • Response Rate • • • • percentage who actually take the survey determines amount of bias in the final data low response rate is less accurate must do all you can to increase the response rate • Reasons for Using Convenience Sampling • most research in psychology uses non-probability sampling • saves time and money and is generalizable • research is to study relationships not estimate population Sampling Distribution If the sample is completely random then the sample mean should be a good estimate of the population mean. When we take multiple samples from our population we will get a range of means. We can make a Distribution of Sample Means or sometimes called a Sampling Distribution. The Central Limit Theorem states that the distribution of sample means approaches a normal distribution when n is large. In such a distribution of unlimited number of sample means, the mean of the sample means will equal the population mean: Sampling Distribution The Sampling Distribution is a probability distribution of all possible outcomes due simply to chance based on the assumption that the null hypothesis is true. When your outcome becomes highly unlikely based on pure chance we reject the Null Hypothesis. Science sets this low probability at p < 0.05 or only 5% due to chance. Standard Error of the Mean The standard deviation of the distribution of sample means is called the standard error of the mean or standard error for short. It is represented by the following formula: Since the standard deviation of the population is sometimes difficult to get, a good estimate of the standard error uses the standard deviation of the sample. This formula is shown below. Type I and Type II Error Anytime you observe a difference in behavior between groups, it may exist for two reasons: 1.) there is a real difference between the groups, or 2.) the results are due to error involved in sampling. This error can be described in two ways: Type I error is when you reject the null hypothesis when shouldn't have because the null hypothesis is actually true - there is not difference between your groups. Type II error is when you fail to reject the null hypothesis when you should have because there really is a significant difference between your groups. The probability of committing a Type I error is designated by alpha. An alpha level of 0.05 is reasonable and widely accepted by all scientists. The null hypothesis can be rejected if there is less than 0.05 probability of committing a Type I error ( p < .05 ). One-Tailed Hypothesis If the scientific hypothesis predicts a direction of the results, we say it is a OneTailed Hypothesis because it is predicting that alpha will fall only in one specific directional tail. If the sample mean falls in this area we can reject the null hypothesis. This is shown below: Two-Tailed Hypothesis If the scientific hypothesis does not predict a direction of the results, we say it is a Two-Tailed Hypothesis because it is predicting that alpha will be split between both tails of the distribution. If the sample mean falls in either of these areas we can reject the null hypothesis. This is shown below: Degrees of Freedom The term degrees of freedom refers to the number of scores within a data set that are free to vary. In any sample with a fixed mean, the sum of the deviation scores is equal to zero. If your sample has an n = 10. The first 9 scores are free to vary but the 10th score must be a specific value that makes the entire distribution equal to zero. Therefore in a single sample the degrees of freedom would be equal to n - 1. Which Statistic Do I Use? •Dependent Variable Type •Continuous •Categorical •Number of Factors (Independent Variables) •One •Two or More (Factorial Analysis) •Number of Levels (of the Independent Variable) •Two •Three or More (Between or Repeated Measures) t-test •examines whether two groups are significantly different from each other •must specify null hypothesis and significance level (alpha) •we calculate our t and determine where it lies on the sampling distribution •the t-test is a ratio between the group mean difference and the variability within groups t-test Types •One Sample t-test •for comparison of one sample to a population •Independent t-test •for comparison of two independent samples •Paired or Correlated t-test •for comparison of two correlated samples Analysis of Variance (ANOVA) •compares three or more groups •can compare two or more Independent Variables •must specify null hypothesis and significance level (alpha) •One-Way ANOVA •Two-Way ANOVA One-Way ANOVA compares at least three levels compares only one Independent Variable Variance Source Sum of Squares Degrees of Freedom Mean Square Between SSbg dfbg MSbg Within SSwg dfwg MSwg F ratio F Main Effect The Main Effect is the effect that an independent variable has on the dependent variable. In a One-Way ANOVA, there is only one Main Effect since there is only one independent variables or Factors of the experiment. The null hypothesis for the Main Effect of the is: H0 : µ1 = µ 2 = . . . µk The research hypothesis for the Main Effect is: H1 : At least one of the sample means comes from a different population distribution than the others Two-Way ANOVA compares at least two or more levels compares at least two Independent Variable Variance Source Sum of Squares Degrees of Freedom Mean Square F ratio Rows SSr dfr MSr Fr Columns SSc dfc MSc Fc Interaction SSr x c dfr x c MSr x c Fr x c Within SSwg SStotal dfwg dftotal MSwg Total Chi-Square The Chi-Square (X2) is used for analysis of nominal data. Remember that nominal data are categorical data without any order of value. Two good examples of nominal data are "yes-no" and "true-false" answers on a survey. Chi-Square analyses can be either One-Way, with one independent variable, or Two-Way, with two independent variables.