Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
What are the five steps used in hypothesis testing? The first step is to state the research question. This is usually stated in words without any fancy mathematical or statistical notation. It is essentially a question that you are trying to answer in your research. For example, if you were studying the average weight of residents of California and trying to determine if their weight is above the national average, your research question might be “Is the average weight of California residents greater than 170 pounds?” The second step is to specify the null and alternative hypothesis. Here, you state what the conventional wisdom is about the population parameters you are studying as the null hypothesis and the alternative hypothesis would be the what you are trying to prove. Continuing with the example above, the null hypothesis is: The average weight of California residents is equal to 170 pounds. The alternative hypothesis is that the average weight of California residents is greater than 170 pounds. The third step is to calculate the test statistic. You will need to choose the test statistic that is most appropriate for the data you are analyzing. The test statistic could include the Z-score, a T-score, or many other test statistics. Continuing with the example above, you would select the Z-score if you knew the standard deviation of the weights of the population (required for the Z test). The fourth step is to compute the probability of observing the test statistic or stating the rejection region. In this step you calculate the p-value associated with your test-statistic calculated in step 4. Alternatively, but equivalently, you could determine the rejection region of the test statistic. The rejection regions is simply the value of the test statistic that when exceed, would result in a rejection of the null hypothesis. The final step is to state your conclusion. If you discovered for example that your p-value calculated in the previous step was very small (e.g. .0001), would you state that there is sufficient evidence to support rejection of the null hypothesis and conclude that the average weight of California residents is larger than the average weight of the population of the US. How do you decide on an alpha significance level? Give an example. You select a specific alpha level to ensure that you do not falsely reject the null hypothesis when the null hypothesis is true. Typically the alpha region is selected to be .01 or .05. This means, that, if you reject the null hypothesis based upon your test statistic, the chances of the null hypothesis being true (and hence falsely rejected) is 1% or 5% (depending upon the choice of alpha). If, rejecting the null hypothesis when the null hypothesis is true would be catastrophic, you would likely adjust your alpha level to be smaller to reduce the chances that you falsely reject the null hypothesis. Continuing with the weight example, you would likely set the alpha level around .05, as falsely rejecting the null hypothesis is unlikely to have catastrophic consequences. Why would you use a z test rather than a t test? Which do you think you will use more often? Why? A t-test is more likely to be used than a z-test, because use of the z-test has more restrictions. For example, the z-test requires a large sample or that the population standard deviation is known. In practice the population standard deviation is rarely known, and it has to be estimated from your data. When you estimate the standard deviation, it is appropriate to use a t-test. When is it appropriate to use a one-tailed test versus a two-tailed test? Does direction of the test affect statistical significance? Explain. You use a one-tailed test when you are specifying that the alternative hypothesis is in a specific direction, either larger or smaller, than the null hypothesis. For example in the example that we are using, the test would be a one-tailed test as we are stating our alternative hypothesis is that the weight of California residents is GREATER than the average weight. If our alternative hypothesis had just indicated that the weight was different (not necessarily greater than or less than, but one or the other of these), then we would use a two-tailed test. If you use a one sided test the rejection region is easier to object, because the probabilities associated with the of the rejection region are concentrated at one tail of the distribution. However, if you use a two sided test, the probability of the rejection region is split between both ends of the distribution tails. Why is it more ethical to decide on the tails of the test before collecting and analyzing the data? It is more ethical to decide on the tails of the test before analyzing the data as one could simply select the tail (either one sided or two sided) that makes it easiest to reject the null hypothesis. For example, a statistician that didn’t select the tails of the test before collecting data might collect the data and then calculate a test statistic. If the test statistic was not rejected using a two-sided test, the unethical statistician might then switch to using a one-tailed test hoping that the one-tailed test results in a rejection of the null hypothesis. This is poor use of the scientific method, because the statistician is actually modifying the test to support his hopes of a rejection of the null hypothesis.