Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Lecture 3 Outline: Thurs, Sept 11 • Chapters 1.3-1.4 • Probability model for 2-group randomized experiment • Randomization test p-value • Probability model for random sampling Vocabulary of Experiments • A study is an experiment when we actually do something to people, animals or objects to observe the response. • Experimental units are the things to which treatments are applied, e.g., people, rats, samples of materials or pieces of land. • When units are human beings, they are called subjects. • A specific experimental condition applied to the units is called a treatment. • The “control” refers to a treatment that is considered a baseline for comparing all other treatments. • Creativity study: Experimental units? Treatments? Probability Model for 2-treatment Randomized Experiment • Creativity Study – Chance mechanism for randomizing units to treatment groups ensures that every subset of 24 subjects gets the same chance of becoming intrinsic group – For example, 23 red and 24 black cards could be shuffled and dealt, one to each subject and the subjects with black cards would be the intrinsic group. – Tables of random numbers can be used to assign units to groups (assign the units with the 24 highest numbers to group 1). Additive Treatment Effect Model • Potential Outcomes: For each subject, we can imagine what the subject’s outcome would be if placed in the extrinsic group (Y) and what subject’s outcome would be if placed in the intrinsic group (Y*). We only see the outcome for the group to which they were assigned. • Additive Treatment Effect Model: For every subject, Y*=Y+ • is a parameter – an unknown constant that describes a key feature in model for answering questions of interest. Test for Treatment Effect • Meaning of : >0: Intrinsic questionnaire improves creativity. =0 : Intrinsic questionnaire (treatment) makes no difference. – <0: Intrinsic questionnaire makes creativity worse. – – • Hypothesis testing: Questions of interests are translated into questions about parameters in probability models. • Null hypothesis (H0): =0 (group status has no effect on outcome) • Alternative Hypothesis: 0 (group status has an effect on outcome) Test Statistic • A test statistic is a numerical data summary for testing a hypothesis. We try to find a test statistic that tends to be small when the null hypothesis is true and tends to be large when the alternative hypothesis is true. • Test statistic for 2-group randomized exp.: – Let Y be the sample mean of the outcome for units 1 assigned to group 1. – Let Y2 be the sample mean of the outcome for units assigned to group 2. – Test statistic: T= | Y2 Y1 | Testing the Null Hypothesis • If there is no treatment effect (H0), subjects would receive same outcome regardless of their assigned group. • If there is a treatment effect, subjects will receive a higher (or lower) outcome if they are assigned to one group. • Therefore a large value of T | Y1 Y2 | argues against the null hypothesis. • But how large is large? Even if there is no treatment effect, T will not necessarily equal 0 because the random assignment can result in an uneven mix of abilities. Randomization Test p-value • The observed value of the test statistic can be large because – (a) there is an effect of the treatment – (b) the random assignment resulted in an uneven mix • A randomization test p-value is the probability associated with explanation (b) • The smaller the p-value, the less believable (b) is as an explanation. Exact Calculation of the p-value • The p-value is the probability that T>=4.14 if, in fact, there is no treatment effect (and based on the random assignment of units to groups) • Important starting point: If there is no treatment effect, then the creativity score for an individual would have been the same had they been assigned to the other group. • Exact Calculation of p-value – Calculate T for every possible grouping of the 47 numbers into groups of size 23 and 24 – The p-value is the proportion of regroupings with T>=4.14. Exact calculation of p-value • If there is no treatment effect, subjects would receive same outcome regardless of their assigned group. • Distribution of test statistic T if there is no treatment effect: – For every possible random assignment of units into two groups, calculate T using the observed outcomes. – The T’s associated with each possible random assignment have the same probability. • The p-value is the probability that if the null hypothesis were true (no treatment effect), T would be greater than or equal to the observed T0 Example • Suppose the creativity study had just six students. Suppose the three students assigned to the intrinsic group had scores of 12, 20 and 28. The three students assigned to the extrinsic group had scores of 10, 18 and 26. • Calculate the p-value for testing if there is a treatment effect. P-value for Creativity Study • For the actual creativity study, using a computer program, the p-value is .011. • Conclusion: either – (i) there is no treatment effect and we happened to get an uneven randomization – (ii) there is a treatment effect. • The probability associated with (i) is .011. So either there is a treatment effect or we obtained an unusual (one-in-a-hundred) randomization. • A p-value of around .01 is considered strong evidence against the null hypothesis, see pg. 47 One-sided vs. Two-sided Tests • For some problems, we might know that the treatment effect is >=0 or <=0 and want to use a one-sided alternative hypothesis – (i) Ha: 0 or – (ii) Ha: 0 • The appropriate test statistics for the one-sided alternative hypotheses are T Y2 Y1 for testing (i) and T Y1 Y2 for testing (ii), where it is assumed that group 2 is the “treatment” group and group 1 is the “control” group. Randomization Distribution and p-value • Defn.: The randomization distribution of a test statistic describes its possible values over all the ways the randomization could have turned out. • The p-value of the randomization test is the proportion of the randomization distribution that is at least as large as the observed test statistic. Approximating the p-value • For the creativity study, there are 1.6*1013 different groupings. • Approximating the randomization test p-value. – (i) Monte Carlo simulation: Randomly choose many groupings. Approximate the randomization distribution by the histogram of the test statistic for the randomly chosen groupings – (ii) (Chapter 2). The randomization distribution of the “t-statistic” is approximated by the “t”-distribution. Probability Model for Random Sampling • Consider taking random samples from two populations with respective means 1 and 2 . Are the means different? H0: 1 2 0 H1: 1 2 0 Test statistic: T | Y1 Y2 | Probability Model for Random Sampling • The sampling distribution of T | Y1 Y2 | is represented by a histogram of all values for the statistic from all possible samples that can be drawn from the two populations. • The p-value for testing a hypothesis (and confidence intervals) follows from an understanding of the sampling distribution. We will discuss sampling distribution in Ch. 2.