Download Unit 3 Notes: Statistical Inference Testing In Chapter 15 we learn

Unit 3 Notes: Statistical Inference Testing  In Chapter 15 we learn about 2 Sampling Distributions which are the distribution of all the sample proportions/means coming from samples having the same size n. o the Sampling Distribution of Proportions  N(p, (pq/n) ) o the Sampling Distribution of Means  N(, /n ) o Both are normally distributed and have similar conditions to check:  Randomization, 10% condition, and a large-enough (success/failure) condition (np>=10 and nq>=10 for proportions, less exact for means) o We use the normal model to answer questions about how likely it is for a sample proportion or sample mean to be in a certain range of values o Note: To use the sampling distribution to determine how unusual sample statistics are, we need to know the true population parameters (p / , )  In Chapter 16 we build confidence intervals around 𝑝̂ using the SE(𝑝̂ ) to make inferences about the true population p. o The confidence interval is given by (𝑝̂ – ME, 𝑝̂ + ME)   o o o o o 𝑝̂𝑞̂ ME (Margin of error) = z*SE(𝑝̂ ) = z* √ 𝑛 z* is the z-score critical value that “cuts off” the middle of the standard normal curve corresponding to our Confidence Level (e.g. middle 90% for a 90% confidence interval). We find this by sketching the normal curve and then using InvNorm(tail probability) on our calculators. Our confidence intervals tell us that we are 95% (for example) confident that the true population proportion p is within that interval. In this case we have a sample proportion p^ but usually no population proportion. We use the 1-PropZint function on our calculator to do this  note you may need to calculate x from 𝑝̂ .  Recall that 𝑝̂ =(x)/(n) so x = (𝑝̂ )(n) Assumptions & Conditions:  Independence Assumption (check Randomization Condition, 10% Condition… sample should be no more than 10% of population)  Sample Size Assumption (check success/failure condition… need at least 10 successes and 10 failures) We also use the following formula to compute the sample size (n) needed in order to give us a certain confidence interval with a certain margin of error: (z*) 2 ( pˆ )(qˆ )  n= (ME) 2   In Chapter 17 we use the One Proportion Z-Test (1-PropZtest) to compute the probability that a given sample proportion comes from a population having a known (or hypothesized) proportion. o In this case we have a hypothetical population proportion and a single sample proportion o We start by writing a null hypothesis and an alternate hypothesis:  H0: p = __  HA: p1≠___, p< ___, or p>___ o To use the 1-PropZTest we must check the following Assumptions & Conditions:  Independence Assumption (check Randomization Condition, 10% Condition)  Sample Size Assumption (check success/failure condition) o The 1-PropZTest gives us a P-value that we test against an alpha level to see if we should reject the null hypothesis or not. The P-value is the probability of observing the sample proportion if the null hypothesis is true.  If P < alpha, reject the null hypothesis, otherwise we fail to reject the null hypothesis  In Chapter 18 we make inferences about the true population mean given a sample mean using the student’s t-models. o The student’s t-models rely on certain assumptions. Check the following conditions before making a t-interval or doing a t-test.  Independence Assumption – check:  Randomization Condition  10% condition  Normal Population Assumption – check:  Nearly Normal Condition (the data comes from a distribution that is unimodal and symmetric) o A One Sample T-interval around the sample mean y(bar) gives a confidence interval for the true population mean. We are 95% (for example) confident that the true population mean is within that interval.  Use Stat… Tests… #8 T-Interval in most cases (this is when you don’t know the actual population standard deviation and whether it is normal.) #7 Z-Interval can be used in rare cases.  Enter either raw data in L1 or enter sample stats  Make sure you have checked the conditions and have nearly normal data o A One Sample T-Test gives us the probability that our sample came from a population having a given (hypothesized) population mean. Stat… Tests… #2 T-test  H0: µ0=(hypothesized mean)  HA: µ0 is ≠, <, or > the hypothesized mean  In Chapter 19 we examine the meaning of the P-value. The P-value is the probability of seeing our sample by chance alone, if the null hypothesis were true. o We also explore how confidence intervals relate to hypothesis tests. A confidence interval corresponds to a two-tailed hypothesis test with  = 100 – Clevel. e.g.  of .05 corresponds to 95% Confidence. o We also define Type I and Type II error.  Type I: we reject H0 when H0 is actually true  Type II: we accept H0 when H0 is actually false  In Chapter 20 we look at situations where we have 2 samples which we want to compare (i.e. determine if they come from the same or different populations) o First we consider confidence intervals around p^1 – p^2.  We note that the sampling distribution of p1 ˆ qˆ p ˆ qˆ  p p2 is normally distributed: ˆ1  p ˆ 2, 1 1  2 2  Np and we use the Normal model to build a n1 n2   confidence interval around p^1 – p^2  On our calculator we do this using 2-propZint  In this case we know p^1 and p^2 but not p1 or p2. We find the confidence interval for  the true difference between the two proportions of the two groups.  Assumptions and Conditions:  Independence Assumptions: o Randomization Condition: The data in each group should be drawn independently and at random from a homogeneous population or generated by a randomized comparative experiment. o The 10% Condition: If the data are sampled without replacement, the sample should not exceed 10% of the population. o Independent Groups Assumption: The two groups we’re comparing must be independent of each other.  Sample size condition: o Success/Failure Condition: Both groups are big enough that at least 10 successes and at least 10 failures have been observed in each. o We also use the Two Proportion Z-Test (2-propZtest) to find the probability that our two sample proportions come from the same population (i.e. are really the same or are really different).  H0: p1=p2,  HA: p1≠p2, p1<p2, or p1>p2  This test gives us a P-value which we compare to our target alpha level to see if we should reject the null hypothesis or not. o Then we examine the difference between two sample means. We use the student’s t sampling model and estimate the standard error using the data.  The confidence interval we build is called a two-sample t-interval (for the difference in means).  Stat -> Tests -> 0: 2-SampTInt    o (note: Don’t pool) The corresponding hypothesis test is called a two-sample t-test.  Hypotheses: o H0: µ1= µ2 o HA: µ1 is ≠, <, or > µ2  Stat -> Tests -> 4: 2-SampTTest o (Don’t pool) Check the following Assumptions and Conditions for both groups.  Independence Assumption – check: o Randomization Condition o 10% condition  Normal Population Assumption – check: o Nearly Normal Condition (the data comes from a distribution that is unimodal and symmetric)  Independent Groups Assumption (the two groups are independent!) In Chapter 21 we look at paired data (for example you have two values like before/after for each participant, or the data is otherwise paired in a natural way). This is an example of a blocked experimental design. o YOU CANNOT USE A 2 SAMPLE T-TEST WITH PAIRED DATA o We examine the pairwise differences.  Because it is the differences we care about, we treat them as if they were the data and ignore the original two sets of data. o Check the following Assumptions and Conditions  Paired data Assumption: The data must be paired.  Independence Assumption: (The differences must be independent of each other.)  Randomization Condition  10% Condition  Normal Population Assumption: We need to assume that the population of differences follows a Normal model.  Nearly Normal Condition: Check this with a histogram of the differences. o A paired t-test is just a one-sample t-test (Stat -> Tests -> #2 T-test) for the mean of the pairwise differences.  Hypotheses:  H0: d = 0  HA: d ≠, >, or < 0  Enter the differences (L1-L2) in L3 of your calculator and use this as your data for the test.  The sample size is the number of pairs o You can find a confidence interval on d by entering the differences (L1L2) in L3 of your calculator and then using Stat-> Tests -> #8 TInterval  In Chapter 22 we look at something a little different. The Chi Square model looks at counts of categorical data. There are three related tests (we’ll focus on Goodness-of-Fit and the Test of Independence) o Assumptions and Conditions: For all x2 tests check:  Counted Data Condition: The data must be counts.  Independence Assumption: (The counts in the cells should be independent of each other). check:  Randomization Condition: The individuals who have been counted and whose counts are available for analysis should be a random sample from some population.  Sample Size Assumption: (We must have enough data for the methods to work). check:  Expected Cell Frequency Condition: The expected count in each cell must be at least 5 o The x2 Goodness-of-fit test compares the observed distribution of a single categorical variable to an expected distribution based on theory or model.  Hypotheses:  H0: The categorical counts are distributed according to the given model (which is…)  HA: The categorical counts are not distributed according to the model  Using your calculator:  Put your observed data in L1 and your expected (model) data in L2  TI-84+: Stat Tests  x2GOFtest  TI-83: o use L3 to store (L1-L2)2/L2 o x2 = sum(L3) (access sum through 2nd stat -> MATH) o To find the p-value: DISTR -> 7: x2cdf( ans, 1E99, d.f.) o d.f. (degrees of freedom) is (1- # categories)  Note: the x2 test statistic is computed as a sum of “components” – each component is given by (observed – expected)2 / expected o Tests of homogeneity compare the distribution of several groups for the same categorical variable.  Hypotheses:  H0: The distribution of ______ are the same for each group (identify groups)  HA: The distribution of ______ are not the same for each group  Compute a test of homogeneity in the same way you compute a X2 Test of Independence 2 o The x Test of Independence examines counts from a single group for evidence of an association between two categorical variables.  Hypotheses:   H0: _______ and _______ are independent  HA: _______ and _______ are not independent Using the calculator:  MATRX (which is 2nd x -1 ) -> EDIT highlight [A] and hit enter  Adjust the size of matrix A (rows X columns) and fill it with the table values (leave out the totals)  STAT -> Tests -> C: X2-Test o Observed: [A] Expected: [B] Calculate

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Unit 3 Notes: Statistical Inference Testing In Chapter 15 we learn