Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Statistics & Data Analysis Course Number Course Section Meeting Time B01.1305 31 Wednesday 6-8:50 pm Hypothesis Testing Class Outline Review of midterm exam Hypothesis Testing One-sample tests Two-sample tests P-values Relationship with Confidence Intervals Professor S. D. Balkin -- July 1, 2002 -2- Review of Last Class Statistical Inference Point Estimation Confidence Intervals Professor S. D. Balkin -- July 1, 2002 -3- Reminder: Statistical Inference Problem of Inferential Statistics: • Make inferences about one or more population parameters based on observable sample data Forms of Inference: • Point estimation: single best guess regarding a population parameter • Interval estimation: Specifies a reasonable range for the value of the parameter • Hypothesis testing: Isolating a particular possible value for the parameter and testing if this value is plausible given the available data Professor S. D. Balkin -- July 1, 2002 -4- Point Estimators Computing a single statistic from the sample data to estimate a population parameter Choosing a point estimator: • What is the shape of the distribution? • Do you suspect outliers exist? • Plausible choices: • • • • Mean Median Mode Trimmed Mean Professor S. D. Balkin -- July 1, 2002 -5- Confidence Intervals Specification of a “probably range” for a parameter Used to understand how statistics may vary from sample to sample States explicit allowance for random sampling error (not selection biases) We have 95% confidence that the population parameter falls within the bounds of the interval Or…the interval is the result of a process that in the long run has a 95% probability of being correct Professor S. D. Balkin -- July 1, 2002 -6- Hypothesis Testing Chapter 8 Overview A research hypothesis typically states that there is a real change, a real difference, or real effect in the underlying population or process. The the opposite, null hypothesis, then states that there is no real change, difference, or effect The basic strategy of hypothesis testing is to try to support a research hypothesis by showing that the sample results are highly unlikely, assuming the null hypothesis, and more likely, assuming the research hypothesis The strategy can be implemented in equivalent to raise by creating a formal rejection region, by obtaining a plea value, were like seeking whether the null hypothesis value falls within a confidence interval There are risks of false positive and a false negative errors Tests of a mean usually are based on the t-distribution Tests of the proportion may be done by using a normal approximation Professor S. D. Balkin -- July 1, 2002 -8- Overview Very often sample data will suggest that something relevant is happening in the underlying population • A sample of potential customers may show that a higher proportion prefer a new brand to the existing one • A sampling of telephone response time by reservation clerks may show an increase in average customer waiting time • A sample of the service times may indicate customers are receiving poorer service fan in the company thinks it is providing The question of whether the apparent defects in the sample is an indication of something happening in the underlying population and more if he apparent effect is merely a fluke Professor S. D. Balkin -- July 1, 2002 -9- What is Hypothesis Testing Method for checking whether an apparent result from a sample could possibly be due to randomness Checks on how strong the evidence is Are sample data reflecting a real effect or random fluke? Results of a hypothesis test indicate how good the evidence is, not how important the result is Professor S. D. Balkin -- July 1, 2002 - 10 - Motivating Case Study #1 FCC has been receiving complaints from customers ordering new telephone service Big telecommunications company tells the FCC that the average time a new customer has to wait for new service installation is 72 hours (excluding weekends) with a standard deviation of 24 hours The FCC randomly samples 100 new customers from the telecom company and asks how long each had to wait for new service installation Professor S. D. Balkin -- July 1, 2002 - 11 - Testing Hypotheses Research Hypothesis, or Alternative Hypothesis is what the is trying to prove • Denoted: Ha Null Hypothesis is the denial of the research hypothesis. It is what is trying to be disproved • Denoted: H0 Professor S. D. Balkin -- July 1, 2002 - 12 - Hypothesis Testing Components Define research hypothesis direction: • One-sided (< or >) • Two-sided () Strategy is to attempt to support the research hypothesis by contradicting the null hypothesis • The null hypothesis is contradicted if when assuming it is true, the sample data are highly unlikely and more likely given the research hypothesis Test Statistic: Summary of the sample data Professor S. D. Balkin -- July 1, 2002 - 13 - Basic Logic 1. Assume that H0: m=72 is true; 2. Calculate the value of the test statistic Sample mean, proportion, etc. 3. If this value is highly unlikely, reject H0 and support Ha We can use the sampling distribution to determine what values of the test statistic are sufficiently unlikely given the null hypothesis Professor S. D. Balkin -- July 1, 2002 - 14 - Rejection Region Specification of the rejection region must recognize the possibility of error • Type I Error: Rejecting the null hypothesis when in fact it is true • In establishing a rejection region, we must specify the maximum tolerable probability of this type of error (denoted a) • Type II Error: Failing to reject the null hypothesis when in fact it is false (beyond scope) Rejection region can be based on sampling distribution of the sample statistic • Remember, we want to reject the null hypothesis if the value of the test statistic is highly unlikely assuming H0 is true • Can uses the tails of a normal distribution Professor S. D. Balkin -- July 1, 2002 - 15 - Rejection Region m=72 Professor S. D. Balkin -- July 1, 2002 - 16 - Rejection Region (cont) To determine whether or not to reject the null hypothesis, we can compute the number of standard errors the sample statistic lies above the assumed population mean This is done by computing a z-statistic for the sample mean: z Professor S. D. Balkin -- July 1, 2002 Y m0 / n - 17 - Rejection Region (cont) For a 0.05 reject H 0 : m 72 if the observed value of Y is more than 1.645 Y above m 72. For a 0.05 reject H 0 : m 72 if computed z statistic is greater th an 1.645 a=0.05 Rejection m=72 Region m+3.948 Professor S. D. Balkin -- July 1, 2002 - 18 - Example The FCC sample of 100 randomly selection new service customers resulted in a mean of 80 hours. Setup the hypothesis test Calculate the test statistic Interpret the hypothesis Professor S. D. Balkin -- July 1, 2002 - 19 - Example A researcher claims that the amount of time urban preschool children age 3-5 watch television has a mean of 22.6 hours and a standard deviation of 6.1 hours. A market research firm believes this is too low The television habits of a random sample of 60 urban preschool children are measured and resulted in the following • Sample mean: 25.2 Should the researcher’s claim be rejected at an a value of 0.01? Professor S. D. Balkin -- July 1, 2002 - 20 - Summary for Z Test with Known H 0 : m m0 H a : 1. m m 0 2. m m 0 3. m m 0 Test Statistic : Y m0 z / n Rejection Region : 1. z za 2. z za 3. z za / 2 or z za / 2 Professor S. D. Balkin -- July 1, 2002 - 21 - Example A researcher claims that the amount of time urban preschool children age 3-5 watch television has a mean of 22.6 hours and a standard deviation of 6.1 hours. A market research firm believes this is incorrect, but does not know in which direction The television habits of a random sample of 60 urban preschool children are measured and resulted in the following • Sample mean: 25.2 Should the researcher’s claim be rejected at an a value of 0.01? Professor S. D. Balkin -- July 1, 2002 - 22 - Z-values Worth Remembering z0.05 z0.025 z0.01 z0.005 Professor S. D. Balkin -- July 1, 2002 = 1.645 = 1.96 = 2.326 = 2.576 - 23 - P-Value Probability of a test statistic value equal to or more extreme than the actual observed value Recall basic strategy • Hope to support the research hypothesis and reject the null hypothesis by showing that the data are highly unlikely assuming that the null hypothesis is true • As the test statistic gets farther into the rejection region, the data become more unlikely, hence the weight of evidence against the null hypothesis becomes more conclusive and p-value become smaller Professor S. D. Balkin -- July 1, 2002 - 24 - P-Value (cont) Small p-values indicate strong, conclusive evidence for rejecting the null hypothesis Computation is straightforward in our z-test example: One tailed p - value P(Z z) Compute the p-value for our telecom example Professor S. D. Balkin -- July 1, 2002 - 25 - P-Value (cont) P-value is also referred to as attained level of significance • Results of a test are said to be statistically significant at the specified p-value Statistically significant says the difference between what is observed and what is assumed correct is most likely not due to random variation It DOES NOT MEAN the difference is important! It DOES NOT tell you that the difference is meaningful from business perspective (practical significance) With large enough sample size, any difference can become meaningful Professor S. D. Balkin -- July 1, 2002 - 26 - P-Value for a z Test The p - value is the probabilit y, assuming that the null hypothesis is true, of obtaining a test statistic at least as extreme as the observed value. H a : m m 0 , p value P( z zactual ) H a : m m0 , p value P( z zactual ) H a : m m 0 , p value 2 P( z | zactual |) Professor S. D. Balkin -- July 1, 2002 - 27 - Hypothesis Testing with the t Distribution Population standard deviation is rarely known Basic ideas of hypothesis testing are not changed, we simply switch sampling distributions t Professor S. D. Balkin -- July 1, 2002 Y m0 s/ n n 1 df ~ ta - 28 - T Test for Hypotheses about m H 0 : m m0 H a : 1. m m 0 2. m m 0 3. m m 0 Test Statistic : t Y m0 s/ n Rejection Region : 1. t t a 2 . t t a 3. | t | ta / 2 where t α cuts off a right - tail area of a in a t distributi on with n-1 degrees of freedom. Professor S. D. Balkin -- July 1, 2002 - 29 - Example Airline institutes a ‘snake system’ waiting line at its counters to try to reduce the average waiting time Mean waiting time under specific conditions with the previous system was 6.1. A sample of 14 waiting times is taken • Sample mean: 5.043 • Standard deviation: 2.266 Test the null hypothesis of no change against an appropriate research hypothesis using a=0.10. • • • • Calculate the rejection region Calculate the t-statistic Perform and interpret the hypothesis test Calculate the associated p-value Professor S. D. Balkin -- July 1, 2002 - 30 - Example Performance based benefits are a way of giving employees more of a stake in their work A study was conducted to find out how managers of 343 firms view the effectiveness of various kinds of employee relations programs Each rated the effect of employee stock ownership on product quality using a scale from –2 (large negative effect) to 2 (large positive effect). • Sample Mean: 0.35 • Standard Error: 0.14 Do managers view employee stock ownership as a worthwhile technique? • Create a 95% confidence interval for the population parameter • Perform a hypothesis test that the population mean isn’t equal to zero Professor S. D. Balkin -- July 1, 2002 - 31 - Example To help your restaurant marketing campaign target the right age levels, you want to find out if there is a statistically significant difference, on the average, between the age of your customers and the age of the general population in town, which is 43.1 years. A random sample of 50 customers shows an average of 33.6 years with a standard deviation of 16.2 years Perform a two-sided test at the 1% significance level What is the p-value? Professor S. D. Balkin -- July 1, 2002 - 32 - t-Test Assumptions Hypothesis tests allow for random variation, but not for bias Measurements are statistically independent Underlying population distribution should be symmetric • Skewness affects p-value Professor S. D. Balkin -- July 1, 2002 - 33 - Hypothesis Testing a Proportion We can also perform hypothesis tests for proportions / percentages by using a normal approximation to the binomial distribution z y n 0 n 0 (1 0 ) ; where y is the number of successes 0 z ; where is the proportion of successes 0 (1 0 ) / n Professor S. D. Balkin -- July 1, 2002 - 34 - Testing a Population Proportion H0 : 0 H a : 1. 0 2. 0 3. 0 Test Statistic : z y n 0 n 0 (1 0 ) Rejection Region : 1. z za 2. z za 3. z za / 2 or z za / 2 Note : 0 is the null - hypothesis value of the population proportion . Professor S. D. Balkin -- July 1, 2002 - 35 - Example A company figures out that the launch of their new product will only be successful if more than 23% of consumers try the product Based on a pilot study based on 205 consumers, you expect 44.1% of consumers to try it How sure are you that the percentage of people who will try the new product is above the break-even point of 23%? Professor S. D. Balkin -- July 1, 2002 - 36 - Using A Confidence Interval Construct a confidence interval (say at 95% confidence) in the usual way If m0 is outside the interval, it is not a reasonable value for the population parameter and you fail to reject the research hypothesis Why does this work? • Confidence interval says that the probability that the population parameter is in the random confidence interval is 0.95. • If the null hypothesis was true, then the probability that m0 is in the interval is also 95% • When the null is true, you will make the correct decision in 95% of all cases Professor S. D. Balkin -- July 1, 2002 - 37 - R Tutorial on Hypothesis Testing Professor S. D. Balkin -- July 1, 2002 - 38 - Testing Two Samples Can test whether two samples are significantly different or not, on the average • Unpaired test: test whether two independent columns of numbers are different • Paired test: test whether two columns of numbers are different when there is a natural pairing between them Professor S. D. Balkin -- July 1, 2002 - 39 - R Tutorial on Two Sample Hypothesis Testing Professor S. D. Balkin -- July 1, 2002 - 40 - Next Time… Regression Analysis Professor S. D. Balkin -- July 1, 2002 - 41 -