Download Document

Welcome to the Unit 5 Seminar for MM305! I hope you have had a good evening so far. I have a lot of information to share with you tonight. We may not be able to go over all of them in our seminar but I have written them (hopefully) straightforward enough that you can go over them after our seminar and follow the concept. My notes will give you the big picture and of course for a little more detail you also need to read the book along with it. It is not too late to catch up in this class but you need to hurry up a little as concepts are going to get a little busier and you need to spend more time to understand them. It won’t necessarily get harder, just busier. I don’t mean to scare you. I just want you to be aware of it so you can plan your time accordingly. Need Help? Please use all the options to get help in this class. You can use our office hours on Mondays and Wednesdays (8:00 pm to 9:00 pm ET on AIM) to get a one-on-one help. You can also email me your questions but I really prefer that you post your questions on the board (under Any Questions link) so other students can also benefit from the questions and their answers. You can also use the NetTutor online tutoring service that is sponsored by Kaplan. To access their service just click on the NetTutor icon on your "MyDesk" page. Anyone is using them on a regular basis and if so, are you happy with the service? I know I have talked about this before but it won’t hurt to share it again. Excel Note: It seems most of you are using excel and are not experiencing too many problems. The problems will become a little more challenging and excel will really help. Do practice though –it will help. You can get information about any statistical procedure by typing the name of the procedure in the HELP command of Excel. You will get an explanation and example for that command or procedure (i.e. mean, standard deviation, regression) . Definition: Point Estimates In most cases we don’t know the mean and standard deviation of a particular parameter of interest (like height) of a large population (think of census data, for example). So, we get an estimate of those values by getting a sample from the population and calculating the mean and standard deviation of that sample. We call these sample mean and sample standard deviation values the “Point Estimates” of the population mean and population standard deviation. (A point estimate is just a single number used to estimate some parameter of a population. Definition: Interval Estimates Since these sample means and standard deviations may not be very accurate (i.e., the sample may not reflect the good sample from the population) then we want to set an interval around the value of sample mean and express that this interval contains true population mean with a certain degree of accuracy. This is called confidence interval. μ Sampling from a population Suppose we draw 20 samples from a population and calculate the mean of each. We would expect only 1 in 20 to be outside of the interval that is 2 standard deviations above and below the mean. (The arrow is pointing to the sample where its mean is outside the confidence interval.) μ 95% Confidence Interval Now, suppose we draw a single sample from the population. Also suppose we then build a 2 standard deviation interval around the value obtained. In most cases, the true mean of the population will be within the interval. (The exception would be the sample that falls outside of the interval around the true mean. μ Example: Confidence Interval Suppose we observe that, in a sample of 50 commuters, the average length of travel to work is 30 minutes with a population standard deviation of 2.5 minutes. What is the standard error for the sampling distribution? Answer: 2.5 / sqrt(50) = 0.35 Example: Confidence Interval Suppose we observe that, in a sample of 50 commuters, the average length of travel to work is 30 minutes with a population standard deviation of 2.5 minutes. What is the standard error for the sampling distribution? Answer: 2.5 / sqrt(50) = 0.35 To create the 95% confidence interval, you would take 2 standard errors and subtract and add it to the mean. [ 30 – 0.7, 30 + 0.7] = [29.3, 30.7] We can be 95% confident that the true population mean is within that interval. Confidence Interval Suppose we observe that, in a sample of 100 cereal boxes, the average weight of the cereal is 26.8 ounces with a Population 2 ounces. Everyone: What is the standard error for the sampling distribution? Confidence Interval Suppose we observe that, in a sample of 100 cereal boxes, the average weight of the cereal is 26.8 ounces with a Population 2 ounces. Everyone: What is the standard error for the sampling distribution? Answer: 2 / √(100) = 0.2 Confidence Interval Suppose we observe that, in a sample of 100 cereal boxes, the average weight of the cereal is 26.8 ounces with a Population 2 ounces. Everyone: What is the standard error for the sampling distribution? Answer: 2 / √(100) = 0.2 Everyone: Construct a 95% confidence interval for the mean. Confidence Interval Suppose we observe that, in a sample of 100 cereal boxes, the average weight of the cereal is 26.8 ounces with a Population 2 ounces. Everyone: What is the standard error for the sampling distribution? Answer: 2 / √(100) = 0.2 Everyone: Construct a 95% confidence interval for the mean Answer: 2 standard errors = 0.4 so subtract it from and add it to the mean. [ 26.8 – 0.4, 26.8 + 0.4] = [26.4, 27.2] We can be 95% confident that the true population mean is within that interval. Excel: Confidence Interval Suppose we observe that, in a sample of 50 commuters, the average length of travel to work is 30 minutes with a population standard deviation of 2.5 minutes. Click on a cell and then type =CONFIDENCE(0.05,2.5,50) in the Excel input box and click on ok. We only use the sample size and standard deviation in this command. You will get the value of 0.692951. 30 - 0.7 = 29.3 and 30 + 0.7 = 30.7 So, the expression =CONFIDENCE(0.05,2.5,50) equals 0.692951 or rounded to 0.7. Therefore, the interval of the average length of travel to work (30 minutes) is calculated as: 30 +/- 0.7 minutes. This results in an interval of 30 + 0.7 = 30.7 and 30 – 0.7 = 29.3. We are 95% confident that the commute time interval is from 29.3 to 30.7 minutes Everyone: Use Excel to determine a Confidence Interval Suppose we observe that, in a sample of 100 cereal boxes, the average weight of the cereal is 26.8 ounces with a Population standard deviation of 2 ounces. Use excel to construct a 95% Confidence Interval for the mean. Everyone: Use Excel to determine a Confidence Interval Suppose we observe that, in a sample of 100 cereal boxes, the average weight of the cereal is 26.8 ounces with a Population standard deviation of 2 ounces. Use excel to construct a 95% Confidence Interval for the mean. Answer: =CONFIDENCE(0.05,2,100) Using Excel, the above is equal to 0.391993. The confidence interval is therefore: [ 26.8 – 0.39, 26.8 + 0.39 ] = [ 26.41, 27.19 ] Hypothesis Testing Basically, t test statistic and Z test statistic are used in Hypothesis testing to reject or accept a claim. The claim is usually Null Hypothesis (called H0) and if we reject H0 we automatically accept Alternative Hypothesis (called H1) because that is the only other option (kind of like plan B) available to us. Null and Alternative hypothesis are kind of complement of each other. For example, if Null hypothesis claims that mean value of something is less than or equal to a certain value (book call this directional) then alternative would be mean value is greater than that value. Or, if Null says mean is equal to a certain value then Alternative says mean is NOT equal to that value. Book call it non directional because it can go to either direction. Classical Approach Calculate a test statistic, t or Z. Formulas for calculating t and Z are in the book. Z test statistic is given on page 318. T test statistic is given on page 328. There are 3 possible tests: 1. Right tailed test (described on pages 320 and 321) 2. Left tailed test (similar to a right tailed test) 3. Two tailed test (described on pages 318 and 319) The flow chart on page 316 is a great help in guiding you on which method to use for any particular Hypothesis situation. Difference Between T and Z tests The only major difference to find a t value from the table t in the back of the book, you need to take TWO things to the t table. One is alpha (that you already know about and is usually given) and the other element is DEGREE of FREEDOM. Degree of freedom is just a number that helps us to have a more accurate value for our t statistic. DF (degree of freedom) value is sample size, n, minus 1 (n 1). It is basically another factor that comes to play to bring accuracy to the calculations based on different sample sizes. That is all there is into degree of freedom for us! Example: t table The T-table is located just before the Z-table on the inside cover of the book. As book shows in the back of the book in Table t, if you are looking for a t value when alpha is 5% (one-sided or one tailed test) and sample size is 74, you go to the Table and look up the t (0.05,73). The t value when sample size is 73 and alpha is 0.05 is 1.666. Let me know if you are not getting this value from the t table in the back of the book right before the Z table. T table Just remember that the values in the body of the table represent the shaded area (blue) in the t distribution as it is shown in the back of the book table t. If sample size approaches the value of infinity then t distribution approaches standard normal distribution and the two curves become identical. So, for example, Z of alpha 0.05 = 1.645 which is the same value as t (df=infinity, alpha 0.05) = 1.645. In Practice The t-test is the more practical case as we usually don't have the standard deviation of a population parameter. If you recall, when we know the standard deviation of the population we use Z test. Now, we use student t test (which has a formula which is very similar to Z test formula) because standard deviation of population is not known. The only difference is that we use standard deviation of the sample, s, in the formula, instead of sigma. Calculation and conclusion of t test is very similar to the calculation and conclusion of a Z test. We can call these values -calculated t or Z Example: t-test A study of the process costs indicates that the average weight of the diamonds must be greater than 0.5 karat in order that the process be operated at a profitable level. Do the six diamond-weight measurements, 0.46, 0.61, 0.52, 0.48, 0.57, 0.54 present sufficient evidence to indicate that the average weight of the diamonds produces by the process is in excess of 0.5 karat? We use t test because sample size is 6 (less than 30). It is a one-sided test because question is about the value being “greater than”. H0: population average weight of the diamonds (mu) = 0.5 H1: population average weight of the diamonds (mu) > 0.5 Example t-test We decide that the value of alpha to be 0.05 (rejecting top 5% of the t values). The degree of freedom is sample size minus 1 so degree of freedom (df) for this problem is 6-1 = 5. The Critical t value has the format of t alpha, df. So, for this problem, it is: t 0.05, 5= 2.015 (from t table in the back of the book). That is, we will reject the Ho if the calculated t (calculated using the formula) is greater that maximum acceptable table t which is 2.015 (for this problem). In that case, we say the calculated t is too large to be accepted according to our 5% policy. Example t-test So, the Rejection Region for alpha = 5% and (6-1)= 5 degrees of freedom is when calculated t (using the formula) is greater than 2.015 (look at the t distribution figure on the top of t –table in the back of the book. The red area is the rejection area). If you use the t formula for this problem you will find calculated t value to be 1.31. In this case calculated t is less than critical t (table t), therefore, we do not reject the H0. This implies that the data do not present sufficient evidence to indicate that the mean diamond weight exceeds 0.5 karat. P-Value Approach The calculations, the meaning of alpha and P-value and conclusion process are the same in both methods but formulas are a little different. We will get familiar with getting a t value from the table in our seminar a little later on tonight. Steps are outlined on pages 322 and 323. p-value The p-value is the probability in the “tail” area. In the classical approach, you either reject of fail to reject. It doesn’t give any information about whether a different value of alpha would have given the opposite conclusion. In the p-value approach, you find the level of alpha at which the null hypothesis would be rejected. For example, if the p value is .2 and it is a one tailed test, it indicates that the probability of getting a sample with the stated mean is twenty percent. (Pretty high and you would not want to reject the null.) If the p-value is 0.0001, the probability that the sample is drawn from a population stated in the null hypothesis is very, very small. (In this case, you would reject the null.) p-value The example shown is for a right tailed test. test statistic value p-value The example shown is for a left tailed test. test statistic value p-value The example shown is for a two tailed test. (Find the area in one tail, and double it.) test statistic value

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Document