Survey

Survey

Transcript

TESTING IN MINITAB The Z test is used when the standard deviation is known. It is also used when the standard deviation is unknown but the sample size is large. In this latter case the (adjusted) sample variance replaces the population variance in the test statistic. For this case you can either: (a) Use DESCRIPTIVE STATISTICS or COLUMN STATISTICS to calculate the sample standard deviation. Enter this value in the box labelled SIGMA when finding the Z-test statistic and p-value in MINITAB using note 1. (b) Use note 2 to calculate the t-test statistic, which is identical to the Z test statistic for this case. The p-value will be based on the tn-1 distribution, where n is the sample size. The distribution of t tends to that of an N(0,1) as the degrees of freedom tends to infinity. So for large sample the p-value based on tn-1 will be approximately the p-value based on the N(0,1) distribution. 1. Z TEST IN MINITAB The test is of H0:µ=µ0 against one of the three possible alternatives, Ha:µ<µ0, Ha:µ>µ0 or Ha:µ≠µ0. Click on STATISTICS then BASIC STATISTICS then SAMPLE Z. In the dialogue box:Enter the appropriate column holding the data in the VARIABLES box. Click in the check box for TEST MEAN and type in the box to the H0 value of µ, µ0. Click on OPTIONS and pull down the choices for ALTERNATIVE. This allows you to specify if Ha:µ<µ0 (LESS THAN), if Ha:µ≠µ0 (NOT EQUAL) or if Ha:µ>µ0 (GREATER THAN). Choose the appropriate one. Click OK In the box labelled SIGMA type the known standard deviation. Click OK. MINITAB will give you various calculations including the sample size N, the value of Z and the p-value P. Note that the p-value is given to 3 decimal places, so a p-value in MINITAB of 0.000 means that the actual p-value is smaller than 0.0005. Similarly a pvalue in MINITABof 1.000 means p-value ≥ 0.9995 2. T TEST IN MINITAB Click on STATISTICS then BASIC STATISTICS then 1-SAMPLE T. In the dialogue box:Enter the appropriate column holding the data in the VARIABLES box. Click in the check box for TEST MEAN and type in the box to the H0 value of µ, µ0. Click on OPTIONS and pull down the choices for ALTERNATIVE. This allows you to specify if Ha:µ<µ0 (LESS THAN), if Ha:µ≠µ0 (NOT EQUAL) or if Ha:µ>µ0 (GREATER THAN). Choose the appropriate one. Click OK. Click OK again. MINITAB will give you various calculations including the sample size N, the value of T and the p-value P. 3. CHI-SQUARED TEST IN MINITAB The test is of H0:σ=σ0 against one of the three possible alternatives (i) Ha::σ<σ0, (ii) Ha::σ>σ0, (iii) Ha::σ≠σ0. MINTAB does not calculate the test statistic and p-value directly. (a) CALCULATING THE TEST STATISTIC Click on CALC then COLUMN STATISTICS. Click on the check box for STANDARD DEVIATION and enter the appropriate variable in the INPUT VARIABLE box. In the OPTIONAL STORAGE box type A. Click OK. The sample standard deviation is then stored in the constant A. Then use CALCULATOR to calculate (n-1)S2/(σ0)2 . In the EXPRESSION box you will need: ((value of sample size) - 1)*(A**2)/((value of σ0)**2) Specify a column for the result. This result (giving the test statistic) will appear in the first entry of that column. (b) CALCULATING THE p-VALUE Click on CALC then PROBABILITY DISTRIBUTION then CHI-SQUARED. Click in the check box for CUMULATIVE PROBABILITY. Enter the value of ((sample size)-1) in the box for DEGREES OF FREEDOM. Either enter the value of the test statistic in the INPUT CONSTANT box, or enter the name of the column where it is stored in the INPUT VARIABLE box. You do not need to store the value. Click OK. The value is of the cumulative probability is printed out. I will call it P. For case (i) the p-value is just P. For case (ii) the p-value =1-P For case (iii) the p-value = 2 (the smaller of P and (1-P)). GOODNESS OF FIT TESTS IN MINITAB 1. Preliminary Calculations (if needed) If there are parameters to be fitted under H0 these must be estimated first. You may also use MINITAB to group data. (i) Calculating the sample mean from an ungrouped frequency table (e.g. for estimating a Poisson mean) Type the values of x in column C1 and the frequencies in column C2. Let N be the total frequency. Use CALCULATOR to calculate C1*C2/N and store this in column C3. Use COLUMN STATISTICS to calculate the sum of C3 and store it in the constant A. A now holds the sample mean. It will also be printed out in the session window. Now delete column C3. Columns C1 and C2 hold the frequency table. This may need to be adjusted (or deleted and re-entered) if some grouping is required. See Note 2. (ii) Calculating the sample mean and standard deviation from ungrouped data (e.g. estimating the meanand standard deviation for the normal distribution). Type the data in column C1. Use COLUMN STATISTICS to calculate the sample mean of C1 and store it in the constant A, and the sample standard deviation of C1 and store it in the constant B. (iii) Grouping data (in column C1) to form a frequency table. If there are k categories, the first step is to code each data value x to the category number, which will then be stored in column C2. Click on MANIP, then CODE, then NUMERIC TO NUMERIC. In the box for CODE DATA FROM enter C1. In the box for INTO COLUMNS enter C2. There are pairs of boxes for ORIGINAL VALUES and NEW VALUES. In the first of each pair type the range of values of x (e.g. if the data is integers 21:30 represents all values x such that 21≤x≤30, which is equivalent to 20<x≤30) and in the second box type the category number. Category 1 should correspond to the smallest range of values of x and category k to the largest. Obtain the CHARACTER GRAPH HISTOGRAM of C2. The histogram appears in the session window and shows the categories and frequencies for each category (the frequency table). You may need to condense the data by amalgamating categories. Delete C1 and C2. You can then use these columns to enter the frequency table. Calculating the goodness of fit test from the frequency table (i) WHEN THERE IS NO GROUPING (a) Entering the frequency table Enter the values of x in column C1 (or category number if categories are qualitative). Enter the observed frequencies in column C2. (b) Calculating the (fitted) probabilities when H0 is true If the probabilities are specified,enter them in column C3. If categories are single values of a random variable X, and H0 specifies that X has a standard distribution (e.g. binomial), MINITAB will calculate these for you as follows: Click on CALC then PROBABILITY DISTRIBUTION then select the appropriate distribution. Enter the parameter values (if appropriate use the estimates. These may be typed in or you can enter the constant the estimate has been stored in). Click in the box for PROBABILITY DENSITY. In the box for INPUT VARIABLE type C1. In the box for the OUTPUT VARIABLE type C3. (c) Calculating the goodness of fit test statistic C2 contains the observed frequencies and C3 contains the (fitted) probabilities when H0 is true. We will now set the entries of (O-E)2/E in column C4. Click on CALC then CALCULATOR. Suppose the total frequency is N. In the box for EXPRESSION enter ((C2-N*C3)**2)/(N*C3) Enter C4 in the box for STORE RESULT IN . Click OK The test statistic φ2 is just the sum of the entries in C4 Click on CALC then COLUMN STATISTICS Then:Click in the check box for SUM. In the INPUT VARIABLE box enter C4 (You may also use OPTIONAL STORAGE Enter constant name, e.g. C) Click OK. The value of φ2 will be printed out in the session window (and stored in C if storage is used). (d) Finding the p-value Click on CALC then PROBABILITY DISTRIBUTION then CHI-SQUARED. Click in check box for CUMULATIVE PROBABILITY. In the box DEGREES OF FREEDOM enter (k-p-1), where k is the number of categories and p is the number of parameters estimated. In the INPUT CONSTANT box enter the calculated value of φ2 you obtained (or type in C if the value was stored in the constant C). Click OK. The value printed out in the session window is P(χ2≤observed φ2) =(1 - p-value). So you can find the p-value. (ii) WHEN THERE IS GROUPING (a) Entering the frequency table Enter the UPPER END POINTS for x for the (amalgamated) categories in column C1, BUT DO NOT ENTER THE LAST VALUE. (If the distribution to be fitted is discrete, the upper end point should be the largest discrete value of x in that category.) Enter ALL the frequencies in column C2 ( including the last). If there are k values in C2, there will be (k-1) values in C1. (b) Calculating the (fitted) probabilities In C3 we will set the cumulative probabilities based on the fitted distribution specified by H0. Click on CALC then PROBABILITY DISTRIBUTION then (NAME OF DISTRIBUTION) Then click in the check box for CUMULATIVE PROBABILITY. Enter C1 in the INPUT COLUMN. In OPTIONAL STORAGE enter C3. There will be boxes to enter the parameters. Either enter the parameters specified by H0, or the estimates for these parameters (or constants storing the estimates). Click OK. C3 has (k-1) values. Type a 1 as entry k of C3 (this corresponds to the cumulative fitted probability for the upper end-point of the last category). The fitted probabilities required are C3(1), then (C3(2)-C3(1)),....,(C3(k)-C3(k-1)). The differences (C3(2)-C3(1)),...,(C3(k)-C3(k-1)) are obtained as follows: Click on STATISTICS then TIME SERIES then DIFFERENCES. Enter C3 in the box for SERIES. Enter C4 in the box for STORE DIFFERENCE IN. The default in the box for LAG is 1 (this is the value we want to use). Click OK. C4 now has a * in the first entry followed by the differences. Replace the * by clicking in the cell and typing the value of C3(1). Then C4 holds the fitted probabilities. (c) Calculating the goodness of fit test statistic C2 contains the observed frequencies and C4 contains the (fitted) probabilities when H0 is true. We will now set the entries of (O-E)2/E in column C5. Click on CALC then CALCULATOR. Suppose the total frequency is N. In the box for EXPRESSION enter ((C2-N*C4)**2)/(N*C4) Enter C5 in the box for STORE RESULT IN . Click OK The test statistic φ2 is just the sum of the entries in C5 Click on CALC then COLUMN STATISTICS Then:Click in the check box for SUM. In the INPUT VARIABLE box enter C5 (You may also use OPTIONAL STORAGE Enter constant name, e.g. C) Click OK. The value of φ2 will be printed out in the session window (and stored in C if storage is used). (d) Finding the p-value Click on CALC then PROBABILITY DISTRIBUTION then CHI-SQUARED. Click in check box for CUMULATIVE PROBABILITY. In the box DEGREES OF FREEDOM enter (k-p-1), where k is the number of categories and p is the number of parameters estimated. In the INPUT CONSTANT box enter the calculated value of φ2 you obtained (or type in C if the value was stored in the constant C). Click OK. The value printed out in the session window is P(χ2≤observed φ2) =(1 - p-value). So you can find the p-value.