Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Xntroduction to Hypothesis Testing Using the Noraal Distribution, the T Test and SAS@ Arthur L. Carpenter california occidental Consultants Xey Words: normal, t statistics, testing test, hypothesis, is often due to characteristics of the population being sampled. Rolls of a die or flips of a coin follow the binomial distribution, while classroom grades often follow a bell shaped or normal distribution (curve). Prior knowledge of these distributions and the characteristics of the populations that the samples are drawn from, can be very useful to the researcher. Introduction The basic statistical concepts of hypothesis testing using the normal distribution and the t test will be presented to the non-statistician. The workshop is designed to assist the manager and researcher who must, through the course of events, come into contact with statisticians and statistical analyses. wi thin the last few years, the study and application of statistics has become a very specialized field and, as with many specialties, it has developed its own jargon and ways of doing things. A large number of experiments result in values that are continuous. The population density per square mile could be any value from zero to several thousand including non-integers. The distribution of person's heights is also not discrete. Height and density must be measured on a continuous scale and result in continuous probability distributions. One of the most important distributions in the field of statistics is the normal distribution. Its graph, the bell-shaped normal curve, is often well known even by people who know little or nothing about statistics. This distribution, which depends on two parameters (mean and variance), has been shown to occur in countless experimental samples. Managers and many researchers are not always in the position to keep up with the application of the latest statistical techniques. Many of them may have had a statistics class or two during their college career, however, one or two courses of undergraduate statistics does not a statistician make. The manager must therefore interface with statisticians and/or the statistical results of analyses without sufficient tools to realize the maximum benefit. This workshop will supply the manager or researcher with the verbiage and jargon necessary to communicate with statisticians and the tools to read and understand basic computer generated output from statistical analyses. Two normal curves with variances and unequal means. equal The Normal probability Distribution Probability distributions play an important role, not only in understanding the relationships within the data, but as part of the testing of hypotheses. When taking samples from a population, usually some outcomes are more likely than others. A plot of the likelihood or probability of each of the outcomes gives a curve known as the probability distribution. Two normal curves with equal means and unequal variances. It is not unusual for samples taken in field or laboratory experiments to follow a known probability distribution. This 802 Two normal curves with unequal means and unequal variances. When taking large samples from a normally distributed population it is reasonable to assume that the sample will also be distributed normally. This is not the case for smaller sampJ.:es. As the sample size (N) becomes less than 30, the sample more closely follows the t distribution. need a way to standardize the many possible normal curves so that they can be compared each other as well as to tables of probabilities. The distribution of a normal random variable with a mean of 0 and a variance of 1 is known as a standard normal distribution. Any normal distribution can be transformed into a standard normal by subtracting the mean and dividing by the standard deviation. We Z (x mean) deviation) I The t distribution looks very much like the normal, however it has heavier tails and a third parameter, degrees of freedom. As the degrees of freedom decrease the distribution becomes wider and flatter. The degrees of freedom for a given sample is one less than the sample size (d.f. = N - 1). The degrees of freedom (DF) allow the test to account for the uncertainty when estimating the population variance with the sample variance. (standard The probability of a given t value is determined in much the same way as it is for a normal distribution. Tables are available in most statistics books and SAS has a built in function which will calculate the probability directly. This transformation can be done using PROC STANDARD or it can be done in the data step. The probability levels, associated with most tests of hypotheses are based on the area under the normal curve. This area can be equated to probability since both the area and the probability must be constrained between 0 and 1. The formula for the T similar to that of the Z: statistic is T = (samplemean - populationmean) I (STD/SQRT (n) ) The probability tables found in the back of most basic statistics books are based on the standard normal curves. Fortunately SAS users never need to look up values in tables, since SAS provides the tables in the form of functions. Example 2 A manufacturer of computer hard disks collected failure time information on 15 of its disks. If the sampled disks had an average failure time of 2000 hours and a standard deviation of 300 hours, find the probability that the true mean time to failure is less than 1200 hours. Compare the results to those obtained in Example 1. Example 1: The average time to failure for a disk drive is 2000 hours, with a standard deviation of 300 hours. Assuming that times to failure are normally distributed, find the probability that a disk drive will fail in less than 1200 hours. DATA NULL; DF = Is - I; T = (1200 -2000) I PR = PROBT(T,DF); (300/SQRT(lS»; PUT T= PR= ; RUN; DATA NULL; Z = (1200 2000) I 300; PR = PROBNORM(Z); = (T=10.32 PUT Z= PR=; RUN; (Z= -2.667 PR= 0.00383) 803 PR=3.13E-8) Hypothesis ~estinq Example 3 The process used to produce widgets has been shown to be normally distributed with a mean of 25 widgets per hour with a standard deviation of 4.33 widgets. A new process has been suggested which may be better (more widgets per hour can be produced) • It is anticipated that the new process will produce 32.5 widgets per hour with the same standard deviation. What is the probability of producing 32.5 widgets per hour if the true mean is 251 Does this suggest that Widgets Inc. should convert to the new process? How sure are we of the recommendation? The testing of statistical hypotheses is very important to the researcher who needs to make decisions or inferences based on experimental results. According to Walpole and Myers (1972), "A statistical hypothesis is an assumption or statement, that mayor may not be true, concerning one or more populations. II Before the researcher can reach a statistical conclusion, statistical tests need to be run. Before the tests can be conducted the statement of hypotheses must be made. Because of the increasing statistical sophistication of those that evaluate the validity of experimental results, especially governmental regulatory boards, the experimenters must also become more sophisticated. Although this is great for statisticians it is not always necessary. HO: There is no difference between processes. HI: The new process is better. The alternative hypothesis (HI) maybe either one sided, as in this example, or two sided (the two processes are not equal). Consider for example the case where the CAP scores (California Academic Proficiency - tests of the quality of a school) for graduating seniors in two schools differ by 150 points (with a standard deviation of 25), we KNOW that the scores of the two schools are different and we DON'T need a statistical test to tell us. Just by inspection we can say that the two schools scored differently. If I however, a statement must be made as to the probability that the schools are the same, a statistical test must be made. It is easy to say the schools are different, but how sure are we? And how different are they? DATA NULL; Z ~ (32.5 25) / 4.33; PR ~ 1 - PROBNORM(Z); PUT Z~ PR~; RUN; = (Z~1.732 Statistical tests based on specific hypotheses allow the researcher to quantify the probability of making a mistaken conclusion. There are two ways of committing an error when dealing with an hypothesis (usually referred to as a null hypothesis). We can conclude that the hypothesis is false when it is really true or we can conclude that the null hypothesis is true when it is really false. These are known as Type I and Type II errors. TYPE I: Rej ect hypothesis when it is true. the null TYPE II : Accept hypothesis when it is false. the null PR~0.0416) The probability level of .04 is low enough to rej ect the null hypothesis. The new process looks like it could be better. And there are only 4 chances in a hundred that we are wrong. However, we do not know any of the true statistics of the new process. We might recommend a trial study. It is possib1e to make a type II error when we do not reject the null hypothesis. Failure to reiect the null hypothesis is NOT the same as accepting it as true! A type II error is made when the null hypothesis is accepted as true when it is actually false. The probability of making a type II error is designated by the Greek letter beta ( B ) and (1 - B) is known as the power of the test. Although the power should always be calculated, in practice it rarely is, as it is necessary to make some assumptions about the al ternative distribution. The probability of committing a type I error is called the level of significance of the test and is usually designated by the Greek letter alpha (0<). Hence the term alpha level. A customary alpha level is 5% (ot = .05). 804 Given the wide variety of type~ of experimental designs it is not surpr~sing that there are also a great many ways to test those hypotheses. Three of the more common classes of these tests include tests of location e.g. equality means, tests of independence and tests of dispersion e.g. equality of variances. The mean is not significant at the .05 level, however it would have been at the .1 level. The t test can also be used to determine if the means of two samples are different. The samples do not have to have the same sample size and the underlying variances do not need to be the same. PROC TTEST is often used for these types of comparisons. It automatically produces a two tailed probability and checks the assumption of equality of variance. Tests of location are usually tests of equality of means and include the z test, t test, analysts of variance (ANOVA) and analysis of covariance (ANCOVA). The chi-square test, there are several tests that use the chi-square statistic, is a test of independence of two or more sets of data. Tests of dispersion or variance, include the F test, Bartlett's test, Cochran's test and others. Example 5 The years of service of union and non-union workers were collected and are to be compared. DATA UNION: INPUT UNION $ YEARS @@: CARDS: Y 25 Y 26 Y 30 Y 25 Y 31 Y 27 Y 24 N 19 N 21 N 30 N 25 N 21 N 23 The true underlying distribution is rarely known in practice. Usually samples are collected and the population statistics are estimated from the sample e.g. sample mean and sample variance. The variability associated with the sample mean is quantified using the standard error (standard deviation of the sample mean). The uncertainty inherent in the. estimated statistics (mean and variance) must be taken into consideration when calculating the tests of hypotheses. PROC TTEST DATA=UNION: CLASS UNION: VAR YEARS: TITLEI I EXAMPLE 5': TITLE2 'COMPARISON OF SENIORITY': RUN: EXAMPLE 5 COMPARISON OF SENIORITY t Tests TTEST PROCEDURE Variable: YEARS This test is used when sampling from normal populations and the sample size is small i.e. <50 and/or when the population variance is unknown (otherwise we can use the Z test). UNION N Mean Std Oev N 6 7 23.166666 26.857146 3.920034 2.672612 std Error Minimum Maximum 1.600347 1.010152 19.000000 24.000000 30.000000 31.000000 Y T = (samplemean - populationmean) / (STO/SQRT (n) ) Example 4 The manufacturer of glass microscope lenses wants to be sure that her processing produces no more than 25 'pits' per slide. She randomly selects 5 slides and counts the pits on each. Her sample yields a mean of 28 pits and a standard deviation of 4.33. Is there reason to suspect that the process is indeed the pits? . HO: The mean number of pits is 25. Hl: The mean number greater than 25. of pits Variances Unequal Equal T OF Prob>ITI -1. 9501 8.6 11.0 0.0844 0.0695 -2.0110 For HO: variances are equal, F'=2.l5 DF=(5,6) Prob>F'=0.3782 is DATA NULL: T = (28 - 25)/(4.33/SQRT(5»: OF = 5-1: PR = 1 - PROBT(T,DF): PUT T= OF= PR=: RUN; (T=1.55 OF=4 PR=0.0981) 805 Paired comparisons are of use when two determinations have been made on each replicate. This technique is often used in before and after studies. The F test, which is used by PROC TTEST is itself subject to assumptions and becomes biased if the underlying distributions are not normally distributed. PROC TTEST is not used because the samples are paired. A data step is used to create a variable of the difference and the difference is compared to zero. The PRT option in PROC MEANS can be used to make the comparison. PRT is always a two tailed test for the hypothesis that the difference is zero. other more sophisticated procedures can be easily programmed in the data step, however they are outside the scope of this workshop. Recent research has shown that most parametric tests are fairly robust (are resistant) to departures from the assumptions of normality and equality of variance. This is especially true if the sample sizes are large and equal. Example 6 A detergent manufacturer would like to show that Brand X really does make whites brighter. A measure of reflected light was recorded before and after washing. ABOUT THE AUTHOR DATA DIFF; SET TTEST.EX6; DIFF = AFTER - BEFORE; Arthur L. Carpenter has over fourteen years of experience as a statistician and data analyst and has served as a senior consultant with California Occidental Consultants, CALOXY, since 1983. His publications list includes a number of papers and posters presented at SUGI and he has developed and presented several courses and seminars on statistics and SAS programming. PROC MEANS DATA=DIFF MEAN N STDERR T PRT; VAR DIFF; TITLE 'EXAMPLE 6'; TITLE2 'PAIRED T TEST'; RUN; EXAMPLE 6 PAIRED T TEST Analysis Variable CALOXY offers SAS contract programminq and in-house SAS training nationwide. This workshop was adapted from the three day course "PC/SAS Introduction to Statistics for Managers". DIFF N Obs N Mean std Error 5 5 0.10000 0.06324 T Prob>ITI 1.58113 0.1890 Arthur L. Carpenter California occidental Consultants 4239 Serena Avenue Oceanside, CA 92056-5018 (619) 724-8579 REFERENCES Benjamin, Jack and C. Allin Cornell, Probability. statistics. and Decision for civil Engineers, McGraw-Hill Book Company, 1970. Sclotzhauer, Sandra O. and Ramon C. Littell, SAS System for Elementary statistical Analysis, SAS Institute, Inc., .1987. Walpole, Ronald E. and Raymond H. Myers, Probability and statistics for Engineers and scientists, Macmillan Company, New York, 1972. One of the assumptions of most tests of hypotheses concerning equality of means is that the sampling variances are equal. Often the validity of this assumption is unknown, but it can be tested. PROC TTEST makes this comparison automatically. When the variances are unequal, adjustments are made to the t test results. TRADEMARK IHFORMATION SAS is a registered trademark of the SAS Institute, Inc., Cary, NC, USA. 806