Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Psychometrics wikipedia , lookup
Degrees of freedom (statistics) wikipedia , lookup
Bootstrapping (statistics) wikipedia , lookup
Taylor's law wikipedia , lookup
History of statistics wikipedia , lookup
Gibbs sampling wikipedia , lookup
Foundations of statistics wikipedia , lookup
Statistical inference wikipedia , lookup
Resampling (statistics) wikipedia , lookup
Key points An estimate is an indication of the value of an unknown quantity based on observed data. A population is the entire collection of people or things you are interested in; A census is a measurement of all the units in the population; A population parameter is a number that results from measuring all the units in the population; A sampling frame is the specific data from which the sample is drawn, e.g., a telephone book; A unit of analysis is the type of object of interest, e.g., arsons, fire departments, firefighters; A sample is a subset of some of the units in the population; A statistic is a number that results from measuring all the units in the sample; Statistics derived from samples are used to estimate population parameters. N = the number of cases in the sampling frame n = the number of cases in the sample NCn = the number of combinations (subsets) of n from N f = n/N = the sampling fraction INFERENTIAL STATISTICS Table of Contents SAMPLE: .................................................................................................................... SAMPLING ................................................................................................................. REASONS FOR SAMPLING..................................................................................... ECONOMY ........................................................................................................... TIME FACTOR: ................................................................................................... VERY LARGE POPULATION: .......................................................................... PARTLY ACCESSIBLE POPULATIONS: ......................................................... THE DESTRUCTIVE NATURE OF THE OBSERVATION: ............................ ACCURACY AND SAMPLING: ........................................................................ BIAS AND ERROR IN SAMPLING ........................................................................ SAMPLING ERROR.................................................................................................. NON SAMPLING ERROR ........................................................................................ POPULATION PARAMETER AND SAMPLE STATISTICS ............................. PROBABILITY OF RANDOM SAMPLING .......................................................... Simple Random Sampling ......................................................................................... 1. Stratified Random sampling .............................................................................. Systematic Random Sampling : ................................................................................. Cluster random Sampling: ........................................................................................ Multistage random sampling: ................................................................................... Sequential Random Sampling : ................................................................................. NON PROBABILITY SAMPLING ............................................... Error! Bookmark Purposive sampling: ................................................................................................. Quota Sampling ........................................................................................................ Convenience sampling: ............................................................................................. DIFFERENCES BETWEEN RANDOM AND NON RANDOM SAMPLING ................................................................................................................. Sampling techniques: Advantages and disadvantages ............................................ How to Choose the Best Sampling Method .............................................................. SAMPLING DISTRIBUTION .................................................................................. Why the sampling distribution is important .............................................................. 1 SAMPLING Central Limit Theorem .............................................................................................. As you increase the sample size, regardless of the shape you create, the distribution (i.e. look at the histogram) becomes more bell-shaped. Variability of a Sampling Distribution: .................................................................... SAMPLING DISTRIBUTION OF MEANS ............................................................... Sampling distribution in case of without replacement: ............... Error! Bookmark Sampling distribution of difference between means: ................................................ Sampling distribution of proportions ........................................................................ Statistics: The mean x Sampling distribution of differences of proportions: ................................................ and standard deviation s for the sample are Objectives .................................................................................................................... statistics. They are used as estimates of the parameters. Statistics are variables. A sample mean, denoted (pronounced “x-bar”), is an average of n observations. It measures the center of the observed data values. A sample standard deviation, denoted, is an average deviation of n observations. It measures the spread or dispersion of the observed data values. INFERENTIAL STATISTICS 2 SAMPLING SAMPLING SAMPLE: A sample is a group of units selected from a larger group (the population). By studying the sample it is hoped to draw valid conclusions about the larger group. A sample is generally selected for study because the population is too large to study in its entirety. The sample should be representative of the general population. This is often best achieved by random sampling (probability sampling). Also, before collecting the sample, it is important that the researcher carefully and completely defines the population, including a description of the members to be included. Example In a classroom of 30 students in which half the students are male and half are female, a representative sample might include six students: three males and three females. SAMPLING A process used in statistical analysis in which a predetermined number of observations will be taken from a larger population. There are two major categories in sampling: Probability sampling Non-probability sampling Examples 1. Conducting a poll to predict the winner of an upcoming election 2. Inspecting a sample of parts to determine if the entire lot meets requirements 3. Sometimes "measuring" or "testing" something destroys it. The government requires automakers who want to sell cars in the U.S. to demonstrate that their cars can survive certain crash tests. Obviously, the company can't be expected to crash every car, to see if it survives! So the company crashes only a sample of cars. REASONS FOR SAMPLING ECONOMY: There is an economic advantage of using a sample in research. Obviously, taking a sample requires fewer resources than a census. INFERENTIAL STATISTICS 3 SAMPLING TIME FACTOR: A sample may provide you with needed information quickly. For example, you are a Doctor and a disease has broken out in a village within your area of jurisdiction, the disease is contagious and it is killing within hours nobody knows what it is. You are required to conduct quick tests to help save the situation. If you try a census of those affected, they may be long dead when you arrive with your results. In such a case just a few of those already infected could be used to provide the required information. VERY LARGE POPULATION: Many populations about which inferences must be made are quite large. For example, consider the population of high school seniors in United States of America, a group numbering 4,000,000. The responsible agency in the government has to plan for how they will be absorbed into the different departments and even the private sector. The employers would like to have specific knowledge about the student`s plans in order to make compatible plans to absorb them during the coming year. But the big size of the population makes it physically impossible to conduct a census. In such a case, selecting a representative sample may be the only way to get the information required from high school seniors. PARTLY ACCESSIBLE POPULATIONS: There are some populations that are so difficult to get access to that only a sample can be used. Like people in prison, like crashed aero planes in the deep seas, presidents etc. The inaccessibility may be economic or time related. For example natural disasters like a flood that occurs every 100 years or take the example of the flood that occurred in Noah`s days. It has never occurred again. THE DESTRUCTIVE NATURE OF THE OBSERVATION: Sometimes the act of observing the desired characteristic of a unit of the population destroys it for the intended use. Good examples of this occur in quality control. For example to test the quality of a fuse, to determine whether it is defective, it must be destroyed. To obtain a census of the quality of a lorry load of fuses, you have to destroy all of them. This is contrary to the purpose served by quality-control testing. In this case, only a sample should be used to assess the quality of the fuses. ACCURACY AND SAMPLING: A sample may be more accurate than a census. A sloppily conducted census can provide less reliable information than a carefully obtained sample. INFERENTIAL STATISTICS 4 SAMPLING BIAS AND ERROR IN SAMPLING Sampling bias is a tendency to favor the selection of units that have particular characteristics. A sample is expected to mirror the population from which it comes; however, there is no guarantee that any sample will be precisely representative of the population from which it comes. Chance may dictate that a disproportionate number of untypical observations will be made like for the case of testing fuses, the sample of fuses may consist of more or less faulty fuses than the real population proportion of faulty cases. SAMPLING ERROR Sampling error is incurred when the statistical characteristics of a population are estimated from a subset, or sample, of that population. Since the sample does not include all members of the population, statistics on the sample, such as means and quantiles, generally differ from parameters on the entire population. For example, if one measures the height of a thousand individuals from a country of one million, the average height of the thousand is typically not the same as the average height of all one million people in the country. Since sampling is typically done to determine the characteristics of a whole population, the difference between the sample and population values is considered a sampling error.“Increasing the sample size can decrease the sampling error” NON SAMPLING ERROR A statistical error caused by human error to which a specific statistical analysis is exposed. These errors can include, but are not limited to, data entry errors, biased questions in a questionnaire, biased processing/decision making, inappropriate analysis conclusions and false information provided by respondents. POPULATION-PARAMETER AND SAMPLE-STATISTICS A parameter is a value, usually unknown (and which therefore has to be estimated or tested), used to represent a certain population characteristic. For example, the population mean is a parameter that is often used to indicate the average value of a quantity. Within a population, a parameter is a fixed value which does not vary. Each sample drawn from the population has its own value of any statistic that is used to estimate this parameter. For example, the mean of the data in a sample is used to give information about the overall mean in the population from which that sample was drawn. INFERENTIAL STATISTICS 5 SAMPLING For example, say you want to know the mean income of the subscribers to a particular magazine—a parameter of a population. You draw a random sample of 100 subscribers and determine that their mean income is $27,500 (a statistic). You conclude that the population means income μ is likely to be close to $27,500 as well. This example is one of statistical inference. “σ” “σ2 ” INFERENTIAL STATISTICS 6 SAMPLING PROBABILITY OR RANDOM SAMPLING Simple Random Sampling In statistics, a simple random sample is a subset of individuals (a sample) chosen from a larger set (a population). Each individual is chosen randomly and entirely by chance, such that each individual has the same probability of being chosen at any stage during the sampling process, and each subset of k individuals has the same probability of being chosen for the sample as any other subset of k individuals. This process and technique is known as simple random sampling. A simple random sample is an unbiased surveying technique. In small populations and often in large ones, such sampling is typically done "without replacement", i.e., one deliberately avoids choosing any member of the population more than once. Although simple random sampling can be conducted with replacement instead, this is less common and would normally be described more fully as simple random sampling with replacement. Example 1 Let’s say you have a population of 1,000 people and you wish to choose a simple random sample of 50 people. First, each person is numbered 1 through 1,000. Then, you generate a list of 50 random numbers and those individuals assigned those numbers are the ones you include in the sample. Example 2 Figure 1.1 An example of simple random sampling of 10 subjects, represented by the red ‘stickmen’, selected at random from a total of 50 subjects. INFERENTIAL STATISTICS 7 SAMPLING Case Study: Selecting a simple random sample of students A simple random sample of 25 students is to be selected from a school of 500 students. Using a list of all 500 students, each student is given a number (1 to 500), and these numbers are written on small pieces of paper. All the 500 papers are put in a box, after which the box is shaken vigorously to ensure randomization. Then, 25 papers are taken out of the box, and the numbers are recorded. The students belonging to these numbers will constitute the simple random sample. Stratified Random sampling: Stratification is the process of dividing members of the population into homogeneous subgroups before sampling. Example 1 Figure 1.2 The 50 subjects in Figure 1.2 have been stratified (divided) into two subgroups – one of 30 subjects (outlined in blue), and one of 20 subjects (outlined in green). A sample of 10 subjects has been selected, but they have not been picked entirely at random. Instead, 6 have been selected at random from the 30 blue subjects and 4 have been selected at random from the 20 green subjects, to ensure that the blue and green individuals are proportionately represented in the sample of 10 selected individuals. Example 2 A survey is conducted on household water supply in a district comprising 2,000 households, of which 400 (or 20%) are urban and 1,600 (or 80%) are rural. It is suspected that in urban areas the access to safe water sources is much more satisfactory than in rural areas. A decision is made to sample 200 household’s altogether, but to include 100 urban households and 100 rural households. INFERENTIAL STATISTICS 8 SAMPLING Probability base and non-probability base: Representation of the subgroups can be proportionate or disproportionate. For example, if you wanted to sample 100 farmers from a population of farmers in which 90% are male and 10% are female, a proportionate stratified sample would select 90 males and 10 females. But you may want to know more about the women farmers than is possible in a sample of only ten subjects. So you can select a disproportionate stratified sample, for example, you could select 50 males and 50 females. Systematic Random Sampling: It is a method of selecting sample members from a larger population, according to a random starting point and a fixed, periodic interval. Typically, every "nth" member is selected from the total population for inclusion in the sample population. Example 1 An example of systematic sampling of every tenth subject selected systematically from a total of 50 subjects. Example 2 A systematic sample is to be selected from 1,200 students from the same school. The required sample size is 100. The study population is 1,200 and the sample size is 100, so a systematic sampling interval is found by dividing the study population by the sample size: 1,200 ÷ 100 = 12 the sampling interval is therefore 12. The number of the first student to be included in the sample should be chosen randomly, for example by blindly picking one out of twelve pieces of paper, numbered 1 to 12. If number 6 INFERENTIAL STATISTICS 9 SAMPLING is picked, then every twelfth student will be included in the sample, starting with student number 6, until 100 students have been selected. Cluster random Sampling: Cluster sampling is a sampling technique used when "natural" but relatively homogeneous groupings are evident in a statistical population. It is often used in marketing research. In this technique, the total population is divided into groups (or clusters) and a simple random sample of the groups is selected. Then the required information is collected from a simple random sample of the elements within each selected group. Example 1 Let’s say that a researcher is studying the academic performance of high school students in the United States and wanted to choose a cluster sample based on geography. First, the researcher would divide the entire population of the United States into clusters, or states. Then, the researcher would select either a simple random sample or a systematic random sample of those clusters/states. Let’s say he/she chose a random sample of 15 states and he/she wanted a final sample of 5,000 students. The researcher would then select those 5,000 high school students from those 15 states either through simple or systematic random sampling. Example 2 The most common cluster used in research is a geographical cluster. For example, a researcher wants to survey academic performance of high school students in Spain. 1. He can divide the entire population (population of Spain) into different clusters (cities). INFERENTIAL STATISTICS 10 SAMPLING 2. Then the researcher selects a number of clusters depending on his research through simple or systematic random sampling. 3. Then, from the selected clusters (randomly selected cities) the researcher can either include all the high school students as subjects or he can select a number of subjects from each cluster through simple or systematic random sampling. The important thing to remember about this sampling technique is to give all the clusters equal chances of being selected. Multistage random sampling: Multistage sampling is a complex form of cluster sampling. Cluster sampling is a type of sampling which involves dividing the population into groups (or clusters). Then, one or more clusters are chosen at random and everyone within the chosen cluster is sampled. Example 1 For instance, when the polling organization samples US voters, they don’t do a SRS. Since voter lists are compiled by counties, they might first do a sample of the counties and then sample within the selected counties. This illustrates two stages. In some instances, they might use even more stages. At each stage, they might do a stratified random sample on gender, race, income level, or any other useful variable on which they could get information before sampling. INFERENTIAL STATISTICS 11 SAMPLING Example 2 For example, household surveys conducted by the Australian Bureau of Statistics begin by dividing metropolitan regions into 'collection districts' and selecting some of these collection districts (first stage). The selected collection districts are then divided into blocks, and blocks are chosen from within each selected collection district (second stage). Next, dwellings are listed within each selected block, and some of these dwellings are selected (third stage). This method makes it unnecessary to create a list of every dwelling in the region and necessary only for selected blocks. NON PROBABILITY SAMPLING Sequential Random Sampling: Sequential sampling is a non-probability sampling technique wherein the researcher picks a single or a group of subjects in a given time interval, conducts his study, analyzes the results then picks another group of subjects if needed and so on. Example 1 If a business organization wanted to determine the need for a new product, it might use sequential sampling as part of its research process. The business might distribute questionnaires to a selected group of potential customers asking for response to questions or scenarios that would help to measure the perceptions of the responders to the idea for the potential product. Example 2 A manufacturing plant might pull off the assembly line for close evaluation every fourth product that was created with a new type of material or process. The testing of that sample portion of the output would verify whether or not the new material or process contributed to the making of a final product that met required specifications. Purposive sampling: A purposive, or judgmental, sample is one that is selected based on the knowledge of a population and the purpose of the study. It is based on the researcher’s own expertise. Example 1 If a researcher is studying the nature of school spirit as exhibited at a school pep rally, he or she might interview people who did not appear to be caught up in the emotions of the crowd or students who did not attend the rally at all. In this case, the researcher is using a purposive sample because those being interviewed fit a specific purpose or description. INFERENTIAL STATISTICS 12 SAMPLING Example 2 In a study wherein a researcher wants to know what it takes to graduate summa cum laude in college, the only people who can give the researcher first hand advise are the individuals who graduated summa cum laude. With this very specific and very limited pool of individuals that can be considered as a subject, the researcher must use judgmental sampling. Quota Sampling A quota sample is a survey design in which interviewers recruit respondents according to a set of guidelines that will result in an overall sample with certain proportions of people with various social characteristics. Example 1 The requirement might be to produce a collection of interviews that is evenly divided between men and women, has certain percentages of people from different races and age categories etc. Example 2 Let’s say, for example, that you want to obtain a proportional quota sample of 100 people based on gender. First you would need to find out the proportion of the population that is men and the proportion that is women. If you found out the larger population is 40% women and 60% men, you would need a sample of 40 women and 60 men for a total of 100 respondents. You would start sampling and continue until you got those proportions and then you would stop. So, if you’ve already got 40 women for the sample, but not 60 men, you would continue to sample men and discard any legitimate women respondents that came along. Convenience sampling: A convenience sample is simply one in which the researcher uses any subjects that are available to participate in the research study. This could mean stopping people in a street corner as they pass by or surveying passersby in a mall. It could also mean surveying friends, students, or colleagues that the researcher has regular access to. Example 1 Let’s say that a researcher and professor at a University are interested in studying drinking behaviors among college students. The professor teaches a sociology 101 class to mostly college freshmen and decides to use his or her class as the study sample. He or she passes out surveys during class for the students to complete and hand in. Example 2 INFERENTIAL STATISTICS 13 SAMPLING Convenience sampling is often used when statistical data gathered from a specific group of people is desired. For example, if a company wants to figure out what flavor of pizza sells the best in college students, they could poll an average local college and reliably say that that is an accurate representation of most college students. DIFFERENCES BETWEEN RANDOM AND NON RANDOM SAMPLING The differences between Probability (Random) Sampling and Non-Probability (NonRandom) Sampling are summarized below. Probability (Random) Sampling Non-Probability (Non-Random) Sampling Allows the use of statistics, tests hypotheses Exploratory research, generates hypotheses Can estimate population parameters Population parameters are not of interest Eliminates bias Adequacy of the sample can't be known Must have random selection of units Cheaper, easier, quicker to carry out Sampling techniques: Advantages and disadvantages Technique Descriptions Advantages Simple Random sample from whole population Highly representative if Not possible without all subjects participate; complete list of population the ideal members; potentially uneconomical to achieve; can be disruptive to isolate members from a group; time-scale may be too long, data/sample could change Stratified Random sample from identifiable groups (strata), subgroups, etc. Can ensure that specific groups are represented, even proportionally, in the sample(s) (e.g., by gender), by selecting INFERENTIAL STATISTICS 14 Disadvantages More complex, requires greater effort than simple random; strata must be carefully defined SAMPLING individuals from strata list Cluster Random samples of successive clusters of subjects (e.g., by institution) until small groups are chosen as units Possible to select randomly when no single list of population members exists, but local lists do; data collected on groups may avoid introduction of confounding by isolating members Stage Combination of cluster (randomly selecting clusters) and random or stratified random sampling of individuals Can make up probability Complex, combines sample by random at limitations of cluster and stages and within stratified random sampling groups; possible to select random sample when population lists are very localized Purposive Hand-pick subjects on Ensures balance of group Samples are not easily the basis of specific sizes when multiple defensible as being characteristics groups are to be selected representative of populations due to potential subjectivity of researcher Quota Select individuals as they come to fill a quota by characteristics proportional to populations Ensures selection of Not possible to prove that adequate numbers of the sample is representative subjects with appropriate of designated population characteristics Snowball Subjects with desired traits or characteristics give names of further appropriate subjects Possible to include members of groups where no lists or identifiable clusters even exist (e.g., drug abusers, criminals) INFERENTIAL STATISTICS 15 Clusters in a level must be equivalent and some natural ones are not for essential characteristics (e.g., geographic: numbers equal, but unemployment rates differ) No way of knowing whether the sample is representative of the population SAMPLING Volunteer, accidental, convenience Either asking for volunteers, or the consequence of not all those selected finally participating, or a set of subjects who just happen to Inexpensive way Can be highly of ensuring unrepresentative sufficient numbers of a study be available How to Choose the Best Sampling Method In this section, we illustrate how to choose the best sampling method by working through a sample problem. Here is the problem: Problem Statement At the end of every school year, the state administers a reading test to a sample of third graders. The school system has 20,000 third graders, half boys and half girls. There are 1000 third-grade classes, each with 20 students. The maximum budget for this research is $3600. The only expense is the cost to proctor each test session. This amounts to $100 per session. The purpose of the study is to estimate the reading proficiency of third graders, based on sample data. School administrators want to maximize the precision of this estimate without exceeding the $3600 budget. What sampling method should they use? Finding the "best" sampling method is a four-step process. We work through each step below. List goals. This study has two main goals: (1) maximize quality production and (2) stay within budget. Identify potential sampling methods. Test methods. A key part of the analysis is to test the ability of each potential sampling method to satisfy the research goals. Specifically, we will want to know the INFERENTIAL STATISTICS 16 SAMPLING level of precision and the cost associated with each potential method. For our test, we use the standard error to measure precision. The smaller the standard error, the greater the precision. Choose best method. In this example, the cost of each sampling method is identical, so none of the methods has an advantage on cost. However, the methods do differ with respect to precision (as measured by standard error). Cluster sampling provides the most precision (i.e., the smallest standard error); so cluster sampling is the best method. SAMPLING DISTRIBUTION 1) The sampling distribution is a theoretical distribution of a sample statistic. 2.) There is a different sampling distribution for each sample statistic. 3) The sampling distribution of the mean is a special case of the sampling distribution. 4.) The Central Limit Theorem relates the parameters of the sampling distribution of the mean to the population model and is very important in statistical thinking. Why the sampling distribution is important? We use the sampling distribution of a statistic to determine the probability that the value of the statistic is like other possible sample values. It helps us determine the likelihood of error in concluding there is a relationship when there is not, or in concluding that two statistics are different. The sampling distribution is derived assuming the null hypothesis is correct. The sampling distribution says, if there is no relationship between x and y, these are the statistics we would expect and their associated probabilities. Central Limit Theorem The central limit theorem states that the sampling distribution of any statistic will be normal or nearly normal, if the sample size is large enough. How large is "large enough"? As a rough rule of thumb, many statisticians say that a sample size of 30 is large enough. If you know something about the shape of the sample distribution, you can refine that rule. The sample size is large enough if any of the following conditions apply. The population distribution is normal. The sampling distribution is symmetric, uni modal, without outliers, and the sample size is 15 or less. INFERENTIAL STATISTICS 17 SAMPLING The sampling distribution is moderately skewed, uni modal, without outliers, and the sample size is between 16 and 40. The sample size is greater than 40, without outliers. The exact shape of any normal curve is totally determined by its mean and standard deviation. Therefore, if we know the mean and standard deviation of a statistic, we can find the mean and standard deviation of the sampling distribution of the statistic (assuming that the statistic came from a "large" sample). Sampling distribution Distribution of a sample statistic is called sampling distribution. OR A probability distribution of all the possible means of the samples is distributions of the sample means; statisticians call this a sampling distribution of mean. Suppose that we draw all possible samples of size n from a given population. Suppose further that we compute a statistic (e.g., a mean, proportion, standard deviation) for each sample. The probability distribution of this statistic is called a sampling distribution. Variability of a Sampling Distribution: Variability of sampling distribution is measured by its standard deviation or variance, it depends upon three factors. N: no of observations in the population. n: no of observations in the sample How random sample is chosen? Sampling distribution will have roughly the same sampling error if population size is much larger than the sample size, whether sampling is done with or without replacement. Sampling error would be smaller if the sample represents a significant figure (say, 1/10) of population, when we sample without replacement. What is the difference between sampling and population distribution? Sampling distribution is a distribution of sample statistic while population distribution is the distribution of the population we selected for deducing our results, our area of interest. Population distribution refers to the patterns that a population creates as they spread within an area. A sampling distribution is a representative, random sample of that population. INFERENTIAL STATISTICS 18 SAMPLING SAMPLING DISTRIBUTION OF MEANS In order to demonstrate the properties of sampling distribution, let us consider a simple example. Suppose that our population consists of N=5 numbers 1, 2, 3, 4, 5. The mean ( and the standard deviation (σ) of this population are given by CASE I: SAMPLING DISTRIBUTION WITH REPLACEMENT When N=5, n=2 µ= ∑ = =3 =1.4142 Suppose that we draw all possible samples of size n=2 with replacement and then for each sample compute the sample mean x. There are Nn=52=25 samples of size 2 which can b drawn with replacement. These samples are Sample mean Sample Mean Sample mean sample Mean (1,1) 1 (2,3) 2.5 (3,5) 4 (5,2) 3.5 (1,2) 1.5 (2,4) 3 (4,1) 2.5 (5,3) 4 (1,3) 2 (2,5) 3.5 (4,2) 3 (5,4) 4.5 (1,4) 2.5 (3,1) 2 (4,3) 3.5 (5,5) 5 (1,5) 3 (3,2) 2.5 (4,4) 4 (2,1) 1.5 (3,3) 3 (4,5) 4.5 (2,2) 2 (3,4) 3.5 (5,1) 3 INFERENTIAL STATISTICS 19 SAMPLING X Tally F Pr(x) ∑ ∑(x2.Pr(x)) 1 I 1 1/25 1/25 1/25 1.5 II 2 2/25 3/25 9/25 2 III 3 3/25 6/25 36/25 2.5 IIII 4 4/25 10/25 100/25 3 IIII 5 5/25 15/25 225/25 3.5 IIII 4 4/25 14/25 196/25 4 III 3 3/25 12/25 144/25 4.5 II 2 2/25 9/25 81/25 5 I 1 1/25 5/25 25/25 ∑=75/25=3 E(X)2=32.68 ∑f=25 E(x) =∑ (x. Pr(x) = 3 V(x) =E(x2) –E (x) 2 =32.68-(3)2 =23.68 To prove functional relationships 1. E(x) = µ µ= =3 E(x) = µ 3=3 INFERENTIAL STATISTICS 20 SAMPLING 2. V(x)= σ = 6.88 σ2=47.36 V(x) = σ 23.68=23.68 CASE II: Sampling distribution with replacement N=5, n=3 Suppose that we draw all possible samples of size n=3 with replacement and then for each sample compute the sample mean x. There are Nn=53=125 samples of size 3 which can b drawn with replacement these samples are Samples Mean Sample Mean Sample Mean Sample Mean Sample Mean 1,1,1 1 2,1,2 2.66 3,1,3 2.33 4,1,4 3 5,1,5 3.33 1,1,2 1.33 2,1,3 2 3,1,4 2.66 4,1,5 3.33 5,2,1 2.66 1,1,3 1.66 2,1,4 2.33 3,1,5 3 4,2,1 2.33 5,2,2 3 1,1,4 2 2,1,5 2.66 3,2,1 2 4,2,2 2.66 5,2,3 3.33 1,1,5 2.33 2,2,1 1.33 3,2,2 2.33 4,2,3 3 5,2,4 3.66 1,2,1 1.33 2,2,2 2 3,2,3 2.66 4,2,4 3.33 5,2,5 4 1,2,2 1.66 2,2,3 2.33 3,2,4 3 4,2,5 3.66 5,3,1 3 1,2,3 2 2,2,4 2.66 3,2,5 3.33 4,3,1 2.66 5,3,2 3.33 1,2,4 2.33 2,2,5 3 3,3,1 2.33 4,3,2 3 5,3,3 3.66 INFERENTIAL STATISTICS 21 SAMPLING 1,2,5 2.66 2,3,1 2 3,3,2 2.66 4,3,3 3.33 5,3,4 4 1,3,1 1.66 2,3,2 2.33 3,3,3 3 4,3,4 3.66 5,3,5 4.33 1,3,2 2 2,3,3 2.66 3,3,4 3.33 4,3,5 4 5,4,1 3.33 1,3,3 2.33 2,3,4 3 3,3,5 3.66 4,4,1 3 5,4,2 3.66 1,3,4 2.66 2,3,5 3.33 3,4,1 2.66 4,4,2 3.33 5,4,3 4 1,3,5 3 2,4,1 2,33 3,4,2 3 4,4,3 3.66 5,4,4 4.33 1,4,1 2 2,4,2 2.66 3,4,3 3.33 4,4,4 4 5,4,5 4.66 1,4,2 2.33 2,4,3 3 3,4,4 3.66 4,4,5 4.33 5,5,1 3.66 1,4,3 2.66 2,4,4 3.33 3,4,5 4 4,5,1 3.33 5,5,2 4 1,4,4 3 2,4,5 3.66 3,5,1 3 4,5,2 3.66 5,5,3 4.33 1,4,5 3.33 2,5,1 2.66 3,5,2 3.33 4,5,3 4 5,5,4 4.66 1,5,1 2.33 2,5,2 3 3,5,3 3.66 4,5,4 4.33 5,5,5 5 1,5,2 2.66 2,5,3 3.33 3,5,4 4 4,5,5 4.66 1,5,3 3 2,5,4 3.66 3,5,5 4.33 5,1,1 2.33 1,5,4 3.33 2,5,5 4 4,1,1 2 5,1,2 2.66 1,5,5 3.66 3,1,1 1.33 4,1,2 2.33 5,1,3 3 2,1,1 1.33 3,1,2 2 4,1,3 2.66 5,1,4 3.33 X Tally F Pr(x) ∑ 1 I 1 1/125 1/25 1/125 1.33 IIII/ 5 5/125 6.65/125 8.84/125 1.66 III 3 3/125 4.98/125 8.26/125 2 IIII/ IIII/ 10 10/125 20/125 40/125 INFERENTIAL STATISTICS 22 ∑(x2.Pr(x)) SAMPLING 2.33 IIII/ IIII/ IIII/ 15 15/125 34.95/125 81.42/125 2.66 IIII/ IIII/ 19 IIII/ III 19/125 50.54/125 134.42/125 3 IIII/ IIII/ 19 IIII/ IIII 19/125 57/125 171/125 3.33 IIII/ IIII/ 19 IIII/ IIII 19/125 63.27/125 210.52/125 3.66 IIII/ IIII/ 19 IIII/ IIII 19/125 69.54/125 254.41/125 4 IIII/ IIII/ 10 10/125 40/125 160/125 4.33 IIII/ I 6 6/125 25.98/125 112.49/125 4.66 III 3 3/125 13.98/125 65.146/125 5 I 1 1/125 5/125 25/125 ∑=392.89/125=3.1 E(X)2=10.18 ∑f=125 E(x) =∑ (x. Pr(x) =3.1 V(x) =E(x2) –E (x) 2 =-(10.18)–(3.14)2 = 0.32 To prove functional relationship 1. E(x) = µ 3.1=3.1 hence proved µ= = 3.1 2. V(x)= INFERENTIAL STATISTICS 23 SAMPLING σ2=V(x).n = (0.32). (3) = 0.96 V(x) = =0.96/3 =0.32 hence proved CASE III: SAMPLING DISTRIBUTION IN CASE OF WITHOUT REPLACEMENT Suppose now we draw all possible samples of size 2 from our population without replacement, for each sample we will compute the sample mean. As N=1, 2, 3, 4, 5 with n=2 Sample Mean(x) Sample Mean(x) (1,2) 1.5 (2,4) 3 (1,3) 2 (2,5) 3.5 (1,4) 2.5 (3,4) 3.5 (1,5) 3 (3,5) 4 (2,3) 2.5 (4,5) 4.5 x tally F Pr(x) ∑ ∑(x2.Pr(x)) 1.5 I 1 1/10 1.5/10 2.25/10 2 I 1 1/10 2/10 4/10 2.5 II 2 2/10 5/10 25/10 3 II 2 2/10 6/10 36/10 3.5 II 2 2/10 7/10 49/10 4 I 1 1/10 4/10 16/10 INFERENTIAL STATISTICS 24 SAMPLING 4.5 I 1 1/10 ∑f=10 4.5/10 20.25/10 ∑=30/10=3 E(X)2=15.25 Prove the results 1. E(X) =µ ∑ µ = =3 So E(X) =∑(x .Pr(x)) 3=3 σ 2. V(x)= V(x) =E(x2)-E(x) 2=15.25-(3)2=6.25 σ2= ∑(x-µ) 2/N σ2 = (1-3)2+ (2-3)2+ (3-3)2+ (4-3)2+ (5-3)2 σ2 = σ2 = 2 6.25=6.25 hence proved CASE IV: Sampling distribution in case of without replacement A population consists of four numbers 2, 4, 6, 8 all the possible samples of size n=3 are given below in the table which can be drawn without replacement from this population Here N=4 and n=3 the number of possible samples of size 3 which can be drawn are as given below INFERENTIAL STATISTICS 25 SAMPLING Samples Mean (2,4,6) 4 (2,4,8) 4.67 (2,6,8) 5.33 (4,6,8) 6 X Tally F Pr(x) ∑ ∑(x2.Pr(x)) 4 I 1 1/4 4/4 16/4 4.67 I 1 1/4 4.67/4 21.80/4 5.33 I 1 1/4 5.33/4 28.40/4 6 I 1 1/4 6/4 36/4 ∑=20/4=5 E(X)2=25.55 ∑f=4 To prove the results: 1. E(x)= Where E(x) = ∑ (x. Pr(x)) =5 µ= =5 Hence 5=5 2. V(x) = V(X) = E(x2)-E(x) 2 V(X) = 25.55-25 = 0.55 INFERENTIAL STATISTICS 26 SAMPLING And σ2= ∑(x-µ) 2/N 2 + (4-5)2+ (6-5)2+ (8-5)2 σ2 = σ2 = 5 σ2 = ( )=0.55 Hence proved CASE V: SAMPLING DISTRIBUTION OF DIFFERENCE BETWEEN MEANS Suppose we have two infinite populations I and II with means µ1 andµ2 and standard deviations σ1 andσ2 respectively. Let X1 be the mean of a sample of size n1 from population I and X2 be the mean of the sample of size n2 from population II, independent of the sample I. the means of samples, each of size n1 from the population will yield a sampling distribution of X1 with mean µ1 and standard deviation σ1.similarly the means of samples each of size n2 from population II will yield a sampling distribution of X2.with a mean µ2 and standard deviation σ2 From all combinations of these samples from the two populations we can obtain a distribution of differences of means, X1-X2 which is called the sampling distribution of differences of the means. The mean and standard deviation of this sampling distribution is denoted by µ1-µ2 and σ1- σ2 are given by µ1-µ2 = µx 1- µx 2 σ x1 - x 2 =V(X1-X2) = V(X1) + V(X2) = (σx1)2+ (σx2)2 x 2 = (σx1)2/n1+ (σx2)2/n2 Suppose that population I consists of 2 numbers (4, 6, 8) and population II consists of 3 numbers (1, 2, 3) For population I and II: N1=3, n1=2, N2=3, n2=2 N1 X1 N2 X2 4, 4 4 1,1 1 4, 6 5 1,2 1.5 4,8 6 1,3 2 6,4 5 2,1 1.5 INFERENTIAL STATISTICS 27 SAMPLING 6,6 6 2,2 2 6,8 7 2,3 2.5 8,4 6 3,1 2 8,6 7 3,2 2.5 8,8 8 3,3 3 Difference Table X1-X2 x1 x2 1 1.5 2 1.5 2 2.5 2 2.5 3 4 3 2.5 2 2.5 2 1.5 2 1.5 1 5 4 3.5 3 3.5 3 2.5 3 2.5 2 6 5 4.5 4 4.5 4 3.5 4 3.5 3 5 4 3.5 3 3.5 3 2.5 3 2.5 2 6 5 4.5 4 4.5 4 3.5 4 3.5 3 7 6 5.5 5 5.5 5 4.5 5 4.5 4 6 5 4.5 4 4.5 4 3.5 4 3.5 3 7 6 5.5 5 5.5 5 4.5 5 4.5 4 8 7 6.5 6 6.5 6 5.5 6 5.5 5 X1-X2=d TALLY F Pr(d) d. Pr(d) d2.Pr(d) 1 I 1 1/81 1/81 1/81 1.5 II 2 2/81 3/81 4.5/81 2 IIII/ 5 5/81 10/81 20/81 2.5 IIII/I 6 6/81 15/81 37/81 3 IIII/ IIII/ 10 10/81 30/81 90/81 INFERENTIAL STATISTICS 28 SAMPLING 3.5 IIII/IIII/ 10 10/81 35/81 122.5/81 4 IIII/ IIII/ 13 III 13/81 52/81 208/81 4.5 IIII/IIII/ 10 10/81 45/81 202.5/81 5 IIII/IIII/ 10 10/81 50/81 250/81 5.5 IIII/I 6 6/81 33/81 181.5/81 6 IIII/ 5 5/81 30/81 180/81 6.5 II 2 2/81 13/81 84/81 7 I 1 1/81 7/81 49/81 E(x1x2)=E(d)=4 E(X1)2-(X2)2=E(d2)= 17.6 ∑f=81 Prove the results: 1. E(d) = µ1-µ2 E (d) = 2.5 µ1-µ2=∑ (X1-X2) / N µ1-µ2=4 Hence proved 4 = 4 2. V(d) = (σ x1 )2/n1+ (σx2)2/n2 (σ1)2= (4-6)2 + (6-6)2+ (8-6)/3 =8/3 (σ2)2= (1-2)2 + (2-2)2+ (3-2)2/3=2/3 (σ x1 ) 2/n1+ (σx2)2/n2=8/3 .1/2 +5/3 .1/2 =5/3 =1.66 V (d) = E (d2) –E (d) 2 V (d) =17.6 – (4)2 V (d) =17.6 - 16 V (d) = 1.6 Hence proved INFERENTIAL STATISTICS 29 SAMPLING SAMPLE PROPORTION Proportion is referred as a certain fraction of the total possessing certain attribute of our interest; let us take an example to understand the concept of proportion Suppose a student guesses at the answer on every question in a 300-question exam. If he gets 60 questions correct, then his proportion of correct guesses is 60/300=.20. If he gets 75 questions correct, then his proportion of correct guesses is 75/300=.25. The proportion of correct guesses is simply the number of correct guesses divided by the total number of questions. Now, let X denote the number of successes out of a sample of n observations. If each observation is a success with probability p independently of the other observations, then X is a binomial random variable with parameters n and p. Furthermore, the proportion of successes in the sample is also a random variable and is computed as Sampling distribution of proportions: Consider an experiment that results in a success on each trial with probability p or a failure with a probability q=1-p. to obtain a sample of size n we perform n trials of experiment and we are sampling from an infinite population. For example the population may be all possible tosses of a fair coin in which the probability of getting head (success) is p=1/2 the mean would be µ=np and the standard deviation σ=√ Question: 01 A population consists of five members .the marital status of each member is given below, where M and S stand for married and single respectively. Member 1 2 3 4 5 Marital status M S M S S a) Determine the proportion of married members in the population b) Select all possible samples of two members from this population (i) with replacement, (ii) without replacement and compute the proportion of married members in each sample. INFERENTIAL STATISTICS 30 SAMPLING Solution: a) Since there are 2 married members in the population N=5, p=2/5 =0.4 or 40% b) There are Nn=52 =25 possible samples of size n=2 which can be drawn with replacement form the population these samples are given below sample p sample p sample p sample p (1,1) 1 (2,4) 0 (4,2) 0 (5,5) 0 (1,2) 0.5 (2,5) 0 (4,3) 0.5 (1,3) 1 (3,1) 1 (4,4) 0 (1,4) 0.5 (3,2) 0.5 (4,5) 0 (1,5) 0.5 (3,3) 1 (5,1) 0.5 (2,1) 0.5 (3,4) 0.5 (5,2) 0 (2,2) 0 (3,5) 0.5 (5,3) 0.5 (2,3) 0.5 (4,1) 0.5 (5,4) 0 P tally f Pr(p) p.Pr(p) p2 p2.Pr(p) 0 IIII/ IIII 9 9/25 0 0 0 0.5 IIII/ IIII/ 12 II 12/25 6/25 .25 3/25 1 IIII 4/25 4/25 1 4/25 4 ∑f=25 E(p)=0.4 E(p2)=0.28 To prove the results: 1) E(p) = P P= 0.4=0.4 INFERENTIAL STATISTICS 31 SAMPLING 2) V ( p ) = pq/n V (p) =E (p2)-E (p) 2 =0.28-(0.4)2 =0.12 Pq/n= (0.4) (1-0.4)/2 =0.12 0.12=0.12 hence proved In case of without replacement the samples drawn of sine n=2 are 10 Sample Proportion Sample proportion (1,2) 0.5 (2,4) 0 (1,3) 1 (2,5) 0 (1,4) 0.5 (3,4) 0.5 (1,5) 0.5 (3,5) 0.5 (2,3) 0.5 (4,5) 0 P Tally F Pr(p) p.Pr(p) p2 p2.Pr(p) 0 III 3 3/10 O 0 0 0.5 IIII/ I 6 6/10 3/10 0.25 1.5/10 1 I 1 1/10 1/10 1 1/10 ∑f=10 E(p)=0.4 E(p2)=0.25 Prove the results 1) E(p)=µp 0.4=0.4 INFERENTIAL STATISTICS 32 SAMPLING 2) V (p) = √ √ V (p) = √ – =√ =√ √ = 0.3 √ √ √ =√ √ =√ = 0.3 Hence proved SAMPLING DISTRIBUTION OF DIFFERENCES OF PROPORTIONS Consider independent samples of size n1 and n2 drawn at random from two binomial populations with parameters p1, q1 and p2 and q2 respectively, we denote proportion of successes of each sample by P1 & P2. From all combination of these samples from the true population we can obtain the sampling distributions of the differences of P1-P2 which is called the sampling distribution of differences of proportions. The mean and standard deviation are given below. Mean: µp1-µp2 =p1-p2 Standard deviation: (σp1-p2) = √ =√ Question 1: let P1 denote the proportion of odd numbers in a random sample of size n1=2 with replacement from a finite population of size N1=3: 3, 6, 9.similarly, let P2 denote the proportion of odd numbers in a random sample of size n2=3 with replacement from a finite population of N2=2: (6, 7). Form sampling distributions of P1-P2 also find the mean and variance of a sampling distribution of P1-P2 and verify the results. Solution: In population 1, N1=3, n1=2 there are Nn=32= 9 possible samples which can be drawn with replacement from this population. These samples are Samples Proportion Samples Proportion 3,3 1 6,6 0 3,6 ½ 6,9 ½ INFERENTIAL STATISTICS 33 SAMPLING 3,9 1 9,3 1 6,2 ½ 9,6 ½ 9,9 1 In population 2, N2=2 n2=3 there are Nn=23=8 possible samples which can be drawn from this population. These samples are Samples Proportion Sample Proportion 6,6,6 0 7,6,6 1/3 6,6,7 1/3 7,6,7 2/3 6,7,6 1/3 7,7,6 2/3 6,7,7 2/3 7,7,7 1 Difference table =p1-p2 P1/p2 0 1/3 1/3 1/3 2/3 2/3 2/3 1 0 0 -1/3 -1/3 -1/3 -2/3 -2/3 -2/3 -1 1/2 1/2 1/6 1/6 1/6 -1/6 -1/6 -1/6 -1/2 P1-P2 tally f -1 -2/3 -1/2 -1/3 -1/6 I III IIII III IIII/ IIII/ II IIII/ IIII/ IIII/ II IIII/ 0 1/6 1/3 INFERENTIAL STATISTICS 1/2 ½ 1/6 1/6 1/6 -1/6 -1/6 -1/6 -1/2 1/2 ½ 1/6 1/6 1/6 -1/6 -1/6 -1/6 -1/2 1/2 ½ 1/6 1/6 1/6 -1/6 -1/6 -1/6 -1/2 1 1 2/3 2/3 2/3 1/3 1/3 1/3 0 1 1 2/3 2/3 2/3 1/3 1/3 1/3 0 1 1 2/3 2/3 2/3 1/3 1/3 1/3 0 1 1 2/3 2/3 2/3 1/3 1/3 1/3 0 -1/72 -2/72 -2/72 -1/72 -2/72 (P1P2)2 1 4/6 ¼ 1/9 1/36 (P1-P2)2. F(P1P2) 1/72 4/216 1/72 1/216 1/216 5/72 12/72 0 2/72 0 1/36 0 1/216 12/72 4/72 1/9 4/216 P1-P2. f(p1-p2) 1 3 4 3 12 F(p1P2) 1/72 3/72 4/72 3/72 12/72 5 12 12 34 SAMPLING ½ 2/3 1 IIII/ II IIII IIII/ IIII/ II IIII 4 12 4/72 12/72 2/72 8/72 ¼ 4/9 1/72 16/216 4 4/72 4/72 1 4/72 ∑f=72 ∑(P1-p2). f(P1P2) = 12/72 ∑(P1-P2)2. f(P1P2) = 48/216 Prove the results Mean: µp1-p2=∑ (P1-P2) f (P1-P2) ∑ (P1-P2) f (P1-P2) =12/72 =1/6 Variance: (σp1-p2)2=∑ (P1-P2)2 f (P1-P2) – (µp1-p2)2 =48/216-1/36 =7/36 The proportion of odd numbers in population 1 and 2 are P1=2/3 and P2=1/2 respectively Prove the results 1) µp1-p2= P1-P2 = 2/3-1/2 =1/6 2) (σp1-p2)=p1 (1-p1)/n1 + p2(1-p2)/n2 = (2/3) (1/3)/2 + (1/2) (1/2)/3 = (1/9) + (1/12) =7/36 Which agrees with the results obtained above? INFERENTIAL STATISTICS 35 SAMPLING ASSESSMENT QUESTION Question 1: A population consists of 4 numbers 3, 7, 11, 15 considering all possible samples of size n=2 which can be drawn from this population find i) the population mean ii) the population standard deviation iii) the mean of the sampling distribution of means iv) the standard deviation of the sampling distribution of means. Verify iii) and IV) directly from i) and ii) by one of the suitable formulae 1. Compute µx , (σx)2 and σx directly without forming the frequency/ sampling distribution of means if the sampling is without replacement and thus verify the results Question 2: Random samples of size 2 are selected from the finite population consisting of the numbers 3, 5, 7, 9, 11, 13. a) Find the mean and standard deviation of this population. b) List the 15 possible random samples (n = 2) that can be selected from this population and calculate their means. c) Use the results of part (b) to construct the sampling distribution of the means of these samples. d) Calculate the mean l and variance r ² of the probability distribution in part (c) and compare them with the results obtained in part (a). Question 3: A city currently does not have a National Football League team. Fifty-four percent of all the city’s residents are in favor of attracting an NFL team. A random sample of 1000 of the city’s residents is selected, and asked if they would want an NFL team. A. What is the probability the percentage of those residents polled who are in favor of attracting an NFL team is less than 50%? B. What is the probability the percentage of those residents polled who are in favor of attracting an NFL team is more than 3% from the actual percentage of 54%? Question4: Let P1 denote the proportion of even numbers in a random sample of size n1=2 without replacement from a population of size N1=3 consisting of value 4, 6, 9 similarly let P2 denotes the proportion of even numbers in a random sample of size n2=2 without replacement from a population size N2=3 consisting of values 2, 3, 5. Find the mean and the variance of the differences of two proportions and verify the results. 1.µp1-p2=∑ (P1-P2) f (P1-P2) 2. (σp1-p2) = p1 (1-p1)/n1 + p2 (1-p2)/n2 Question5: A population consists of 6 values 1, 3, 5, 7, 9, 11.Take all the possible samples of size 2 which can be drawn i) with replacement ii) without replacement from this population. Find the sample means and form a sampling distribution of the mean in case of with replacement. Compute mean and variance directly in case of without replacement. Find the means and variances and verify the results INFERENTIAL STATISTICS 36 SAMPLING I) E(x) =µ, V(x) = And for ii) E(x) =µ, V(X) = Question 6: the weights of 1000 students of a college are normally distributed with a mean 68.5kg and standard deviation 2.7kg.if 200 random samples of 25 student each are obtained from this population find the expected mean and standard deviation of the sampling distribution of means if sampling is done i) with replacement, ii) without replacement also verify the respective results. Question7: draw all possible random samples of size n1=2 with replacement from a population 3, 4, 5 similarly draw all possible random samples of size n2=2 with replacement from another finite population 1,2,3 a) find sample means X1 and X2 and the possible differences between the sample means of the two populations. B) Form a sampling distribution of X1-X2 and compute its mean and variance and verify the results of difference between means. Question8: A population consists of 7 numbers 1, 1, 2, 3, 4, 5, 6 draw all possible samples of size n=3 without replacement from this population and find the sample proportion of odd numbers in the sample. Construct the sampling distribution of proportions and also verify their respective results. 1) E(p)=µp 2) V (p) = √ √ Objectives 1. When each member of a population has an equally likely chance of being selected, this is called: a) b) c) d) A nonrandom sampling method A quota sample A snowball sample An Equal probability selection method 2. Which of the following techniques yields a simple random sample? a) Choosing volunteers from an introductory psychology class to participate b) Listing the individuals by ethnic group and choosing a proportion from within each ethnic group at random. INFERENTIAL STATISTICS 37 SAMPLING c) Numbering all the elements of a sampling frame and then using a random number table to pick cases from the table. d) Randomly selecting schools, and then sampling everyone within the school. 3. Which of the following is not true about stratified random sampling? a) It involves a random selection process from identified subgroups b) Proportions of groups in the sample must always match their population proportions c) Disproportional stratified random sampling is especially helpful for getting large enough subgroup samples when subgroup comparisons are to be done d) Proportional stratified random sampling yields a representative sample 4. Which of the following statements are true? a) The larger the sample size, the greater the sampling error b) The more categories or breakdowns you want to make in your data analysis, the larger the sample needed c) The fewer categories or breakdowns you want to make in your data analysis, the larger the sample needed d) As sample size decreases, so does the size of the confidence interval 5. Which of the following formulae is used to determine how many people to include in the original sampling? a) b) c) d) Desired sample size/Desired sample size + 1 Proportion likely to respond/desired sample size Proportion likely to respond/population size Desired sample size/Proportion likely to respond 6. Which of the following sampling techniques is an equal probability selection method (i.e., EPSEM) in which every individual in the population has an equal chance of being selected? a) b) c) d) Simple random sampling Systematic sampling c. Proportional stratified sampling Cluster sampling using the PPS technique All of the above are EPSEM 7. Which of the following is not a form of nonrandom sampling? a) Snowball sampling INFERENTIAL STATISTICS 38 SAMPLING b) c) d) e) Convenience sampling Quota sampling Purposive sampling They are all forms of nonrandom sampling 8. Which of the following will give a more “accurate” representation of the population from which a sample has been taken? a) b) c) d) A large sample based on the convenience sampling technique A small sample based on simple random sampling A large sample based on simple random sampling A small cluster sample 9. Sampling in qualitative research is similar to which type of sampling in quantitative research? a) b) c) d) Simple random sampling Systematic sampling Quota sampling Purposive sampling 10. Which of the following would generally require the largest sample size? a) b) c) d) e) Cluster sampling Simple random sampling Systematic sampling Proportional stratified sampling Answers: 1. D 6. E 2. C 7. E 3. B 8. C 4. B 9. D 5. D 10. A INFERENTIAL STATISTICS 39 SAMPLING 1) Choose the pair of symbols that complete the sentence-------------- is a parameter, whereas-------------- is a statistic. a) N, µ b) n, s c) d) 2) In which of the following situations would x= computing x? √ be the correct formula to use for a) Sampling from infinite population without replacement b) Sampling from infinite or finite population without replacement c) Sampling from finite population without replacement d) Both b and c but not a 3) suppose that a population has standard deviation 5 what is the standard deviation of the sampling distribution of the mean of sample size n=25 a) 5 b) 25 c) 1 d) 0.2 4) A border patrol check point stops every 10th passenger van is using a) Stratified sampling b) Cluster sampling c) Systematic sampling d) Sequential sampling 5) Standard error of the mean is the standard deviation of the a) Population b) Statistic c) Sample d) Sampling distribution of means 6) If samples of size n are drawn without replacement from a population of size N with mean (µ) and variance( ), the standard error of the sample mean would be a) √ c) (N-n/N-1). b) d) (N-n/N-1). INFERENTIAL STATISTICS 40 SAMPLING 7) in sampling without replacement a) n<N b) n>N c) n≤N d) n≥N 8) The standard error increases when the sample size is a) Increased b) Decreased c) small d) Large 9) The number of possible samples drawn by using with replacement as compared to without replacement would be a) more b) Less c) Equal d) None 10) The difference between a statistic and a parameter is called a) Sampling distribution b) Sampling error c) Systematic error d) Non sampling error 11) Which of the following statements best describes the relationship between a parameter and a statistic? a) b) c) d) A parameter has a sampling distribution with the statistic as its mean. A parameter has a sampling distribution that can be used to determine what values the statistic is likely to have in repeated samples. A parameter is used to estimate a statistic. A statistic is used to estimate a parameter. 12) Sampling distribution of is the a) b) c) probability distribution of the sample mean probability distribution of the sample proportion mean of the sample INFERENTIAL STATISTICS 41 SAMPLING a. mean of the population 13) A simple random sample of 100 observations was taken from a large population. The sample mean and the standard deviation were determined to be 80 and 12 respectively. The standard error of the mean is a. 1.20 b. 0.12 c. 8.00 d. 0.80 14) The probability distribution of all possible values of the sample proportion is the A. probability density function of B. sampling distribution of C. same as , since it considers all possible values of the sample proportion D. sampling distribution of 15) Since the sample size is always smaller than the size of the population, the sample mean A. must always be smaller than the population mean B. must be larger than the population mean C. must be equal to the population mean D. can be smaller, larger, or equal to the population mean 16) Standard deviation of all possible values is called the A. standard error of proportion B. standard error of the mean C. mean deviation D. central variation INFERENTIAL STATISTICS 42 SAMPLING 17) As the sample size becomes larger, the sampling distribution of the sample mean approaches a A. binomial distribution b. Poisson distribution C. normal distribution D. chi-square distribution INFERENTIAL STATISTICS 43 SAMPLING References: http://www.southalabama.edu/coe/bset/johnson/dr_johnson/mcq/mc7.pdf http://labspace.open.ac.uk/mod/oucontent/view.php?id=454418§ion=1.5.2 http://sociology.about.com/od/Q_Index/g/Quota-Sample.htm http://www.stats.gla.ac.uk/steps/glossary/basic_definitions.html#sampdistn http://www.investopedia.com/terms/s/sampling.asp http://www.csulb.edu/~msaintg/ppa696/696sampl.htm#Why sample http://schatz.sju.edu/methods/sampling/intro.html https://www.google.com.pk/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&cad=rja &ved=0CCcQFjAA&url=http%3A%2F%2Flibguides.usc.edu%2Floader.php%3Ftype %3Dd%26id%3D675792&ei=OzujUoWnK4_KsgbHyYCwDg&usg=AFQjCNHi8T9Ua REnJxzUbP_Y72YacNEoaQ&bvm=bv.57752919,d.Y http://www.vtasq.org/pdf_ppt/program_presentations/Sampling%20Presentation%20 Oct%2026%202011.pdf http://www.slideshare.net/samanshuaib7/savedfiles?s_title=sampling-ppt-myreport&user_login=mjfababaer http://www.investopedia.com/terms/n/non-samplingerror.asp http://www.cliffsnotes.com/math/statistics/sampling/populations-samples-parametersand-statistics http://stattrek.com/sampling/sampling-distribution.aspx http://people.uncw.edu/pricej/teaching/statistics/hyp_test.htm http://stattrek.com/survey-research/compare-sampling-methods.aspx INFERENTIAL STATISTICS 44 SAMPLING STATISTICAL INFERENCE: HYPOTHESIS TESTING Statistical inference: Statistical inference is the process of drawing conclusions from data that are subject to random variation, for example, observational errors or sampling variation. It is concerned with making predictions or inferences about a population from observations and analyses of a sample. This means But keep in mind that sample should be large enough to represent the population or we can say it should be a representative part of it. that by using sample results of a particular population, we can conclude about whole population and its characteristics. Hypothesis: A statistical hypothesis is a claim (assertion, statement, belief or assumption) about an unknown population parameter value. For example, an investment company claims that the average return across all its investments is 20 percent and so on. To test such claims sample data are collected and analyzed. On the basis of sample findings, hypothesized value of population parameter is accepted or rejected. STEPS IN HYPOTHESIS TESTING Specification of hypothesis: The Null Hypothesis rejection hypothesis leads of to null the An assumption to be tested for possible rejection is called null acceptance of an alternative hypothesis and is denoted by H0. hypothesis. Alternate Hypothesis Any hypothesis that is different from the null hypothesis and is set up in parallel to the null hypothesis, is called an alternative hypothesis and is denoted by H1 INFERENTIAL STATISTICS 45 HYPOTHESIS TESTING Types of Hypothesis Directional: Directional hypothesis are those where one can predict the direction (effect of one variable on the other as 'Positive' or 'Negative') For example, Girls perform better than boys (‗better than‘ shows the direction predicted) One tail test: A one tailed test looks for an increase or decrease in the parameter. In a one-tailed test, the critical region will have just one part (the red area below). If our sample value lies in this region, we reject the null hypothesis in favor of the alternative. Suppose we are looking for a definite decrease. Then the critical region will be to the left. Note, however, that in the one-tailed test the value of the parameter can be as high as you like. Non - Directional Non Directional hypothesis are those where one does not predict the kind of effect but can state a relationship between variable 1 and variable 2. For example, there will be a difference in the performance of girls & boys (Not defining what kind of difference) Two tail test: A two-tailed test looks for any change in the parameter (which can be any change- increase or decrease).A two-tailed t-test divides distribution in half, placing half in the each tail. The INFERENTIAL STATISTICS 46 HYPOTHESIS TESTING null hypothesis in this case is a particular value, and there are two values for alternative hypotheses, one positive and one negative. The critical value of t, tcrit, is written with both a plus and minus sign (±). For example, the critical value of t when there are ten degrees of freedom (DF=10) and is set to .05, is tcri = ± 2.228. The sampling distribution model used in a two-tailed t-test is illustrated below: Level of significance: The significance level is usually denoted by α The significance level of a statistical hypothesis test is a fixed Significance Level = P (type I error) = α probability of wrongly rejecting the null hypothesis H0, if it is in fact true. It is the probability of a type I error (explained Usually, the significance level is chosen to be 0.05 (or equivalently, 5%). below) and is set by the investigator in relation to the consequences of such an error. That is, we want to make the significance level as small as possible in order to protect the null hypothesis and to prevent, as far as possible, the investigator from inadvertently making false claims. Type 1 error: In a hypothesis test, a type I error occurs when the null hypothesis is rejected when it is in fact true; that is, H0 is wrongly rejected. For example, in a clinical trial of a new drug, the null hypothesis might be that the new drug is no better, on average, than the current drug; i.e. H0: There is no difference between the two drugs on average. INFERENTIAL STATISTICS 47 HYPOTHESIS TESTING A type I error would occur if we concluded that the two drugs produced different effects when in fact there was no difference between them. Type II Error: A type II error occurs when the null hypothesis H0, is not The probability of a type II rejected when it is in fact false. error is generally unknown, According to previous example, a type II error would but is symbolized by β and occur if it was concluded that the two drugs produced the written P (type II error) = β same effect, i.e. there is no difference between the two drugs on average, when in fact they produced different ones. A type II error is frequently due to sample sizes being too small. Accept H0 H0 is true H0 is false Correct decision with Type 2 error (β) confidence(1-α) Reject H0 Type 1 error α Correct decision with confidence (1-β) Table showing Type I and Type II errors Which Error Is More Dangerous? Both errors can be dangerous depending upon the situation faced. For example, if in court, a judge makes wrong decision by releasing a criminal (Type I error), it would be more dangerous than punishing the right person (Type II error). Similarly some situations can be faced where Type II error would be more dangerous. INFERENTIAL STATISTICS 48 HYPOTHESIS TESTING Test statistics: A test statistic is a quantity calculated from our sample of data. Its value is used to decide whether or not the null hypothesis should be rejected in our hypothesis test. Cases Case1 Case2 Case3 Case4 Case5 Case6 Case7 Case8 No of parameter 1 2 1 2 1 2 1 Parameter δ or δ2 n Statistic µ Known - z= µ µ µ µ µ P 2 p 1 2 Known Unknown Unknown Unknown Unknown Known Known - z= n<30 t= 𝑠 𝑛 𝛿x 1− x 2 x −μ 𝑠 with n-1 d.f 𝑛 𝑑−𝜇 𝑑 𝑆𝑑 - 𝑛 z= - x −μ ( x1− x2 ) − ( µ 1− μ 2 ) t= n<30 𝑛 (𝜎 1 2 /𝑛 1 )+( 𝜎 2 2 /𝑛 2 ) z= z= 𝛿 ( x 1− x 2 ) − ( µ1−µ2 ) n>30 n>30 x −μ with n-1 d.f 𝐩̂−𝐩 𝐩𝐪 𝐧 p̂ 1 − p̂ 2 z= p (1−p̂ (1 𝑛 1 + 1 𝑛 2 ) Case 9 Case 10 Case 11 Case 12 2 1 1 σ Two σ2 Α Β - - - 𝜒2 = 𝑛 −1 𝑠 2 𝜎2 - - - - - with ( r – 1 )( c – 1 ) f= t= t= a−α S.E a b−β S.E b σ 21 σ 22 with n-2 d.f with n-2 d.f Calculation: Calculate the standard error of the sample statistic. Use the standard error to convert the observed value of the sample statistic to a standardize value. INFERENTIAL STATISTICS 49 HYPOTHESIS TESTING Critical region: The critical region CR, or rejection region RR, is a set of values of the test statistic for which the null hypothesis is rejected in a hypothesis test. That is, the sample space for the test statistic is partitioned into two regions; one region (the critical region) will lead us to reject the null hypothesis H0, the other will not. So, if the observed value of the test statistic is a member of the critical region, we conclude "Reject H0"; if it is not a member of the critical region then we conclude "Do not reject H0". Conclusion: The final conclusion is made by comparing the test statistic (which is a summary of the information observed in the sample) to the decision rule. The final conclusion will be either to reject the null hypothesis (because the sample data are very unlikely if the null hypothesis is true) or not to reject the null hypothesis (because the sample data are not very unlikely). SPECIAL NOTE Acceptance of hypothesis: In real life cases, when we reject the null hypothesis, it does not mean that we are accepting the alternate hypothesis. That's because a hypothesis test does not determine which hypothesis is true, or even which is most likely; it only assesses whether available evidence exists to reject the null hypothesis. Conclusion: When we are testing any hypothesis, we are not actually performing that situation, we are only doing paper work, so we cannot say that we are making decision on the basis of null and alternate hypothesis. We always say that we are making conclusion on these basis. As discussed above, when we reject the null hypothesis, it does not mean that we are accepting the alternate hypothesis i.e. we are making ―conclusion‖ that we are rejecting null; but we are not making ―decision‖ that we are accepting the alternate on the other side. INFERENTIAL STATISTICS 50 HYPOTHESIS TESTING Understanding through example: Look at it in terms of "innocent until proven guilty" in a courtroom: As the person analyzing data, you are the judge. The hypothesis test is the trial, and the null hypothesis is the defendant. The alternative hypothesis is like the prosecution, which needs to make its case beyond a reasonable doubt (say, with 95% certainty). If the evidence presented doesn't prove the defendant is guilty beyond a reasonable doubt, you still have not proved that the defendant is innocent. But based on the evidence, you can't reject that possibility. So how that verdict would be announced? It enters the court record as "Not guilty". That phrase is perfect: "Not guilty" doesn't mean the defendant is innocent, because that has not been proven. It just means the prosecution couldn't prove its case to the necessary, "beyond a reasonable doubt" standard. It failed to convince the judge to abandon the assumption of innocence. If you follow that rationale, then you can see that "failure to reject the null" is just the statistical equivalent of "not guilty." In a trial, the burden of proof falls to the prosecution. When analyzing data, the entire burden of proof falls to the sample data you've collected. Just as "not guilty" is not the same thing as "innocent," neither is "failing to reject" the same as "accepting" the null hypothesis. So the next time you're looking to hang around at the local Nulls Angels clubhouse, remember that "failing to reject the null" is not "accepting the null." Knowing the difference just might get Tiny to buy you a drink. INFERENTIAL STATISTICS 51 HYPOTHESIS TESTING Z Table: How to Read a Z Table: Example: Percent of Population between 0 and 0.45 Start at the row for 0.4, and read along until 0.45: there is the value 0.1736 And 0.1736 is 17.36% So 17.36% of the population is between 0 and 0.45 Standard Deviations from the Mean INFERENTIAL STATISTICS 52 HYPOTHESIS TESTING T Table: How to read t table: First we will get the value of alpha with degree of freedom then we will search degree of freedom vertically and we will see the value of alpha horizontally. The value coming under the respective degree of freedom and alpha is the required value. INFERENTIAL STATISTICS 53 HYPOTHESIS TESTING CHI SQUARE GOODNESS OF FIT When an analyst attempts to fit a statistical model to observed data, he or she may wonder how well the model actually reflects the data. How "close" are the observed values to those which would be expected under the fitted model? One statistical test that addresses this issue is the chi-square goodness of fit test. This test is commonly used to test association of variables in two-way tables where the assumed model of independence is evaluated against the observed data. In general, the chi-square test statistic is of the form If the computed test statistic is large, then the observed and expected values are not close and the model is a poor fit to the data. The following are properties of the goodness-of-fit test: The data are the observed frequencies. This means that there is only one data value for each category. The degree of freedom is one less than the number of categories, not one less than the sample size. It is always a right tail test. It has a chi-square distribution. The value of the test statistic doesn't change if the order of the categories is switched. TEST OF INDEPENDENCE In the test for independence, the claim is that the row and column variables are independent of each other. This is the null hypothesis. The multiplication rule said that if two events were independent, then the probability of both occurring was the product of the probabilities of each occurring. It is a key to working the INFERENTIAL STATISTICS 54 HYPOTHESIS TESTING test for independence. If you end up rejecting the null hypothesis, then the assumption must have been wrong and the row and column variable are dependent. Remember, all hypothesis testing is done under the assumption the null hypothesis is true. The test statistic used is the same as the chi-square goodness-of-fit test. The principle behind the test for independence is the same as the principle behind the goodness-of-fit test. The test for independence is always a right tail test. In fact, you can think of the test for independence as a goodness-of-fit test where the data is arranged into table form. This table is called a contingency table. The test statistic has a chi-square distribution when the following assumptions are met: The data are obtained from a random sample The expected frequency of each category must be at least 5. The following are properties of the test for independence The data are the observed frequencies. The data is arranged into a contingency table. The degrees of freedom are the degrees of freedom for the row variable times the degrees of freedom for the column variable. It is not one less than the sample size; it is the product of the two degrees of freedom. It is always a right tail test. It has a chi-square distribution. The expected value is computed by taking the row total times the column total and dividing by the grand total The value of the test statistic doesn't change if the orders of the rows or columns are switched. The value of the test statistic doesn't change if the rows and columns are interchanged (transpose of the matrix) INFERENTIAL STATISTICS 55 HYPOTHESIS TESTING TEST FOR HOMOGENIETY The test for homogeneity is a method, based on the chi-square statistic, for testing whether two or more multinomial distributions are equal. When is this test used? The data is multinomial data in a contingency table or two way cross-classification table. All expected values are at least 5. Another rule of thumb is that there are more than 4 cells, the average of the expected values is at least 5, and the smallest expected value is at least 1. The cells have counts or frequencies. It doesn‘t work if the data is percentages or relative frequencies! Either the row totals or column totals are fixed. The data comes from multiple samples which are independent. This test is used to see if the different samples come from populations with the same distribution. Each cell will have a frequency or count. It is necessary to find row, column, and grand totals. There will be i categories and j samples or distributions. The notation below assumes the categories are the rows and the samples are the columns. If they are switched, all the calculations and results will be the same. How to decide which χ2 test is appropriate one to use among above three tests? Goodness of Fit: Use the Goodness of Fit Test when you want to decide whether a population with unknown distribution "fits" a known distribution. In this case there will be a single qualitative survey question or a single outcome of an experiment from a single population. Goodness of fit is typically used to see if the population is uniform (all outcomes occur with equal frequency), the population is normal, or the population is the same as another population with known distribution. The null and alternative hypotheses are: H0: The population fits the given distribution. Ha: The population does not fit the given distribution. INFERENTIAL STATISTICS 56 HYPOTHESIS TESTING Independence: Use the Test for Independence when you want to decide whether two variables are independent or dependent. In this case there will be two qualitative survey questions or experiments and a contingency table will be constructed. The goal is to see if the two variables are unrelated (independent) or related (dependent). The null and alternative hypotheses are: H0: The two variables are independent. Ha: The two variables are dependent. Homogeneity: Use the Test for Homogeneity when you want to decide if two populations with unknown distribution have the same distribution as each other. In this case there will be a single qualitative survey question or experiment given to two different populations. The null and alternative hypotheses are: H0: The two populations follow the same distribution. Ha: The two populations have different distributions. FISHER’s EXACT TEST The Fisher's Exact test procedure calculates an exact probability value for the relationship between two dichotomous variables, as found in a two by two cross table. The program calculates the difference between the data observed and the data expected, considering the given marginal and the assumptions of the model of independence. It works in exactly the same way as the Chi-square test for independence; however, the Chi-square gives only an estimate of the true probability value, an estimate which might not be very accurate if the marginal is very uneven or if there is a small value (less than five) in one of the cells. In such cases the Fisher exact test is a better choice than the Chi-square. However, in many cases the Chi-square is preferred because the Fisher exact test is difficult to calculate. The probability of observing a given set of frequencies a, b, c and d in a 2 x 2 contingency table, given fixed row and column marginal totals and sample size n, is: (a + b )! ( a + c )! ( c + d )! ( b + d )! 𝑎! 𝑏! 𝑐! 𝑑! 𝑛! INFERENTIAL STATISTICS 57 HYPOTHESIS TESTING SPECIAL CASE OF CONTIGENCY A 2×2 contingency table shows the frequencies of occurrence of all combinations of the levels of two dichotomous variables, in a sample of size N. A schematic form of such a table is given by the figure below. A research question of interest is often whether the variables summarized i n a contingency table are independent of each other. The test to determine if this is so depends on which, if any, of the margins are fixed, either by design or for the purposes of the analysis. For example, in a randomized trial in which the number of subjects to be randomized to each treatment group has been specified, the row margins would be fixed but the column margins would not (it is customary to use rows for treatments and columns for outcomes). In a matched study, however, in which one might sample 100 cases (smokers, say) and 1000 controls (non–smokers), and then test each of these 1100 subjects for the presence or absence of some exposure that may have predicted their own smoking status (perhaps a parent who smoked), it would be the column margins that are fixed. In a random and unstrained sample, in which each subject sampled is then cross– classified by two attributes (say smoking status and gender), neither margin would be fixed. Finally, in Fisher‘s famous tea–tasting experiment, in which a lady was to guess whether the milk or the tea infusion was first added to the cup by dividing 8 cups into two sets of 4, both the row and the column margins would be fixed by the design. Yet in the first case mentioned, that of a randomized trial with fixed row margins but not fixed column margins, the column margins may be treated as fixed for the purposes of the analysis, so as to ensure exactness. When the row and column margins are fixed, either by design or for the analysis, independence can be tested using Fisher‘s exact test. This test is based on the hyper INFERENTIAL STATISTICS 58 HYPOTHESIS TESTING geometric distribution and it is computationally intensive, especially in large samples. Therefore, Fisher advocated the use of Pearson‘s statistic, 𝜒2 = n (ad − bc )2 𝑎 + 𝑏 𝑐 + 𝑑 𝑎 + 𝑐 (𝑏 + 𝑑) F-Test The F-distribution is formed by the ratio of two independent chi-square variables divided by their respective degrees of freedom. Since F is formed by chi-square, many of the chi-square properties carry over to the F distribution. The F-values are all non-negative The distribution is non-symmetric The mean is approximately 1 There are two independent degrees of freedom, one for the numerator, and one for the denominator. There are many different F distributions, one for each pair of degrees of freedom. The F-test is designed to test if two population variances are equal. It does this by comparing the ratio of two variances. So, if the variances are equal, the ratio of the variances will be 1. If the null hypothesis is true, then the F test-statistic given above can be simplified (dramatically). This ratio of sample variances will be test statistic used. If the null hypothesis is false, then we will reject the null hypothesis that the ratio was equal to 1 and our assumption that they were equal. There are several different F-tables. Each one has a different level of significance. So, find the correct level of significance first, and then look up the numerator degrees of freedom and the denominator degrees of freedom to find the critical value. You will notice that all of the tables only give level of significance for right tail tests. Because the F distribution is not symmetric, and there are no negative values, you may not simply take the opposite of the right critical value to find the left critical value. The way to INFERENTIAL STATISTICS 59 HYPOTHESIS TESTING find a left critical value is to reverse the degrees of freedom, look up the right critical value, and then take the reciprocal of this value. For example, the critical value with 0.05 on the left with 12 numerator and 15 denominator degrees of freedom is found of taking the reciprocal of the critical value with 0.05 on the right with 15 numerator and 12 denominator degrees of freedom. Assumptions: The larger variance should always be placed in the numerator The test statistic is F = s1^2 / s2^2 where s1^2 > s2^2 Divide alpha by 2 for a two tail test and then find the right critical value If standard deviations are given instead of variances, they must be squared When the degrees of freedom aren't given in the table, go with the value with the larger critical value (this happens to be the smaller degrees of freedom). This is so that you are less likely to reject in error (type I error) The populations from which the samples were obtained must be normal. The samples must be independent How to read an F table: Find the column that corresponds to the relevant numerator degrees of freedom, r1. Find the three rows that correspond to the relevant denominator degrees of freedom, r2. Find the one row, from the group of three rows that is headed by the probability of interest... whether it's 0.01, 0.025, and 0.05. Determine the F-value where the r1 column and the probability row intersect. INFERENTIAL STATISTICS 60 HYPOTHESIS TESTING Analysis of Variance-ANOVA ONE-WAY ANOVA A One-Way Analysis of Variance is a way to test the equality of three or more means at one time by using variances. Assumptions The populations from which the samples were obtained must be normally or approximately normally distributed. The samples must be independent. The variances of the populations must be equal. Hypothesis The null hypothesis will be that all population means are equal; the alternative hypothesis is that at least one means is different. In the following, lower case letters apply to the individual samples and capital letters apply to the entire set collectively. That is, n is one of many sample sizes, but N is the total sample size. Grand Mean The grand mean of a set of samples is the total of all the data values divided by the total sample size. This requires that you have all of the sample data available to you, which is usually the case, but not always. It turns out that all that is necessary to find perform a one-way analysis of variance are the number of samples, the sample means, the sample variances, and the sample sizes. Another way to find the grand mean is to find the weighted average of the sample means. The weight applied is the sample size. INFERENTIAL STATISTICS 61 HYPOTHESIS TESTING Total Variation The total variation (not variance) is comprised the sum of the squares of the differences of each mean with the grand mean. There is the between group variation and the within group variation. The whole idea behind the analysis of variance is to compare the ratio of between group variance to within group variance. If the variance caused by the interaction between the samples is much larger when compared to the variance that appears within each group, then it is because the means aren't the same. Between Group Variation The variation due to the interaction between the samples is denoted SS (B) for Sum of Squares Between groups. If the sample means are close to each other (and therefore the Grand Mean) this will be small. There are k samples involved with one data value for each sample (the sample mean), so there are k-1 degrees of freedom. The variance due to the interaction between the samples is denoted MS (B) for Mean Square Between groups. This is the between group variation divided by its degrees of freedom. It is also denoted by . Within Group Variation The variation due to differences within individual samples denoted SS (W) for Sum of Squares Within groups. Each sample is considered independently, no interaction between samples is involved. The degree of freedom is equal to the sum of the individual degrees of freedom for each sample. Since each sample has degrees of freedom equal to one less than their sample sizes, and there are k samples, the total degrees of freedom is k less than the total sample size: df = N - k. The variance due to the differences within individual samples is denoted MS (W) for Mean Square Within groups. This is the within group variation divided by its degrees of freedom. It INFERENTIAL STATISTICS 62 HYPOTHESIS TESTING is also denoted by . It is the weighted average of the variances (weighted with the degrees of freedom). F test statistic Recall that an F variable is the ratio of two independent chi-square variables divided by their respective degrees of freedom. Also recall that the F test statistic is the ratio of two sample variances, well, it turns out that's exactly what we have here. The F test statistic is found by dividing the between group variance by the within group variance. The degrees of freedom for the numerator are the degrees of freedom for the between group (k-1) and the degrees of freedom for the denominator are the degrees of freedom for the within group (N-k). Summary Table All of this sounds like a lot to remember, and it is. However, there is a table which makes things really nice. Between SS df MS F SS(B) k-1 SS(B) MS(B) ----------- -------------- k-1 MS(W) SS(W) --. SS(W) Within N-k ----------N-k Total SS(W) + SS(B) INFERENTIAL STATISTICS N-1 63 . . HYPOTHESIS TESTING Notice that each Mean Square is just the Sum of Squares divided by its degrees of freedom, and the F value is the ratio of the mean squares. Do not put the largest variance in the numerator, always divide the between variance by the within variance. If the between variance is smaller than the within variance, then the means are really close to each other and you will fail to reject the claim that they are all equal. The degrees of freedom of the F-test are in the same order they appear in the table. Decision Rule The decision will be to reject the null hypothesis if the test statistic from the table is greater than the F critical value with k-1 numerator and N-k denominator degrees of freedom. If the decision is to reject the null, then at least one of the means is different. However, the ANOVA does not tell you where the difference lies. For this, you need another test, either the Scheffe' or Tukey test. TWO-WAY ANOVA The two-way analysis of variance is an extension to the one-way analysis of variance. There are two independent variables (hence the name two-way). Assumptions The populations from which the samples were obtained must be normally or approximately normally distributed. The samples must be independent. The variances of the populations must be equal. The groups must have the same sample size. Hypothesis There are three sets of hypothesis with the two-way ANOVA. The null hypotheses for each of the sets are given below. INFERENTIAL STATISTICS 64 HYPOTHESIS TESTING The population means of the first factor are equal. This is like the one-way ANOVA for the row factor. The population means of the second factor are equal. This is like the one-way ANOVA for the column factor. There is no interaction between the two factors. This is similar to performing a test for independence with contingency tables. Factors The two independent variables in a two-way ANOVA are called factors. The idea is that there are two variables, factors, which affect the dependent variable. Each factor will have two or more levels within it, and the degrees of freedom for each factor is one less than the number of levels. Treatment Groups Treatment Groups are formed by making all possible combinations of the two factors. For example, if the first factor has 3 levels and the second factor has 2 levels, then there will be 3x2=6 different treatment groups. Main Effect The main effect involves the independent variables one at a time. The interaction is ignored for this part. Just the rows or just the columns are used, not mixed. This is the part which is similar to the one-way analysis of variance. Each of the variances calculated to analyze the main effects are like the between variances Interaction Effect The interaction effect is the effect that one factor has on the other factor. The degrees of freedom here are the product of the two degrees of freedom for each factor. INFERENTIAL STATISTICS 65 HYPOTHESIS TESTING Within Variation The Within variation is the sum of squares within each treatment group. You have one less than the sample size (remember all treatment groups must have the same sample size for a two-way ANOVA) for each treatment group. The total number of treatment groups is the product of the number of levels for each factor. The within variance is the within variation divided by its degrees of freedom. The within group is also called the error. F-Tests There is an F-test for each of the hypotheses, and the F-test is the mean square for each main effect and the interaction effect divided by the within variance. The numerator degrees of freedom come from each effect, and the denominator degrees of freedom is the degrees of freedom for the within variance in each case. Two-Way ANOVA Table It is assumed that main effect A has a levels (and A = a-1 df), main effect B has b levels (and B = b-1 df), n is the sample size of each treatment, and N = abn is the total sample size. Notice the overall degree of freedom is once again one less than the total sample size. Source Main Effect A SS df MS F Given A, SS / df MS(A) / MS(W) SS / df MS(B) / MS(W) SS / df MS(A*B) / a-1 Main Effect B Given B, b-1 Interaction Given A*B, (a-1)(b- Effect MS(W) 1) INFERENTIAL STATISTICS 66 HYPOTHESIS TESTING Within Given N - ab, SS / df ab(n-1) Total sum of others N - 1, abn - 1 MULTIPLE COMPARISON TEST In a one-way ANOVA, the F statistic tests whether the treatment effects are all equal, i.e. that there are no differences among the means of the J groups. A significant F value indicates that there are differences in the means, but it does not tell you where those differences are, e.g. group 1‘s mean might be different than group 2‘s mean but not different from group 3‘s mean. To isolate where the differences are, you could do a series of pair wise T-tests. The problem with this is that the significance levels can be misleading. For example, if you have 7 groups, there will be 21 pair wise comparisons of means; if using the .05 level of significance, you would expect at least one statistically significant difference even if no differences exist. Therefore, various methods have been developed for doing multiple comparisons of group means. LSD LSD stands for Least Significant Difference t test. This test does not control the overall probability of rejecting the hypotheses that some pairs of means are different, while in fact they are equal, i.e. it doesn‘t matter if you are comparing 1 pair of means or a 100, no adjustment is made for the number of comparisons. The formula is; INFERENTIAL STATISTICS 67 HYPOTHESIS TESTING BONFERRONI The Bonferroni adjustment is the simplest. It basically multiplies each of the significance levels from the LSD test by the number of tests performed, i.e. J*(J-1)/2 If this value is greater than 1, then a significance level of 1 is used. SIDAK While simple, the Bonferroni adjustment actually overcompensates for the fact that multiple comparisons are being made, e.g. if you do 21 tests, the probability is NOT 1.05 that at least one of them will be significant at the .05 level; rather, it is 1 – .9521 =.659. The Sidak adjustment computes the level of significance as 1-(1-LSDsignificance) J*(J-1)/2 SCHEFFE The Scheffe test takes a somewhat different approach. The Scheffe test computes an F statistic with d.f. = J-1, N-J. Scheffe = LSD2/ (J – 1). INFERENTIAL STATISTICS 68 HYPOTHESIS TESTING PROBLEMS RELATED TO EACH CASE Case 1: (For testing μ when δ or δ 2 is known) Q1: The mean lifetime of electric light bulbs produced by a company has in the past been 1120 hours with standard deviation of 125 hours. A sample of 100 electric bulbs recently chosen from a supply of newly produced bulbs showed a mean lifetime of 1070 hours. Test the hypothesis that the mean lifetime of bulbs has not changed, using 5% levels of significance. Solution: Step 1: Specification of hypothesis Ho: μ= 1120 hours (The mean life time of the bulbs has not changed) Two Tailed H1: μ≠ 1120 hours (the mean life time of bulbs has changed) Step 2: Level of Significance α= 0.05 (5 %) Standard deviation is known. Step 3: Test Statistics xˉ −μ z = σ/ n Step 4: Calculation z cal = 1070 −1120 125/ 100 50 z cal = - 125 (10) z cal= -4 Step 5: Critical Region Reject Ho if; INFERENTIAL STATISTICS 69 HYPOTHESIS TESTING │z cal│≥ z tab So; z tab=1.96 │-4 │≥ 1.96 Step 6: Conclusion Since the calculated value of z exceeds the critical values of z tab= 1.96. (z lies in the critical region), we reject Ho at 5% level of significance. We therefore, conclude that the mean lifetime of the bulbs has changed. Q2: The mean weight of a tablet of a certain drug is claimed to be 50 mg. A sample of 100 tablets showed a mean weight of 50.15 mg with a standard deviation of 0.4 mg. using a 1% level of significance, can we conclude that the desired weight is not properly maintained. Step 1: Specification of hypothesis Ho: μ= 50 mg (the weight of the tablet is properly maintained) Two Tailed H1: μ≠50 mg (the weight of the tablet is not properly maintained) Step 2: Level of Significance α= 0.01 (1 %) Standard deviation is known. Step 3: Test Statistics xˉ −μ z = σ/ n Step 4: Calculation z cal = z cal = INFERENTIAL STATISTICS 70 50.15−50 0.4/ 100 0.15 0.4 (10) HYPOTHESIS TESTING z cal= 3.75 Step 5: Critical Region Reject Ho if; │z cal│≥ z tab So; z tab= 2.58 │3.75│ ≥ 2.58 Step 6: Conclusion Since the calculated value of z is greater than the critical values of z tab. (z lies in the critical region), we reject Ho at 1% level of significance. We therefore, conclude that the weight of the tablet is not properly maintained. Practice Questions Q1: A manufacturer supplies the rear axles for U.S Postal Service mail trucks. These axles must be able to withstand 80,000 pounds per square inch in stress tests, but an excessively strong axle raises production costs significantly. Long experience indicates that the standard deviation of the strength of its axles is 4,000 pounds per square inch. The manufacturer selects a sample of 100 axles from production, tests them, and finds that the mean stress capacity of the sample is 79,600 pounds per square inch. Q2: It has been found from experience at the mean breaking strength of a particular brand of threads is 9.63N with a standard deviation of 1.40N. Recently a sample of 36 pieces of threads showed a mean breaking strength of 8.93N. Can we conclude at 5% and 1% levels of significance that the threads have become inferior? INFERENTIAL STATISTICS 71 HYPOTHESIS TESTING Case 2: (For testing μ 1 and μ2 when δ or δ 2 is known) Q1: A firm believes that the tires produced by process A on an average last longer than tires produced by process B. To test this belief, random samples of tires produced by the two processes were tested and the results are: Process Sample Size Average Standard Lifetime Deviation (in km) (in km) A 50 22,400 1000 B 50 21,800 1000 Is there evidence at a 5% level of significance that the firm is correct in its belief? Solution: Step 1: Specification of hypothesis Let us take the null hypothesis that the use of vitamin C reduces the mean time required to recover from the common cold, that is Ho: (μ1- μ2) ≤ 0 H1: (μ1- μ2) > 0 Upper Tail Step 2: Level of Significance α= 0.05 (5 %) Standard deviation is known. Step 3: Test Statistics INFERENTIAL STATISTICS 72 HYPOTHESIS TESTING z= ( x 1− x 2 ) − ( µ1−µ2 ) (𝜎 1 2 /𝑛 1 )+( 𝜎 2 2 /𝑛 2 ) Step 4: Calculation 22,400−21,800 z cal = 1000 2 1000 2 + 50 50 z cal = - 600 20,000+20,000 600 z cal= 200 z cal= 3 Step 5: Critical Region Reject Ho if; z cal > z tab So; z tab= 1.65 3 > 1.65 Step 6: Conclusion Since the calculated value of z exceeds the critical values of z tab= 1.645. (Z lies in the critical region), we reject Ho at 5% level of significance. We therefore, conclude that the tires produced by process a last longer than those produced by process B. Q2: A random sample of size 6 from a normal population with variance 24 gave mean= 15. A sample of size 8 from a normal population with variance 80 gave mean = 13. Test the H o= μ1-μ2=0 against not equal to 0. Solution: Step 1: Specification of hypothesis Ho: μ1= μ2 Ho: μ1 - μ2 = 0 INFERENTIAL STATISTICS 73 Two Tailed HYPOTHESIS TESTING H1: μ1- μ2 ≠ 0 Step 2: Level of Significance α= 0.05 (5 %) Standard deviation is known. Step 3: Test Statistics z= ( x 1− x 2 ) − ( µ1−µ2 ) (𝜎 1 2 /𝑛 1 )+( 𝜎 2 2 /𝑛 2 ) Step 4: Calculation z cal = (15−13)−0 z cal = z cal= 24 80 + 6 58 2 4+10 2 14 z cal= 0.535 Step 5: Critical Region Reject Ho if; │z cal│≥ z tab So; z tab= 1.65 0.535 ≱ 1.65 Step 6: Conclusion Since the calculated value of z is less than the critical values of z tab (z do not lies in the critical region), we do not reject Ho at 5% level of significance. INFERENTIAL STATISTICS 74 HYPOTHESIS TESTING Practice Questions Q1: On an examination in a statistics course, the average marks of 50 boys was 72 with a population standard deviation of 8, while the average marks of 45 girls was 75. Test the hypothesis at (a) 5% and (b) 1% level of significance that the boys‘ performance is inferior to that of the girls. Q2: Two samples A and B detailed below were taken from normal populations of standard deviation 2.5. Decide whether the difference of sample means is significant at the 0.05 level of significance. A 16 18 23 26 19 24 25 23 21 22 B 20 21 23 25 27 24 26 24 28 25 Case 3: (For testing μ when δ or δ 2 is unknown and n > 30) Q1: A company claims that the average lifetime of his product is 2000 hours. A random sample of 64 products is put on test and their lifetime in hours is recorded. The following sums are obtained from the lifetimes: Σx = 127808 and Σ(x-xˉ )2 = 9694.6. Test the hypothesis that the manufacturer is overestimating the lifetimes of the products. Take α = 0.01. Solution: Step 1: Specification of hypothesis Ho: μ = 2000 hours Lower Tail H1: μ < 2000 hours Step 2: Level of Significance α= 0.01 (1 %) INFERENTIAL STATISTICS 75 HYPOTHESIS TESTING Standard deviation is unknown. Step 3: Test Statistics xˉ −μ z = 𝑠/ 𝑛 Step 4: Calculation z cal = 1997−2000 12.31/ 64 3(8) z cal = − 12.31 = - 1.95 Step 5: Critical Region Reject Ho if; z cal < - z tab So; z tab= 2.58 -1.95 ≮ -2.33 Step 6: Conclusion Since the calculated value of z do not exceeds the critical values of z tab (z do not lies in the critical region), we reject Ho at 5% level of significance. We therefore, conclude that the mean lifetime is less than 2000 hours. Q2: Individual filing of income tax returns prior to 30th June had an average refund of Rs. 1200. Consider the population of last minutes filers who file their returns during the last week of June. For a random sample of 400individuals who filed a return between 25 and 30 June, the sample mean refund was Rs.1054 and the sample standard deviation was Rs. 1600. Using 5% level of significance, test the belief that the individuals who wait until the last week of June to file their returns to get a higher refund than early the filers. Solution: Step 1: Specification of hypothesis INFERENTIAL STATISTICS 76 HYPOTHESIS TESTING Ho: μ ≥ 1200 Lower Tail H1: μ < 1200 Step 2: Level of Significance α= 0.05 (5 %) Standard deviation is unknown. Step 3: Test Statistics xˉ −μ z = s/ n Step 4: Calculation z cal = z cal = − 1054 −1200 1600 / 400 146 80 = - 1.825 Step 5: Critical Region Reject Ho if; z cal < -z tab So; z tab= 1.96 1.825 ≮ -1.96 Step 6: Conclusion Since the calculated value of z exceeds the critical values of z tab (z lies in the critical region), we reject Ho at 5% level of significance. We therefore, conclude that the mean is less than 1200. INFERENTIAL STATISTICS 77 HYPOTHESIS TESTING Practice Questions Q1: A package device is set to fill detergent powder packets with a mean weight of 5 kg, with standard deviation of 0.21kg. The weight of packets can be assumed to be normally distributed. The weight of packets is known to drift upwards over a period of time due to machine fault, which is not tolerable. A random sample of 100 packets is taken and weighted. This sample has a mean weight of 5.03kg. Can we conclude that the mean weight produced by the machine has increased? Use a 5% level of significance. Q2: The mean life span of a sample fluorescent LEDs produced by a company is found to be 1600 days with a standard deviation of 150 days. Test the hypothesis that the mean life span of fluorescent LEDs produced in general is higher than the mean life of 1570 days at α = 0.01 level of significance. Case 4: (For testing μ 1 and μ2 when δ or δ 2 is unknown and n > 30) Q1: An experiment was conducted to compare the mean time in days required to recover from the common cold for a person given daily dose of 4mg of vitamin C versus those who were not given a vitamin supplement. Suppose that 35 adults were randomly selected for each treatment category and that the mean recovery times and standard devotions for the two groups were as f6llows: Vitamin C No Vitamin Supplement Sample size 35 35 Sample mean 5.8 6.9 Sample Standard 1.2 2.9 Deviation INFERENTIAL STATISTICS 78 HYPOTHESIS TESTING Test the hypothesis that the use of vitamin C reduces the mean time required to recover from a common cold and its complications, at the level of significance α =0.05. Solution: Step 1: Specification of hypothesis Ho: (μ1 - μ2) ≤ 0 Upper Tail H1: (μ1- μ2) > 0 Step 2: Level of Significance α= 0.05 (5 %) Standard deviation is unknown. Step 3: Test Statistics z= ( x 1− x 2 ) − ( µ1−µ2 ) (𝜎 1 2 /𝑛 1 )+( 𝜎 2 2 /𝑛 2 ) Step 4: Calculation z cal = z cal = (5.8−6.9)−0 1.2 2 2.9 2 + 35 35 −1.1 0.041+0.240 −1.1 z cal= 0.530 z cal= -2.605 Step 5: Critical Region Reject Ho if; z cal > z tab So; z tab= 1.65 -2.605 ≯ 1.65 Step 6: Conclusion INFERENTIAL STATISTICS 79 HYPOTHESIS TESTING Since the calculated value of z cal is less than the critical values of z tab (z do not lies in the critical region), we do not reject Ho at 5% level of significance. Q2: The education testing Service conducted a study to investigate differences between the scores of female and male students on the Mathematics Aptitude Test. The study identified a random sample of 562 female and 852 male students who had achieved the same high score on the mathematics portion of the test. That is, the female and male students viewed as having similar high ability in mathematics. The verbal scores for the two samples are given below: Sample mean Sample standard deviation Female Male 547 525 83 78 Do the data support the conclusion that given populations of female and male students with similar high ability in mathematics, the female students will have a significantly high verbal ability? Test at α =0.05 significance level. What is your conclusion? Solution: Step 1: Specification of hypothesis Ho: (μ1 -μ2) ≥ 0 H1: (μ1- μ2) < 0 Lower Tail Step 2: Level of Significance α= 0.05 (5 %) Standard deviation is unknown. Step 3: Test Statistics INFERENTIAL STATISTICS 80 HYPOTHESIS TESTING z= ( x 1− x 2 ) − ( µ1−µ2 ) (𝜎 1 2 /𝑛 1 )+( 𝜎 2 2 /𝑛 2 ) Step 4: Calculation z cal = z cal = (547−525)−0 83 2 78 2 + 562 852 22 12.258+7.140 22 z cal = 4.404 z cal = 4.995 Step 5: Critical Region Reject Ho if; z cal < - z tab So; z tab = 1.65 4.995 ≮- 1.65 Step 6: Conclusion Since the calculated value of z cal exceeds than the critical values of - z tab (z lies in the critical region), we reject Ho at 5% level of significance. Practice Questions Q1: The mean height of 50 males students of group I is 68.2 inches with a standard deviation of 2.5 inches, while 0 males students of group II as a mean height of 67.5 inches with a standard deviation of 2.8 inches. Test the hypothesis that male students of group of I are taller than male students of group II at the 0.05 level of significance. INFERENTIAL STATISTICS 81 HYPOTHESIS TESTING Q2: A farmer claims that the average yield of corn of variety-----variety B by at least 12 bushels per acre.----- with a standard deviation of 6.28 bushels per acre, while variety B yielded on average 77.8 bushels per acre with a standard deviation of 5.64 bushels per acre. Test the farmer‘s claim using a 0.05 level of significance. Case 5: (For testing μ when δ or δ 2 is unknown and n < 30) Q1: Researchers are interested in whether the mean level of enzyme B in a certain population is different from 120. They measure levels of enzyme B in a sample of 15 individuals and find that the mean, xˉ = 96 and the sample standard deviation, s = 35. Step 1: Specification of hypothesis Ho: μ= 120 hours Two Tailed H1: μ≠ 120 hours Step 2: Level of Significance α= 0.05 (5 %) Standard deviation is unknown Step 3: Test Statistics xˉ −μ t = s/ n with n-1 d.f Step 4: Calculation t cal = 96−120 35/ 15 t cal = - 2.65 Step 5: Critical Region INFERENTIAL STATISTICS 82 HYPOTHESIS TESTING Reject Ho if; │t cal│≥ t tab (n-1) So; t tab (n-1) = 2.145 (d.f 15-1= 14) │- 2.65│≥ - 2.145 Step 6: Conclusion Since the calculated value of t cal exceeds the critical values of t tab (t lies in the critical region), we reject Ho at 5% level of significance. Q2: The Average breaking strength of steel rods is specified to be 18.5 thousand kg. For this a sample of 14 rods was tested. The mean and standard deviation obtained were 17.85 and 1.955, respectively. Test the significance of the deviation. Step 1: Specification of hypothesis Ho: μ= 18.5 Two Tailed H1: μ≠ 18.5 Step 2: Level of Significance α= 0.05 (5 %) Standard deviation is unknown. Step 3: Test Statistics t= xˉ −μ s/ n with n-1 d.f Step 4: Calculation t cal = 17.85−18.5 1.955/ 14 t cal = -1.24 INFERENTIAL STATISTICS 83 HYPOTHESIS TESTING Step 5: Critical Region Reject Ho if; │t cal│≥ t tab So; t tab (n-1) =2.16 (d.f 14-1= 13) │-1.24 │≥ -2.16 Step 6: Conclusion Since the calculated value of t cal not exceeds the critical values of t tab. (t do not lies in the critical region), so; we do not reject Ho at 5% level of significance. Practice Questions Q1: An automobile tire manufacturer claims that the average life of a particular grade of tire is more than 20,000 km when used under normal conditions. A random sample of 16 tires was tested and a mean and standard deviation of 22,000 km and 5,000 km respectively were computed. Assuming the life of the tires in km to be approximately normally distributed, decide whether the manufacturer‘s claim is valid. Q2: A random sample of 22 fifth grade pupils have a grade point average of 5.0 in math‘s with a standard deviation of 0.452, whereas marks range from 1 (worst) to 6 (excellent). The grade point average (GPA) of all fifth grade pupils of the last five years is 4.7. Is the GPA of the 22 pupils different from the populations‘ GPA? INFERENTIAL STATISTICS 84 HYPOTHESIS TESTING Pupil Grade points INFERENTIAL STATISTICS 1 5 2 5.5 3 4.5 4 5 5 5 6 6 7 5 8 5 9 4.5 10 5 11 5 12 4.5 13 4.5 14 5.5 15 4 16 5 17 5 85 HYPOTHESIS TESTING 18 5.5 19 4.5 20 5.5 21 5 22 5.5 Mean 5.0 Variance 0.2045 Case 6: (For testing μ 1 and μ2 when δ or δ 2 is unknown and n < 30) Q1: A researcher interested in employee satisfaction and productivity measured the number of units produced by employees at a plant before and after a company-wide pay raise occurred. The researcher hypothesized that production would be higher after the raise compared to before the raise. Assume that the difference scores are normally distributed and let a = 0.05. INFERENTIAL STATISTICS Participants Befor After 1 e7 7 2 4 5 3 8 9 4 5 8 6 9 6 6 6 6 7 5 5 8 9 5 7 4 7 86 HYPOTHESIS TESTING Solution: Step 1: Specification of hypothesis Ho: There is no difference in the number of units produced before and after the raise, or the number of units was higher before the raise. Upper Tail H1: The number of units produced was higher after the raise. Step 2: Level of Significance α= 0.05 (5 %) Standard deviation is unknown. Step 3: Test Statistics 𝑑−𝜇 𝑑 t=𝑆 𝑑 𝑛 with n-1 d.f Step 4: Calculation Calculate the difference scores and the intermediate numbers for the SS formula: Difference Score D2 7 0 0 4 5 -1 1 3 8 9 -1 1 4 8 9 -1 1 5 6 6 0 0 6 6 6 0 0 7 5 5 0 0 8 5 4 +1 1 9 7 7 Participants Before After 1 7 2 0 ∑D= -2n=9 0 ∑D2=4 D = 2/9 =-0.222 INFERENTIAL STATISTICS 87 HYPOTHESIS TESTING (ΣD)2 SSD= ΣD2 - – SD = SS D (𝑛−1) 𝑛 = = 3.55 (9−1) (−2)2 4- – 9 = 3.55 = 0.66 Apply the formula: 𝑑−𝜇 𝑑 t cal = 𝑆 𝑑 𝑛 t cal = with n-1 d.f −0.222− 0 0.222 t cal = -0.999 Step 5: Critical Region Reject Ho if; │t cal│> t tab (n-1) So; t tab= 1.86 (d.f 9-1 = 8) │-0.999 │≯ 1.86 Step 6: Conclusion Since the calculated value of t cal do not exceeds than the critical values of t tab (n-1) (t do not lie in the critical region), so; we do not reject Ho at 5% level of significance. Q2: A sociologist is interested in the decay of long-term memory compared to the number of errors in memory that an individual made after 1 week and after 1 year for a specific crime event. Participants viewed a videotape of a bank robbery and were asked a number of specific questions about the video 1 week after viewing it. They were asked the same questions 1 year after seeing the video. The number of memory errors was recorded for each participant at each time period. The researchers asked whether or not there was a significant difference in the number of errors in the two time periods. Assume that the difference scores are normally distributed and let a = 0.05. INFERENTIAL STATISTICS 88 HYPOTHESIS TESTING Subject One Week One Year 1 5 7 2 4 5 3 4 6 8 9 9 5 6 6 6 5 6 7 4 5 8 5 4 9 7 7 Solution: Step 1: Specification of hypothesis Ho: There is no difference in the number of errors made at 1 week and at 1 year. Two Tailed H1: There is a difference in the number of errors made at 1 week and at 1 year. Step 2: Level of Significance α = 0.05 (5 %) Standard deviation is unknown. Step 3: Test Statistics 𝑑−𝜇 𝑑 t=𝑆 𝑑 𝑛 with n-1 d.f Step 4: Calculation Calculate the difference scores and the intermediate numbers for the SS formula: INFERENTIAL STATISTICS 89 HYPOTHESIS TESTING Subject one one Difference Score D2 Week Year 1 5 7 -2 4 2 4 5 -1 1 3 6 9 -3 9 4 8 9 -1 1 5 6 6 0 0 6 5 6 -1 1 7 4 5 -1 1 8 5 4 +1 1 9 7 7 0 0 ∑D=-8 ∑D2=18 n=9 D= (ΣD)2 SSD= ΣD2 - – SD = SS D (𝑛−1) 𝑛 = = 10.88 (9−1) (−8)2 18- – 9 -8/9=-0. 8888 = 10.88 = 1.166 Apply the formula: 𝑑−𝜇 𝑑 t cal = 𝑆 𝑑 𝑛 t cal = with n-1 d.f −0.888− 0 1.1666 9 t cal = -2.2855 Step 5: Critical Region Reject Ho if; │t cal│ ≥ t tab (n-1) INFERENTIAL STATISTICS 90 HYPOTHESIS TESTING So; t tab= 2.306 (d.f 9-1 = 8) │-2.2855│≱ 2.306 Step 6: Conclusion Since the calculated value of t cal do not exceeds than the critical values of t tab (n-1) (t do not lie in the critical region), so; we do not reject Ho at 5% level of significance. Practice Questions Q1: Suppose you are interested in developing a counseling technique to reduce stress within marriages. You randomly select two samples of married individuals out of ten churches in the association. You provide Group 1 with group counseling and study materials. You provide Group 2 with individual counseling and study materials. At the conclusion of the treatment period, you measure the level of marital stress in the group members. Here are the scores: Group 1 Group 2 25 17 29 29 26 24 27 33 21 26 28 31 14 27 29 23 23 14 21 26 20 27 26 32 18 25 32 23 16 21 17 20 20 32 17 23 20 30 26 12 26 23 INFERENTIAL STATISTICS 91 7 18 29 32 24 19 HYPOTHESIS TESTING Q2: A professor wants to empirically measure the impact of stated course objectives and learning outcomes in one of her classes. The course is divided into four major units, each with an exam. For units I and III, she instructs the class to study lecture notes and reading assignments in preparation for the exams. For units II and IV, she provides clearly written instructional objectives. These objectives form the basis for the exams in units II and IV. Test scores from I, III are combined, as are scores from II, IV, giving a total possible score of 200 for the ―with‖ and ―without‖ objectives conditions. Do students achieve significantly better when learning and testing are tied together with objectives? (α=0.01) Here is a random sample of 10 scores from her class: INFERENTIAL STATISTICS Instructional Objectives 1 Without 165(I,III) With182 (II,IV) 2 Subject 178 189 3 143 179 4 187 196 5 186 188 6 127 153 7 138 154 8 155 178 9 157 169 10 171 191 ΣX=1607 ΣY=1779 92 HYPOTHESIS TESTING Case 7: (For testing p when δ or δ 2 is known) Q1: Suppose that you interview 1000 exiting voters about who they voted for governor. Of the 1000 voters, 550 reported that they voted for the democratic candidate. Is there sufficient evidence to suggest that the democratic candidate will win the election at the .01 level? Solution: Step 1: Specification of hypothesis H0: p =.5 H1: p >.5 Upper Tail Step 2: Level of Significance α= 0.05 (5 %) Step 3: Test Statistics z= p̂ −p 𝑝𝑞 𝑛 Step 4: Calculation z cal = 0.6−0.5 0.5(1−0.5)/1000 z cal = 3.16 Step 5: Critical Region Reject Ho if; z cal > z tab So; z tab= 1.96 3.16 > 1.96 INFERENTIAL STATISTICS 93 HYPOTHESIS TESTING Step 6: Conclusion Since the calculated value of z cal exceeds the critical values of z tab (Z lies in the critical region), we reject Ho at 5% level of significance. So we can conclude that the democratic candidate will win. Q2: The CEO of a large electric utility claims that 80 percent of his 1,000,000 customers are very satisfied with the service they receive. To test this claim, the local newspaper surveyed 100 customers, using simple random sampling. Among the sampled customers, 73 percent say they are very satisfied. Based on these findings, can we reject the CEO's hypothesis that 80% of the customers are very satisfied? Use a 0.05 level of significance. Step 1: Specification of hypothesis H0: p = 0.80 H1: p ≠ 0.80 Two Tailed Step 2: Level of Significance α= 0.05 (5 %) Step 3: Test Statistics z= p̂ −p 𝑝𝑞 𝑛 Step 4: Calculation z cal = 0.73−0.8 0.8(0.2)/100 z cal = -1.75 Step 5: Critical Region Reject Ho if; │z cal│≥ z tab So; z tab=1.96 INFERENTIAL STATISTICS 94 HYPOTHESIS TESTING │-1.75│≱ 1.96 Step 6: Conclusion The above calculation shows us that 1.75 is not in the rejection region. Therefore we fail to reject H0. Practice Questions Q3: 1500 randomly selected pine trees were tested for traces of the Bark Beetle infestation. It was found that 153 of the trees showed such traces. Test the hypothesis that more than 10% of the Tahoe trees have been infested. (Use a 5% level of significance) Q4: Suppose the CEO claims that at least 80 percent of the company's 1,000,000 customers are very satisfied. Again, 100 customers are surveyed using simple random sampling. The result: 73 percent are very satisfied. Based on these results, should we accept or reject the CEO's hypothesis? Assume a significance level of 0.05. Case 8: (For testing 𝐩̂𝟏 and 𝐩̂𝟐 when δ or δ 2 is known) Q1: Suppose the Acme Drug Company develops a new drug, designed to prevent colds. The company states that the drug is equally effective for men and women. To test this claim, they choose as simple random sample of 100 women and 200 men from a population of 100,000 volunteers. At the end of the study, 38% of the women caught a cold; and 51% of the men caught a cold. Based on these findings, can we reject the company's claim that the drug is equally effective for men and women? Use a 0.05 level of significance. Solution: INFERENTIAL STATISTICS 95 HYPOTHESIS TESTING Step 1: Specification of hypothesis H0: p1 = p2 H1: p1 ≠ p2 Two Tailed Step 2: Level of Significance α = 0.05 (5 %) Step 3: Test Statistics p̂ 1 − p̂ 2 z= p (1−p̂ (1 𝑛 1 + 1 𝑛 2 ) Step 4: Calculation p = [p1 (n1) + p2 (n2)] / (n1 + n2) p = [(0.38(100)) + [0.51 (200)] / (100 + 200) p = 140/300 p=0.467 SE = { p * ( 1 - p ) * [ (1/n1) + (1/n2) ] } SE = [ 0.467 * 0.533 * ( 1/100 + 1/200 ) ] SE= [0.003733] = 0.061 z cal = 0.38−0.51 0.061 z cal = -2.13 Step 5: Critical Region Reject Ho if; │z cal│≥ z tab INFERENTIAL STATISTICS 96 HYPOTHESIS TESTING So; z tab = 1.65 2.13 ≥ 1.65 Step 6: Conclusion The above calculation shows us that 2.13 are in the rejection region. Therefore we will reject H0. Q2: Suppose the previous example is stated a little bit differently. Suppose the Acme Drug Company develops a new drug, designed to prevent colds. The company states that the drug is more effective for women than for men. To test this claim, they choose a simple random sample of 100 women and 200 men from a population of 100,000 volunteers. Step 1: Specification of hypothesis H0: p1 = p2 Lower Tail H1: p1 < p2 Step 2: Level of Significance α= 0.01 (5 %) Step 3: Test Statistics p̂ 1 − p̂ 2 z= p (1−p̂ (1 𝑛 1 + 1 𝑛 2 ) Step 4: Calculation p = [p1 (n1) + p2 (n2)] / (n1 + n2) p = [(0.38 ×100) + (0.51 × 200)] / (100 + 200) p = 140/300 p = 0.467 INFERENTIAL STATISTICS 97 HYPOTHESIS TESTING SE = { p ×( 1 - p ) × [ (1/n1) + (1/n2) ] } SE = [ 0.467 × 0.533 × ( 1/100 + 1/200 ) ] SE = [0.003733] SE = 0.061 z cal = 0.38−0.51 0.061 z cal = -2.13 Step 5: Critical Region Reject Ho if; z cal < -z tab So; z tab=1.96 -2.13 < -1.65 Step 6: Conclusion The above calculation shows us that - 2.13 is in the rejection region. Therefore we reject H0. Practice Questions Q1: Consider a production process that produced 10,000 widgets in January and experienced a total of 100 rejected widgets after a quality control inspection (i.e., failure rate = 0.01, success rate = 0.99). A Six Sigma project was deployed to fix this problem and by March the improvement plan was in place. In April, the process produced 8,000 widgets and INFERENTIAL STATISTICS 98 HYPOTHESIS TESTING Q2: Researchers want to test the effectiveness of a new anti-anxiety medication. In clinical testing, 64 out of 200 people taking the medication report symptoms of anxiety. The people receiving a placebo, 92 out of 200 report symptoms of anxiety. Is the medication working any differently than the placebo? Test this claim using alpha = 0.05. CASE 9: (For testing one variance) Q1: A cigarette manufacturer wishes to test the claim that the variance of nicotine content of its cigarettes is 0.644. Nicotine content is measured in milligrams and is assumed normally distributed. A sample of 20 cigarettes has a standard deviation of 1.00milligram. At α = 0.05, is there enough evidence to reject the manufacturer‘s claim? Solution: Step 1: Specification of hypothesis H0: σ2 = 0.644 H1: σ2 ≠ 0.644 Two Tailed Step 2: Level of significance α = 0.05 Step 3: Test statistics χ2 = n−1 s 2 σ2 with n - 1 d.f Step 4: Calculation 20 − 1 (1.0)2 χ = 0.644 2 χ2 = 29.5 With 19 d.f Step 5: Critical region Reject H0 if; INFERENTIAL STATISTICS 99 HYPOTHESIS TESTING 2 2 𝜒𝑐𝑎𝑙 ≥ 𝜒𝑡𝑎𝑏 (𝑛−1) So; 2 𝜒𝑡𝑎𝑏 (𝑛−1) = 32.852 29.5 ≱ 32.852 Step 6: Conclusion We failed to reject H0 i.e. we do not have enough evidence to reject the manufacturer‘s claim that the variance of the nicotine content of the cigarettes is equal to 0.644. Q2: A pharmaceutical company is considering the purchase of new bottling machines to increase efficiency. The factory currently makes use of machines that fill cough syrup bottles whose volume of medicine has a standard deviation of 1.6 mL. The new machine they are considering was tested on 30 bottles, producing a batch with a standard deviation of 1.25 mL. Does this machine produce a variance less than 1.6 mL at the 0.05 significance level? Solution: Step 1: Specification of hypothesis H0: σ2 ≥ 2.56 H1: σ2 < 2.56 Lower Tail Step2: Level of significance α = 0.05 Step 3: Test statistics n−1 s 2 χ2 = σ2 with n - 1 d.f Step 4: Calculation χ2 = 30 − 1 (1.25)2 2.56 χ2 = 17.700 With 29 d.f INFERENTIAL STATISTICS 100 HYPOTHESIS TESTING Step 5: Critical region Reject H0 if; 2 2 𝜒𝑐𝑎𝑙 < 𝜒𝑡𝑎𝑏 (𝑛−1) So; 2 𝜒𝑡𝑎𝑏 (𝑛 −1) = 45.722 17.700 < 45.722 Step 6: Conclusion We reject H0 i.e. this machine produces a variance less than 1.6. Practice Questions Q1: In a study in which the subjects were 15 patients suffering from pulmonary sarcoid disease, blood gas determinations were made. The variance of the sample was 450. Test the hypothesis that the population variance is less than 250. Q2: A nutritionist claims that the standard deviation of the number of calories in 1 tablespoon of the major brands of pancake syrup is 60. A sample of major brands of syrup is selected, and the number of calories is shown. At = 0.10, can the claim be rejected? 53 210 100 200 100 220 210 100 240 200 100 210 100 210 100 210 100 60 CASE 10: (For testing ratio of variances) Q1: The variability in the amount of impurities present in a batch of chemicals used for a particular process depends on the length of time that the process is in operation. Suppose a INFERENTIAL STATISTICS 101 HYPOTHESIS TESTING sample of size 25 is drawn from the normal process which is to be compared to a sample of a new process that has been developed to reduce the variability of impurities. Sample 1 Sample 2 n 25 25 σ2 1.04 0.51 Solution: Step 1: Specification of hypothesis H0: σ12 = σ 22 H1: σ 12> σ 22 Step 2: Level of significance α = 0.05 Step 3: Test statistics σ2 F = σ 12 with 𝜈1 𝑎𝑛𝑑 𝜈2 d.f 2 Step 4: Calculation F= 1.04 0.51 F = 2.04 With 24 and 24 d.f Step 5: Critical region Reject H0 if; F cal > F tab (ν1, ν2) So; F tab (ν1, ν2) = 1.9838 2.04 > 1.9838 Step 6: INFERENTIAL STATISTICS 102 HYPOTHESIS TESTING We reject H0 and conclude that the variability in the new process (Sample 2) is less than the variability in the original process (Sample 1). Q2: A math test is given in two classrooms. In the first classroom (21 students) the mean was 84.3 and the variance was 16.8. In the second classroom (16 students) the mean was 83.7 with a variance of 42.6. Are the two classroom variances different? Solution: Step 1: Specification of hypothesis H0: σ12 = σ 22 H1: σ 12> σ 22 Upper Tail Step 2: Level of significance α = 0.05 Step 3: Test statistics F= σ 21 σ 22 with 𝜈1 𝑎𝑛𝑑 𝜈2 d.f Step 4: Calculation F= 42.6 16.8 F = 2.54 With 15 and 20 d.f Step 5: Critical region Reject H0 if; F cal > F tab (ν1, ν2) So; F tab (ν1, ν2) = 2.2033 2.54 > 2.2033 Step 6: INFERENTIAL STATISTICS 103 HYPOTHESIS TESTING We reject H0 and conclude that the variances are significantly different. Practice Questions Q1: A tire manufacturer claims that the variance of the diameters in a certain tire model is 8.6. A random sample of ten tires has a variance of 4.3. At α = 0.01, is there enough evidence to reject the manufacturer‘s claim? Assume the population is normally distributed. Q2: A manufacturer wishes to determine whether there is less variability in the silver plating done by Company 1 than that done by Company 2. Independent random samples yield the following results. Do the populations have different variances? CASE 11: (For testing α) Q1: H0: α = 0 H1: α ≠ 0 Two Tailed α=5% t = a – α / S.E α = 5.8253 – 0 / 0.82702 = 7.043723 │t cal │≥ t tab Tab = t α/2 µ │7.04373 │ ≥ 0.025 (16) │7.04373 │ ≥ 2.583 So, Reject H0 CASE 12: (For testing β) Q1: INFERENTIAL STATISTICS 104 HYPOTHESIS TESTING H0: β = 0 H1: β ≠ 0 Two Tailed α= 5% t= b - β/ S.Eb = 0.5676 – 0/ 0.0183 = 31.01 │t cal │≥ t tab ttab = t α/2 µ │31.01│ ≥ 0.025 (16) │31.01│ ≥ 2.583 So, Reject H0 GOODNESS OF FIT TEST (PROBLEMS) Q1: If we toss a die 150 times and find that we have following distribution of rolls is the die fair? Face 1 2 3 4 5 6 No. of 22 21 22 27 22 36 rolls Solution: Step 1: Specification of hypothesis H0: The distribution is binomial with n = 6 but p is identified Two Tailed H1: The distribution is not binomial INFERENTIAL STATISTICS 105 HYPOTHESIS TESTING Step 2: Level of significance α = 0.05 Step 3: Test Statistics 2 𝜒𝑐𝑎𝑙 = 2 𝑛 (𝑜 𝑖 −𝑒 𝑖 ) 𝑖=1 𝑒 With n-1-k d.f where n = no. of categories and k = no. of unidentified 𝑖 parameter Step 4: Calculation We have to fit this chi square distribution in binomial distribution. For this we have to make proper table of observed and expected frequencies. ei = pi . (oi – ei)2 / ∑f ei 0.0275 4.125 77.45 42 0.115 17.25 0.815 22 66 0.257 38.55 7.105 4 27 108 0.322 48.33 9.41 5 22 110 0.216 32.4 3.338 6 36 216 0.60 90 32.4 - ∑f= ∑ f(x) = - - 𝝌𝟐𝒄𝒂𝒍 = 150 564 Face No. of (x) rolls (f) 1 22 22 2 21 3 𝜇= Σ 𝑓(𝑥) Σ𝑓 f(x) pi = 𝑛 𝑥 𝑝 𝑥 𝑞 𝑛−𝑥 129.88 = np 564 𝜇 = 150 = 3.76 = np p = 3.76 / 6 = 0.6267 INFERENTIAL STATISTICS 106 HYPOTHESIS TESTING 1-p = q q = 1 – 0.626 = 0.374 Step 5: Critical region Reject H0 if; 2 2 𝜒𝑐𝑎𝑙 ≥ 𝜒𝑡𝑎𝑏 (𝑛−1−𝑘) So; 2 𝜒𝑡𝑎𝑏 (𝑛−1−𝑘) = 11.143 129.88 ≥ 11.143 Step 6: Conclusion We reject H0 as above statement is true i.e. the distribution is not binomial. Practice Questions Q1: The letter distribution of the 5 most popular letters in the English language is known to be approximately letter E T N R O freq. 29 21 17 17 16 That is when E, T, N, R, O appear, on average 29 times out of 100 it is an E and not the other 4. This information is useful in cryptography to break some basic secret codes. Suppose a text is analyzed and the number of E, T, N, R and O's are counted. The following distribution is found INFERENTIAL STATISTICS letter E T N R O freq. 100 110 80 55 14 107 HYPOTHESIS TESTING Do a chi-square goodness of fit hypothesis test to see if the letter proportions for this text are pE=.29, pT=.21, pN=.17, pR=.17, pO=.16 or are different. Q2: A new casino game involves rolling 3 dice. The winnings are directly proportional to the total number of sixes rolled. Suppose a gambler plays the game 100 times, with the following observed counts: Number of Sixes Number of Rolls 48 35 15 3 The casino becomes suspicious of the gambler and wishes to determine whether the dice are fair. What do they conclude? TEST OF INDEPENDENCE (PROBLEMS) Q1: Suppose you have the following categorical data set. Table. Incidence of three types of malaria in three tropical regions. Asia Africa South Totals America Malaria A 31 14 45 90 Malaria B 2 5 53 60 Malaria C 53 45 2 100 Totals 86 64 100 250 INFERENTIAL STATISTICS 108 HYPOTHESIS TESTING Step 1: Specification of hypothesis H0: relationship between location and type of malaria. Two Tailed H1: No relationship between location and type of malaria. Step 2: Level of significance α = 0.05 Step 3: Test statistics Step 4: Calculation We could now set up the following table: (O — E)2/ E Expected 31 30.96 0.04 0.0016 0.0000516 14 23.04 9.04 81.72 3.546 45 36.00 9.00 81.00 2.25 2 20.64 18.64 347.45 16.83 5 15.36 10.36 107.33 6.99 53 24.00 29.00 841.00 35.04 53 34.40 18.60 345.96 10.06 45 25.60 19.40 376.36 14.70 2 40.00 38.00 1444.00 36.10 INFERENTIAL STATISTICS |O -E| (O — E)2 Observed 109 HYPOTHESIS TESTING Chi Square = 125.516 Degrees of Freedom = (c - 1) (r - 1) = 2(2) = 4 Step 5: Critical region Reject H0 if; χ2cal ≥ χ2tab c−1 (r−1) So; χ2tab c−1 (r−1) = 9.488 125.516 ≥ 9.488 Step 6: Conclusion Thus, we would reject the null hypothesis that there is no relationship between location and type of malaria. Q2: Suppose you conducted a drug trial on a group of animals and you hypothesized that the animals receiving the drug would show increased heart rates compared to those that did not receive the drug. You conduct the study and collect the following data: Heart Rate No Heart Rate Total Increased Increase Treated 36 14 50 Not 30 25 55 66 39 105 treated Total INFERENTIAL STATISTICS 110 HYPOTHESIS TESTING Step 1: Specification of hypothesis Upper Tail H0: The proportion of animals whose heart rate increased is independent of drug treatment. H1: The proportion of animals whose heart rate increased is associated with drug treatment. Step 2: Level of significance α = 0.05 Step 3: Test statistics Step 4: Calculation (O — E)2 (O — E)2/ E 4.58 20.976 0.667 18.57 4.57 20.884 1.124 30 34.57 4.57 20.884 0.604 25 20.42 4.58 20.976 1.027 Observed Expected 36 31.42 14 |O -E| Chi square = 3.422 Degrees of Freedom = (2-1) x (2-1) = 1 Step 5: Critical region Reject H0 if; χ2cal ≥ χ2tab c−1 (r−1) So; χ2tab c−1 (r−1) = 3.841 3.422 ≱ 3.841 Step 6: Conclusion INFERENTIAL STATISTICS 111 HYPOTHESIS TESTING χ 2 values are 3.422 and this value is less than the table value of 3.841. So we do not reject the H0. Practice Questions Q1: In a certain town, there are about one million eligible voters. A simple random sample of 10000 eligible voters was chosen to study the relationship between sex and participation in the last election. The results are summarized in the following 2X2 (read two by two) contingency table: Men Women Voted 2792 3591 Didn't vote 1486 2131 We want to check whether being a man or a woman (columns) is independent of having voted in the last election (rows). In other words is "sex and voting independent"? Q2: Each respondent in the Current Population Survey of March 1993 was classified as employed, unemployed, or outside the labor force. The results for men in California age 3544 can be cross-tabulated by marital status, as follows: Widowed, divorced, Married or separated never married __________________________________________ Employed 679 103 114 Unemployed 63 10 20 Not in labor force 42 18 25 INFERENTIAL STATISTICS 112 HYPOTHESIS TESTING Men of different marital status seem to have different distributions of labor force status. Or is this just chance variation? (You may assume the table results from a simple random sample.) TEST FOR HOMOGENIETY (PROBLEMS) Q1: We had selected a random sample of 20 males from the population of males in the school and another, independent, random sample of 16 females from the population of females in the school. Within each sample we classify the students as democrat, republican and independent. Do a chi square test of homogeneity to see if there is a difference between political party preferences on the basis of gender, from the data given below; Democrat Republican Independent Totals Male 11 17 2 20 Female 7 8 1 16 Totals 18 15 3 36 Step 1: Specification of hypothesis H0: Political party preference is independent of gender Two Tailed H1: Political party preference is dependent on gender Step 2: Level of significance α = 0.05 INFERENTIAL STATISTICS 113 HYPOTHESIS TESTING Step 3: Test Statistics 2 𝜒𝑐𝑎𝑙 = 𝑁2 𝐴𝐵 [ Σ 𝑎 𝑖2 𝑐𝑖 − 𝐴2 𝑁 ] With n-1 d.f Step 4: Calculation ai a i2 ai2 / ci 11 121 6.72 7 49 3.26 2 4 1.33 2 𝜒𝑐𝑎𝑙 362 400 = [ 11.313 − ] 20 (16) 36 2 𝜒𝑐𝑎𝑙 = 0.822 Step 5: Critical region Reject H0 if; 2 2 𝜒𝑐𝑎𝑙 ≥ 𝜒𝑡𝑎𝑏 (𝑛−1) So; 2 𝜒𝑡𝑎𝑏 (𝑛−1) = 12.832 0.822 ≱ 12.832 Step 6: Conclusion We fail to reject H0 i.e. political party preference is independent of gender. Q2: In a study of the television viewing habits of children, a developmental psychologist selects a random sample of 300 first graders - 100 boys and 200 girls. Each child is asked which of the following TV programs they like best: The Lone Ranger, Sesame Street, or The Simpsons. Results are shown in the contingency table below. INFERENTIAL STATISTICS 114 HYPOTHESIS TESTING Viewing Preferences Lone Sesame The Row Ranger Street Simpsons total Boys 50 30 20 100 Girls 50 80 70 200 100 110 80 300 Column total Do the boys' preferences for these TV programs differ significantly from the girls' preferences? Use a 0.05 level of significance. Step 1: Specification of hypothesis H0: Boy‘s preferences for these TV programs differ from those of girl‘s preferences Two Tailed H1: Boy‘s preferences for these TV programs differ from those of girl‘s preferences Step 2: Level of significance α = 0.05 Step 3: Test Statistics 2 𝜒𝑐𝑎𝑙 = 𝑁2 𝐴𝐵 [ Σ 𝑎 𝑖2 𝑐𝑖 − 𝐴2 𝑁 ] With n-1 d.f Step 4: Calculation INFERENTIAL STATISTICS ai a i2 ai2 / ci 50 2500 25 30 900 8.181 20 400 4.44 115 HYPOTHESIS TESTING 2 𝜒𝑐𝑎𝑙 3002 10000 = [ 36.625 − ] 100 (200) 300 2 𝜒𝑐𝑎𝑙 = 14.8275 Step 5: Critical region Reject H0 if; 2 2 𝜒𝑐𝑎𝑙 ≥ 𝜒𝑡𝑎𝑏 (𝑛−1) So; 2 𝜒𝑡𝑎𝑏 (𝑛−1) = 12.832 14.8275 ≥ 12.832 Step 6: Conclusion We reject H0 i.e. boy‘s preferences for these TV programs differ from those of girl‘s preferences. Practice Questions Q1: A survey of drivers was taken to see if they had been in an accident during the previous year, and if so was it a minor or major accident. The results are tabulated by age group: Accident Type INFERENTIAL STATISTICS AGE None minor major under 18 67 10 5 18-25 42 6 5 26-40 75 8 4 40-65 56 4 6 116 HYPOTHESIS TESTING over 65 57 15 1 Do a chi-squared hypothesis test of homogeneity to see if there is difference in distributions based on age. Q2: To determine if there was an association between race and opinions about schools, researchers surveyed 3 randomly selected groups of parents and asked them ―Are high schools in your state doing an excellent, good, fair or poor job or don‘t you know enough to say?‖ FISHER’s EXACT TEST (PROBLEMS) Q1: Use the Fisher‘s exact test hypothesis that the inoculation is independent of immunity from attack among a population exposed to a certain disease. Not inoculated Inoculated Not attacked 3 5 Attacked 10 2 INFERENTIAL STATISTICS 117 HYPOTHESIS TESTING Solution: Step 1: Specification of hypothesis H0: Inoculation is independent of immunity Two Tailed H1: Inoculation is dependent on immunity Step 2: Level of significance α = 0.05 Step 3: Test statistics 𝑃= (a + b )! ( a + c )! ( c + d )! ( b + d )! 𝑎! 𝑏! 𝑐! 𝑑! 𝑛! Step 4: Calculations TABLE 0 Not inoculated Inoculated Total Not attacked 3 5 8 Attacked 10 2 12 Total 13 7 20 Not inoculated Inoculated Total Not attacked 3-1 = 2 5+1 = 6 8 Attacked 10+1 = 11 2-1 = 1 12 Total 13 7 20 P0 = 0.0477 TABLE 1 INFERENTIAL STATISTICS 118 HYPOTHESIS TESTING P1 = 0.0043 TABLE 2 Not Inoculated Total inoculated Not attacked 3-2 = 1 5+2 = 7 8 Attacked 10+2 = 12 2-2 = 0 12 Total 13 7 20 P2 = 0.0001 Step 5: Critical region Reject H0 if; 2(Grand P) is not negligible Grand P = P0 + P1 + P2 Grand P = 0.0521 So; 2(Grand P) = 0.1042 Step 6: Conclusion As 2(Grand P) is not negligible, so we reject H0 i.e. inoculation is dependent on immunity. INFERENTIAL STATISTICS 119 HYPOTHESIS TESTING Practice Questions Q1: Suppose, in a fictitious experiment, 4 subjects in an Experimental Group and 4 subjects in a Control Group are asked to solve an anagram problem. Three of the 4 subjects in the Experimental Group and none of the subjects in the Control Group solved the problem. Table below shows the results in a contingency table. Experimental Control Total Solved 3 0 3 Did Not Solve 1 4 5 4 4 8 Total Perform Fisher’s exact test. ONE WAY ANNOVA PROBLEM A study compared three number of hours of relief provide by five different brands of antacid administered to 25 different people, each with stomach acid considered strong. The results are given A B C D E TOTAL 4.4 5.8 4.8 2.9 4.6 - 4.6 5.2 5.9 2.7 4.3 - 4.5 4.9 4.9 2.9 3.8 - 4.1 4.7 4.6 3.9 5.2 - 3.8 4.6 4.3 4.3 4.4 - INFERENTIAL STATISTICS 120 HYPOTHESIS TESTING Tj 21.4 25.2 24.5 16.7 22.3 110 Tj2 457.96 635.04 600.25 278.89 497.29 2469.43 ∑xij 92.02 127.94 121.51 57.81 100.49 500 C.F=T...2 / n×r =484 TSS= 500 - 484 = 16 BSS= ∑ Tj2 / r ─ C.F = 2469/5 ─ 484 = 9.886 WSS=TSS─BSS = 16─9.886 = 6.114 SOV SS d.f MS F.ratio b/w sample 9.886 4 2.4715 2.0211 WSS 6.114 5 1.2228 Total 9 Step 1: Specification of hypothesis Ho: µ1=µ2=µ3 H1:µ1≠µ2≠µ3 Two Tailed Step 2: Level of significance α = 0.05 Step 3: Test statistics F = MSS / MSE INFERENTIAL STATISTICS 121 with d.f (ν1, ν2) HYPOTHESIS TESTING Step 4: Calculation F cal = 2.0112 Step 5: Critical region Reject H0 if; F cal ≥ F tab (ν1, ν2) So; F tab (ν1, ν2) = 5.1922 2.0112 ≱ 5.1922 Step 6: Conclusion We fail to reject H0. Practice Questions Q1: The training methods were compared to see whether they lead to greater productivity after training. The following are productivity measures for individuals trained by each method. Method 1 45 40 50 39 53 44 Method 2 59 43 47 51 39 49 Method 3 41 37 43 40 52 37 At the 0.05 LOS do the three training methods lead to different levels of productivity? Q2: A research study was conducted to examine the clinical efficacy of a new antidepressant. Depressed patients were randomly assigned to one of three groups: a placebo group, a group that received a low dose of the drug, and a group that received a moderate dose of the drug. After four weeks of treatment, the patients completed the Beck Depression Inventory. The INFERENTIAL STATISTICS 122 HYPOTHESIS TESTING higher the score, the more depressed the patient. The data are presented below. Compute the appropriate test. Placebo Low Dose Moderate Dose 38 22 14 47 19 26 39 8 11 25 23 18 42 31 5 TWO WAY ANNOVA PROBLEMS: Q1: When a restaurant server writes a friendly note or draws a ―happy face‖ on your restaurant check, is this just a friendly act or is there a financial incentive? Psychologists conducted a randomized experiment to investigate whether drawing a happy face on the back of a restaurant bill increased the average tip given to the server. One female server and one male server in a Philadelphia restaurant either did or did not draw a happy face on checks during the experiment. In all they drew happy faces on 45 checks and did not draw happy faces on 44 checks. The sequence of drawing the happy faces or not was random. Complete the following two-way ANOVA table and then perform the appropriate F tests for main effects and interaction and state your conclusions. Source DF SS MS F Message --- 14.7 --- --- INFERENTIAL STATISTICS 123 HYPOTHESIS TESTING Gender --- --- 2602.0 --- Interaction --- 438.7 --- --- Error --- --- 109.8 --- Total --- 12407.9 --- --- Source DF SS MS F Message 1 14.7 14.7 0.134 Gender 1 2602.0 2602.0 23.7 Interaction 1 438.7 438.7 4.0 Error 85 9333.0 109.8 - Total 88 12407.9 - - Solution: TEST 1 Step 1: Specification of hypothesis H0: No main effect of message H1: A main effect of message exists Step 2: Level of significance α = 0.05 INFERENTIAL STATISTICS 124 HYPOTHESIS TESTING Step 3: Test statistics F0 = MSS / MSE Step 4: Calculation F0 = 14.7/109.8 = 0.134 with numerator degrees of freedom 1 and denominator degrees of freedom 85. Step 5: Conclusion This test statistic corresponds to a p-value of 0.7152. We do not have any evidence to reject the null hypothesis that there is main effect of message on the average amount a server gets tipped. TEST 2 Step 1: Specification of hypothesis H0: No main effect of gender H1: A main effect of gender exists Step 2: Level of significance α = 0.05 Step 3: Test statistics F0 = MSS / MSE Step 4: Calculation F0 = 2602/109.8 = 23.7 with numerator d.f = 1 and Denominator df = 85. Step 5: Conclusion This test statistic corresponds to a p-value of less than .0001. We have very strong evidence that a main effect of gender does exist. TEST 3 INFERENTIAL STATISTICS 125 HYPOTHESIS TESTING Step 1: Specification of hypothesis H0: No interaction effect between gender and message H1: An interaction effect between gender and message exists. Step 2: Level of significance α = 0.05 Step 3: Test statistics F0 = MSS / MSE Step 4: Calculation F0 = 438.7/109.8 = 4.0 with numerator d.f = 1 and denominator d.f = 85 Step 5: Conclusion This test statistic corresponds to a p-value of .0487. We have evidence to reject the null hypothesis of no interaction at the = 0.05 level. We have reason to believe that there is an interaction effect. Practice Questions Q1: As a budding psychologist, you wonder whether you can teach old dogs new tricks. So you go to the pound and adopt 15 old dogs and 15 puppies. Then you attempt to teach each of the 30 dogs one of the standard dog tricks, "sit", "stay", and "roll over." Teaching only one trick to each dog, you keep a record of how many days it takes before they learn the tricks. The results of your experiment are listed in the table below. Use that data to conduct a two-way analysis of variance to determine if old dogs can learn new tricks. INFERENTIAL STATISTICS 126 HYPOTHESIS TESTING Type of Trick Puppies (Row 1) Old Dogs (Row 2) INFERENTIAL STATISTICS "Sit" "Shake" "Roll Over" (Column 1) (Column 2) (Column 3) 2 4 6 1 5 9 3 4 7 1 6 8 2 7 10 2 9 13 5 10 12 2 11 15 4 13 17 3 7 13 127 HYPOTHESIS TESTING REFERENCES: http://en.wikipedia.org/wiki/Statistical_inference http://sociology.about.com/od/Statistics/a/Descriptive-inferential-statistics.htm http://blog.minitab.com/blog/understanding-statistics/things-statisticians-say-failure-toreject-the-null-hypothesis https://people.richland.edu/james/lecture/m170/ch12-fit.html https://people.richland.edu/james/lecture/m170/ch12-ind.html http://mathnstats.com/index.php/hypothesis-testing/85-chi-square-tests/129-chi-square-testfor-homogeneity.html http://www.ltcconline.net/greenl/Courses/201/Regression/HomogeneityCollaborative/homog eneity.html http://faculty.london.edu/cstefanescu/Yates.pdf https://people.richland.edu/james/lecture/m170/ch13-f.html https://people.richland.edu/james/lecture/m170/ch13-1wy.html https://people.richland.edu/james/lecture/m170/ch13-2wy.html http://www3.nd.edu/~rwilliam/stats1/x53.pdf INFERENTIAL STATISTICS 128 HYPOTHESIS TESTING STATISTICAL INFERENCE-ESTIMATION “Statistical inferences is the art of drawing conclusion and or inference about the population from the limited information contained in the sample” Statistical inference is the process of drawing conclusions from data that are subject to random variation, for example, observational errors or sampling variation. Statistical inference is used to describe systems of procedures that can be used to draw conclusions from datasets arising from systems affected by random variation, such as observational errors, random sampling, or random experimentation. Inferential Statistics Inferential statistics is concerned with making predictions or inferences about a population from observations and analyses of a sample. That is, we can take the results of an analysis using a sample and can generalize it to the larger population that the sample represents. In order to do this, however, it is imperative that the sample is representative of the group to which it is being generalized. To address this issue of generalization, we have tests of significance. A Chi-square or T-test, for example, can tell us the probability that the results of our analysis on the sample are representative of the population that the sample represents. In other words, these tests of significance tell us the probability that the results of the analysis could have occurred by chance when there is no relationship at all between the variables we studied in the population we studied. Examples of inferential statistics include linear regression analyses, logistic regression analyses,ANOVA, correlation analyses, structural equation modeling, and survival analysis, to name a few. Two important areas of statistical inference are estimation and testing of hypothesis. Hypothesis: A statistical hypothesis is a claim (assertion, statement, belief or assumption) about an unknown population parameter value. For example an investment company claims that the average return across all its investments is 20 percent and so on. To test such claims sample data are collected and analyzed. On the basis of sample findings hypothesized value of population parameter is accepted or rejected. INFERENTIAL STATISTCS 130 ESTIMATION Estimation: The method which is used to estimate the value of a population parameter from the value of corresponding sample statistic. For example: A company needs to understand consumer awareness of its products. In the following example the decision maker needs to examine the following concepts that are useful for drawing statistical inference about an unknown value of population or process parameter. The procedures of making judgements about a population parameter from a sample statistic are referred to as statistical estimation or simply estimation. Estimation is further divided into point estimation and interval estimation. POINT ESTIMATION The object of point estimation is to obtain a single number from the sample which will represent the unknown value of the population parameter. Population parameter (population mean, population variance, population proportion etc.) are estimated from the corresponding sample statistics (sample mean, sample variance, sample proportion). A statistic is used to estimate a population parameter is called a point estimator or simply an estimator and a specific numerical value which we obtain for an estimator in a given problem is called an estimate. CRITERIA FOR POINT ESTIMATORS: A point estimator is considered a good estimator if it satisfies various criteria. Unbiasedness Consistency Efficiency Sufficiency UNBIASEDNESS An estimator is defined to be unbiased if the statistic used as an estimator has its expected value equal to the true value of the population parameter being estimated. CONSISTENCY An estimator is said to be consistent if the statistics to be used as estimators becomes closer and INFERENTIAL STATISTCS 131 ESTIMATION closer to the population parameter being estimated as sample size n increases. EFFICIENCY An unbiased estimator is defined to be efficient if the variance of its sampling distribution is smaller than that of the sampling distribution of any other unbiased estimator of the same parameters. SUFFICIENCY An estimator is defined to be sufficient if the statistic used as estimator uses all the information that is contained in the sample. Any statistic that is not computed from all values in the sample is not a sufficient estimator. CONFIDENCE INTERVAL A point estimator (e.g. a sample mean) calculated from sample data, provides a single number as an estimate of the population parameter. A point estimator cannot be expected to be exactly equal to the population parameter. For example the mean of a sample taken from a population may assume different values for different samples. A sample mean obtained from one sample cannot be equal to the population mean. We therefore estimate an interval of values within which the population parameter may be expected to lie with a certain degree of confidence. A range of values used to estimate a population parameter is known as interval estimation by confidence interval and the interval (a, b) that will include the population parameter with a high probability (e.g. 0.90, 0.95 or 0.99) is known as confidence interval. The 90%, 95% or 99% confidence interval shows that we are 90%, 95% or 99% confident that our computed interval does in fact contain the unknown population parameter. The limits ‘a’ and ‘b’ are called the lower and upper confidence limit of the interval; the probability 0.90 0.95 or 0.99 is called the confidence coefficient or level of confidence and is denoted by 1-alpha. The probability that the interval does not contain the parameter is denoted by alpha. The probability curve is one; the level of confidence is always equal to 1-alpha. Pr⦋ –Z α/2 ≤ z ≤ Z α/2 ⦌ = 1-α Pr⦋-Z α/2 ≤ ≤ Z α/2 ⦌ =1-α Pr⦋ –Z α/2 S.E ≤ INFERENTIAL STATISTCS ≤ Z α/2 S.E ]=1-α 132 ESTIMATION ≤x Pr⦋ –Z α/2 S.E Pr⦋ Z α/2 S.E Pr⦋ Pr[ ≤ Z α/2 S.E ⦌ =1-α - x≥ ≤ Z α/2 S.E –Z α/2. - Z α/2 S.E ≤ ⦌= 1-α Z α/2 S.E ] = 1-α ≤ Z α/2. ] =1-α In short we can write 100(1-α) % C.I for µ/δ is known Pr ( Z α/2. Pr =1-α Pr =1-α ≥ Pr [ Pr [ ≥ ≥ ≥ INFERENTIAL STATISTCS ]=1-α ] =1-α 133 ESTIMATION TABLE: CONFIDENCE INTERVAL Z VALUE 90% 1.65 95% 1.96 99% 2.58 99,9% 3.291 INFERENTIAL STATISTCS 134 ESTIMATION CONFIDENCE INTERVAL FOR POPULATION WITH MEAN WHEN POPULATION VARIANCE IS KNOWN We want to construct a 95%confidence interval for the population mean when the population variance is unknown. Let X denotes the sample mean. We know that the sample distribution of the sample mean X is normal with mean and standard deviation. Thus the statistic Will be normally distributed with mean is zero and standard deviation one. The range Is called the 95% confidence interval for the population mean. TABLE: INFERENTIAL STATISTCS 135 ESTIMATION QUESTION: Give a sample random of 25 observations from a normal population for which mean is unknown and s.d is 5. Suppose the sample mean is found to be 45. Find i) 95% ii) 99% confidence interval for the population mean? LARGE SAMPLE CONFIDENCE INTERVAL FOR POPULATION MEAN WHEN POPULATION VARIANCE IS UNKNOWN In establishing a confidence interval for mean we have use the value of population standard deviation in determining the width of the interval. But the standard deviation of the population, like the mean of the population, is unknown. In this situation when the sample size is large n > 30 the population standard deviation may be approximated by a sample standard deviation S or s. Thus when is large and the population is unknown a 100 (1-α) % confidence interval for μ is given by QUESTION: The mean and standard deviation of the maximum loads supported by 60 cables are11.09 and 0.73 tons respectively. Find i) 95% ii) 99% confidence interval for mean of the maximum loads of all cables produced by the company. QUESTION: To estimate the average weekly income of unskilled workers in a large city, an investigator collects weekly income data from a random sample of 75 unskilled workers. The mean and standard deviation are found to be Rs. 127 and Rs. 15 respectively. Compute a) 90% and b) 80% confidence interval for the mean weekly income? CONFIDENCE INTERVAL FOR DIFERENCE BETWEEN POPULATION MEANS From each of two populations an independent random sample is drawn. Sample means, X1andX2, are calculated. The difference is X1-X2which is an unbiased estimator of the difference between the two INFERENTIAL STATISTCS 136 ESTIMATION population means, µ1- µ2.The variance of the estimator is (σ12/n1) and (σ22/n2). QUESTION: A research team is interested in the difference between serum uric acid levels in patients with and without Down's syndrome. In a large hospital for the treatment of the mentally retarded, a sample of 12 individuals with Down's syndrome yielded a mean of 1= 4.5 mg/100 ml. In a general hospital a sample of 15 normal individuals of the same age and sex were found to have a mean value of 2= 3.4 mg/100 ml. If it is reasonable to assume that the two populations of values are normally distributed with variances equal to 1 and 1.5, find the 95 percent confidence interval for µ1-µ2 Given 2 1=4.5σ1 n1 =12 n2=15 2 =1 =3.4 σ22=1.5 QUESTION: Two independent samples of 100 machinists and 100 carpenters are taken to estimate the difference between the weekly wages of the two categories of workers. The sample mean wages for the mechanist and carpenters are 1 =Rs. 345 and 2 =Rs. 340 respectively. The population variance for mechanist and σ12 =196 and σ22 =204. Determine a) 90% b) 99% confidence interval for the true difference between the average wages of machinist’s carpenters? LARGE CONFIDENCE INTERVAL FOR DIFFERENCE BETWEEN TWO POPULATION MEANS When the population variance σ12 and σ22 are unknown and the population are not normally distributed we can obtain a confidence interval for the difference between two population means provided the sample sizes are large. In this situation when the sample sizes are large n1 > 30, n2>30, σ12 and σ22 respectively and a 100(1-α) % confidence interval for the difference between two population means INFERENTIAL STATISTCS 137 ESTIMATION QUESTION: Students from schools A and B compared on the basis of their scores on an aptitude test. Two random samples of 90 and 100 students are selected from schools. The sample means are 76.4 and 81.2 where as the sample standard deviations are 8.2 and 7.6 respectively. Establish a 98% confidence interval for the difference in population mean scores between students of schools A and B? CONFIDENCE INTERVAL FOR PROPORTIONS we have established a confidence interval for mean we can obtain a confidence interval for the binomial parameter p. the interval is based on the estimator P , the proportion of success in the sample size of n. we have noted earlier that the distribution of the estimator P is approximately normal with mean is equal to p and standard deviation When p is not too close to 0 or 1. Thus a 100 (1-α) % confidence interval for p is This interval depends on the population proportion p which is generally unknown. However when n is large the population proportion p is approximated by the sample proportion P. thus for large n, a 100 (1-α) % confidence interval for p is. INFERENTIAL STATISTCS 138 ESTIMATION QUESTION: In a random sample of 100 articles, 10 are found to be defective. Obtain a 95% confidence interval for the true proportion of defectives in population of such articles? QUESTION: In a random sample of 500 farmers in a certain rural area, 41 were found to be employed. Compute a99% confidence interval for the rate of unemployment in that area? CONFIDENCE INTERVALS FOR DIFFERENCE BETWEEN PROPORTIONS For large n1 and n2 a 100(1-α) % confidence interval for the difference of two binomial parameter p1-p2 is given by Where p1 and p2 are the proportion of success in random samples of sizes n1 and n2 respectively. QUESTION: In random samples of 400 adults and 600 teenagers who watched a certain TV programmed, 100 adults and 300 teenagers indicated that they liked it. Construct a) 95% b) 99% confidence interval of the difference in proportions of all adults and all teenagers who watched the program me and liked it? CONFIDENCE INTERVAL FOR POPULATION MEAN BASED ON SMALL SAMPLES The procedure for determining confidence interval for the population mean based on small samples is the same as for large samples except that we use the t-distribution instead of the standard normal distribution. Confidence interval for the population mean can be computed when the population INFERENTIAL STATISTCS 139 ESTIMATION variance id unknown and the sample is small. In general if the population distribution is normal and if σ12 is unknown a 100(1-α) % confidence interval for mean is given by With n-1(degree of freedom) A comparison of this confidence interval with the confidence interval formula shows that for small samples we replaced z to t distribution and we replaced sigma by s which is the sample estimate of sigma. As n increases both methods tend towards agreement. QUESTION: A sample of 12 measurements of the breaking strength of cotton threads gave a mean of 209 grams and a standard deviation is 35 grams. Find 95% and 99% confidence limit for the actual mean breaking strength? QUESTION: Five measurements of the reaction time of an individual to a certain stimulus were recorded as 0.28, 0.30, 0.27, 0.33, 0.31 seconds. Find 95% and 99% confidence interval for the actual mean reaction time? CONFIDENCE INTERVAL FOR DIFFERENCE BETWEEN POPULATION MEANS μ1 - μ2 BASED ON SMALL SAMPLES The random samples of sizes n1 and n2 from normal population with variances σ12 and σ22 respectively. Let 1 and 2 be the respective sample means. Confidence interval for the difference between two population mean μ1 and μ2 can be computed when the population variances are unknown and the sample sizes are small. If σ12 =σ22 we can estimate the common variance by sp (square) given by INFERENTIAL STATISTCS 140 ESTIMATION Hence the 100(1-α ) % confidence interval for With degree of freedom (n1 +n2 -2) QUESTION: Two random samples of size n1= 9 and n2= 16 from two independent populations having normal distribution provide the means and standard deviations. Find a 95% confidence interval for μ1- μ2 assuming σ 1 =64 and 2=59, s1 = 6 and s2 =5. =1, σ = 2. QUESTION: A random sample of 10 university professors were gave their salaries in thousands RS. 13, 11, 19, 15, 22, 20, 14, 17, 14, 15. Another random sample 5 college professors gave their salaries in thousands RS. 9, 12, 8, 10, 16. Construct a 95% confidence interval for the difference between means of the salaries of universities and college professors assuming that their variances are equal? CONFIDENCE INTERVAL FOR PAIRED OBBSERVATIONS Now consider estimation procedures for the difference of two population means when the samples are not independent and the variance of the two populations are not necessarily equal. The pairs are independent when the two samples are selected from normal populations difference d1, d2....den constitute a single random samples from a population of difference which is normally distributed with mean and variance. INFERENTIAL STATISTCS 141 ESTIMATION QUESTION: Twenty college freshmen were divided into 10 pairs each member of the pair having approximately the same I.Q one of the pair was selected at random and assigned to mathematics section using programmed materials only. The other members of each pairs were assigned to a section in which the teacher lectured. At the end of the semester grouped was given the same examination and the following results were recorded Pair Programmed Material Lecturer 1 76 81 2 70 52 3 85 87 4 58 70 5 91 86 6 75 77 7 82 90 8 64 73 9 79 85 10 88 83 Find a 98%confidence interval for the true difference in the two learning procedures? INFERENTIAL STATISTCS 142 ESTIMATION TABLE: DEGREES OF FREEDOM In many statistical problems we are required to determine the degrees of freedom. This refers to a positive whole number that indicates the lack of restrictions in our calculations. The degree of freedom is the number of values in a calculation that we can vary. Student t Distribution Degrees of freedom play an important role when using the Student t-score table. There are actually several t-score distributions. We differentiate between these distributions by use of degrees of freedom. Here the probability distribution that we use depends upon the size of our sample. If our sample size is n, then the number of degrees of freedom is n - 1. For instance, a sample size of 22 would require us to use the row of the t-score table with 21 degrees of freedom. Chi-Square Distribution The use of a chi-square distribution also requires the use of degrees of freedom. Here, in an identical manner as with the t distribution, the sample size determines which distribution to use. If the sample size is n, then there are n - 1 degrees of freedom. INFERENTIAL STATISTCS 143 ESTIMATION Standard deviation Another place where degrees of freedom show up is in the formula for the standard deviation. This occurrence is not as overt, but we can see it if we know where to look. To find a standard deviation we are looking for the "average" deviation from the mean. However after subtracting the mean from each data value and squaring the differences, we end up dividing by n - 1 rather than n as we might expect. The presence of the n - 1 comes from the number of degrees of freedom. Since the n data values and the sample mean are being used in the formula, there are n - 1 degrees of freedom. Advanced Techniques More advanced statistical techniques use more complicated ways of counting the degrees of freedom. When calculating the test statistic for two means with independent samples of n1and n2 elements, the number of degrees of freedom has quite a complicated formula. It can be estimated by using the smaller of n1 - 1 and n2 - 1 Another example of a different way to count the degrees of freedom comes with an F test. In conducting an F test we have k samples each of size n. The degrees of freedom in the numerator are k - 1 and in the denominator is k (n - 1). Chi-square A chi-square test is a statistical test commonly used for testing independence and goodness of fit. Testing independence determines whether two or more observations across two populations are dependent on each other (that is, whether one variable helps to estimate the other). Testing for goodness of fit determines if an observed frequency distribution matches a theoretical frequency distribution. In both cases the equation to calculate the chi-square statistic is Where O equals the observed frequency and E the expected frequency. The results of a chi-square test, along with the degrees of freedom, are used with a previously calculated table of chi-square distributions to find a p-value. The p-value can then be used to determine the significance of the test INFERENTIAL STATISTCS 144 ESTIMATION Data used in a chi-square analysis has to satisfy the following conditions Randomly drawn from the population, reported in raw counts of frequency, measured variables must be independent, observed frequencies cannot be too small Values of independent and dependent variables must be mutually exclusive. There are two types of chi-square test. The Chi-square test for goodness of fit Which compares the expected and observed values to determine how well an experimenter’s predictions fit the data? The Chi-square test for independence Which compares two sets of categories to determine whether the two groups are distributed differently among the categories? QUESTION: Calculate the chi square values, state the null hypothesis and alternatives hypothesis with α =0.05? LUNGS GUMS THROAT BONES SKIN TOTAL ADDICTED 23 78 100 20 29 250 NOT 4 11 13 25 7 60 27 89 113 45 36 310 ADDICTED TOTAL INFERENTIAL STATISTCS 145 ESTIMATION QUESTION: If 5 coins were tossed 1000 times and the number of heads were given below? Test a binomial distribution give a satisfactory fit to these data? NO OF HEADS F 0 38 1 144 2 342 3 287 4 164 5 25 ANOVA A study compared the number of hours of relief provide by five different brands of antacid administered to 25 different people, each with stomach acid considered strong. The results are given A B C D E 4.4 5.8 4.8 2.9 4.6 4.6 5.2 5.9 2.7 4.3 4.5 4.9 4.9 2.9 3.8 INFERENTIAL STATISTCS 146 TOTAL ESTIMATION 4.1 4.7 4.6 3.9 5.2 3.8 4.6 4.3 4.3 4.4 Tj 21.4 25.2 24.5 16.7 22.3 110 Tj2 457.96 635.04 600.25 278.89 497.29 2469.43 ∑xij 92.02 127.94 121.51 57.81 100.49 500 C.F= /n×r =484 TSS= =500─484 =16 BSS= /r ─C.F = 2469/5─ 484 =9.886 WSS=TSS─BSS =16─9.886 = 6.114 SOV SS d.f MS F.ratio b/w sample 9.886 4 2.4715 2.0211 WSS 6.114 5 1.2228 Total INFERENTIAL STATISTCS 9 147 ESTIMATION 1. Specification of hypothesis Ho: µ1=µ2=µ3 H1:µ1≠µ2≠µ3 2. Level of significant α=5% 3. Test statistics F= with d.f (µ1,µ2) 4. Calculation Fcal= 2.0112 5. Critical region If Reject Ho fcal≥ftab f0.05 (4, 5) = 5.1922 6. Conclusion 2.0112≥5.1922 Ho is not rejected QUESTION: The training methods were compared to see whether they lead to greater productivity after training. The following are productivity measures for individuals trained by each method. Method 1 45 40 50 39 53 44 Method 2 59 43 47 51 39 49 Method 3 41 37 43 40 52 37 At the 0.05 LOS do the three training methods lead to different levels of productivity? INFERENTIAL STATISTCS 148 ESTIMATION Table of contents: 4.1 INTRODUCTION 4.1.1- Qualitative Vs Quantitative variable 4.1.2- Correlation & Causation 4.1.3- Correlation& Regression 4.2 CORRELATION ANALYSIS 4.2.1- Definition 4.2.2- Properties 4.2.3- Solved Example 4.2.4.1- Example 1 4.2.4.2- Example 2 4.2.4.3-Example 3 4.3 SCATTER PLOT 4.4.1- Why scatter plot? 4.4.2- Types of scatter plot 4.4.2.1- Positive correlation 4.4.2.2- Negative correlation 4.4.2.3- No correlation 4.4.2.4- Curvilinear relation 4.4.2.5-Strong positive correlation 4.4.2.6-Strong negative correlation STATISTICAL INFERENCE 148 ANALYSIS OF CORELATION AND REGRESSION 4.4- SIMPLE LINERA REGRESSION 4.5.1- Simple linear regression model 4.5.2- Estimating unknown regression coefficients α, β. 4.5.3- Solved Example 4.5- SATISTICAL INFERENCE - CORRELATION 4.6.1- Hypothesis testing for ⍴. 4.6.2- Confidence interval for………………?? 4.6- SATISTICAL INFERENCE REGRESSION 4.7.1- hypothesis test for α & β 4.7.2- Confidence interval for α & β STATISTICAL INFERENCE 149 ANALYSIS OF CORELATION AND REGRESSION CORRELATION& SIMPLE REGRESSION: Definition: The strength of relationship between any two variable is called correlation. It is denoted by ―r” Correlation Analysis is the statistical tool we can use to describe the degree to which one variable is linearly related to another. INTRODUCTION: Correlation is quantitative estimate of the relationship between two or more variables. When we refer to simple correlation, we are measuring the strength and direction of the relation between a dependent variable (Y) and one single independent variable (X), although it is called Simple correlation. This is in contrast to multiple correlations. When we refer to multiple correlations, we are measuring the strength and direction of the relationship between a dependent variable (Y) and more than one independent variables (x). Why it is used: Correlation Analysis is used in conjunction with regression analysis to measure how well the regression line explains the variation of the dependent variable. Quantitative vs. Qualitative Quantitative variables are variables measured on a numeric scale. Height, weight, response time, subjective rating of pain, temperature, and score on an exam are all examples of quantitative variables. STATISTICAL INFERENCE 150 ANALYSIS OF CORELATION AND REGRESSION Qualitative variables are variables with no natural sense of ordering. They are therefore measured on a nominal scale. For instance, hair color (Black, Brown, Gray, Red, and Yellow) is a qualitative variable Both variables are qualitative; we analyze an association through a comparison of conditional probabilities and graphically represent the data using contingency tables. Examples of qualitative variables are gender and class standing. Both variables are quantitative; while analyzing this situation we consider how one variable, called a response variable, changes in relation to changes in the other variable called an explanatory variable. Graphically we use scatter plots to display two quantitative variables. Examples are age, height, weight (i.e. things that are measured). One variable is categorical and the other is quantitative; for instance height and gender. These are best compared by using side-by-side box plots to display any differences or similarities in the center and variability of the quantitative variable (e.g. height) across the categories (e.g. Male and Female). Correlation and causation If there is a significant linear correlation between two variables, then one of five situations can be true. There is a direct cause and effect relationship There is a reverse cause and effect relationship The relationship may be caused by a third variable The relationship may be caused by complex interactions of several variables The relationship may be coincidental If we conduct a study and we establish a strong correlation does this mean we also have causation? That is, if two variables are related does that imply that one variable causes the other to occur? STATISTICAL INFERENCE 151 ANALYSIS OF CORELATION AND REGRESSION Consider smoking cigarettes and lung cancer; does smoking cause lung cancer? Initially this was answered as YES, but this was based on a strong correlation between smoking and lung cancer. Not until scientific research verified that smoking can lead to lung cancer was causation established. If you were to review the history of cigarette warning labels, the first mandated label only mentioned that smoking was hazardous to your health. Not until 1981 did the label mention that smoking causes lung cancer. To establish causation one must rule out the possibility of lurking variable(s). The best method to accomplish this is through a solid design of your experiment, preferably one that uses a control group. Regression and correlation Regression and correlation analyses are statistical tool that when properly used, can significantly help people to make decision. Unfortunately they are often misused .As a result decision makers often make inaccurate forecasts and less than desirable decision. CORRELATION ANALYSIS Definition It is a technique to determine the degree to which variables are linearly related to another. Correlation analysis measures the relationship between two items, for example, a security's price and an indicator. The resulting value (called the "correlation coefficient") shows if changes in one item (e.g., an indicator) will result in changes in the other item (e.g., the security's price). Spearman's rank correlation coefficient It is written in short as the Greek letter rho ( ) or sometimes as . This means that it is a number that shows how closely two sets of data are linked. It only can be done on data that can be put in order, highest to lowest. STATISTICAL INFERENCE 152 ANALYSIS OF CORELATION AND REGRESSION For example, if you have data for how expensive different computers are, and data for how fast the computers are, you could see if they are linked, and how closely they are linked, using . Correlation Coefficient “r”: Properties: The quantity ―r”, called the linear correlation coefficient, measures the strength and the direction of a linear relationship between two variables. The linear correlation coefficient is sometimes referred to as the Pearson product moment correlation coefficient in honor of its developer Karl Pearson. The mathematical formula for computing ―r” is: Where n is the number of pairs of data. The value of ―r” is such that -1 < r < +1. The + and – signs are used for positive linear correlations and negative linear correlations, respectively. Positive correlation: If x and y have a strong positive linear correlation, ―r” is close to +1. An ―r” value of exactly +1 indicates a perfect positive fit. Positive values indicate a relationship between x and y variables, such that as values for x increases, values for y also increase. Negative correlation: If x and y have a strong negative linear correlation, r is close to -1. An r value of exactly -1 indicates a perfect negative fit. Negative values indicate a relationship between x and y such that as values for x increase, values for y decrease. No correlation: If there is no linear correlation or a weak linear correlation, r is close to 0. A value near zero means that there is a random, nonlinear relationship between the two variables STATISTICAL INFERENCE 153 ANALYSIS OF CORELATION AND REGRESSION Note that ―r” is a dimensionless quantity; that is; it does not depend on the units employed. A perfect correlation of ± 1 occurs only when the data points all lie exactly on a straight line. If r = +1, the slope of this line is positive. If r = -1, the slope of this line is negative. A correlation greater than 0.8 is generally described as strong, whereas a correlation less than 0.5 are generally described as weak. These values can vary based upon the "type" of data being examined. A study utilizing scientific data may require a stronger correlation than a study using social science data. Coefficient of Determination, r 2 or R2: The coefficient of determination,‖ r 2‖, is useful because it gives the proportion of the variance (fluctuation) of one variable that is predictable from the other variable. It is a measure that allows us to determine how certain one can be in making predictions from a certain model/graph. The coefficient of determination is the ratio of the explained variation to the total variation. The coefficient of determination is such that 0 < r 2 < 1, and denotes the strength of the linear association between x and y. The coefficient of determination represents the percent of the data that is the closest to the line of best fit. For example, if r = 0.922, then r 2 = 0.850, which means that 85% of the total variation in y can be explained by the linear relationship between x and y (as described by the regression equation). The other 15% of the total variation in y remains unexplained. STATISTICAL INFERENCE 154 ANALYSIS OF CORELATION AND REGRESSION The coefficient of determination is a measure of how well the regression line represents the data. If the regression line passes exactly through every point on the scatter plot, it would be able to explain all of the variation. The further the line is away from the points, the less it is able to explain. Correlation Example EXAMPLE: 1 The following sample populations represent a perfect positive linear correlation. X = [-8.1, 1.0, -14.3, 4.2, -10.1, 4.3, 6.3, 5.0, 15.1, -2.2] Y = [-9.8, -0.7, -16.0, 2.5, -11.8, 2.6, 4.6, 3.3, 13.4, -3.9] Compute the correlation coefficient of X and Y. CORRELATE(X, Y) R=1.00000 EXAMPLE: 2 The following sample populations represent a high negative linear correlation. X = [1.8, -2.7, 0.7, -0.5, -1.3, -0.9, 0.6, -1.5, 2.5, 3.0] Y = [-4.7, 9.8, -3.7, 2.8, 5.1, 3.9, -3.6, 5.8, -7.3, -7.4] Compute the correlation coefficient of X and Y: CORRELATE(X, Y) R = -0.979907 STATISTICAL INFERENCE 155 ANALYSIS OF CORELATION AND REGRESSION EXAMPLE: 3 The following sample populations represent a poor linear correlation. X = [-1.8, 0.1, -0.1, 1.9, 0.5, 1.1, 1.9, 0.3, -0.2, -1.0] Y = [1.5, -1.0, -0.6, 1.1, 0.7, -0.7, 1.1, -0.1, 0.6, -0.1] Compute the correlation coefficient of X and Y: CORRELATE(X, Y) r=0.0322859 SCATTER PLOT The Scatter Diagram is a tool for determining the potential correlation between two different sets of variables, i.e., how one variable changes with the other variable. This diagram simply plots pairs of corresponding data from two variables, which are usually two variables in a process being studied. The scatter diagram does not determine the exact relationship between the two variables, but it does indicate whether they are correlated or not. It, by itself, also does not predict cause and effect relationships between these variables. Why it is used? 1) To quickly confirm a hypothesis that two variables are correlated. 2) Provide a graphical representation of the strength of the relationship between two variables. 3) Serve as a follow-up step to a cause-effect analysis to establish whether a change in an identified cause can indeed produce a change in its identified effect. To make a scatter diagram for two variables requiring confirmation of correlation, the following simple steps are usually followed: 1) Collect pairs of data for the two variables and tabulate them; 2) Draw the x- and y-axes of the diagram, along with the scales that increase to the right for the x-axis and upward for the y-axis; STATISTICAL INFERENCE 156 ANALYSIS OF CORELATION AND REGRESSION 3) Assign the data for one variable to the x-axis (the independent variable) and the data for the other variable to the y-axis (the independent variable); 4) Plot the data pairs on the scatter diagram, encircling (as many times as necessary) all data points that are repeated. Interpretation of the resulting scatter diagram is as simple as looking at the pattern formed by the points. If the data points plotted on the scatter diagram are all over the place with no discernible pattern whatsoever, then there is no correlation at all between the two variables of the scatter diagram. An example of a scatter diagram that shows no correlation is shown in Figure 1. Figure 1: A Scatter Diagram showing no correlation There is positive correlation between two sets of data if an increase in the x-value results in an increase in the y-value. Figure 2a shows a scatter diagram that exhibits positive correlation. Note that in such a correlation, the data points constitute a perceivable diagonal line that goes from the lower left to the upper right corner. Not all sets of data pairs will exhibit a strong positive correlation, even if an increase in the xvalue somehow results generally in an increase in the y-value. An example of this 'weak' type of positive correlation is shown in the scatter diagram of Figure 2b, which is said to exhibit just a 'possible positive correlation.' This scatter diagram still shows a perceivable diagonal line going in the upper right direction, but the points are more spread apart than in a scatter diagram with strong positive correlation. STATISTICAL INFERENCE 157 ANALYSIS OF CORELATION AND REGRESSION Figure 2: Scatter Diagrams showing positive correlation (a, left) and just a possible positive correlation (b, right) If the scatter diagram formed also shows a perceivable diagonal line, but the line is going in a direction opposite that of positive correlation (i.e., from the upper left to the lower right corner) as shown in Figure 3a, then the data pairs are exhibiting negative correlation. This means that y decreases as x increases. Again, the negative correlation is strong if the line formed by the data points is narrow and much defined. If the negative correlation is not strong, resulting in data points that are not closely packed together, then there is just a 'possible negative correlation.' An example of a scatter diagram for such type of correlation is shown in Figure 3b. Figure 3: Scatter Diagrams showing negative correlation (a, left) and just a possible negative correlation (b, right) Determining the exact nature of correlation between variables can lead to benefits. These include: STATISTICAL INFERENCE 158 ANALYSIS OF CORELATION AND REGRESSION 1) Better understanding of cause-effect relationships. 2) Reduction of data gathering requirements. 3) Establishment of more effective process controls. 4) Easier development of check and balance schemes; etc SIMLPE LINEAR REGRESSION When you think of regression, you think of prediction. A regression uses the historical relationship between an independent and a dependent variable to predict the future values of the dependent variable. When one independent variable is used in a regression, it is called a simple regression; when two or more independent variables are used, it is called a multiple regression. As you can see, there are several different classes of regression procedures, with each having varying degrees of complexity and explanatory power. The most basic type of regression is that of simple linear regression. A simple linear regression uses only one independent variable, and it describes the relationship between the independent variable and dependent variable as a straight line Regression Model: In simple linear regression, the model used to describe the relationship between a single dependent variable y and a single independent variable x is y = a0 + a1x + K a0 and a1 are referred to as the model parameters, and is a probabilistic error term that accounts for the variability in y that cannot be explained by the linear relationship with x. If the error term were not present, the model would be deterministic; in that case, knowledge of the value of x would be sufficient to determine the value of y. Least squares method. STATISTICAL INFERENCE 159 ANALYSIS OF CORELATION AND REGRESSION Either a simple or multiple regression models is initially posed as a hypothesis concerning the relationship among the dependent and independent variables. The least squares method is the most widely used procedure for developing estimates of the model parameters. The following table lists the monthly sales and advertising expenditures for all of last year by a digital electronics company. In this case, you would plot last year's data for monthly sales and advertising expenditures as shown on the scatter plot below. (Data for independent and dependent variables must be from the same period of time.) Scatter plots are effective in visually identifying relationships between variables. These relationships can be expressed mathematically in terms of a correlation coefficient, which is STATISTICAL INFERENCE 160 ANALYSIS OF CORELATION AND REGRESSION commonly referred to as a correlation. Correlations are indices of the strength of the relationship between two variables. They can be any value from –1 to +1. (Correlations are covered in greater detail in the Covariance and Correlation topic of this section.) When you use regression to predict future values of the dependent variable, the ideal correlation between the independent and dependent variable is high—in absolute value terms, somewhere in the range between 0.5 - 0.99. Viewing the scatter plot above, you can see that there appears to be some degree of correlation between the level of advertising expenditure and product awareness. When calculated, this correlation equals .89. This historical data will enable you to predict the relationship between the two variables in the future, before any further expense is incurred. In order to make these predictions, a regression line must be drawn from the information appearing in the scatter plot. Regression Line The figure below is the same as the scatter plot above, with the addition of a regression line fitted to the historical data. The regression line is the line with the smallest possible set of distances between itself and each data point. As you can see, the regression line touches some data points, but not others. The distances of the data points from the regression line are called error terms. STATISTICAL INFERENCE 161 ANALYSIS OF CORELATION AND REGRESSION A regression line will always contain error terms because, in reality, independent variables are never perfect predictors of the dependent variables. There are many uncontrollable factors in the business world. The error term exists because a regression model can never include all possible variables; some predictive capacity will always be absent, particularly in simple regression. The typical procedure for finding the line of best fit is called the least-squares method. This calculation is usually performed using computer software. In this calculation, the best fit is found by taking the difference between each data point and the line, squaring each difference, and adding the values together. The least-squares method is based upon the principle that the sum of the squared errors should be made as small as possible so the regression line has the least error. Once this line is determined, it can be extended beyond the historical data to predict future levels of product awareness, given a particular level of advertising expenditure. STATISTICAL INFERENCE 162 ANALYSIS OF CORELATION AND REGRESSION The extension of the line of regression requires the assumption that the underlying process causing the relationship between the two variables is valid beyond the range of the sample data. Regression is a powerful business tool due to its ability to predict future relationships between variables such as these. When you run a regression in Excel or in a statistics program, the program will provide you with a report. The details of these reports, and the definition of all the terms included in the report, are beyond the scope of the course. Equation of a Regression Line You may recall the equation of a straight line from your review of the Linear Functions topic in the Algebra section of this course. Variables, constants, and coefficients are represented in the equation of a line as X represents the independent variable. F(x) represents the dependent variable. The constant b denotes the y-intercept—this will be the value of the dependent variable if the independent variable is equal to zero. The coefficient m describes the movement in the dependent variable as a result of a given movement in the independent variable. In finance, linear regressions are commonly used to describe the returns of an individual security (dependent variable) compared to the returns of the market in general (independent variable). The equation for the simple linear regressions used to describe security movements is also a straight line and is expressed in a format, which, while similar, does contain a couple of twists. The equation below is a regression equation for a straight line describing the relationship between the returns of security I and the market in general. STATISTICAL INFERENCE 163 ANALYSIS OF CORELATION AND REGRESSION ri represents the return of security I and is the dependent variable rm represents the return of the market in general and is the independent variable b is the slope of the regression line, and it describes the level of movement in security I as a result of a unit of movement in the market in general α is the y-intercept of the regression line I is an error term that describes the distance between an actual data point and the corresponding point on the regression line The graph below provides a visual depiction of this regression line. The returns of the market in general are represented in this graph by the returns of the S&P 500—a common surrogate for market returns. You may be familiar with discussions in financial circles about the beta (β) of a security being a measure of the security's risk. The risk measure of beta is calculated using regression techniques. Beta, the slope of the regression line, was described above as the level of movement in the returns of a given security for each unit of movement in the market in general. A security with a high beta is considered risky and will experience big swings in its returns as compared to those of the market. A security with a low beta is considered less risky and will have returns that fluctuate less than those of the market. The alpha term (α) in the regression equation of a security represents the security's propensity to move independent of the market. The alpha and beta of a STATISTICAL INFERENCE 164 ANALYSIS OF CORELATION AND REGRESSION security cannot be observed directly but are estimated, based on the past performance of a security, through regression analysis. Example The example data in Table 1 are plotted in Figure 1. You can see that there is a positive relationship between X and Y. If you were going to predict Y from X, the higher the value of X, the higher your prediction of Y Table 1: Example data X Y 1.00 1.00 2.00 2.00 3.00 1.30 4.00 3.75 5.00 2.25 Figure 1: A scatter plot of the example data. Linear regression consists of finding the best-fitting straight line through the points. The best-fitting line is called a regression line. The black diagonal line in Figure 2 is the regression line and consists of the predicted score on Y for each possible value of X. The vertical lines from the points to the regression line represent the errors of prediction. As you can see, the red point is STATISTICAL INFERENCE 165 ANALYSIS OF CORELATION AND REGRESSION very near the regression line; its error of prediction is small. By contrast, the yellow point is much higher than the regression line and therefore its error of prediction is large. 3.75 1.33 Figure 2; a scatter plot of the example data, The black line consists of the predictions, the points are the actual data, and the vertical lines between the points and the black line represent errors of prediction.The error of prediction for a point is the value of the point minus the predicted value (the value on the line). Table 2 shows the predicted values (Y') and the errors of prediction (Y-Y'). For example, the first point has a Y of 1.00 and a predicted Y (called Y') of 1.21. Therefore, its error of prediction is 0.21. Table 2: Example data. STATISTICAL INFERENCE X Y Y' 1.00 1.00 1.210 -0.210 0.044 2.00 2.00 1.635 0.365 0.133 3.00 1.30 2.060 -0.760 0.578 4.00 3.75 2.485 1.265 1.600 5.00 2.25 2.910 -0.660 0.436 166 Y-Y' (Y-Y') 2 ANALYSIS OF CORELATION AND REGRESSION You may have noticed that we did not specify what is meant by "best-fitting line". By far, the most commonly-used criterion for the best-fitting line is the line that minimizes the sum of the squared errors of prediction. That is the criterion that was used to find the line in Figure 2. The last column in Table 2 shows the squared errors of prediction. The sum of the squared errors of prediction shown in Table 2 is lower than it would be for any other regression line. The formula for a regression line is Y' = b(X) + A Where Y' is the predicted score, b is the slope of the line, and A is the Y intercept. The equation for the line in Figure 2 is Y' = 0.425X + 0.785 For X = 1, Y' = (0.425) (1) + 0.785 = 1.21. For X = 2, Y' = (0.425) (2) + 0.785 = 1.64. Example Thus we can derive from the data in table below: STATISTICAL INFERENCE 167 ANALYSIS OF CORELATION AND REGRESSION From this we get that Minimizes S2, the least squares estimate, is given by & It can be shown that Which is of use because we have calculated all the components of equation in the calculation of the correlation coefficient? The calculation of the correlation coefficient on the data in table 11.2 gave the following: Applying these figures to the formulae for the regression coefficients, we have: STATISTICAL INFERENCE 168 ANALYSIS OF CORELATION AND REGRESSION Therefore, in this case, the equation for the regression of y on x becomes This means that, on average, for every increase in height of 1 cm the increase in anatomical dead space is 1.033 ml over the range of measurements made. The line representing the equation is shown superimposed on the scatter diagram of the data in figure shown above. The way to draw the line is to take three values of x, one on the left side of the scatter diagram, one in the middle and one on the right, and substitute these in the equation, as follows: If x = 110, y = (1.033 x 110) - 82.4 = 31.2 If x = 140, y = (1.033 x 140) - 82.4 = 62.2 If x = 170, y = (1.033 x 170) - 82.4 = 93.2 Example Suppose we measured the height and weight of a random sample of adults in shopping malls in the U.S. We want to predict weight from height in the population. Table 2.1 Ht Wt 61 105 62 120 63 120 65 160 65 120 STATISTICAL INFERENCE 169 ANALYSIS OF CORELATION AND REGRESSION 68 145 69 175 70 160 72 185 75 210 N=10 N=10 67 150 Mean 20.89 1155.5 Variance (S2) 4.57 33.99 Standard Deviation (S) Correlation (r) = .94 It is customary to talk about the regression of Y on X, hence the regression of weight on height in our example. The regression equation of our example is Y = -316.86 + 6.97X, where -361.86 is the intercept (a) and 6.97 is the slope (b). We could also write that weight is -316.86+6.97height. The slope value means that for each inch we increase in height, we expect to increase approximately 7 pounds in weight (increase does not mean change in height or weight within a person; rather it means change in people who have a certain height or weight). The intercept is the value of Y that we expect when X is zero. So if we had a person 0 inches tall, they should weigh -316.86 pounds. Of course we do not find people who are zero inches tall and we do not find people with negative weight. It is often the case in psychology the value of the intercept has no meaningful interpretation. STATISTICAL INFERENCE 170 ANALYSIS OF CORELATION AND REGRESSION STATISTICAL INFERENCE - Correlation Hypotheses testing and confidence interval Formulas: S.E = Sy.x/√∑(𝑿𝟏 − 𝑿𝟐)2 Sy.x= √∑(𝒀𝟏 − 𝒀𝟐)2/n-2 Suppose that we took 7 mice and measured their body weight and their length from nose to tail. We obtained the following results and want to know if there is any relationship between the measured variables. [To keep the calculations simple, we will use small numbers] Mouse Units of weight (x) Units of length (y) 1 1 2 2 4 5 3 3 8 4 4 12 5 8 14 STATISTICAL INFERENCE 171 ANALYSIS OF CORELATION AND REGRESSION 6 9 19 7 8 22 Procedure (1) Plot the results on graph paper. This is the essential first step, because only then can we see what the relationship might be - is it linear, logarithmic, sigmoid, etc? In our case the relationship seems to be linear, so we will continue on that assumption. If it does not seem to be linear we might need to transform the data. (2) Set out a table as follows and calculate S x, S y, S x2, S y2, S xy, (mean of y). x2 y2 Xy 2 1 4 2 5 16 25 20 Weight Length (x) (y) Mouse 1 1 Mouse 2 4 STATISTICAL INFERENCE and 172 ANALYSIS OF CORELATION AND REGRESSION Mouse 3 3 8 9 64 24 Mouse 4 4 12 16 144 48 Mouse 5 8 14 64 196 112 Mouse 6 9 19 81 361 152 Mouse 7 8 22 64 484 176 Total S x = 37 S y = 82 S x2 = S y2 = S xy 251 1278 553 Mean (3) Calculate (4) Calculate (5) Calculate = = 5.286 11.714 = = 55.429 in our case. = 317.429 in our case. (this can be positive or negative) = 119.571. (6) Calculate r (correlation coefficient): r = 0.9014 in our case. STATISTICAL INFERENCE 173 ANALYSIS OF CORELATION AND REGRESSION (7) Look up r in a table of correlation coefficients (ignoring + or - sign). The number of degrees of freedom is two less than the number of points on the graph (5 df in our example because we have 7 points). If our calculated r value exceeds the tabulated value at p = 0.05 then the correlation is significant. Our calculated value (0.9014) does exceed the tabulated value (0.754). It also exceeds the tabulated value for p = 0.01 but not for p = 0.001. If the null hypothesis were true (that there is no relationship between length and weight) we would have obtained a correlation coefficient as high as this in less than 1 in 100 times. So we can be confident that weight and length are positively correlated in our sample of mice. Suppose that we had the following results from an experiment in which we measured the growth of a cell culture (as optical density) at different pH levels. Optical pH density 3 0.1 4 0.2 4.5 0.25 5 0.32 5.5 0.33 6 0.35 6.5 0.47 7 0.49 7.5 0.53 We plot these results (see below) and they suggest a straight-line relationship. STATISTICAL INFERENCE 174 ANALYSIS OF CORELATION AND REGRESSION Using the same procedures as for correlation, set out a table as follows and calculate S x, S y, S x2, S y2, S xy, pH (x) and (mean of y). Optical x2 y2 Xy density (y) 3 0.1 9 0.01 0.3 4 0.2 16 0.04 0.8 4.5 0.25 20.25 0.0625 1.125 5 0.32 25 0.1024 1.6 5.5 0.33 30.25 0.1089 1.815 6 0.35 36 0.1225 2.1 6.5 0.47 42.25 0.2209 3.055 STATISTICAL INFERENCE 175 ANALYSIS OF CORELATION AND REGRESSION Total Mean 7 0.49 49 0.240 3.43 7.5 0.53 56.25 0.281 3.975 S x = 49 S y = 3.04 S x2 = 284 S y2 = 1.1882 S xy = 18.2 = 5.444 Now calculate Calculate Calculate = 0.3378 = 17.22 in our case. = 0.1614 in our case. (this can be positive or negative) = +1.649 Now we want to use regression analysis to find the line of best fit to the data. We have done nearly all the work for this in the calculations above. The regression equation for y on x is: y = bx + α where b is the slope and a is the intercept (the point where the line crosses the y axis) We calculate b as: b= 1.649 x 17.22 = 0.0958 in our case We calculate a as: a= STATISTICAL INFERENCE 176 -b ANALYSIS OF CORELATION AND REGRESSION From the known values of (0.3378), (5.444) and b (0.0958) we thus find a (-0.1837). So the equation for the line of best fit is: y = 0.096x - 0.184 (to 3 decimal places). To draw the line through the data points, we substitute in this equation. For example: When x = 4, y = 0.384, so one point on the line has the x, y coordinates (4, 0.384); When x = 7, y = 0.488, so another point on the line has the x, y coordinates (7, 0.488). It is also true that the line of best fit always passes through the point with co-ordinates , y so we actually need only one other calculated point in order to draw a straight line. EXAMPLE Question? Solution Confidence Interval for α: Pr [a- t α/2 (µ). S.Eα ≤ α ≤ a +tα/2 (µ). S.Ea ]= 1-α S.Eα = δy.x √1/n + x2/ √∑(xi-x)2 =0.82702 Pr[5.82539-2.583*0.8270 ≤ α ≤ 5.82539 + 2.583*0.8270 Pr [3.69 ≤ α ≤ 7.961] Confidence Interval of β: Pr [b- t α/2 (µ).S.Eb ≤ β ≤ b +tα/2 (µ). S.Eb ]= 1-β S.Eβ = δy.x / √∑ (xi-𝑥)2 STATISTICAL INFERENCE 177 ANALYSIS OF CORELATION AND REGRESSION = 2.58072/√-19828 = 2.58072/ 140.8119 =0.0183 Pr [ 0.5676- 2.583* 0.0183 ≤ β ≤ 0.5676+ 2.583* 0.0183] Pr [0.5676 – 0.024339 ≤ β ≤0.5676 + 0.024339] Pr [0.5203 ≤ β ≤ 0.6148] Hypothesis Testing Of α: H0: α = 0 H1: α ≠ 0 α=5% t = a – α / S.E α = 5.8253 – 0 / 0.82702 = 7.043723 │t cal │≥ t tab t tab = t α/2 µ │7.04373 │ ≥ 0.025 (16) │7.04373 │ ≥ 2.583 So, Reject H0 STATISTICAL INFERENCE 178 ANALYSIS OF CORELATION AND REGRESSION Hypothesis testing of β: H0: β = 0 H1: β ≠ 0 α= 5% t= b - β/ S.Eb = 0.5676 – 0/ 0.0183 = 31.01 │t cal │≥ t tab t tab = t α/2 µ │31.01│ ≥ 0.025 (16) │31.01│ ≥ 2.583 So, Reject H0 STATISTICAL INFERENCE 179 ANALYSIS OF CORELATION AND REGRESSION Reference Statistics for management (seventh Edition) –RICHARD L . LEVIN, DAVID S. RUBIN Applied Statistics for Public Policy – BRIAN . MACFIE AND PHILIP M. NUFEIO http://www.cimt.plymouth.ac.uk/projects/mepres/book9/bk9i8/bk9_8i3.html http://mathbits.com/MathBits/TISection/Statistics2/correlation.htm http://www.ncbi.nlm.nih.gov/pmc/articles/PMC374386/ http://archive.bio.ed.ac.uk/jdeacon/statistics/tress11.html http://www.slideshare.net/21_venkat/correlation-regression-17406392 http://lycofs01.lycoming.edu/~sprgene/M123/Text/UNIT_09.pdf http://www.icoachmath.com/problems/problemslink.aspx?search=scatter%20plot& grade=0 http://www.statistics.com/index.php?page=glossary&term_id=538 http://vassarstats.net/textbook/ch3a.html http://www.stats.gla.ac.uk/steps/glossary/paired_data.html http://www.siliconfareast.com/scatterdiagram.html http://answers.yahoo.com/question/index?qid=20090710024107AAT7qnj http://www.metastock.com/Customer/Resources/TAAZ/?c=3&p=44 http://www.physics.nyu.edu/grierlab/idl_html_help/mathematics6.html http://ci.columbia.edu/ci/premba_test/c0331/s7/s7_6.html http://luna.cas.usf.edu/~mbrannic/files/regression/regbas.html http://www.bmj.com/about-bmj/resources-readers/publications/statistics-squareone/11-correlation-and-regression STATISTICAL INFERENCE 180 ANALYSIS OF CORELATION AND REGRESSION NON PARAMETRIC TESTING INTRODUCTION Until now, everything that we have done in statistics was based on this assumption that the data is normally distributed. But sometimes we do not know about the real distribution of the data. In this case, we will use non parametric tests. A non parametric test is one that makes no assumption about the specific shape of the population from which a sample is drawn. It is also called a distribution free test. When we know what distribution we are dealing with, it is much more practical and useful to use a particular test which is designed to serve that specific purpose. For example, if we are talking about normal distribution, we can use a z, t or a F test for performing inferences about parameters. But when we do not know about the population, then it is better to use a test that fits for any kind of distribution. In this way, we will be prepared for any condition. A non parametric test should be used instead of its parametric counterpart whenever: 1. Data are of the nominal or ordinal scales of measurement. 2. When there are definite outliers, or 3. Data are of interval or ratio scale measurement but one or more other assumptions, such as the normality of the underlying population distribution, are not met. When our data is normally distributed, the mean is equal to the median and we use the mean as our measure of center. However, if our data is skewed, then the median is a much better measure of center. Therefore, just like the Z, t and F tests made inferences about the population mean(s), nonparametric tests make inferences about the population median(s). Advantages of Non parametric Testing 1. Fewer assumptions about the population. Non parametric tests do not assume the population has any specific distribution. 2. The techniques can be applied when sample sizes are very small. 3. Samples with data of the nominal and ordinal scales of measurement can be tested. 4. In most cases, computations are easier than its parametric counterpart. INFERENTIAL STATISTICS 185 NON PARAMETRIC TESTING Disadvantages of Non parametric Testing 1. Compared to a parametric test, the information in the data is used less efficiently, and the power of the test will be lower. 2. They are less sensitive to their parametric counterparts when the assumptions of the parametric method are met. Therefore, larger differences are needed before rejecting the null hypothesis. Number of samples Non- parametric test Parametric test Wilcoxon signed rank test, t- test, z- test One sample One sample One When samples are dependent; Two Wilcoxon signed rank test, t- test, z-test, paired samples paired samples When Two samples are t- test, z- test, two independent independent; samples Wilcoxon rank sum test When sample are dependent; Randomized block analysis of Freidman test variance More than two When More than two samples independent; are One way analysis of variance Kruskal-Wallis test THE RUNS TEST FOR RANDOMNESS The runs test evaluates the randomness of a series of observations by analyzing the number of runs it contains. A run is the consecutive appearance of one or more observations that are similar. INFERENTIAL STATISTICS 186 NON PARAMETRIC TESTING Assumptions 1. The sample data are arranged according to some scheme (such as time series). 2. The data falls into two separate categories (such as above and below a specific value). 3. The runs test is based on the order in which the data occur; not on the frequency of the data. PROCEDURE State the null and alternate hypotheses Ho: the sequence is random H1: the sequence is not random For nominal data with two categories: 1. Determine n1 and n2, the number of observations of each type. 2. Count the number of runs , T For ordinal, interval or ratio data: 1. Determine the median, m of the values. 2. Identify each data value with a (+), if x ≥ m and with a (-), if x < m. 3. Determine n1 and n2, the number of (+) and (-) observations. 4. Count the number of runs, T Test statistics 𝑧= 𝑇− 2𝑛 1 𝑛 2 𝑛 +1 2𝑛 1 𝑛 2 2𝑛 1 𝑛 2 −𝑛 𝑛 2 𝑛 −1 Where T= the number of runs n1= the number of observations of the first type INFERENTIAL STATISTICS 187 NON PARAMETRIC TESTING n2= the number of observations of the second type n= the total number of observations, n1+n2 Critical region Conclusion EXAMPLE A political activist claims to have “randomly” stop persons at a street corner and asked them to sign his petition and give their age,. During his first half on the street, 30 people signed the document and gave their age, and the order is as shown: The age of the signers in order in which they signed: 30 33 15 59 35 29 68 69 38 43 15 36 35 30 61 74 56 47 68 18 22 12 58 45 65 64 49 38 58 45 At the 0.05 LOS, evaluate the randomness of the ages for this sequence of 30 respondents. SOLUTION Ho = the ages are in a random order H1: the ages are not in a random order α= 0.05 𝑧= 𝑇−( 2𝑛 1 𝑛 2 𝑛 + 1) 2𝑛 1 𝑛 2 (2𝑛 1 𝑛 2 −𝑛) 𝑛 2 (𝑛 −1) INFERENTIAL STATISTICS 188 NON PARAMETRIC TESTING 30 - 74 + 33 - 56 + 15 - 47 + 59 + 68 + 35 - 18 - 29 - 22 - 68 + 12 - 69 + 58 + 38 - 45 + 43 - 65 + 15 - 64 + 36 - 49 + 35 - 38 - 30 - 58 + 61 + 45 + The age values shown in table have a median of 44 years. The ages are converted to (+) when they are 44 or greater than 44. Total number of runs = T = 10. There are n1 = 15 and n2 = 15 and the total sample size is n1 + n2= 30 Substituting the values in the formula INFERENTIAL STATISTICS 189 NON PARAMETRIC TESTING Z = -2.23 For a two tail test, at 0.05 LOS, the critical z values -1.96 and +1.96. The observed z value is outside these limits, and the null hypothesis is rejected. PROBLEMS 1. The News and Clarion kept a record of the gender of people to call the circulation office to complain about delivery problems with the Sunday paper. For a recent Sunday these data are as follows: F F F M M F M F F F F M M M F M F M F F F F M M M M M M Using the 0.05 LOS, test this sequence for randomness. SIGNED TEST METHOD: It is one of the simplest of statistical test, which focuses on the median rather than the mean as a measure of central tendency. Only assumption made in performing the test is that the variables come from a continuous distribution. It is called the sign test because we use pluses and minuses as the new data in performing the calculations. We illustrate its use with a single sample and a paired sample. It is useful when we are not able to use the t test because the assumption of normality has been violated. ONE SAMPLE: To perform a test, we replace each observation by a plus sign and minus sign depending upon whether the observation is above or below Mo, the hypothesized value of the population median. We discard any observation that equals Mo and reduce the sample size. We denote the total number of plus and minus sign by n. the test statistic X is defined by number of times the less frequent sign (plus or minus) occurs. Under null hypothesis, the samplying distribution of X is 𝟏 binominal with parameters 𝟐 and n. we determine the critical region by calculating the binominal probalilities. To reach the signifinance level α, we add the probalilities from both tails in case of INFERENTIAL STATISTICS 190 NON PARAMETRIC TESTING two-tailed test, and in case on one-tailed test, the probalilties in the desired tail are added to reach α. We accept the null hypothesis in case the population are systematric, may be sataed as Ho: μ=μo TWO SAMPLE: Let Xi and Yi denote the observation from the first sample and the second sample respectively. We replace the difference Xi-Yi by a plus sign if Xi>Yi; by a minus sign if Xi<Yi and we ignore the pair if Xi=Y i.e, zero difference are droped from the analysis let n represent the number of plus and minus signs and ket X stand for the number of times the less frequent sign(plus or minus)occurs. Then the samplying samplying distribution of X binominal with parameter ½ and n. the rest of the procedure is same as in one sample sign test. In case the sample size are not equal, some of the values of the larger samples are to be discarded 9the data must be from match pair samples. The two sample sign test may be used to test the hypothesis Ho: μ1=μ2 when the underlying populations are asumed to be symmetric. With large n we use the normal approximation to the binominal distribution b(n,1/2). The statistic X is then approximately standard normal with mean =n/2 and the standard deviation = 𝑛/4. In other words, the test statistic under Ho, becomes Z= Z= 𝑿−𝒏/𝟐 𝒏/𝟒 𝟏 𝟐 , without correction for continuity, 𝑿± −𝒏/𝟐 𝑛 2 , with correction for continuity, We reject or accept ho, applying the usual decision rules. In applying normal approximation to binominal distribution ;n is taken large when both np and nq are atleast 5. As p=1/2 we can therefore use the normal approximation when n exceeds 10. The procedure for testing the hypothesis that the population median has a specified value Mo, in case of one sample, is given below: Test of null hypothesis Ho: population medians M=Mo INFERENTIAL STATISTICS 191 NON PARAMETRIC TESTING Ha: population medians M≠M (o or M>Mo or M<Mo) Significance level α The test statistics is X, the number of times the less frequent sign (+ or -) occurs and is binominally distributed. If n , the number of pluses and minuses exceeds 10, the test z= 𝑋−𝑛/2 𝑛/4 which ,if ho is true, is approximately standard normal. Computations. Subtract Mo the hypothesized value of population median, from each observation of the sample i.e find the difference X i-Mo. Write down +sign if the difference is postive and a – sign if the difference is negative, ignore the zero difference,if any. Denote by n the number of + and – signs (i.e, non-zero differences) and by X,the number of times the less frequent sign occurs. Compute either the extreme probalities of the binominal variable X or the value of Z, as the case may be. The critical region depends on the test-statistic, alternative hypothesis and sinificance level α. Apply the usual decision rule to reject or accept the null hypothesis. The procdure is the case of two-sample sign test would be the same except the following two steps:1 Ho: two populations are identical or that they have equal medians, M1=M2. It is tested aginst and appropiate alternative hypothesis. Computation. Subtract each observation of the second sample say Yi, from the corresponding observation of the first sample say Xi i.e find the differences Xi – Yi. Write a plus sign if the difference is postive and minus if the difference is negative. Discard zero differnces,if any; and soo on . INFERENTIAL STATISTICS 192 NON PARAMETRIC TESTING EXAMPLE:1 You’re an analyst for Chef-Boy-R-Dee. You’ve asked 7 people to rate a new ravioli on a 5pointLiker scale (1=terrible to 5 = excellent). The ratings are: 2 5 3 4 1 4 5. At the .05 level, is there evidence that the median rating is less than 3? Solution: Ho: h = 3 Ha: h < 3 a = .05 S=2 Ratings1 & 2Are Less Than h=3(2, 5, 3, 4, 1, 4, 5) P (x≥2) = 1 - P(x £ 1) = .937(Binomial Table, n = 7, p = 0.50) Reject h0 if: S ≥ p-value 2 ≥ 0.937 Do not reject Ho Conclusion: There Is No Evidence Median Is Less Than 3 EXAMPLE 2. An experimenter want to determine the effectiveness of a certain reducing diet. 12 persons were put on diet; their weight before and after they tried the diet,are shown below; Persons 1 2 3 4 5 6 7 8 9 10 11 12 Weight 202 154 183 180 228 164 139 165 175 245 237 163 195 154 178 199 220 157 135 180 108 106 227 155 before Weight after INFERENTIAL STATISTICS 193 NON PARAMETRIC TESTING Use the sign test at the 5%significance level to test the hypothesis that the diet is not effective aginst the allternative that is effective. Solution: Ho: the diet is not effectice, which is eqvialent to test the hypothesis Ho : P[+sign]=[P-sign]=1/2, and Ha: the diet is effective, i.e P>1/2 (one- tailed test) α=0.05 X=number pf times the less frequent sign occurs. Under Ho X is binominally distributed. Cacluation: Subtracting the weights after from the weights before they tried the diet , and writning down a plus sign for each postive difference and minus sign for each negative difference we get +0+-+++--+++ Thus n = 11 , the sum of + sign and – sign, ignoring ) diiference, and X =3, the number of minus sign( less frequent sign ). The statistic X is the distribution of –sign therfore 1 1 1 1 P(X≤3)=(2)11 + 11(2)11 =55(2)11 +165(2)11 232 =2048 = 0.113 Reject Ho if P≤0.05 As 0.113≤0.05 conclusion Reject Ho because 0.05 is less the computed probality. PROBLEMS INFERENTIAL STATISTICS 194 NON PARAMETRIC TESTING 1. The following data show employee rate of defective work before and after a change in the wage incentive plan. Compare the following two sets of data to see whether the change lowered the defective units produces use the 0.10 LOS? Before 8 7 6 9 7 10 8 6 5 8 10 8 After 5 8 6 9 8 10 7 5 6 9 8 6 2. After collecting a data on the amount of air pollution in Los Angeles, the environment protection agency decided to issues strict new rules to govern the amount of hydrocarbons in the air. For the next year, it took monthly measurements of this pollutant and compared them to the preceding year’s measurements for corresponding months. Based on the following data, does the EPA have enough evidence to conclude with 95 % confidence that the new rules were effective in lowering the amount of hydrocarbons in the air? To justify these laws for another year, it must conclude at =0.10 that they are effective will these laws still will be effective next year? Months Last year This year Jan 7.0 5.3 Feb 6.0 6.1 March 5.4 5.6 April 5.9 5.7 May 3.9 3.7 June 5.7 4.7 July 6.9 6.7 INFERENTIAL STATISTICS 195 NON PARAMETRIC TESTING WILCOXON SIGNED RANK TEST FOR ONE SAMPLE For one sample, the Wilcoxon signed rank method tests whether the sample could have been drawn from a population having a hypothesized value as its median. It is equivalent to 1sample z or t test. This procedure makes no assumption except that the sample we have is randomly taken from a population, with a symmetric frequency distribution. The symmetric assumption does not assume normality, simply that there seems to be roughly the same number of values above and below the median. Following are the steps of Wilcoxon signed rank test: Stating the null and alternative hypotheses Ho : H1 : Two- tail test Left-tail test Right- tail test m = mo m≥ mo m≤ mo m ≠ mo m < mo m > mo Where m = the population median mo = a value that has been specified Level of significance (α) Test statistic, W For each of the observed values, calculate di = xi - mo Ignoring observations where di = 0, rank the | di | values so the smallest | di | will have a rank of 1. If there are ties, assign each of the tied rows the average of the ranks they are occupying. For observations where xi> mo, list the rank in the R+ column. The test statistics is the sum of the R+ column, W = ∑R+. Critical value of W The Wilcoxon signed rank table is shown below, which lists the lower and upper critical values for various levels of significance, with n = the number of observations for which INFERENTIAL STATISTICS 196 NON PARAMETRIC TESTING di≠0. The rejection region will be in either one or both tails, depending on the null hypothesis being tested. Conclusion EXAMPLE An environmental activists believe her community’s drinking water contains at least the 40 ppm (parts per million) limit recommended by health officials for a certain metal. In response to her claim, the health department samples and analyzes drinking water from a sample of 11 households in the community. The results are residue levels of 39, 20.2, 40, 32.2, 30.5, 26.5, 42.1, 45.6, 42.1, 29.9, and 40.9 ppm. At 0.05 LOS, can we conclude that the community’s drinking water might equal or exceed the 40 ppm recommended limit? SOLUTION Specification of hypothesis: Ho: m ≥40 ppm H1: m < 40 ppm LOS: α= 0.05 Test statistics and calculations: Observed Household Concentration di = xi –mo | di | Rank R+ R- xi A 39 -1.0 1.0 2 2 B 20.2 -19.8 19.8 10 10 C 40 0 0 - INFERENTIAL STATISTICS 197 NON PARAMETRIC TESTING D 32.2 -7.8 7.8 6 6 E 30.5 -9.5 9.5 7 7 F 26.5 -13.5 13.5 9 9 G 42.1 2.1 2.1 3.5 3.5 H 45.6 5.6 5.6 5 5 I 42.1 2.1 2.1 3.5 3.5 J 29.9 -10.1 10.1 8 K 40.9 0.9 0.9 1 8 1 13.0 42.0 W= ∑R+ = 13.0 The critical value of W can be found from the table of critical values for the Wilcoxon signed rank test. For n = 10 nonzero differences and α = 0.05 the critical value is 11. The test statistics, W= 13 exceeds 11, and the null hypothesis cannot be rejected at the 0.05 level. At 0.05 level, we are unable to reject the possibility that the city’s water supply might have at least 40 ppm of the metal. NOTE: For n = the number of observations for which di ≠ 0, the Wilcoxon signed rank statistics will be W= 0 If all the di vales are negative W = n(n+1)/2 If all the di values are positive INFERENTIAL STATISTICS 198 NON PARAMETRIC TESTING THE NORMAL APPROXIMATION When the number of observations for which di ≠ 0 is n ≥ 20, a z test will be a close approximation to the Wilcoxon signed rank test. This is possible because the W distribution approaches a normal curve as n becomes larger. The format for this approximation is shown below: 𝑊− 𝑧= 𝑛(𝑛+1) 4 𝑛 𝑛 +1 (2𝑛 +1) 24 Where W = sum of R+ ranks n = the number of observations for which di ≠ 0 Our example has only n = 10 nonzero differences so this approximation will be rougher than if n were 20 or larger. Substituting the values of W and n into this expression, we get z = -1.48. The critical value of z will be -1.645 for a left tail test. Since the calculated value of z lies to the right of this critical value, the null hypothesis cannot be rejected PROBLEMS 1. According to the director of a country tourist bureau, there is a median of 10 hours of sunshine per day during the summer months. For a random sample of 20 days during the past three summers, the number of hours of sunshine has been recorded as shown below. Use the 0.05 levels in evaluating the director’s claim. 8 9 8 10 9 7 7 9 7 7 9 8 11 9 10 7 8 11 8 12hours 2. Use the Wilcoxon signed rank test for the following randomly selected observations in examining whether the population median could be greater than 37.0. LOS is 0.01. INFERENTIAL STATISTICS 34.6 40.0 33.8 47.7 41.4 40.2 47.0 39.5 36.1 48.1 39.1 45.0 45.7 46.6 199 NON PARAMETRIC TESTING WILCOXON SIGNED RANK TEST FOR COMPARING PAIRED SAMPLES The Wilcoxon signed rank test for paired sample is the non parametric equivalent of the paired sample z or t- test. It is used when we want to make inferences about the mean difference between the two populations. This test too assumes that the data are continuous and of the interval or ratio scales of measurement. The population of d values is assumed to be symmetric or nearly so, but need not be normally distributed or have any other specific shape. In this application, the measurement of interest is the difference between paired observations i.e., di = xi -yi Applying the Wilcoxon signed rank test to paired sample is nearly identical to its use of one sample. The steps are as below: State the null and alternative hypotheses Ho : H1 : Two- tail test Left-tail test Right- tail test md = 0 md ≥ 0 md ≤ 0 md ≠ 0 md < 0 md > 0 Where md = the population median of di = xi -yi Determine the level of significance Test statistic, W For each of the matched pair values calculate the difference between the two responses, di = xi -yi List the set of n absolute differences. Rank the absolute value of the differences. Group all the positive differences and the negative differences separately (under R+ and R-). INFERENTIAL STATISTICS 200 NON PARAMETRIC TESTING The sum of the ranks of positive differences, R+ is the Wilcoxon signed rank test statistics. Critical value of W Conclusion An example will help to understand a clearer picture of this application. EXAMPLE An athletics coach wishes to test the values to his athletes of an intensive period of weight training and so he selects twelve 400m runners from his regions and records their times, in seconds, to complete this distance. They then undergo his plan of weight training and have their times for 400 m measured again. The table below summarizes the result. Athlete A B C D E F G H I J K L Before 51.0 49.8 49.5 50.1 51.6 48.9 52.4 50.6 53.1 48.6 52.9 53.4 After 50.6 50.4 48.9 49.1 51.6 7.6 53.5 9.9 51.0 48.5 50.6 51.7 Use the Wilcoxon signed rank test to investigate the hypothesis that the training program will significantly improve athletes’ time for the 400 meters. SOLUTION Ho: md = 0 (the population median of di = xi -yi is 0 i.e. training programme has no effect) H1: md> 0 (the population median of di = xi -yiis greater than 0 i.e. training programme improves time) Level of significance = 0.05 Athletes Before After di | di | Rank R+ A 51.0 50.6 0.4 0.4 2 2 INFERENTIAL STATISTICS 201 R- NON PARAMETRIC TESTING B 49.8 50.4 -0.6 0.6 3.5 3.5 C 49.5 48.9 0.6 0.6 3.5 3.5 D 50.1 49.1 1.0 1.0 6 6 E 51.6 51.6 0.0 0.0 - F 48.9 47.6 1.3 1.3 8 G 52.4 53.5 -1.1 1.1 7 H 50.6 49.9 0.7 0.7 5 5 I 53.1 51.0 2.1 2.1 10 10 J 48.6 48.5 0.1 0.1 1 1 K 52.9 50.6 2.3 2.3 11 11 L 53.4 51.7 1.7 1.7 9 9 8 7 55.5 10.5 W= ∑R+ = 55.5 As the table shows, the difference is calculated for each pair of the values, the absolute values for di are obtained, and then they are ranked. Finally the ranks associated with positive differences are listed in the R+ column and added to get the observed value of the test statistic, W. The test statistic is W = 55.5. For n= 11 nonzero differences and α = 0.05, the Wilcoxon signed rank table gives lower and upper critical values for W of 14 and 52 respectively. The observed value W = 55.5 falls outside these limits so the null hypothesis is rejected. Based on this data the weight training programme does improve the athletes’ time for 400 m. INFERENTIAL STATISTICS 202 NON PARAMETRIC TESTING PROBLEMS 1. Eight subjects were asked to perform a simple puzzle assembly under normal conditions and under conditions of stress. During the stress condition the subjects were told that a mild shock would be delivered 3 minutes after the start of the experiment and every 30 seconds thereafter until the task was completed. Blood pressure readings were taken under both conditions. Data in the accompanying table represent the highest reading during the experiment. Do the data present sufficient evidence to indicate higher blood pressure readings during conditions of stress? Test at 0.01 level of significance. Subject 1 2 3 4 5 6 7 8 Normal 126 117 115 118 118 128 125 120 Stress 130 118 125 120 121 125 130 120 2. The ministry of Defense is considering which of the two shoe leathers it should adopt for its new army boot. They are particularly interested in how boots made from these leathers wear and so 15 soldiers are selected at random and each man wears one boot of each type. After six months the wear, in mm, for each boot are recorded as follows: Soldier 1 2 3 4 5 6 7 8 Leather A 5.4 2.6 4.3 1.1 3.3 6.6 4.4 3.5 Leather B 4.7 3.2 3.8 2.3 3.6 7.2 4.4 3.9 INFERENTIAL STATISTICS 203 NON PARAMETRIC TESTING 3. Nine experts rated two brands of Colombian Coffee in a taste testing experiment. A rating on a 7 point scale (1 = extremely unpleasing and 7 = extremely pleasing) is given for each of four characteristics: taste, aroma, richness and acidity. The following table displays the summated ratings- accumulated over all four characteristics. BRAND D Expert A B C.C. 24 26 S.E. 27 27 E.G. 19 22 B.L. 24 27 C.M. 22 25 C.N. 26 27 G.N. 27 26 R.M. 25 27 P.V. 22 23 WILCOXON RANK SUM TEST FOR COMPARING TWO INDEPENDENT SAMPLES Wilcoxon signed rank test for matched pairs cannot be performed when either the two samples are independent or the two samples have different sizes. For such situations, Wilcoxon rank sum test is used. The Wilcoxon Rank Sum test is used to test for a difference between two samples. It is the nonparametric counterpart to the two-sample Z or t test. Instead of comparing two population means, we compare two population medians. It has the same assumptions as the previous test. INFERENTIAL STATISTICS 204 NON PARAMETRIC TESTING This test is also known as Mann Whitney U Test. Procedure for performing this test is: State the null and alternative hypotheses Ho : H1 : Two- tail test Left-tail test Right- tail test m1 = m2 m1 ≥ m 2 m1 ≤ m2 m1 ≠ m2 m1 < m 2 m1 > m2 Where m1 and m2 are the population medians Level of significance Test statistic, W Elect the smaller of the two samples as sample 1. In case, the sample sizes are equal, either of the samples can be considered as sample 1. Rank the combined data values as if they were from a single group. (The smallest data value gets the rank of 1, the next smallest 2, and so on). If there is a tie between the values, each tied value gets the average rank that the values are occupying. List the ranks for data values from sample 1 in the R1 column and the ranks for data values from sample 2 in the R2 column. The calculated or observed value of the test statistic is W = ∑R1. Critical value of W. The Wilcoxon rank sum table is shown below. It lists lower and upper critical values for the test, with n1 and n2 as the number of observations in the respective samples. The rejection region will be in either one or both tails, depending on the null hypothesis being tested. Conclusion INFERENTIAL STATISTICS 205 NON PARAMETRIC TESTING EXAMPLE In evaluating the flexibility of rubber tie – down lines, an inspector selects random samples of the straps and counts the number of 360 degree twists each will withstand before breaking. For 7 lines from one production lot, the number of turns before breaking is 112, 105, 83, 102, 144, 85 and 50. For ten lines from a second production lot, the number of turns before breaking is 91, 183, 141, 219, 105, 138, 146, 848, 134 and 106. At 0.05 level, can it be concluded that the two production lots have the same median flexibility. SOLUTION Ho:m1 = m2 H1:m1 ≠ m2 α= 0.05 Sample 2 and Ranks Sample 1 and Ranks 112 10 91 5 105 7.5 183 16 83 2 141 13 102 6 219 17 144 14 105 7.5 4 138 12 1 146 15 44.5 84 3 134 11 106 9 85 50 108.5 W= ∑R1 = 44.5 INFERENTIAL STATISTICS 206 NON PARAMETRIC TESTING The sum of rank of values from sample 1 is W = 44.5, the observed value of test statistics. From the table, for n1 = 7 and n2= 10 at 0.05 level, the lower and upper critical values of W are 43 and 83 respectively. Since the calculated value W= 44.5 lies within the limits, the null hypothesis cannot be rejected at 0.05 level. Our conclusion is that the median flexibility of the two lots could be the same. The Normal Approximation: When n1 ≥ 10 and n2 ≥ 10, a normal approximation to the Wilcoxon rank sum test can be used. The format is as follows: Z-test approximation to the Wilcoxon rank sum test: Test statistics: Z= 𝑛 1(𝑛 +1) 2 𝑊− 𝑛 1𝑛 2(𝑛 +1) 12 Although our example falls short of the n1 ≥ 10 and n2 ≥ 10 rule of thumb, it will be used to demonstrate the normal approximation. The results will be more “approximate” than if the rule of thumb had been met. Substituting W = 44.5, n1 = 7, n2 = 10 and n = 17 into the above expression we find that z = -1.81. For two tail test at 0.05 level the critical z values are -1.96 and +1.96. The calculated z is between these limits; therefore, null hypothesis cannot be rejected. PROBLEMS 1. Given the two samples below, use the Wilcoxon rank sum test to test the null hypothesis that the population medians are equal against their alternative that they are not equal. INFERENTIAL STATISTICS 207 NON PARAMETRIC TESTING Sample 1 Sample 40 35 44 42 46 28 39 50 37 45 27 35 32 34 49 37 48 49 51 45 44 34 36 50 49 37 48 2 2. Many states are considering lowering the blood-alcohol level at which a driver is designated as driving under the influence (DUI) of alcohol. An investigator for a legislative committee designed the following test to study the effect of alcohol on reaction time. Ten participants consumed a specified amount of alcohol. Another group of ten participants consumed the same amount of a nonalcoholic drink, a placebo. The two groups did not know whether they were receiving alcohol or the placebo. The twenty participants’ average reaction times (in seconds) to a series of simulated driving situations are reported in the following table. Does it appear that alcohol consumption increases time? Placebo 0.90 0.37 1.63 0.83 0.95 0.78 0.86 0.61 0.38 1.97 Alcohol 1.46 1.45 1.76 1.44 1.11 3.07 0.98 1.27 2.56 1.32 KRUSKALWALLIS TEST It is a non-parametric counterpart for one-way analysis of variance used to determine if three or more samples originate from the same distribution. It is similar to the Mann-Whitney U test, but applicable to more than two sample groups. Assumptions: 1) Variables has a continuous distribution 2) The data are at least ordinal. 3) Samples are independent. 4) Samples are not drawn from normally distributed populations with equal variances. INFERENTIAL STATISTICS 208 NON PARAMETRIC TESTING SOLVING KRUSKAL WALLIS TEST METHOD: State the null and alternative hypotheses. In this test the null hypothesis is that the median of the k populations are the same i-e m1=m2=…. =mk. The test is one tail and is carried out as follows: Ho = m1=m2=…. =mk.(The population medians are equal) H1: at least one mjdiffers from others. Level of significance Test statistics: (The population medians are not equal) Rank the combined data value as, if they were from a single group. The smallest value gets the rank of 1, next smallest get 2, and so on, in case of a tie; each of the tied value gets their average rank. Add the ranks for data values from each of the k group, obtaining ∑R1, ∑R2, through∑ Rk. the calculated value of test statistics is: 12 𝐻= 𝑛(𝑛 + 1) 𝑝 𝑘=1 𝑅𝑘 2 − 3(𝑛 + 1) 𝑛𝑘 Where: n = Sum of sample sizes in all samples Rk = Sum of ranks in the k the sample nk = Size of the kth sample Critical value The distribution of H is closely approximated by the chi-square distribution. Whenever each sample size is at least 5 and for α = the level of significance for the test, the critical H is the chi-square value for which df= k-1 and the upper tail area is α. If the calculated H exceeds the critical value, the null hypothesis is rejected. Otherwise it cannot be rejected. Conclusion INFERENTIAL STATISTICS 209 NON PARAMETRIC TESTING EXAMPLE: As production manager, you want to see if 3 filling machines have different filling times. You assign 15 similarly trained & experienced workers, 5 per machine, to the machines. At the .05 level, is there a difference in the distribution of filling times? Machinery 1 Machinery 2 Machinery 3 25.40 23.40 20.00 26.31 21.80 22.20 24.10 23.50 19.75 23.74 22.75 20.60 25.10 21.60 20.40 SOLUTION Ho: Identical Distribution. H1: At Least 2 Differ. α = .05 12 𝐻= 𝑛(𝑛 + 1) 𝑝 𝑘=1 𝑅𝑘 2 − 3(𝑛 + 1) 𝑛𝑘 With 𝜗= k-1 Calculations: Machinery 1 Rank Machinery 2 Rank Machinery 3 Rank 25.40 14 23.40 9 20.00 2 26.31 15 21.80 6 22.20 7 INFERENTIAL STATISTICS 210 NON PARAMETRIC TESTING 24.10 12 23.50 10 19.75 1 23.74 11 22.75 8 20.60 4 25.10 13 21.60 5 20.40 3 ∑R165 ∑R238 ∑R317 Substituting the values in the formula we get H=11.58 Critical value: The critical value for H is the chi-square statistics corresponding to an upper tail area of α = 0.05 and df = k -1 =2 .Referring chi-square table, thisis found to be 5.991 Reject Ho if: Hcal ≥ Htab 11.58≥5.991 The calculated H = 11.58 exceeds 5.991 so the null hypothesis is rejected at 0.05. At this level we conclude that three populations do not have the same median. PROBLEMS: 1. For the following independent and random sample, use 0.05 level of significance in testing whether the population medians could be equal. Sample 1 Sample 2 Sample 3 31.3 24.9 36.0 INFERENTIAL STATISTICS 211 NON PARAMETRIC TESTING 30.7 20.8 37.7 35.4 22.2 31.0 36.1 24.9 28.4 30.3 21.4 31.7 25.5 24.1 32.6 2. In testing three different rubber compounds, a tire manufacturer find the tread lives of the tire made from each to be shown as given below. At the level of significance 0.05, could three compounds deliver the same median tread life? Design 1 34 38 33 30 30 Design 2 46 43 39 46 36 Design 3 48 39 33 35 41 3. In an agriculture test, each of four organic compounds is applied to a sample of plants. At the end of the 4 weeks, the highest of the plants are as shown below. At the 0.025 level, are the compounds equally effective in promoting plant growth? Formula 1 18 18 20 20 18 Formula 2 9 13 20 16 13 Formula 3 14 8 8 17 8 INFERENTIAL STATISTICS 212 18 NON PARAMETRIC TESTING Statistical Quality Control Introduction: Statistics: Statistics means the good amount of data to obtain reliable results. The Science of statistics handles this data in order to draw certain conclusions. Its techniques find extensive applications in quality control, production planning and control, business charts, linear programming etc. Inferential Statistics: Inferential statistics is a valuable tool because it allows us to look at a small sample size and make statements on the whole population. Samples must be pulled RANDOMLY from a population so that the sample truly represents the population. Every unit in a population must have a equal chance of being selected for the sample to be truly random. The distribution or shape of the data is important to know for analytical purposes. The most common distribution is the bell shaped or normal distribution. Parameters can be estimated from sample statistics. Two of the most common parameters are the mean and standard deviation. The mean (or average, denoted by μ) measures central tendency This is estimated by the sample mean or x-bar. The standard deviation (σ) measures the spread of the data and is estimated by the sample standard deviation. Quality: In manufacturing, a measure of excellence or a state of being free from defects, deficiencies and significant variations. It is brought about by strict and consistent commitment to certain standards that achieve uniformity of a product in order to satisfy specific customer or user requirements. Quality is a relative term and is generally explained with reference to the end use of the product. Quality is thus defined as fitness for purpose. Dimensions of Quality • • • • • Performance - Will it do the intended job? Reliability - How often does it fail? Durability - How long does the product last? Serviceability - How easy is it to repair the product? Aesthetics - What does the product look like? INFERENTIAL STATISTICS 213 STATISTICAL QUALITY CONTROL • • • Features - What does the product do? Perceived Quality - What is the reputation of the company or its product? Conformance to Standards - Is the product made exactly as the designer intended? Control: Control is a system for measuring and checking or inspecting a phenomenon. It suggests when to inspect, how often to inspect and how much to inspect, how often to inspect. Control ascertains quality characteristics of an item, compares the same with prescribed quality standards and separates defective item from non-defective ones. Statistical Quality Control (SQC) is the term used to describe the set of statistical tools used by quality professionals. SQC is used to analyze the quality problems and solve them. Statistical quality control refers to the use of statistical methods in the monitoring and maintaining of the quality of products and services. Statistical Quality Control (SQC) is the term used to describe the set of statistical tools used by quality professionals. SQC is used to analyze the quality problems and solve them. Statistical quality control refers to the use of statistical methods in the monitoring and maintaining of the quality of products and services. All the tools of SQC are helpful in evaluating the quality of services. SQC uses different tools to analyze quality problem. Statistical quality control Statistical quality control refers to the use of statistical methods in the monitoring and maintaining of the quality of products and services. One method, referred to as acceptance sampling, can be used when a decision must be made to accept or reject a group of parts or items based on the quality found in a sample. INFERENTIAL STATISTICS 214 STATISTICAL QUALITY CONTROL Objective of Statistical Quality Control Quality Control is very important for an every company. Quality control includes service quality given to customer, company management leadership, commitment of management, continuous improvement, and fast response, actions based on facts, employee participation and a quality driven culture. The main objectives of the quality control module are to control of material reception, internal rejections, clients, claims, providers and evaluations of the same corrective actions are related to their follow-up. These systems and methods guide all quality activities. The development and use of performance indicators is linked, directly or indirectly, to customer requirements and satisfaction, and to management. Three SQC Categories Statistical quality control (SQC) is the term used to describe the set of statistical tools used by quality professionals SQC encompasses three broad categories of; 1. Descriptive statistics e.g. the mean, standard deviation, and range 2. Statistical process control (SPC) Involves inspecting the output from a process Quality characteristics are measured and charted Helpful in identifying in-process variations 3. Acceptance sampling used to randomly inspect a batch of goods to determine acceptance/rejection.Does not help to catch in-process problems. Descriptive Statistics involves describing quality characteristics and relationships. SPC involves inspect random sample of output from process for characteristic. Acceptance Sampling involve batch sampling by inspection. Sources of Variation Variation exists in all processes. Variation can be categorized as either; Common or Random causes of variation, or Random causes that we cannot identify Unavoidable e.g. slight differences in process variables like diameter, weight, service time, temperature Assignable causes of variation Causes can be identified and eliminated e.g. poor employee training, worn tool, machine needing repair INFERENTIAL STATISTICS 215 STATISTICAL QUALITY CONTROL Traditional Statistical Tools Descriptive Statistics include The Mean- measure of central tendency. The Range- difference between largest/smallest observations in a set of data. Standard Deviation measures the amount of data dispersion around mean. Distribution of Data shape Normal or bell shaped Skewed Analysis of Patterns on Control Charts When do you have a problem with your process? One or more points outside of the control limits A run of at least seven points (up, down or above or below center line) Two or three consecutive points outside the 2-sigma warning limits, but still inside the control limits Four or five consecutive points beyond the 1-sigma limits. An unusual or nonrandom pattern in the data Setting Control Limits Percentage of values under normal curve Control limits balance risks like Type I error SPC Methods-Control Charts Control Charts show sample data plotted on a graph with CL, UCL, and LCL Control chart for variables are used to monitor characteristics that can be measured, e.g. length, weight, diameter, time INFERENTIAL STATISTICS 216 STATISTICAL QUALITY CONTROL Control charts for attributes are used to monitor characteristics that have discrete values and can be counted, e.g. % defective, number of flaws in a shirt, number of broken eggs in a box The normal distribution is the basis for the charts and requires the following assumptions: The quality characteristic to be monitored is adequately modelled by a normally distributed random variable The parameters μ and σ for the random variable are the same for each unit and each unit is independent of its predecessors or successors The inspection procedure is same for each sample and is carried out consistently from sample to sample Control Charts The control chart is a graph used to study how a process changes over time. Data are plotted in time order. A control chart always has a central line for the average, an upper line for the upper control limit and a lower line for the lower control limit. Lines are determined from historical data. By comparing current data to these lines, you can draw conclusions about whether the process variation is consistent (in control) or is unpredictable (out of control, affected by special causes of variation). When to use a control chart? Controlling ongoing processes by finding and correcting problems as they occur. Predicting the expected range of outcomes from a process. Determining whether a process is stable (in statistical control). Analyzing patterns of process variation from special causes (nonroutine events) or common causes (built into the process). Determining whether the quality improvement project should aim to prevent specific problems or to make fundamental changes to the process. Basic components of control charts A centerline, usually the mathematical average of all the samples plotted; Lower and upper control limits defining the constraints of common cause variations; INFERENTIAL STATISTICS 217 STATISTICAL QUALITY CONTROL Performance data plotted over time. Types of the control charts 1 Variables control charts Variable data are measured on a continuous scale. For example: time, weight, distance or temperature can be measured in fractions or decimals. Applied to data with continuous distribution 2 Attributes control charts Attribute data are counted and cannot have fractions or decimals. Attribute data arise when you are determining only the presence or absence of something: success or failure, accept or reject, correct or not correct. For example, a report can have four errors or five errors, but it cannot have four and a half errors. Applied to data following discrete distribution Variables Control Charts Suppose we have a scatter plot with a response variable on the vertical axis and a representation of time (such as hours, shifts, days, weeks, or months) on the horizontal axis. This scatter plot shows the nature of the response over time. For example, we might see trends, shifts, sudden jumps, and so on. If we add horizontal limit lines to the plot to indicate standards, the scatter plot becomes a control chart. When the plots fall inside these limits lines, the process yielding the response is said to be in control. When the process yields responses that are outside these limits, the process is said to be out-of-control. The limit lines set a range of „normal behavior.‟ They are based on past experience with the process and give a frame of reference for judging current outcomes. Because of natural variation in the process, the responses will not be exactly the same. They will bounce up and down. As long as the response stays within the limits, we need take no corrective action. However, once a measurement occurs outside the limits, we must investigate the cause and take appropriate corrective action. Use x-bar and R-bar charts together Used to monitor different variables X-bar & R-bar Charts reveal different problems INFERENTIAL STATISTICS 218 STATISTICAL QUALITY CONTROL Control Charts for Variables uses: x-bar charts to monitor the changes in the mean of a process (central tendencies) R-bar charts to monitor the dispersion or variability of the process System can show acceptable central tendencies but unacceptable variability or System can show acceptable variability but unacceptable central tendencies X-Bar Chart: The X-bar chart monitors the process location over time, based on the average of a series of observations, called a subgroup. X-bar / Range charts are used when you can rationally collect measurements in groups (subgroups) of between two and ten observations. Each subgroup represents a "snapshot" of the process at a given point in time. The charts' x-axes are time based, so that the charts show a history of the process. For this reason, data should be time-ordered; that is, entered in the sequence from which it was generated. If this is not the case, then trends or shifts in the process may not be detected, but instead attributed to random (common cause) variation. For subgroup sizes greater than ten, use X-bar / Sigma charts, since the range statistic is a poor estimator of process sigma for large subgroups. In fact, the subgroup sigma is always a better estimate of subgroup variation than subgroup range. The popularity of the Range chart is only due to its ease of calculation, dating to its use before the advent of computers. For subgroup sizes equal to one, an Individual-X / Moving Range chart can be used. X-bar Charts are efficient at detecting relatively large shifts in the process average, typically shifts of +-1.5 sigma or larger. The larger the subgroup, the more sensitive the chart will be to shifts, providing a Rational Subgroup can be formed. Mean or x bar chart: Control over the average quality is exercised by the control chart of averages, typically called the X bar chart; the construction of an x bar chart is based on the central limit theorem. This states that regardless of the distribution of the population, X Bars distribution (mean of each sample drawn from the population)will tend to follow a normal distribution as the sample size increases, this theorem also states that the Mean of sample means denoted by x bar bar will equal the mean of the population µ i.e. µ= X Standard deviation of sample distribution σx will be the population standard deviation divided by the square root of the sample size n, i.e. σx =σ/√𝑛 INFERENTIAL STATISTICS 219 STATISTICAL QUALITY CONTROL The Chart Construction Process: in order to construct x bar and R charts, we must first find our upper- and lower-control limits. This is done by utilizing the following formulae: In case when µ and σ are known UCL = μ+ 3σ/√n LCL = μ - 3σ/√n CL= X or µ While theoretically possible, since we do not know either the population process mean or standard deviation, these formulas cannot be used directly and both must be estimated from the process itself. First, the R chart is constructed. If the R chart validates that the process variation is in statistical control, the x bar chart is constructed. In case when σ and µ are not known 1. Find the mean of each subgroup X (1), X (2), X (3)... X (k) and the grand mean of all subgroups using: 2. Find the UCL and LCL using the following equations: UCL= X + A2R CL= X LCL= X - A2R 3. Plot the LCL, UCL, centre line, and subgroup means on a graph paper. 4. Interpret the data to determine if the process is in control: A2 can be found from the table given below: INFERENTIAL STATISTICS 220 STATISTICAL QUALITY CONTROL A2 2nd column of the table is giving the value of A2 respective to the subgroup or sample size n Example of Constructing a X-bar Chart: A quality control inspector at the Cocoa Fizz soft drink company has taken three samples with four observations each of the volume of bottles filled. If the standard deviation of the bottling operation is .2 ounces, use the below data to develop control charts with limits of 3 standard deviations for the 16 oz. bottling operation. Center line and control limit formulas Observation 1 Observation 2 Observation 3 Observation 4 Sample means (Xbar) Sample ranges (R) x1 x 2 ...xn σ , σx k n where(k ) is the # of sample means and (n) is the # of observations w/in each sample Time 1 Time 2 Time 3 15.8 16.1 16.0 16.0 16.0 15.9 15.8 15.8 15.9 UCL x x zσ x 15.9 15.9 15.8 LCL x x zσ x 15.875 15.975 15.9 0.2 0.3 0.2 INFERENTIAL STATISTICS 221 x STATISTICAL QUALITY CONTROL X-Bar Control Chart R chart: A control chart for dispersion: An x chart is used to plot the location values, for each sample, whereas R chart is used to plot the variation of each sample as measured by the sample range. The range (r chart) is used to control the variability or dispersion in the quality of a product in the process of production. for each sample of size n, we calculate a sample variance where variation is known . the frequency distribution of these variances approximate the normal distribution, with mean µ and variance of sample variance σR2.thus we can use this distribution of sample variances to establish control limits, to understand variability in the process be controlled. Procedure of constructing an R chart is as follows: Take random samples each of size n, and let Ri be the value of range in ith sample of size n. We can find the range by subtracting the maximum value of a sample from the minimum value : Rmax- Rmin Mean of sample ranges R- is given by R- = R1+R2+…Rn /n = ∑R/n Control limits are set as under CL= R UCL= R + 3 σR & LCL= R - 3 σR Where σR is the standard error of range or the standard deviation of the range of all possible sample size from a given population. The average range R provides an estimate of the mean of the random variable of interest. INFERENTIAL STATISTICS 222 STATISTICAL QUALITY CONTROL Sample Obs 1 1 10.68 2 10.79 3 10.78 4 10.59 5 10.69 6 10.75 7 10.79 8 10.74 9 10.77 10 10.72 11 10.79 12 10.62 13 10.66 14 10.81 15 10.66 Obs 2 10.689 10.86 10.667 10.727 10.708 10.714 10.713 10.779 10.773 10.671 10.821 10.802 10.822 10.749 10.681 Obs 3 10.776 10.601 10.838 10.812 10.79 10.738 10.689 10.11 10.641 10.708 10.764 10.818 10.893 10.859 10.644 Obs 4 10.798 10.746 10.785 10.775 10.758 10.719 10.877 10.737 10.644 10.85 10.658 10.872 10.544 10.801 10.747 Obs 5 10.714 10.779 10.723 10.73 10.671 10.606 10.603 10.75 10.725 10.712 10.708 10.727 10.75 10.701 10.728 Averages Avg 10.732 10.755 10.759 10.727 10.724 10.705 10.735 10.624 10.710 10.732 10.748 10.768 10.733 10.783 10.692 Range 0.116 0.259 0.171 0.221 0.119 0.143 0.274 0.669 0.132 0.179 0.163 0.250 0.349 0.158 0.103 10.728 0.220400 Step 2. Determine Control Limit Formulas and Necessary Tabled Values x Chart Control Limits n 2 3 4 5 6 7 8 9 10 11 UCL = x + A 2 R LCL = x - A 2 R R Chart Control Limits UCL = D 4 R LCL = D 3 R A2 1.88 1.02 0.73 0.58 0.48 0.42 0.37 0.34 0.31 0.29 D3 0 0 0 0 0 0.08 0.14 0.18 0.22 0.26 D4 3.27 2.57 2.28 2.11 2.00 1.92 1.86 1.82 1.78 1.74 Where R = ∑R/15 R= Largest observation- lowest observation Steps 3&4: Calculate x-bar Chart and Plot Values UCL = x + A2 R 10.728 .58( 0.2204 )=10.856 Means LCL = x - A2 R 10.728-.58( 0.2204 )=10.601 10.900 10.850 10.800 10.750 10.700 10.650 10.600 10.550 Sample mean UCL 1 2 3 4 INFERENTIAL STATISTICS 5 6 7 8 9 Sample 223 10 11 12 13 14 15 STATISTICAL QUALITY CONTROL Steps 5&6: Calculate R-chart and Plot Values UCL = D 4 R (2.11)(0.2204) 0.46504 LCL = D 3 R (0)(0.2204) 0 Range UCL LCL R-bar 0.800 0.700 0.600 0.500 R 0.400 0.300 0.200 0.100 0.000 1 2 3 4 5 6 7 8 Sample 9 10 11 12 13 14 15 There may arise two situations: 1 If σR is known, then CL=d2 σR b. UCL= R+ d2 σR c. LCL= R- d1 σR a. Where d1 and d2 are the constants that depend on the sample size as shown above 2 If σR is not known, then its value can be estimated as: CL= R LCL=D3 R UCL=D4 R Thus control limits for R chart are given Example of x-bar and R charts: Step 1: Calculate sample means, sample ranges, mean of means, and mean of ranges. INFERENTIAL STATISTICS 224 STATISTICAL QUALITY CONTROL Range Chart : The lower and upper control limits for the range chart are calculated using the formula 1 If variation in R in known UCL = R + 3 σR LCL = R − 3 σR 2 If variation in R in unknown UCL = D4 R LCL = D3 R Range Chart Limits when n=1 the moving ranges of size two replace the usual ranges in the formulas above. All calculations remain the same after this substitution, except that there are only k-1 ranges to plot. Control Chart for Range (R) Factors for three sigma control limits Center Line and Control Limit formulas: R 0.2 0.3 0.2 .233 3 UCL R D4 R 2.28(.233) .53 LCLR D3R 0.0(.233) 0.0 R-Bar Control Chart INFERENTIAL STATISTICS 225 STATISTICAL QUALITY CONTROL Second Method for the X-bar Chart Using R-bar and the A2 Factor : Method used when sigma for the process distribution is not know Control limits solution: R 0.2 0.3 0.2 .233 3 UCL x x A 2 R 15.92 0.73.233 16.09 LCL x x A 2 R 15.92 0.73.233 15.75 Chart Focuses on Subgroup Sample Size Advantages Disadvantages X-bar Average Two and above Slow to detect drifts in the average. Not good at detecting small changes in the process average. R Variability Two and above Does a good job at detecting sudden, large jumps in the process average. Simple to understand. Used often so there is a large body of knowledge about its use. Good at detecting sudden jumps. Easy to compute and understand. Ignores a lot of information about the variability, especially when the subgroup size is large. CONTROL CHART FOR ATTRIBUTES: Control charts for attributes are used to understand whether products under inspection satisfy or not certain characteristics. In other words, the attribute (quality characteristics) charts are typically based on classification of products or services as defective or non defective. This class of charts neither includes any measurement of variation anything comparable to an R chart derived from the range in samples. They are similar to R charts in a sense that their control limits are three standard errors away from the means of all possible values. INFERENTIAL STATISTICS 226 STATISTICAL QUALITY CONTROL C chart: control chart for defects per unit: Sometimes the characteristics representing the quality of a product and services are there in nature and the data is obtained by counting, such as if the machine is working or idle defects in automobiles , machine components, services rendered by a restaurant a store or a bank and so on. C chart is used in situations wherein opportunities for a defect in each production unit or a complaint from a customer and very large while the probability occurrence per unit tends to b very small or constant. The outcome of such a sampling can be described by a Poisson distribution. Example: During an examination of equal length, the following numbers of defects were observed 2,3,4,0,5,6,7,4,3,2. Draw a control chart for the number of defects. Comment whether the process is under control or not. Solution Step 1: let c denote the number of defects per piece. Then the average number of defects in 10 samples will be C = ∑C/ N = 2+3+4+0+5+6+7+4+3+2 / 10 Thus the control limits are: C =3.6 LCL= C +3√𝐶 =3.6 + 3√3.6 = 3.6+5.692 = 9.292 UCL= C -3√𝐶 =3.6- 5.692 = -2.092 or 0 The control chart for C based on these limitations is given From the above calculated data draw a graph with three lines one of control limit e upper limit which is the maximum value of control limit the lower limit which contains the lowest possible value of the data, out of these values the process would be out of control. P chart: control chart for proportions of defectives: The p chart is designed to control the percentage (or proportion) of defectives per sample and is based on the distribution of proportion (or fraction) defectives in each INFERENTIAL STATISTICS 227 STATISTICAL QUALITY CONTROL sample. The assumption that attributes that are classified as either good or bad follows the binomial distribution implies that a) There are only two possible outcomes (good or defective) b) The outcomes occur randomly, and c) The probability of either outcome remains unchanged for each trial. Since the no of C per unit can be converted into a fraction (proportion) defectives by dividing C by the sample size, p- chart may be used in place of C- chart. The p- chart has two advantages over the C- chart 1. Expressing the defectives as a percentage or proportion of the given population is more meaningful. 2. When sample size varies from sample to sample, the p chart derives more meaningful and simple presentation. If the sample size is constant, the primary difference in C-chart and pchart is only in the computation of control limits. The steps for construction of control limits for p-chart are as follows: 1. Compute the proportion defective items in each sample by dividing the number of defectives xi recorded in a sample of size n P1 = x1/n1 , p2 = x2/n2 In general, p= no of defectives/sample size, n 2. Obtain the mean and variance of p from all samples combined i.e. Average proportion defectives total number of defectives in all the samples combined p- = Total number of items in all the samples combined = And p1+p2+p3+⋯pn n σp2 =p- q-/n = p-(1- p-)/ n 3. The control limits for p-chart are given by UCL= p +3 σp = p + 3 LCL= p -3 σp = p - 3 INFERENTIAL STATISTICS p−(1− p) n p−(1− p) n 228 STATISTICAL QUALITY CONTROL Where σp is the standard error (deviation) of proportion. While constructing the p chart it is generally preferred to expression terms of percent defective rather than defective fraction. The percent defective is 100p. Control Charts for Attributes – P-Charts & C-Charts Attributes are discrete events; yes/no, pass/fail. Use P-Charts for quality characteristics that are discrete and involve yes/no or good/bad decisions e.g: Number of leaking caulking tubes in a box of 48 Number of broken eggs in a carton Use C-Charts for discrete defects when there can be more than one defect per unit Number of flaws or stains in a carpet sample cut from a production run Number of complaints per customer at a hotel P-Chart Example: A Production manager for a tire company has inspected the number of defective tires in five random samples with 20 tires in each sample. The table below shows the number of defective tires in each sample of 20 tires. Calculate the control limits. Sample Number of Defective Tires Number of Tires in each Sample Proportion Defective 1 3 20 .15 2 2 20 .10 σp 3 1 20 .05 4 2 20 UCL p p zσ .09 3(.064) .282 .10 LCLp p zσ .09 3(.064) .102 0 5 2 20 .05 Total 9 100 .09 INFERENTIAL STATISTICS Solution: CL p 229 # Defectives 9 .09 Total Inspected 100 p(1 p) n (.09)(.91) 0.64 20 STATISTICAL QUALITY CONTROL P- Control Chart C-Chart Example: The number of weekly customer complaints are monitored in a large hotel using a c-chart. Develop three sigma control limits using the data table below. Week Number of Complaints 1 3 2 2 3 3 CL 4 1 UCL c c z c 2.2 3 2.2 6.65 5 3 LCLc c z c 2.2 3 2.2 2.25 0 6 3 7 2 8 1 9 3 10 1 Total 22 INFERENTIAL STATISTICS Solution: 230 # complaints 22 2.2 # of samples 10 STATISTICAL QUALITY CONTROL C- Control Chart Acceptance Sampling A statistical measure used in quality control. A company cannot test every one of its products due to either ruining the products, or the volume of products being too large. Acceptance sampling solves this by testing a sample of product for defects. The process involves batch size, sample size and the number of defects acceptable in the batch. This process allows a company to measure the quality of a batch with a specified degree of statistical certainty without having to test every unit of product. The statistical reliability of a sample is generally measured by a t-statistic. Probability is a key factor in acceptance sampling, but it is not the only factor. If a company makes a million products and tests 10 units with one default, an assumption would be made on probability that 100,000 of the 1,000,000 are defective. However, this could be a grossly inaccurate representation. More reliable conclusions can be made by increasing the batch size higher than 10, and increasing the sample size by doing more than just one test and averaging the results. When done correctly, acceptance sampling is a very effective tool in quality control. INFERENTIAL STATISTICS 231 STATISTICAL QUALITY CONTROL SAMPLING PLANS A “lot,” or batch, of items can be inspected in several ways, including the use of single, double, or sequential sampling. Single Sampling Two numbers specify a single sampling plan: They are the number of items to be sampled (n) and a pre-specified acceptable number of defects (c). If there are fewer or equal defects in the lot than the acceptance number, c, and then the whole batch will be accepted. If there are more than c defects, the whole lot will be rejected or subjected to 100% screening. Double Sampling Often a lot of items is so good or so bad that we can reach a conclusion about its quality by taking a smaller sample than would have been used in a single sampling plan. If the number of defects in this smaller sample (of size n1) is less than or equal to some lower limit (c1), the lot can be accepted. If the number of defects exceeds an upper limit (c2), the whole lot can be rejected. But if the number of defects in the n1 sample is between c1 and c2, a second sample (of size n) is drawn. The cumulative results determine whether to accept or reject the lot. The concept is called double sampling. INFERENTIAL STATISTICS 232 STATISTICAL QUALITY CONTROL Sequential Sampling Multiple sampling is an extension of double sampling, with smaller samples used sequentially until a clear decision can be made. When units are randomly selected from a lot and tested one by one, with the cumulative number of inspected pieces and defects recorded, the process is called sequential sampling. If the cumulative number of defects exceeds an upper limit specified for that sample, the whole lot will be rejected. Or if the cumulative number of rejects is less than or equal to the lower limit, the lot will be accepted. But if the number of defects falls within these two boundaries, we continue to sample units from the lot. It is possible in some sequential plans for the whole lot to be tested, unit by unit, before a conclusion is reached. Selection of the best sampling approach—single, double, or sequential—depends on the types of products being inspected and their expected quality level. A very lowquality batch of goods, for example, can be identified quickly and more cheaply with sequential sampling. This means that the inspection, which may be costly and/or destructive, can end sooner. On the other hand, in many cases a single sampling plan is easier and simpler for workers to conduct even though the number sampled may be greater than under other plans. INFERENTIAL STATISTICS 233 STATISTICAL QUALITY CONTROL Acceptance quality level (AQL): This is the minimum level of quality acceptable in a given lot. It is expressed in decimals or percentage defectives in a lot that can be considered satisfactory by the consumer. For example, if acceptable quality is20% in defectives in a lot of1000 items, then the AQL= 20/1000 =2% Advantages and limitations of statistical quality control: ADVANTAGES: Reduction in cost: since only a fraction of the output is inspected, cost of inspection is greatly reduced. Greater efficiency: not only there is reduction in cost but the efficiency also goes up because much of the boredom is avoided, the work of inspection being considerably reduced. Easy to apply: an excellent feature of quality control is that it is easy to apply. Once the system is established, it can be even operated by a person who has not had extensive specialized training or higher mathematical background. It may appear difficult only because the statistical principles are actually based on commonsense, the quality control method finds wide application. Early detection of faults: quality control ensures an early detection of faults and hence a minimum waste of rejected production. The moment a sample point falls outside the control limits it is taken to be a danger signal and necessary corrective action is taken. On the other hand, with 100% inspection unwanted variations in the quality may b detected at a stage when a large amount of fault products have already been produced. Thus there would be a big wastage. A control chart, on the other hand, provides a graphic picture of how the production is proceeding and tells management where to look for trouble. LIMITATIONS: Despite several advantages of quality control it is believed that it is not a treatment for all quality evils. The techniques of quality control should not be used technically. Instead these should be matched to the process being studied. The application of standard procedure without adequate supply of the process is extremely dangerous. INFERENTIAL STATISTICS 234 STATISTICAL QUALITY CONTROL Practice Questions: Q no 1: The following data refer to defects found at the inspection of the first 10 samples of size 100.use the term to obtain the Upper and lower control limits for percentage defective in samples of 100.represent the first ten sample results in the chart you prepare the central limit and control limits. Sample 1 number No of 2 defectives 2 3 4 5 6 7 8 9 10 1 1 3 2 3 4 2 2 0 Qno 2: the average no of defectives in 23 samples of size 2000rubber belts are found to b 16%. Indicate how to construct the relevant control chart, Its upper and lower control limit and tell whether the process is under control or not? Qno 3: the following table gives the no of defects observed in 8 woollen carpets passing as satisfactory. Construct the control chart for the no of defects Carpet 1 no No of 3 defects 2 3 4 5 6 7 8 9 10 4 5 6 3 3 5 3 6 2 Qno4 : Todd Olmstead is the meals on wheels dispatcher for the atlantes for metropolitan area. He wants meal delivered to clients within 30 minutes of leaving the kitchens. Meal with longer delivery times tend to be too cold when they arrive. Each of his 10 volunteer drivers is responsible for delivering 15 meals daily over the past month; Todd has recorded the percentage of each day‟s 150 meals that were delivered on time; “DAY” 1 2 3 4 5 6 7 8 9 10 11 12 13 14 INFERENTIAL STATISTICS “% on-time” 89.33 81.33 95.33 88.67 96.00 86.67 98.00 84.00 90.67 80.67 88.00 86.67 96.67 85.33 235 STATISTICAL QUALITY CONTROL 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 78.67 89.33 89.33 78.67 94.00 94.00 99.33 95.33 94.67 92.67 81.33 89.33 99.33 90.67 92.00 88.00 a) Help Todd construct a p chart from these data b) How does your chart show that the attribute “fraction of meals delivered on rime” is out-of-control? c) What action do you recommend for Todd? Qno5: the following data gives the inspection data relating to 10 samples of 100 items each, concerning the production of bottle corks. Construct a control chart. Sample no 1 2 3 4 5 6 7 8 9 10 Size of sample 100 100 100 100 100 100 100 100 100 100 No of defectives 5 3 3 6 5 6 8 10 10 4 Fraction defective 0.05 0.03 0.03 0.06 0.05 0.06 0.08 0.10 0.10 0.04 Qno6: A food company puts mango juice into cans advertised as containing 10ounces of the juice. The weight of the cans immediately after filling for 20 samples are taken by a random method (at an interval of every 30 mins) each of the samples includes 4 cans. The sample values are tabulated in the following table. The weights in the table are given in units of 0.01ounces in excess of 10 ounces. For example the weight of the juice drained from the first can of sample is 10.15 ounces which is excess of 10 ounces by 0.15 ounces (10.15-10=0.15) since the unit in the table is 0.01ounces, the excess is recorded as 15 units In the table. Construct the control chart. INFERENTIAL STATISTICS 236 STATISTICAL QUALITY CONTROL Sample number 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 I 15 10 8 12 18 20 15 13 9 6 5 3 6 12 15 18 13 10 5 6 II 12 8 15 17 13 16 19 23 8 10 12 15 18 9 15 17 16 20 15 14 III 13 8 17 11 15 14 23 14 18 24 20 18 12 15 6 8 5 8 10 2 IV 20 14 10 12 4 20 17 16 5 20 15 18 10 18 16 15 4 10 12 14 QNO 7: the following data shows the value of sample mean and range for 10 samples of size 5 each. Calculate the values for central line and control limits for Mean and range chart, and determine whether the process is in control. Sample 1 2 3 4 5 6 7 8 9 Mean 11.2 11.8 10.8 11.6 11.0 9.6 10.4 9.6 10.6 Range 7 4 8 5 7 4 8 4 7 Conversion factors required for n=5 are given in the table of control chart. 10 10.0 9 Qno 8: Draw a suitable chart for the following data predicting to the number of foreign-cultured threads (considered as defects) in 15 pieces of cloth of 2m×2m in a certain make of synthetic fiber and state your conclusion. 7,12,3,20,21,5,4,3,10,8,0,9,6,7,20. Qno 9: After finding out his luggage arrived in San Antonio while his destination was Omaha, Will Richardson, a statistician for USA Airlines, decided to do some research. For the last three weeks, Will has sampled 200 passengers daily and determined the percentage of luggage delivered to the expected destination with the results given below: Day 1 2 3 4 INFERENTIAL STATISTICS Percent correct 0.89 0.91 0.93 0.95 237 STATISTICAL QUALITY CONTROL 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 0.94 0.96 0.92 0.91 0.93 0.90 0.88 0.94 0.97 0.94 0.95 0.92 0.93 0.92 0.91 0.93 0.89 a) Help will construct a p chart from these data. b) Is this luggage delivery process in control? Qno 10: The Take-Charge Company produces batteries. From time to time a random sample of six batteries is selected from the output and the voltage of each battery is measured, to be sure that the system is under control. Here are statistics on 16 such samples. a. What type of control chart should be used here? Why? b. What is the centre line of the chart? c. What is the lower control limit? d. The upper control limit? e. Draw the control chart on a piece of graph paper. f. what the graph interprets? Sample Mean Range Sample Mean Range 1 4.99 0.41 9 5.01 0.49 2 4.87 0.57 10 5.19 0.56 3 4.85 0.59 11 5.40 0.44 4 5.26 0.74 12 5.15 0.63 5 5.09 0.74 13 5.00 0.35 6 5.02 0.21 14 4.89 0.45 7 5.13 0.56 15 4.99 0.54 INFERENTIAL STATISTICS 238 STATISTICAL QUALITY CONTROL 8 5.09 0.92 16 5.05 0.33 Qno11: A machine is set to deliver an item of a given weight 10 samples of size 5 were recorded. The relevant data is as follows: Sample 1 15 Mean (X ) Range 7 (R-) 2 17 3 15 4 18 5 17 6 14 7 18 8 15 9 17 10 16 7 4 9 8 7 12 4 11 5 Calculate the value for central line and control limits for mean chart and the range chart and then comment on the state of control. Conversion factors for n=5 are given in the table above. Qno12: Ross Darrow is a flight operations analyst for spacious skies. He has been assigned the task of monitoring flights at the company‟s hub airport in the southeast .each day, spacious skies has 240 takeoffs scheduled from his hub. Ross has been concerned about the fractions of flights with late departures, and four weeks ago he instituted procedures designed to reduce the fraction. Use the data for the last 30 week-days to construct a p chart to see whether his new procedures have been successful. What further action if any, should Ross consider? INFERENTIAL STATISTICS Weeks (1-6) late M T W TH F M T W TH F M T W TH F M T 26 19 26 22 24 19 19 20 18 18 17 9 13 10 12 14 14 239 STATISTICAL QUALITY CONTROL W TH F M T W TH F M T W TH F 13 9 10 12 15 14 15 16 18 17 16 18 17 Qno 13: R&H Bloch is a large accounting firm specializing in the preparation of individual federal tax returns. The firm is very conservative in its practices and tries to avoid having more than 2% of its clients audited. As part of a summer internship, Jane Bloch has been asked to see whether this goal is being met on consistent basis. For each week during a 16 week interval centered on April 15 of last year, she has randomly selected 125 returns prepared by the firm. Week ending 2/25 3/04 3/11 3/18 3/25 4/01 4/08 4/15 4/22 4/29 5/06 5/13 5/20 5/27 6/03 6/10 # audited 2 1 2 3 5 4 5 6 3 1 1 3 2 2 3 2 a) Are they significantly more than 2% of R&H Bloch‟s clients being audited? State and test appropriate hypothesis using all 2000 clients in Jane‟s sample. b) Now withstanding your result in part (a), construct a p chart based on Jane‟s data. Is there anything evident in the chart that Jane should bring to attention of the partners in the firm? INFERENTIAL STATISTICS 240 STATISTICAL QUALITY CONTROL