Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Series Editor: Patrick Johnson MATHEMATICS WORKSHEETS FOR SCHOOLS Volume 1: Number 2 a. b. Curriculum Links: i. Hypothesis Testing Levels: i. Leaving Certificate Mathematics ii. Junior Certificate Science Maths ensures quality in the Bio-medical Industry Boston Scientific Corporation (NYSE: BSX) is the world’s leading developer, manufacturer and marketer of less invasive medical devices, which provide effective alternatives to traditional surgery by reducing trauma, complexity and risk to the patient. Such less invasive procedures also lead to substantial savings in procedure and patient recovery times, which in turn leads to lower costs. Due to the fact that they make products that impact people’s health – and lives – it is essential that they put quality first in everything they do. Mathematical Information What is a hypothesis test? A hypothesis test examines a claim about some characteristic of a population, e.g. average length of nails produced by a company. The population is very large, generally too large to test, and so we draw a random sample from the population. This sample should be large enough (unbiased) so that it doesn’t inadvertently favour the characteristic that we are testing. Additionally, by insuring that the sample is large enough means that it will be representative of the population. Every sample is different, so the sample characteristic value that we are testing (test statistic) varies from sample to sample, i.e., it has a range. Therefore we set a range of values that we believe covers the range for the test statistic, and if the population value is within this range, we agree with the claim being tested. The one-sample Z-test is one of many procedures available for hypothesis testing. The one-sample Z-test allows us to determine whether the difference between the sample characteristic mean value (x) and the population mean (µ) is statistically significant. For large sample sizes many distributions are approximately normally distributed and so the Z-test is commonly used when testing hypotheses. The normal distribution (also referred to as the Gaussian distribution) is a continuous probability distribution that is often used as an approximation to describe real-valued data that tend to cluster around a single mean value. The graph of the A coronary stent is a metal tube used to unblock the arteries whose insertion can help prevent a heart attack. normal distribution is symmetrical, bell-shaped, centered about its mean, with its spread determined by its standard deviation (see Figure 1 overleaf). Examples of data that can be represented by the normal distribution include height, weight, IQ (Intelligence Quotient) and most manufacturing processes. Hypothesis testing is an example of statistical inference, which refers to using information from a sample to draw a conclusion about the population. We use a hypothesis test to make inferences about one or more populations when sample data are available. NCE-MSTL and Engineers Ireland Mathematics Worksheets for Schools — Vol. 1 Number 2. Series Ed. Patrick Johnson; Writing Team Patrick Johnson, Tim Brophy and Barry Fitzgerald. www.steps.ie www.nce-mstl.ie A coronary stent is a metal tube used to unblock the arteries whose insertion can help prevent a heart attack. A stent is usually inserted into an artery by key-hole surgery, a surgical procedure which is far less stressful for a patient than open-heart by-pass surgery. The stent is inserted using a device called a balloon catheter. Once in position, the catheter is inflated and opens the stent compressing the cholesterol that has built up in the artery. This results in a widening of the passage through which blood flows in the artery. Figure 1:The standard normal distribution. How does a hypothesis test work? There are two hypotheses in a significance test—the null and alternative hypotheses. A null hypothesis (H0) is the hypothesis to be tested. It is assumed to be true unless the data indicates otherwise. The null hypothesis usually assumes any variation in the data is due to chance, whereas the alternative hypothesis (Ha) assumes any variation is a real effect. The alternative hypothesis (Ha) is the opposite to the null hypothesis. All hypothesis tests follow the same steps: 1. Assume H0 is true. 2. Determine how different the sample is from what you expected under the assumption of H0 being true. 3. If the test statistic is sufficiently unlikely under the assumption that H0 is true, then reject H0 in favour of Ha. Once we reach a conclusion in terms of H0 (e.g. Reject H0 or Fail to reject H0) the final step is to interpret this conclusion in terms of the original problem statement i.e. write our conclusion down in simple English for everyone to understand. Biological Information The heart is a muscle that pumps blood containing food and oxygen through blood vessels to all parts of the body. Like our body, the heart also needs food and oxygen which reach it through the coronary arteries. However, if a substance known as cholesterol sticks to the inner walls of the arteries, they become narrow which leads to a decrease in oxygen supply to the heart muscle. This condition is known as coronary artery disease (CAD) and can cause heart attacks. www.steps.ie www.nce-mstl.ie Example Company X manufacturers coronary stents. They currently have several suppliers of tubing but want to reduce this so that they only have two main suppliers. Before deciding on which two suppliers to keep they want to assess the performance of each supplier. Once tubing is received in, it is processed through laser machines which form the stent pattern. The company requires that all stents weigh on average 161 mg(milligrams) with a standard deviation (σ) of 5.2 mg1. Company X completed all Monday’s production using Supplier A. They take a sample of 30 stents from the day’s production, weigh them and then conduct an analysis to compare this to the required company standard. Below is the measurements that were recorded for the 30 stents (n = 30). 166.0 168.9 164.5 161.0 169.0 160.5 161.7 157.8 158.3 163.3 166.5 159.0 154.6 167.7 167.1 164.0 152.7 161.9 157.6 164.1 165.5 160.9 170.0 153.8 161.7 175.6 157.5 165.5 157.2 158.0 An example of a hypothesis test on the mean of a single sample where the population standard deviation is known is now presented. The table below list all the information that we have relating to the problem; 1 Population Sample Mean µ = 161 x = calculated from sample Standard Deviation σ = known s = calculated from sample Size N = unknown n = 30 If the population standard deviation (σ) is known then we use this value, otherwise we use the sample standard deviation (s) when calculating the test statistic. STEP 1 The first step is always to identify the hypothesis. This means determining the null hypothesis (H0) and alternative hypothesis (Ha) for the given question. Since we are only interested in whether the mean weight of our sample varies from the required mean weight this is a two-tailed test (sample mean could be under or over the population mean). Therefore our hypothesis is: H0 : µ = 161 Ha : µ ≠ 161 STEP 2 The second step is to determine the test statistic. This is a Z-value, determined from our sample, that we will be testing. The formula used for determining the test statistic is: TS = significance level of the test. The significance level (α) of a statistical hypothesis test is a fixed probability of wrongly rejecting the null hypothesis H0, if it is in fact true. Usually, the significance level is chosen to be 0.05 i.e. there is a 5% chance of wrongly rejecting the null hypothesis. This means that out of every 20 tests we are willing to accept that one will return an incorrect result. This is a compromise between safety and practicality. The critical value is a Z value and is often represented as Z α/2. To determine the critical value divide the significance level (0.05) by 2 since we have a two-tailed test. Find the cell in the body of the normal tables (remember that it is the body of the normal tables that we look up because the significant level is a probability) with a value closest to 0.025 and read off the associated Z value. Based on this Z value we can determine the critical region (or rejection region) for our test. Calculate the critical values for this test. x–µ StErr(x) σ where StErr(x) is the standard error of the sample and is given by . √n Determine the test statistic for the sample in question. STEP 5 We now ask the question:- “Does the test statistic lie in the critical region?” STEP 3 We now determine the distribution of our sample. Since n is large, (≥30), we can assume that our data is approximately normal and so this means that we will use the normal tables when determining the critical value(s). STEP 4 The critical value(s) for a hypothesis test is a threshold to which the value of the test statistic (calculated in Step 2) in a sample is compared to determine whether or not the null hypothesis is rejected. Since this is a two-tailed test (determined in Step 1) it means that there will be two critical values. The critical values are found by looking up the normal tables once we know the www.steps.ie www.nce-mstl.ie STEP 6 Draw your conclusion from the result in Step 5. At this stage we attempt to interpret the result in Step 5 and phrase it in terms of the original problem specification. Teacher Page Solutions STEP 2 SOLUTION STEP 5 SOLUTION To determine the test statistic we use the formula: Since TS = 1.47 and the CR > 1.96, we can clearly see that the test statistic does not lie in the critical, or rejection region. TS = x–µ σ/√ n STEP 6 SOLUTION We first need to calculate x, the mean of the sample in question. Summing all the values in the sample together and dividing by 30 we find that x = 162.397. Therefore the test statistic is: TS = x–µ σ/√ n = 162.397 – 161 ≈ 1.47 5.2 / √ 30 STEP 4 SOLUTION Comments and Suggestions In this example the associated Z value is 1.96. By symmetry of the normal distribution the critical values are +1.96 and – 1.96 as shown in the figure below. From this we can now determine the critical region (or rejection region) for our test. CR > 1.96 and CR < –1.96 History William Sealy Gosset is famous as a statistician who worked for Guinness in the early 1900s, best known by his pen name Student and for his work on Student’s t-distribution which deals with the problem of small samples (n < 30). Reject H0 Reject H0 www.steps.ie www.nce-mstl.ie Overview Students will be involved in conducting a hypothesis test and then be required to relate the answers back to the contextual problem. Hints Hypothesis testing of a population mean from a large sample (n > 30) highlights the importance of the normal distribution while at the same time drawing our attention to a potential multitude of opportunities for using real life applications involving the normal distribution. Fail to Reject H0 -1.96 The conclusion that we can draw from the result in Step 5 might be something like, “Based on this we do not have enough evidence to reject the null hypothesis. Therefore there is no difference between the mean weight of tubing supplied by Supplier A and the required company standard, i.e. Supplier A’s tubing meets the required standard”. 1.96 Accurate sketches of statistic distributions can be generated on this website : http://www.socr.ucla.edu/SOCR.html Other statistical resources are available from the Census at School website http://www.censusatschool.ie/