* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Linear regression
Degrees of freedom (statistics) wikipedia , lookup
Foundations of statistics wikipedia , lookup
Confidence interval wikipedia , lookup
History of statistics wikipedia , lookup
Bootstrapping (statistics) wikipedia , lookup
Taylor's law wikipedia , lookup
German tank problem wikipedia , lookup
Resampling (statistics) wikipedia , lookup
Normal Distribution Normal distribution We often meet the situation when the frequency of some occurances depend on the distance from the mean value ο close to mean are very frequent, away from mean less frequent: - Human height, temperature, machined products, sales, financial data Lower tail Upper tail -3 -2 -1 0 1 2 3 Normal distribution π -3 Position -2 -1 0 π 1 2 3 Position Normal distribution π=0 π = β1 π = β2 -3 -2 -1 0 1 2 3 Normal distribution π π ππππππ narrow π π ππππππ narrow π ππππππ widen -3 -2 -1 0 1 2 3 Standard normal curve Area under the curve 1 -3 -2 Z distribution π=0 π=1 -1 0 1 2 3 Standard normal curve 0.5 0.8413 0.1587 Z distribution π=0 π=1 0.0228 0.3413 0.1359 0.0013 0.1359 0.0214 -3 -2 0.9772 0.3413 0.9982 0.0214 -1 0 1 2 3 Standard normal curve 0.5 68.26% βππ -3 -2 -1 0 +ππ 1 2 3 Z distribution π=0 π=1 Standard normal curve 0.5 95.44% βππ -3 -2 -1 0 +ππ 1 2 3 Z distribution π=0 π=1 Standard normal curve 0.5 99.74 βππ -3 -2 -1 0 +ππ 1 2 3 Z distribution π=0 π=1 Apple corporation 2012 Daily Returns 1. What is the probability, for any given day, of a return greater than 0.5%? 2. What is the probability, for any given day, of a loss greater than 2%? 3. What is the probability, for any given day, of a return beween 0 and 1%? 4. What is the probability, for any given day, of a gain or loss greater than 3%? Z distribution π = 0.11 % π = 1.84 % 0.5 % -3 -2 -5.41% -3.57% -1 -1.73% 0 0.11% 1 1.95% 2 3.79% 3 5.63% Apple corporation 2012 Daily Returns 1. What is the probability, for any given day, of a return greater than 0.5%? Z distribution π = 0.11 % π = 1.84 % π₯βπ π§= π 0.5 β 0.11 π§= = 0.21 1.84 = norm.dist(0.21,0,1,TRUE)=0.58 0.5 % 1 - 0.58 = 0.42 0.58 -3 -2 -5.41% -3.57% -1 -1.73% 0 0.11% 1 1.95% 2 3.79% 3 5.63% Apple corporation 2012 Daily Returns 2. What is the probability, for any given day, of a loss greater than 2%? Z distribution π = 0.11 % π = 1.84 % π₯βπ π§= π β2 β 0.11 π§= = β1.15 1.84 = norm.dist(-1.15,0,1,TRUE)=0.125 0.125 -3 -2 -5.41% -3.57% -1 -1.73% 0 0.11% 1 1.95% 2 3.79% 3 5.63% Apple corporation 2012 Daily Returns 3. What is the probability, for any given day, of a return beween 0 and 1%? Z distribution π = 0.11 % π = 1.84 % -3 -2 -5.41% -3.57% -1 -1.73% 0 0.11% 1 1.95% 2 3.79% 3 5.63% Apple corporation 2012 Daily Returns 3. What is the probability, for any given day, of a return beween 0 and 1%? Z distribution π = 0.11 % π = 1.84 % π₯βπ π§= π 1 β 0.11 π§= = 0.48 1.84 π₯βπ π§= π 0 β 0.11 π§= = β0.06 1.84 = norm.dist(0.48,0,1,TRUE)=0.69 = norm.dist(-0.06,0,1,TRUE)=0.48 0.21 0.21 -3 -2 -5.41% -3.57% -1 -1.73% 0 0.11% 1 1.95% 2 3.79% 3 5.63% Apple corporation 2012 Daily Returns 4. What is the probability, for any given day, of a gain or loss greater than 3%? Z distribution π = 0.11 % π = 1.84 % -3 -2 -5.41% -3.57% -1 -1.73% 0 0.11% 1 1.95% 2 3.79% 3 5.63% Apple corporation 2012 Daily Returns 4. What is the probability, for any given day, of a gain or loss greater than 3%? Z distribution π = 0.11 % π = 1.84 % π₯βπ π§= π 3 β 0.11 π§= = 1.57 1.84 π₯βπ π§= π β3 β 0.11 π§= = β1.69 1.84 = norm.dist(-1.69,0,1,TRUE)= 0.046 = 1- norm.dist(1.57,0,1,TRUE)=0.058 0.104 -3 -2 -5.41% -3.57% -1 -1.73% 0 0.11% 1 1.95% 2 3.79% 3 5.63% Mean and variance Statistical quality control High Way Paving inc. Is a company specializing in residential road surfacing using low noise pavement. Recycled rubber can be added to asphalt mixtures to reduce road noise. However resistance to flow must be maintained within very tight limits otherwise it may too thick or too βwateryβ. The goal is a viscosity of 3200. Over several years of production and quality measurements, HWP has determined that viscosity population mean and standard deviation is: π = 3200 π = 150 Statistical quality control During manufacture of each batch of asphalt, the quality control specialist takes 15 samples of the material and tests the viscosity. There is no way test every kg of asphalt (population). Therefore the company must take samples. From those sample HWP must then make conclusions about the entire batch. Statistical quality control Sample 1 Sample 2 Sample 3 Sample 4 π₯ = 3210.73 π₯ = 3150.13 π₯ = 3345.54 π₯ = 3190.67 Sample 5 Sample 6 Sample 7 Sample 8 Sample 9 π₯ = 3217.90 π₯ = 3301.45 π₯ = 3100.72 π₯ = 3413.01 π₯ = 3023.59 Statistical quality control Sample Sample mean (π) Range Frequency 1 3210.73 2950-3049 I 2 3150.13 3050-3149 I 3 3345.54 3150-3249 IIII 4 3190.67 3250-3349 II 5 3217.90 3350-3449 I 6 3301.45 7 3100.72 8 3413.01 9 3023.59 We call this the: Sampling distribution (distribution of sample means) πΈ π₯ = 3217.08 Viscosity sampling distribution 4.5 4 3.5 3 2.5 2 1.5 1 0.5 0 2950-3049 3050-3149 3150-3249 3250-3349 3350-3449 Viscosity sampling distribution 4.5 πΈ π₯ =π 4 3.5 As n ο βlargeβ 3 2.5 2 πΈ π₯ = π‘βπ ππ₯ππππ‘ππ π£πππ’π ππ π₯ π = π‘βπ ππππ’πππ‘πππ ππππ 1.5 1 0.5 0 2950-3049 3050-3149 3150-3249 3250-3349 3350-3449 If we take many random samples from the population each with its own sample mean and then create a distribution based of all of those sample means, the mean of that sampling distribution is equal to the mean of the population. β’ The expected value of the sampling distribution of π₯ us at best going to be an estimate of π, β’ We would have to take every sample from the popoulation to match the population mean perfectly (but what is the point of sampling then?) β’ The best we are going to be able to do is find and interval estimate for the population mean π, β’ Our interval estimate will be influenced by sample size and the degree of confidence we are satisfied with. Standard deviation of π₯ sampling distribution π ππ₯ = π 4.5 4 3.5 3 ππ₯ - standard deviation of π₯ 2.5 2 π - standard deviation of population 1.5 1 n β sample size 0.5 0 2950-3049 3050-3149 3150-3249 In this case we know standard deviation of the population which is rarely the case. 3250-3349 3350-3449 Standard deviation of π₯ sampling distribution π ππ₯ = π 4.5 4 3.5 3 ππ₯ - standard deviation of π₯ 2.5 2 π - 150 1.5 1 0.5 n β 15 0 2950-3049 ππ₯ = π 150 = = 38.7 π 15 3050-3149 3150-3249 3250-3349 πππ‘β πππππππ πππ π πππππ π ππ§π ππ₯ πππ‘π π ππππππ. 3350-3449 Standard deviation of π₯ sampling distribution A larger sample size decreases standard error. The values of π₯ will have less variation and therefore be closer to π. π 150 ππ₯ = = = 6.71 π 500 ππ₯ = π 150 = = 38.7 π 15 ππ₯ = π 150 = = 12.9 π 135 ππ₯ = Statistical quality control π 150 = = 38.7 π 15 Sample 1 Sample 2 Sample 3 Sample 4 ππ₯ = 38.7 ππ₯ = 38.7 ππ₯ = 38.7 ππ₯ = 38.7 π₯ = 3210.73 π₯ = 3150.13 π₯ = 3345.54 π₯ = 3190.67 Sample 5 Sample 6 Sample 7 Sample 8 Sample 9 ππ₯ = 38.7 π₯ = 3217.90 ππ₯ = 38.7 π₯ = 3301.45 ππ₯ = 38.7 π₯ = 3100.72 ππ₯ = 38.7 π₯ = 3413.01 ππ₯ = 38.7 π₯ = 3023.59 β’ Standard error is stanard deviation, it allows us to calculate z-scores and therefore area (probability) under the curve for certain region, β’ Any point estimator is an estimation and will contain error, β’ This error can be minimized by selecting large sample from the population from which to estimate a parameter β’ Error component means we cannot determine single value of the parameter, we can only provide a range or interval that may cover the parameter ο confidence interval β’ Most often we do not know standard deviation and we have to estimate it Hypothesis testing π»0 : π = π0 π»π : π β π0 π is the true mean of population under analysis π0 is the hypothesized mean of the population under analysis Is the true mean the same as the hypothesized mean? Example 1 A bottled water company states on the product label that each bottle contains 355 ml of water. Your work for a goverment agency that protects consumers by testing products volumes. A sample of 50 bottles is tested. Establish null and alternative hypothesis. What is our assumption? We assume that 355 ml on the bottle is to be true. So: π»0 : π = 355 ππ π»π : π β 355 ππ If the data indicates the bottles are being filled properly, then we fail to reject the null, fail to reject our assumption. We are not saying we have proven the null just that our assumption held up. Example 2 According to the United States Department of Agriculture, in 2006 the average farm size in the state of Texas was 2.3 km2. Since the dacadeslong trend as been for farm sizes to increase due to large agrobussiness, we want to analyze if farm size in 2015 larger than it was in 2006. Establish null and alternative hypothesis. What is our assumption? We assume that there has been no change is farm size since 2006. We wish to see if the farm has increased since 2006 So: π»0 : π β€ 2.3 ππ2 π»π : π > 2.3 ππ2 Example 3 During the 2010-2011 English Premier League season Manchester United home matches had an average attendace of 74,691. A club marketing analyst would like to see if attendance decreased during the most recent season. Establish null and alternative hypothesis. What is our assumption? We assume that attendance remained the same. We wish to see if the attendanced has decreased since 2010-2011 So: π»0 : π β₯ 74,961 π»π : π < 74,961 Two scenarios - π known or unknown? 1. The population standard deviation π is given. 2. The population standard deviation π is not given and we have to estimate it, s. β’ When π is given (or n > 100) we use the normal standard zdistribution, β’ When π is not given we use a t-distribution with n-1 degrees of freedom β’ It is better to check data for normality Z-test for a single mean π₯ β π 0 π₯ β π0 π§= = π π π₯ β π πππππ ππππ π0 β βπ¦πππ‘βππ ππ§ππ ππππ’πππ‘πππ ππππ π β π πππππ π π‘ππππππ πππ£πππ‘πππ π β π πππππ π ππ§π The Standard Normal Curve SNC is also a sampling distribution of the mean π₯. The distribution of many sample means of given sample size. π -3 -2 -1 0 1 2 3 95% Probability Interval The are in the tails is called alpha. Therefore we have 5% left evenly divided between both tails. πΌ = 5% ππ 0.05 π 95% of all sample means (π₯) are in here. 5% = 2.5% ππ 0.025 2 5% = 2.5% ππ 0.025 2 πΆ π πΆ π -3 -2 -1 0 1 2 3 95% Probability Interval By doing so, we can assign z-scores (t-scores) to the uppar and lower boundary of the 95% interval. If the populationβs standard deviation π we treat the sampling distribution as a standard normal curve (z-curve). If not we have to estimate π and use t distribution (t-curve). π 95% of all sample means (π₯) are in here. βπ. πππ πΆ π -3 -2 +π. πππ π ± π. ππππ -1 0 1 πΆ π 2 3 95% Probability Interval The standard error of the mean ππ₯ depends on: π β’ Sample size (ππ₯ = ), What is the standard deviation of the sampling distribution? π The standard error of the mean ππ₯ (not π β the standard deviation of the population) π 95% of all sample means (π₯) are in here. βπ. πππ πΆ π -3 -2 +π. πππ π ± π. ππππ -1 0 1 πΆ π 2 3 Two-tailed z-test rejection region The critical value is determined by πΌ and if we are using z- or t-test distribution. π»0 : π = π0 π»π : π β π0 πΌ = 0.05 With πΌ and π known we would consult the z-table and find the corresponding z-scores for a two-tailed test. Z = +1.96 Critical Value π Z = -1.96 Critical Value Nonrejection region Rejection region πΌ = 0.025 2 -3 Rejection region πΌ = 0.025 2 ππ -2 -1 0 1 2 3 The Studentβs T-distribution 1. In general, the t-distribution is shorter in the middle and fatter in the tails. 2. More probability in the tails, less near the mean, grater chance of extreme values. 3. There isnβt just one t-distribution. 4. There is a t-distribution for every sample size. 5. Degrees of Freedom (n-1). 6. Smaller the sample size, the shorter and fatter the distribution, more tail probability. 7. However as n becomes large, the tdistribution tends to z-distribution. Area (probability) under the curve = 1 When we do not know π we have to estimate. This estimation, this uncertainty, forces us to use the Studentβs T-dsitribution. -4 -3 -2 -1 0 1 2 3 4 The Studentβs T-distribution 95% Prob. Int. By doing so, we can assign t-scores to the upper and lower boundary of the 95% interval of each sample size. Area (probability) under the curve = 1 If we do not know the population standard deviation π we treat the sampling distribution as a t-distribution. Degrees of Freedom (n-1) n = 10 π z = -1.96 z = +1.96 95% of all sample means (π₯) are in here. t = -2.26 df = 9 t = +2.26 π‘πΌ ππ = 9 2 -4 -3 -2 -1 0 1 2 3 4 95% Distribution comparision Z-distribution, ±π. ππ n df Interval 10 9 ±2.262 30 29 ±2.045 75 74 ±1.993 100 99 ±1.984 Sample standard deviation β’ When do we do not know the population standard deviation, we use the stample standard deviation to approximate it, β’ This approximation come at a cost though in terms of our interval estimate, β’ We must use the t-distribution instead of z-distribution to account for this estimation of π, β’ Every sample size will have its own t-distribution with degees of freedom df = n-1, β’ Our standard error will now be: π ππ = where π = π ππ βπ π π΅ π=π π΅βπ Different standard errors β’ In the previous examples we knew the population standard deviation π and it was therefore fixed in the standard error formula, β’ This meant that all samples of the same size had the same standard error, β’ When π is uknown we estimate it with the sample standard deviation, s, β’ Since every sample have a unique s, samples of the same size do not necessarily have the same standard error, β’ The randomness of sample selection is represented in its standard deviation and therefore its standard error. The Studentβs T-distribution 95% Prob. Int. π π₯ is largely dependent on sample size. The standard deviation of sampling distribution now is the standard error of the mean. If sample size is small π π₯ becomes larger and thus distribution becomes wider. π ππ = π π 95% of all sample means (π₯) are in here. -4 -3 -2 -1 0 1 2 3 4 T-test for a single mean π₯ β π 0 π₯ β π0 π‘= π = π π₯ π π₯ β π πππππ ππππ π0 β βπ¦πππ‘βππ ππ§ππ ππππ’πππ‘πππ ππππ π β π πππππ π π‘ππππππ πππ£πππ‘πππ π β π πππππ π ππ§π Question: Is this t-test value in the nonrejection region or the rejection region based on df = n-1? Two-tailed t-test rejection region, n = 20 With πΌ and π not known and 20 samples (df = 20) we would consult the t-table and find the corresponding t-scores for a two-tailed test. π»0 : π = π0 π»π : π β π0 πΌ = 0.05 t = +2.093 Critical Value π t = -2.093 Critical Value Nonrejection region Rejection region πΌ = 0.025 2 -3 Rejection region πΌ = 0.025 2 ππ -2 -1 0 1 2 3 General t-distribution properties 1. A smaller sample size means more sampling error. 2. This sampling error due to small n means a higher probability of extreme sample means. 3. More probability in the tails means the center hump of the tdistribution mu come downward. 4. This process shrinkes the distribution downward and outward and thus moving critical values. 5. Given the same πΌ and s, a smaller n will push the critical values outward in the tails due the uncertainty associated with small n. Hypothesis testing procedure 1. 2. 3. 4. 5. 6. 7. 8. 9. Start with clear research problem. Establish hypothesis, null and alternative. Determine appropriate statistical test and sampling distribution. Choose πΌ. State decision the decision rule. Gather sample data. Calculate test statistics. State statistical conclusion. Make a decision. Bussiness analyst salaries A report from 6 years ago indicated that the average gross salary for a bussiness analyst was $69,873. Since this survey is now outdated, the Berau of Labor Statistics wished to test this figure against current salaries to see if the current salaries are statistically different from the old ones. Based on this sample, we found s = $14,985. We do not know π and therefore we will estimate it using s. For this study, the BLS will take a sample of 12 current salaries. Bussiness analyst salaries 1. Establish Hypothesis π»0 : π = $69,873 π»π : π β $69,873 2. Determine Appropriate Statistical Test and Sampling Distribution This will be two-tailed test. π β ππ Salaries can higher or lower. π= π Since π is uknown ans n is small π we will use t-distribution. Bussiness analyst salaries 3. Specify the error rate (significance level) πΌ = 0.05 4. State the decision rule If t > 2.201, reject π»0 π For df = 11 Rejection region πΌ = 0.025 2 -3 Rejection region πΌ = 0.025 2 Nonrejection region If t < -2.201, reject π»0 ππ -2 -1 0 1 2 3 Bussiness analyst salaries 5. Gather data n = 12, π₯ = $79,180 6. Calculate test statistics π₯ = $79,180 π0 = $69,873 π = $14,985 π = 12 π β ππ π= π π π‘= $79,180 β $69,873 = 2.15 $14,985 12 Bussiness analyst salaries 7 and 8. State statistical conclusion π»0 : π = $69,873 OK! π»π : π β $69,873 Since the t-statistics is in the nonrejection region we fail to reject Null hypothesis. It is not βout of the ordinaryβ that this sample came from a population π = $69,873 when df = 11 π Rejection region πΌ = 0.025 2 Nonrejection region Rejection region πΌ = 0.025 2 -3 ππ -2 -1 0 1 2 3 Bussiness analyst salaries, n = 15 3. Specify the error rate (significance level) πΌ = 0.05 4. State the decision rule If t > 2.145, reject π»0 π For df = 14 Nonrejction region shrinks! Rejection region πΌ = 0.025 2 -3 Rejection region πΌ = 0.025 2 Nonrejection region If t < -2.145, reject π»0 ππ -2 -1 0 1 2 3 Bussiness analyst salaries 5. Gather data n = 12, π₯ = $79,180 6. Calculate test statistics π₯ = $79,180 π0 = $69,873 π = $14,985 π = 15 π β ππ π= π π π‘= $79,180 β $69,873 = 2.41 $14,985 12 Bussiness analyst salaries 7 and 8. State statistical conclusion π»0 : π = $69,873 π»π : π β $69,873 ππΎ! 1. The larger n decreased standard deviation of samplig distribution thus narrowing it and making π₯ stand further out on its own; more likely to belong to a different population that does not overlap much with π0 . Created separation between π₯ and π0 . 2. The larger n led to higher df. That shrinked nonrejection region. Rejection region πΌ = 0.025 2 -3 π Rejection region πΌ = 0.025 2 Nonrejection region ππ -2 -1 0 1 2 3 Starbucks customer satisfaction Starbucks is interestes in assessing customer satisfaction in the Toronto. To conduct the study, Starbucks askes 25 customers in the city: βCompared to other coffe houses in Toronto, would you say the customer service at Starbucks is much better than average (5), better then average (4), average (3), worse than average (2), much worse than average (1)?β (βLikert scaleβ) The man rating was determined to be 3.5. Based on this sample, the standard deviation was found to be s = 1.4. Bussiness analyst salaries 1. Establish Hypothesis π»0 : π β€ 3 π»π : π > 3 2. Determine Appropriate Statistical Test and Sampling Distribution This will be one-tailed test. π β ππ We are interested in better than average rating. π = π Since π is uknown ans n is small π we will use t-distribution. Bussiness analyst salaries, n = 25 3. Specify the error rate (significance level) πΌ = 0.1 4. State the decision rule For df = 24 If t > 2.495, reject π»0 π Rejection region Nonrejection region πΌ = 0.1 Nonrejction region shrinks! ππ -3 -2 -1 0 1 2 3 Bussiness analyst salaries 5. Gather data n = 25, π₯ = 3.5 6. Calculate test statistics π₯ = 3.5 π0 = 3 π = 1.4 π = 25 π β ππ π= π π π‘= 3.5 β 3 = 1.79 1.4 25 Bussiness analyst salaries 7 and 8. State statistical conclusion π»0 : π β€ 3OK! π»π : π > 3 We fail to reject null hypothesis that customer satisfaction is below average. π = π. π Rejection region Nonrejection region πΌ = 0.1 t=1.79 ππ = π -3 -2 -1 0 1 2 3 The p-value method Based on our πΌ = 0.01 we know that 1% of our area (probability) is in the upper tail past our π‘ππππ‘ = 2.495. In the p-value method, we ask how much area (probability) is above out test statistics of t=1.79. Using t-table or Excel (T.DIST.RT(1.79,24)) we find that this 0.043 being greater than 0.01. π = π. π Since these are greater than πΌ = 0.01 we would fail to reject π»0 . Nonrejection region Rejection region πΌ = 0.1 t=1.79 ππ = π -3 -2 -1 0 1 2 3 We want to determine whether our sample mean (330.6) indicates that this year's average energy cost is significantly different from last yearβs average energy cost of $260 We want to determine whether our sample mean (330.6) indicates that this year's average energy cost is significantly different from last yearβs average energy cost of $260 ANOVA Is one mean so far away from the other two that is not from the same population? Suppose we want to compare three sample means to see if there is a difference between them π₯1 Question: Do all three of these means come from a common population? π₯2 π₯3 Is one mean so far away from the other two that is not from the same population? Suppose we want to compare three sample means to see if there is a difference between them π₯1 π₯2 π₯3 Is one mean so far away from the other two that is not from the same population? Means are in different locations to the overall mean. π₯1 π₯2 π₯3 We are not asking if they are exactly equal. We are asking if each mean likely came from the larger overall population Null hypothesis: π»0 : π1 = π2 = π3 π₯1 Variability AMONG/BETWEEN the sample means. π₯2 π₯3 Multiple t-test π₯1 π₯2 π₯3 π»0 : π₯1 = π₯2 ; πΌ = 0.05 π»0 : π₯1 = π₯3 ; πΌ = 0.05 π₯1 π₯2 Pairwise comparision means three t-tests all with πΌ = 0.05 Type I error rate at 95% confidence. Error compound with each t-test: 0.95 0.95 0.95 = 0.857 π₯3 π»0 : π₯2 = π₯3 ; πΌ = 0.05 πΌ = 1 β 0.857 = 0.143 ANOVA: Analysis of Variance π₯1 Variability ratio Variability AMONG/BETWEEN the sample means. π₯2 Distance from overall mean π₯3 overall mean π₯1 Internal spread Variability AMONG/WITHIN the sample means. π₯2 π₯3 ANOVA: Analysis of Variance π₯1 π₯2 Distance from overall mean π₯3 ππππππππ π΅ππ‘π€πππ = ππππππππ πππ‘βππ overall mean π₯1 Internal spread π₯2 π₯3 ANOVA: Analysis of Variance ππππππππ π΅ππ‘π€πππ ββ πππ‘ππ ππππππππ πΆππππππππ‘π ππππππππ πππ‘βππ ππππππππ π΅ππ‘π€πππ + ππππππππ πππ‘βππ = πππ‘ππ ππππππππ Partitioning β separating total variance into its components parts If the variability BETWEEN the means (distance from the overall mean) in the numerator is relatively large compared to the variance WITHIN the samples (internal spread) in the denominator, the ration will be much larger than 1. The samples then most likely do not come from a common population ο reject null hypothesis that means are equal. ANOVA: Analysis of Variance πΏπ΄π πΊπΈ = π πππππ‘ π»0 π ππππ ππππππππ π΅ππ‘π€πππ ππππππππ πππ‘βππ π πππππππ = πΉπππ π‘π π πππππ‘ π»0 π πππππππ π ππππ = πΉπππ π‘π π πππππ‘ π»0 πΏπ΄π πΊπΈ At least one mean is an outlier and each distribution is narrow; distinct from each other. Means are close to overall mean and/or distr. overlap a bit; hard to distinguish. The means are close to overall mean and/or distr. melt together. ANOVA: Analysis of Variance ππππππππ π΅ππ‘π€πππ + ππππππππ πππ‘βππ = πππ‘ππ ππππππππ πΉ= π΅ππ‘π€πππ πππ‘βππ F-ratio! π₯1 π₯2 π₯3 ANOVA: Analysis of Variance Example ππππ£πππ ππ‘π¦ π π‘π’ππ¦ π πππππ Twenty one students at the Autonomous University of Madrid (AUM) in Spain were selected for an informal study about student study skills; 7 first year, 7 second year, and 7 third year undergraduates were randomly selected. The students were given a study-skills assessment having a maximum score of 100. As researchers we are interested in whether or not a difference exists somewhere between the three different year levels. We will conduct this analysis using a One-Way ANOVA. ANOVA: Analysis of Variance Example ππππ£πππ ππ‘π¦ π π‘π’ππ¦ π πππππ Random sample within each group Columns/Groups Year 1 Scores Year 2 Scores Year 3 Scores 82 71 64 93 62 73 61 85 87 74 94 91 69 78 56 70 66 78 53 71 87 ANOVA: Analysis of Variance Example Random sample within each group π₯1 π₯2 π₯3 Year 1 Scores Year 2 Scores Year 3 Scores 82 71 64 93 62 73 61 85 87 74 94 91 69 78 56 70 66 78 53 71 87 π₯1 =? π₯2 =? π₯3 =? Overall Mean: The mean of all 21 scores taken together. π₯ =? ANOVA: Analysis of Variance Example Random sample within each group π₯1 π₯2 π₯3 Year 1 Scores Year 2 Scores Year 3 Scores 82 71 64 93 62 73 61 85 87 74 94 91 69 78 56 70 66 78 53 71 87 π₯1 = 71.71 π₯2 = 75.29 π₯3 = 76.57 Overall Mean: The mean of all 21 scores taken together. π₯ = 74.52 Variance and the sum of squares Sample Variance 2 π = π₯βπ πβ1 Avarages squared differences between sample and its mean. Sum of Squares 2 ππ = π₯βπ 2 Partitioning sum of squares SST (total) sum of squares SSC (column/groups) sum of squares SSE (within/error) sum of squares ANOVA: Analysis of Variance Example Random sample within each group π₯1 π₯2 π₯3 Year 1 Scores Year 2 Scores Year 3 Scores 82 71 64 93 62 73 61 85 87 74 94 91 69 78 56 70 66 78 53 71 87 π₯1 = 71.71 π₯2 = 75.29 π₯3 = 76.57 SST (total) sum of squares Overall Mean: The mean of all 21 scores taken together. π₯ = 74.52 ANOVA: Analysis of Variance π₯ = 74.52 71 SST (total) sum of squares 78 87 66 78 69 π 88 62 πππ = 74 93 56 91 70 π₯ππ β π₯ π=1 π=1 73 53 πΎ 82 94 2 ANOVA: Analysis of Variance Example Random sample within each group π₯1 π₯2 π₯3 Year 1 Scores Year 2 Scores Year 3 Scores 82 71 64 93 62 73 61 85 87 74 94 91 69 78 56 70 66 78 53 71 87 π₯1 = 71.71 π₯2 = 75.29 π₯3 = 76.57 SSC (column/groups) sum of squares Overall Mean: The mean of all 21 scores taken together. π₯ = 74.52 ANOVA: Analysis of Variance π₯ = 74.52 SSC (column/groups) sum of squares 71.77 πΎ 75.29 πππΆ = π₯π β π₯ π=1 76.57 2 ANOVA: Analysis of Variance Example Random sample within each group π₯1 π₯2 π₯3 Year 1 Scores Year 2 Scores Year 3 Scores 82 71 64 93 62 73 61 85 87 74 94 91 69 78 56 70 66 78 53 71 87 π₯1 = 71.71 π₯2 = 75.29 π₯3 = 76.57 SSE (within/error) sum of squares Overall Mean: The mean of all 21 scores taken together. π₯ = 74.52 ANOVA: Analysis of Variance Example Random sample within each group π₯1 π₯2 π₯3 Year 1 Scores Year 2 Scores Year 3 Scores 82 71 64 93 62 73 61 85 87 74 94 91 69 78 56 70 66 78 53 71 87 π₯1 = 71.71 π₯2 = 75.29 π₯3 = 76.57 π πΎ πππΈ = π₯ππ β π₯π π=1 π=1 2 SSE (within/error) sum of squares Overall Mean: The mean of all 21 scores taken together. π₯ = 74.52 Formulas for ANOVA SSC Sum of squares ππππππ’πππ = πΆ β 1 πππΆ = πππΆ ππππππ’πππ (columns) SSE Sum of squares πππππππ = π β πΆ πππΈ = πππ‘ππ‘ππ = π β 1 πΉ= πππΈ πππππππ (within/error) SST Sum of squares (total) N = total observations C = no. of columns πππΆ πππΈ Formulas for ANOVA β our case SSC Sum of squares (columns) ππππππ’πππ = 2 πππΆ = 88.67 2 SSE Sum of squares (within/error) πππππππ = 18 2812.57 18 SST Sum of squares (total) N = 21 C = 3 πππ‘ππ‘ππ = 20 πππΈ = πΉ= 44.33 156.25 = 44.33 = 156.25 = 0.28 ANOVA chart Source of Variance df SSE MSE F Between (columns) 2 88.87 44.33 0.28 Within (error) 18 2812.57 156.25 Total 20 2901.24 πΉ= πππΆ πππΈ πΉπΌ,πππΆ,πππΈ πΉ= F-stat larger than F-crit? ππ=2 ππ=18 πΉ0.05,2,18 = πΉ. πΌππ. π π 0.05,2,18 ππ πΈπ₯πππ β πΉππππ‘ = 3.55 NO. Fail to reject π»0 . No significant difference in mean test score by Year of Student. Statistical quality control Sample 1 Sample 2 Sample 3 Sample 4 π₯ = 3210.73 π₯ = 3150.13 π₯ = 3345.54 π₯ = 3190.67 Sample 5 Sample 6 Sample 7 Sample 8 Sample 9 π₯ = 3217.90 π₯ = 3301.45 π₯ = 3100.72 π₯ = 3413.01 π₯ = 3023.59 Standard deviation of π₯ sampling distribution π ππ₯ = π 4.5 4 3.5 3 ππ₯ - standard deviation of π₯ 2.5 2 π - standard deviation of population 1.5 1 n β sample size 0.5 0 2950-3049 3050-3149 3150-3249 In this case we know standard deviation of the population which is rarely the case. 3250-3349 3350-3449 Apple corporation 2012 Daily Returns 1. What is the probability, for any given day, of a return greater than 0.5%? 2. What is the probability, for any given day, of a loss greater than 2%? 3. What is the probability, for any given day, of a return beween 0 and 1%? 4. What is the probability, for any given day, of a gain or loss greater than 3%? Z distribution π = 0.11 % π = 1.84 % 0.5 % -3 -2 -5.41% -3.57% -1 -1.73% 0 0.11% 1 1.95% 2 3.79% 3 5.63% Apple corporation 2012 Daily Returns 3. What is the probability, for any given day, of a return beween 0 and 1%? Z distribution π = 0.11 % π = 1.84 % π₯βπ π§= π 1 β 0.11 π§= = 0.48 1.84 π₯βπ π§= π 0 β 0.11 π§= = β0.06 1.84 = norm.dist(0.48,0,1,TRUE)=0.69 = norm.dist(-0.06,0,1,TRUE)=0.48 0.21 0.21 -3 -2 -5.41% -3.57% -1 -1.73% 0 0.11% 1 1.95% 2 3.79% 3 5.63% Z-test for a single mean π₯ β π 0 π₯ β π0 π§= = π π π₯ β π πππππ ππππ π0 β βπ¦πππ‘βππ ππ§ππ ππππ’πππ‘πππ ππππ π β π πππππ π π‘ππππππ πππ£πππ‘πππ π β π πππππ π ππ§π The Studentβs T-distribution 1. In general, the t-distribution is shorter in the middle and fatter in the tails. 2. More probability in the tails, less near the mean, grater chance of extreme values. 3. There isnβt just one t-distribution. 4. There is a t-distribution for every sample size. 5. Degrees of Freedom (n-1). 6. Smaller the sample size, the shorter and fatter the distribution, more tail probability. 7. However as n becomes large, the tdistribution tends to z-distribution. Area (probability) under the curve = 1 When we do not know π we have to estimate. This estimation, this uncertainty, forces us to use the Studentβs T-dsitribution. -4 -3 -2 -1 0 1 2 3 4 T-test for a single mean π₯ β π 0 π₯ β π0 π‘= π = π π₯ π π₯ β π πππππ ππππ π0 β βπ¦πππ‘βππ ππ§ππ ππππ’πππ‘πππ ππππ π β π πππππ π π‘ππππππ πππ£πππ‘πππ π β π πππππ π ππ§π Question: Is this t-test value in the nonrejection region or the rejection region based on df = n-1? 95% Distribution comparision Z-distribution, ±π. ππ n df Interval 10 9 ±2.262 30 29 ±2.045 75 74 ±1.993 100 99 ±1.984 How stastistically can we predict data? π β πππ‘π 1. Collect sample How stastistically can we predict data? π β πππ‘π π 1. Collect sample 2. Calculate statistical parameters How stastistically can we predict data? π β πππ‘π π π 1. Collect sample 2. Calculate statistical parameters How stastistically can we predict data? π π 2. Calculate statistical parameters How stastistically can we predict data? π π 2. Calculate statistical parameters 3. Fit distribution How stastistically can we predict data? In what range with specific certainty do we expect future results? 4. Predict future results ο confidence interval Confidence intervals Gumball guessing game The oject of the game is to guess how many gumballs are in the jar. β’ If you guess within 5 (±5) I will give you the gum plus 50 PLN, β’ If you guess within 15 (±15) I will give you the gum plus 15 PLN, β’ If you guess within 30 (±30) I will give you the gum. But before you start guess you have to decide How confident are you? Confidence interval β’ Note that in the preceding examples we used the terms βconfidentβ and βintervalβ in the form of ±, β’ When estimating a population parameter using a sample statistics it is never going to be perfect; there will always be an error, β’ To estimate a population parameter without an error we would have to include all the sample in the population ο no practical sense, β’ We can express that error, or uncertainty, using an interval estimate: πππππ‘ ππ π‘ππππ‘π ± ππππππ ππ πππππ C.I. and standard error β’ In the previous exmaples we discussed the standard error of the mean, β’ To find the standard error of the mean (SEM) we need to know two things: 1) the population standard deviation and 2) the sample size, β’ Most often we do not know the population standard deviation (PSD) and therefore we have to estimate it, β’ Also remember that for any PSD, increasing the sample size reduces standard error, β’ So we are left with idea that the confidence interval will be affected by all these points: standard deviation, sample size and level of βconfidenceβ we are satisfied with. The Standard Normal Curve SNC is also a sampling distribution of the mean π₯. The distribution of many sample means of give sample size. π -3 -2 -1 0 1 2 3 95% Probability Interval The are in the tails is called alpha. Therefore we have 5% left evenly divided between both tails. πΌ = 5% ππ 0.05 π 95% of all sample means (π₯) are in here. 5% = 2.5% ππ 0.025 2 5% = 2.5% ππ 0.025 2 πΆ π πΆ π -3 -2 -1 0 1 2 3 95% Probability Interval By doing so, we can assign z-scores (t-scores) to the uppar and lower boundary of the 95% interval. If we the population standard deviation π we treat the sampling distribution as a standard normal curve (z-curve). If not we have to estimate π and use t distribution (t-curve). π 95% of all sample means (π₯) are in here. βπ. πππ πΆ π -3 -2 +π. πππ π ± π. ππππ -1 0 1 πΆ π 2 3 95% Probability Interval The standard error of the mean ππ₯ depends on: π β’ Sample size (ππ₯ = ), What is the standard deviation of the sampling distribution? π The standard error of the mean ππ₯ (not π β the standard deviation of the population) π 95% of all sample means (π₯) are in here. βπ. ππππ πΆ π -3 -2 +π. ππππ π ± π. ππππ -1 0 1 πΆ π 2 3 95% Probability Interval As soon as a sample mean steps outside the dotted region, π is no longer in its interval. 95% of all sample means (π₯) are in here. π -3 -2 -1 0 ππ 1 2 3 ππ We take many samples of the same size. ππ ππ ππ ππ Does this sample interval contains π? Sample of the same size have the same standard error ππ₯ . So the 95% βwidthβ is the same for all sample of that size. Interpretation β’ The randomness lies in the elements chosen for the sample, NOT the population mean. β’ It is the probability of obtaining a representative sample. β’ The proportion of samples, size n, for which our estimate, the sample mean π₯, is within a certain distance ± of the true population mean, π. β’ The sample mean is either wihin ± interval of the true mean, or it is not (no probability). β’ The confidence interval IS NOT the probability that the population mean lies within the interval. Example To estimate the mean amount spent per customer at a TGI Fridayβs data was collected for 75 customers. We are to assume the population standard deviation is $4. 1. At 95% confidence, what is the margin of error? 2. If the sample mean is $20, what is the 95% confidence interval for the population mean (all customers)? Example 1. At 95% confidence, what is the margin of error? π ± π. ππππ π = 75 π=4 ππ₯ = π π ππ₯ = 4 75 π₯ ± 1.96 β 0.46 π₯ ± 0.91 = 0.46 Example 2. If the sample mean is $20, what is the 95% confidence interval for the population mean (all customers)? π₯ ± 0.91 20 ± 0.91 95% Probability Interval All samples of n = 75 will have 0.91 as the margin of error, assuming π is know to be 4. 95% of all intervals made using π₯ ± 0.91 will contain the uknown POPULATION MEAN. If we take another samples of n = 75 and make intervals π₯ ± 0.91, 95 of them will contain π. π 95% of all sample means (π₯) are in here. π. πππ -3 -2 ππ. ππ 20 ± 0.91 -1 0 ππ 1 95% confident π. πππ 2 ππ. ππ 3 Example Brick and mortar stores like Castorama etc. Have a difficult time dealing with the practice of βshow-roomingβ whereby customers come in to the store, examine items, leave and then buy them cheaper on Allegro. We call these βfalse customersβ. Sales associates spend time with the customers who have no intention of buying anything at the stores leading to unproductive time spent. Management in one company determined that if less than 15% of a salesperonβs 8 hour day (4320 seconds) is spent with βfalse customersβ then show-rooming is a not a major problem. Example To determine if current show-rooming is really a problem bases on the companyβs standards, 125 sales associates are randomly selected to measure their service times with false customers. For this problem we are going to assume a population standard deviation, π = 1958 seconds. The results are: Standard error of the mean n = 125 π₯ = 3661.5 π = 1958 ππ₯ = 1958 125 = 175.13 95% Confidence Interval π₯ ± 1.96ππ₯ 3661.5 ± 1.96 175.13 3661.5 ± 343.25 95% Probability Interval 95% Confidence Interval π₯ ± 1.96ππ₯ 3661.5 ± 1.96 175.13 3661.5 ± 343.25 We are 95% confident that salesperson spend between 3318.25 s and 4004.75 s interacting with false customer each 8 h day. It is 5% possible that we will select sample with very high 4320 sample mean. False customers are not a great enough drain to justify policy changes. π 95% of all sample means (π₯) are in here. π. πππ -3 3661.5 ± 343.25 -2 ππππ. ππ -1 0 ππππ. π 1 π. πππ 2 ππππ. ππ 3 Concern point 4320 seconds Summing up β’ The randomness lies in the elements chosen for the sample, NOT the population mean. β’ It is the probability of obtaining a representative sample. β’ The proportion of samples, size n, for which our estimate, the sample mean π₯, is within a certain distance ± of the true population mean, π. β’ The sample mean is either wihin ± internval of the true mean, or it not not (no probability). β’ The confidence interval IS NOT the probability that the population mean lies within the interval. The Studentβs T-distribution 1. In general, the t-distribution is shorter in the middle and fatter in the tails. 2. More probability in the tails, less near the mean, grater chance of extreme values. 3. There isnβt just one t-distribution. 4. There is a t-distribution for every sample size. 5. Degrees of Freedom (n-1). 6. Smaller the sample size, the shorter and fatter the distribution, more tail probability. 7. However as n becomes large, the tdistribution tends to z-distribution. Area (probability) under the curve = 1 When we do not know π we have to estimate. This estimation, this uncertainty, forces us to use the Studentβs T-dsitribution. -4 -3 -2 -1 0 1 2 3 4 The Studentβs T-distribution 95% Prob. Int. By doing so, we can assign t-scores to the upper and lower boundary of the 95% interval of each sample size. Area (probability) under the curve = 1 If we do not know the population standard deviation π we treat the sampling distribution as a t-distribution. Degrees of Freedom (n-1) n = 10 π z = -1.96 z = +1.96 95% of all sample means (π₯) are in here. t = -2.26 df = 9 t = +2.26 π‘πΌ ππ = 9 2 -4 -3 -2 -1 0 1 2 3 4 95% Distribution comparision Z-distribution, ±π. ππ n df Interval 10 9 ±2.262 30 29 ±2.045 75 74 ±1.993 100 99 ±1.984 Sample standard deviation β’ When do we do not know the population standard deviation, we use the stample standard deviation to approximate it, β’ This approximation come at a cost though in terms of our interval estimate, β’ We must use the t-distribution instead of z-distribution to account for this estimation of π, β’ Every sample size will have its own t-distribution with degees of freedom df = n-1, β’ Our standard error will now be: π ππ = π Different standard errors β’ In the previous examples we knew the population standard deviation π and it was therefore fixed in the standard error formula, β’ This meant that all samples of the same size had the same standard error, β’ When π is uknown we estimate it with the sample standard deviation, s, β’ Since every sample have a unique s, samples of the same size do not necessarily have the same standard error, β’ The randomness of sample selection is represented in its standard deviation and therefore its standard error. The Studentβs T-distribution 95% Prob. Int. π π₯ is largely dependent on sample size. The standard deviation of sampling distribution now is the standard error of the mean. If sample size is small π π₯ becomes larger and thus distribution becomes wider. π ππ = π π 95% of all sample means (π₯) are in here. -4 -3 -2 -1 0 1 2 3 4 Example To estimate the mean amount spent per customer at a TGI Fridayβs data was collected for 15 customers. The sample standard deviation is $4. 1. At 95% confidence, what is the margin of error? 2. If the sample mean is $20, what is the 95% confidence interval for the population mean (all customers)? Example 1. At 95% confidence, what is the margin of error? π = 15 π =4 ππ = 14 π ± π. πππππ π π₯ = π π π π₯ = 4 15 = 1.03 π₯ ± 2.145 β 1.03 π₯ ± 2.21 Example 2. If the sample mean is $20, what is the 95% confidence interval for the population mean (all customers)? π₯ ± 2.21 20 ± 2.21 95% Probability Interval π = 15 π =4 ππ = 14 πΌ = 0.05 π‘ = 2.145 π₯ ± 2.21 π₯1 ± 2.21 20 ± 2.21 We are not saying that there is 95% probability that population mean is in the interval. We are saying that there is 95% probability the interval contains population mean. The randomness is in the interval not the mean. π 95% of all sample means (π₯) are in here. 20 ± 2.21 -4 -3 -2 17.79 -1 0 20 1 2 3 22.21 4 Example Brick and mortar stores like Castorama etc. Have a difficult time dealing with the practice of βshow-roomingβ whereby customers come in to the store, examine items, leave and then buy them cheaper on Allegro. We call these βfalse customersβ. Sales associates spend time with the customers who have no intention of buying anything at the stores leading to unproductive time spent. Management in one company determined that if less than 15% of a salesperonβs 8 hour day (4320 seconds) is spent with βfalse customersβ then show-rooming is a not a major problem. Example To determine if current show-rooming is really a problem bases on the companyβs standards, 30 sales associates are randomly selected to measure their service times with false customers. For this problem we are going to use sample standard deviation, π = 1958 seconds. The results are: Standard error of the mean n = 30 π₯ = 3661.5 π = 1958 π π₯ = 1958 30 = 357.48 95% Confidence Interval π₯ ± 2.045π π₯ 3661.5 ± 2.045 357.48 3661.5 ± 731.05 95% Probability Interval 95% Confidence Interval π₯ ± 2.045π π₯ 3661.5 ± 2.045 357.48 3661.5 ± 731.05 π = 75 π =4 ππ = 14 πΌ = 0.05 π‘ = 2.145 False customers are in fact draining resources. Policy intervention needed. π 95% of all sample means (π₯) are in here. 3661.5 ± 731.05 -4 -3 -2 2930.45 -1 1 0 3661.5 We are 95% confident that salesperson spend between 2930.45 s and 4392.55 s interacting with false customer each 8 h day. It is 5% possible that we will select sample with very high 4320 sample mean. Concern point 4320 seconds t = 1.84 2 3 4392.55 4 Increase sample size from 30 to 50 95% Confidence Interval π₯ ± 2.011π π₯ 3661.5 ± 2.011 276.9 3661.5 ± 556.58 π = 75 π =4 ππ = 49 πΌ = 0.05 π‘ = 2.011 Falase customers are not a great enough drain to justify policy changes. π 95% of all sample means (π₯) are in here. 3661.5 ± 556.58 -4 -3 -2 3104.92 -1 1 0 3661.5 We are 95% confident that salesperson spend between 3104.92 s and 4218.08 s interacting with false customer each 8 h day. It is 5% possible that we will select sample with very high 4320 sample mean. Concern point 4320 seconds t = 2.38 2 3 4218.08 4 Confidence interv al β’ Note that in the preceding examples we used the terms βconfidentβ and βintervalβ in the form of ±, β’ When estimating a population parameter using a sample statistics it is never going to be perfect; there will always be an error, β’ To estimate a population parameter without an error we would have to include all the sample in the population ο no practical sense, β’ We can express that error, or uncertainty, using an interval estimate: πππππ‘ ππ π‘ππππ‘π ± ππππππ ππ πππππ Estimating size of the sample Margin of Error πππππ‘ ππ π‘ππππ‘π ± ππππππ ππ πππππ π 2 π π₯ ± π§πΌ π§πΌ β π§ πππ’πππππ¦ ππ πππ‘πππ£ππ ππππππππππ‘π¦ 2 π 2 π ββ π§πΌ πΈ= π§πΌ 2 π π 1. We choose E, our margin of error. 2. We choose our confidence probability boundary π§πΌ . 2 π β π π‘ππππππ πππππ π 3. We are given or we estimate population standard deviation π. 4. Solve for n. Margin of Error πππππ‘ ππ π‘ππππ‘π ± ππππππ ππ πππππ π 2 π πΈ = π§πΌ π§πΌ π 2 π§πΌ π π= 2 πΈ n= 2 πΈ Margin of Error β’ Solving for the sample requires the population standard deviation π. Most ofther we do not know it so we have to use an estimare or βplanning valueβ in its place. Options: 1. Estimate π from previous studies using the same population of interest. 2. Conduct a pilot study to select a preliminary sample. Use sample standard deviation from the pilot study. 3. Use a judgment or best guess for π. A common guess is the data range (high-low) divided by 4. Example How large a sample should be selected to provide a 95% confidence interval with a marigin of error (E) of 8? Assiuming the population standard deviation is π = 36. π§πΌ π 2 n= 2 πΈ = 1.96β36 2 8 = 77.8 To have 95% of our sample means contain π, we need a sample size of 78. Example The question we are asking is: βWhat minimum sample size is necessary to produce 95% confidence that the sample mean is ±8 of the true population mean?β As we increase sample size we reduce the standard error and our sample most likely becomes more representative of the population. In graphical terms, we set the upper and lower boundary. Then we increase sample size, pulling the distribution in the middle and inward on the sides. Example The larger sample size ensures more sample means are within the given margin of error dueβ¦ To the fact that a large sample is more representative of the overall population. Larger sample size will be required when: 1. A smaller margin of error is required. 2. A higher level of confidence is required. 3. Or both. 95% Probability Interval 95% Confidence Interval π₯ ± 1.96ππ₯ 3661.5 ± 1.96 175.13 3661.5 ± 343.25 We are 95% confident that salesperson spend between 3318.25 s and 4004.75 s interacting with false customer each 8 h day. It is 5% possible that we will select sample with very high 4320 sample mean. Falase customers are not a great enough drain to justify policy changes. π 95% of all sample means (π₯) are in here. π. πππ -3 3661.5 ± 343.25 -2 ππππ. ππ -1 0 ππππ. π 1 π. πππ 2 ππππ. ππ 3 Concern point 4320 seconds