Survey

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Survey

Transcript

Estimations from sample data When we knew the population mean and standard deviation, we used standard normal (z) to determine the proportion of sample means that fell within a given number of standard error units form the known population mean. It is typical of inferential statistics for us to use the mean and standard deviation of a single sample as our best estimate of the unknown population mean and standard deviation. If we establish the sample mean as the midpoint of an interval estimate for the population mean, the resulting interval may or may not include the actual value of the population mean. Definitions: Interval limits – the lower and upper values of the interval estimate Confidence interval – An interval for which there is a specified degree of certainty that the actual population parameter will fall with the interval Confidence coefficient/Confidence level – express the degree of certainty that an interval will include the actual value of the population parameter. Coefficients are expressed as a number between 0 and 1 (.95) Levels are expressed as percentages (95%) Accuracy – the difference between the sample statistic and the actual parameter Sometimes called sampling error. Student t-distribution 𝑡= (𝑥̅ − 𝜇) 𝑠 √𝑛 Degrees of freedom (df) = n-1 Confidence intervals 𝑥̅ = ±𝑡 𝑠 √𝑛 Where: 𝑥̅ = sample mean t = value corresponding to the desired level of confidence s = standard deviation of the sample n = sample size Sample Size So far, we took the results from our random samples and constructed a confidence interval. Now, let’s work the problem backwards; that is, let’s start with a confidence interval and figure out how large our sample size needs to be to reach that level of confidence. You probably won’t know the population standard deviation. You can use estimates based on other studies, conduct a small-scale pilot test, or estimate based on 1/6th the range of the data values. 𝑛= 𝑧 2 ∙𝜎 2 𝑒2 Where: n = the required sample size z = value for which ±z corresponds to the desired level of confidence σ = known or estimated value for the standard deviation e = maximum likely error that is acceptable Hypothesis Testing The null hypothesis (H0) is the statement that will be tested. You can consider it the status quo, business as usual, or the accepted truth. Everything that’s not the null hypothesis becomes the alternate hypothesis (Ha). Between the null and the alternate, all possibilities must be accounted for. Type I Error: rejecting the null hypothesis when it turns out to be true. The probability of making a type I error is determined by the significance level of the test (α). Type II Error: failing to reject the null hypothesis when it turns out that the alternate hypothesis is true. The probability of making a type II error depends on several factors, but I don’t think we discussed any of them. Steps for hypothesis testing 1. Formulate the null and alternate hypotheses. 2. Select the significance level a. The significance level is the area beneath the tail(s) on the probability distribution. We can choose any significance level, but in practice, .05 is the most common. Rarely would we use a sig level > .10 and sometimes we’ll go as low as .01. 3. Calculate the value of the test statistic a. We’re familiar with z and t values (depends on sample size and if σ is known) 4. Identify critical value(s) for the test statistic and state the decision rule 5. Compare calculated and critical values to reach a conclusion about the null hypothesis p-value The p-value is the probability we would see a test statistic at least as extreme as the one we observed, if the null hypothesis was true. If the p-value is < our level of significance , we have sufficient evidence to reject the null hypothesis.