Download Calculating the probability of

Statistics Final Exam Quick Notes Exam Layout and Content Question 1: Threshold 5 (10 marks) ● Calculating the probability of a sampling distribution of sample means ● Calculating the standard error of the sampling distribution of sample mean for: finite population infinite population Question 2: Threshold 6 (10 marks) ● Calculating/constructing the confidence Interval using zdistribution ● Interpreting a confidence Interval ● Hypothesis testing when the population standard deviation is known Questions 3 and 4: any material from lectures 9, 10, 11 ● Lecture 9: Sampling and Sampling distribution (Threshold 5) ● Lecture 10: Confidence Interval (Threshold 6) ● Lecture 11: Hypothesis Testing (Threshold 6) Notes Lecture 9: Sampling and Sampling Distribution (T5) Introduction ● A sample consists of individual observations that are random draws from a population ● A sample mean is a guess about the true population mean ● But, based on one sample, how accurately can you estimate the population parameter (i.e. population mean)? ● A sampling distribution of a statistic tells us how close the statistic (e.g. the sample mean) is to the parameter (e.g. population mean) ● Sampling error: the difference between the value computed from a sample statistic and the corresponding value for population. Can be minimised by using larger sample. Central Limit Theorem ● The sampling distribution of the sample mean is approximately normal if at least one of these conditions is met: 1. The original (population) distribution is normal 2. The sample size (n) ≥ 30 (regardless of the original distribution) The Sampling Distribution of the Sample Mean ● DEFINITION: The sampling distribution of the mean is the probability distribution of sample means, with all samples having the same samples size n taken from the same population. ● It is characterised by: * Note xbar space – average of averages x space – is a normal sample Standard Error (SE) ● Is the standard deviation of all possible sample mean ● The higher the standard error, the lower the accuracy Standard Error Formulas * Note: is the finite correction factor (FCF) * Note: if the population standard deviation is unknown, we replace σ with s Standardising the Sampling Distribution (using Zscore) and Finding Probabilities ● We can standardise the sampling distribution of the sample mean using zscore: ● We can now use the Z table to find probabilities. Sampling Distribution of the Sample Proportion ● Let X be the number of times a particular outcome (success) occurs in n repeated trials ● To estimate the population proportion of success, p, we use the sample proportion: Sampling Distribution of the Sample Proportion ● If both np ≥ 5 and nq ≥ 5, then the sample size is large. ● Therefore sampling distribution of proportions can be approximated by a normal probability distribution: Lecture 10: Confidence Interval (T6) Introduction ● Confidence intervals are constructed to provide an estimate of how close the sample mean is to the population mean after accounting for a certain margin of error. ● The information about whether the population standard deviation is known or unknown is crucial to understanding whether we use z or t distribution to calculate z/t critical value and then use it to calculate the confidence interval. Point Estimate ● The sample mean, a single value, is referred to as the point estimate. Confidence Interval (CI) ● To make statements about unknown population parameters (e.g. population mean) with greater accuracy/confidence, we can develop an interval estimator ● An interval estimator draws inference about a population by estimating the value of the unknown population parameter using an interval. ● This interval is called the Confidence Interval (CI) * Further Explanation ● Different samples taken from a population will give different means (xbar) ● So instead we give an interval instead of a specific point ● That interval says: the real (i.e. population) mean lies somewhere in the interval ● We are trying the capture the pop. mean in the interval. The bigger the interval, the more likely the pop. mean will be in that interval Constructing a Confidence Interval (standard deviation is known) Confidence interval (CI) = point estimate (xbar) (critical value) ± X (standard error) Confidence interval (CI) = * Note: (critical value) X (standard error) is the margin of error (E) Constructing a Confidence Interval (standard deviation is unknown) ● In most sampling situations the population standard deviation is unknown. ● Instead of population standard deviation we use sample standard deviation, s ● Instead of z distribution we use t distribution to obtain the critical value (t critical value) and then the confidence interval. Confidence interval (CI) = Where n1 = degrees of freedom (df) Commonly Used Confidence Levels Determining the Appropriate Sample Size ● Formula: ● E is the margin of error ● Always ROUNDUP e.g. 47.000000256 → round to 48 Confidence Interval of Population Proportion ● Population proportion: for qualitative variables the pop. proportion is a parameter of interest. Determining the Appropriate Sample Size (Population Proportion) When to Use the Z or T Distribution for Confidence Interval Computation Lecture 11: Hypothesis Testing (T6) Introduction ● The purpose of hypothesis testing is to determine whether there is enough statistical evidence in favour of a certain belief about a population parameter ● e.g. is there statistical evidence in a random sample of potential customers that supports the belief that consumers spend more than $20 on an online purchase? You have been employed to test the belief… Summary of Steps in Hypothesis Testing ● STEP 1: formulate the hypothesis ● STEP 2: determine alpha ( ), the level α of significance ● STEP 3: determine the standardized test statistic ● STEP 4: determine the critical value ● STEP 5: write the decision rule and draw a conclusion STEP 1: Set the null hypothesis (H0) and alternate hypothesis (HA) ● Claim always goes to HA ● Equal sign always goes to HO (≤, ≥, =) ● We are testing population parameter (e.g. μ) ● Choose from 3 scenarios: Identifying the Rejection Region ● Rejection region = range of values such that if the test statistic falls into that range, we REJECT the NULL HYPOTHESIS in favour of alternate hypothesis ● The rejection regions are the shaded regions on the above diagrams. STEP 2: determine alpha ( ), the level α of significance ● Type Errors: ○ Type 1 error: occurs when we reject a true null hypothesis (i.e. reject H0, but H0 is true) ○ Type 2 error: occurs when we don’t reject a false null hypothesis (i.e. don’t reject H0, but H0 is false) ● The probability of a type 1 error is denoted as α (alpha) ● P(making type 1 error) = α ● α is called the level of significance ● α = 1%, 5%, 10% are frequently used in practice ●1α → the confidence level 5% significance level → α = 5% = 0.05 → confidence level = 1 0.05 = 0.95 10% significance level → α = 10% = 0.10 → confidence level = 1 0.1 = 0.90 STEP 3: determine the standardized test statistic STEP 4: determine the critical value ● σ is known: Z crit → use Z table, you need α ● σ is unknown: t crit → use t table, you need α and df (df = n1) Common critical values STEP 5: write the decision rule and draw a conclusion ● Making decision requires comparison between teststatistic and critical value LOWER TAIL: if test stat < critical value → reject H0, otherwise don’t reject UPPER TAIL: if test stat > critical value → reject H0, otherwise don’t reject TWO TAILED: if test stat < – critical value OR test stat > + critical value → reject H0, otherwise don’t reject Hypothesis Testing: P Value Method ● The p value of a test is the minimum level of significance that is required to reject the null hypothesis. PValue Creation For a 2-sided test , p-value is calculated as we reject HO we do not reject HO Example

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Calculating the probability of