Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
CLASS NOTES: Distribution of Sample Means CONCEPT CALCULATION/EXAMPLES Population: The set of all the individuals of interest in a particular study (typically large) APPLICATION So, if you hear of a measure being described as a “parameter” then it is referring to measures associated w/ a population. If you hear of a measure being described as a “statistic” then it is referring to measures associated w/ a sample. Parameter: Any measure obtained by having measured an entire population. Sample: A set of individual selected from a population, usually intended to represent the population in a research study Statistic: Any measure obtained by having measured a sample. Techniques in Sampling When using inferential statistics, the sample used should be representative of the population & contain the same elements as the population. Random Sampling: Demands that each member of the entire population has an equal chance of being included. No member of the population may be systematically excluded. If everyone does not have an equal chance of being selected, then the sample is not considered random. Such is the case if subjects are allowed to select themselves. Stratified, or Quota, Sampling: Samples are selected in such a way that the same percentages of the representative population are present. So, the researcher must know beforehand what some of the major population characteristics are and then, deliberately select a sample that shares these characteristics in the same proportions. Example: If a hospital contains 5% ages 12 & below 20% ages 13 to 30 35% ages 31 to 59 & 40% ages 60 & above Then… Your sample of this group must contain the same percentages for it to be a stratified or quota sampling. Situation Sampling: The researcher collects observations in a variety of settings & circumstances. Example: A sample of cancer patients at 20 different cancer treatment centers throughout the US. Sampling error: The discrepancy, or amount of error, b/t a sample statistic & its corresponding population parameter. Remember that sampling error is a natural occurrence. It is not a mistake. Outliers: When one or two scores in a large random sample fall so far from the mean. Either the distribution is not normal or some measurement error has crept in. Bias: When most of the sampling error loads up on one side so that the sample means are consistently either over or underestimated the population mean. Bias is a constant sampling error in one direction. Distribution of Sample Means: The collection of sample means for all the possible random samples of a particular size (n) that can be obtained from a population. Sampling Distribution: A distribution of statistics obtained by selecting all the possible samples of a specific size from a population (sampling distribution of M). Population Sp1 1 Spl 2 M1 M2 Spl 3 M3 XX Example: If the entire set contains 100 samples, then the probability of obtaining any specific sample is 1 out of 100: The distribution of sample means contains all possible samples, so it is necessary to have all the possible values in order to compute probabilities. * Each sample will have its own individuals, its own scores & its own sample mean. * It provides a method for organizing all of the different sample means into a single picture that shows how the sample means are related to each other & how they are related to the overall population mean. * Sample means should pile up around the population mean. * The pile of sample means should tend to form a normal-shaped distribution. * The larger the sample size, the closer the sample means should be to the population mean. P = _1_ 100 Central Limit Theorem: Provides a precise description of the distribution that would be obtained if you selected every possible This serves as a cornerstone for much of inferential statistics. The value of the central limit sample, calculated every sample mean, & constructed the distribution of the sample mean. So, for any population w/ mean µ & standard deviation of σ / n , & will approach a normal distribution as n approaches infinity. theorem comes from 2 different facts: 1. It describes the distribution of sample means for any population, no matter what shape, mean or standard deviation. 2. The distribution of sample means approaches a normal distribution very rapidly. So, by the time n=30, the distribution is almost perfectly normal. The Central Limit Theorem identifies w/ the 3 basic characteristics that describe any distribution: shape, central tendency & variability. Will be almost perfectly normal if either of the following two conditions is satisfied: 1. the population from which the samples are selected is a normal distribution 2. The number of scores in each sample is relatively large, around 30 or more. However, increasing the size of more than 30 does not produce much additional improvement in how well the sample represents the population. The shape of the distribution of sample means The mean of the distribution of sample means The expected value of M: The mean of the distribution of sample means is equal to µ (the population mean). I.e, the mean of the distribution of sample means always will be identical to the population mean. This mean value is called the expected value of M. The expected value of M is sometimes expressed as µM, but since it is always equal to µ, it is not necessary to notate the expected value of M as µM. Population Sp1 1 Spl 2 M1 M2 Spl 3 M3 µM Example: Population = Stress rate for 100 volunteers Sp1 1 30 p Spl 2 50 M1= 74 M2 = 70 Spl 3 20 M3 = 78 MM MM = ΣM N Remember that N in this case represents the number of samples, not the number of all participants. 222 = 74 3 The standard error of M: The standard deviation of the distribution of sample means. The standard error measures the standard amount of differences b/t M & µ that is reasonable to expect simply by chance. This is like what the standard deviation would be to regular raw data, but instead, we are working w/ whole sample mean values, thus changing the formula. The law of large numbers: States that the larger the sample size (n), the more probable it is that the sample mean will be close to the population mean. Standard error of M = σM = standard distance b/t M & µ Standard error = σM = _σ_ √N σM = _σ_ = 2.23 = 2.23 = 1.29 3 1.73 √N The standard error of M is represented as σM. The standard error of M is a very important as it specifies precisely how well a sample mean estimates its population mean, or how much error you should expect on the average b/t the M & µ. The magnitude of the standard error is determined by two factors: 1) the size of the sample 2) the standard deviation of the population from which the sample is selected There is an inverse relationship b/t sample size & standard error. Bigger samples = smaller error; smaller samples = bigger error. B/c of this rule, if you have n = 1, then the standard error & standard deviation are the same (σM = σ) So, the equation to the left satisfies the following 2 requirements: 1) as sample size (n) increases (↑), standard error decreases (↓). 2) When the sample consists of a single score (n=1), the standard error is the same as the standard deviation (σM = σ). Example: Probability & the Distribution of Sample Means The primary use of the distribution of sample means is to find the probability associated w/ any specific sample. population µ = 500 σ = 100 N = 25 B/c of the rules, we know that: 1) the distribution is normal b/c the population of SAT scores is normal 2) The distribution has a mean of 500 b/c the population mean is 500 What is the probability that the sample mean will be great than M = 540? Step 1: P(X > 540) = ? Step 2: Find your standard error: 3) for n = 25, the distribution has a standard error of σM = 20 σM = _100_ = _100_ = 20 5 25 Z-score formula for sample means: Z=M-µ σM Step 3: Locate your z-score ZM = M - µ = 540 – 500 = 40 σM 20 20 = 2.00 Step 4: Draw out your distribution Since your z-score is +2.00, then your score of 540 falls 2 standard deviations above the mean. Step 5: Find the area beyond the z-score of +2.00 on your Unit Normal Table. Your answer would be 0.0228 Answer: There is a 2.28% chance to obtain a random sample of n = 25 patients w/ a patient satisfaction score greater than 540, so it is very The difference b/t z-score formula for an x-value & z-score formula for a sample mean is that for an xvalue, your numerator is the standard deviation (σ). For the sample mean, your numerator is the standard error (σM). unlikely. Here is a chart to help you w/ your new symbols & formulas: Sample Means Concept Symbol Sample Mean M Mean of the Sample Means µM Formula MM = ΣM N or MM Standard Error of M σM or SEM _σ_ √N To complete the formula above, you first need to complete the population standard deviation, formula below: σ = ΣX2 – (ΣX)2 _ __N__ N Z-score formula for sample means Z Or ZM Z=M-µ σM or ZM = M - µM σM