Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Sampling distributions chapter 6 ST 315 Nutan S. Mishra Department of Mathematics and Statistics University of South Alabama Useful links • http://oak.cats.ohiou.edu/~wallacd1/ssamp le.html • http://garnet.acns.fsu.edu/~jnosari/05.PDF • http://www.ruf.rice.edu/~lane/stat_sim/sam pling_dist/ Sampling distribution In chapter 2 we defined a population parameter as a function of all the population values. Let population consists of N observations then population mean and population standard deviation are parameters N f ( x1 , x2 ..., x N ) x i 1 i N 2 ( x ) 2 x N g ( x1 , x2 ..., x N ) N For a given population, the parameters are fixed values. Sampling distribution On the other hand if we draw a sample of size n from a population of size N, then a function of the sample values is called a statistics For example sample mean and sample standard deviation are sample statistics. n x f ( x1 , x2 ..., xn ) x i 1 i n 2 ( x ) 2 x n s g ( x1 , x2 ..., xn ) n 1 Since we can draw a large number of samples from the population the value of sample statistic varies from sample to sample Sampling distribution Since value of a sample statistic varies from sample to sample, the statistic itself is a random variable and has a probability distribution. For Example sample mean x is random variable and it has a probability distribution. Example: Start with a toy example Let the population consists of 5 students who took a math quiz of 5 points. Name of the students and corresponding scores are as follows: Name of the student A B C D E Score 2 3 4 4 5 For this population mean µ = 3.6 and standard deviation σ = 1.02 Sampling distribution Now we repeatedly draw samples of size three from the population of size 5. then the possible samples are 10 as listed below The population parameters are µ = 3.6 and s.d. σ = 1.02 Sample sample Sample values x s 1 A,B,C 2,3,4 3 1 2 A,B,D 2,3,4 3 1 3 A,B,E 2,3,5 3.33 1.53 4 A,C,D 2,4,4 3.33 1.16 5 A,C,E 2,4,5 3.67 1.53 6 A,D,E 2,4,5 3.67 1.53 7 B,C,D 3,4,4 3.67 .58 8 B,C,E 3,4,5 4 1 9 B,D,E 3,4,5 4 1 10 C,D,E 4,4,5 4.33 .58 Sampling distribution X= score of a student in the math quiz Sampling distribution of sample mean Population distribution x f P(x) x f P(x ) 2 1 .2 3 2 .2 3 1 .2 3.33 2 .2 4 2 .4 3.67 3 .3 5 1 .2 4 2 .2 4.33 1 .1 Thus we see that the sample mean x is a new random variable and has a probability distribution. Question: What is the mean of this random variable and what is its variance? Sampling distribution Let N be the size of the population and n be the size of the sample mean of sample mean x If n/N > .05 and standard devation of sample mean x And if n/N ≤.05 n N n N 1 mean of sample mean x and standard devation of sample mean x n Sampling distribution of sample mean Theorem Let X be a random variable with population mean µ and population standard deviation σ . If we collect the samples of size n then the new random variable sample mean x has the mean same as µ and standard deviation σ/√n We can denote them as follows: mean of x x standard deviation of x x n Sampling distribution of sample mean mean of x x standard deviation of x x n It is easy to see that the standard deviation of sample mean decreases as the sample size increases. The mean of the sample remains unaffected with the change in sample size. Sample mean is called an estimator of the population mean. Because whenever population mean is unknown we will use sample mean in place. Sampling distribution of sample mean x P( x) 3 .2 3.33 .2 3.67 .3 4 .2 4.33 .1 From the above table when we compute the mean and variance They are Sampling distribution of sample mean We have seen that distribution of the sample mean x is derived from the distribution of x Thus distribution of x is called parent distribution. The next question is to investigate what is the relationship between the parent distribution and the sampling distribution of x . Sampling distribution of sample mean Let the distribution of x is normal with mean µ and standard deviation σ then it is equivalent to saying that Let the parent population is normal with mean µ and standard deviation σ If we draw a sample of size n from such a population then • Mean of x that is x is equal to the mean of the population µ. • Standard deviation of x that is x is equal to σ/√n • The shape of the distribution of x the value of n is normal whatever be Sampling distribution of sample mean If X~ N(µ, σ) then x ~ N ((µ, σ/√n) Where n is size of the sample drawn from the population Central Limit Theorem For a large sample size, the sampling distribution of x is approximately normal, irrespective of the shape of the population distribution. What size of the sample is considered to be large? A sample of size ≥ 30 is considered to be large. Sampling distribution of sample mean Assume that population standard deviation σ is known If the random sample comes from a normal population, the sampling distribution of sample mean is normal regardless the size of the sample. If the shape of the parent population is not known or not normal then distribution of sample mean is approximately normal when ever n is large (≥30).(this is central limit theorem) If the shape of the parent population is not known or not normal and sample size is small then we can not say readily about the shape of sample distribution Sampling distribution of sample mean When population standard deviation is unknown • If the sample size is large the sampling distribution of sample mean is still approximately normal • If the sample size is small then X t is a random variable having t - distributi on S n with parameter n - 1. 2 ( X X ) where S2 i n 1 i 1 n About t-distribution • • • • • t is a special continuous distribution Its symmetric about zero Has bell shaped curve like normal Its variance depends on the parameter is called degrees of freedom and is the only parameter of t-distribution. • Variance of t approaches 1 as n ∞ • In other words t approaches Z as n ∞ • The t-values are tabulated for different values of the right tail areas and degrees of freedom Sampling distribution of sample mean Sampling distribution of sample mean σ unknown σ known n>30 n<30 Normal normal n>30 n<30 Approx. normal t For t-distribution :assume that parent population is approximately normal