Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Math 4030 – 8a Population and Sample Sample Mean Distribution 4/29/2017 1 Sample in terms of iid RV (Sec. 6.1): Population: a random variable X with certain distribution (discrete or continuous); Sample (of size n): n independent random variables that have the same (identical) distribution as X. A random sample of size n can be viewed as an n-dimensional random vector of which all components have the independent and identical distribution, called population or underlying distribution. 4/29/2017 2 Parameter vs. Statistic • Parameters are numbers that summarize data for an entire population. • Statistics are numbers that summarize data from a sample, i.e. some subset of the entire population. Sample statistics are random variables, while population parameters are not. 4/29/2017 3 An example of sample means: Population: S = {0, 1, 2, 3, 3, 5, 6, 9} Consider random samples of size 3, the following samples are equal likely to be formed: {012, 123, 013, 015, 016, 019, 023, 123, 025, 026, 029, 033, 035, 036, 039, 035, 036, 039, 056, 059, 069; 233, 235, 236, 239, 235, 236, 239, 256, 259, 269; 4/29/2017 123, 123, 125, 126, 129, 133, 135, 136, 139, 135, 136, 139, 156, 159, 169; 335, 336, 339, 356, 359, 369; 356, 359, 369; 569} 8 876 56 8 C3 3 2 1 3 4 Sampling distribution of the Mean (Sec. 6.2) X has any distribution with the mean µ and standard deviation . Let n 1 X Xi n i 1 be the sample mean from an (independent) sample of size n. Then X E X and 2 X 2 Var X . n If the population is finite, sample variables cannot be independent. However we still have, X E X and N n Var X 2 2 X n N 1 Finite Population Correction Factor Chebyshev’s Theorem: 1 P X X k X 2 , for all k 1. k X Xn, X X , X n n X n , X 1 P X n X k 2 , for all k 1. n k k X n X P X n X n 2 lim n P X n X 0, for any 0. 4/29/2017 7 Law of Large Number (Theorem 6.2): X1,X2,…, Xn is a sample from a population with (finite) mean and (finite) variance 2, then for any arbitrary (small and) positive number , lim P X 0, n where 1 n X Xi n i 1 is the sample mean. (Long-run) relative frequency and probability. 4/29/2017 8 The Central Limit Theorem (Theorem 6.3) X has any distribution with the mean µ and standard deviation . Let 1 n X Xi n i 1 be the sample mean from an (independent) samples of size n. Then 2 X X N , or Z N 0,1 n / n if n is large. Further more, if the population variance is unknown, we may use the sample standard deviation. i.e. X Z N 0,1 s/ n if n is large. (n ≥ 30) Sample mean distribution If the population is normally distributed with known mean and variance, then the sample mean is normally distributed. For large sample (at least 30), the sample mean is approximately normally distributed (Central Limit Theorem). If population is normally distributed with unknown variance, X t t n 1 S/ n Where S is the sample standard deviation, and t(n-1) is the tdistribution with degree of freedom n-1. (Sec. 6.3) X ~ t ( ) Excel: P(X < t) : t: R: P(X < t) : t: Table: T.DIST(t, , T) T.INV(1-,) pt(t,) qt(1-,) 4/29/2017 13 Distribution of Sample Means: 2 X Z N 0,1 or X N , n / n Use Table 3 X t t n 1. s/ n Use Table 4 for n < 30 and Table 3 for n ≥ 30. Population Identify the population according to our research objective(s). Based on the data analysis results, make inference toward the population (e.g. estimation/prediction) 4/29/2017 Draw a sample: to ensure the sample preserves the same characteristics as that of the population Sample Conduct survey/experiment to collect data; organize and summarize the data. 15