Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Statistics of samples Populations and samples ● POPULATION All individuals All objects by Gilberto E. Urroz, March 2006 All measurements Populations, samples and statistical inference Population size ● Finite population – ● Infinite population – ● e.g., students in this class e.g., possible values of a length Extremely large populations – – e.g., the population of the U.S. treat as infinite Samples Should be random or unbiased Numeric sample ● Example: monthly precipitation data (in) [0.05, 0.07, 0.10, 0.12, 0.22, 0.50, ..., 0.10] Each element equally likely to be chosen ● Biased sample = not representative [x1, x2, ..., xn] ● Described by sample statistics Represent it as the list: n = sample size Sample statistics ● Measures of central tendency – ● ● Mean deviation, median deviation, variance, standard deviation, range, interquartile range Coefficient of variation Measures that split the data – ● Extracting sample statistics out of a numerical sample Mean, geometric mean, harmonic mean, median, mode(s) Measures of spread or variation – ● Data Reduction Quartiles, percentiles, deciles Moments – Skewness, kurtosis Maple context-menu for statistics Entering data into Maple ● ● ● Click on list, right-click, choose Statistics Type the data Generate random data – LinearAlgebra[RandomMatrix](1,n) or – with(Statistics); – X := RandomVariable(Normal(μ,σ)) – Sample(X,n) ● Read data from a file – – – – Tools>Assistants>Import Data... Stored as a matrix LinearAlgebra[Column] – extract columns Convert to a list Measures of central tendency Measures of central tendency ● Mean ● Quadratic mean ● Geometric mean ● Median ● Harmonic mean ● Mode(s) = value(s) that repeat the most in sample Deviations from the mean ● Differences between each data value and the mean Measures of spread ● Mean Deviation, or mean absolute deviation Variance ● Sum of deviations equals zero ● ● Uses ● Quartiles ● ● ● ● Quartiles, Inter-quartile range Q1 = first quartile 25% of data below Q1, 75% above Q2 = median (second quartile) 50% of data below Q2, 50% above Q3 = third quartile 75% of data below Q3, 25% above IQR = Q3-Q1 Contains 50% of the data Coefficient of variation ● ● ● Standard deviation, s = square root of variance Maple definition Skewness & Kurtosis ● Skewness ● Kurtosis Population standard deviation (for finite populations) Alternative definition: Five point summary Contains (1) Minimum, (2) Lower hinge, (3) Median, (4) Upper hinge, (5) Maximum Data Summary ● ● ● ● ● ● ● Mean Standard deviation Skewness Kurtosis Minimum value Maximum value Cumulative weight = n (for a sample without weights)