Download Slides Part 1

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Statistics of samples
Populations and samples
●
POPULATION
All individuals
All objects
by
Gilberto E. Urroz,
March 2006
All measurements
Populations, samples and
statistical inference
Population size
●
Finite population
–
●
Infinite population
–
●
e.g., students in this class
e.g., possible values of a length
Extremely large populations
–
–
e.g., the population of the U.S.
treat as infinite
Samples
Should be random or unbiased
Numeric sample
●
Example: monthly precipitation data (in)
[0.05, 0.07, 0.10, 0.12, 0.22, 0.50, ..., 0.10]
Each element equally likely to be chosen
●
Biased sample = not representative
[x1, x2, ..., xn]
●
Described by sample statistics
Represent it as the list:
n = sample size
Sample statistics
●
Measures of central tendency
–
●
●
Mean deviation, median deviation, variance,
standard deviation, range, interquartile range
Coefficient of variation
Measures that split the data
–
●
Extracting sample statistics out
of a numerical sample
Mean, geometric mean, harmonic mean,
median, mode(s)
Measures of spread or variation
–
●
Data Reduction
Quartiles, percentiles, deciles
Moments
–
Skewness, kurtosis
Maple context-menu for statistics
Entering data into Maple
●
●
●
Click on list, right-click, choose Statistics
Type the data
Generate random data
– LinearAlgebra[RandomMatrix](1,n)
or
– with(Statistics);
– X := RandomVariable(Normal(μ,σ))
– Sample(X,n)
●
Read data from a file
–
–
–
–
Tools>Assistants>Import Data...
Stored as a matrix
LinearAlgebra[Column] – extract columns
Convert to a list
Measures of central tendency
Measures of central tendency
●
Mean
●
Quadratic mean
●
Geometric mean
●
Median
●
Harmonic mean
●
Mode(s) = value(s) that repeat the most in
sample
Deviations from the mean
●
Differences between each data value and the
mean
Measures of spread
●
Mean Deviation, or mean absolute deviation
Variance
●
Sum of deviations equals zero
●
●
Uses
●
Quartiles
●
●
●
●
Quartiles, Inter-quartile range
Q1 = first quartile
25% of data below Q1, 75% above
Q2 = median (second quartile)
50% of data below Q2, 50% above
Q3 = third quartile
75% of data below Q3, 25% above
IQR = Q3-Q1
Contains 50% of the data
Coefficient of variation
●
●
●
Standard deviation, s = square root of
variance
Maple definition
Skewness & Kurtosis
●
Skewness
●
Kurtosis
Population standard deviation (for finite
populations)
Alternative definition:
Five point summary
Contains
(1) Minimum, (2) Lower hinge, (3) Median,
(4) Upper hinge, (5) Maximum
Data Summary
●
●
●
●
●
●
●
Mean
Standard deviation
Skewness
Kurtosis
Minimum value
Maximum value
Cumulative weight = n (for a sample
without weights)
Related documents