• Study Resource
• Explore

Survey

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia, lookup

Student's t-test wikipedia, lookup

Taylor's law wikipedia, lookup

Bootstrapping (statistics) wikipedia, lookup

Misuse of statistics wikipedia, lookup

Degrees of freedom (statistics) wikipedia, lookup

Categorical variable wikipedia, lookup

Gibbs sampling wikipedia, lookup

Transcript
```Sampling




‘Scientific sampling’ is random sampling
 Simple random samples
 Systematic random samples
 Stratified random samples
 Random cluster samples
What?
Why?
How?
What is random sampling?




Simple random sample -Every sample with
the same number of observations has the
same probability of being chosen
Choose first sample member randomly
Stratified random sample – Choose simple
random samples from the mutually exclusive
strata of a population
Cluster sample – Choose a simple random
sample of groups or clusters
Why sample randomly?



To make valid statistical inferences to a
population
Conclusions from a non-probability
sample can be questioned
Conclusions from a self-selected sample
are SLOP
How can samples be randomly
chosen?
Random number generators (software)
 Ping pong balls in a hopper
 Other mechanical devices
 Random number tables
 Slips of paper in a ‘hat’
With or without replacement

Descriptive Statistics –
Graphic Guidelines





Pie charts – categorical variables, nominal data, eg. ‘religion’
Bar charts – categorical or numerical variables, nominal or
interval data, eg. ‘religion’ or ‘margin debt’; time series or
cross sectional data
Line graphs – numerical variables, interval data, eg. margin
debt; time series data
Histograms – numerical variables, interval data, eg. golf
scores; cross sectional data – depicts the SHAPE of a
frequency distribution
 Stem and Leaf Plot– quick and dirty histogram
 Ogive – depicts a cumulative percentage frequency
distribution
Scattergram – two quantitative variables, eg. Margin vs, the
market value
Graphic Deception – some widely
used methods






Graphs without a scale on one axis
Captions or titles intended to influence
Reporting only absolute changes in value and
not percentage changes
Changing the scale of the vertical axis with
breaks or truncations
Changing the scale of the horizontal axis
Changing the width as well as the height of
bars or pictogram figures
Summary of data types and
available graphic techniques
Numeric
Cross-sectional data
Time-series data
Histograms
Percentage histograms
Ogives
Stem and leaf plots
Box plots
Line charts
Bar charts
Nomina;
Pie charts
Bar charts
Complex pie
or bar charts
Describing the frequency distribution
for numerical, cross sectional data



Shape
Center
Describing distributions

SHAPE

Graphs






Histograms
Percentage histograms
Ogives,
Stem and leaf plots
Box plots
Words

Symmetric, skewed, bell shaped, flat, peaked
Descriptive Statistics –

CENTER

Quantitative measures




Mean
Median
Mode
Mid-point of the range
Descriptive Statistics –

Numeric Measures – cont’d.



Range
Symmetric distributions



Standard deviation
Variance
Skewed distributions





Quartiles
Min
Max
Interquartile range
Percentiles
Z Scores and t-scores


Measures distance from the mean in standard
deviations
Eg. T score for bone density – 1 to 2.5
standard deviations below the norm (mean)
for a 23 year old indicates osteopenia; 2.5 or
more indicates osteoporosis


(X-m)/s = z score
(X – Xbar)/s = t score
Empirical Rule

For mound shaped distributions



About 68% of observations are within one
standard deviation of the mean
About 95% of observations are within two
standard deviations of the mean
Almost all (99.7%) observations are within
three standard deviations of the mean
Chebyshev’s Rule

For all distributions






Let k be greater than or equal to 1
At least 1-(1/k2) of the observations are within k
standard deviations of the mean
Examples
K=1 zero observations may be within one
standard deviation of the mean
K=2 3/4th’s of observations must be within two
standard deviations of the mean
K=3 8/9th’s of observations must be within three
standard deviations of the mean
```
Related documents