• Study Resource
• Explore

Survey

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia, lookup

Bootstrapping (statistics) wikipedia, lookup

Time series wikipedia, lookup

Misuse of statistics wikipedia, lookup

Foundations of statistics wikipedia, lookup

Transcript
```Fitting Models by
Judgment
Least Squares Formulations

Statistics can fit a variety of mathematical
models to data by minimizing the sum of the
squared departures of individual data


This a mathematical technique
The distribution of other mathematical model
is being fit to an actual data or situation –
sometimes your knowledge of the subject is
as powerful a tool as least squares.
Helps Available

Histograms and basic statistics give us some
great initial ideas


Cumulative Probability Plots


But histograms can have some wild interval jumps
even for a perfectly normal distribution.
Computer can show them – but you can also take
control and plot them on paper.
Special Types of Paper linearize data so that
you can read key model parameters off of the
slope and intercept.
The Normal
Probability Plot
We have seen those
Cumulative probability plots
Come out as a straight line
Put an appropriate linear scale for your data down here
Plot the percentage of your
Sample data here.
Example – with 10 samples
Plot the first sample by its
X axis value with its Y
Axis value at 10.
The second sample would
Plot at 20%.
What Can Happen to Results
It might plot as a straight line – (if it is
Perfectly normal)
Woopy – I got it!
Why Graph for Statistics
Computers Spit Out Instantly?
Because a lot of almost normal distributions
Are not normal
A common defect is that much engineering
Data has a definite range over which it can
Physically occur – The normal model goes
From plus to minus infinity.
This constriction of outer limit data causes
Droopy tails on the graph.
The slope determines standard deviation.
Simple mathematics fitting can over-emphasize
tail
Heavy Tails
Common to have deviations of
Physical situations in the tail areas.
One of the reasons I cautioned
Against doing 99.99% confidence
Intervals – those extreme values
Are likely to give you garbage.
I have found normal models useful
If only my outer 5% or less tails
Bent off in unwanted directions.
(But this also gives you a clue how
Far you can go with a confidence
Interval).
By setting your own model slope you can preferentially fit your model to the central
Data you will be more likely to use.
The Lognormal Distribution
Lognormal Distributions can be normalized
By taking the Log of the data
(but the transform can have issues in
Dealing with real world data).
Lognormal
Probability Paper Can Allow