Download Normal Probability Plot

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Probability wikipedia , lookup

History of statistics wikipedia , lookup

Statistics wikipedia , lookup

Transcript
STATGRAPHICS – Rev. 11/7/2013
Normal Probability Plot
Summary ......................................................................................................................................... 1
Data Input........................................................................................................................................ 2
Analysis Summary .......................................................................................................................... 2
Normal Probability Plot .................................................................................................................. 3
Summary
The Normal Probability Plot is used to help judge whether or not a sample of numeric data
comes from a normal distribution. If it does not, you can often determine the type of departure
from normality by examining the way in which the data deviate from the normal reference line.
Sample StatFolio: probplot.sgp
Sample Data
The file bottles.sgd contains the measured bursting strength of n = 100 glass bottles, similar to a
dataset contained in Montgomery (2005). The table below shows a partial list of the data from
that file:
strength
255
232
282
260
255
233
240
255
254
259
235
262
 2013 by StatPoint Technologies, Inc.
Normal Probability Plot - 1
STATGRAPHICS – Rev. 11/7/2013
Data Input
The data to be analyzed consist of a single numeric column containing n = 2 or more
observations.

Data: numeric column containing the data to be summarized.

Select: subset selection.
Analysis Summary
The Analysis Summary shows the number of observations in the data column.
Probability Plot - strength
Data variable: strength
100 values ranging from 225.0 to 282.0
Also displayed are the largest and smallest values.
 2013 by StatPoint Technologies, Inc.
Normal Probability Plot - 2
STATGRAPHICS – Rev. 11/7/2013
Normal Probability Plot
This pane displays the probability plot.
Normal Probability Plot
99.9
n:100
Median:255.0
Sigma:8.14815
W:0.97781
P:0.4162
99
percentage
95
80
50
20
5
1
0.1
220
240
260
strength
280
300
The plot is constructed in the following manner:

The data are sorted from smallest to largest and the order statistics are determined. By
definition, the j-th order statistic is the j-th smallest observation in the sample, denoted by
x(j).

The data are then plotted at the positions

 j  0.375  
 x( j ) ,  1 
 
 n  0.25  

(1)
where  1 (u ) indicates the inverse standard normal distribution evaluated at u.

If desired, a straight line is fit to the data and added to the plot.
The normal probability plot is created in such a way that, if the data are random samples from a
normal distribution, they should lie approximately along a straight line. In the above plot, the
deviation of the values from the reference line at both ends indicates that the data may come
from a distribution with relatively longer tails than a normal distribution.
 2013 by StatPoint Technologies, Inc.
Normal Probability Plot - 3
STATGRAPHICS – Rev. 11/7/2013
Pane Options

Direction: the orientation of the plot. If vertical, the Percentage is displayed on the vertical
axis. If horizontal, Percentage is displayed on the horizontal axis.

Fitted Line: the method used to fit the reference line to the data. If Using Quartiles, the line
passes through the median when Percentage equals 50 with a slope determined from the
interquartile range. If Using Least Squares, the line is fit by least squares regression of the
normal quantiles on the observed order statistics. If Using Mean and Sigma, the line is
determined from the mean and standard deviation of the n observations. The method based
on quartiles puts more weight on the shape of the data near the center and is often able to
show deviations from normality in the tails that would not be evident using the other
methods.

Include - Percentile: displays a reference line at the specified percentile. The percentile is
estimated using the fitted line.

Include – Confidence Limits: displays confidence limits around the fitted line (only
available when estimating the line Using mean and sigma). The confidence level applies to
each percentile separately.
 2013 by StatPoint Technologies, Inc.
Normal Probability Plot - 4

STATGRAPHICS – Rev. 11/7/2013
Include – Shapiro-Wilk Test: displays the calculated value of the Shapiro-Wilk W test and
its associated P value. This test is described in the document titled Distribution Fitting –
Uncensored Data. A small P-value indicates that the data are not well modeled by a normal
distribution.
The Direction and Fitted Line defaults are determined from the settings on the EDA tab of the
Preferences dialog box on the Edit menu.
Example – Using Mean and Sigma with Confidence Limits and a Selected Percentile
The plot below displays the estimated line based on the sample mean and standard deviation, the
estimated 90th percentile, and the 95% confidence limits.
Normal Probability Plot
with 95% limits
99.9
n:100
Mean:254.64
Sigma:10.6823
W:0.97781
P:0.4162
99
percentage
95
90.0
80
50
20
5
1
268.33 [265.747,271.493]
0.1
220
240
260
strength
280
300
Using the Mean and Sigma method on the sample data produces a more “normal” looking plot,
since the line is forced to fit both the core and the tails.
 2013 by StatPoint Technologies, Inc.
Normal Probability Plot - 5