Download The Determination of Normal Ranges from Routine Laboratory Data

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Particle-size distribution wikipedia , lookup

Transcript
The Determination
of Normal Ranges
from Routine Laboratory Data
George
J. Neumann
A method is described which is potentially capable of closely estimating the normal
range from laboratory data. The estimation is made on probability paper using a
purposely truncated form of the “normal” distribution. A fictitious set of data has
been used to illustrate the efficiency of estimation of normals. The method has been
used to estimate the normal range of blood urea.
HE NEED FOR A SIMPLE
AND RELIABLE
METHOD
for determining
normal
ranges is widely recognized.
A number of methods
have been suggested
for this purpose
(1-5);
however,
to date none has been shown to be
completely
satisfactory
when treating
data from a heterogenous
population such as is obtained
in the hospital
laboratory
as daily routine
procedure.
This paper
discusses
a method
which has the potential
of
treating
such data, and describes
a modification
which will improve
the results
of the procedure.
Probit Analysis
The method of Hoffman
(1)
for deriving
normal ranges from laboratory data is a simplified
form of a more mathematic
treatment
called
probit
analysis.
In his book, Finney
(6)
gives a short history
of the
probit method,
which dates back to a suggestion
by Fechner
in 1860,
progresses
to the normal
equivalent
deviation
(N.E.D.)
of Gaddum
in
1933, and Fisher’s
maximum
likelihood
analysis
in 1935. Although
probit analysis
has had many years in which to be used and developed,
relatively
little has been done in connection
with heterogenous
distributions.
The
probit
(normal)
From
is a unit
distribution
the
Received
Ellis
Hospital
for publication
defined
in terms
of the standardized
curve.
Laboratory,
Sept.
27,
Schenectady,
1967;
accepted
979
N. Y. 12308.
for
publication
Feb.
16, 1968.
Gaussian
980
r11110
NEUMANN
equation
for the Gaussian
p
If a new
unit,
n, is defined
=
distribution
into Equation
equation
equation
(2)
,
the result
distributions
is the N.E.1).
(3)
for the standardized
for the cumulative
seen that
is:
-1--e7
I
ardized
(1)
=
(1)
P
it is readily
is:
as:
=
which is the
corresponding
Chemistry
e
-
a
and substituted
Clinical
normal
distribution
The
j.5
J-
/2,r
e
(4)
2
the mean
and standard
0 and
1, respectively.
are
distribution.
is then:
deviation
The
of the stand-
parameter
1(2
2
of Gaddum.
Since it is more
is defined as
convenient
to work
Pi’obiC
=
with positive
U
+
5
numbers,
the probit
(5)
When probits
are plotted
against
values
for a cumulative
normal
distribution,
the result
is a straight
line. The effect
of the probit
transformation
is shown in Fig. 1. Since the 0 arid 100 probabilities
correspond
to
oc and + cc, respectively,
these points do not normally
fall on the probit line in practical
plotting.
It should be pointed
out that in probit plotting
the mean occurs at a
probit
of 5 (sO%),
and the slope of the line is the reciprocal
of the
standard
deviation.
This will be true only if the values
are normally
distributed
or can be transformed
to a normally
distributed
function.
The use of probits is of definite advantage
when one needs mathematic
accuracy,
since the units may be treated
as relating
to a straight
line.
However,
the cost in time and complexity
seems not worth the investmerit unless computers
are available.
Therefore,
the mathematic
methods of probit
analysis
will be passed
over in favor
of the graphical
estimates
which are more economical
of time. Those interested
in the
probit calculations
are referred
to Finney’s
book (6).
-
vol.
14, No. 10, 1968
NORMAL
981
RANGES
Estimations from Probability Paper
Probability
paper (No. 468000) * is graduated
according
to a Gaussian
probability
distribution
in such a way that percentages
may be plotted
as their corresponding
normal
deviates
in much the same way as one
uses semilogarithmic
graph
paper
to plot numbers
as their corre-
I-.
‘C’
4
C”
2
Fig.
3
1. Effect
sponding
logarithms.
distribution
is plotted
of probability
paper
formation.
Of course,
straight-line
statistics
To use probability
or groups of values in
the order is increasing
in order of increasing
*Keufel
and
Esser
4
5
of probit
6
7
transformation.
5
9
See text
/0
for
/1
/2
details.
When the cumulative
percentage
of a Gaussian
against
value, a straight
line will result. The use
thus avoids
the need for tables of probit
transthe use of the paper makes it impossible
to derive
without
returning
to the use of probits.
paper,
it is necessary
to list the possible
values
consecutive
order. It makes no difference
whether
or decreasing,
but it seems conventional
to list
value. The frequency
of occurrence
is then noted
Co., Cleveland,
Ohio
982
NEUMANN
for each
value,
as well as the cumulative
Clinical
frequency,
Chemistry
and the cumulative
percentage
is calculated
for each value. Table 1 (lemonstrates
the i)rocedure
for data representing
a fictitious,
Ilonhonlogenous
population.
The cumulative
percentage
is then plotted
against
tile
value on “normal’’ probability
paper
(Fig. 2). Note that the composite
curve obtained in Fig. 2 is not a straight
line due to the fact that the data are
derived
from two overlapping
Gaussian
distributions.
The exact form
of tile curve will depend upon (1) the distance
between
the means,
(2)
the standard
deviation
of each distribution,
and (3) the relative
proportions
of the distributions.
if the normal range of Curve B is evaluated
by tile method
of Hoffman (1), extending
tile best straight
line to intersect
the 5 and 93%
points
(90% limits, as suggested
by Hoffman),
we would obtain
18.549.0, as compared
to the original
13.5-42.0.
if, on the other hand, one
were to use tile limits
of tile straight-line
portion
of this curve
as
suggested
by Waid (2), the lower limit of tile range would l)e equal to
or less than 10, and the upper limit equal to 30 or perhaps
35, depending
on exactly where tile straight
line is terminated.
Thus, it can be seen that these methods
are at best of limited usefulness as they stand. If it were possible,
however,
to dissect
the mixed
distribution
in such a way as to restore
the original
plots, the method
should be more accurate
and more appliuitble.
Hoffman
(1)
alludes
to
such a technic in Hald (7). Dissection
by this technic involves
fitting a
parabolic
curve to the logarithmic
form of the equation
for the Gaussian
curve. There are two objections
to this: (1) the arithmetic
is cumbersome, and (2) the technic
is useful only when one side of one of the
distributions
is essentially
unaffected
by tile presence
of the other. Tt
would be much more desirable
to have a simpler
technic,
preferably
one involving
no more cumbersome
arithmetic
than that encountered
Table
1. I)
FOR
Two
THEIR
Cumulati,
B
C
68
1370
6190
9500
9984
-
9
85
500
1850
4400
7200
9050
-
9800
-
9972
Value
10
20
30
40
50
6()
70
80
90
e incidence
-
ARBITRARY
COMBINED
Cumulative
combined
incidence
77
1455
6600
11350
14384
17200
19050
19800
19972
(B AND C)
POPULATIONS
FREQUENCY
CalcuMed
Cumul
incidence
at/re
%
B
0.4
7.3
33.0
56.7
71.9
86.0
95.2
99.0
99.86
C
7
70
1460
6600
9680
0
0
1670
9994
4354
-
-
7200
9050
9500
9972
Vol. 14, No. 10, 1968
NORMAL
RANGES
983
with the fitting of a straight
line, and one which is independent
of the
degree of overlapping
of the constituent
populations.
Such a technic is implied by Hald’s
discussion
of the truncated
normal distribution
(7).
The truncated
distribution
is a normally
distributed
population
which has 1)een cut off at some point so that the
sample is an incomplete
population.
0
f
0
01
/
/
‘C’
01
I...
‘:3
‘C’
C”
IC’
/
I
/0
20
I
I
I
30
40
50
I_
60
_I
70
I
90
80
UNI 7S
Fig.
2. Probability
plot
of
fictitious
open circles;
Curve C, haif.closed
according
to Hoffmann
(1), closed
nomihomogenous
circles;
circles.
composite
distribution
curve,
unbroken
(see
Table
line;
1).
Curve
extrapolated
B,
line
NEUMANN
984
Clinical
Chemistry
The effect of truncation
on the probability
plot is shown in Fig. 3. It
caii be seen that the curve of the truncated
distribution
asymptotically
approaches
the value at which the distribution
is truncated.
The curve
is derived
from the values of Column 4 in Table 2, where the point of
truncation
is 30 units. An estimate
of the degree
of truncation
is obtained by extending
the best straight
line to this value and reading
the
/G
/
0#{149}
01
e
.8
/
/
/
/(
0
30
I
I
40
50
60
I
I
70
80
90
UN! TS
Fig.
circles;
closed
3.
Dissection
second
circles;
cycle,
ut
overlapping
half-closed
reconstructed
populations.
circles,
Curve
vertically
C, half-closed
First
split;
circles,
cycle
fourth
(truncated
cycle,
horizontally
distribution),
(reconstructed
split.
open
Curve
B),
vol.
14, No. 10 196$
NORMAL
Table
2. 1 )ISsscTIoN
(IF
985
RANGES
OvERL.\PI’ING
i’oI’UL.TIoNs
Cycle
Value
CumulI,ti,’e
incidence
10
20
30
40
50
60
70
80
90
77
1455
6600
11350
14384
17200
19050
19800
19972
--
l’runcatc,l
cumulative
Cumulative
%
3
4
0.9
17.0
0.8
15.4
0.7
14.6
77.0
70.0
66.0
5
1.16
22.0
100
04
7.3
33.0
567
71.9
86.0
95.2
99.0
99.86
----
percent
represented
by the point of intersection
(Point A, Fig. 3). The
calculation
to this point is the first cycle referred
to below. This estimate
can then be used to reconstruct
the original
distribution
as described
below.
Method
It is not known at this point how many values must be used to obtain
statistically
significant
results.
It is suggested
that not less than 200
values be collected,
and more would be preferable.
The values are sorted
as described
above and the cumulative
percentages
obtained,
including
all values. The results
are then plotted
on probability
paper which will
yield a curve similar to that represented
by the unbroken
line in Fig. 2.
A straight
line is fitted by eye through
those points
obviously
representing
a straight
line (closed circles, Fig. 2).
One then selects the values closest to the limits of the straight
line
as the points of truncation.
In the present
instance,
there is only one
point of truncation
at 30 units. The cumulative
percent
for each value
between the points of truncation
is recalculated
using the total cumulative frequency
of the included
values
as the total frequency
of the
sample (Table 2, first four columns).
The resulting
data are plotted
on
probability
paper yielding
a truncated
curve (open circles, Fig. 3). A
straight
line is visually
fitted to these points and extended
to the point
of truncation
(Point A, Fig. 3). The percentage
read from the point of
intersection
can then be used to calculate
an estimate
of the total number of observations
(N) in this sample
from this population.
In the
present
example
the calculation
is as follows:
6600
=
- -
=
.16
The figure
thus
derived
can then be used
in a second
cycle of calcula-
986
NEUMANN
tions to calculate
new percentages
of N.
Should tile point of truncation
tribution,
the difference
between
of N is added to the incidence
percentages.
Clinkal
alid to plot
arid make
a new
Chemistry
estilnate
occur at the lower values of the disthe observed
and the calculated
values
of each value before
calculating
new
The procedure
as described
for the first cycle can be applied
as often
a sufficiently
straight
line.
I have arbitrarily
chosen to repeat
until the limiting
value deviates
less than 2 percentage
units from the value obtained.
When a sufficiently
straight
line has been obtained
the values
corresponding
to 2.5 and
97.5% are read from the graph.
These are the 95% limits of the normal
range; the mean is read at the 50% point on tile drawn line. The standard deviation
may be estimated
by subtracting
tile low’er limit of normal
from the upper limit of normal and dividing
by four.
to obtain
as necessary
The application
of this method
to tile data of Table 1 are shown in
Table 2 and Fig. 3. The last two columns
of Table 1 show the calculated
incidence
of each value for the two distributions.
The reconstructed
curves agree favorably
with the originals.
The results
of the applicahon are compared
in Table 3 with the original,
Hoffman’s
interpretation, and Waid ‘s interpretation.
Application of Method of Blood Urea
The method has been applied
to a series of 626 urea values obtained
by the routine
AutoAllalyzer
method.
Values
were grouped
using
multiples
of five as midpoints,
with a class interval
of 5 units. This
grouping
was chosen
011 the
basis of the standard
deviation
of the
analysis.
The original
plot and the reconstructed
normal
curve
are
shown in Fig. 4.
The normal
range as calculated
from the data is 8-20 mg./100
ml.
These values agree well with those quoted by Henry
(8)
for the overall
population.
Preliminary
data on differences
between
age groups
and
sexes tend to confirm previous
data quoted by Henry.
Table
Method
Original
Hoffman
Waid(2)
Present
*
The lower limit
3. CoMPRIsoN
Curve
(1)
can he shown
OF “NoRM.I”
VALUES
Curve
B
13.5 -42.5
18.5 -49.0
10.0*30.0
13.5 -41.0
to be lens than
5 if more
C
26.0-79.0
17.5-69.0
60.0-90.0
26.0-79.0
complete
data
are used
(see Fig.
2).
Vol. 14, No. 10, 196$
NORMAL
RANGES
987
oO
00
00
0
01
0
00
OO
0
0
0
0
0
01
I
I
/0
20
I
I
30
40
-
I
I
I
I
50
60
70
80
I
90
UREA (mg/lOO ml)
Fig.
siormal
4. Determination
of normal
distribution,
closed circles.
values:
urea.
Original
plot,
open
circles;
reconstructed
Summary
A method has been described
which is potentially
capable
of closely
estimating
the normal
range from laboratory
data. The estimation
is
made on probability
paper using a purposely
truncated
form of the
normal
(Gaussian)
distribution.
The oi)jections
to previously
published
methods
have been overcome,
and no more complicated
calculations
than the calculation
of percentages
are necessary.
A fictitious
set of
data has been
used to illustrate
the efficiency of estimation
of normals.
rflle
method has also been used to estimate
the normal
range of blood
urea. Tile method is applicable
only when tile distribution
is Gaussian
or can be transformed
to a Gaussian
distribution
(e.g., loguormal).
988
NEUMANN
Clinical
Chemistry
References
I.
2.
Hoffman,
R. G., Statistics
in the practice
of medicine.
J. Am. Med. Assoc. 185, 864 (1963).
Waid,
M., Quoted
by Sparapani,
A., and Berry,
R. E., The range
of normal
values in the
quality
control
of clinical
chemistry.
Am. J. Clin. Pathol.
42, 133 (1964).
3. Herrera,
L., The precision
of percentiles
in establishing
normal
limits in medicine.
J. Lab.
Clin. Med. 52, 34 (1958).
4. Henry,
R.,J., Clinical Chemistry:
Principles
and Technics.
Hoeher,
New York, 1964, p. 147.
5. Henry,
11. J., and Dryer,
R. L., Standard
Methods
of Clinical
Chemistry
(Vol. 4). Acad.
Press, New York, 1963, p. 205.
6. Finney,
D. J., Probit
Analysis:
A Statistical
Treatment
of the Sigmoid
Response
Curve
(ed. 2). Cambridge
Univ. Press,
Cambridge,
England,
1962.
7. Hald, A., Statistical
Theory
with Engineering
Applications.
Wiley, New York, 1962.
8. Henry,
B. J., Clinical
Chemistry:
Principles
and Technics.
Hoeber,
New York, 1964, p. 275.