Download The Statistics of a Function

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
The Statistics of a Function
Sheldon P. Gordon
Farmingdale State College of New York
Farmingdale, NY, 11735
[email protected]
Florence S. Gordon
New York Institute of Technology
Old Westbury, NY 11768
[email protected]
Mailing address:
61 Cedar Road
East Northport, NY 11731
The Statistics of a Function
One of the most important interpretations of the definite integral in a modern
calculus course is the fact that it gives the mean value of a function on an interval. Thus,
if a function f is defined on an interval [a, b], then the mean, or average value, of f on
that interval is given by
Mean value =
1
ba

b
a
f ( x )dx .
(1)
This suggests the possibility that other ideas from statistics may also carry over in
natural ways to a function f on an interval [a, b]. In this article, we will investigate
the meaning of several other statistical measures associated with a function f on an
interval, including the variance, the standard deviation, and the median, as well as the
shape of the distribution of values of the function about the mean.
The Standard Deviation of a Function
We denote the mean value of a function f by , so that  represents the average of
all the possible values of f between a and b. Suppose that the interval [a, b] is partitioned
into n subintervals, each of length ∆x = (b – a)/n, so that n = (b – a)/∆x. This gives rise to
a presumably large number of uniformly spaced points x0 = a, x1 = a + ∆x, …, xn = b. If
n is sufficiently large, the mean  of f on the interval should be reasonably well
approximated by the mean of the n + 1 values f(x0), f(x1), f(x2), …, f(xn), so that
Mean of f 

n
i 0
f ( xi )
n

1
n
f ( xi )x ,

i 0
ba
and this multiple of a Riemann Sum in the last expression is a good approximation to the
definite integral in (1).
We now consider the variance of the function on this interval, which is a measure
of the spread of the values of the function f about the mean. If n is sufficiently large, the
variance of f on the interval should be reasonably well approximated by the variance of
the n + 1 values f(x0), f(x1), f(x2), …, f(xn), so that

n
[ f ( xi )   ]2
1
n
[ f ( xi )   ]2 x ,

i 0
n
ba
which is a multiple of a Riemann sum. (We use the formula for the variance of a
population to make things algebraically simpler; since we eventually take the limit as n
→ ∞, it does not make a difference in the final results.) To obtain the exact value for the
variance, we take the limit as n  , or equivalently as ∆x  0, and so find
Variance of f 
Variance of f = lim x0
=
i 0
1
ba

 i 0 [ f ( xi )   ]2 x 
1
n
ba
1 b 2
[ f ( x)  2  f ( x)   2 ]dx .

a
ba

b
a
2
[ f ( x)   ] dx
However, since  =
1

ba
b
a
f ( x )dx , which will be a constant, we can simplify this
expression for the variance to get
b
b
1 b 2
1
1
2
f
(
x
)
dx

2

f
(
x
)
dx


1dx
b  a a
b  a a
b  a a
1 b 2
1
f ( x)dx  2 2 
 2 (b  a)
=

a
ba
ba
1 b 2
f ( x)dx  2  2   2
=

a
ba
1 b 2
f ( x)dx   2 .
=
b  a a
We now define the standard deviation  of the function f about the mean  as
Variance of f =
=
Variance 
1

ba
b
a
f 2 ( x)dx   2 .
(2)
Let’s see what this formula gives us in terms of some of the elementary functions.
Example 1 Consider the linear function f(x) = x +  on the interval [0,1]. Using
Formula (1), the mean of the function is
1
1 1
x2

1

(

x


)
dx

(



x
)
    f ( ).

0
1 0 0
2
2
2
Thus, as we might have expected, the mean of the linear function is equal to the value at
the midpoint of the interval. (The same is true for any linear function – its mean over any
interval is always equal to the value of the linear function at the midpoint of the interval.)
Next, we use Formula (2) to calculate the standard deviation, which is


1
1

 ( x   ) dx  ( 2   )
1 0
2
0
After some manipulation, we find that this reduces to  =
2
.

 0.288675. The values
12
of the function on the interval [0, 1] range from  (when x = 0) to α +  (when x = 1) and
are centered vertically at ½α + . However, because σ is somewhat more than ¼ of α, we
conclude that the entire collection of values of the linear function lie roughly within two
standard deviations of the mean.
Example 2 Consider the exponential function f(x) = ex on [0, 1]. Formula (1) gives
1 1 x
1
(e )dx  e x 0  e  1  1.71828.
=

0
1 0
Formula (2) then gives
 
1

1 0
1
0
(e x )2 dx  (e  1)2 
1
e
0
2x
dx  (e  1)2 ,
which eventually reduces to
 =
1
 (e2  4e  3)  0.49197,
2
or about ½. Consequently, a spread of one standard deviation about the mean extends
from a minimum height of 1.22631 to a maximum height of 2.21025, as shown in Figure
1, while the function extends from a minimum height of 1 to a maximum height of e 
2.71828. Furthermore, the entire set of values of the function on this interval lies,
roughly, within less than two standard deviations of the mean.
3
2
1
0
0
0.25
0.5
0.75
1
Figure 1
Example 3 Consider the family of power functions f(x) = xn on [0, 1]. Formula (1) gives
1 1 n
x n 1 1
1
=
.
(
x
)
dx



0
0
1 0
n 1
n 1
Formula (2) then gives
1
1 1 n 2
1 2
1 2
 
(
x
)
dx

(
)

x 2 n dx  (
) ,


0
0
1 0
n 1
n 1
which eventually reduces to
1
1

 =
.
2n  1 (n  1) 2
For instance, for f(x) = x2, we have  =  and  =
4
 0.298142 and a spread of one
45
standard deviation about the mean extends from a height of 0.03519 to a height of
0.63148, as shown in Figure 2. This encompasses roughly 63% of the totality of all
values of the function on this interval. Consequently, a spread of two standard deviations
therefore encompasses all the values.
If n = 3, we have  =  and  =
1
7

1
 0.283473, so that a spread of one
16
standard deviation about the mean extends from a minimum height of -0.033473 to a
maximum height of 0.53347, as shown in Figure 3. This encompasses roughly 53% of the
totality of all values and, again, a spread of two standard deviations encompasses all the
values.
1
1
0.6
0.6
0.2
0.2
-0.2
0
0.25
0.5
Figure 2
0.75
1
0
-0.2
0.25
0.5
0.75
1
Figure 3
All power functions with n > 0 pass through the origin and the point (1, 1); when
n > 1, the pattern is concave up. The higher the power n, the slower the initial growth and
then the faster the final spurt to reach the final height of 1. Consequently, it makes sense
that the mean of the values of the function will be smaller when n is smaller; there are
more “small” values for the function. In turn, there is less variation among the values of
the function and so the standard deviation should also be smaller as n increases.
The Median of a Function
We next turn to the idea of finding the median value of a function f on an interval [a, b].
As before, we start with the linear function f(x) = x + . It is apparent that the median
value should occur midway between f(a) and f(b), which is precisely the same height as
the mean . In fact, if f is any monotonic function on the interval [a, b], the median
value always occurs precisely at the midpoint of the interval, since precisely half of the
values will be below this level and half will be above it. Thus, for a monotonic function,
the median is considerably easier to find than the mean is.
Things are not quite so simple when the function is not monotonic, however. For
some classes of functions, we can find the median using geometric arguments. For most
functions, however, it does not seem possible to get an exact value for the median and we
have to settle for using Monte Carlo simulations to estimate the value. We illustrate
some of the potential complications that can arise with the function f(x) = x2 on various
intervals. We begin with the interval [0, 2]. Since the function is monotonic on this
interval, the median occurs at the midpoint of the interval, x = 1, so that the median value
is 1.
Next, consider the interval [-1, 3]. Because the function is not monotonic on this
interval, we reason as follows. The length of the interval is 4. Half of all the values,
which are the smaller values (in this case at most y =1), occur between x = -1 and x = 1
and the other half of all the values, which are the larger values (y = 1 or larger), occur
between x = 1 and x = 3. Therefore, the median value for the function on this interval is
1; half of the values are below this level and half are above it.
Next, consider the interval [-1, 4], whose length is 5. The smaller values (those
less than 1) occur between x = -1 and x = 1 and the equivalent number of larger values
(now 4 or larger) all occur between x = 2 and x = 4. Consequently, the median height
must occur between x = 1 and x = 2. However, on this subinterval, the function is
monotonic, so the median occurs at the midpoint, x = 1.5, and is therefore equal to 2.25.
Now consider the interval [-1, 1] having length 2. We seek the height such that
half of the values of the function are above this level and half are below it. We use the
symmetry of the function on this interval to reason that the middle half of the interval
extends from x = -0.5 to x = 0.5, and so the median height is simply f(0.5) = 0.25 = . In
a comparable way, consider the interval [-3, 3] having length 6. The middle half of the
interval extends from x = -1.5 to x = 1.5, and so the median is f(1.5) = 2.25.
On the other hand, suppose we have the function f(x) = sin (x2), say on the interval
[0, 2], whose graph is shown in Figure 4. The function is definitely not monotonic, nor
is there any symmetry that we can utilize to deduce the value of the median. In fact, there
seems to be no obvious way to calculate the median. (Perhaps some interested readers
can develop an analytic approach to determine the median of such a function on an
arbitrary interval.) Instead, we resort to a simple numerical approach. Using a program
such as Excel, we can create a spreadsheet that generates a table containing a large
number of values (we use 5000 points) of any function on any desired interval and then
have it calculate the median of the values in the table. On the interval [0, 2], we then
find that this function has a median value of 0.16867.
1
0
-1
Figure 4
The Distribution of the Values of a Function
Having found the mean and standard deviation of a function f on an interval [a, b], we
next consider the distribution of the values of the function about the mean in the sense of
determining the pattern or shape of that distribution – that is, are the function’s values
roughly normally distributed about the mean
12
or is there any other pattern in the
10
distribution?
8
We again start with a linear function.
After a little thought, it should be evident
6
that the values of the function will always be
4
uniformly distributed about the mean –
2
within any vertical interval of a given size,
0
there will always be the same number of
1 2 3 4 Mean
5 6 = 7 8 9 10 11
points. To check this out, consider the
distribution of the values of the linear
Figure 5
6
or
e
2.
M
4
2.
8
2
6
1.
2.
4
1.
2
2
1.
1
1.
function f(x) = 4x + 5 on the interval [0, 1]; its mean is f(0.5) = 7. The distribution of the
values of the function can be seen in the histogram in Figure 5, which validates our
expectation that the pattern will be roughly uniform. (The short bar at the left is a
consequence of the way that the subintervals were defined.)
Now let’s consider some nonlinear
functions. Suppose we look at the 10
distribution of the values of the
8
exponential function f(x) = ex on the
6
interval [0, 1]. As we found above, the
4
mean of the values for this function is e –
2
1  1.71828. The resulting histogram of
the values of the function is shown in
0
Figure 6, from which we observe that the
Mean = 1.71828
values are clearly skewed to the right. To
understand why this is the case, think
Figure 6
about the graph of the exponential function,
which is increasing and concave up. The function therefore grows relatively slowly at the
left and ever more rapidly as x increases. Consequently, it makes sense that the values at
the left will be much more tightly clustered than the values at the right and therefore we
should expect that the values of the function will be skewed to the right.
This observation suggests some general rules for any monotonic function. If the
function is increasing and concave up, the distribution of function values will be skewed
to the right. If the function is increasing and concave down, the distribution will be
skewed to the left. If the function is decreasing and concave up, the distribution will be
skewed to the left. Finally, if the function is decreasing and concave down, the
distribution will be skewed to the right.
What if the function is not
30
monotonic on the given interval?
Consider f(x) = x2 on the interval [-1, 3]. 25
Using Formula (1), we find that the 20
mean value of the function on this 15
interval is 2.
The corresponding 10
histogram showing the distribution of
5
the values of the function about the
0
mean, which extends from 0 to 9, is in
= 2.3333
1 4 Mean
7 10
13 16 19 22 25 28 31
Figure 7. These values are obviously
highly skewed to the right, near x = 0.
Figure 7
The possible heights extend from
0 to 9. However, the preponderance of heights near 0 in the histogram follows from two
factors. First, the closer we are to the origin, the more slowly the function grows or
decays, so the points will be more tightly clustered there. Second, because there is a
turning point at the origin, the smaller values (those that are less than 1) will each occur
twice, again leading to a denser distribution of points near x = 0.
As a final example, consider f(x) = cos x on the interval [0, 2π]. The mean value
of the function is obviously 0. What can we expect of the distribution of values about
this mean? The values of the function clearly extend from -1 to 1. We should therefore
expect denser clusters of points near the height of
each of the turning points of the function, namely
about a height of 1 and a height of -1. We should
also expect lower densities near the heights
corresponding to each of the inflection points,
where the function grows or decays most rapidly.
But all the inflection points occur at a height of 0.
The corresponding histogram is shown in Figure
8, where we see that these predictions are borne
out.
14
12
10
8
6
4
2
0
-1
1 3
5
7
1
9 Mean
11 13= 015 17 19 21
Figure 8
Acknowledgment The work described in this article was supported by the Division of
Undergraduate Education of the National Science Foundation under grants DUE0089400, DUE-0310123, and DUE-0442160. However, the views expressed do not
necessarily reflect those of the Foundation.
October 10, 2007
Barbara Rives, Editor
The AMATYC Review
Abilene Christian University
204 Hardin Administration Building
ACU Box 29140
Abilene, Texas 79699-9140
Dear Barbara
Enclosed please find five copies of an article entitled The Statistics of a Function for your
consideration for possible publication in The AMATYC Review.
Thanks for your kind attention. I look forward to hearing your decision on this article.
Hope all is going well.
Sincerely yours,
Sheldon P. Gordon