Download The Pythagorean Theorem and Its Consequences

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Computational electromagnetics wikipedia , lookup

Mathematics of radio engineering wikipedia , lookup

Weber problem wikipedia , lookup

Transcript
The Pythagorean Theorem and Its
Consequences
Jim Emery
Edited: 8/4/13
Contents
1 Pythagoras: Biographical Sketch
5
2 Eight Proofs of the Pythagorean Theorem
2.1 Proof I: Euclid’s Elements . . . . . . . . . . . . . . . . . . . .
2.2 Proof II: The Ascent of Man . . . . . . . . . . . . . . . . . . .
2.3 Proof III: Garfield . . . . . . . . . . . . . . . . . . . . . . . . .
2.4 Proof IV: An Arrangement of Four Triangles in a Square of
side a + b . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.5 Proof V: An Arrangement of Four Triangles in a Square of side c
2.6 Remarks on Geometric Proofs, Versus Algebraic Proofs . . . .
2.7 Proof VI: Equating Two Square Arrangements . . . . . . . . .
2.8 Proof VII: Triangle Area Proportional to the Hypotenuse Squared
2.9 Proof VIII: Similarity and Proportion . . . . . . . . . . . . . .
12
12
12
12
17
17
3 A Crises in Greek Mathematics: What is a real number?
20
4 Inequalities
20
5 Euclidean Distance
21
6 Distance Functions and the Metric Space
22
7 Vector Spaces and Inner Product Spaces
23
1
5
5
5
10
8 Normed Linear Spaces
23
9 Normed Linear Spaces and Functional Analysis
23
10 Hilbert Space and `2
23
11 Orthogonality, Orthagonal Polynomials, Fourier Series
23
12 Projections
23
13 Linear Least Squares Problems as Geometric Problems: Orthogonality and the Pythagorean Theorem
23
14 Elementary Formulation of the Least Squares Problem for
Straight Line Fitting
23
15 A Geometric View of the Least Squares Problem
24
16 Bibliography
31
List of Figures
1
2
The Pythagorean Theorem. The area of a square on the
hypotenuse of a right triangle is equal to the sum of the squares
on the sides. a2 + b2 = c2 , where here a = 6, b = 8, c = 10 . .
Proof I: Euclid’s Elements. The short side of the triangle is
a, the long side is b and the hypotenuse is c. The more darkly
shaded triangle rotated counterclockwise by 90 degrees, will
fall exactly on the more lightly shaded triangle. So these two
triangles are congruent. The line from the top vertex divides
the square on the hypotenuse c into a left rectangle L and a
right one R. The dark triangle has area b2 /2, because its base
has length b, as does its height. The area of the lightly shaded
triangle is 1/2 that of the right sub-rectangle R. Therefore
the area of R is b2 . Repeating the argument on the left side
of the figure with two new triangles, we find the area of L is
a2 . Therefore c2 = a2 + b2 . . . . . . . . . . . . . . . . . . . . .
2
6
7
3
4
5
6
7
Proof II: The Ascent of Man. Jacob Bronowski in his
book The Ascent of Man discusses this proof on pages 156162. The book is based on the 1972 BBC television series of
the same name. The small side of the triangle is a, the long
side b, the hypotenuse c. The area of the left figure is c2 . The
shaded inner square has side length b − a. Rearranging the
pieces of the left figure we get the right figure consisting of a
small square of area a2 , and a larger composite square of area
b2 . Therefore a2 + b2 = c2 . . . . . . . . . . . . . . . . . . . . . 8
Proof II: The Ascent of Man. Jacob Bronowski in his
book The Ascent of Man discusses this proof on pages 156162. The book is based on the 1972 BBC television series of
the same name. The small side of the triangle is a, the long
side b, the hypotenuse c. The area of the left figure is c2 . The
shaded inner square has side length b − a. Rearranging the
pieces of the left figure we get the right figure consisting of a
square region of area b2 (shaded region), and a square region
of area a2 (unshaded region). Therefore a2 + b2 = c2 . . . . . . 9
Proof III: Garfield’s Proof. The long side of the triangle is
b, the short side a, the hypotenuse c. The area of the trapezoid
is A = (a + b)(a + b)/2 = a2 /2 + ab + b2 /2. The area as the
sum of the three triangles is A = ab + c2 /2. Equating the two
expressions for A we obtain the result a2 + b2 = c2 . . . . . . . 11
Proof IV: An Arrangement of Four Triangles in a
Square of side a + b The short side of the triangle is a, the
long side b, and the hypotenuse c. The area of the enclosing
rectangle is A = (a + b)2 . The area of the four triangles and
the inside rectangle is A = 4(ab/2) + c2 = 2ab + c2 . Equating
these two expressions for area A we have a2 + b2 = c2 . . . . . . 13
Proof V: An Arrangement of Four Triangles in a Square
of side c The short side of the triangle is a, the long side b,
and the hypotenuse c. The area of the enclosing rectangle is
A = c2 . The area of the four triangles and the inside rectangle is A = 4(ab/2) + (b − a)2 = a2 + b2 . Equating these two
expressions for area A we have a2 + b2 = c2 . . . . . . . . . . . 14
3
8
9
10
11
An Arrangement That Leads to a Geometric Proof.
By equating this square with a certain second square also
of side a + b we arrive at a clearly geometric proof of the
Pythagorean Theorem that uses no algebra, and could have
been employed by the Greeks, who did not have algebra available. They used number in their arguments, but to them numbers were line segment lengths. Through this means Euclid
treated the concept of proportional numbers. . . . . . . . . . 15
Proof VI: Equating Two Square Arrangements. The
left enclosing square and the right enclosing square have the
same area. Therefore the sum of the areas of the two shaded
squares on the left, is equal to the area of the shaded square
on the right. That is, the sum of the square on the short side
of the triangle, plus the square on the long side of the triangle,
is equal to the area of the square on the hypotenuse. . . . . . 16
Proof VII: Area Proportional to the Hypotenuse Squared.
Let the short side of the triangle be a, the long side b, and the
hypotenuse c. Similar right triangles have their areas proportional to the square of their hypotenuses. with the same
proportionality constant, say α. This follows because similar triangles have corresponding sides that are proportional.
Also the ratios of triangle sides are the same for similar triangles. This is established in Euclidean geometry, and is the
basis of trigonometry. In particular similar triangles have the
same acute angles. Let one of them be θ. Then the area of
the triangle is A = (ab)/2 = c cos(θ)c sin(θ) = αc2 , where
α = cos(θ) sin(θ). The vertical line divides the triangle into
two similar triangles, a left one and a right one. The hypotenuse of the left sub-triangle is a, the right one b. Thus
their areas are αa2 and αb2 . The area of the original triangle
is αc2 . So αa2 + αb2 = αc2 . Thus a2 + b2 = c2 . . . . . . . . . . 18
Proof VIII: Similarity and Proportion. Let the short
side of the triangle be a, the long side b, and the hypotenuse c.
The vertical line divides the triangle into two similar triangles.
Corresponding sides are proportional. c is divided into two
segments c1 on the left and c2 on the right. We have a/c1 =
c/a, so a2 = c1 c. And b/c2 = c/b, so b2 = c2 c. Then a2 + b2 =
(c1 + c2 )c = c2 . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
4
1
Pythagoras: Biographical Sketch
Pythagoras proclaimed that ”All is Number” (that is, all is Mathematics).
Pythagoras was born in Samos about 570 BC and died about 495 BC. Knowledge about him is vague and uncertain. He is said to have related mathematics to music, believed in reincarnation, and founded a secret religion in
southern Italy in the town of Croton, a Greek colony. Much said about him
may be apocryphal. But perhaps he was the first to call himself a philosopher (lover of knowledge). Many later philosophers claimed to have been
influenced by his ideas. The Pythagorean theorem itself may have originated
in the cultures of the Babylonians and Indians, although he may have been
the first to write down a formal proof of this theorem, earlier versions being
folklore and tradition.
2
Eight Proofs of the Pythagorean Theorem
Most proofs are obvious from geometrical figures. Some proofs are algebraic,
many use use the concept of similarities of triangles and proportions.
2.1
Proof I: Euclid’s Elements
In our figure for Euclid’s proof, which proof appears in his work The Elements, two overlaid shaded triangles are congruent, and so have equal areas.
Corresponding to each triangle are two rectangles each of double the area.
One such rectangle is a square on a side of the original right triangle. The
other makes up a portion of the square on the hypotenuse. So suppose the
triangle sides are a and b and the hypotenuse c. So we have have that a2
is equal to the area of a sub-rectangle of the square on the hypotenuse c.
Similarly we have b2 equal to the area of the rest of the area of the square
on the hypotenuse. Thus
a2 + b2 = c2
2.2
Proof II: The Ascent of Man
Jacob Bronowsky devotes several pages discussing a proof of the Pythagorean
theorem in his book, The Ascent of Man, and in his television series. This
5
Figure 1: The Pythagorean Theorem. The area of a square on the
hypotenuse of a right triangle is equal to the sum of the squares on the sides.
a2 + b2 = c2 , where here a = 6, b = 8, c = 10
6
Figure 2: Proof I: Euclid’s Elements. The short side of the triangle is a,
the long side is b and the hypotenuse is c. The more darkly shaded triangle
rotated counterclockwise by 90 degrees, will fall exactly on the more lightly
shaded triangle. So these two triangles are congruent. The line from the top
vertex divides the square on the hypotenuse c into a left rectangle L and a
right one R. The dark triangle has area b2 /2, because its base has length b,
as does its height. The area of the lightly shaded triangle is 1/2 that of the
right sub-rectangle R. Therefore the area of R is b2 . Repeating the argument
on the left side of the figure with two new triangles, we find the area of L is
a2 . Therefore c2 = a2 + b2 .
7
Figure 3: Proof II: The Ascent of Man. Jacob Bronowski in his book
The Ascent of Man discusses this proof on pages 156-162. The book is
based on the 1972 BBC television series of the same name. The small side
of the triangle is a, the long side b, the hypotenuse c. The area of the left
figure is c2 . The shaded inner square has side length b − a. Rearranging the
pieces of the left figure we get the right figure consisting of a small square of
area a2 , and a larger composite square of area b2 . Therefore a2 + b2 = c2 .
8
Figure 4: Proof II: The Ascent of Man. Jacob Bronowski in his book
The Ascent of Man discusses this proof on pages 156-162. The book is
based on the 1972 BBC television series of the same name. The small side
of the triangle is a, the long side b, the hypotenuse c. The area of the left
figure is c2 . The shaded inner square has side length b − a. Rearranging the
pieces of the left figure we get the right figure consisting of a square region
of area b2 (shaded region), and a square region of area a2 (unshaded region).
Therefore a2 + b2 = c2 .
9
occurs in the chapter called The Music of the Spheres and in an episode
similarly titled in the television series. See the figure captioned The Ascent
of Man.
2.3
Proof III: Garfield
James A. Garfield contributed an original proof for the Pythagorean theorem.
Of course most proofs of this theorem are rather similar. I had heard about
Garfield’s proof many times, but had not actually seen it. However, his
proof is presented in the book: Welchons, Krickenberger, Pearson, Plane
Geometry.
I graduated from James A. Garfield elementary school in Long Beach
California, a few years back, so I am closely connected to Garfield. Garfield
was one of our assassinated presidents, a rather interesting person, an exception to our rather dull and dim witted group of presidents in general. His
assassin Charles Guiteau had a connection with the Oneida Community in
Oneida, New York. This was a 19th century social experiment devoted to
”free” love. For an interesting treatment of these matters see Sara Vowell’s
book Assassination Vacation. If you are not familiar with Sara, her quirky
personality and her squeaky voice, as heard on This American Life, you
are really missing out.
Garfield’s proof consists in using two copies of the triangle, which has
short side a, long side b, and hypotenuse c. We rest one copy on its short
side a, the other on the long side b, so that the two triangles touch at a
point. Then we add a line joining the top vertex of the first triangle to
the top vertex of the second triangle getting a trapezoid (See the Garfield
figure). A trepezoid is a quadrilateral with two parallel opposite sides. The
area of the trapezoid is the average length of its two parallel sides times
the perpendiculat distance between its parallel sides (this can be shown by
decomposing the trapezoid into two triangles by drawing a diagonal). So the
area of the trapezoid is
A = (a + b)
a+b
1
a2
b2
= (a2 + 2ab + b2 ) =
+ ab + .
2
2
2
2
On the other hand writing the area as the sum of the areas of the three
triangles, we have
ab ab c2
c2
A=
+
+
= ab + .
2
2
2
2
10
Figure 5: Proof III: Garfield’s Proof. The long side of the triangle is
b, the short side a, the hypotenuse c. The area of the trapezoid is A =
(a + b)(a + b)/2 = a2 /2 + ab + b2 /2. The area as the sum of the three
triangles is A = ab + c2 /2. Equating the two expressions for A we obtain the
result a2 + b2 = c2 .
11
Equating these two expressions for A, we obtain
a2 + b2 = c2 .
2.4
Proof IV: An Arrangement of Four Triangles in a
Square of side a + b
Consider the figure called An Arrangement of Four Triangles in a
Square of side a + b. The short side of the triangle is a, the long side b, and
the hypotenuse c. The area of the enclosing rectangle is A = (a+b)2 . The area
of the four triangles and the inside rectangle is A = 4(ab/2) + c2 = 2ab + c2 .
Equating these two expressions for area A we have a2 + b2 = c2 .
2.5
Proof V: An Arrangement of Four Triangles in a
Square of side c
Consider the figure called An Arrangement of Four Triangles in a
Square of side c. The short side of the triangle is a, the long side b, and
the hypotenuse c. The area of the enclosing rectangle is A = c2 . The area of
the four triangles and the inside rectangle is A = 4(ab/2) + (b − a)2 = a2 + b2 .
Equating these two expressions for area A we have a2 + b2 = c2 .
2.6
Remarks on Geometric Proofs, Versus Algebraic
Proofs
Euclid’s proof is purely Geometric with no reliance on algebra. The figure
titled An Arrangement That Leads to a Geometric Proof. will lead
to another purely geometric proof. Most of the proofs are algebraic involving
a slight amount of Algebra.
2.7
Proof VI: Equating Two Square Arrangements
Referring to the figure for proof VI, the left enclosing square and the right
enclosing square have the same area. Therefore the sum of the areas of the
two shaded squares on the left, is equal to the area of the shaded square on
the right. That is, the sum of the square on the short side of the triangle,
plus the square on the long side of the triangle, is equal to the area of the
square on the hypotenuse.
12
Figure 6: Proof IV: An Arrangement of Four Triangles in a Square
of side a + b The short side of the triangle is a, the long side b, and the
hypotenuse c. The area of the enclosing rectangle is A = (a + b)2 . The area
of the four triangles and the inside rectangle is A = 4(ab/2) + c2 = 2ab + c2 .
Equating these two expressions for area A we have a2 + b2 = c2 .
13
Figure 7: Proof V: An Arrangement of Four Triangles in a Square of
side c The short side of the triangle is a, the long side b, and the hypotenuse
c. The area of the enclosing rectangle is A = c2 . The area of the four triangles
and the inside rectangle is A = 4(ab/2) + (b − a)2 = a2 + b2 . Equating these
two expressions for area A we have a2 + b2 = c2 .
14
Figure 8: An Arrangement That Leads to a Geometric Proof. By
equating this square with a certain second square also of side a+b we arrive at
a clearly geometric proof of the Pythagorean Theorem that uses no algebra,
and could have been employed by the Greeks, who did not have algebra
available. They used number in their arguments, but to them numbers were
line segment lengths. Through this means Euclid treated the concept of
proportional numbers.
15
Figure 9: Proof VI: Equating Two Square Arrangements. The left
enclosing square and the right enclosing square have the same area. Therefore
the sum of the areas of the two shaded squares on the left, is equal to the
area of the shaded square on the right. That is, the sum of the square on the
short side of the triangle, plus the square on the long side of the triangle, is
equal to the area of the square on the hypotenuse.
16
2.8
Proof VII: Triangle Area Proportional to the Hypotenuse Squared
See the figure for proof VII. Let the short side of the triangle be a, the
long side b, and the hypotenuse c. Similar right triangles have their areas proportional to the square of their hypotenuses. with the same proportionality constant, say α. This follows because similar triangles have
corresponding sides that are proportional. Also the ratios of triangle sides
are the same for similar triangles. This is established in Euclidean geometry, and is the basis of trigonometry. In particular similar triangles have the
same acute angles. Let one of them be θ. Then the area of the triangle is
A = (ab)/2 = c cos(θ)c sin(θ) = αc2 , where α = cos(θ) sin(θ). The vertical
line divides the triangle into two similar triangles, a left one and a right one.
The hypotenuse of the left sub-triangle is a, the right one b. Thus their areas
are αa2 and αb2 . The area of the original triangle is αc2 . So αa2 + αb2 = αc2 .
Thus a2 + b2 = c2 .
2.9
Proof VIII: Similarity and Proportion
Let the short side of the triangle be a, the long side b, and the hypotenuse c.
The vertical line divides the triangle into two sub-triangles both similar to
the original. Referring to the figure, corresponding sides are proportional. c
is divided into two segments c1 on the left and c2 on the right. We have
a
c
= ,
c1
a
so
a2 = c1 c.
b
c
= ,
c2
b
so
b2 = c1 c.
Then
a2 + b2 = (c1 + c2 )c = c2 .
Thus
a2 + b2 = c2 .
17
Figure 10: Proof VII: Area Proportional to the Hypotenuse Squared.
Let the short side of the triangle be a, the long side b, and the hypotenuse
c. Similar right triangles have their areas proportional to the square of their
hypotenuses. with the same proportionality constant, say α. This follows because similar triangles have corresponding sides that are proportional. Also
the ratios of triangle sides are the same for similar triangles. This is established in Euclidean geometry, and is the basis of trigonometry. In particular similar triangles have the same acute angles. Let one of them be θ.
Then the area of the triangle is A = (ab)/2 = c cos(θ)c sin(θ) = αc2 , where
α = cos(θ) sin(θ). The vertical line divides the triangle into two similar triangles, a left one and a right one. The hypotenuse of the left sub-triangle is
a, the right one b. Thus their areas are αa2 and αb2 . The area of the original
triangle is αc2 . So αa2 + αb2 = αc2 . Thus a2 + b2 = c2 .
18
Figure 11: Proof VIII: Similarity and Proportion. Let the short side
of the triangle be a, the long side b, and the hypotenuse c. The vertical
line divides the triangle into two similar triangles. Corresponding sides are
proportional. c is divided into two segments c1 on the left and c2 on the
right. We have a/c1 = c/a, so a2 = c1 c. And b/c2 = c/b, so b2 = c2 c. Then
a2 + b2 = (c1 + c2 )c = c2 .
19
3
A Crises in Greek Mathematics: What is
a real number?
For the greeks numbers were lengths of line segments. Fractions (rational
numbers) are obtained by dividing line segments into equal pieces. They
discovered that the diagonal of a square can not be equal to any multiple of
a fractional division of the unit length of a square. This is a big problem for
their concept of number!
Show that the square root of a prime number is not rational. So suppose
√
the integer p is a prime, having no factors. Suppose p could be written as
a rational number, as a fraction say n/m, where m and n have no common
factor, since if not we could divide out the common factors.
√
p=
m
.
n
Squaring we have
p=
m2
.
n2
Then
n2 p = m2 .
Hence p must be a factor of m, say
m = pr.
Then
n2 p = p2 r 2
But this implies that p is a factor of n. This contradicts our assumption that
m and n had no common factor. Therefore the square root of a prime is not
a rational number.
Mention the definition of real numbers as Didekind Cuts, or as equivalence
classes of Cauchy sequences.
LEAST UPPER BOUND AXIOM If A is any nonempty set of the real
numbers R that is bounded above, then A has a least upper bound.
4
Inequalities
Cauchy-Schwartz, Minkowsky (Goldberg)
20
Cauchy-Schwartz,
∞
X
sn tn n=1
Minkowsky
"
∞
X
(sn + tn )
n=1
2
≤
#
"
∞
X
s2n
n=1
≤
"
∞
X
#1/2 "
∞
X
t2n
n=1
s2n
n=1
#1/2
+
"
#1/2
∞
X
n=1
t2n
#1/2
If a, b, c are vectors in a normed vector space (triangle inequality)
kc − ak ≤ kb − ak + kc − bk.
5
Euclidean Distance
From the Pythagorean Theorem we able to define the Euclidean distance
between points. So if we have two points with respective coordinates p1 =
(x1 , y1, z1 ) and p2 = (x2 , y2 , z2 ), the distance between the points is
d=
q
(x2 − x1 )2 + (y2 − y1 )2 + (z2 − z1 )2
So now we can talk about the nearness of points and thus talk about concepts such as continuity, and differentiability. Also we are able to formulate
the ideas of analytic geometry.
For example we are able to define the ellipse as the locus of points equidistant from two fixed points called the foci. Doing this we arrive at the canonical representation of an ellipse with the equation
x2 y 2
+ 2 = 1,
a2
b
and the standard equation of the ellipsoid
x2 y 2 z 2
+ 2 + 2 = 1.
a2
b
c
21
6
Distance Functions and the Metric Space
A metric is a distance function ρ defined on some set of points M with the
following four properties:
For a, b, c points of M, then
ρ(a, b) ≥ 0
ρ(a, b) = ρ(a, b)
ρ(a, a) = 0
(i)
(ii)
(iii)
and
ρ(a, c) ≤ ρ(a, b) + ρ(b, c)
(iv)
An open ball about the point a of radius r, B(a, r), is the set of all points
such that
ρ(a, p) < r
A metric space (M, ρ) consists of a set M with a metric ρ.
An open set in a metric space is a set A so that for every point a in A
there exists some open ball about a that is a subset of A. A metric space M
and the class of all open subsets form a topological space.
The metric for ordinary Euclidean two dimensional space is defined by
the Pythagorean Theorem. So let point p1 = (x1 , y1 ) and point p2 = (x2 , y2 .
Then the Euclidean distance between the points is the square root of the
differences of the coordinates
d(p1 , p2 ) =
q
(x2 − x1 )2 + (y2 − y1 )2 ,
which by the Pythagorean Theorem is the length of the line segment connecting the two points.
So the triangle inequality says that the sum of the lengths of two adjacent
sides in a triangle is greater than the length of the opposite side.
This is metric property (iv):
ρ(a, c) ≤ ρ(a, b) + ρ(b, c)
(iv)
For this simple two dimensional case, from the law of cosines
c2 = a2 + b2 − ab cos(θ) ≤ a2 + b2 .
For more general arguments see lineara.pdf, Topics In Linear Algebra
and Its Applications by James Emery.
22
7
Vector Spaces and Inner Product Spaces
8
Normed Linear Spaces
9
Normed Linear Spaces and Functional Analysis
10
Hilbert Space and `2
11
Orthogonality, Orthagonal Polynomials, Fourier
Series
12
Projections
13
Linear Least Squares Problems as Geometric Problems: Orthogonality and the
Pythagorean Theorem
14
Elementary Formulation of the Least Squares
Problem for Straight Line Fitting
The traditional way of deriving least squares equations is to write the expression for the sum of the squares difference between the given ”data” and
the approximating function, and then to set the partial derivatives with respect to the coefficients of the approximating function to zero. Let us do
this for the case of fitting a straight line to given data. Assume the model
f (x) = ax + b and minimize
r(a, b) =
n
X
i=1
(axi + b − yi )2
The conditions for a minimum are
n
∂r X
=
2xi (axi + b − yi ) = 0
∂a i=1
23
n
∂r X
2(axi + b − yi ) = 0
=
∂b i=1
We get a two by two system of equations.
a
n
X
x2i
+b
xi =
n
X
xi + b
n
X
x2i
xi yi
n
X
n
X
1=
n
X
xi )2 ) 6= 0.
yi
i=1
i=1
i=1
n
X
i=1
i=1
i=1
a
n
X
These equations are known as the normal equations of the problem. They
have a unique solution if the determinant is not zero, that is if
n
i=1
−(
i=1
If the x values are not all equal this follows from the Cauchy-Schwartz inequality applied to the vectors (1, 1, ...1) and (x1 , x2 , ..., xn ). The general
problem can be viewed more naturally as being geometric.
15
A Geometric View of the Least Squares
Problem
The abstract linear least squares problem may be formulated as approximation in a vector space by some element of a subspace. Often this vector space
is a space of functions. As examples the subspace could be generated by a
bases such as
1, x, x2 , x3 , ....,
or such as
1, cos(ωt), sin(ωt), cos(2ωt), sin(ωt), ...
The first case would be a polynomial, or power series approximation. And
the second would be a Fourier or trigonometric approximation. So consider
a vector space V with an inner product of u, with v, written as (u, v). Given
a subspace S and an arbitrary element g of V , we are to find the element in
S that best approximates g in the norm corresponding to the inner product.
The L2 norm for functions is based on the inner product
(f, g) =
24
Z
f g,
and for sequences is based on the inner product
(f, g) =
n
X
fi gi .
i=1
This L2 norm corresponds directly to the ”squares” part of the least squares
approximation. But the theory carries through for an arbitrary inner product. The norm defined by an inner product is
kf k = (f, f )1/2 .
A solution f ∈ S, minimizes
(f − g, f − g) = kf − gk2.
We will show that the problem is solved as the orthogonal projection of a
vector into a subspace. One can think of this as analogous to the simple
geometric problem of projecting a vector in space onto a plane. Think of a
vector from the origin to a point, and think of a plane through the origin,
not containing this vector. The plane is a vector space. A vector in the
plane closest to the original vector is obviously the orthogonal projection of
the vector onto the plane. The same thing happens in the general problem,
where the plane becomes the subspace. For example the subspace might be
the set of all cubic polyunomials. And the problem is to best fit the data to
a cubic polynomial.
Two vectors are orthogonal, i.e. perpendicular, if their inner product is
zero. We require a preliminary theorem to prove the main proposition.
Pythagorean Theorem. If v1 is orthogonal to v2 , then
kv1 + v2 k2 = kv1 k2 + kv2 k2 .
Proof.
(v1 + v2 , v1 + v2 ) = (v1 , v1 ) + 2(v1 , v2 ) + v(2 , v2 ) = (v1 , v1 ) + (v2 , v2 ).
Proposition. If f ∈ S and (g − f, h) = 0, ∀h ∈ S then f is a solution to the
least squares problem.
Proof. Let s ∈ S. We have
kg − sk2 = k(g − f ) + (f − s)k2 = kg − f k2 + kf − sk2 ≥ kg − f k2 .
25
By assumption, g − f is orthogonal to the subspace S, and f − s is in S. So
the second equality is a consequence of the Pythagorean Theorem.
We have shown that
kg − sk ≥ kg − f k, ∀s ∈ S.
so f is the best approximation to g in S and this completes the proof.
Notice that a unique solution always exists because f is the unique orthogonal projection of g into S. For finite subspaces the solution can be
formulated as a solution to a set of n linear equations in n unknowns. Let S
equal the span of f1 , .., fn . Let the solution be
f = c1 f1 + c2 f2 + .. + cn fn .
Then the minimum condition is equivalent to
(fi , c1 f1 + c2 f2 + ..cn fn − g) = 0, i = 1, .., n.
This is the same as
c1 (fi , f1 ) + c2 (fi , f2 ) + ..cn (fi , fn ) = (fi , g), i = 1, .., n.
These n linear equations in n unknowns are called the normal equations of
the problem. In the usual case, S is a space of discrete functions. These are
functions defined on a finite domain. Suppose there are m data values so
that the domain is
{p1 , p2 , ..., pm }.
We identify the function fi with the vector








fi (p1 )
fi (p2 )
.....
.....
fi (pm )








fi is an m dimensional column vector of values of the ith function. We can
formulate the minimum conditions with matrices. The inner product is then
the transpose of the first vector times the second. We write the transpose of
a vector v as v t . We have (fi , fj ) = fit fj Then
c1 (fi , f1 ) + c2 (fi , f2 ) + ..cn (fi , fn ) = (fi , g), i = 1, .., n.
26
Thus
h
fit f1 ... fit fn


i





c1
.
.
.
cn








= fit g
If we let A be an m row by n column matrix, whose ith column is fi , then
A=
Written out




A=
Also let
h
f1 f2 ... fn
f1 (p1 ) f2 (p1 )
f1 (p2 ) f2 (p2 )
...
...
f1 (pm ) f2 (pm )
B=
The normal equations become




t 
A A



g(p1)
.
.
.
g(pm )







c1
.
.
.
cn
Note that the original approximation
equations in n unknowns

c1

 .

A
 .

 .
cn








i
... fn (p1 )
... fn (p2 )
...
...
... fn (pm )













= At B.
problem in this form is a system of m








≈ B.
Any linear system of this form with m > n can be interpreted as a least
squares problem and has an approximate least squares solution. The matrices
27
A and B are a convenient input set to a general linear least squares solver
(see the listing of subroutine llsq).
There is always a unique solution to the linear least squares problem.
The solution is the orthogonal projection into the subspace. But there will
be more than one solution to the normal equations if the given functions
spanning the subspace are not linearly independent. The normal equations
have a solution, so they are consistent. From the theory of linear equations,
if the determinant D of the coefficient matrix of the normal equations is
not zero, then there is a unique solution. Then we can solve the equations
either by inverting the coefficient matrix, or by gaussian elimination. If D is
zero, then there is more than one solution, such solution will involve one or
more variables of arbitrary value. Gaussian elimination will fail. The D = 0
solution can be computed by using elementary row operations which can be
done numerically or with various computer algebra programs. When we are
concerned only with the discrete space, it does not matter that there are
multiple solutions to the normal equations. Because any set of coefficients
gives a linear combination equal to the unique projection into the subspace.
The various solutions just give different linear combinations of dependent
vectors that equal the same vector. On the other hand if points other than the
sample points are in the relevant domain of the functions, then the multiple
solutions may give function solutions that are not the same on this extended
domain. To illustrate compare functions f and g where f (x) = x(x − 1)
is equal to zero on the domain x = 0 and x = 1, but it is not zero on
the extended domain of all real numbers. Let g be the true zero function,
g(x) = 0. The two functions agree on {0, 1}, but give different values on
an extended domain. Frequently we want to use the least squares solution
for interpolation between the given data points, and so the case of multiple
solutions to the normal equations does have consequence.
We will show that if f1 ,..,fn are linearly independent then the normal
equations have a unique solution. This is obvious because in this case f1 ,..,fn
is a basis of S and the unique solution f in S has unique components with
respect to this basis. It is also a direct consequence of the following proposition.
Proposition. if f1 ,...,fn are the linearly independent columns of a matrix
A, which has m > n rows, then det(At A) is not equal to zero.
Proof. Suppose the determinant is zero. Then there exists c1 , c2 , .., cn , not
28
all zero such that




c1 



(f1 , f1 )
(f2 , f1 )
......
......
(fn , f1 )








 + c2 






(f1 , f2 )
(f2 , f2 )
......
......
(fn , f2 )








 + .. + cn 






(f1 , fn )
(f2 , fn )
......
......
(fn , fn )








= 0.
Let
v = c1 f1 + ... + cn fn .
The first equation shows that (fi , v) = 0, for i = 1, .., n. It follows that
(v, v) = 0. This implies v = 0, and so each ci is zero. This is a contradiction,
so the proposition is true.
Example 1. We are to fit the function
y = f (x) = a sin(x) + b cos(x).
to the data
x
y
1.0 3.0
,
2.5 5.6
3.4 7.8
Apply the sin function to the x values to get the first column of matrix A
and the cos function to get the second column. Let vector B be the y values.
The normal equations are
At AC = At B
or in terms of the components
"
1.13154358 0.22224325
0.22224325 1.86845642
The solution is
C=
"
#
C=
"
4.6334245
−6.120705005
3.882636366
−10.40652323
#
#
So
f (x) = 4.6334245 sin(x) − 6.120705005 cos(x)
The following program does the linear least squares computations.
29
c+ llsq
least squares solution of a*c=b (solving for c)
subroutine llsq(a,ia,m,n,ws,c,b,ier)
c parameters
c
a-m by n matrix. declared row dimension ia.
c
ws-working storage vector of length m
c
c-vector of size n
c
b-vector of size m
c
ier-return parameter: ier=0 normal return,ier=1 normal
c
equations
c
nearly singular,ier=2 normal equations singular.
c
dimension a(ia,1),b(1),c(1),ws(1)
c
compute lower elements of jth column of transpose(a)*a
do 50 j=1,n
do 18 i=j,n
s=0.
do 15 k=1,m
s=s+a(k,i)*a(k,j)
15
continue
18
ws(i)=s
c
c compute jth element of right side vector
s=0.
do 40 k=1,m
40
s=s+a(k,j)*b(k)
c(j)=s
c
c store lower elements of jth column in a
do 19 i=j,n
19
a(i,j)=ws(i)
c
50
continue
c fill in upper values
do 60 i=1,n
do 60 j=i,n
a(i,j)=a(j,i)
60
continue
ib=1
30
c
mm=1
eps=1.e-12
inv=0
solve normal equations
call gausse(a,ia,c,ib,n,mm,inv,eps,det,ier)
return
end
16
Bibliography
[1] Heath T. L. (translator), Euclid’s Elements, 3 Volumes, Dover, 1956.
[2] Welchons A. M., Krickenberger W. R., Pearson Helen R., Plane Geometry, 1958, Ginn and Company. Garfield Proof p. 253.
[3] Bronowski Jacob, The Ascent of Man, Little Brown and Company,
1973.
[4] Halmos Paul R, Introduction to Hilbert Space: And the Theory
of Spectral Multiplicity, Chelsea, 1951. Halmos was a student of John
Von Neumann.
[5] Halmos Paul R, Finite Dimensional Vector Spaces, Springer-Verlag,
1975.
[6] Diggins Julia E, String, Straight-Edge and Shadow: The Story of
Geometry, The Viking Press, 1965. This is a book for junior high school
students, and elementary school teachers. A very nice short book with pictures, a history of the Greeks and Pythagoras, as well as some interesting
mathematical discussions I had not seen elsewhere.
[7] Pedoe Dan, Geometry and the Liberal Arts, St Martins Press, 1976.
[8] Vowell Sara, Assassination Vacation, 2005, Simon and Schuster.
[9] Goldberg Richard R Methods of Real Analysis, Blaisdell Publishing
Company, 1964.
31
32