Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
The Pythagorean Theorem and Its Consequences Jim Emery Edited: 8/4/13 Contents 1 Pythagoras: Biographical Sketch 5 2 Eight Proofs of the Pythagorean Theorem 2.1 Proof I: Euclid’s Elements . . . . . . . . . . . . . . . . . . . . 2.2 Proof II: The Ascent of Man . . . . . . . . . . . . . . . . . . . 2.3 Proof III: Garfield . . . . . . . . . . . . . . . . . . . . . . . . . 2.4 Proof IV: An Arrangement of Four Triangles in a Square of side a + b . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.5 Proof V: An Arrangement of Four Triangles in a Square of side c 2.6 Remarks on Geometric Proofs, Versus Algebraic Proofs . . . . 2.7 Proof VI: Equating Two Square Arrangements . . . . . . . . . 2.8 Proof VII: Triangle Area Proportional to the Hypotenuse Squared 2.9 Proof VIII: Similarity and Proportion . . . . . . . . . . . . . . 12 12 12 12 17 17 3 A Crises in Greek Mathematics: What is a real number? 20 4 Inequalities 20 5 Euclidean Distance 21 6 Distance Functions and the Metric Space 22 7 Vector Spaces and Inner Product Spaces 23 1 5 5 5 10 8 Normed Linear Spaces 23 9 Normed Linear Spaces and Functional Analysis 23 10 Hilbert Space and `2 23 11 Orthogonality, Orthagonal Polynomials, Fourier Series 23 12 Projections 23 13 Linear Least Squares Problems as Geometric Problems: Orthogonality and the Pythagorean Theorem 23 14 Elementary Formulation of the Least Squares Problem for Straight Line Fitting 23 15 A Geometric View of the Least Squares Problem 24 16 Bibliography 31 List of Figures 1 2 The Pythagorean Theorem. The area of a square on the hypotenuse of a right triangle is equal to the sum of the squares on the sides. a2 + b2 = c2 , where here a = 6, b = 8, c = 10 . . Proof I: Euclid’s Elements. The short side of the triangle is a, the long side is b and the hypotenuse is c. The more darkly shaded triangle rotated counterclockwise by 90 degrees, will fall exactly on the more lightly shaded triangle. So these two triangles are congruent. The line from the top vertex divides the square on the hypotenuse c into a left rectangle L and a right one R. The dark triangle has area b2 /2, because its base has length b, as does its height. The area of the lightly shaded triangle is 1/2 that of the right sub-rectangle R. Therefore the area of R is b2 . Repeating the argument on the left side of the figure with two new triangles, we find the area of L is a2 . Therefore c2 = a2 + b2 . . . . . . . . . . . . . . . . . . . . . 2 6 7 3 4 5 6 7 Proof II: The Ascent of Man. Jacob Bronowski in his book The Ascent of Man discusses this proof on pages 156162. The book is based on the 1972 BBC television series of the same name. The small side of the triangle is a, the long side b, the hypotenuse c. The area of the left figure is c2 . The shaded inner square has side length b − a. Rearranging the pieces of the left figure we get the right figure consisting of a small square of area a2 , and a larger composite square of area b2 . Therefore a2 + b2 = c2 . . . . . . . . . . . . . . . . . . . . . 8 Proof II: The Ascent of Man. Jacob Bronowski in his book The Ascent of Man discusses this proof on pages 156162. The book is based on the 1972 BBC television series of the same name. The small side of the triangle is a, the long side b, the hypotenuse c. The area of the left figure is c2 . The shaded inner square has side length b − a. Rearranging the pieces of the left figure we get the right figure consisting of a square region of area b2 (shaded region), and a square region of area a2 (unshaded region). Therefore a2 + b2 = c2 . . . . . . 9 Proof III: Garfield’s Proof. The long side of the triangle is b, the short side a, the hypotenuse c. The area of the trapezoid is A = (a + b)(a + b)/2 = a2 /2 + ab + b2 /2. The area as the sum of the three triangles is A = ab + c2 /2. Equating the two expressions for A we obtain the result a2 + b2 = c2 . . . . . . . 11 Proof IV: An Arrangement of Four Triangles in a Square of side a + b The short side of the triangle is a, the long side b, and the hypotenuse c. The area of the enclosing rectangle is A = (a + b)2 . The area of the four triangles and the inside rectangle is A = 4(ab/2) + c2 = 2ab + c2 . Equating these two expressions for area A we have a2 + b2 = c2 . . . . . . 13 Proof V: An Arrangement of Four Triangles in a Square of side c The short side of the triangle is a, the long side b, and the hypotenuse c. The area of the enclosing rectangle is A = c2 . The area of the four triangles and the inside rectangle is A = 4(ab/2) + (b − a)2 = a2 + b2 . Equating these two expressions for area A we have a2 + b2 = c2 . . . . . . . . . . . 14 3 8 9 10 11 An Arrangement That Leads to a Geometric Proof. By equating this square with a certain second square also of side a + b we arrive at a clearly geometric proof of the Pythagorean Theorem that uses no algebra, and could have been employed by the Greeks, who did not have algebra available. They used number in their arguments, but to them numbers were line segment lengths. Through this means Euclid treated the concept of proportional numbers. . . . . . . . . . 15 Proof VI: Equating Two Square Arrangements. The left enclosing square and the right enclosing square have the same area. Therefore the sum of the areas of the two shaded squares on the left, is equal to the area of the shaded square on the right. That is, the sum of the square on the short side of the triangle, plus the square on the long side of the triangle, is equal to the area of the square on the hypotenuse. . . . . . 16 Proof VII: Area Proportional to the Hypotenuse Squared. Let the short side of the triangle be a, the long side b, and the hypotenuse c. Similar right triangles have their areas proportional to the square of their hypotenuses. with the same proportionality constant, say α. This follows because similar triangles have corresponding sides that are proportional. Also the ratios of triangle sides are the same for similar triangles. This is established in Euclidean geometry, and is the basis of trigonometry. In particular similar triangles have the same acute angles. Let one of them be θ. Then the area of the triangle is A = (ab)/2 = c cos(θ)c sin(θ) = αc2 , where α = cos(θ) sin(θ). The vertical line divides the triangle into two similar triangles, a left one and a right one. The hypotenuse of the left sub-triangle is a, the right one b. Thus their areas are αa2 and αb2 . The area of the original triangle is αc2 . So αa2 + αb2 = αc2 . Thus a2 + b2 = c2 . . . . . . . . . . 18 Proof VIII: Similarity and Proportion. Let the short side of the triangle be a, the long side b, and the hypotenuse c. The vertical line divides the triangle into two similar triangles. Corresponding sides are proportional. c is divided into two segments c1 on the left and c2 on the right. We have a/c1 = c/a, so a2 = c1 c. And b/c2 = c/b, so b2 = c2 c. Then a2 + b2 = (c1 + c2 )c = c2 . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 4 1 Pythagoras: Biographical Sketch Pythagoras proclaimed that ”All is Number” (that is, all is Mathematics). Pythagoras was born in Samos about 570 BC and died about 495 BC. Knowledge about him is vague and uncertain. He is said to have related mathematics to music, believed in reincarnation, and founded a secret religion in southern Italy in the town of Croton, a Greek colony. Much said about him may be apocryphal. But perhaps he was the first to call himself a philosopher (lover of knowledge). Many later philosophers claimed to have been influenced by his ideas. The Pythagorean theorem itself may have originated in the cultures of the Babylonians and Indians, although he may have been the first to write down a formal proof of this theorem, earlier versions being folklore and tradition. 2 Eight Proofs of the Pythagorean Theorem Most proofs are obvious from geometrical figures. Some proofs are algebraic, many use use the concept of similarities of triangles and proportions. 2.1 Proof I: Euclid’s Elements In our figure for Euclid’s proof, which proof appears in his work The Elements, two overlaid shaded triangles are congruent, and so have equal areas. Corresponding to each triangle are two rectangles each of double the area. One such rectangle is a square on a side of the original right triangle. The other makes up a portion of the square on the hypotenuse. So suppose the triangle sides are a and b and the hypotenuse c. So we have have that a2 is equal to the area of a sub-rectangle of the square on the hypotenuse c. Similarly we have b2 equal to the area of the rest of the area of the square on the hypotenuse. Thus a2 + b2 = c2 2.2 Proof II: The Ascent of Man Jacob Bronowsky devotes several pages discussing a proof of the Pythagorean theorem in his book, The Ascent of Man, and in his television series. This 5 Figure 1: The Pythagorean Theorem. The area of a square on the hypotenuse of a right triangle is equal to the sum of the squares on the sides. a2 + b2 = c2 , where here a = 6, b = 8, c = 10 6 Figure 2: Proof I: Euclid’s Elements. The short side of the triangle is a, the long side is b and the hypotenuse is c. The more darkly shaded triangle rotated counterclockwise by 90 degrees, will fall exactly on the more lightly shaded triangle. So these two triangles are congruent. The line from the top vertex divides the square on the hypotenuse c into a left rectangle L and a right one R. The dark triangle has area b2 /2, because its base has length b, as does its height. The area of the lightly shaded triangle is 1/2 that of the right sub-rectangle R. Therefore the area of R is b2 . Repeating the argument on the left side of the figure with two new triangles, we find the area of L is a2 . Therefore c2 = a2 + b2 . 7 Figure 3: Proof II: The Ascent of Man. Jacob Bronowski in his book The Ascent of Man discusses this proof on pages 156-162. The book is based on the 1972 BBC television series of the same name. The small side of the triangle is a, the long side b, the hypotenuse c. The area of the left figure is c2 . The shaded inner square has side length b − a. Rearranging the pieces of the left figure we get the right figure consisting of a small square of area a2 , and a larger composite square of area b2 . Therefore a2 + b2 = c2 . 8 Figure 4: Proof II: The Ascent of Man. Jacob Bronowski in his book The Ascent of Man discusses this proof on pages 156-162. The book is based on the 1972 BBC television series of the same name. The small side of the triangle is a, the long side b, the hypotenuse c. The area of the left figure is c2 . The shaded inner square has side length b − a. Rearranging the pieces of the left figure we get the right figure consisting of a square region of area b2 (shaded region), and a square region of area a2 (unshaded region). Therefore a2 + b2 = c2 . 9 occurs in the chapter called The Music of the Spheres and in an episode similarly titled in the television series. See the figure captioned The Ascent of Man. 2.3 Proof III: Garfield James A. Garfield contributed an original proof for the Pythagorean theorem. Of course most proofs of this theorem are rather similar. I had heard about Garfield’s proof many times, but had not actually seen it. However, his proof is presented in the book: Welchons, Krickenberger, Pearson, Plane Geometry. I graduated from James A. Garfield elementary school in Long Beach California, a few years back, so I am closely connected to Garfield. Garfield was one of our assassinated presidents, a rather interesting person, an exception to our rather dull and dim witted group of presidents in general. His assassin Charles Guiteau had a connection with the Oneida Community in Oneida, New York. This was a 19th century social experiment devoted to ”free” love. For an interesting treatment of these matters see Sara Vowell’s book Assassination Vacation. If you are not familiar with Sara, her quirky personality and her squeaky voice, as heard on This American Life, you are really missing out. Garfield’s proof consists in using two copies of the triangle, which has short side a, long side b, and hypotenuse c. We rest one copy on its short side a, the other on the long side b, so that the two triangles touch at a point. Then we add a line joining the top vertex of the first triangle to the top vertex of the second triangle getting a trapezoid (See the Garfield figure). A trepezoid is a quadrilateral with two parallel opposite sides. The area of the trapezoid is the average length of its two parallel sides times the perpendiculat distance between its parallel sides (this can be shown by decomposing the trapezoid into two triangles by drawing a diagonal). So the area of the trapezoid is A = (a + b) a+b 1 a2 b2 = (a2 + 2ab + b2 ) = + ab + . 2 2 2 2 On the other hand writing the area as the sum of the areas of the three triangles, we have ab ab c2 c2 A= + + = ab + . 2 2 2 2 10 Figure 5: Proof III: Garfield’s Proof. The long side of the triangle is b, the short side a, the hypotenuse c. The area of the trapezoid is A = (a + b)(a + b)/2 = a2 /2 + ab + b2 /2. The area as the sum of the three triangles is A = ab + c2 /2. Equating the two expressions for A we obtain the result a2 + b2 = c2 . 11 Equating these two expressions for A, we obtain a2 + b2 = c2 . 2.4 Proof IV: An Arrangement of Four Triangles in a Square of side a + b Consider the figure called An Arrangement of Four Triangles in a Square of side a + b. The short side of the triangle is a, the long side b, and the hypotenuse c. The area of the enclosing rectangle is A = (a+b)2 . The area of the four triangles and the inside rectangle is A = 4(ab/2) + c2 = 2ab + c2 . Equating these two expressions for area A we have a2 + b2 = c2 . 2.5 Proof V: An Arrangement of Four Triangles in a Square of side c Consider the figure called An Arrangement of Four Triangles in a Square of side c. The short side of the triangle is a, the long side b, and the hypotenuse c. The area of the enclosing rectangle is A = c2 . The area of the four triangles and the inside rectangle is A = 4(ab/2) + (b − a)2 = a2 + b2 . Equating these two expressions for area A we have a2 + b2 = c2 . 2.6 Remarks on Geometric Proofs, Versus Algebraic Proofs Euclid’s proof is purely Geometric with no reliance on algebra. The figure titled An Arrangement That Leads to a Geometric Proof. will lead to another purely geometric proof. Most of the proofs are algebraic involving a slight amount of Algebra. 2.7 Proof VI: Equating Two Square Arrangements Referring to the figure for proof VI, the left enclosing square and the right enclosing square have the same area. Therefore the sum of the areas of the two shaded squares on the left, is equal to the area of the shaded square on the right. That is, the sum of the square on the short side of the triangle, plus the square on the long side of the triangle, is equal to the area of the square on the hypotenuse. 12 Figure 6: Proof IV: An Arrangement of Four Triangles in a Square of side a + b The short side of the triangle is a, the long side b, and the hypotenuse c. The area of the enclosing rectangle is A = (a + b)2 . The area of the four triangles and the inside rectangle is A = 4(ab/2) + c2 = 2ab + c2 . Equating these two expressions for area A we have a2 + b2 = c2 . 13 Figure 7: Proof V: An Arrangement of Four Triangles in a Square of side c The short side of the triangle is a, the long side b, and the hypotenuse c. The area of the enclosing rectangle is A = c2 . The area of the four triangles and the inside rectangle is A = 4(ab/2) + (b − a)2 = a2 + b2 . Equating these two expressions for area A we have a2 + b2 = c2 . 14 Figure 8: An Arrangement That Leads to a Geometric Proof. By equating this square with a certain second square also of side a+b we arrive at a clearly geometric proof of the Pythagorean Theorem that uses no algebra, and could have been employed by the Greeks, who did not have algebra available. They used number in their arguments, but to them numbers were line segment lengths. Through this means Euclid treated the concept of proportional numbers. 15 Figure 9: Proof VI: Equating Two Square Arrangements. The left enclosing square and the right enclosing square have the same area. Therefore the sum of the areas of the two shaded squares on the left, is equal to the area of the shaded square on the right. That is, the sum of the square on the short side of the triangle, plus the square on the long side of the triangle, is equal to the area of the square on the hypotenuse. 16 2.8 Proof VII: Triangle Area Proportional to the Hypotenuse Squared See the figure for proof VII. Let the short side of the triangle be a, the long side b, and the hypotenuse c. Similar right triangles have their areas proportional to the square of their hypotenuses. with the same proportionality constant, say α. This follows because similar triangles have corresponding sides that are proportional. Also the ratios of triangle sides are the same for similar triangles. This is established in Euclidean geometry, and is the basis of trigonometry. In particular similar triangles have the same acute angles. Let one of them be θ. Then the area of the triangle is A = (ab)/2 = c cos(θ)c sin(θ) = αc2 , where α = cos(θ) sin(θ). The vertical line divides the triangle into two similar triangles, a left one and a right one. The hypotenuse of the left sub-triangle is a, the right one b. Thus their areas are αa2 and αb2 . The area of the original triangle is αc2 . So αa2 + αb2 = αc2 . Thus a2 + b2 = c2 . 2.9 Proof VIII: Similarity and Proportion Let the short side of the triangle be a, the long side b, and the hypotenuse c. The vertical line divides the triangle into two sub-triangles both similar to the original. Referring to the figure, corresponding sides are proportional. c is divided into two segments c1 on the left and c2 on the right. We have a c = , c1 a so a2 = c1 c. b c = , c2 b so b2 = c1 c. Then a2 + b2 = (c1 + c2 )c = c2 . Thus a2 + b2 = c2 . 17 Figure 10: Proof VII: Area Proportional to the Hypotenuse Squared. Let the short side of the triangle be a, the long side b, and the hypotenuse c. Similar right triangles have their areas proportional to the square of their hypotenuses. with the same proportionality constant, say α. This follows because similar triangles have corresponding sides that are proportional. Also the ratios of triangle sides are the same for similar triangles. This is established in Euclidean geometry, and is the basis of trigonometry. In particular similar triangles have the same acute angles. Let one of them be θ. Then the area of the triangle is A = (ab)/2 = c cos(θ)c sin(θ) = αc2 , where α = cos(θ) sin(θ). The vertical line divides the triangle into two similar triangles, a left one and a right one. The hypotenuse of the left sub-triangle is a, the right one b. Thus their areas are αa2 and αb2 . The area of the original triangle is αc2 . So αa2 + αb2 = αc2 . Thus a2 + b2 = c2 . 18 Figure 11: Proof VIII: Similarity and Proportion. Let the short side of the triangle be a, the long side b, and the hypotenuse c. The vertical line divides the triangle into two similar triangles. Corresponding sides are proportional. c is divided into two segments c1 on the left and c2 on the right. We have a/c1 = c/a, so a2 = c1 c. And b/c2 = c/b, so b2 = c2 c. Then a2 + b2 = (c1 + c2 )c = c2 . 19 3 A Crises in Greek Mathematics: What is a real number? For the greeks numbers were lengths of line segments. Fractions (rational numbers) are obtained by dividing line segments into equal pieces. They discovered that the diagonal of a square can not be equal to any multiple of a fractional division of the unit length of a square. This is a big problem for their concept of number! Show that the square root of a prime number is not rational. So suppose √ the integer p is a prime, having no factors. Suppose p could be written as a rational number, as a fraction say n/m, where m and n have no common factor, since if not we could divide out the common factors. √ p= m . n Squaring we have p= m2 . n2 Then n2 p = m2 . Hence p must be a factor of m, say m = pr. Then n2 p = p2 r 2 But this implies that p is a factor of n. This contradicts our assumption that m and n had no common factor. Therefore the square root of a prime is not a rational number. Mention the definition of real numbers as Didekind Cuts, or as equivalence classes of Cauchy sequences. LEAST UPPER BOUND AXIOM If A is any nonempty set of the real numbers R that is bounded above, then A has a least upper bound. 4 Inequalities Cauchy-Schwartz, Minkowsky (Goldberg) 20 Cauchy-Schwartz, ∞ X sn tn n=1 Minkowsky " ∞ X (sn + tn ) n=1 2 ≤ # " ∞ X s2n n=1 ≤ " ∞ X #1/2 " ∞ X t2n n=1 s2n n=1 #1/2 + " #1/2 ∞ X n=1 t2n #1/2 If a, b, c are vectors in a normed vector space (triangle inequality) kc − ak ≤ kb − ak + kc − bk. 5 Euclidean Distance From the Pythagorean Theorem we able to define the Euclidean distance between points. So if we have two points with respective coordinates p1 = (x1 , y1, z1 ) and p2 = (x2 , y2 , z2 ), the distance between the points is d= q (x2 − x1 )2 + (y2 − y1 )2 + (z2 − z1 )2 So now we can talk about the nearness of points and thus talk about concepts such as continuity, and differentiability. Also we are able to formulate the ideas of analytic geometry. For example we are able to define the ellipse as the locus of points equidistant from two fixed points called the foci. Doing this we arrive at the canonical representation of an ellipse with the equation x2 y 2 + 2 = 1, a2 b and the standard equation of the ellipsoid x2 y 2 z 2 + 2 + 2 = 1. a2 b c 21 6 Distance Functions and the Metric Space A metric is a distance function ρ defined on some set of points M with the following four properties: For a, b, c points of M, then ρ(a, b) ≥ 0 ρ(a, b) = ρ(a, b) ρ(a, a) = 0 (i) (ii) (iii) and ρ(a, c) ≤ ρ(a, b) + ρ(b, c) (iv) An open ball about the point a of radius r, B(a, r), is the set of all points such that ρ(a, p) < r A metric space (M, ρ) consists of a set M with a metric ρ. An open set in a metric space is a set A so that for every point a in A there exists some open ball about a that is a subset of A. A metric space M and the class of all open subsets form a topological space. The metric for ordinary Euclidean two dimensional space is defined by the Pythagorean Theorem. So let point p1 = (x1 , y1 ) and point p2 = (x2 , y2 . Then the Euclidean distance between the points is the square root of the differences of the coordinates d(p1 , p2 ) = q (x2 − x1 )2 + (y2 − y1 )2 , which by the Pythagorean Theorem is the length of the line segment connecting the two points. So the triangle inequality says that the sum of the lengths of two adjacent sides in a triangle is greater than the length of the opposite side. This is metric property (iv): ρ(a, c) ≤ ρ(a, b) + ρ(b, c) (iv) For this simple two dimensional case, from the law of cosines c2 = a2 + b2 − ab cos(θ) ≤ a2 + b2 . For more general arguments see lineara.pdf, Topics In Linear Algebra and Its Applications by James Emery. 22 7 Vector Spaces and Inner Product Spaces 8 Normed Linear Spaces 9 Normed Linear Spaces and Functional Analysis 10 Hilbert Space and `2 11 Orthogonality, Orthagonal Polynomials, Fourier Series 12 Projections 13 Linear Least Squares Problems as Geometric Problems: Orthogonality and the Pythagorean Theorem 14 Elementary Formulation of the Least Squares Problem for Straight Line Fitting The traditional way of deriving least squares equations is to write the expression for the sum of the squares difference between the given ”data” and the approximating function, and then to set the partial derivatives with respect to the coefficients of the approximating function to zero. Let us do this for the case of fitting a straight line to given data. Assume the model f (x) = ax + b and minimize r(a, b) = n X i=1 (axi + b − yi )2 The conditions for a minimum are n ∂r X = 2xi (axi + b − yi ) = 0 ∂a i=1 23 n ∂r X 2(axi + b − yi ) = 0 = ∂b i=1 We get a two by two system of equations. a n X x2i +b xi = n X xi + b n X x2i xi yi n X n X 1= n X xi )2 ) 6= 0. yi i=1 i=1 i=1 n X i=1 i=1 i=1 a n X These equations are known as the normal equations of the problem. They have a unique solution if the determinant is not zero, that is if n i=1 −( i=1 If the x values are not all equal this follows from the Cauchy-Schwartz inequality applied to the vectors (1, 1, ...1) and (x1 , x2 , ..., xn ). The general problem can be viewed more naturally as being geometric. 15 A Geometric View of the Least Squares Problem The abstract linear least squares problem may be formulated as approximation in a vector space by some element of a subspace. Often this vector space is a space of functions. As examples the subspace could be generated by a bases such as 1, x, x2 , x3 , ...., or such as 1, cos(ωt), sin(ωt), cos(2ωt), sin(ωt), ... The first case would be a polynomial, or power series approximation. And the second would be a Fourier or trigonometric approximation. So consider a vector space V with an inner product of u, with v, written as (u, v). Given a subspace S and an arbitrary element g of V , we are to find the element in S that best approximates g in the norm corresponding to the inner product. The L2 norm for functions is based on the inner product (f, g) = 24 Z f g, and for sequences is based on the inner product (f, g) = n X fi gi . i=1 This L2 norm corresponds directly to the ”squares” part of the least squares approximation. But the theory carries through for an arbitrary inner product. The norm defined by an inner product is kf k = (f, f )1/2 . A solution f ∈ S, minimizes (f − g, f − g) = kf − gk2. We will show that the problem is solved as the orthogonal projection of a vector into a subspace. One can think of this as analogous to the simple geometric problem of projecting a vector in space onto a plane. Think of a vector from the origin to a point, and think of a plane through the origin, not containing this vector. The plane is a vector space. A vector in the plane closest to the original vector is obviously the orthogonal projection of the vector onto the plane. The same thing happens in the general problem, where the plane becomes the subspace. For example the subspace might be the set of all cubic polyunomials. And the problem is to best fit the data to a cubic polynomial. Two vectors are orthogonal, i.e. perpendicular, if their inner product is zero. We require a preliminary theorem to prove the main proposition. Pythagorean Theorem. If v1 is orthogonal to v2 , then kv1 + v2 k2 = kv1 k2 + kv2 k2 . Proof. (v1 + v2 , v1 + v2 ) = (v1 , v1 ) + 2(v1 , v2 ) + v(2 , v2 ) = (v1 , v1 ) + (v2 , v2 ). Proposition. If f ∈ S and (g − f, h) = 0, ∀h ∈ S then f is a solution to the least squares problem. Proof. Let s ∈ S. We have kg − sk2 = k(g − f ) + (f − s)k2 = kg − f k2 + kf − sk2 ≥ kg − f k2 . 25 By assumption, g − f is orthogonal to the subspace S, and f − s is in S. So the second equality is a consequence of the Pythagorean Theorem. We have shown that kg − sk ≥ kg − f k, ∀s ∈ S. so f is the best approximation to g in S and this completes the proof. Notice that a unique solution always exists because f is the unique orthogonal projection of g into S. For finite subspaces the solution can be formulated as a solution to a set of n linear equations in n unknowns. Let S equal the span of f1 , .., fn . Let the solution be f = c1 f1 + c2 f2 + .. + cn fn . Then the minimum condition is equivalent to (fi , c1 f1 + c2 f2 + ..cn fn − g) = 0, i = 1, .., n. This is the same as c1 (fi , f1 ) + c2 (fi , f2 ) + ..cn (fi , fn ) = (fi , g), i = 1, .., n. These n linear equations in n unknowns are called the normal equations of the problem. In the usual case, S is a space of discrete functions. These are functions defined on a finite domain. Suppose there are m data values so that the domain is {p1 , p2 , ..., pm }. We identify the function fi with the vector fi (p1 ) fi (p2 ) ..... ..... fi (pm ) fi is an m dimensional column vector of values of the ith function. We can formulate the minimum conditions with matrices. The inner product is then the transpose of the first vector times the second. We write the transpose of a vector v as v t . We have (fi , fj ) = fit fj Then c1 (fi , f1 ) + c2 (fi , f2 ) + ..cn (fi , fn ) = (fi , g), i = 1, .., n. 26 Thus h fit f1 ... fit fn i c1 . . . cn = fit g If we let A be an m row by n column matrix, whose ith column is fi , then A= Written out A= Also let h f1 f2 ... fn f1 (p1 ) f2 (p1 ) f1 (p2 ) f2 (p2 ) ... ... f1 (pm ) f2 (pm ) B= The normal equations become t A A g(p1) . . . g(pm ) c1 . . . cn Note that the original approximation equations in n unknowns c1 . A . . cn i ... fn (p1 ) ... fn (p2 ) ... ... ... fn (pm ) = At B. problem in this form is a system of m ≈ B. Any linear system of this form with m > n can be interpreted as a least squares problem and has an approximate least squares solution. The matrices 27 A and B are a convenient input set to a general linear least squares solver (see the listing of subroutine llsq). There is always a unique solution to the linear least squares problem. The solution is the orthogonal projection into the subspace. But there will be more than one solution to the normal equations if the given functions spanning the subspace are not linearly independent. The normal equations have a solution, so they are consistent. From the theory of linear equations, if the determinant D of the coefficient matrix of the normal equations is not zero, then there is a unique solution. Then we can solve the equations either by inverting the coefficient matrix, or by gaussian elimination. If D is zero, then there is more than one solution, such solution will involve one or more variables of arbitrary value. Gaussian elimination will fail. The D = 0 solution can be computed by using elementary row operations which can be done numerically or with various computer algebra programs. When we are concerned only with the discrete space, it does not matter that there are multiple solutions to the normal equations. Because any set of coefficients gives a linear combination equal to the unique projection into the subspace. The various solutions just give different linear combinations of dependent vectors that equal the same vector. On the other hand if points other than the sample points are in the relevant domain of the functions, then the multiple solutions may give function solutions that are not the same on this extended domain. To illustrate compare functions f and g where f (x) = x(x − 1) is equal to zero on the domain x = 0 and x = 1, but it is not zero on the extended domain of all real numbers. Let g be the true zero function, g(x) = 0. The two functions agree on {0, 1}, but give different values on an extended domain. Frequently we want to use the least squares solution for interpolation between the given data points, and so the case of multiple solutions to the normal equations does have consequence. We will show that if f1 ,..,fn are linearly independent then the normal equations have a unique solution. This is obvious because in this case f1 ,..,fn is a basis of S and the unique solution f in S has unique components with respect to this basis. It is also a direct consequence of the following proposition. Proposition. if f1 ,...,fn are the linearly independent columns of a matrix A, which has m > n rows, then det(At A) is not equal to zero. Proof. Suppose the determinant is zero. Then there exists c1 , c2 , .., cn , not 28 all zero such that c1 (f1 , f1 ) (f2 , f1 ) ...... ...... (fn , f1 ) + c2 (f1 , f2 ) (f2 , f2 ) ...... ...... (fn , f2 ) + .. + cn (f1 , fn ) (f2 , fn ) ...... ...... (fn , fn ) = 0. Let v = c1 f1 + ... + cn fn . The first equation shows that (fi , v) = 0, for i = 1, .., n. It follows that (v, v) = 0. This implies v = 0, and so each ci is zero. This is a contradiction, so the proposition is true. Example 1. We are to fit the function y = f (x) = a sin(x) + b cos(x). to the data x y 1.0 3.0 , 2.5 5.6 3.4 7.8 Apply the sin function to the x values to get the first column of matrix A and the cos function to get the second column. Let vector B be the y values. The normal equations are At AC = At B or in terms of the components " 1.13154358 0.22224325 0.22224325 1.86845642 The solution is C= " # C= " 4.6334245 −6.120705005 3.882636366 −10.40652323 # # So f (x) = 4.6334245 sin(x) − 6.120705005 cos(x) The following program does the linear least squares computations. 29 c+ llsq least squares solution of a*c=b (solving for c) subroutine llsq(a,ia,m,n,ws,c,b,ier) c parameters c a-m by n matrix. declared row dimension ia. c ws-working storage vector of length m c c-vector of size n c b-vector of size m c ier-return parameter: ier=0 normal return,ier=1 normal c equations c nearly singular,ier=2 normal equations singular. c dimension a(ia,1),b(1),c(1),ws(1) c compute lower elements of jth column of transpose(a)*a do 50 j=1,n do 18 i=j,n s=0. do 15 k=1,m s=s+a(k,i)*a(k,j) 15 continue 18 ws(i)=s c c compute jth element of right side vector s=0. do 40 k=1,m 40 s=s+a(k,j)*b(k) c(j)=s c c store lower elements of jth column in a do 19 i=j,n 19 a(i,j)=ws(i) c 50 continue c fill in upper values do 60 i=1,n do 60 j=i,n a(i,j)=a(j,i) 60 continue ib=1 30 c mm=1 eps=1.e-12 inv=0 solve normal equations call gausse(a,ia,c,ib,n,mm,inv,eps,det,ier) return end 16 Bibliography [1] Heath T. L. (translator), Euclid’s Elements, 3 Volumes, Dover, 1956. [2] Welchons A. M., Krickenberger W. R., Pearson Helen R., Plane Geometry, 1958, Ginn and Company. Garfield Proof p. 253. [3] Bronowski Jacob, The Ascent of Man, Little Brown and Company, 1973. [4] Halmos Paul R, Introduction to Hilbert Space: And the Theory of Spectral Multiplicity, Chelsea, 1951. Halmos was a student of John Von Neumann. [5] Halmos Paul R, Finite Dimensional Vector Spaces, Springer-Verlag, 1975. [6] Diggins Julia E, String, Straight-Edge and Shadow: The Story of Geometry, The Viking Press, 1965. This is a book for junior high school students, and elementary school teachers. A very nice short book with pictures, a history of the Greeks and Pythagoras, as well as some interesting mathematical discussions I had not seen elsewhere. [7] Pedoe Dan, Geometry and the Liberal Arts, St Martins Press, 1976. [8] Vowell Sara, Assassination Vacation, 2005, Simon and Schuster. [9] Goldberg Richard R Methods of Real Analysis, Blaisdell Publishing Company, 1964. 31 32