Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Statistics 359a Regression Analysis Necessary Background Knowledge - Statistics • expectations of sums • variances of sums • distributions of sums of normal random variables • t distribution – assumptions and use • calculation of confidence intervals • simple tests of hypotheses and p-values Necessary Background Knowledge – Linear Algebra • • • • • • multiplication of conformable matrices transpose of a matrix determinant of a square matrix inverse of a square matrix eigenvalues of a square matrix quadratic forms Origin of Least Squares Introduction of the metric system and the length of a meter • 1790 – French National Assembly commissions the French Academy of Sciences to design a simple decimal-based system of weights and measures • 1791 – French Academy defines the meter to be 10-7 or one ten-millionth of the length of the meridian through Paris from the north pole to the equator. Adrien-Marie Legendre • Legendre on the French commission in 1792 to determine the length of the meridian quadrant • measurements of latitude made in 1795 • complex calculations made from the measurements in 1799 • Legendre proposes the method of least squares in 1805 to determine the length of a meter Data • old French units of measurement: 1 module = 2 toises • old French to imperial English: 1 toise = 6.395 feet • metric to imperial: 1 meter = 3.2808 feet From Spherical Geometry S S L L C sin( L L) cos( L L) 28500 28500 S arc length C is related to the length of the meridian quadrant (90D) D 28500/(1 C ) length of one degree of an arc in modules is related to the ellipticit y of the earth Including measurement errors, the data and model reduce to: 1 0.003398 C ( 4.912) (0.590) 2 0.000475 C (2.720) (0.027) 3 0.002625 C (0.048) (0.324) 4 0.001529 C (2.914) (0.277) 5 0.000279 C (4.765) (0.014) Solution is: D = 28497.78 modules 90D = 2564800.2 modules = length of the meridian quadrant Therefore 1 meter = 0.256480 modules = 0.512960 toises = 3.280 feet modern meter = 3.2808 feet Origin of the Term “Regression” • Francis Galton, 1886, ‘Regression towards mediocrity in hereditary stature.’ Journal of the Anthropological Institute, 15: 246 – 263 • See JSTOR under UWO library databases Data on Heights of Children and Parents ‘Regression Line’ Theoretical Basis For X and Y bivariate normal with equal means variances E (Y | X x ) ( x ) E (Y | X x ) x ( 1)( x ) For > 0 E(Y |X ) < x for x > and E(Y |X ) > x for x < Example in Data Analysis Through Regression • Relationship between the price of a violin bow and its attributes such as age, shape and ornamentation on the bow Violin Bow Example The following data on violin bows made by W.E. Hill and Sons of London, England are taken from the internet site www.maestronet.com/pricehist.html. The data show the prices of the bows sold at auction at Sotheby’s auction house for the years 1994-97. Also given are data on various factors that may affect the price of the bow. These include: the year of the sale (in case of price inflation or deflation); the year of manufacture (or age – are antique bows more or less valuable?); weight of the bow in grams (do buyers like heavier or lighter bows?); the shape of the bow (is there an aesthetic effect to the price?); presence or absence of ornamental gold; presence or absence of ornamental pearl; and whether the bow has a tortoiseshell frog or an ebony frog. Only the bows for which the approximate year of manufacture has been given are included in the data set. Prices from other auction houses and for other bow makers, as well as violins, are available at the same site, but only Sotheby’s gives the year of manufacture. A Minitab file of the data is at O:\359\bows.mtb. Price in U.S. Dollars 1874 2436 7498 1142 1935 1759 5278 4905 7994 2543 1769 1592 3716 2477 2654 3362 Year of Sale 1997 1997 1997 1996 1996 1996 1996 1995 1995 1995 1994 1994 1994 1994 1994 1994 Year the Shape TortoiseGold Pearl Bow was Weight in O=octagonal shell Grams Made R=round Accessories Frog Accessories 1957 59.0 O N N N 1935 62.0 R N N N 1920 62.0 R Y Y N 1945 59.5 O N N Y 1890 57.5 R N N N 1900 56.0 O N N N 1950 57.0 O Y Y Y 1920 58.0 R Y N N 1920 60.0 O Y Y Y 1926 62.5 R N N Y 1935 61.0 R N N N 1960 61.0 R N N Y 1935 55.0 O Y Y Y 1925 59.0 R N N Y 1930 58.0 R N N N 1935 58.0 R N Y Y Price and Date of Sale • 1995 seems to be a more expensive year • Is the effect confounded with some other attribute common to 1995? Violin Bows - Price and Sale Date 8000 7000 Price 6000 5000 4000 3000 2000 1000 1994 1995 1996 Year Sold 1997 Price and Year of Manufacture • Is there anything special about 1920? • Is there a quadratic trend in the data? Violin Bows - Price and Year of Manufacture 8000 7000 Price 6000 5000 4000 3000 2000 1000 1890 1900 1910 1920 1930 Year Made 1940 1950 1960 Price and Weight of the Bow • Is there any trend with respect to the weight? Violin Bows - Price and Weight in Grams 8000 7000 Price 6000 5000 4000 3000 2000 1000 55 56 57 58 59 Weight 60 61 62 63 Octagonal vs. Round Bows • No apparent trend Violin Bows - Price and Shape 1 = round, 0 = octagonal Shape 1.0 0.5 0.0 1000 2000 3000 4000 5000 Price 6000 7000 8000 The Gold Standard? • The presence of gold on a bow generally makes it more expensive Violin Bows - Price and Gold Accessories 1 = present, 0 = absent Gold 1.0 0.5 0.0 1000 2000 3000 4000 5000 Price 6000 7000 8000 Tortoise Shell Frogs • Some evidence of added expense for tortoise shell Violin Bows - Price and Tortoise Shell Frogs 1 = present, 0 = absent Frog 1.0 0.5 0.0 1000 2000 3000 4000 5000 Price 6000 7000 8000 Price and Pearl Accessories • No apparent effect Violin Bows - Price and Pearl Accessories 1 = present, 0 = absent Pearl 1.0 0.5 0.0 1000 2000 3000 4000 5000 Price 6000 7000 8000 Prediction • Can we use the model built with the current data to predict the future price of a bow • Example: some 1999 data from auctions • 1920 bow, 60.5 g., round with gold and pearl accessories $4098 • 1933 bow, 61 g., octagonal with pearl accessories only $2421