Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Sveriges lantbruksuniversitet Swedish University of Agricultural Sciences Department of Aquatic Resources Institute for Freshwater Research Drottningholm Wednesday 2016-08-24 – GAM course, Statistics@SLU, Ultuna GAM models of eel recruits: few data, many zeroes, many questions Willem . Dekker @ SLU . SE Leptocephalus Glass eel Silver eel Yellow eel Sargasso Production × 1000 ton 30 20 Fishery 10 Aquaculture Restock 0 1950 1960 1970 1980 1990 2000 2010 Glass eel recruitment index (log-scale) 1000 100 10 1 0.1 1950 AdCP AdTC Albu AlCP Ebro Ems GiCP GiSc GiTC Imsa Katw Lauw Loi Maig MiPo MiSp Nalo RhDO RhIj Ring SeHM SevN Stel Tibe Vida Vil YFS2 Yser 1960 1970 1980 1990 2000 2010 C. Puke, Svensk Fiskeri Tidskrift 1955 – Uppsamling och transport av ålyngel i nedre Norrland 2,000 1,000 1 eel 2 eels 1982 Viforsen dam built → # eels in trap 3,000 ← 1950 Skallböle dam built Eel trap at Skallböle, Ljungan, CSweden 0 1950 1960 1970 1980 63 Ljungan 1975 62 1976 Ljusnan 61 Gavleån Dalälven 1979 10000000 60 Latitude (°N) 1000000 10000 59 Kilaån Nyköpingsån 1978 MotalaStröm GötaÄlv 58 1000 Botorpsströmmen 1978 100 Viskan TvååkersKanal 1989 Ätran 2012 Nissan Lagan 1990 57 10 Emån 1989 Alsterån 1991 Morupsån Suseån 1990 1993 Mörrumsån 1 0.1 0.01 1940 Alsterån Ätran Botorpsströmmen Dalälven Emån Gavleån Göta älv Helgeån Kävlingeån Kilaån Lagan Ljungan Ljusnan Mörrumsån Morupsån Motala ström Nissan Nyköpingsån Råån Rönne å Skräbeån Suseån Tvååkers kanal Viskan Kävlingeån 55 10 1950 1960 1970 1980 1990 2000 Skräbeån Helgeån 1979 RönneÅ Råån 1973 56 12 14 2010 16 18 20 Longitude (°E) Year Oslo Age Test two alternatives, uses a lot of df’s Need to minimize on df-usage, parsimony! Now: not yet comparing hypotheses; to be continued! 80 60 40 Göta Älv Two hypotheses: Svärdson 1976 - shrinking distribution Dekker 2004 - older eels declined first Mörrumssån 100 Mean individual weight (gr) Number per year 100000 Oslo 20 0 0 500 1000 Distance to Oslo (km) 1500 Selecting an appropriate model Data: 1 observation per year per site, if available. Numbers or total weight (and number/kg) – all converted to numbers, now. #years= 75 #sites = 24 time trend local conditions and peculiarities #obs = 994 (75*24 = 1800, nearly half is missing) assumes equal trends repeats the data different patterns, but linear patterns different trends over time, linear in site 3 Mandel(year) Y = year + site Y = year + site + year*site Y = year + site + βsite*continuous(year) Y = year + site + βyear*continuous(site) Y = year + site + βsite *Mandel(year) Y = year + site + βyear*Mandel(site) Y = spline(year, site) reduction of 6% on df= 6 for Age reduction of 20% on df=21 for Oslo 1 -1 -3 1940 1950 1960 1970 1980 1990 2000 Year Mandel J. 1959 The analysis of Latin squares with a certain type of row-column interaction. Technometrics 1, 379-387. 2010 Fitted regression surface, spline2(yearclass, age) GAM results 4 age 0 age 1 age 2 age 3 3 age 4 age 5 age 6 age 7 2 1 0 1950 1960 1970 1980 Birth-year 1990 2000 2010 Giving up too soon? 6 4 6 2 4 10 20 30 40 50 60 70 -2 -4 50 40 30 20 -6 2 0 0 -2 -4 -8 -10 10 80 -6 series age Years to go -8 -10 residual residual 0 series stop here 0 Near zero is uncertain 1,000,000 6 100,000 1.E+13 4 1.E+12 2 1.E+10 1.E+09 1,000 -5 1.E+08 sorted normal 0 -4 -2 -3 2 1 0 -1 5 4 3 -2 1.E+07 100 Normal 1.E+06 -4 1.E+05 10 1.E+04 -6 1.E+03 1 1.E+02 1.E+01 0 1 10 100 1.E+00 1 10 100 sorted residuals Raw residual squared Predicted 1.E+11 10,000 -81,000 10,000 Observed -10 1,000 100,000 1,000,000 10,000,000 10,000 100,000 Observed Normal σ2 ≈ const Poisson σ2 ≈ μ Gamma σ2 ≈ μ2 Log-normal, log(y+1), μ >1: ≈Gamma, μ<1: ≈Normal. Choose “1” well! or Negative Binomial, but automatic fit often foolish. 1,000,000 10,000,0 Conclusions a. GAM is a flexible technique for fitting complex relations b. Parsimony of resulting models is a pro c. Lack of analytical interpretation is a con, GAM is a descriptive technique d. Know your data! e. Know your (statistical) assumptions, and their effects. Check! f. Do not rely on default settings – they might not apply for you. Check! age 0 4 Fitted regression surface, spline2(yearclass, age) age 1 age 2 age 3 3 age 4 age 5 age 6 2 age 7 1 0 1950 1960 1970 1980 Birth-year 1990 2000 2010 A. Paul Weber 1974: Tote Fische - denn sie wissen nicht, was sie tun!