Download GAM for modelling the annual recruitment data of young eels

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Sveriges lantbruksuniversitet
Swedish University of Agricultural Sciences
Department of Aquatic Resources
Institute for Freshwater Research
Drottningholm
Wednesday 2016-08-24 – GAM course, Statistics@SLU, Ultuna
GAM models of eel recruits:
few data, many zeroes, many questions
Willem . Dekker @ SLU . SE
Leptocephalus
Glass eel
Silver eel
Yellow eel
Sargasso
Production × 1000 ton
30
20
Fishery
10
Aquaculture
Restock
0
1950
1960
1970
1980
1990
2000
2010
Glass eel recruitment index
(log-scale)
1000
100
10
1
0.1
1950
AdCP
AdTC
Albu
AlCP
Ebro
Ems
GiCP
GiSc
GiTC
Imsa
Katw
Lauw
Loi
Maig
MiPo
MiSp
Nalo
RhDO
RhIj
Ring
SeHM
SevN
Stel
Tibe
Vida
Vil
YFS2
Yser
1960
1970
1980
1990
2000
2010
C. Puke, Svensk Fiskeri Tidskrift 1955 – Uppsamling och transport av ålyngel i nedre Norrland
2,000
1,000
1 eel
2 eels
1982 Viforsen dam built →
# eels in trap
3,000
← 1950 Skallböle dam built
Eel trap at Skallböle, Ljungan, CSweden
0
1950
1960
1970
1980
63
Ljungan
1975
62
1976
Ljusnan
61
Gavleån
Dalälven
1979
10000000
60
Latitude (°N)
1000000
10000
59
Kilaån Nyköpingsån
1978
MotalaStröm
GötaÄlv
58
1000
Botorpsströmmen
1978
100
Viskan
TvååkersKanal
1989
Ätran
2012
Nissan
Lagan
1990
57
10
Emån
1989
Alsterån
1991
Morupsån
Suseån
1990
1993
Mörrumsån
1
0.1
0.01
1940
Alsterån
Ätran
Botorpsströmmen
Dalälven
Emån
Gavleån
Göta älv
Helgeån
Kävlingeån
Kilaån
Lagan
Ljungan
Ljusnan
Mörrumsån
Morupsån
Motala ström
Nissan
Nyköpingsån
Råån
Rönne å
Skräbeån
Suseån
Tvååkers kanal
Viskan
Kävlingeån
55
10
1950
1960
1970
1980
1990
2000
Skräbeån
Helgeån 1979
RönneÅ
Råån
1973
56
12
14
2010
16
18
20
Longitude (°E)
Year
Oslo
Age
Test two alternatives, uses a lot of df’s
Need to minimize on df-usage, parsimony!
Now: not yet comparing hypotheses; to be continued!
80
60
40
Göta Älv
Two hypotheses:
Svärdson 1976 - shrinking distribution
Dekker 2004
- older eels declined first
Mörrumssån
100
Mean individual weight (gr)
Number per year
100000
Oslo
20
0
0
500
1000
Distance to Oslo (km)
1500
Selecting an appropriate model
Data:
1 observation per year per site, if available.
Numbers or total weight (and number/kg) – all converted to numbers, now.
#years= 75
#sites = 24
time trend
local conditions and peculiarities
#obs = 994
(75*24 = 1800, nearly half is missing)
assumes equal trends 
repeats the data

different patterns, but linear patterns 
different trends over time, linear in site 



3
Mandel(year)
Y = year + site
Y = year + site + year*site
Y = year + site + βsite*continuous(year)
Y = year + site + βyear*continuous(site)
Y = year + site + βsite *Mandel(year)
Y = year + site + βyear*Mandel(site)
Y = spline(year, site)
reduction of 6% on df= 6 for Age
reduction of 20% on df=21 for Oslo
1
-1
-3
1940
1950
1960
1970
1980
1990
2000
Year
Mandel J. 1959 The analysis of Latin squares with a certain type of row-column interaction. Technometrics 1, 379-387.
2010
Fitted regression surface, spline2(yearclass, age)
GAM results
4
age 0
age 1
age 2
age 3
3
age 4
age 5
age 6
age 7
2
1
0
1950
1960
1970
1980
Birth-year
1990
2000
2010
Giving up too soon?
6
4
6
2
4
10
20
30
40
50
60
70
-2
-4
50
40
30
20
-6
2
0
0
-2
-4
-8
-10
10
80
-6
series age
Years to go
-8
-10
residual
residual
0
series stop here
0
Near zero is uncertain
1,000,000
6
100,000
1.E+13
4
1.E+12
2
1.E+10
1.E+09
1,000
-5
1.E+08
sorted normal
0
-4
-2
-3
2
1
0
-1
5
4
3
-2
1.E+07
100
Normal
1.E+06
-4
1.E+05
10
1.E+04
-6
1.E+03
1
1.E+02
1.E+01
0
1
10
100
1.E+00
1
10
100
sorted residuals
Raw residual squared
Predicted
1.E+11
10,000
-81,000
10,000
Observed
-10
1,000
100,000 1,000,000 10,000,000
10,000
100,000
Observed
Normal σ2 ≈ const
Poisson σ2 ≈ μ
Gamma σ2 ≈ μ2
Log-normal, log(y+1), μ >1: ≈Gamma, μ<1: ≈Normal. Choose “1” well!
or Negative Binomial, but automatic fit often foolish.
1,000,000
10,000,0
Conclusions
a. GAM is a flexible technique for fitting complex relations
b. Parsimony of resulting models is a pro
c. Lack of analytical interpretation is a con, GAM is a descriptive technique
d. Know your data!
e. Know your (statistical) assumptions, and their effects. Check!
f. Do not rely on default settings – they might not apply for you. Check!
age 0
4
Fitted regression surface,
spline2(yearclass, age)
age 1
age 2
age 3
3
age 4
age 5
age 6
2
age 7
1
0
1950
1960
1970
1980
Birth-year
1990
2000
2010
A. Paul Weber 1974: Tote Fische - denn sie wissen nicht, was sie tun!