Download wmo 718

Document related concepts
no text concepts found
Transcript
WORLD METEOROLOGICAL ORGANIZATION
OPERATIONAL HYDROLOGY REPORT No. 33
STATISTICAL DISTRIBUTIONS
FOR
FLOOD FREQUENCY ANALYSIS
SECRETARIAT OF THE WORLD METEOROLOGICAL ORGANIZATION - GENEVA - SWITZERLAND
1989
CONTENTS
vi
Page
CHAPTER 6
6.1
6.2
6.3
6.4
65
CHAPTER 7
7.1
7.2
CHAPTER 8
8.1
82
8.3
8.4
85
8.6
8.7
8.8
8.9
8.10
METHODS OF CHOOSING BETWEEN DISTRIBUTIONS
Introduction
Influence of Outliers
Traditional Methods
Recent Approaches
,
Summary
..
.
..
.
.
.
DISTRIBUTIONS PREVIOUSLY CHOSEN OR RECOMMENDED FOR NATIONAL
USE
.
WMO Survey
.
Selected Cases
..
CONCLUDING REMARKS
Types of model
The modelling problem
Descriptive ability of distributions
Predictive ability and robustness
Parameter estimation
At-site and at-site{regional estimation
Arid and semi-arid zones
Regional homogeneity
Necessity for flow gauging
Interprelation and use of flood frequency estimates
.
..
.
.
.
..
.
.
.
.
..
45
47
50
52
52
53
59
59
59
59
59
59
60
60
60
61
61
63
REFERENCES
APPENDIX I
APPENDIX 2
APPENDIX 3
APPENDIX 4
APPENDIX 5
APPENDIX 6
43
43
43
VOLUME FLOODS
ESTIMATES OF POPULATION MOMENTS AND THEIR BIASES
MOMENT RATIO DIAGRAMS
PARAMETER ESTIMATION BY PROBABILITY WEIGHTED MOMENTS (PWM)
NUMERICAL EXAMPLES
WMO SURVEY ON DISTRIBUTION TYPES CURRENTLY IN USE FOR
FREQUENCY ANALYSIS OF EXTREMES OF FLOODS BY HYDROLOGICAL
AND OTHER SERVICES
AU
.
.
.
A2.1
..
M.I
.
A5.1
..
A6.1
A3.1
FOREWORD
Choosing a statistical distribution for flood frequency analysis has remained a difficult problem for
hydrologists, designers of hydraulic structures, irrigation engineers and planners of water resources. Several
distribution types are used to estimate extremes of flows and precipitation, but the merits of their applicability to
different typeS of data and for different purposes have not been clearly established. The reasons for operational use of a
particular distribution type in many countries are frequently subjective or historical, as confmned by the 1983 - WMO
survey, results of which are given in this report
In addition to the survey on current practices of countries with regard to selection and application of statistical
distributions, the WMO Commission for Hydrology, at its sixth session (1980) decided that a report containing detailed
guidance on the merits and selection of distribution types for flood frequency analysis should be prepared. Similar
guidance on selection of distribution types for extremes of precipitation was published by WMO in 1981 (Operationl
Hydrology Report No. 15 - WMO-No. 560). Bearing this in mind, the Secretariat arranged for the preparation of the
present report which has been written by Dr. C. Cunnane of the Department of Engineering Hydrology, University
College Galway, Ireland.
I should like to express the gratitude of the World Meteorological Organization to Members who have
willingly participated in the survey and to Dr. Cunnane for preparing this excellent report on a very important subject.
(G.O.P Obasi)
Secretary-General
x
SUMMARY
Since the beginning of this centnry significant developments have taken place in the methods for statistical
flood freqnency analyis nsing models to annual maximum (AM) series and partial duration (PD) series.
The AM and PD models are defined and compared and the relation between return period T and the distribution
function F(Q) of flood magnitudes in each series is developed, leading to the Q - T relation. General statistical
properties of observed AM and PD series are then outlined, following which a discussion of the problems encountered in
modelling these series satisfactorily is given.
Methods of estimating flood distribution quantiles using at-site and regional data both separatly and together,
are outlined in Chapter 4, with some examples in Appendix 5. These include the efficient and robust methods based on
regionally averaged probability weighted moments, Bayesian methods and the regionally estimated TCEV model.
Regional homogeneity and flood quantile estimation for arid zones are also discussed in Chapter 4.
Properties of flood quantile estimators are discussed in Chapter 5 which includes discussion of robustness and
efficiency of competing models and the effects on quantile estimates of regional heterogeneity, of temporal dependence
in flood series and of spatial dependence between flood series.
Methods of choosing between statistical distributions are diScussed in Chapter 6 where it is pointed out that
conventional goodness of fit tests are of little value in this context. Studies conducted in a number of countries aimed
at selecting a "best" distribution for nationwide use in the AM model are outlined in Chapter 7, followed by asummary
and concluding remarks in Chapter 8.
In addition to the traditional methods of testing the goodness of fit of distribntions of observed flood series the
report discnsses the progress which has been made in recent years through "behaviour" related studies and robustness
studies, Behaviour studies examine the statistical characteristics of samples of flood data within several regions on the
one hand and of random samples drawn from candidate distributions on the other. All of the traditionally used
distributions fail to behave like real hydrological data and only the recently developed Wakeby and TCEV distributions
seem to be satisfactory.
Robustness studies have shown that the Wakeby (WAK) and General Extreme Value (GEV) distributions
when used regionally, with parameters estimated by probability weighted moments (PWM), are robust and least
sensitive to changes in the nnknown parent distribution which is being modelled. TCEV tends to be unbiased but less
efficient than these two. On the other hand, the log Pearson Type 3 (LP3) distribution with parameter estimates
obtained by moments from logarithms of the data is least robust of the popular methods and it is extremely sensitive to
changes in the unknown underlying parent distribntion. Therefore LP3 cannot henceforth be recommended generally for
flood frequency estimation.
In the use of Wakeby and GEV distributions parameter estimation, where possible, should be carried out on a
regional basis. Parameter estimation by PWM, although not quite as efficient as maximum likelihood, is almost free
of bias, easy to use and generally unaffected by outliers.
Illustrative examples of parameter estimation are given in the appendices of the report. In Appendix 6 is
given a summary and an analysis of replies to a questionnaire on distribution types currently in use for frequency
analysis of extreme of precipitation and floods by national Hydrological and other Services. This survey reveals that the
Extreme Value TYpe I (EVI) and the log-normal distribution are used more commonly. The Weibull formula is
favoured most for plotting position although its continued use for this pllIJlose is not recommended in this report.
This report recommends that flood estimates be based on joint nse of at-site and regional data using an index
flood method of quantile estimation. It also recommends that flow measurements be commenced at any site, or in any
region, which is being contemplated as the site of any water resources project. To date, no amount of formulae or
statistical dexterity can make np for the lack of data on which to base an accurate estimate of mean annual flood and
hence of floods of higher retnrn periods at a site.
xi
RESUME
Depuis Ie debut du siecle, les methodes d'analyse statistique de la frequence des crnes ont connu une evolution
significative par I'usage de modeles de series des maximums annuels et de series de valeurs excedentaires.
Apres avoir defini et compare les modeles en question, I'auteur developpe une relation entre la periode de
recurrence T et la fonction de distribution F(Q) des grandeurs de crne dans chaque serie. Les proprieres statistiques
generales des series observees sont alars esquissees, et les problemes que souleve la modeIisation de ces series sont
exposes.
Les methodes d'estimation des quantiles de la distribution des crnes en utilisant les mesures in situ et des
donnees regionales, tant separement que conjointement, sont decrites au chapitre 4, avec quelques exemples dans
I'appendice 5. II s'agit notamment des methodes efficaces et robustes fondees sur les moyennes regionales des
moments ponderes par la probabilite, des methodes de Bayes et du modele des valeurs extremes 11 deux composantes
(TCEV) estime sur une base regionale. On trouvera aussi au chapitre 4 une discussion sur l'homogeneireregionale et
I'estimation des quantiles de crnes pour les zones arides.
Les proprieres des estimateurs de quantiles de crnes sont examinees au chapitre 5, oul'on passe en revue la
robustesse et I'efficacite de modeles concurrents et les effets sur les estimations de quantiles de I'heterogenCite
regionale, de la dependance temporelle dans les series de crnes et de la dependance spatiale entre les senes de crnes.
Les methodes servant 11 faire une selection entre distributions statistiques sont examinees au chapitre 6, ou i!
est indique que les tests classiques de l'adeqnation d'une distribution sont de peu de valeur dans ce contexte. Les etudes
menees dans un certain nombre de pays en vue de choisir la distribution "Ia meilleure" 11 utiliser 11 l'echelle d'un pays
dans Ie modele de series des maximums annuels sont decrites au chapitre 7, suivies d'un resume et de conclusions au
chapitre 8.
Outre les methodes traditionnelles de verifier I'adequation de distributions de series de crnes observees, Ie
rapport examine les progres realises ces dernieres annees grace 11 des etudes se rapportant au "comportement" et des
etudes de robustesse. Les etudes relatives au comportement examinent d'une part les caracteristiques statistiques
d'echantillons de donnees relatives aux crnes dans plusieurs regions et d'autre part celles d'echantillons aleatoires tires
des distributions envisagees. Toutes les distributions traditionnellement utilisees ne se comportent pas commes les
donnees hydrologiques reelles et seules les distributions de Wakeby et TCEV recemment mises au point semblent
satisfaisantes.
Des etudes de robustesse ont montre que la distribution de Wakeby (WAK) et la distribution generale des
valeurs extremes (GEV) lorsqu'elles sont utilisees sur Ie plan regional, avec des parametres estimes 11 I'aide des
moments ponderes par la probabilite, sont robustes et Ie moins sensibles aux variations de la distribution reelle
inconnue qu'i! s'agit de modeliser. La distribution TCEV tend 11 etre sans biais mais elle est moins efficace que les deux
precedentes. D'autre part, la distribution logarithmique de Pearson de type 3 (LP3) avec des estimations de parametres
obtenus 11 I'aide de moments des logarithmes des donnees est la mains robuste des methodes courantes et clle est
extrememeut sensible aux changements de la distribution reelle inconnue. Par consequence, la distribution LP3 ne peut
desorrnais pas etre recommandee d'une maniere generale pour estimer la frequence de crnes.
Dans I'utilisation de la distribution de Wakeby et de la distribution GEV, l'estimation des parametres devrait,
autant que possible, se faire sur une base regionale. L'estimation des parametres 11 l'aide des moments ponderes par la
probabilire (PWM), bien qu'elle ne soit pas tout 11 fait aussi efficace que la vraisemblance maxirnale, est presque
exempte de biais, facile 11 utiliser et n'est generalement pas affectee par des valeurs ab6rrantes.
Les appendices du rapport contiennent 11 titre d'illustration des exemples d'estimation des parametres. Dans
I'appendice 6, on trouvera un resume et une analyse des reponses 11 un questionnaire sur les types de distribution
actuellement utilises par les services nationaux hydrologiques et autres pour l'analyse de frequence des valeurs
extremes des precipitations et des crnes. Cette enquete montre que la distribution des valeurs extremes du type I (EVI)
xii
RESUME
et la distribution log-normale sont les plus couramment utilisees. La formule de Weibull a surtout la faveur pour Ie
report sur papier bien que Ie present rapport ne recommande pas que I'on continue a I'utiliser a cet effet.
Le rapport recommande que les estimations de crnes se fondent sur I'utilisation conjointe des donnees in situ
de valeurs regionales avec une methode d'estimation des quantiles fondee sur une erne indice. II recommande aussi que
I'on entreprenne des mesures de I'<\coulement en tout point, ou en lOute region, oil on envisage d'amenager un ouvrage
concernant les ressources hydrauliques. A ce jour, ni Ie nombre des formules ni I'habilete statistique ne peuvent pallier
Ie manque de donnees pour fonder une estimation exacte de la erne annuelle moyenne et par consequent des crnes
ayant une periode de recurrence plus grande en un site donne.
xiii
PE310ME
C Ha'l3Jla 11W16mH6ro CI'O!I6THJ1 C BIIQIJ;OM B MQIl6!lH PUOB JJ)lIIHIlX rQIIOBYX MaKCHMYMOB
JJ)lIIHIlX 'IllCl'H'IHOH nJlQllomKHrenbHOCfH
(PO)
(AM)
H PUOB
3Ha'IHrenbllO yalB6pm6HCrDOB3J1HCb M6TQl1b1 crarncrll'leGKoro aH3J1H3a
nOBTOpH6Mocru n_OB.
B
OT'I6T6 ,llaHbI onjl6.ll6!l6HH6 H CPaBH6HH6 MQll6!l6H
T
n6PHQllOM nOHTOpH6MOCfH
ycrauOBHTb CBH3b M6lKJlY
AM H PD.
Q
H qiYHKI\II6H pacnpeJl6!l6HHH
H
T.
FlQ)
AM
H
PO
H YCTallOBJI6HO COOTHom6HH6 M6:JK.llY
B6!1H'lHH naBQllKOB B KIDI<JlOM PJlllY, n03BOJ1HBmee
npHHeJl611b1 TaIOK6 06mH6 CTaTHCfHqecKH6 xapaKT6pHCTHKH PUOB Ha6moJl6HHii
3a KOTOpYMH C116JlY6T OnucauH6 npOOn6M. CBH3aHHIlX C npH6MJl6MID1 MQll6!lHPOIIHHH6M 3TUX pUOB.
M6TQllY 01l6HOK KBaHTHJ16H pacnp6Jl6!l6HHH naBQllKOB C HCnMb30BaHH6M nOKanbHYX H perHoHanbHYX
JJ)lIIHIlX teaK no OTJl6!lbIlOCfH. TaK H BMecre. HPHB6Jl6HbI B rnaoo
4
C HecKMbKHMH npHM6paMH B npHJ1OlIC6HHH
5. B
HUX BKJ1lOq6HbI 3l\iqieKTHBHbl6 H ycrOHqHBbl6 M6TQllbl, OCHOBaHHbl6 Ha perHOHanbHO OCpeJlH6HHYX B3B6IlI6HHbIX
MOM6HTa.X B6poHTHOCfH. M6TQllY 1i6HCCH H MQll6!lb
TCEV
)lJ1H perHOH3J1bHOro paCq6Ta.
B
maDe
4
H3nOllC6HbI
peruoHanbBaH QllHOPQllIIOCfb H OU6HKH naBQlllCOBbIX KllHHTHJ16H JlJ1H 3aCYIlIJ1HBYX 3OH.
5.
CBOHCTBa 01l6HOK naBQllKOBYX KBaHTHn6H 06cYlKJl6HbI B maB6
B KOTOPYIO BKJ1IOQ6HO onHGallH6
ycroH'IHBOCTH H 3l\il\i6KTHBIIOCfH COIIOCTaMH6MbIX MQll6Jl6H H 0llHCaHH6 MHHHHH. KOTOpoe OKa3b1Ba1OT Ha 01l611I<H
KII3IfI1llIeH pernoHaJ1bllllJl
IIJIOCI1l3OCI1I
rereporellIfOCn.. 3aBHCHM0Cfb PJlllOB HaIIQIJ;KOB DO Bp6M611H H 33llItafM0Cfb PUOB HaIlQlJ;KOB B
6.
M6TQllbI Bbloopa CTaTHcrHQ6CKHX pacnp6Jl6Jl6l1HH H3JlOllC611b1 B maB6
TpaJlHIIHOllllbl6 KpHTepHH comaGHH B )laHHOM CIIyqae 116 HM610T IionbillOro 3HaQellHH.
B KOTOPOH nOKa3allO, QTO
npoDeJl61111b16 B PU6 crpall
HOCJI6JlOBaHHH. HanpaBIl6H11b16 lIa BbiOOP 'lIaunyqIlI6ro. paCHp6Jl6Jl6HHH JlJ1H HaHHOll3J1bllOro HcnMb30BaIIHH B MQll6JlHX
AM.
1\3.IIOlK611b1 B rnaoo
7,
a.pe3IOM6 H BbIIIQllY - B mane
8.
,l\o1l0JlllHT6Jlb1lO K TPlIJlHIIHOIIHYM MeTQllaM npoHePKH KpHTepHH COmaGIIJI paCHp6Jl6Jl611HH Bp6M611l1b1X PUOB
lIa6n1OJl6HHIlX naBQlllCOB. B JJ)lIIIIOM OneT6 06cyJKJlH6TCH nporpecc, JlocrurllYTbIH B nOCll6JlHH6 rQllbI. B H3yq6HHii
'nOB6JleHHH' H YCTOH'IHBOCTH MQll6Jl6H.
Hayq6HH6 nOB6Jl6I1HH KacaJ10Cb CTaTHCfHQecKHX xapaKTepHCTHK BbiooPKH
JJ)lIIHIlX 110 naBQlllCaM B lIp6Jl6Jlax lIecKMbKHX perHOHOB, C QllIIOH CTOPOIIbI, H CIIyqaHllblx BbiooPOK H3 B03MOlKlIbIX
pacnpeJl6!l6HuH. C JlJlYroH CTOpollbl.
Be6 H3 TPaJlHIIHOHHO HCHOJ1b3Y6MbIX pacnp6Jl6JleHHH 116 nOKa3aJlH peaJJbllOro
nODeJl611HH. CBOHcrB6HHOro rHJlPDOOrHQecKHM )laIIHYM. H TOJ1bKO p33paooTaHHDe I16JlaBHO pacnpeJl6Jl6HH6 )3HKOOH H
pacnp6Jl6!l6llH6
TCEV.
npeJlCTaBJ1HlOTCH YJlOM6TBOpHrenbHYMu.
(WAK)
H3YQ6HH6 ycrOHQHBOCfH 1I0Ka3HJ1H, QTO pacnpeJl6!l6HH6 )3HK66H
3KCfpeManbHllX 3Ha'l611Hil
(GEV),
H pacnpeJl6!l6HH6 06IlIHX
6CIIH OHH HCHMb3ylOTCH perHOllanbHO BMecre C napaMeTpaMH, paroIHTaHHYMH 110
B3HeIlI6HIIbIM MOM6HTUM DePOJITIIOCfH
(PWM).
JlBJ1HIOTCH
ycroH~bIMH H HaHM6Hee qyHCfBHrenbllblMH K H3MeH6HHJ1M B
H6H3B6CfHOM pacnjl6.ll6!l6HHH r6H6panbHOH COBOtcyIlHOCfH. KOTOpoe MQll6JlHPYeTCH.
HBJ1H6TCH HecM6D\6HHOH. HO M6Hee 3l\il\i6KTHBlfOH OIl6HKOH no cpaBlI6HHIO C
norapHqiMHQ6CKDe pacnp6Jl6!l6HH6 nHpeOHa THna
3 (LP3)
JlIIYMJI
npeJlHMaraeTCH, QTO
JlPyrHMH.
TCEV
C JlpyroH crOPOllbI,
C paCCQHTaHHLlMH napaM6TpaMH, nMYQ6HHblMH no
MOM61ITaM norapHqiMOB JJ)lIIHbIX, JlMJl6TCH HaHM6Hee ycrOHQHBYM H3 06D\6H3B6CfHbIX M6TQllOB H HCKJ1IO'lHT6!IbHO
QYJICfBHT6!IbHYM K H3M6H611HHM B H6H31l6CT110M pacnpeJl6!l6HHH. n6lKaD\6M B OCHODe r6H6panbHoH COBOtcynHOCfH.
nooTOMY
LP3 o6bl'lHO
H6 p6KOM6HJlYeTCH JlJ1H pacq6TOB nOBTOpJl6MOCfH HaBQIlICOB.
UPH HCnMb30BaHHH paCnp6Jl6!l6HHJI )3HKOOH H pacnp6Jl6!l6HHH
B03MOJKHocru JlOJ1lKeH IlPOBQllHTbClI Ha perHonanbHoH OCHOHe.
GEV
paCQ6T napaM6TpoB 110 M6p6
XOTJI pacc'lHTHIIIIbIH C nOMOD\blO
PWM
napaMeTp H6
HBJ1H6TCH crMb lIC6 3l\il\ieKTHBHYM. teaK MaKCHManbHaH DepDHTHOCfb, HO OH 1I0~H coo6Qll6H OT CM6D\61rnocru. nerKO
HCII~TCH H H6 nQllB6pra6TCH B03Jl6HcraHlO peaKO OTKJ1OHHIOD\HXCJI 3HaQeuuH.
B npHJ1OlIC6HHHX
K OTQeTY JlHIOTCH npHM6pbl pam6Ta napaM6TpoB.
B npHJ10llC6HHH 6
IIPHB6Jl6HbI pe31OM6 H
aHanH3 OTB6TOB Ha BonpOCHHK no THnaM paCnp6Jl6Jl6HHH, KOTOpbl6 B HaCTOJlD\ee BP6MJI HCIIMb3YlOTCJI
PEJIOME
xiv
lIaUHOmLllbHaMH rIlJlJlO.llorn.OCKHMH WlH ilpyrnMH CIlyilKWMlI
naBQllKOB.
11M
allaJUf3a nOBTOpBeMocrn 3KcrpeMll11blfill( llCa,l\KOB H
choT 06wp nOKa:lblll'<eT. qTQ lIaHlionee qaero HCIlMb3yeTCH 3KcrpeMll11bllJlJl Be11H.H1la THna
JlOrapH,pMH.ecKH 1I0PMll11blloe paCllpeilenellHe.
I (EVI)
H
,l(JlB HallecellHB MecTOnOJlOilKeHHB lIa rpa,pHK lIpeilnO.HTaJOT B
OCIIOBIIOM HCIlMbWBaTb ,poPMYllY Beii6YJlJla. XOTB ee nOCTOBlllloe lIpHMellellHe ilJIB 3TOii uenH lie peKOMellilYeTClI B
3TOM OTQere;,
B
OT.eTe peKoMellilyeTcB IIPOH3BOilHTb pac.eTU naBQIlKOB lIa OCllOBe COBMecTHOrO HClIOJlb30BallHB
JlOKll11bllUX H pernOIlll11bllUX I\lllIIIUX C IIpHMellellHeM lII/)\eKCIIOro MeTW KBalITHJlbllOii ouelllCH naBQllKOB.
'lluoKe
peKOMellilYeTClI npoIIQII,IITb H3MepellHe nOTOKa B JIlo6oM MecrollOJ1OilKelllIH HJIH B JlI060M paiiolle. me lIJ1aHHPyeTCH
npoHeilellHe JlI060ro npoeKTa. OTHOClIIllerorn K BQIllIUM
pecypcaM. B lIacrDBIllee
BpeMB lIHKaKoe KM""OC1'BO ,popMyn.
H lIHKaKoe HCKYCCTBO CTaTHCTH.eCKOro aHaJIH3a He MoryT BOCIIOJIIIHTb lIeilOCTaTOK ilaHlIUX. ua KOTOpUX MOJKllO
060CIIOBaTb TO.IIUii pac.eT cpeilHerQllOBOrO naBOilKR H. CJlMOBaTenbIlO. naBOilKOB C 60JlblllHM lIepHQIlOM
nOBTOpBeMocrn.
xv
RESUMEN
Desde el comienzo del presente siglo se han producido importantes avances en los metodos para el anlilisis
estadfstico de Ia frecuencia de las crecidas utilizando modelos de Ia serie de maximos anuales (AM) y de Ia serie de
duraci6n parcial (PD).
Se definen y comparan los modelos de AM y PD Yse desarrolla la relaci6n entre Ie perfodo de retorno T y Ia
funci6n de distribuci6n F(Q) de las magnitudes de Ia crecida en cada serie, 10 que conduce a la relaci6n Q - T. Se
resellan despues las propiedades estadisticas generales de las series AM y PD observadas, tras considerar los problemas
hallados en la modelaci6n satisfactoria de esas series.
En el Capitulo 4 se exponen los metodos para estimar los cuantiles de la distribuci6n de las crecidas utilizando por separado y juntos los datos regionales y los obtenidos in situ, dando algunos ejemplos en el Apendice 5. Entre
estos figuran los metodos s6lidos y eficaces basados en los momentos ponderados de probabilidad regionalmente promediada, los metodos bayesianos y el modelo TCEV estimado regionalmente. Tambien se consideran en el Capitulo 4
la homogeneidad regional y Ia estimaci6n de los cuantiles de las crecidas en las zonas andas.
En el Capitulo 5 se trata de las propiedades de los factores de estimaci6n de los cuantiles de las crecidas, 10
que incluye el exarnen de Ia solidez y eficacia de los modelos en competencia y los efectos que ejercen la heterogeneidad regional, la dependencia temporal en la serle de inundaciones y Ia dependencia espacial entre series de inundaciones sobre las estimaciones de los cuantiles.
En el Capitulo 6 se discuten los metodos para e1egir entre distintas distribuciones estadfsticas, sella1ando que
las pruehas convencionales de buen ajuste tienen escaso valor en este contexto. En el Capitulo 7 se resellan los estudios efectuados en distintos paises destinados a seleccionar Ia "mejor" distribuci6n para uso naciona! en el modelo AM.
EI Capitolo 8 contiene un resumen y las conclusiones.
Ademas de los metodos tradicionales de prueba del buen ajuste de las distribuciones de las series de crecidas
observadas, el informe trata de los avances efectuados en los ultimos alios en los estudios de "comportamiento" y en los
estudios de solidez. En los primeros se exarninan las caracterfsticas estadisticas de muestras de datos sobre crecidas
denlro de varias regiones, por una parte, y de muestras aleatorias tomadas de distribuciones candidatas, pur otra parte.
E! comportamiento de todas las distribuciones utilizadas tradicionalmente no corresponde a los datos hidrol6gicos
reales y sOlo las distribuciones de Wakeby y TCEV, recientemente elabnradas, parecen satisfactorias.
Los estudios de solidez han mostrado que las distribuciones de Wakeby (WAK) Y del valor extremo general
(GEV), utilizadas regional mente con parametros calculados por los momentos ponderados de probabilidad (PWM), son
s6lidas y por 10 menos sensibles a los cambios de Ia distribuci6n original desconocida que se halla en curso de modelaci6n. La distribuci6n TCEV tiende a estar exenta de sesgos, pero es menos eficaz que esas dos. Por otra parte,la distribuci6n logaritrnica de Pearson de tipo 3 (LP3), con estimaciones de los paramelros obtenidas por momentos a partir
de los logaritrnos de los datos, es la menos s6lida de los metodos muy difundidos y es extremadamente sensible a los
cambios de la distribuci6n original subyacente desconocida. Pur consigniente, la distribucion LP3 no puede recomendarse en general para Ia estimaci6n de Ia frecuencia de las crecidas.
Al utilizar las distribuciones de Wakeby y GEV, la estimaci6n de los parametros debe efectuarse en 10 posible
sobre una base regional. La estimaci6n de los panimetros por el metodo PWM, aunque no es tan eficaz como Ia probabilidad maxima, eslll casi exenta de sesgo, es de facil utilizaci6n yen general no eslll afectada por factores exteriores.
En los apendices del informe se facilitan ejemplos ilustrativos de estimaci6n de parametros. En el Apendice 6 se facilita un resumen y un anlilisis de las respueslas a un cuestionario sobre los tipos de distribuci6n actualmente en uso para el anl1lisis de Ia frecuencia de extremos de precipitaci6n y de crecidas por los servicios hidrol6gicos
naciona!es y otros. Esa encuesta pone de manifiesto que se utiliza mas corrientemente la distribucion del valor extremo
de tipo I (BVI) Y Ia distribuci6n log-normal. La formula de Weibull se emplea sobre todo para las inscripciones grlificas, aunque en este informe no se recomienda Ia continuaci6n de su uso con esta fmaIidad.
xvi
RESUMEN
El presente infonne recomienda que las estimadoiles iIe las erccidas se basen en la utilizacion eonjunta de
datos regionales y obtenidos in situ utilizando un metoda de erecida fndiee para la estimaeion de los euantiles. Se recomienda que se inieien medieiones del caudal en todo emplazamiento 0 region en donde se plantee el estableeimiento de
algun proyecto de recursos hfdrieos. Rasta la fecha, ningun eonjunto de formulas ni eonocimientos estadfstieos puede
suplir la falta de datos en los que basar una estimaeion precisa de las erecidas anuales medias y por ende de las erecidas
en los perfodos de retorno maximos en e1 emplazamiento.
CHAPTER 1
BASIC CONCEPTS OF FLOOD FREQUENCY ANALYSIS
1.1
Introduction
In flood frequency analysis the objective is to estimate a flood magnitude corresponding to any required return
period of occurence. The resulting magnitude-return period relationship will be referred to as the Q - T relationship.
1.1.1
R1mJRN PERIOD
Flood peaks do not occur with any fixed pattern in time or in magnitude. Successive exceedances of some
magnitude are separated by varying intervals of time, Figure 1.1. For any arbitrary discharge Q' we define return period
as the average of the inter-event times,
T(Q')
=
Average('th 't2. 't3
(1.1)
)
a straightforward definition of return period which does not entail any reference to probability. The concept of
. probability is introduced in flood frequency models and the resulting magnitude return period relationships in Section
1.5 below. Return period is also referred to as recurrence interval;
Q
- - - - .......
-
... ".,-
"".
1". -
---
Figure 1.1 Quantities used in defmition of return period
Jt.may be that the hydrological regime at any river site changes due either to man's direct influence on the
catchment or to climatic change. The latter may of course be very difficult to detect or quantify. This report deals only
with situations where neither of these changes occur at a detectable rate.
2
1 .2
STAllSTICALDISTRlDlfflONS FOR FLOOD FREQUENCY ANALYSIS
Flood frequency models
At any river site it is usually assumed that nature provides a unique Q - T relationship and that Q is a
monotonically increasing function of T. In order to estimate this natural Q - T relationship from a good quality
continuous hydrometric record of N years duration it is necessary to resort to a statistical or stochastic model of the
continuous hydrograph which retains information in the hydrograph relevant to the Q - T relationship and discards the
rest.
Three such models are:
(a) annual maximum series model, AM.
(b) partial duration series, PD, or peaks over a threshold, POT model.
(c) time series model, TS.
This report is primarily concerned with the AM model; the PD model receives some attention while the TS
model, from which a Q-T relation may be obtained by simulation (Quimpo. (1967), Hall and O'Connell (1972», is
acknowledged but is considered outside the scope of this report. If the assumptions of independence of flood peaks are
violated, see Section 2.3, then use ofa TS model would be one way to proceed.
Even if the statistical parameters of these models were known perfectly for a given river site, it must be
assumed that a particular value of Q would: be attributed differing values of T b)' each of these-models imdfurtherthat
these values of T would differ from the value innatnre. Thus, if for some Q, T is the true value ofretum period in
nature and TAM and TpD are the values attributed to Q by the two models it must be assumed that
(1.2)
*
Under certain not unreasonable assumptions, Langbein (1949) showed that TAM TpD , the greatest difference
being for. small values ofT. Under his conditions the difference converges to 0.5 year as T increases. The difference for
small values of T is a result of sampling the time base of the entire hydrograph discreteIY.illMllit,s 9ta year,a time
interval .wliich is of the same order of magnitude as the return period of small flood peaks.
This distortion of.. -:the time
.,.
scale is not present to .an obvious degree in the partial
durationmodet
and
it
may
be
reasonable
to asSume that r pD = T.
,
.
, Langbein de<lhced that
=
I
1 - exp( -II Tp&)
(1.3)
which gives for instance:
T';'M
- 1:.58, 2.54 and 100.50 for
= LOO,2.00 and rOO.OO
respectivet~. Thus the relative difference is greatest for small values of T (WMO, 1983). The derivation of eqn(1.3) has
been discussed by Chow(1950) and by Takeuchi (1984).
The valfdfty"of equation CU} has been empirically examined by a numbe; of in~~stigators. Beard (1974)
conduded that PD frequencies are related to AM frequencies differently in different regions and thus that empirical
relationships should be used rather than a single theoretical relationship. Beran and Nozdryn·Plotnicki (1977) report on
a study of low return period floods on British rivers. They found that Langbein's result does not hold exactly even
though the form of the relationship between TAM-,andTpD is. approximately correct. Yevjevich and Taesombnt (1978)
report, inter alia, on a comparison of return periods of fixed flood magnitudes estimated by (a) annual maximum model
and (b) partial series duration model. The latter TpD values are compared with TpD values calculated by Langbein's
formula from the TAM- values obtained in (a). In one river the agreement is excellent for TpD values of two years and
less while in the second river the agreement is remarkably good for return periods up to ten years.
CHAPrERI
1. 3
3
BASIC CONCEPTS OF FLOOD FREQUENCY ANALYSIS
Relative merits of AM, PD and TS models
Apart from their relative abilities in representing the parent Q - T relationship as discussed above, their
relative merits can also be considered under two other headings: identification of data series and statistical efficiency of
estimates of QT by each model.
(a) Identification of data series
The series of annual maximum floods can.be extracted without difficulty from a hydrometric record. The
frequency of problems of identification caused by a flood continuing from the end of one year into the beginning of the
next can be considerably reduced by selecting a date within the dry season as the commencement of the hydrometric year.
The extraction of the partial duration series of floods is less straightforward because of occasional bunching of flood
peaks. The possibility that such peak magnitudes are not statistically independent has led to a certain amount of unease
about the validity of statistical methods used with this model. In addition, prior to 1963 (see Borgman 1963) no
engineering literature dealt with peaks over a threshold series type of problem in which the number of peaks included
differed from the number of years of record. In the latter case, the series was known as the annual exceedance series.
(b) Statistical efficieny of estimates of Qr by each model
A
Denoting the estimate of Qr obtained by the AM method as Qr and that obtained from the same hydrometric
record by the PD method as <:!; it is usually observed that these two estimates are unequal. Furthermore the sampling
varianceof<:!r is not equal to that of Q;, Le. var(<:!r) '" var(Q;). From a statistical poiut of view that method which
has the smallest sampling variance enjoys an advantage. Under certain common assumptions Cunnane (1973) examined
•
A
1'\*
A
A*
•
• .
the relattve values ofvar(Qr) and var(Qr) and found that var(Qr) < var(Qr) provIded A < 1.65 where A IS the mean
number of peaks per year included in the PD series. When A > 1.65 the opposite was true. This shows that the AM
method is statistically more efficient than the PD method when A is small but less efficient when A is large. In many
practical situations the assumptions of the PD model may not be valid if A is increased to too high a level, certainly if
A >3.
These results have been re-examined by Yevjevich and Taesombut (1978) and by Taesombut and Yevjevich
(1978) with the help of simulated flows obtained from a time series model. Their results suggest that a value of A >
1.8 or 1.9 maybe required to ensure greater efficiency of PD estimates of Qr, the change from A> 1.65 being a result
of small but significant differences between parameters estimated from AM and PD simulated series. (Taesombut and
Yevjevich, 1978, Chapter 7.2 and Tables 7.1 and 7.2). On the basis of very restricted simulation results Tavares and da
Silva (1983) reported that Cunnane's (1973) expression for var(<:!;) overestimates the true sampling variance if A < 1
and underestimates it if A> 2. Rosbjerg (1984) gives an improved exact expression for var(Q;) for both the case of
independent peaks and serially correlated peaks.
Apart from these restricted comparisons no further objective comparisons based on bias and efficiency of Qr
estimates have been published and the TS model has never been considered in this way.
1. 4
Measures of flood magnitude
Some aspect of the flood wave such as its peak level, H, peak discharge, Q, or volume, V, must be designated
to represent its magnitude. Which of these is chosen depends on the use to· which the frequency curve is to be put
Another aspect which is of interest particularly in the economic evaluation of road closures and crop inundations, is the
duration D dorlng which a given level is exceeded. If interest is confined to a single cross-section of a river it may
sometimes be sufficient to analyse the series of peak levels ralher than the series of peak discharges. However, if there
have been changes in channel geometry at the site during the period of record the series of levels is nonhomogeneous
and unsuitable for ordinary statistical analysis. This disadvantage is not shared, however, by the corresponding series of
discharges. In addition there are other reasons why the series of discharges is preferable to the series of levels:
4
STATISTICAL DISTRIBUTIONS FOR FLOOD FREQUENCY ANALYSIS
(i)
the results of the analysis of discharges can be related to catchment characteristics and thereby transferred
to other sites on the river or other catchments and
(ii) the effects on levels at the site in question of some hydraulic structure downstream of it or of flood
protection works can only be obtained via discharges.
The volume of the flood wave is of paramount importance whenever storage plays more than a minor part in
the sitnation being examined, for instance, in reservoir problems while it is also important in some flood protection
schemes.
1.4.1
SINGLE vARlABLE DESCRIPTION
Despite the importance of the flood volume, the majority of flood frequency research and published flood data
has been based on flood peak discharge. While flood volumes and durations have been studied at individual sites for
investigating particular projects, few data or results have been published in the periodical hydrological literature.
Exceptions are an extensive study of the distribution of flood volumes on 64 catchments published by NERC (1975, 1.5)
and a guide to a particular method of flood volume frequency analysis prepared by Beard (1962) and U.S. Corps. of
Engineers (1975). This lack of easily available, published data have forced researchers to consider peak values only.
In keeping with the availability of the material and the popularity of flood peaks, the main body of this
Report deals with distribution of flood peak magnitude Q. In applications, the recommendations made can be taken to
apply to instantaneous flood peaks or to peaks of daily mean flows. It should be borne in mind that these quantities
may differ considerably on small catchments. The question of the distribution of flood volumes and durations is
considered separately in Appendix 1.
1. 5
Magnitude-return period (Q- T) relationships
1.5.1
ANNuALMAXlMUMSERIES MODEL
A series of annual maximum floods (QI, Q2
QN) is assumed to form a random sample from a
stationary population in which Q is a random variable with distribution PR(Q';; q) = F(q). The variate value with
exceedance probability I I T is said to have return period T. Denoting this value Qr it is such that:
I . F(Qr)
lIT
=
(1.4)
The justification for this may be seen by considering a sequence of Bernoulli trials, in each of which the
outCQme is either an exceedance or a non-exceedance of Qr.. If the probability of exceedance of Qr in each trial is
p = I I T, then in a sequence of M years (trials) the expected number of exceedances is M.p = MIT by the binomial
distribution. The expected number of exceedances of Qr in T years is thus TIT = I. Further the length of interval
elapsing between successive exceedances in such a sequence has the Geometric distribution with expected value
I I p = T. This result about the average interval between successive exceedances of Qr is consistent with the concept
on which the definition of equation 1.1 is based. If F(Qr) is known, equation lA, after some algebraic manipulation,
provides the Q - T relation. For instance, if F(.) is the extreme value type I or Gumbel distribution, EVI equation 104
gives:
Pr(Q';;Qr)
=
QT- u)
exp[ • exp - ( - - ]
0:
=
I - II T
(1.5)
where u and 0: are EVllocation and scale parameters, which leads to
Qr
=
u - aln[-ln(1 - lIT)]
(1.6)
where
YT
= -In[-ln(l· lIT)].
(1.7)
CHAPrnRl
5
BASIC CONCEPI'S OF FLOOD FREQUENCY ANALYSIS
An alternative fonn is:
=
j!
+
0"
(1.8)
KT
where j! and 0" are population mean and standard deviation and
=
- ";6 [0.5772 + In ( _In (1 - 11 T
It
»]
(1.9)
is a frequency factor depending only on T, (Chow,1951). In equations (104) to (1.9) T is synonymous with TAM' In
distributions snch as 2-parameter gamma or log-normal as well as most three parameter distributions KT is a function of
the shape (skewness) parameter as well as ofT.
1.5.2
PARTIAL DURATION SERIES (OR PEAKS OVER A1lIRESHOID) MODEL
In this model most of the flow hydrograph is disregarded and the hydrograph is viewed as a series of randomly
spaced flood peaks of random magnitude. For ease of statistical modelling and also for ease of identification of the
values which fonn the series, only the series of peaks exceeding an arbitrary threshold qo are considered. The most
general concept is that of a joint probability density functionf (~l, qr, ~2, qz, ~3, q3 •......) which expresses jointly
the random distribution of times of occurrence of peaks exceeding a threshold qo and the magnitudes of the peaks. A
very general treatment of the probability statements involved has been given by Todorovic (1970) while particular
models from his general approach have been dealt with by Todorovic and Zelenhasic (1970) and by Todorovic and
Ronsselle (1971). Earlier and somewhat less general treatments were given by Borgman (1963), Shane and Lynn (1964)
and by Bernier (1967). In particular, each of these showed that if the number of flood peaks exceeding some value qo (a
threshold value) in some interval of time such as a year has a Poisson distribution with parameter A. then the number of
events exceeding a greater value q' is also Poisson distribnted with parameter A'= AP where p = PR(q~q' I q~qo)' Here p
is a conditional probability, being the proportion of all peaks exceeding qo which also exceed q'.
Most practical applications make nse of this least general model which assumes a Poisson distribution for the
number of events exceeding qo per year or at most a seasonally varying Poisson distribution. In either case the
parameter may be estimated by the mean number of events exceeding qo per year when estimates of flood magnitudes are
required for return periods well in excess of one year as in such circumstances the seasonality aspect has little effect.
The seasonality aspcct can be very important when small return periods are being considered. If the simplest Poisson
assumption is made then a magnitude Qr of T year return period will appear once on average among every AT flood
peaks in the series, Abeing the Poisson parameter of mean number of events per year exceeding qo. In other words the
exeeedanee probability of QT in the population of flood peaks which exceed qo is II AT. If this conditional
distribution has distribution function F(q I q~qo) then the Q - T relation is defined by
I -
or
F(Qr/Qr~qo)
F(Qr/Qr~qo)
=
=
I/A T
In particular if F( ) is assumed, as frequently is the case, to be exponential with standard deviation
F(q/q~qo)
=
Qr
+.~ [n (
(l.lO)
1- I/A T
1- exp[-(q - qo)/~]
~,
(1.1\)
then (l.lO) gives:
1.5.3
=
qo
A T)
(1;\2)
REruRN PERIOD AND PROBABILITY
In the basic definition of return period T in equation. (1.1) there is no mention of probability. The type of
models described here have been developed to estimate the Q - T relation from an observed hydrograph and in doing so
the concept of the distribution of a population of flood peaks has been introduced and hence a link between return period
and probability has been developed. This is a conseqnence of the model and not of the definition of return period.
6
1. 6
STATISTICAL OISTRIBUTIONS FOR FLOOD FREQUENCY ANALYSIS
General considerations
A primary assumption of hydrological frequency analysis is that the series of events being considered have
random magnitudes which are mutually independent and identically distributed (tid). Thus existence of serial correlation
or trend in the series would invalidate the usc of such analysis. The result of any frequency analysis is an estimated Q'T
relationship. While modem practice attempts to quantify the error in such relationships it must still be borne in mind
that some estimates involve considerable statistical extrapolation. Some authors, particularly Klemes (1986, 1987),
draw attention tc the accompanying inherent dangers.
Flood frequency analysis is not an end in itself but provides vital information for the design of many hydraulic
structures (culverts, bridges, reservoir spillways, road embankments, flood control levees) and for risk assessment in
flood plain use and insurance. And, as mentioned in Section 1.4, the primary variable for a particular application may
be one or more of peak, duration or volume. In applying frequency analysis in any problem area the analyst must,
using his knowledge and familiarity with the region, take into account the uses tc which the frequency curve will be
put, the type, amount and quality of hydrological data available, the nature of the hydrological regime and in particular
the causitive factors of floods and any special factors, natural or otherwise, controlling them.
In preparing guidelines for general use in a particular region the responsible hydrologists will, in addition to
technical hydrological problems, have to take into account the institutional framework of such guidelines and in
particular any legal requirements, including those suggested by case law, tc be met by such guidelines. In addition the
degree of expertise available among potential users will bave a bearing an whether such guidelines are presented as strict
codes of practice tc be rigidly applied or otherwise. Such gnidelines usually try tc provide procedures for basic checking
of data, treatment of low and high outliers, estimation of missing peaks, inclusion of historical flood data, detailed
methods for ungauged catchments and some comment on the relationship between design floods obtained from statistical
analysis of floods on the one hand and from rainfall runoff models on the other.
In this report a degree of familiarity with the concepts and practice of frequency analysis on the part of the
reader is assumed. The report is not envisaged primarily as an instruction manual nor as an introduction tc the practice
of frequency analysis with all the attendant considerations mentioned in the previous two paragraphs. What it does aim
to do is to outline rationally ideas which are fundamental to the technical questions of frequency analysis and tc collate
and synthesize recent relevant research on these questions. Some conclusions emerge which are not presented as binding
truths but rather as points worthy of consideration by those responsible for flood frequency analysis work.
CHAPTER 2
STATISTICAL PROPERTIES OF OBSERVED FLOOD SERIES
1.1
Distribution
The distribution of flood magnitudes in the PD series tends to be abruptly truncated at some threshold while
that of the AM series always has some values to the left of the mode. These latter values reflect the presence of annual
maximum from years having small flood peaks (see Figure 2.1).
",
I
f(q)
1
f(q)
I
I
I
I
-q
(a) Typical PD series histogram and
idealized probability density function..
(b) Typical AM series histogram and
idealized probability density function
Figure 2.1 Contrasting PD and AM histogram shapes.
While many theoretical forms of probability distributions provide a satisfactory fit to such histograms within
the observed range of peak values, such distributions may differ in the shape of their tails and this shows up acutely at
large T on Q - T diagrams (see Figure 3.1 later). The distinctiou betwecn different shapes of Q - T relations is of great
practical significance if cost effective recommendations are to be made by flood hydrologists. The histogram alone,
based usually on 50 or fewer observations, is unable to aid the task of discriminating between distribution tails.
Therefore, conective evidence about distributional behaviour of floods on many rivers needs to be studied. Such
behaviour is reflected in the higher dimensionless moments C v, C s and Ck and in the extreme values of samples. Such
quantities are now discussed while factors affecting choice of distribution are discussed in Chapter 3 and actual choice of
distribution is discussed in Chapter 6.
1.1
Sample Statistics
Conventional moments and their dimensionless ratios C v , C s and Ck are well known and estimation of them
from samples is discussed in Appendix 2 and their typical values in the flood frequency context are given below in
section 2.2.1. Probability weighted moments (PWM) are newer (Greenwood et al.. 1979) and are superior to ordinary
moments in helping to identify distributions, estimate their parameters and test hypotheses about values of such
parameters. PWMs are defined in Appendix 4 and use of dimensionless PWM ratios, known as I.-moments (Hosking,
1986), to help identify distributions is discussed briefly in Appendix 3.
2.2.1
MO,,"NTS
The mean of these series reflects the size of the catchment from which the data come. The mean of the AM
series Q (= mcan annual flood) is roughly proportional to AO.75 where A is catchment area. (Hazen, 1932, Nash and
Shaw, 1965).
The standard deviation of these series is generally proportional to catchment size except that along a single
river it sometimes decreases with distance downstream. The coefficient of variation, Cv, dcscribes the standard deviation
8
STATISTICAL DISTRIBUfIONS FOR FLOOD FREQUENCY ANALYSIS
as a proportion of the mean. In the majority of AM series its estimated value lies between 0.3 and 0.8 but values
outside this range can and do occur. (See NERC, 1975 (p.l22) and Matalas et aI., 1975). Unusually low, as small as
0.1, valucs of Cv occur in equatorial regions of high rainfall whose flood producing mechanism is fairly uniform from
year to year, such as Papua New Guinea (Atkins, 1982). Low C v values also occur in relatively impermeable, high
rainfall catchments in temperate zones, (NERC, 1975, (pp 182 -183), C v= 0.15). Large values, Cv > 1, occur in some
AM series which display a well-behaved heterogeneity but most cases of large Cv are due to the presence of one or more
outliers. In some AM series the largest value may be more than twice as large as the second largest while in others the
smallest value may appear to be inordinately small.
In general, the value of skewness, C s, estimated from AM series is positive, ranging from zero to 5.0 or
more, with the majority of values lying in the range 0.5 to 3.0. (See NERC, 1975 (P122), Matalas et aI., 1975 and
Rossi et al., 1984). Values of Cv and Cs tend to be positively correlated since the same features cause high or low
values of each. Since hydrological practice sometimes makes use of logarithms of data, Z = log Q, it is worth noting
that C s in log space (LS) of AM is frequently negative (Landwehr et al., 1978).
Values of kurtosis, Ck, ilJ AM series vary' from 2 to 8 which are regional average valn';s reported by Landwehr
et ai. (1978). Individual values.mayvary widely outside these limits. In extensive randolll sampling froni· distributions
commonly used in hydrological frequency analysis Wallis et al. (1974) noticed that no matter how many samples were
generated, even from e"tremely skewed parent p'opulations, the value of skewnesS obtained from such samples is
bounded. above. Kirby (1974) confirmed theoretically that both Cv and C s in random samples are bounded and that the
bounds are a function of sample size N alone,. thus:<
I Cs I
(N - 1)°.5·
< --,("-N,--...;2,,,)...-c
(N _1)0.5
(2.1)
(2.2)
where C v and C s are based on uncorrected moments as in equations (A2.1) to (A2.3) of Appendix 2.
Kurtosis is also bounded by a function of sample size, not evaluated by Kirby (1974). Landwehr et al. (1978,
Table 2) found that average regional values of C v, Cs and Ck. measured from AM data, increase with record length
while Hazen (1932) also found that skewness of AM dala tcnded to increase with record length and suggested a correction
factor (I + 8.51N) by which to multiply small sample values of C s ' This factor was later found to be useful for
samples from mildly skewed parent populations (Wallis et aI.. 1974 and Appendix 2 of this report).
Because of the bounds which exist on Cv , Cs and Ck values of these quantities measured from random
samples are biased downwards and tables of bias factors are given by Wallis et al. (1974). The magnitude of these biases
are outlined in Appendix 2. If AM series arc in fact random samples from an unknown distribution the values of Cv ,
C s and Ck obtained from them are biased estimates of the population values. This bias is important in making
inferences from the sample moments, for instance by use of moment ratio diagrams, about the form of the population
distribution. It can be concluded that the populations from which AM series are assumed to be random samples have .
values of Cv , C s and Ck in excess of regional average values of those statistics calculated by the usual formulae.
Corrections for such bias should be made if method of moments estimation is used. Appendix 2 gives some details but
completely general rules cannot be given.
2.2.2
REGIONAL DISTRIBUTION OF C y . C S AND C K
While regional mean and standard deviation values of C v• C s and Ck have been reported the distribution of
these quantities within a region has rarely been examined. Such knowledge may be useful in the inference problem of
determining the form of distribution for AM series. Rossi et al. (1984) used the observed sampling distribution of
skewness obtained from 39 Italian AM series of average length 40 years as a criterion to be reproduced by similarly
sized dala sets randomly drawn from a two component extreme value (TCEV) distribution. Based on the comparison of
observed and Monte Carlo derived sampling distributions they eliminated both EVI and log normal distributions as
candidates for modelling Italian AM series.
Njenga (1985) found that the sampling distributions of Cv , C s and Ck calculated from 64 AM series of length
15 years each in Ircland are remarkably well behaved and are very similar to, but not exactly equal to, the mean sampling
CHAPfER2
9
STATISTICAL PROPERTIES OF OBSERVED FLOOD SERIES
distribution of corresponding statistics of data sets randomly drawn from GEV parent distributions with shape parameter
k in the range 0.0 to - 0.1 (Figure 2.2).
Thus, while the large sampling standard error of a skewness value calculated from an individual AM series
makes it impossible to use such a value on its own with confidence (pointed out by Slade, 1936) it is quite possible that
regional distributions of Cv , C, and Ck may contain information of value in inferring characteristics of parent
distributions from which the AM data may have been drawn.
3..-----------------------,
.... 64 OBSERVED RANKED SKEWNESS VALUES
(9(1) I = 1, .. , 64)
- - - Smoothed trace of
9(i)'
whereg (i)
average laO s~mulations of gel)
obtained from random samples from
GEV parent with k
- - - likewise with k
= 0.0
, /.'
-0.1
=:
2
/
/
,
/.'
/ ..
,
....
,/
1
/./
/
/
/
,/
".
".
/
...
,
/
/
".
'".
/
/
/
.,/
'/
.'
,
/~y" ...,.
,/;.
,,.
,/ ...
~
/.'
//
1/ :.,
' / ....•
/ ..//1 ..
/
I
I
I
I
/
•
-1.t---~--~---~--_r_--_,_--_,_--___1
-2
-7
o
7
2
EVI STANDARDISED
Figure 2.2
3
VARIATE
4
5
y.
Sampling distribution of observed AM skewness in Ireland (64 stations, 15 years each)
compared with trace of expected value of corresponding quantities obtained on assumption
of GEV parent populations with k = 0.0 and -0.1. (After Njenga, 1985).
10
2.2.3
STATISTICAL DISTRIBUTIONS FOR FLOOD FREQUENCY ANALYSIS
DISTRIBUTION OF STANDARDIZED LARGEST FLOOD PEAK
If QN is the largest peak in an AM series of length N let
(2.3)
YN = (QN - a)1 b
be a standardized largest valne, where a and b are measures of location and scale respectively. Values of YN calculated
from several records of equal length N from a region define an observed sampling distribution of YN which could
usefully be compared with Monte Carlo derived distributions of YN (for similarly sized data sets) obtained from a variety
of distributional assumptions. Such comparisons would be of assistance in identifying parent distributions whose
sample extremes behave similarly to observed AM extremes. The values (a, b) could be distributiou free, e.g. mean and
standard deviation or the estimated parameters of a particular distributional form; for example EVI parameters, estimated
by ML, were used by Rossi et al. (1984) in outlier tests but could also be used in the more general manner outlined
above. As an example the sampling distribution of YN = QmaxlQ from 64 AM records of size 15 in SE England are
shown in Figure 2.3.
6
64 Stations each with
'U
c:
IV
'E>
15 years record
•
5
c:
w
toIV
w
-"
4
GEV Simulated
.c:
0
!!!.
3
•••••
~
.t
IV
.£I
~
IV
E
2
Observed
a
1
-2
0
2
4
6
EV1 yl
Figure 2.3
Observed regional distribution of YN = Qmax I Qbar compared with (smoothed)
expectations of corresponding ranked values from 64 samples of size 15 from GEV
parent distribution with ].1 = 100.0, IX = 0.3 and k =- 0.20 .
CHAPfER2
2.2.4
11
STATISTICAL PROPERTIES OF OBSERVED FLOOD SERIES
CONDITION OF SEPARAnON OF SKEWNESS
Matalas et al. (1975) have examined the regional mean Cs and standard deviation O"cs of Cs for subsets of
length 10, 20 and 30 years of recorded AM series. They found for each subset length that a plot of O"cs versus Cs
prepared from historical data from 14 regions in U.S. was not in agreement with corresponding plots obtained from
similarly sized random samples drawn from distributions commonly used in hydrological frequency analysis. This
disagreement was termed "the condition of separation", which is illustrated in Figure 204. This condition manifests
itself on Figure 2.2 by the empirical distribution of skewness having a steeper slope than the corresponding distribution
of expected values. This result suggests that no conventional distribution could have generated the observed flow values.
However the Wakeby (Houghton, 1978) and TCEV (Rossi et aI., 1984) distributions, with certain parameter
combinations, yield poiuts which overlap the regional flood data points of Figure 204.
'"
)(
~
___ Regional
')(
"
____=__----------
--
",,,
,,-"~":.--"::....:.":....-------:;_
Figure 204
2.3
flood data
Samples
from
known
distributions
Illustration of condition of separation
DEPARTURES FROM INDEPENDENCE
It is usually assumed that all the peak magnitudes in the AM series are mutually independcnt in the statistical
sense. This assumption is usually justified. Beard (1974) concluded that AM series of 300 gauging stations were not
substantially autocorrelated. In the study of over 30 long AM series in Britain the use of 3 different tests for persistence
did not reveal any appreciable dependence in AM series (NERC, 1975). There may however be some element of serial
persistencc displayed by extremely large rivers.
Many authors have expressed the fear that successive clements in a PD series may be corrclated and this has
given rise to doubts about the applicability of the method. For this reason, arbitrary rules are adopted for deciding
whether to include certain peaks in the serics or not. For example it may be required that two peaks must be separated
in time by more than 3 tp, where tp is an average timc to peak for the catchment (NERC, 1975, VolA, p14) or the two
peaks must be separated by more than (5 days + In A), where A is catchment area in square miles (Bcard, 1974), if both
are to be included. A further condition might be that flow between successive peaks must fall to less than half, two
thirds or threc quartcrs of the first or smallcr peak flow. More recently USWRC(1981, p8) has declined to recommend
"specific guidelines" for defining flood events to be included in a PD series.
Studies carried out on PD series in Britain (NERC 1975, also Cunnane, 1979) indicate that the magnitudes of
successive PD scries peaks, selected in accordance with the abovc rules, are not correlated and may be regarded as
statistically independent but that the intcr-event limes are not independent random variables. That is persistence in the
PD series, if it exists, should be looked for in the process giving rise to the peaks rather than among the peak
magnitudes. (Sec also Ashkar and Rousselle, 1983 a. and b.).
12
STATISTlCAL DISTRffiunONS FOR FLOOD FREQUENCY ANALYSIS
Because it is difficult to prove conclusively that serial persistence is absent in either PD or AM series a
number of simulation studies have been conducted on the effect this may have on parameter estimation and standard error
of estimate. See for inslRnce Landwehr et al. (1979a), Srikanthan and McMahon (1981), Tasker (1983) for AM series and
Tavares and da Silva (1983), Rosbjerg (1984) for PD series. In general, when serial persistence is present, the
assumption of serial independence leads to more biased quantile estimation and larger slRndard error of quantile estimate
than when serial persistence is absent and the correct model form is assumed.
In addition to serial persistence studies NERC (1975) also reports on split sample tests and trend analysis of
AM series. Significant trend was found in a small number of British AM series. It is not possible to make a general
slRtement about non-slRtionarity of AM or PD series because there can be circumslRnces in which definite catchment
changes can bring about a trend (see Reich, 1985), even though this may be difficult to detect or measure. Naturally, if
trend or cyclic change does exist in a flood series it would have to be removed before a conventional flood frequency
analysis could be attempted. It is recommended that lests for trend be performed, as for inslRnee in NERC (1975,
Chapler 2), to check that the usual assumptions of frequency analysis are valid.
,.,i'"
CHAPTER 3
THE MODELLll'lIGPROBLEM
3.1
Introduction
The flood frequency modelling problem relates to:
(a) choice of model type (AM, PD, or other),
(b) choice of distribution to be used in the chosen model (e.g. choice of distribution for AM or.PD series),
(c) choice of method of parameter and quantile estimation,
(d) choice of scheme for joint use of at-site data, when available, and regional data.
Aspects (a) and (b) are discussed in this chapter while (c) and (d) are discussed more fully in Chapters 4 and 5.
Finally, the choice of flood frequency estimation methods is discusscd in Chapter 6.
It should be noted here that two separate aspects of such choice are important. These are the descriptive and
predictive properties of the chosen method. The descriptive property relates to the requirement that the chosen
distribution shape resembles the observed sample distribution of floods and that random samples drawn from the chosen
model distribution must be statistically similar to the properties of real flood series described in Chapter 2. The
predictive property relates to the requirement that quantile estimates are robust with small bias and standard error.
3.2
Choice of model type
The relative merits of different model types were discussed briefly in Chapter I, Section 3. Where very low
return period (T < 2) floods are being considered, such as in flooding of agricultural land, it should be remembered that
the AM model causes a distortion of the Q - T relation because TAM 'Ie T. Use of the PD model obviates this problem
in such cases. If AM data are the only ones conveniently avairable then the Q - T relation so obtained has to be
modified, by use of Langbein's relation between T and TAM, equation 1.3, or some empirically derived local counterpart
of it (Beard,I974, Beran and Nozdryn-Plotnicki 1977, Takeuchi, 1984).
In most other flood estimation problems quantile, of high return period are sought and the difference between
T and TAM is of little concern. What is then of interest is accuracy (lack of bias) and efficiency (inverse of sampling
standard error) inherent in the method used. Thus it is valid to ask whether either of AM, PD or TS models display any
marked superiority over the others. This question has not yet been fully investigated; what has been done was reviewed
earlier in Section 1.3(b).
3.3
Population and Distributions
The choice of distribution problem, especially in the AM series case has attracted considerable interest
(Gumbel, (941), Moran (1957), Benson (1968), Jenkinson (1969), Beard (1974), NERC(l975), Houghton (l977a) ,
Boughton (1980), Rossi ~. (1984), Ahmad et a1. (1988». Until the mid 1970's there was a tendency to treat this
aspect in isolation from choice of method of parameter and quantile estimation and from the choice of scheme for joint
use of at-site and regional data. This separation cannot be justified as all three aspects interact when the hydrology of a
region or country is being eonsidered,(Fiering 1967). The question of choice of distribution for PD series has, on the
other hand, received considerably less attention.
.
3.3.1
DISTRIBUTIONS FOR AM SERIES
Choice of distribution for AM series has received widespread attention. For instance, Benson (1968) described
a study which concentrated on finding a distributional form which would describe well the observed sample distribution
of ten long records of AM floods. NERC (1975) also devoted considerable attention to the same problem. Surprisingly,
however, WMO (1984) reports (see Appendix.6) that:
14
STATISTICAL DISTRIBUTION FOR FLOOD FREQUENCY ANALYSIS
"In many countries the selection of an AM distrihution is a<;tually not made in any objective manner and that
the choice of distribution is argued in a gcneral manner, as follows:
The (chosen) distribution is
j .• ,
widely
accepted,
simple
convenient to apply,
.; ':: c·' .
- consistent, flexible or robust (low sensibility to outliers),
- theoretically well based,
- documented in the Guide (WMO, 1983) and elsewhere.
No special method of parameter estimation is preferredoand the graphical method is as frequently or even more
used as any other method."
Apart from goodness of fit type tests, information about distribution type should be inferrabIe from the
dimensionless moments C v, C s and Ck measured from AM data. Such inference, attempted by Wu and Goodridge
(1976) and by McMahon and Srikanthan (1981), is unsatisfactory unless allowance is made for the bias which is known
to exist in the estimates of C v , C s and Ck. (See Chapter 2, Section 2, and Appendices 2 and 3). Improvements on
such inferences may be possible through more powerful goodness of fit tests such as Anderson-Darling and modified
Anderson-Darling EDF tests (Ahmad et aI, 1988b) and L-moment ratio diagrams based on PWMs (Hosking, 1986,
1988).
3.3.1.1
Robustness
In the past (e.g Benson (1968), NERC (1975», it was generally assumed that Once a distributional choice was
made according to a goodness of fit criterion that a satisfactory basis for flood estimation was established, Such an
approach assumes that a single, as yet unidentified, under lying distributional form is adequate for modelling AM flood
peaks. Even if this assumption were true it has to be accepted that the true underlying distributional form cannot be
identified with certainty at the present time, either on a single-site or regional basis. This fact should be taken into
account when selecting a distribution. In doing so a distribution and an associated method of parameter estimation
(denoted DIE for "distribution and estimation procedure") must be sought which is robust with respect to extreme upper
quantile estimation, over a reasonable range of distributions, random samples from which have statistical characteristics
similar to observed AM flood data. Such a reasonable range of distributions will be termed "flood-like", following
Landwehr fi.l!!. (1980).
A robust DIE procedure in this context, is relatively insensitive to small changes in the distributional
assumptions which it assumes are true. For instance, let QT(A) and Qr(B) be estimates of QT obtained by assuming
DIE procedures "A" and "B" respectively. If DIE procedure "A" is robust, then QT(A) should be a good estimate of Or
regardless of whether the population from which the sample has been drawn is "A" distributed, "B" distributed or
"otherwise" distributed. On the other hand if DIE proeedure "B" is non-robust then QT(B) might be a reasonable
estimate of QT when the population from which the sample is drawn is "B" distributed but it may be greatly in error
when the population is "A" distributed. The criteria for assessing these estimates are bias, b, and root mean square
error rmse. where
I
b
=
rmse
=
E(Qr - Qr)
(3.1)
(3.2)
Such assessments cannot be conducted on observed flood data because the true value of QT is unknown, Such
assessments can only be conducted in controlled simulation experiments using random samples drawn from known
statistical "flood-like" distributions, (see for instance Hosking tlJ!!.. (1985a». Before promoting a particular DIE
procedure for use in flood frequency analysis, it is essential to confirm that it yields quantile estimates with relatively
small bias and rmse over a reasonable range of "nood-like" distributions, since the true underlying distribution form is
unknown. While this requirement is certainly necessary it may not alone be sufficient.
CHAPrER3
3.3.1.2
15
TIlE MODELLING PROBLEM
Single or multi-component AM distributions.
In most applications the model consists of a single distribution. However, there are cases where a model
which recognizes different physical flood producing mechanisms might have to be considered. If physical circumstances
warrant it the observed AM flood series may be considered as the maxima of samples from two distinct sub-populations
(USWRC, 1981; Waylen and Woo, 1982). If two sub-populations are assumed then two sets of parameters must be
estimated. In addition, some such models (Singh and Sinclair, 1972) contain a mixture parameter which must also be
estimated. Obviously these more general models Wilh extra parameters should only be adopted if there is a clear
physical distinction between the two types of events as for instance between floods caused by summer cloud bursts and
spring snowmelt (Waylen and Woo 1982) or between floods caused by hurricane and non-hurricane events, (Fiering and
Jackson 1971, p.80). Rossi tl..ll! (1984) have suggested a two component extreme value distribution on the basis that it
can satisfactorily represent the regional spread of observed skewness and standardized largest values of observed AM
series.
3.3 .1.3
Candidate AM distributions
In either case, many candidate distributions have been suggested including:
Log-normal
Pearson type 3
Extreme value type I
Extreme value type 2
Extreme value type 3
Gamma
Log-Pearson type 3
General extreme value
Weibull
Wakeby
Boughton
Two component EV
Log-logistic
Generalized logistic
(LN)
(P3)
(EV1)
(EV2)
(EV3)
(LP3)
(GEV)
(WAK)
(TCEV)
(LLG)
(GLG)
(Hazen, 1914)
(Fosler, 1924)
(Gumbel, 1941)
(Gumbel,1941)
(Jenkinson, 1969)
(Moran, 1957)
(US Water Resources Council, 1967, 1976, 1977,1981)
(Jenkinson, 1955, 1959)
(WU and Goodridge, 1976)
(Houghton, 1978a)
. (Boughton, 1980)
(Rossi et aI, 1984)
(Ahmad et aI., 1988)
(Ahmad, 1988)
The mathematical forms of the distributions mentioned are given in Table 3.1 while the magnitude-return
period relation corresponding to some of these distributions for a fixed value of mean and variance is shown in Figure
3.1 on an EVI reduced variate base y. It should be noted that for T > 5, YT = In(T - 0.5). It can be seen that magnitude
of the EVI variate itself increases linearly with y (~ InT) while some others increase nonlinearly. Variates with large
positive skewness increase more rapidly than EVI while those with negative skewness increase less rapidly. The latter
tend towards a definite upper limit.
Various distributions may also be distinguished in another way namely by the relationships which exist
between C v and C s and between Cs and Ck for each distributional form. A selection of these are shown on moment
ratio diagrams in Figure 3.2. The relations shown apply to population parameter values. The relations appropriate to
small sample expectations of these statistics plot below those of Figure 3.2 (see Appendix 3). Because of the known
bias in small sample estimates of moments, it is anticipated that a curve joining plotted points representing the
expected value of skewness Cs and expected val ue of kurtosis Ck in small samples should plot below the curve
representing the population (Cs,C0 relations in Figure 3.2.
Some of these distributions were proposed initially because of their abilities to model different shapes of
histogram or perhaps simply because they had not been used already. However, some of them have been reeommended
on the basis of deductive reasoning, for instance, the extreme value family of distributions were proposed by Gumbel
(1941) while Chow (1954), many years after its initial use for AM series, suggested theoretical reasons for use of the lognormal distribution. Both these proposals show that there is some deductive basis for these distributions.
3.3.1.4
Theoretical arguments
Gumbel postulated that because the maximum flow in a year is the maximum of daily values it should be
distributed as an extreme value variate. However, the 365 flow values are neither identically nor independently
distributed. If, therefore, the annual maxima have an extreme value distribution it is for other than those rcasons stated
16
STATISTICAL DISlRillunON FOR FLOOD FREQUENCY ANALYSIS
by Gumbel. Nevertheless, there are local maxima within the year and some M of these may be mutually independent,
even if not identically distributed. If this M is sufficiently "large" in the context of extreme value theory, then perhaps
the maximum may have an EV distribution. From this it is difficult to see a really secure argument in favour of an EV
distribution for flood series (see also Lamberti and Pilati, 1985).
A more important problem in the application of EV distributions to floods is that of choosing between Types
1, 2 and 3. Theory offers no help in this regard where the need is greatest. The difficulty and importance of
distinguishing between EVI and EV2 or between EVI and EV3 is just as great as that of distinguishing between EVI (or
EY2 or EV3) and LP3 for instance. In itself, belief in the grounds on which the EV family of distributions is
recommended does not lead to a solution of the choice of distribution problem because great differences exist between
the members of the family itself, in the manner in which Q varies with T. Tests proposed by van Montfort (1970) and
by Hosking et al. (l985b) to test for EYI against alternatives of EV2 and GEY respectively are mentioned later in
Chapter 5. The EV3 distribution possesses a finite upper bound and because of this it appeals to those who believe that
the flood magnitudes on a catchment must have an upper bound. However, when such an estimated upper bound results
by chance from estimating the parameters from a single small or medium sized sample, it can sometimes be
unrealistically small being only a few percent greater than the largest flood in the record.
Of the other distributions, only the log-normal has had any theoretical support elicited for it but then only
after 40 years of prior use in hydrology. Chow (1954) stated that if the annual maximum flood could be considered to be
the product of a large number of random effects then it would be log-normally distributed, because the logarithm of the
variate could be considered to be the sum of a large number of random effects and would therefore be normally
distributed by the central limit theorem. However, to be·valid as a deductive theory these effects would have to be
identifiable. Failing this the distribution can only be supported by empirical data.
Thus, theoretical arguments cannot, per se,. identify a best choice of distribution for floods. Given the
empirical' evidence that most flOod series are positively skewed, theoretical knowledge of distributions' properties can
serve to eliminate symmetrical distributions such as normal or rectangular. However this use of theoretical knowledge
is not to be confused. with knowledge claimed to be due to reasoning about the genesis. of floods. In effect,. at the time
of writing, empirical suitability plays a much· larger role in distribution choice than a priori reasoning.
3.3.1.5
Implication olan upper bound' on floods
Because ofthe striking difference between the major groups of distributions in the growth of flood magnitude
with return period the choice of distribution to be used in engineering design has serious economic implications. If
there is a statistical upper bound, Qmn, to flood magnitude at anyri'ver site then it would be uneconomical to size large
structures for flood magnitudes derived on the assumption of an unbounded distribution unless Qmn » Qr for
reasonably large T. And conversely, if the distribution of flood magnitudes is statistically unbounded above then the
consequences of designing a large structure for flOod magnitudes obtained on the assumption of a bounded distribution
could be calamitous.
It is worth noting that the motivation for Boughton's (1980) distribution was that in Eastern Australia the
largest recorded flows in some rivers show log-probability plot behaviour which suggests a tailing off towards some
upper bound. This is particularly true when the standardized variable K = (Z - Z)/sz is plotted as ordinate, where Z = log
Q. That is, the 3 or 4 largest recorded values on many such rivers do not differ greatly from one another as happens in
other rivers. This is manifested by a preponderence of negative values of skewness in log space. The remarkable feature
of such largest floods are that in magnitude they can be significant proportions of the PMP flood. This led Boughton to
believe that a distribution which can accommodate an upper bound should be formulated for use with such series.
It might be expected that some guidance on the question of whether an upper bound exists to flood magnitude
or not might be obtained by recourse to physical examination of the precipitation process which causes floods. While
the concept of a maximum possible rainfall amount, from the viewpoint of the limit of the capabilities of the physical
processes causing it, is very well publicised and documented (WMO, 1969,1986), it is not universally accepted by
hydrologists, (see for instance, Yevjevich (1968), and Wallis, (1980)). In this context also, Alexander et al. (1969(a)
and (b)) attempted to derive what the distributional form should be which describes the series of annual maximum floods
from consideration of the distribution of the rainfall magnitudes involved. However, having examined many aspects of
the problem the authors were unable to offer solid guidance on the question of choice of distribution for floods. Reich
(1970) using data from 26 Pennsylvanian catchments found that "no usable relationship could be found between the
extreme value statistics of rainfall and floods." On the other hand, French practice assumes that "the scale parameter of
the distribution of extreme flood volume (and hence indirectly that of the flood peak) can be extrapolated using the
distribution of rainfall volume (Guillot, 1973) using the method known as GRADEX.
CHAPTER 3
TIlE MODELLING PROBLEM
17
300,------------------'-------------------,----,'"
ALL
x
HAVE
Cs =2.5
f"
= 100, C v =0.3
FOR
EV2,TCEV,WAK
Ck",,19, 21, 25
RESPECTIVELY
[GEVk =-0.15 FOR
=+0.15 FOR
EV2
A
EVl
EV3)
200
_-------
---x-EV3
100
O+-------.--------r~-----,..-------_,_-_____j
o
-2
2
7
6
4
300~-----~------------------------r71
ALL
HAVE )'-
C s =2.5
x
FOR
=100, C v = 0.3
P3,LN3,LP3
C S =0.93 FOR
LN2
C s = 1.14 FOR
EVl
200
100
---------~EVl
1.01
1.11
1.5
2
5
10
20
50
100
200
500
103
ol--'====='==~;==='====~==;:::='
==='=====';:::=='====~"~AM;~==~
o
-2
Figure 3.1
2
4
EVI
Magnitude - return period relations for selected distributions
Y
6
7
18
STATISTICAL DISTRIBlITION FOR FLoOD FREQUENCY ANALYSIS
Distribution
Name
Variate and parameter ranges
Probability density function fix)
or Distribution function F(x)
Extreme Value Type I
-oo~x~oo
U
(Gumbel or EVI)
F(x) = exp{- e-f:
.
General Extreme
Value (GEV)
a>O
)}
rr
a>O
.
F(x)=exp
-
u+~~x:"':Oo if k<O
ICke-U)r'k}
a
k
-Oo<x:",:u+~ ifk>O
k
Extreme Value
Type 2 (EV2)
Log~Gumbel
k>O,
F(x) =
exp{-(:~:J}
e~
x
0:"': e < U
F(x) = exp {- e' (lop - b)/a)
}
a>O
-Oo:"':logx~Oo
Pearson Type 3
f(x) =
etr exp{-~}
Ial r(b)
a
m:"':x if a>O
.
x:"':m if a<O
Gamma
f(x) =
(x/a)b.l
Ialr(b)
x
exp (--)
a
O:"':xifa>O
Le. Pearson Type 3 with m = 0
Exponential
f(x) = -I exp (x-m)
- -'-- 0'
a
a
<,
i.e, Pearsou Type 3 with b = i;'1lIld
Weibull with b = I.
Weibull
be-my-'
f(x) = lal
-aexp {(X-mn
- ..-
x:"': 0 if a<O
m:"':x
..
mSx if a> 0
x;:o,mif a<O
Table 3.1
Mathematical definitions of distributions used for AM series
CHAPTER 3
19
TIlE MODELLING PROBLEM
Variate and parameter ranges
Distribution
Probability density function fix)
Name
or Distributionfunction F(x)
2 ParameleC
f(x) =
1
eX{4-C
fi1rax
f()
x =
1
exp{ '-,cOg (x-m) - by}
fi1ra.(x-m)
,
a
Og
O<x
,x - by}
a
Lognonnal (LN2)
3 Parameter
m<x
Lognonnal (LN3)
2 Component
Extreme Value
(TCEV)
F(x) = exp(-A, e-x,e, _A2 e- X ,e,)
Wakeby (WAK)
x = m + a[1 - (1 - F(X))b] - c[l- (1- F(X))"dJ
See Appcndix 4,
Table A4.4.
Log Pearson
If x is LP3 and z = log x
c ~ z~
cC::;x$oo
a> 0
Type 3 (LP3)
f(x) = (';')'"' exp{-C'C)}
xlaW(b)
a
-oo::;z$;c
a<O
x >0
(:I, > 0
(:12 > 0
DO
O$;x$ec
Ifz-log x then
z-/4
C
In[-ln(F)] -A
Boughton (1980)
--=K=A+
Log -logistic (LL)
F(x) = {I + [(x - a)/b)""'}"'
a,
-=~K~A
x> a, e> 0, b > 0
(Ahmad et al,1988)
Generalised logistic (GL)
F(x) = {I +[I-){x - a)IPJ"'f
,
y;t 0,
(Ahmad 1988)
r> 0,
='
Table 3.1 (continued)
y< 0, a +
{I + exp[ -(x - a)IP]r'
,
y=O,
_00
Ply~
< x ::;
-oo<x<oo
x < =
a + Ply
20
STATISTICAL DISTRIBUTION FOR FLOOD FREQUENCY ANALYSIS
40,
---.."
f---IE~P.
:>0f--~----_"7'-----:::>l
//
.:,
~
1---
...... ~
---
,,~
NORMAL (Cs:O)
O·~----____:c:_---'--''--~
0-0
0-5
C
v
1-0
I
I
...L_ _-L._---L_...L.....l_.L....L...l.....L
,l-
SKEWNESS
0"
L..._ _.L.._...L_.l----L.....l----l.....l...l
'0
Figure 3.2
Moment-ratio diagrams for selected distributions
CHAPfER3
3.3.1.6
21
THE MODELUNG PROBLEM
Role of skewness in distribution choice
It must be acknowledged that a distribution for annual maximum floods cannot be chosen solely on the basis
of a priori theoretical arguments. The characteristics of observed flood data must be determined in a suitable fashion and
taken into account when a distribution is being chosen. Some distributions can be excluded if it is known that random
samples from them do not have characteristics in common with observed flood data. Slack tl.l!l. (1975) have shown in
the context of quantile estimation, that quantile estimates with small expected opportunity design loss (which is a
function of quantile estimate bias and rmse) are obtained even when the assumed form of model distribution is not
identical to the population distribution provided that the model distribution is selected from among distributions which
have approximately the same skewness as the population. In the case of a three parameter distribution this would
involve placing a constraint on the shape parameter. This is one reason for the need to be able to make correct
inferences about the AM population skewness. Rossi illl. (1984) actually rejected EVI and LN2 distributions for
Italian conditions because of their inconsistency with observed skewness of the data.
3.3.1.7
Parsimony
A further question is whether the flood series of all the rivers in a particular region, country or continent
should or could be considered to be distributed according to a common distributional form. While this hypothesis may
not be provable, the results of random sampling experiments (Wallis et a1.,1974; Matalas illl.,1975) indicate such
diversity of samples from a common parent that this hypothesis might well be difficult to reject. Its acceptance would
concur with the general scientific preference for parsimony of model types.
3.3.2
DISTRIBlITIONS FOR PD SERIES
The modelling problem in the partial duration series is also one of choice of distribution coupled with a choice
of the number of peaks, M = AN, to include in the series. While increasing M increases the amount of information in
the series it sometimes makes the problem of choice of distribution more difficult and it may also affect the
assumptions made about model structure as explained below.
An exponential distribution of magnitudes is moderately satisfactory in series of length M < 2N. Increasing
the size of the series to M > 3N or more, often introduces a striking departure from exponential behaviour at the lower
end and results in a series that is not always easy to model. If qo is too low the distribution displays a mode at some
value greater than qo' as illustrated in Figure 3.3(a). This may be more difficult to model than a distribution whose
mode is at qo' In such cases, the statistical algebra of truncated distributions may be complex. The exponential
distribution is exceptional in this regard and the invariance of its algebraic form with change in threshold level helps to
account for its popularity in this application.
f(q)
f(q)
L--:~-----------"q
L-
qo
(a)
Low value of threshold. Truncated
distribution with mode> qo. This
can be difficult to model.
Figure 3.3.
.L.-
-=== q
qo
(b)
Medium to high value of threshold.
Truncated distribution with mode
at qo. This may be modelled more easily.
Effect of threshold level on type of distribution required to model the PD series.
22
STATISTICAL DISTRIBUTION FOR FLOOD FREQUENCY ANALYSIS
In most published papers on analysis of partial duration flood series the exponential distribution plays a
prominent part but other distributions have also been uscd. These distributions (log-normal, Pearson 3, log-Pearson 3 ,
Pearson 4, Polya, gamma, sinepower, geometric, Goodrich and extreme value type I) are listed in Table X of WMO
Operational Hydrology Report No. 15 (Sevruk and Geiger, 1981) together with references to publications in which they
appear. Many of these refer to precipitation scrics. The generalizcd Pareto distribution (GPO), (van Montfort and
Witter, 1985), can also be adapted for use in this context and could be a more realistic model for PO magnitudes than
the simple exponential. This would be consistent with the use of GEV distribution for AM series.
3.3.2.1
Clusters ofpeaks in PD series
If the number of peaks, M, includcd in thc scries is high thcn the corresponding threshold, '10, is necessarily
low. In some years in these circumstances flood peaks exceeding qo appear to occur in clusters which gives rise to the
fear that the successive flood peaks are not statistically independcnt and/or that the rate of occurrence of the peaks as
distinct from the magnitudes, is controllcd at times by some form of persistent process. Cunnahe (1979), having
applied arbitrary but consistcnt rules to exclude somc flood pcaks, found no correlation among peak magnitudes but
concluded that a persistence mechanism may exist in the process which gives rise to the flood peaks. Ashkar and
Rousselle (l983a) discuss the effect of choice of threshold level qo on the validity of the PO model assumptions and
recommend that qo be chosen sufficiently large to ensure that the occurrence of flood peaks exceeding qo occur in a
Poisson fashion. Ashkar and Rousselle (l983b) examine the effect of placing restrictions on the selection, for the PO
series, of flood events from the observed historical series to exclude peaks which might be feared to be inter-correlated.
They reiterate the advantages of choosing a threshold qo which allows the valid use of the Poisson assumption but where
circumstances do not allow such a choice, they refer to the relatively new stochastic trigger type model (Cervantes 1981,
Kavvas 1982 (a) and (b), Cervantes et al. (1983)).
Apart from these restricted comparisons no further objective comparisons based on bias and efficiency of Or
estimates have been published and the TS model has never been considered in this way.
CHAPTER 4
METHODS
4.1
OF
QUANTILE
ESTIMATION
Introduction
Many methods of quantile estimation have been suggested in the past, some of which are tailored for particular
circumstances. As indicated in Section 4.2, estimation methods depend on data availability and on the amount of
regional pooling of data which is to be allowed. Section 4.3 outlines a number of basic distinctions between different
types of methods while section 4.4 outlines the main types of schemes used in practice for at-site, at-site/regional and
regional only (ungauged catchment) cases. Section 4.5 discusses design flood specification and draws attention to the
question of expected probability correction. In view of the increased use of regional analysis, the value of which is
being continually demonstrated, the question of regional homogeneity of flood statistical behaviour is of great
importance and this is discussed briefly in Section 4.6. Some aspects of flood estimation in arid zones, by no means a
solved problem, are discussed in Section 4.7. While the PD model has some advantages over the AM model much of
what follows relates to the AM model because it is by far the more widely used mainly becanse the AM series is more
easily available.
4.2
Dependence on data availability
The method adopted for estimation of QT in any situation depends on the amount and type of hydrological data
available at the site in question. Two broad categories of data availability are:(a) at-site hydrological data are available, along with data at some other sites in the region;
(b) at-site hydrological data are not available but data at some other sites in the region are available.
From the data input point of view, three categories of quantile estimation schemes may be identified:(a) use of at-site data alone;
(b) joint use of at-site and regional data;
(c) use of regional data alone in the absencc of at-site data.
These three methods are amplified in Section 4.3.
In the absence of at-site and regional flow data recourse would have to be had to estimating flood qnantiles
from minfall statistics and a rainfall-runoff model by methods which are not the subject of this report.
4.3
Some basic distinctions and concepts
4.3.1
DISTRIBUTIONAL AND PARAMEfRIC METIIQDS (OR DISTRIBlITlON-FREEAND NON-PARAMIITRIC METI-lODS)
The probability, that the ith largest among N past AM flood events will be exceeded during N' future AM
floods, may be estimated, using "ball and urn" type probability models, without making any assumption about the
distribution of flood magnitudes (Thomas, 1949; Gumbel, 1958, Chapter 2.2). Such a technique is called distributionfree. Its drawback is that it can be applied only to the N observed variate values and neither intermediate values nor
values outside the observed range can have their probabilities estimated. Hence quantiles of arbitrary return periods
cannot be estimated directly by "ball and urn" distribution-free methods. The use of modem non-parametric methods,
suitable for use with large samples, has been demonstrated hydrologically by Adamowski (1985). Such a method can
obviate some of the disadvantages of the "ball and urn" methods at the expense of computational effort and some
complexity.
If AM data arc represented on a cumulative histogram or on a probability plot an eye-guided curve may be
drawn through the plot which thus defines in a non-algebraic manner, a relation between magnitude Q and probability
(and hence T). Such a non-algebraic relation defines the distribution of Q in a non-parametric manner. If such a relation
24
STATISTICAL DISTRIBUIlONS FOR FLOOD FREQUENCY ANALYSIS
could be described algebraically by an expression which depends on only a few constants or parameter values (e.g.
equation 1.6), it would be called a parametric description of the distribution of Q.
In general, distribution free methods are less efficient (see section 4.3.2) than distributional methods. They are
by nature mOre robust than distributional methods but this in itself is insufficient to recommend them. The statistical
nature of flood series (Chapter 2, Section 2) is sufficiently well-known to make a reasonable, if not perfect, choice of a
distribution, whose use would yield more efficient quantile estimates than distribution-free methods. Similarly,
parametric description of distributions leads to quantile estimators which are objective and whose sampling properties
are capable of being more readily investigated. They are generally more efficient and hence they are preferred to nonparametric descriptions of flood distributions.
4.3.2
BIAS, STANDARD ERROR ANDEFFlCIENCY
Objective quantile estimation methods are based on methods devised for use with truly random samples from
stationary populations. Such random samples have the characteristic that different samples, when treated in the same
way, generally yield numerically different values of quantile estimates. Subsamples of long AM series also yield
different values of quantile estimates and their variation is similar to what would be expected to occur among estimates
obtained from truly random samples, a finding which lends some validity to the assumption that AM series can be
treated by random sample methods. (See for instance NERC 1975, 1.2.5).
A
If Or is the value of quantile estimate obtained by a particular distribution and estimation (DIE) procedure and
QT is the population value, define bias, b, standard error, se, and root mean square error, rrnse, by
A
=
E(OT) -OT
=
E[ OT - E(QT) ]
se(QT)
=
[ var«h)] 1/2
(4.3)
rmse
=
[E (QT - OT)2]1/2
(4.4)
(rmse)2
=
(se)2 + b 2
(4.5)
b
A
var (QT)
A
A
A
(4.1)
2
(4.2)
from which
One DIE procedure is more efficient than another if it has smaller sampling variance (= se2) than the other.
Efficiency of a DIE procedure is a property which is inversely proportional 10 sampling variance.
4.3 .2./
Confidence intervals
A 90% confidence interval for a quantile, based on a single estimate QT, is usually evaluated as
(4.6)
A
on the assumption that the QT statistic is normally distributed (e.g. Hosking et aI., 1985a, p.95 and Fig.5). Hebson
A
and Cunnane (1987) found that QT estimates have a skewed distribution when obtained from small samples by at-site
estimation methods but that they are very close to being normally distributed when obtained by combined atsite/regional methods. Guidelines for flood frequency analysis in the 'United Stales (USWRC, 1981) contain methods
for computing confidence intervals for skewed distributions (specifically LP3) using a non-centralt-distribution while
Stedinger (1983) gives a good account of confidence intervals in the hydrological context.
A confidence interval obtained in this way needs 10 be interpreted with care. The 90% confidence interval thus
defined would include or straddle the true value QT in 90% of samples in repeated sampling. It does not mean that there
CHAPTER 4
25
METIIODSOFQUANITLEESTIMATION
A
is 90% probability that QT lies in the interval, defined in equation (4.6), calculated from a particular sample. The latter
type of statement would be valid only in a Bayesian context.
4.3.3
BAYESIAN AND NON-BAYESIAN MErnODS OF ESTIMATION
In non-Bayesian estimation, parameters and quantiles of a population are considered to be fixed but unkown.
Their values are estimated from random sample data. In Bayesian estimation, parameters and quantiles are considered not
as unknown constants but rather as unknown, random variables, the distributions of which are modified by random
sample data in the estimation procedure. In Bayesian methods information about parameters and quantiles from separate
sources, such as regional and at-site data, are combined in a logical framework. The distribution of each parameter, and
that of Qr also, is modified by each new piece of information.
Initial knowledge, if any, is expressed as a prior distribution and sample information is incorporated in the
form of a likelihood function which depends on the distributional form chosen for the population. The prior
distribution and sample likelihood are combined via Bayes' Theorem to give a posterior distribution of the parameters
and of the quantity required, Qr.
If there is no initial knowledge on the parameters, they are assigned a uniform prior distribution. This is
known as the non-informative prior distribution. An informative prior can be obtained from regionally-based equations
relating the parameters to characteristics of the catchment such as area, slope, soil cover and an index of climate such as
mean annual rainfall, (Cunnane and Nash, 1971; Wood and Rodriguez-Iturbe, 1975; Kuczera, 1982; Kuczera,1983).
(If upper and lower confidence bounds, Qu and QL are obtained for Qr from its Bayesian
posterior distribution then we may speak of the probability of the event [QL < Qr < QuJ
if we understand the probability to be a subjective probability whereas we cannot make
such a direct statement in the non-Bayesian case).
The Bayesian estimator is also of direct use fordccision-making (Wood and Rodriguez-Iturbe, 1975). Inference
about a hydrologic quantity such as QT is not an end in itself but one of the steps taken towards an engineering and
economic decision such as the height of a levee, depth of a channel or width of a spillway. Bayesian inference can be
combined with decision theory in a far more satisfactory manner than can traditional frequency-based inferences. In this
approach model uncertainty can also be taken into account (Wood,1974).
4.3.4
DATA DISPLAY
Ranked sample data are usually displayed on a probability plot having a normal, Gumbel (EVI) or exponential
base. Unbiased plotting positions should be used (Lieblein 1953, Cunnane, 1978) because the traditional Weibull
formula, i/(N+I), leads on average to data from a "straight line" population having an elongated S appearance on the
probability plot. The resulting bias in graphical quantile estimates was noted by Benson (1960) in his classic sampling
experiment, but such bias was not attributed by him to the plotting position bias.
The unbiased plotting formula are:
For normal paper:
=
i - 3/8
N + 1/4
(4.7a)
(BJorn 1958)
For Gumbel and
=
i - 0,44
N + 0.12
(4.7b)
exponential Paper:
(Gringorten 1963)
where Fi is the plotting probability, N is sample size and i is rank with i = I indicating the smallest sample member.
Unbiased plotting formulae are distribution-specific. If a single distribution frcc formula is required then
26
STATISTICAL DISTRIBUTIONS FOR FLOOD FREQUENCY ANALYSIS
=
i - 0.4
N + 0.2
(4.8a)
might be a reasonable compromise for mildy skewed data (Cunnane, 1978) while Hazen's formula (Hazen, 1914)
Fi
=
i - 0.5
N
(4.8b)
would be more suitable for more skewed data.
Unbiased plotting positions have been considered by Ii et al. (1984) for the P3 distribution and by ArnellJll
ill. (1986) for the GEV distribution. Each of these specifies the plotting formula in tenns of i, N and constants which
vary with the respective shape parameters of the distributions and which are available in tabular fonn. In-na (1988)
developed a single formula, in which skewness is explicitly included, for unbiased plotting positions for P3 samples
which, after a little rearrangement, can be expressed as
F·1
=
i - 0.53 + 0.3 C s
N + 0.05 + 0.3 C s
(4.9)
a result also published in Nguyen tlJI!. (1988). In these cases specially graduated graph paper can be constructed for
specific values of the shape parameter on which data from a distribution having that value of shape parameter would tend
to plot as a straight line. Zhang(l982) and Hirsch (1985) have considered plotting formulae in the context of inclusion
of historical flood data in estimation schemes.
Probability plots can also be prepared on ordinary graph paper if the plotting positions are expressed as the
expected values of the reduced (standardized) variate order statistics E(y(i», (Cunnane, 1978). An approximation to these
are
Yi
=
y(F;) = E(Y(i»
(4.10)
where y(F) is the inverse fonn of the cumulative distribution function of the appropriate reduced variate, and Fi is
obtained via equations (4.7) or (4.8); for example,
Yi
=
- In [-In [(i - 0.44) I (N + 0.12)] ]
(4.11)
is an approximation to E(Y(i» for EVI which is almost exact at i=N. For the exponential case the exact expression for
E(Y(i» has been given by Sukhatme (1937),
~
j
I
.L.tj=t N + I - j
4.4
Quantile estimation schemes
4.4.1
USE OF AT-SITE DATA ALONE
(4.12)
AM series data are considered as a random sample from a population of flood values whose distribution can be
described by a probability density function which depends on just a few parameters. Traditionally one of the two or
three parameter distributions of Table 3.1 is selected. The three parameters usually related to location, scale and shape of
the distribution and are closely analogous to the distribution's mean, standard deviation and coefficient of skewness.
Having decided on a distribution, parameter estimation may be made by a non-Bayesian or by a Bayesian method. The
latter is generally used in the regional context only (see 4.4.2.2 below).
4.4.1.1
Non-Bayesian methods
These include moments (MOM), maximum likelihood (ML), least squares (LS), probability weighted
moments, (PWM) (Greenwood et al., 1979) and sextiles (Jenkinson, 1969).
MOM, although easy to apply, does not use all the sample information in an exhaustive manner and is not as
efficient as ML estimation, especially in three parameter distributions (Matalas and Wallis, 1973).
CHAPIER4
METIlODS OF QUANTILE ESTIMAnON
27
ML method is regarded as being best because it is most efficient. That is, the sampling variance of the
estimated parameters and hence of Qr is asymptotically smaller than by any other estimation method. ML estimates are
frequently biased but corrections for bias can be found (Fiorentino and Gabriele, 1984, Hosking, 1985). More seriously,
ML estimtes may not always be feasible in small samples from three parameter distributions (Matalas and Wallis, 1973,
Hosking !ll.Jl!.... 1985b). The application of ML estimation is no longer unattractive from a numerical point of view
because of the widespread use of programmable calculators and the increased number of micro and personal computers
now available.
The location and scale parameters of a distribution can be obtained by least squares regression of the ranked
sample values on ordered reduced variate values known as plotting positions. The use of generalised least squares for
this purpose was pointed out by Lloyd (1952) while Chow(1954) used ordinary least squares and Gumbel (1958)
introduced a modified least squares procedure. The resulting scale parameter, and hence Qr, will be biased upwards if the
Weibull plotting position, i/(N+I), is used (Lowery and Nash, 1970) but need not be biased if suitable plotting
positions are adopted (Lieblein 1953; Cunnane 1978).
The PWM method, developed by Greenwood lllJl!...1979), calculates linear functions M(O), M(l), M(2) of the
sample data (see Appendix 4) and these quantities are equated to theoretically derived expressions in the distribution's
parameters in a manner analogous to the use of ordinary moments. In the EVI and exponential cases the resulting
quantile estimates are linear functions of the sample data, a property shared by least squares estimates in these cases.
The PWM method has good statistical properties (Landwehr .e1.ll!., 1979, Hosking tl.ll!., 1985b). It was originally used,
however, only with distributions whose distribution function F(q), is explicitly expressible in inverse form q = q(F),
thus ruling out normal, Pearson type 3 and their log-distributions. However Song and Ding (1988) have derived an
algorithm for PWM estimation of P3 parameters and the use of PWM estimation for TCEV has been investigated
(Beran tl1!!., 1986). A rigorous definitive account of PWMs is give by Hosking (1986).
If parameters of a log-distribution are being estimated by moments there is more than one possible variation,
viz. (a) the moments of the data may be equated to the corresponding expressions in terms of parameters in the flow or
real domain (b) the moments of the logarithms of the data may be equated to expressions in the parameters in the logdomain or (c) one parameter may be estimated in the real domain and one or more parameters may be estimated in the
log domain. The latter is referred to as the method of mixed moments (see for instance Phien and Hira (1983».
4.4.1.2
Historical Data
Methods have also been developed, with the help of existing techniques for censored samples (see for instance
Kendall and Stuart, 1961), to incorporate some types of historical information or to allow data to be used which have
been truncated owing to an inadequate streamflow recorder. Such techniques have been reported by Benson (1950), Leese
(1973), USWRC (1981), Condie and Lee (1983) and have recently been examined critically by Hosking and Wallis (l986a,
1986b) and by Stedinger and Cohn (1985, 1986). Chinese practice (Chen tl.ll!., 1975; Hua, 1985) is based on graphical
estimation using plotting positions whose derivation has been thoroughly investigated by Zhang (1982).
4.4.1.3
Robustness
Another topic which has rightly attracted the attention of hydrologists (Houghton 1977, Kuczera 1982b) during
the last decade is the robustness of statistical estimators. Broadly speaking an estimator is robust if it estimates Qr
"sensibly well" even if the assumptions used in the estimates are slightly wrong or if the data are ill-behaved because of
outliers or measurement errors. Robustness has been more thronghly discussed in Section 3.3.1.1.
4.4.2
JOINT USE OF AT-SITE AND REGIONAL DATA.
This method, of which there arc many variations, assumes that AM populations at several sites in a region are
similar with respect to statistical characteristics which arc not dependent 'on catchment size. This homogeneity
assumption is a gross over simplification but it is a very convenient and effective one. In most circumstances it is
advisable to combine site specific and regional information.
If only a small sample of AM data are available at a site, one could not hope to estimate the entire Q-T
relationship from it. At the very most no more than two parameters of the AM distribution would be estimated from
the sample while the form of the distribution to be fitted would have to be chosen in the light of regional experience.
Indeed, in view of the very large variations that occur in C v estimated from small samples drawn from a parent
population, the estimation of a second parameter from small samples is of doubtful validity.
28
STATISTICAL DISTRIBUTIONS FOR FLOOD FREQUENCY ANALYSIS
In addition, if a three parameter distribntion is adopted the third parameter, skewness or a function of it, cannot
bc estimated from the small sample because of the high sampling variance involved. Hence an average regional value of
skcwness must be used. Such a practice is currently prevalent in the United Slates of America (USWRC, 1981) based
on a map of "generalized" 10g- skew. This practice is not condoned however by all hydrologists in that country and is
strongly opposed by some (see Wallis, 1980), on the grounds of serious slatistical shortcomings (Landwehr tlJ!l,
1978). In the U.K. NERC (1975) recommended that if only a short record were available, that Q be estimated from it
and that a regional multiplier QT/Q be used in conjunction with it This of course not only assumes a conslant regional
skewness but also a conslaUt regional coefficient of variation.
Thus in general, one or two parameters of the flood population at the required site are oblained from the at-site
record and the remainder of the required information is obtained by some regional averaging procedure.
4.4.2.1
Index flood procedure
The variate Q is divided by Q and the resulting variate X = Q/Q is assumed to have the same form of
distribution at every site. In this context Q is the index flood. The parameters of the distribution of X are oblained
from the combined regional dala sets and Q for any application is obtained from at-site data. The quantile OT is then
estimated as
A
OT
_
=
Q .
XT
(4.13)
The parameters of the X distribution may be obtained by
(a) Regional averaging of dimensionless at-site quantile estimates QT/Q (Dalrymple, 1960)
(b) Regional averaging of dimensionless moments such as C v (Nash and Shaw, 1965) and/or Cs
(USWRC, 1976, 1977, 1981).
(c) Regional averaging of dimensiouless at-site order Slatistics X(i) = Q(iyQ and fitting a distribution to
these either graphically or numerically by regression-probability plot approach or by probability
weighted moments. A variatiou of this, adapted for use with records of nnequallength was described
by NERC (1975,1.2.6). The X(i) quantities were also used by Houghton (1978) to estimate a
regional "righteous" Wakeby distribution from 46 US flood records of length 60-years.
(d)
Regional averaging of dimensionless PWMs, M(l)!M(O), M(2)/M(O), M(3)!M(O),as outlined by
Wallis (1980). (Note that M(O) = Q = sample mean). This method has been found to be easy to
apply and is robust and efficient. An example of its use is given in Appendix 5.
(e)
Pooling all x = Q/Q values
(Xij = Qij/Qio i = I, 2 ... Nj; j = I, 2... M)
and treating them as a single sample from a population for parameter estimation purposes. This
could be called a "slation year" approach.
In (a) to (d) slatistics are averaged over sites without invoking the space-time equivalence principle encompassed in the
station year approach (e). Hence (a) to (d) avoid most undesireable consequences of interstation dependence or
correlation. Procedure (d) is demonstrated by way of numerical example in Appendix 5.
CHAPTER 4
4.4.2.2
29
METIlODS OF QUANTILE ESTIMATION
Bayesian Procedures
The use of Bayes' Theorem for combining prior and sample flood information was introduced by Bernier
(1967). Cunnane and Nash (1971) showed how it could be used to combine regional estimates of Q and C v obtained
from catchment characteristics, using a bivariate log-normal distribution for Q and Cv , and site data assumed to be EYI
distributed to give a posterior distribution for Qr. This method involves considerable amounts of numerical integration.
Wood (1974) discusses the manner in which uncertainty about model type can be taken into account by use of
Bayes' Theorem while Wood and Rodriguez-Iturbe (1975) give a specific example of Bayesian analysis which uses
regional and site hydrological data, flood cost and damage functions to determine the optimum size of flood protection
works. They assume that floods are log-normally distributed and use conjugate distribution theory, rather than extensive
numerical methods, to obtain the posterior distribution of the log-normal distribution parameters.
Kuczera (1982) has given a very thorough description of a general framework in which regional and site
specific information can be combined in Bayesian analysis to give a posterior distribution of a flood quantity, together
with a study of risk. This he describes as the empirical Bayes (EB) approach. Kuczera (1983) extends this to the study
of the effect of spatial correlation and sampling uncertainty on the above EB procedure.
It should be noted that while application of a Bayesian method involves more algebraic development than a
conventional treatment, the final estimate is capable of being interpreted as a linear combination of a prior (regional) and
a sample (site record) estimate. This can be exactly so in the case of prior and sample distributions both being normal.
4.4.2.3
Two component extreme value (TCEV) procedure.
This procedure (Rossi et aI., 1984) is a regional flood frequency procedure which uses a 4-parameter
distribution. It is intended for use in conditions where some floods are considered to be caused by a different physical
mechanism which tends to produce outliers Le. abnormally large floods which are apparently not consistent with the
remainder of the series. Ideally, such events should be recognisable by their physical causes in which case two of the
parameters would be estimated from "ordinary" flood events and the other two would be estimated from the
"extraordinary" flood events. If there are insufficient of the latter available at any site then these two latter parameter
values would have be estimated from regionally pooled and standardised "extraordinary" values.
In the case of Italian data however Rossi tl.ll!. (1984) were unable to identify, by cause, floods of the two
types even though they were able to establish that outliers exist in their data set. They thus resorted to estimation of
parameters without decomposition of the data into two types. Because of the small number of extraordinary events at
any site it was also necessary to adapt a regional approach, which entails some sort of standardisation method to remove
the effect of differences in catchment sizes and other characterstics.
The TCEY distribution can be written as
=
exp[- Al e- q/9[ - A z e-q/92]
(4.14a)
=
exp[- e -(q - c[)/9rJ
(4.14b)
. exp[- e - (q - c2)/9i1
=
where c[ 9 1 In Al and c2 = 92 In A 2. This represents the product of two EYI distribution functions which indicates
that Q is the maximum of two independent EYI variates. The observed data at each site are standardised by
y'
(4.15)
=
where (Ch91) are ML estimates of EYI parameters obtained from the trimmed AM sample following omission of
outliers previously detected in a specified way. Y' is distributed in a similar way to X,
F y ' (y')
=
exp [-A', e-y'/S'I - A'2 e-y'/S'2]
(4.16)
with modified parameters,
(4.17)
30
81'ATISTICAL DISTRInUfIONS FOR FLOOD FREQUENCY ANALYSIS
The jth sample of M stations in a region provides Nj valnes of y', viz. [Y'ij, j = I, 2...Nj]. The station year
assumption is then invoked and L = N, + N 2 + ... Nj + '" + N M values of Y' are assumed to form a single random
sample from a TCEV distribution. The parameter values are estimated by the method of maximum likelihood, using
techniques originating with Hasselblad (1969). Having estimated regional parameters for y', regional quantiles Y"T can
be obtained from which at-site estimates can be obtained by inverting equation (4.15)
(4.18)
where (Elj, ej) are values of (El ,ell obtained from the trimmed data set at site j.
The above method has been applied to 39 Italian AM series totalling 1525 station years by Rossi ~.
(1984), and by Beran et al. (1986) to 57 British AM series totalling 2334 station years. In both cases random samples
from the fitted regional distribution gave skewness values which were as variable as the skewness of the observed data,
thus accounting for Matalas lllJ!l. (1975) condition of separation of skewness.
4.4.3
USE OF REGIONAL DATA ALONE
When no site data are available there are currently three possible procedures. The first of these uses the index
flood approach in the following three steps and was formalized by Dalrymple (1960).
(a)
Establish a dimensionless XT - T relation, where XT = QT/Q, by one of the methods of Section 4.4.2.1;
(b) Estimate Q from a regionally calibrated relation bctween Q and physically measurable catchment
characteristics (Benson,1962; Nash and Shaw, 1965; Thomas and Benson,1970; NERC,1975; Stedinger
and Tasker,1985, 1986) and
f\
(c) use equation (4.13) to obtain OT.
The form of relation adopted for estimating Q depends on the amount of physical and climatic data available.
A considerable rangc of numerically expressed catchment characteristics have been described by Benson (1962) and a
wide range of logarithmic regression equations for both high and low flows have been given by Thomas and Benson
(1970).
A second approach, used by Nash and Shaw (1966) and also by US Corps of Engineers, is to summarise each
observed streamflow record in a region by the mean Q and coefficient of variation Cv of its annual maximum flood
series and to derive separate relations, by logarthmic regression, between these two quantities and catchment
characteristics. Using the latter values of an ungauged catchment, estimates of Q and Cv are obtained. An assumption
is then made that the AM population is distributed according to some two parameter distribution such as EVI, gamma
or LN2 and the distribution is filled by the first two moments obtained via Q and C v. However while a statistically
significant relation between Cv and catchment characteristics may be obtained, the resulting equation usually has a
prohibitively large standard errcr.
. .
A third approach is to estimate Qr separately at each site in the region, by filling a distribution to the AM
data for that site, for a selection of T values such as 2,5,10,25,50 and 100 years. For each T a logarthmic regression
f\
relation is established between Qr and the catchment characteristics. Then when
f\
OT
is required for an ungauged
f\
catchment, the values of the relevant catchment characteristics are inserted in the derived equation and OT is evaluated.
This approach has the disadvantage that many sets of parameters have to be estimated. There is considerable sampling
Q,., / Q,.2
> I even
error in each relation derived and the ratio of OTI to OT2 may not be realistic. It could occur that
though T, < T2 . However this laller condition rarely oecurs in practice (W.O. Thomas Jr., 1988 Pers. Comm.) and can
usually be avoided by using the same catchment characteristics for each value of T. This third method is widely used in
the U.S.A. especially by U.S. Geological Survey (Thomas, 1987; Tasker, 1987). The paper by Thomas (1987)
contains a list of more than 60 current reports for individual states in the U.S.A. which give equations relating OT to
catchment characteristics, for instance Eychaner (1984), Simmons and Carpenter (1978), Bridges (1982) to mention but
CHAPTER 4
31
METI-lODS OF QUANTILE ESTIMAnON
a few; a list of 13 reports detailing methods of estimating peak floods from channel geometry, for instance Hedman and
Riggs (1978); Osterkamp (1982) and a list of26 reports which include methods for urbanized catchments, for instance
Olin and Bingham (1982) and Conger (1986).
4.5
Specification of design flood quantile
Stedinger (1983 b) draws attention to the fact that two formulations exist for the design flood quantile. The
first or traditional formulation is to seek the best sample or regional estimate of
(4.19)
where Jl and IT are population mean and standard deviation and KT is a frequency factor (Chow, 1951) dependent on the
"
" and IT are statistical estimates of Jl and IT then the traditional estimate of Qr is
form of the distribution and on T. If Jl
" = Jl" + cr" KT
Qr
(4.20)
The second formulation takes into account Beard's (1960) observation that the expected exceedance probability,
over many equally sized samples, is not Iff as is required. In the case of Q beiug a normal variate Beard (1960) showed
that the expression
Qr =
t + 3 [I + (I/N)]I/2 t,,-!
(4.21)
does have the required exceedance probability Iff when averaged over many samples. Here N is sample size and tN_! is a
Student t variate with N-l degrees of freedom. The use of equation (4.21) instead of equation (4.20) is known as Beard's
expected probability correction. Stedinger (1983b) also shows that equation (4.21) is equivalent to the result obtained
by Bayesian analysis using a non-informative prior distribution for the parameters. Moran (1957) also derived equation
(4.20) for the design value by considering the joint distribution of the observed sample mean and standard deviation and
a future, as yet unobserved, value from the population.
This concept, well developed for the normal distribution case, should not be lost sight of when other
distributions are being used. It is automatically taken care of in a Bayesian estimation framework.
4.6
Regional Homogeneity
Regional flood estimation methods are based on the premise that standardised flood variate, such as X =
Q/E(Q) has the same distribution at every site in the chosen region. In particular Cv(x) and Cs(x) are considered to be
constant across the region. Serious departures from such assumptions could lead to biased flood estimates at some sites.
Those catchments whose Cv and Cs values happen to coincide with the regional mean values would fortuitously not
suffer such bias. Nevertheless if the degree of heterogeneity present is not too great its negative effect may be more
than compensated for by lhe larger sample of sites contributing to parameler estimates. Thus xT estimated from M
siles, which are slightly heterogeneous, may be more reliable than XT estimated from a smaller number, say M/3, more
homogeneous sites, especially if flow records are short.
At least five categories of questions arise in this context (Cunnane 1987),
(a) is flood frequency behaviour of anyone of M sites in a region, with AM records available, inconsistent
with that of the remainder of the group?
(b)
are geographically defined regions better or worse than regions obtaiued by partitioning the catchment
characteristic data spaee?
(c) how can a large group of catchmeuts be divided into homogeneous sub-groups or regions?
(d) how can an ungauged catchment be allocated to one of a number of pre-selected homogeneous regions?
(e)
what degree of departure from regional homogeneity can be tolerated in a flood quantile estimation
procedure?
32
STATISTICAL DISTRIBUIlONS FOR H..OOD FREQUENCY ANALYSIS
The present state of knowledge about these questions is based on the work of Langbein (1947), Benson
(1962b), Cole (1966), Biswas and Fleming (1966), De Courcey (1972), White (1975), Mosley (1981), Beable and
McKerchcr (1982), Tasker (1982), Farhan (1984), Acreman and Sinclair (1984, 1986), Wiltshire (1985, 1986a,b,c) and
Wiltshire and Beran (1986). The more recent of these have been surveyed in Cunnane (1987).
Geographical regions are convenient and may divide a country into disparate regions by chance, because of
variation of soils, climate and topography with latitude and longitude, but in general the geographical proximity of two
catchments is no guarantee that they are similar from the flood frequency point of view. Wiltshire (1986a,b,c) shows
that geographical regions in Britain do not display as much internal homogeneity nor as much heterogeneity between
regions as regions dcrived by other methods proposed by him. Acreman and Sinclair (1986) draw similar conclusions
using Scottish data.
In seeking to divide a country into regions which are internally homogeneous but mutually heterogeneous,
Willshire (1985) divides all his catchments into two or more groups by partitioning on one or more catchment
characteristics. Thus each step consists of a trial set of regions and the internal homogeneity and mutual heterogeneity
of these regions is numerically expressed in terms of a flow statistic such as Cv. The process is repeated by altering the
. partioning points in catchment characteristic space until an acceptable set of regions has been identified. This is clearly
a computationally intensive procedure.
Acreman and Sinclair (1986), on the other hand, seek homogeneous regions by using a clustering algorithm
which allocates catchments to regions by identifying clusters of catchments in catchment characteristic data space. The
regions thus identified are tested for dissimilarity by the use of a likelihood ratio test on the assumption that data are
GEV distributed.
At this time it is not possible to define regions without some possibility of misaUocation of catchments to
incorrect regions. If an ungauged or poorly gauged catchment is assumed to be assigned to one of several previously
defined regions then a new problem arises unless the regions are defined geographically. Multivariate discriminant
analysis, suggested by Mosley (1981), has been used by Wiltshire (1986a,b,c) for British catchments. In this method a
probability of membership of region k, Pk, can be calculated for any catchment whose catchment characteristic vector is
given using coefficients determined during the allocation of gauged catchments to regions. The standardized quantile
estimate for that catchment is then given by
(4.22)
where summation is over all regions and xn is the kth region estimate of XT = QT / Q.
Finally it may be stated briefly here that in regions of relatively low C v ( < 0.6 say) a small degree of
heterogeneity does not negate the benefits of using an at-site/regional estimating method. However in extremely
heterogeneous regions, C v > 0.6, and coefficient of variation of Cv > 0.2 then use of a single regional estimate of XT =
QT / Qwill lead to severe positive or negative bias for some catchments (Lettenmaier and Potter, 1985). On the other
hand, at-site estimates from short records in high Cv regions have such large standard error as to be almost worthless.
4.7
Flood estimation in arid and semi-arid zones
Arid and semi-arid zones contain some streams which may have no flow for periods of time which sometimes
extend to numbers of years. Thus the usual AM model assumptions are not true and statistical problems arise.
(Measurement problems are also a great problem in such zones). According to Yevjevich (1979) arid zone precipitation
series have similar statistical properties to truncated humid zone series, in which case the sequence of all peaks in an arid
zone river might be expected to look like a PD series from a humid zone river; Thus PD models are very appropriate
for arid zone hydrology when used with a regional standardization and parameter estimating scheme.
In the absence of readily available PD data, AM models have to be adapted to take into account years having
one or more zero flood values. A distribution may be fitted to the non-zero values and for selected flood values, Q, the
conditional exceedance probability P'(Q) is calculated from the fitted distribution. The conditional probability is
converted to unconditional probability by moltiplying by the probability of a non-zero flood year
P(Q) = P'(Q) . [mIN]
(4.23)
CHAPTER 4
MElHODS OF QUANIlLE ESTIMATION
33
where P(Q) is the unconditional probability in !he annual series and mIN is !he probability of a non-zero flood year, m
being the number of non-zero flood years in N years of record. A more complex version of this type of adjustment,
based on the work of Jennings and Benson (1969), is recommended by USWRC (1981). The above type of treatment is
more correct and reliable (Beard, 1974) than replacing zero values by some arbitrary non zero value such as 1%Q.
Dalin (1986) has pointed out that AM series in arid zone streams in the Middle East may consist of three
categories of data
(a) zero or drought values, 10 - 20% of values
(b) normal flood values,
60 - 80% of values
(c) extreme flood values,
10 - 20% of values
These percentages depend on the region and degree of aridity, and !he percentage of zero values may be as high
as 50% in Southwestern United States (Thomas, 1988, Pers. Comm.). When plotted on probability paper these
indicate three regimes, the upper two of which are reminiscent of Rossi tl..l!1's (1984) TCEV model and of POller's
(1958) "upper and lower frequency" curves. Dalin demonstrates how the upper "extreme" values may be standardized by
a statistic calculated from the "normal" flood values of which there are a greater number. The standardized "extreme"
values are then pooled regionally and have a simple exponential distribution fitted to them. This may be also viewed as
an index flood method applied to truncated data.
Floods in arid zones arise mainly from intense convective thunderstorms of very limited areal extent and which thus
affect catchments randomly with little spatial pattern or coherence. Some minor floods occur as a result of other lower
intensity rainfalls leading to very high values of C v for arid zone flood series (see McMahon (1979), Wallis (1982)).
Long records are essential for such circumstances but in their absence some form of regionalization technique such as
those of Section 4.4.2 above or an adaptation of Dalin's (1986) approach or a regionally calibrated PD model is
essential.
CHAPTER 5
PROPERTIES OF QUANTILE ESTIMATORS
5. 1
Sources of error of estimation
A
The estimated value, Qr , may differ from the true value Qr because of
(a)
Inability of model chosen (AM or PD) to reproduce the population Q-T relation;
(b)
Incorrect choice of distribution to describe the population within the chosen model type;
(c)
Bias in the estimating procedure (ifthis is known to exist, a correction can be made for it);
(d)
Sampling error due to the fact that parameters are estimated from a fmite sample;
ee)
The available record (sample) may not bea truly random sample from the required population. No
. control can be exercised over this even though tests can be made to test the reasonableness of the
assumption.
In this chapter sources (a) and (e) will not be considered further. It is inevitable in all flood estimation
schemes that sources (b), (c) and (d) contribute.
5.2
Properties of at-site quantile estimators
Investigations of these properties consider quantile estimates obtained from random samples under one of the
following assumptions:
(a)
the estimating method is based on knowledge of the form of the parent distribution;
(b)
the estimating method is based deliberately on the assumption that the data have come from a
distribution different in form from the true parent distribution. This enables the robustness of the
estimating procedure to be examined.
In either case the results are expressible in terms of bias, standard error and root mean square error. In case (a),
the required expressions are obtainable analytically for some methods of estimation while for others recourse mnst be
had to Monte Carlo simnlation methods. In case (b) analytical methods are too intractable and simnlation methods are
always necessary. Several investigations of the above types have been published and Table 5.1 lists some of the
relevant references.
In general, standard error of estimate increases with T, population Cv and Cs values and is inversely
proportional to sample size:
se(&r) =
(J
~(T, Cs) I
(5.1)
NII2
where (J is population standard deviation and ~( ) is a function which depends on the form of the parent
distribution and on the method of parameter estimation. ~() would also depend on the form of the estimating
distribution if different from the parent.
Tables 5.2 (a), (b) and (c) show a limited selection of standard error values adapted or taken from published
A
sources. Table 5.2 (a) shows in the simple case of a two parameter distribution (EVI) how se(Qr )varies with N, T
and parent Cv . For simplicity Table 5.2 (a) is based on ordinary moments estimation. Slightly different values would
be obtained using other methods of estimation (Lowery and Nash, 1970; Landwehr~, I 980; Fiorentino and
Gabrielle,1984; Hosking,1985). Table 5.2 (b) gives selected results from Hosking tl.1!L (1985 b) for GEV
A
populations and shows how seCOr) varies with N, T and method of parameter estimation for a fixed value of parent
skewness.
CHAPrER5
Model Distribution
Extreme Value
Type I
Lognormal
Pearson Type 3
Log Pearson 3
General Extreme
Value (GEV) and
LogEVI
Wakeby
TCEV
•
••
35
PROPERTIES OF QUANTILE ESTIMATORS
At-site
Regional
•
•
•
••
•
••
••
•
•
•
••
••
••
•
•
•
•
•
••
•
•
••
••
•
§
§§
§§
§§
Reference
Kaczmarek (1957)
Nash and Amorocho (1966)
Lowery and Nash (1970)
Landwehr et al. (1979a)
Greis and Wood (1981)
Fiorentino and Gabrielle (1984)
Lettenmaier ~ (1985, 1987)
Lettemnaier and Potter (1985)
Kaczmarek (1957)
Sangal and Biswas (1970)
Burges l:Ul. (1975)
Kuczera (1982b)
Stedinger (1980)
Lettenmaier and Potter (1985)
Matalas and Wallis (1973)
BobCe (1973)
§§
Condie (1977)
Nozdryn-Plotnicki and Watt (1979)
Hoshi and Burges (1981)
Phien and Hsu (1984)
Wallis and Wood (1985)
§§
§
§§
Jenkinson (1969)
Hosking et al. (1985a)
Hosking et al. (1985b)
Wallis and Wood (1985)
Lettenmaier et al. (1985, 1987)
Arnell and Gabrielle (1985, 1988)
§
§§
§§
§§
Landwehr et al. (l979b,c)
Wallis (1980)
Hosking l:Ul. (1985a)
Wallis and Wood (1985)
Arnell and Gabrielle (1985, 1988)
§§
§§
Arnell and Gabrielle (1985, 1988
Arnell and Beran (1987)
Parent and assumed model distributions are the same.
Also tests model distribution under different parent distribution assumption(s). (Robustness test) .
§ and §§ For regional cases correspond to * and ** in at-site estimation.
Table 5.1. Selected references to investigations into sampling properties of quantile estimates.
36
STATISTICALDISTRIBUITONS FOR FLOOD FREQUENCY ANALYSIS
Table 5.2 (c) gives selected results from Kuczera (1982 a) in which a robustness study was carried out using a variety of
methods to estimate quantiles of Wakeby parcnt distribution. These results show that 2 parameter estimators, LN2 and
EVl, have smaller rmse values than 3 and 4 parameter estimators especially in small samples. A considerable
proportion of rmse is due to bias in the 2 parameter estimators while that proportion is almost negligible in the 3 and 4
parameter estimators. In other words the more flexible 3 and 4 parameter estimators have negligible bias but very large
sampling error while the less flexible 2 parameter estimators have considerable bias but much smaller standard error.
Similar conclusions were drawn by Lellenmaier et al. (1987) in relation to EVI and GEV quantile estimators, the latter
having negligible bias and very large standard errors while the former exhibit tight confidence intervals but with
considerable bias when the population is GEV (k < 0). This type of result is displayed schematically in Figure 5.1 (a)
and (c).
IFIXED SKEWNESS MODELS (usually with 2-parameters)
fa; Model C s < Parent C s
Negv, bias
(b; Model Cs
==>
=>
1
I
> Parent
Cs
posve bias
1
"--- ....
0
!-
_- .. ----
-- --...........
- ..... -
-.
T
o
~~- ~-
-----
~..........
1--_ .......
--- -.
-I
~-
"...,.""'-
-
_
T
-1
IVARIABLE SKEWNESS
MODELS (usually with> 2 parameters)
I
-
(0) At-site Use
1
small bias
large se
-_.- _ ' ",
1---- -- -..
a
,
,,
...
0
(d) Regional lor xr
+ At-site
low bias
1
low se
,
-
T
"" ,
"\
·1
I
I
I
I
0
10
10'
10'
1--_._--------.-
o
----T
I- - ------- --_.
.
-1
T
------------
I
I
0
10
-- J
10'
I
10'
T
bias/QT
[ bias!: 1·96 se (aT)] / Q T
Figure 5.1 Qualitative outline of simulation results about quantile estimates.
(a) and (b) show effect of choice of wrong distribution having fixed but incorrect skewness.
(c) and (d) show the effect of using a flexible distribution i.e. low bias and large standard error ( = se ) when
used in at-site mode but with very much reduced se as well as low bias when used in at-site/regional mode.
CllAl'IER5
37
PROPERTIES OF QUANTILE ESTIMATORS
A
se(Qr)
Qr
ax!
A
[se(Qr)/Qr]%
Cv
T
0.3
10
100
139.3
194.2
19.8
37.3
[14]
(19)
8.7
16.7
[6.0]
(8.5]
6.3 [4.5]
1l.8 [6.1]
0.6
10
100
178.6
288.4
39.6
74.6
[22]
[26]
17.7 [10m
33.3 [11.5]
12.5 [7.0]
23.6 [8.2]
0.9
10
100
217.9
382.6
59.5 [27]
1l1.8 [29]
26.6 [12.2]
50.0 [13.1]
18.8 [8.6]
35.4 [9.3]
10
Sample Size
100
50
(a) Standard error of QT estimated by EVI/MOM in samples of size 10, 50 and 100 from EVI
populations, with J.l = 100 and Cv =0.3, 0.6 and 0.9.
T
10
2.84
100
7.55
A
A
se(QT) and [se(Qr) I Qr]%
E
Qr
PWM
ML
JS
0.97 [34]
1.42 [50]
0.97 [34]
0.54 [19]
0.62 [22]
0.54 [19]
0.40 [14]
0.43 [15]
0.40 [14]
PWM
4.15 [55]
Not quoted
4.23 [56]
2.49 [33]
3.02 [40]
2.49 [33]
1.81 [24]
1.89 [25]
1.81 [24]
ML
JS
15
Sample size
50
100
(b) Standard error of QT estimates by GEV I PWM, GEV I ML and GEV IJS from GEV samples.
Parent population parameters u = 0, a = I, k = . 0.2. (J.l = 1.16, Cv = 0.31, Cs = 3.54).
(From Hosking et aI., 1985b, Table 7). (IS = Jenkinson's (1969) method of sextiles).
Estimator
rmse
%Qr % due to bias
LN2/ML
LP3
EVI/ML
log EVI/MOM
WAK·4/pWM
0.78
1.20
0.75
1.44
1.03
26
40
26
49
35
Sample Size
15
26
2
64
25
I
rmse
%Qr % due to bias
0.06
0.82
0.66
1.18
0.76
20
28
22
40
26
35
3
76
44
I
30
(c) Root mean square error of Qr estimates for T = 100 by five estimatOrs in samples of sizes 15 and 30
from Wakeby parent population with J.l = 100, Cv = 0.52, Cs = 2.42 and QIOO = 2.936.
(Adopted from Tables 1,3,4 of Kuczera, 1982b)
Table 5.2 Standard error of selected quantile estimators showing dependence on sample size and on some
population parameters
38
5.3
STAllSTICAL DISTRIBUTIONS FOR FLOOD FREQUENCY ANALYSIS
Properties of at-site/regional quantile estimators
While at-site/regional estimators such as index flood methods (Dalrymple, 1960; NERC, 1975; Beable and
McKercher, 1982) have been widely used, systematic investigation of the sampling properties of these was not
undertaken until this decade (Wallis, 1980; Greis and Wood, 1981; Kuczera, 1982 b; Hosking tl.ll!.., 1985a; Wallis and
Wood, 1985; Lettenmaier tl.Jl!...1987; Lettenmaier and Potter, 1985; Arnell and Gabrielle, 1985, 1988). The
circumstances investigated in these tests can be seen in Table 5.1. Lettenmaier (1985) gives a lucid account of results
available up to mid -1985.
All such investigations have been conducted by simulation methods. Such tests generally proceed along the
following lines.
(a)
Select a flood-like parent distribution;
(b)
Select the estimators to be tested i.e. model distributions and methods of parameter estimation;
(c)
Select a hypothetical region, I.e. number of stations, M and length of record at each station;
(<1)
Select parent distribution parameters so as to produce regions with required C v and C s, with or
without regional homogeneity;
(e)
For each selected parent:
(I)
(i)
Generate a region-full of data,
(ii)
For each estimator calculate Qr for each site, for different selected values of T,
(iii)
Repeat (a) and (b) a large number (usually at least 1000) of times,
(iv)
For each estimator and return period calculate bias, se and rmse
A
Compare the estimators on the basis ofresults of e (iv).
The selection of steps (a) to (d) determines the range of applicability of the results. A wide variety of parent
distributions have been used in such tests. Wakeby, GEV and TCEV parents are considered to be hydrologically most
challenging when testing the robustness of both 2-parameter and more complex models. The robustness of a particular
model is tested by assessing its performance with data drawn from a different parent distribution.
Such experiments yield a large volume of results which are not easy to assimilate at a glance. The general
nature ofresults of selected experimental types is shown in Figure 5.1. Figure 5.1 (a) shows the type of result obtained
when a 2-parameter fixed skew model is being used to estimate quantiles when data are drawn from a population with
higher skewness than that implied by the model. This usually results in negative bias, whether used in at-site or atsite/regional mode. Wheu used in at-site mode it has much smaller se than a typical 3-parameter model, Figure 5.1 (c),
while the latter has much less bias. Corresponding but opposite remarks apply when the model skewness exceeds that
of the parent, Figure 5.1 (b).
Thus 2-parameter models have high efficiency but are biased while in contrast 3-parameter models can be
unbiased but quite inefficient. The 2-parameter model bias can be damaging while the 3 parameter inefficiency make 3parameter models unattractive for at-site use alone. In general the 3-, 4- and 5-parameter models used in regional index
flood mode with PWM estimation lead to results typified by Figure 5.1 (d) and outperform all other models in respect of
bias and especially efficiency (small se). It should be noted that ML estimation is usable only in regional estimation
schemes which make the station year assumption that all standardized flood values in a region can be considered to be a
single random sample. This assumption is not always been regarded as valid and hence station year methods are not
often used.
In his review of estimation methods Lettenmaier (1985) concludes: "regionalisation is the most viable way of
improving flood quantile estimation. The performance of regional PWM (index flood type) estimators for GEV and
WAK distributions, in particular, are so superior to the currently used institutional methods that no viable argument for
the continuation of the current practice is evident. Particularly where the flexibility of using a 3-parameter distribution
is required, the reduction in variability of flood quantile estimators achieved by proper regionalization is so large that atsite estimators should not be seriously considered. The lone exception will be where no physically defensible region
CHAPTER 5
39
PROPERTIES OF QUANTILE ESTlMATORS
can be identified. In such cases, it will be necessary to use a 2-parameter distribution, such as EVI [or LN2J, and
accept large estimation bias, or a 3-parameter distribution such as GEV [or LP3] and accept large estimation
variability.
It
A
An example of the reduction of se(Qr) with increase in region size for homogeneous regions is given in Table
5.3. Since Lettenmaier's (1985) review, the use of TCEV distribution as a flood parent and use of Rossi ~ (1984)
TCEV regionalization procedure have been investigated by Arnell and Gabrielle (1985, 1988) in studies in which GEV
and WAK parents and regional estimating methods were also used. It was found that the WAK/PWM at-site/regional
quantile estimating procedure performed very well over all parent distributions. GEV/pWM performed well with GEV
parents but noticeably less well with TCEV and WAK parents, a lack of flexibility in GEV seen also in the results of
Hosking ~ (1985a). The TCEV regional model had downward bias and relatively small se when the parent was
GEV, but had small bias and large se, particularly for large T, with TCEV and WAK parents. However, the TCEV
model may be realistic in arid zones as well as in two-flood season humid regions and should not be lightly dismissed.
It deserves further testing in such conditions.
L
M
N
30
150
196
300
400
600
900
1
5
14
10
20
20
30
30
30
14
30
20
30
30
Quantile Return Pedod
A
[se(Qr) / Qr]%
8.8
4.1
3.3
3.0
2.4
2.1
1.7
T
= 20
22.9
9.9
7.7
7.2
5.7
4.9
4.1
T
= 100
55.1
20.3
15.0
14.4
11.2
9.8
8.1
T
= 1000
A
Table 5.3. Reduction in se (Qr) with increase in region size using at-site/regional GEV/PWM estimating
procedure. Parent population is GEV (10,4, - 0.177) at all sites, i.e a homogeneous region with
Cv = 0.53, Cs = 3.0. First row is for at-site GEV/pWM, others are for at-site/regional GEV/PWM
estimation. (M = No. stations, N =No. of years at each station, L = M.N).
(Adapted from Hosking!ll..llL 1985a, Table 3).
5.3.1
EFFECT OF REGIONAL 1lIITER00ENEITY
Even where flood statistical behaviour is not regionally homogeneous Lettenmaier and Potter (1985) and
Lettenmaier et aI. (1987) have shown that at-site/regional procedures are still preferable to at-site procedures unless the
degree of heterogeneity is very large.
5.3.2
!'ERFoRMANCEOFLP3 QUANTILE ESTIMATIONMETIIODS
Following recommendations of a US Interagency Committee, LP3 was recommended for use by US Federal
Agencies (USWRC, 1967). The basis of the choice was goodness of fit to 10 long natural streamflow records. Beard
(1974) examined the ability of LP3 and of other models to predict rates of OCcurrence, in the second half of a flood
record, of quantiles estimated by each model from the first half of the records. A correction for expected probability
(Beard, 1960) was employed. On the basis of the results from 300 records, he concluded that only LP3 with regional
skew values, and LN2 were able to predict future frequencies without bias and that the former method was preferable.
The quantile estimating ability of LP3 has been examined for at-site use by Kuczera (1982 b) who found that
not ouly are EVI and LN2 more efficient quantile estimators than LP3 but 4-parameter WAK is also more efficient (see
Table 5.2.c). Wallis and Wood (1985) tested the regional LP3 quantile estimator as specified by USWRC (1981) and
found that when LP3 is the parent, the LP3 estimator is less precise than regional GEV/PWM or regional WAKjPWM.
40
STATISTICALDlSTRIBUIlONS FOR FLOOD FREQUENCY ANALYSIS
The same conclusion is true whcn other parent distributions are considered but to a greater degree indicating that LP3 is
also lacking in robustness.
A
Table 5.4 shows examples of how GEV/pWM and WAK/PWM usually give smaller se(Qr) than the USWRC
(1981) recommended regional LP3 procedure, especially where site population skewness is negative.
Selected
Site No.
Parent
Cs(z)
LP3 with Regional
Skew
A
bias se (Qr)
9
18
1
14
16
-0.660
-0.349
-0.064
0.216
0.441
1.033
0.435
0.098
-0.058
-0.31
1.301
0.611
0.363
0258
0.247
GEV/PWM at-site
regional
A
bias se (Qr)
0.010
0.032
0.215
0.152
0.361
0.186
0.161
0.289
0.223
0.397
WAK/PWM
at site/regional
bias
- 0.096
- 0.134
0.987
0.031
0.218
A
se(Qr)
0.200
0.201
0.206
0.162
0.272
Table 5.4: Comparison of three regional estimating techniques for estimating 5OD-year return period
flood. Data generated as LP3 for 20 site heterogeneous region with -0.66 < Cs(z) < 0.44, where
Cs(z) ~ skewness in log space. Tabulated quantities are fractions of true quantile values at each site.
(Selected from Wallis and Wood, 1985, Table 3).
The conclusions drawn by Wallis and Wood (1985) have been contested by both Beard (1987) and Landwehr l<l
al. (1987). Beard (1987) disagrees that the 6 step strategy outlined above for testing quantile estimating methods is
appropriate and would promote the split sample test (Beard, 1974; USWRC, 1981) in its place. He also advocates
dealing with flows in the log-domain to reduce skewness rather than in the Q - domain. In their reply Wallis and Wood
(1987) disagree that the split sample technique is appropriate and state, "it is insufficient for identifying the underlying
distribution of floods", and that hence the modem emphasis is on selecting robust procedures. They also rebut the
suggestion that the log transform, by reducing skewness, improves quantile estimation since in some cases in the form
of LP3 it can lead to some "spurious quantile estimates" which can be avoided by the use of PWM estimation.
Landwehr et al. (1987) express their disagreement with the Wallis and Wood (1985) findings more extensively.
They assert that it has not been "shown, that in expectation, that there will be a gain using WAK/PWM, or at least no
loss, for any region of interest that may be constructed by any criteria". They also claim that Wallis and Wood (1985)
base their claims for PWMjWAK over LP3/WRC on the use of average regional bias, which averages out individual
occurrences of under- and over-estimation. They state that if average absolute bias were used as a criterion instead that
LP3 in one form or other (depending on the way regional information is used with it) would in many cases appear less
biased than WAK/PWM. However in making this statement (pI208) the standard error of estimate is temporarily
ignored. However Landwehr fi.ll!. (1987) introduce two variants of LP3 regional techniques, not considered by Wallis
and Wood (1985) nor included in USWRC (1981), and show that these compare favourably with (in Region lA) and
better than (in Region IB) WAK/PWM in two new hypothetical regions not considered by Wallis and Wood (1985).
Thus Landwehr l<l..JI!... (1987) went considerably outside the scope of the original paper and introduced
previously unpublished variants of LP3 regional techniques in order to challenge the Wallis and Wood (1985) findings
and perhaps to defend the use of LP3 as an institutionally recommended tool in U.S.A.. In fact W.O. Thomas, Jr.
(1988, Personal Communication) claims that the LP3/regional skew tested by Wallis and Wood (1985) is "not a true
regional method", presumably because only skewness and not other statistics is averaged regionally in it. It is however
the variant which was advocated for institutional use by USWRC (1981) and it would seem that it is this recommended
variant that was intended to be subjected to test by Wallis and Wood (1985).
Despite the defence of LP3 based techniques put forward by Landwehr et al. (1987) it does not diminish the
fact that LP3 can lead to low upper bounds on flood magnitude. Use of LP3 must always be accompanied by some
contingency technique for recognizing and dealing with low outliers.
CHAPTER 5
5.4
41
PROPERTIES OF QUANTILE ESTIMATORS
Properties of regional·only qnantile estimators
Statistical flood estimation for nngauged catchments depends on regression relationships between some or
all of Q, C V , Qr" On .... and catchment characteristics, as explained in Chapter 4.3.3. When the index flood approach
_
A
_
-
is used, Qr =Q. xT (equation 4.13), the XT component contributes much less to error in Qr than Q does when Q is
obtained from a regression relationship.
Nash and Shaw (1965) showed that se( Q) ~ sd(Q) = standard deviation of population. In other words Q from
catchment characteristics is as good as having a sample of annual maxima of size one. Their result was based on data
from 57 catchments. NERC (1975) using data from 530 catchments found no improvement on se(Q). Increasing the
number of catchments increases the precision with which parameters of the regression relation are established but it does
not change the nature of the problem because the scatter of residuals about the regression relation is still inherently the
same.
NERC (1975) did find that restricting the calibration to include only data from longer records with high quality
measurement, resulted in a small improvement in se(Q). Stedinger and Tasker (1985, 1986) show that a more
comprehensive treatment of the residuals, using either generalised least squares or weighted least squares calibration,
leads to improved prediction from catchment characteristics, especially where record length varies from site to site.
These methods also enable very short records to be used more profitably.
Hebson and Cunnane (1987) have attempted to investigate, by simulation, ungauged catchment index flood
quantile estimates. They tried to simulate data which were compatible with real catchments and found that Q obtained
from a simple regression relation were considerably more variable than estimates obtained from a single year record,
which does not augur well for the use of such relations.
As stated in Chapter 4.4.3 the relationships between Cv and catchment characteristics is in some regions
statistically significant but it usually has very large standard error. If a regional estimate of C v is to be used for an
ungauged catchment, then the regional mean value should be used. This is a compromise which may lead to under- or
over-estimation in some catchments.
A
A
The use ofregression relationships between QT and catchment characteristics which, as indicated in Section
4.4.3, Qr is widely used in U.S.A., leads to levels of accuracy for different values of T which depend not only on the
number of sites M used in the calibration but also on the length of record, N, at each site. Tasker and Moss (1979, Fig.
3) using a simple model, Qr = aAb, shows that se(QT) can be greatly reduced by increasing M if T is small, but if Tis
A
large N has to be increased in order to reduce se(Qr). Their Figure 3 also shows the limit of accuracy available with
A
such a simple model. The observed se(Qr) for T = 2, 10,50 and 100 years are 90, 79, 85 and 88% of Qr respectively.
Increasing M to 50 and N to 35 will reduce these percentages to about 55% for T = 50 and 60% for T = 100 but they
cannot be further reduced without a more complex model, which in turn may need more data for accurate estimation
because of the incresed number of parameters to be estimated. W.O. Thomas, Jr. (1988, Pers. Comm.) has indicated
that for estimation of 100 year flood the regression equations currently used in the U.S.A. have an accuracy equivalent
to that obtainable from between 8 and 12 years of record.
5.5
Effect of spatial and temporal interdependence of flood magnitndes on qnantile estimates
If a great degree of spatial dependence exists between neighbouring flood records then new information is not
necessarily included when new stations are added to a regional estimating scheme. However inclusion of interdependent
data does not invalidate the calculation of mean values of selected statistics over a region even though such means are
not as precisely determined as from an equal amount of independent data. Thus methods such as Dalrymple's (1960) or
average standardized PWMs (Wallis,1980) are not damagingly affected by interdependence, (see Hosking and Wallis,
1985). Methods which depend on the station year assumption that all standardized flood values in a region form a single
random sample may be affected to some degree by such dependence. One method of dealing with such dependence has
been suggested by Aereman and Hosking (1986).
42
STAllSTICALDISTRIBunONS FOR FLOOD FREQUENCY ANALYSIS
Temporal interdependence can be measured, perhaps imperfectly by serial correlation coefficients. The amount
likely to exist in a flood series is unlikely to effect greatly the reliability of quantile estimates (see Landwehr ll1..llL..
1979 a; Srikanthan and McMahon, 1981b).
5.6
Confidence intervals
A
A
A
If quantile estimates Qr are normally distributed an interval of [Qr ± 1.96 se(Qr)] defines a 95% confidence
interval. Estimates obtained by at-site/regional index flood methods tend to be normally distributed (Hosking 1aJl1,
1985 a; Hebson and Cunnane, 1987) but estimates obtained by at-site methods tend to be highly skewed and hence the
A
A
symmetrical confidence interval [Qr ± 1.96 se (Qr)] is misleading, being much too short in the upper portion. Fora.
detailed account of confidence intervals see Stedinger (1983).
5.7
Comparison of different techniques of parameter estimation
Traditionally, parameter estimation methods have been compared under the headings of bias, efficiency,
sufficiency and consistency (Kendall and Stuart, 1961 VollI, Chapter 17) although only the first two of these receive
much consideration by hydrologists. In general ML.methods are known to be most efficient in asymptotically large
samples. While there is no guarantee that ML is most efficient in small samples it frequently is so.
Apart from efficiency, feasibility is another consideration. For instance Matalas and Wallis (1973) reported
that ML estimates were impossible to obtain for some LP3 samples using P3jML estimation methods. On a different
level ML estimation has not been applied. or may be impossible to apply, to WAK parameter estimation. Likewise
PWM estimation has not been applied to normal, LN or LP3 parameter estimation but Song and Ding (1988) have
shown how it can be applied for P3 estimation even though P3 distribution is not expressible in inverse form.
Many of the studies listed in Table 5.1 have concentrated on comparing several methods of estimating
parameters and quantiles from particular distributions on the assumption that the form of parent population is known.
While these studies provide valuable insight on estimation methods the most important fact to remember is that if an
index flood type at-site/regional method is adopted for use a sufficiently flexible multiparameter distribution can be
chosen whereas at most a 2-parameter distribution should be used on an at-site basis. The use of regional data with the
flexible distribution will ensure that the parameters are well estimated and hence the benefits of flexibility are not marred
by excessive standard errors.
CHAPTER 6
METHODS OF CHOOSING BETWEEN DISTRIBUTIONS
6.1.
Introduction
Several techniques have been used in the past for evaluating the suitability of different distributions for AM
series. An attempt is made to classify these in Table 6.1. Two main categories can be identified in the use of these
techniques:
(a)
Tests of descriptive ability
(i)
Seek from among known distributions that one which fits observed AM data best, judged according
to one or more of the criteria I to VI given in Table 6.I(a). Generally speaking this philosophy
prevailed prior to the mid 1970's and cannot be regarded as having been successful on its own.
(ii) Examine the statistical behaviour, especially the sampling distribution of Cv , C, and standardized
largest sample values, of candidate distributions to determine whether they are capable of producing
random samples having the same statistical characteristics as observed AM series, Category VII in
Table 6.1 (a).
(b) Tests of predictive ability
Test how well a candidate distribution can estimate the Q - T relationship or the frequency of future
events when the population distribution is not identical to that of the candidate distribution, Categories
VIII and IX of Table 6.1 (b). Different methods of parameter estimation must also be included in this part
of the test.
In this Chapter the traditional methods and recent approaches to the problem are discussed.
6.2
Influence of Outliers
Some hydrologists are inclined to believe that many existing statistical flood frequency estimation methods
underestimate the frequency of occurrence of very large floods.
There is some difficulty in formulating a test of such a belief. Since outliers, defined in some suitable
objective way, are rare and do not appear in each observed hydrological record, a test of any hypothesis about their
frequency of occurrence must be done on a regional basis. Random samples drawn from statistical populations can
produce outliers in hydrologically sized samples. It is relatively simple to count the frequency of outliers occurring in
sets of randomly generated samples from some population in order to use them as a standard of comparison but it is less
easy to define a set of outliers in a real hydrological region. Typically Nj (j=1,2,...M) years of record are available at
M gauging sites in a region (some on the same stream perhaps) and occasional historical records of unusually large
floods are available for some of these M sites, as well as for some additional sites which are not gauged on a regular
basis. It is difficult to establish a procedure for obtaining, from such a data base, a count of how many outliers of
differeut degrees of severity have actually occurred in the region, which may be compared in an unbiased manuer with
the results of a simulation study.
With such a fair procedure for comparison one could then test whether existing or proposed flood frequency
models arc capable of generating randoin data sets, similar in size and record lengths to observed hydrometric data sets,
which have the same outlier producing capability as nature. It may be that some models (candidate distributions) might
display outlier related behaviour relative to the observed data analogous to the condition of separation for skewness.
Such a condition might point to the need for a thick-tailed distribution.
The amount of influence which outliers should be allowed to have in distribution selection and parameter
estimation needs to be considered. Outliers can be excluded from the estimation procedure only if it is certaiu that AM
floods can be adequately modelled by a single known distributional form. (The distribution and estimation procedure in
question need not match the parent (i.e. nature) precisely so long as it is robust). In such a case, the Or estimate
obtained from the truncated sample may be closer to the true value than that obtained from the entire sample. Even if
retained, outliers have only a small effect if an efficient method of parameter estimation (ML or PWM) is used.
44
STATISTICAL DISTRIBUIlONS FOR FLOODFREQUENCY ANALYSIS
CATEGORY
I
PROCEDURE
Graphical
Histogram - Visual inspection
Probability plot - Visual inspection
Probability plot including confidence
interval type "control" bands (Gumbel, 1941)
II
Goodness Of fit tests
Chi-Square
Kolmogorov-Smirnov
Anderson-Darling
ill
N
Tests based on skewness
(e.g. is Cs '" 1.14?). Visual tests based on
Moment-Ratio diagrams.
Numerical indices of agreement
calculated from probability plot as in
USWRC (Benson, 1968) and NERC (1975).
V
Test of distributional hypothesis
against specified alternative e.q.
EVI versus EV2 (van Montfort, 1970)
EVI versus GEV (Hosking !ll..l!!., 1985b)
VI
Regional Dooling of data
and applying I - IV to select a single
regional distribution
VII
Behaviour Analysis
Test by simulation study or theoretical
analysis whether candidate distribution
can give rise to random samples having
the same general statistical properties
as observed flood data series
e.g: incidence of outliers,
variability of calculated skewness
Matalas et al.(l975),Rossi et al. (l985),Ahmad et al. (1988a)
Table 6.1
Categories of procedures used to select a flood distribution.
(a) Tests of descriptive ability.
CHAPTER 6
45
METIlODS OF CHOOSING BETWEEN DISTRIBlITIONS
CATEGORY
PROCEDURE
vm
Split Sample Test
Distributions are fitted to fIrst
half ofeach record and expected
numbers of exceedances of specifIed
magnitudes in second half of record
are compared with observed number
(Beard, 1974).
IX
Robustness
Test whether a distribution and method
of parameter estimation, considered
jointly are insensitive to departures
from assumptions made in their use.
e.g: Kuczera (1982b), Hosking et al. (1985),
Wallis & and Wood (1985), Arnell and Fiorentino (1988).
Table 6.1
Categories of procedures used to select a flood distribution.
(b) Tests of predictive ability.
On the other hand, if it is regarded as true that AM floods come from two very different sub-populations, then
the outliers must be retained if the sample is to be regarded as random and unbiased. Even then, because there are so
few of them, such outliers can be used only in a regional estimation procedure, perhaps to estimate the parameters of the
upper end of the compound distribution. The Rossi l:t.l!!. (1984) regional estimation procedure explicitly extracts all
outliers prior to estimating the at-site standardizing statistics and then includes the standardized outliers in the final stage
of the estimation procedure.
6.3
Traditional Methods
Dntilthe 1970's the suitability of any particular distribution for flood frequency analysis was often judged on
the basis of physical inspection of the data on a probability plot. On such a plot the sample values of the hydrological
record appear as a series of plotted points while the estimated distribution of a particular form, whose suitability is
being examined, would appear as a line or curve. The fonn of distribution whose line or curve showed best agreement
with the plotted points would then be chosen. Gumbel (1941), in demonstrating the use of EV distributions, made use
of confIdence intervals about the line or curve of the fitted distribution on the probability plot in order to help judge or
demonstrate the suitability of these distributions for annual maximum series data.
It is now understood however, both from theoretical statistics and from split sample tests on hydrological data,
that any single record of data from a given distribution can display a plotted behaviour which is quite different from the
parent Q-T relationship. Thus a sample from a straight line population could display marked curvature on a probability
plot while on the other hand a sample from a population whose parent Q - T relationship is curved, might by chance
plot remarkably close to a straight line on a probability plot. Therefore there is a distinct possibility of error when
choosing a distribution on the basis of inspection of a probability plot. This is not the fault of the plot, per se, but
stems from the highly uncertain nature of the problem. Other methods of choosing between distributions suffer
analogous weaknesses even though the procedures may be less subjectively defined.
Certain goodness of fit statistics may be used as the basis of a test of the hypothesis that a given sample of
data may be regarded as having been drawn randomly from a distribution of specified form. Examplcs of these are Chisquare and Kolmogorov-Smimov statistics. Such tests can reject, or by default accept, the null hypothesis that a
sample of data have come from a stated distribution. (If the parameters of the distribution named in the hypothesis are
estimated from the sample itself, as is usually the case, this is taken into account when counting degrees of freedom Le.
in detennining the distribution of the test statistic). These tcsts can be used to test distributions separately to dctermine
whether the data are in accord with the distributions or not.
46
STATISTICAL DISTRIBUTIONS FOR FLOOD FREQUENCY ANALYSIS
Such tests are not, however, discriminatory tests for choosing between one distribution and another. Further,
since each theoretical distributional form being tested against any sample nsually has its parameter values estimated
from the sample, it follows that the several candidate distributions are constrained to be similar, at least in mean and
variance. Because of this inbuilt forced similarity, a test of the hypothesis that data have come from EVI when the
alternative hypothesis is that they have come from a lognormal distribution with equal mean and variance has
necessarily very low statistical power, or equivalentiy the test has a very high probability of a Type II error of inference,
i.e that the real differences will not be detected.
In Britain, NERC (1975) published results of using Chi-Square goodness of fit and Kolmogorov-Smirnov
goodness of fit tests in determining the suitability of different distributions for annual maximum flood series. These
resnlts are outlined in Chapter 7 and in general must be regarded as inconclusive because of their lack of statistical
power when choosing between alternative distributions.
Van Montfort (1970) proposed a specific likelihood ratio test for discriminating between EVI and EV2
distributions a1thongh this test is also of limited power in the circumstances since both distributions nnder test are
constrained to be very similar because their parameters are estimated from the same sample. Otten and van Montfort
(1978) discuss the statistical power of such tests. Hosking ~(1985b) give an easily used test of an EVI hypothesis
against a GEV alternative. This test is detailed in Appendix 4.
Other tests based.on the agreement between ranked sample magnitudes and estimated or fitted magnitudes, as
viewed on probability paper, have been used by U.S. Water Resources Council (Benson 1968) and NERC (1975) and
the results of using these are discussed in Chapter 7. It should be noted that such goodness of fit tests based on
probability plots suffer from a major weakness in that they take no account of the fact that the natural sampling
variation of the largest elements in a sample are far greater than that for the middle ranking values. Another point to
note is that such tests almost inevitably pick out the three parameter distribution before the two parameter ones. This
is not a snfficient reason for accepting three parameter distributions and rejecting two parameter ones. If thetrne
population were EVI such tests would show more favour to the GEV, with its third parameter, than to the parent
itself. The same would hold if the true population were lognormal; such tests would show more favour to LP3 (of
which it is a special case) than to the parent distribution.
Another test is based on the relations between Cv andCs and between Cs and Ck in a distribution of particular
mathematical form. On moment ratio diagrams, first used by Karl Pearson for demonstrating his system of frequency
curves, showing the Cv - Cs and the Cs - Ck relationships respectively, each distributional form can be represented by a
point, a curve or region. The corresponding quantities calculated from an observed record of flows define a single point
on each diagram. This plotted point may identify a distribution, and it is to be hoped that a given sample identifies the
same distribution on both diagrams.
The drawback of this approach is that the lengths of record available in hydrology do not allow the positions
of the plotted points to be obtained accurately and hence there is the possibility of error in the diagnosis. This
possibility of error is usually not reduced by plotting the results from several catchments on the same diagram. This
often results in a wide scatter of points and it is not possible to determine precisely how much of this scatter is due to
sampling error and how much is due to differences of parent distributions between catchments. Some properties of
moment ratio diagrams are discussed in Appendix 3 from which the weaknesses of this technique can be deduced.
Split sample tests (Beard, 1974) are predictive ability tests which compare the expected number of exceedances
of specified magnitudes, as estimated from the first half of a record, with what has been observed in the second half.
The distribution which produces the best agreement is accepted as best. Beard (1974) conclUded that only LP3 (with
Regional Skew estimates) and LN2 distributions were satisfactory when tested on data of 300 rivers in U.S.A.
6.3.1
GOODNESS OF FIT AS TIIE SOLE CRITERION
The disadvantage of using the wrong form of distribution for flood series is that of over- and under-design of
hydranlic structures. Overdesign involves unnecessary construction or other flood plain costs while underdesign may
result in excessive future damages. Such a disadvantage would seem obvious in any individual case. Nevertheless
Matalas and Wallis (1972) arrived at an unexpected conclusion when they examined, by simulation methods, strategies
for minimising the overdesign, underdesign and total costs associated with the T = 50 year reservoir design flood. They
found that use of the Gaussian (Normal) distribution for flood series can, in certain circumstances, minimise expected
overdesign costs. This is surprising since it is generally acknowledged that the Gaussian distribution is not a good
descriptor of observed flood series. The important word here is "expected" and undoubtedly individual cases of extreme
over- and underdesign could occur with such a choice of distribution. However, the study alerts us to the fact that
CHAPfER6
METHODS OF CHOOSING BETWEENDISTRIBlITIONS
47
goodness of fit alone need not be the sole criterion in choosing a distribntion for flood series when many different
projects are being jointly considered on a nationwide basis.
The Matalas and Wallis (1972) finding however is based on the parsimonious attitude that a single form of
distribution should be used for alI applications in a given region. Different results might well be obtained if a nonparsimonious attitude were adopted whereby the best fitting distribution for each simulated sample is used instead of a
single distribution for alI samples.
The procedure of checking the goodness of fit of candidate distributions to AM series by traditional methods
has not led to a unique satisfactory choice of distribution for any region. The power of such tests is necessarily very
low since fitted candidate distributions are constrained, by the fitting procedure, to be similar to one another for each site
before the test is undertaken.
6.4
Recent Approaches
Recent approaches correspond to categories VII and IX in Table 6.1. This work has appeared in print as
Matalas and Wallis (1972, 1973), Wallis tl.1I!.. (1974), Matalas l:1.ll1 (1975), Landwehr et al. (1978, 1979), Wallis
(1980) in behaviour analysis category VII of Table 6.1(a) and as Houghton (1977, 1978 a,b), Wallis (1980), Greis and
Wood (1981), Kuczera (1982 b), Rossi tl1!1 (1984), Wallis and Wood (1985), Hosking et al. (1985a) and Arnell and
GabrielIe (1986) in the robustness category IX of Table 6.1(b). These publications are very detailed but an outline of
the principal f'mdings is attempted here.
6.4.1
BEHAVIOUR ANALYSIS
WalIis tlJ!!. (1974) examined the sampling properties of random samples from distributions which have been
nsed for flood frequency analysis. They fonnd that sample estimates of C v and C" which are sometimes used to make
inferences about the parent population, are biased downwards and Kirby (1974) conflrmed that these sample quantities
are subject to upper bounds which are functions of sample size, (see equations (2.·1) and (2.2». From this it must be
concluded that parent flood distributions have higher Cv and C, (and probably also Ck) values than are indicated by
conventional methods of calculation used heretofore.
Matalas!llll!. (1975) discovered the condition of separation for skewness whereby the variability of skewness
is greater among samples of hydrological AM series than among equal sized samples drawn randomly from parent
candidate distributions which are commonly used as flood frequency distributions including LN, P3, WeibulI and EV1.
Landwehr!llll!. (1978) showed that LP3 also does not explain the condition of separation nor does the GEV (Cunnane,
1984). Houghton (1978a) however, investigated the Wakeby distribntion, which has been suggested by H.A. Thomas,
and found that for certain combinations of parameters random samples from it do not display the condition of separation.
In conclusion alI previously tried distributions failed to produce sample skewness which behaved similarly to observed
AM skewness. Only the Wakeby distribution did so. Landwehr!llll!. (1978, p.90?) found a combination of Wakeby
parameters which agreed well with Cv and C, values of their regional AM data.
Landwehr tlJl1. (1978) considered the effect of kurtosis on the condition of separation and concluded that large
kurtosis is a necessary but not sufficient property of any distribution required to explain the separation condition.
Previously Matalas l:lJ!1 (1975) and Wallis et al (1977) had found that the condition of separation could not be due to
(i) the relatively smalI number of historical sequences available for analysis, (ii) autocorrelation (iii) cross-correlation,
but could be accounted for by spatial mixing (heterogeneity) of C, values within a region and by non-stationarity in C,.
Therefore, the condition of separation has to be given serious consideration.
Next, Landwehr et aI (1978) considered how accurately or reliably inferences could be made about QT from
studies of z = log Q, the intention, among other things, being to examine the utility of resorting to log-distributions
(LN and LP3) with parameter estimation by moments in log space (LS) as distinct from in the Q - domain or real space
(RS). They showed that the condition of separation also exists in LS and that neither LN nor LP3 distribution explain
it there.
They note that C,(Q) derived from AM series is mostly positive and becomes more so as sample size, N,
increases while in LS, C,(Z) is mostly negative and becomes more so as N increases. If an LP3 distribution's
parameters are estimated by moments in LS and C,(Z) < 0 then the fltted distribution will have an upper bound in RS.
Such a fitted distribution is unlikely to have a positive C,(Q) in RS and this is a contradiction of the gcneralIy observed
48
STATISTICAL DISTRlBlITlONS FOR FLOOD FREQUENCY ANALYSIS
condition of Cs(Q) > 0 and even more contradictory when we consider that Cs(Q) is probably under-estimated anyway.
This is one argument against LP3 as usually fitted I.e. by moments in LS. However W.O. Thomas Jor (1988, Pers.
Comm.) quoting Gilroy (1972) points out that "the upper bound for the LP3 will be at least four standard deviations in
log units above the mean if the skew is > - 0.9" a condition that is met by most applications. On the other hand,
Monte Carlo experiments with the Wakeby distribution indicate that both Cs(Q) > 0 and Cs(Z) < 0 are realisable in
that distribution subject to certain restrictions on the parameters.
Further Monte Carlo experiments were conducted by Landwehr et aI. (1978) under a series of six different
assumed parent regional hydrologies, as defined by AM flood distribution, with a view to studying how effective
regional maps of skewness in LS are in conveying information about skewness in RS. They concluded that a skewness
value in RS could not be uniquely inferred from the value in LS, as distinctly different Cs(Q) contours could give rise
to identical Cs(Z) contours and vice versa. The transfer from LS to RS depends on the form of the parent distribution.
Since, in truth, the latter is unknown for AM series, they conclude that "the construction and use of regional skew
maps are most likely to be counter productive." The magnitude of this effect on quantile estimates was not quantified.
Rossi et al. (1984) tested the suitability of EV1, LN and PEV hypotheses on the basis of their ability to
reproduce observed regional skewness of Italian AM series (39 series of average length 40 years each). They found that
each of these hypotheses conflicted with the observed data but that TCEV could generate samples with the same regional
distribution of skewness as the observed data, and hence accounted for the condition of separation of skewness. Beran!.<l
al. (1986) showed that the TCEV distribution could account for the variability of skewness of British data.
Recently Ahmad tlJ!!. (1988) have shown that the log-logistic LLG is able to model flood series for Scottish
rivers from the point of view of reproducing the observed regional distribution of skewness as well as scoring better
than GEV, LN3 and P3 .in at-site and regional goodness of fit tests. In reparameterised form LLG isa special case of
the generalised logistic distribution, GLG, (Hosking 1986a, Ahmad, 1988).
In conclusion, behaviour analysis shows that none of the commonly used distributions produce samples which
behave in the same manner as AM flood series of equal size, on the assumption that a given hydrological region is
homogeneous in skewness and time (stationary). Only the Wakeby, TCEV and GLG (or LLG) distribution emerge as
possible candidates, under these assumptions, from this examination. The Wakeby distribution has the disadvantage of
having five parameters which may seem too many to estimate from a hydrological AM sample. Parameter estimates
obtained via PWM (Greenwood et al 1979), Landwehr et al. (1980) may not always be feasible in 'small samples but
estimates of Wakeby parameters from regionally averaged PWM's (Wallis,1980) nearly always exist. It.is not suggested
. here that these multiparameter distributions be used in an at-site mode but rather in an at-site/regional mode.
Wakeby fitting failures seem to be most prevalent with data straddling zero, with extremely high Cv data or
with samples that are so thin-tailed that no legal Wakeby distribution exists (Wallis, 1984 personal communication).
These conditions rarely occur among sets of hydrological or meteorological maxima, and hence failure to fit a Wakeby
distribution has so far proved to be an academic point rather thana practical hydrological problem. Robustness tests
(next section) can be used to determine if other distributions can be used to approximate well the Q - T relationship of a
Wakebydistribution. Indications are that the GEV distribution is a moderately good substitute.
6.4.2
ROBUSTNESS
A procedure for estimating QT is robust if it yields estimates of Qr which are good (low bias, high efficiency)
even if the procedure is based on an assumption which is not true. A procedure is not robust if it yields poor estimates
of Qr when the procedure's assumptions depart even slightly from what is true. Since we do not know the distribution
of AM floods in nature it behoves us to seek out and find a distribution and an estimating procedure which together are
robust when dealing with distributions which give random samples which have a flood-like behaviour. It should be
pointed out that split sample tests based on historical AM flood records are inadequate for testing the robustness of any
distribution and estimation (D / E) procedure.
A suitable method of testing aD/ E procedure involves simulating random samples from a parent distribution
in which the Q - T relationship is known exactly (Hosking tlJ!!., 1985 a). To be authentic, in this context, the parent
distribution must produce random samples which are flood like in their behaviour. Such a parent would be a Wakeby,
TCEV, GLG or possibly GEV distribution, with suitable parameter values. Then the D / E under test is applied to
1\
each sample and QT is obtained from each sample for a selection of T values. This is repeated for M samples (M large)
1\
and analogs of equations (4.1) to (4.5) are used to calculate bias and rmse from the M values ofQr:
CHAPfER6
49
MElHODS OF CHOOSING BETWEEN DISTRIBlITIONS
M
Mean
(6.1)
1:
=
i=l
SI. dev.
(6.2)
Bias
(6.3)
Rmse
(6.4)
=
A
In these expressions QT is the known population value. The sampling distribution of QT is also examined and
frequently this can be approximated by a Normal distribution so that 5% and 95% quantiles of the sampling
distribution, denoted lower and upper confidence levels, LCL and UCL, can be obtained as:
A
A
A
A
LCL =
Or - 1.645 SQT
UCL =
Or +1.645 SQT
(6.5)
(6.6)
AlI these quantities can be made non-dimensional by dividing by the population value QT' This practice is
usualIy followed to enable intercomparison of D / E procedures. Results are tabulated or presented on diagrams such as
Figure 5.1.
The performance of the D / E procedure under test is then seen through the magnitude of the bias and the
spread of the 90% confidence interval LCL to UCL. Different D / E procedures may be tested in tltis way and their
biases and confidence interval widths compared. The D / E procedure giving the smalIest bias or narrowest confidence
band may then be chosen under the assumption of the given parent distribution. The test must then be repeated with
data assumed to have come from different forms of parent distribution. Obviously a large amount of computer time is
required to undertake such tests and the programme of work has to be very carefulIy planned and the computer
experiments have to be properly designed to avoid obtaining useless results.
The sources of studies of quantile estimate robustness performed to date are summarised in Table 6.2. Some
studies refer to single sample (at-site) estimating procedures while others refer to at-site/regional procedures only. Some
alIow inferences to be made about both type of procedure. The following general points can be made from the results of
such tests.
(l) At - site / regional methods are better than at-site methods even in the presence of a modest amount of
heterogeneity.
(2) Two parameter models have substantially smaller standard error and rmse than three parameter models.
Two parameter models usually are biased, if the model skewness is invariant and much smaller thau the
population skewness.
(3) Three parameter models, while yielding relatively unbiased estimates have such high standard error when
used as at - site estimators as to make them very unattractive for that purpose.
(4) Regional index flood procedures outperform existing regional Bayesian procedures.
50
STATISTICALDISTRlBUllONS FOR FLOOD FREQUENCY ANALYSIS
(5) PWM based regional index flood procedures are most efficient and least biased and are easy to apply.
(6) The at-site / regional WAK / PWM procedure is uniformly the best quantile estimator when considered
over all relevant studies of Table 6.2. This is so notwithstanding the extra parameters involved in the
WAK distribution. The regional aspect of this procedure must be stressed. It would not in general be
prudent to apply WAK on an at-site basis.
(7) The at - site / regional GEV / PWM procedure is good with GEV parents but is not as robust as WAK /
PWM, because it is not as flexible. With non-GEV data it suffers more from bias than from standard
error.
(8) The TCEV regionalisation procedure yields relatively unbiased but very variable results. There are also
some data sets to which TCEV cannot be fitted. It also has have to be based on large data sets.
(9) The LP3/regional skew procedure is less efficient than either at - site / regional WAK / PWM or
GEV / PWM procedures even when the population is LP3 distributed. It performs particularly poorly for
sites with negative skewness and when the parent population is not LP3. Wallis and Wood (1985) go so
far as to say that the LP3/WRC procedure "should not be used as a basis for engineering design because
significantly more accurate estimates can be obtained by other currently available statistical procedures".
Landwehr et al. (1987) have contested these conclusions and claim that a previously unpublished variant
of a regional index flood method using LP3is as effective as regional WAK / PWM. This was discussed
in Chapter 5.3.
(10) Fully numerical variants, indicated as FSR, of the NERC (1975) regional estimation method perform
poorly, with very large standard error especially when record lengths are short (Hosking lllJl!.. 1985 a).·
Such implementations as have been computerised have differed sufficiently one from another to
materially affect their relative performances. Fodnstance, Amell and Gabrielle (1985) found that FSR
performance was not as bad as indicated by Hosking et al. (1985 a), the differences being due to
differences in the exact steps followed in the numerical algorithm.
6.5
Summary
.Goodness of fit tests applied to records of AM floods individually and conventional tests of hypotheses are not
conclusive when seeking a flood distribution. They can serve to reject some distributions but are not necessarily good
descriminators between accepted ones.
Behaviour analysis indicates tbat real AM flood data samples behave differently from random samples drawn
from the parent distributions conventionally used in flood frequency analysis. The Wakeby, TCEV and GLG
distributions can bridge the gap between theoretical and observed AM flood data.
Robustness studies indicate that quantile estimates using 2 parameter distributions suffer more from bias than
those based on multiparameter ones. The latter suffer from large standard error if used in at-site mode but not in
regional mode. The at - site / regional WAK / PWM appears to be efficient and robust. Early studies (Wallis and
Wood, 1985) indicated that the opposite was true of the LP3 based method recommended by USWRC (1981) but this
finding has been contested by Landwehr lllJl!.. (1987) albeit with a different variant of LP3 regional estimation. The
GEV / PWM regional procedure is not quite as robust as the corresponding WAK / PWM. The TCEV regional
method can model real flood data behaviour well although its quantile estimating ability is not as good as WAK /
PWM.
CHAPTER 6
REGIONAL
AT . SITE COMPARISONS
Landwehr WJ1 (1980):
(i)
(ii)
51
METIiODS OF CHOOSING BETWEEN DISTRIBUI10NS
6 Wakeby parents
EV1, LN3, WAK estimators
COMPARISONS
Parent
Estimator
Kuczera (1982)
LNx5
WAK
LN/LEB
Hosking l:1Jl!... (1985a)
GEV
WAK
FSR /LS
GEV /PWM
WAK/PWM
WEIB/PWM
WElB
Kuczera (1982):
Lettenmaier tl.lll. (1985)
EVI
GEV
EV1/PWM
GEV /PWM
WAK/PWM
EV2/PWM
EV3/PWM
(i) GEV parent
(ii) EVI and GEV estimators
Wallis and Wood (1985)
GEV
WAK
LP3
GEV/PWM
WAK/PWM
LP3 / PWM-RS
Lettenmaier and Potter (1985):
Lettenmaier & Potter (1985)
EVI
LN2
P3
EVI
LN2/LEB
EVI/LEB
Arnell & Gabrielle
(1985, 1988)
TCEV
WAK
GEV
TCEV /ML
WAK/PWM
GEV /PWM
Landwehr et al. (1987)
LP3
LP3/MOM-RS
(i) 4 Wakeby parents
(ii) N, LN2, LP3, EV1,log EV1, WAK
estimators
Lettenmaier et al. (1985):
(i) EV1, LN2 and P3 parents
(ii) EV1, LN2 estimators
Wallis and Wood (1985):
(i) LP3, GEV, WAK parents
(ii) LP3, GEV Estimators
Table 6.2 Sources of quantile estimating robustness studies results.
CHAPTER 7
DISTRIBUTIONS PREVIOUSLY CHOSEN OR RECOMMENDED FOR NATIONAL USE
7.1
WMO Survey
In 1983 WMO conducted a survey of current practices of selected countries with regard 10 use of distribution
types for frequency analysis on extremes of precipitation and floods. Replies 10 questionnaire were received from 55
agencies in 27 countries. These replies were examined and summarised in a report WMO (1984) which is attached as
Appendix 6. Table 7.1 tabulates results from Appendix 6, Table III, about the most commonly used flood distributions
among the agencies and countries surveyed.
The EVI distribution was the most commonly used distribution of all followed closely by lognormal and LP3
with P3 a little less frequently used. If reported 2 and 3 parameter gamma uses are combined with P3 uses the resulting
total resembles that for LP3. If reported uses of EV2 and EV3 are pooled with GEV uses it can be seen that EV
distributions, exclusive of EVl, had a large degree of usage even though not recommended as standard by a large number
of agencies or countries. It should be noted that there is considerable overlap between some columns in Table 7.1
because many agencies reported use of more than one distribution and some countries reported use of more than one
distribution as standard. Table 4 of Appendix 6 shows that almost half of all agencies use the Weibull plotting position
for data display and lor calculation while one third use either of Blom, Hazen or Gringorten. The remainder use variants
which are very similar 10 these, all being special cases of T = (N+1-2a)/(i-a) with 0.25"; a ,,; 0.5.
As pointed out in Chapter 4.3.4 the Weibull formula is biased and should not be used for flood data. Anyone
of the others is considerably better, with the Gringorten formula preferred for EVI (Gumbel) paper and the Blom
formula preferred for Normal probability paper.
The WMO survey points out that choice of distribution is made in many countries with the help of goodness
of fit tests and tests of hypotheses against specific alternatives. In Chapter 6, of this report it was argued that such
procedures are unable to lead to a unique choice of distribution and may not necessarily lead 10 distributions which are
robust as flood quantile estimators.
EVI
EV2
No. Agencies in
which it is used
28
No. Countries in
which it is used
No. Agencies in
which it is used
as standard
No. Countries in
which it is used
as standard in
one or more
Agencies
Table 7.1
EV3
GEV
LN
11
8
7
27
16
9
8
2
18
3
0
10
3
0
Gamma 2/3
LP3
Exponential
(pDModels)
17
6
22
5
16
12
6
13
4
5
16
9
3
17
3
1
8
7
3
7
2
P3
Summary of frequency and extent of use of flood distributions.
(Distributions with one or fewer reported uses not included)
(Source: Table 3, Appendix 6. Total No. Agencies in survey = 54. Total No. Countries = 28).
CHAPTER 7
DISTRIBUTIONS PREVIOUSLY CHOSEN OR RECOMMENDED FOR NATIONAL USE
53
The survey, which is based on at-site estimation, also revealed that "no special method of parameter
estimation is preferred and the graphical method is as frequently, or even more, nsed as any other method". This should
no longer be the case as there is sufficient evidence to show that joint at-site/regional procedures, based on PWMs, are
better than most other flood quantile estimation methods.
7.2
SELECTED CASES
Snmmaries of selected investigations into the form of distribution for AM floods are given below in order of
their relative popularity.
7.2.1
UNTIED STATES OF AMERICA
The U.S. Water Resonrces Conncil established a Work Group on Flood Freqnency Methods in 1966. The
Group's first two years work was comprehensively reported upon by Benson (1968). The group decided that several
methods of flood frequency analysis in common use among Federal agencies would be applied to a group of 10 long
term records of annual maximum flows at selected stations in the U.S.. The catchments chosen represent a wide range
of climate, hydrological conditions and size of catchment. Records with obvious outliers were avoided initally. One of
the records was 97 years long, the other nine varying between 40 and 62 years.
The following distributions were fitted to the 10 records
Distribution
Number of different programs used
2-parameter gamma
2
Gumbel
2
Log Gumbel
2
Log-Normal
4
Log-Pearson Type 3
3
Hazen method
1
Total
14
The first five distributions were fitted by the programs of more than one agency and in all 14 sets of
computations were used. From each computation the flood estimates of return period T = 2, 5, 10,25,50 and 100
years were obtained. For each return period there were 14 values for each station.
The calculated estimates of Qr for each T value listed above were compared with "data values" obtained for
each station from a probability plot of that station's data. The ranked flood data at a station were assigned probability
plotting positions i / (N+ I), where i is rank and N is record length, and plotted accordingly on extreme value
logarithmic graph paper. Valnes of QT, the "data values", were obtained by linear interpolation between the two
adjacent peaks which bracketed the specified probability or return period.
These "data values" were then used as the basis against which the 14 computed valnes of Qr, for each T, could
be compared. The deviations for each return period were expressed as
«\ -QD)/ QD where (\ is the estimate of QT
54
STATISTICALDISTRIBlITIONS FOR FLOOD FREQUENCY ANALYSIS
obtained by one of the 14 distribution-fitting computer programs and QD is the corresponding interpolated "data value".
These standardized deviations were listed separately by method and averages over the 10 stations were obtained for each
return period and method. These average values of deviations "were an important consideration in deciding between
methods".
In coming to a conclusion and in making a recommendation it was noted that "no single method of testing the
computed results against the original data was acceptable to all those on the Work Group". Further "the statistical
consultants employed by the group had indicated that no unique procedure could be specified as correct for anyone
method of flood-frequency analysis" and they "could not offer a mathematically rigorous method" for selecting a best
method. Consequently, the group decided that if a unique choice could not be made on statistical grounds alone then a
choice would nevertheless have to be made for compelling administrative reasons.
Guided by the average deviation results described above, the group decided to recommend that the LP3
distribution (with LN as a special case) be adopted as a base method for analysing flood-flow frequencies (for Federal
Agencies). Allowance was made for the use of other distributions also provided there were sufficiently justifiable
reasons. The group also recommended that their choice of a base method should not freeze hydrological practice into a
set pattern but that research into flood frequency methods should continue.
The U.S. Water Resources Council accepted the group's recommendation on the LP3 distribution and has
retained it in each of its subsequent recommendations on methods of determining flow frequencies (1967,1976,1977,
1981). This choice has been supported by Beard (1974). Implementation of the method involves computing the mean,
standard deviation and skewness of logarithms of the data series. This can be referred to as estimation by moments in
log space (LS) as distinct from other possible moment estimators (phien and Hsu, 1984). A general map of skewness
has been prepared by the Council for the entire United States of America. This allows a value of skewness to be
obtained for a site at which insufficiently long records are available for a reliable estimate of skewness to be obtained
from the data.
The "Guidelines for Determining Flood Flow Frequency" (USWRC, 1976, 1977, 1981) also discusses issues
other than the choice of distribution notably (i) the detection and treatment of outliers, (ti) treatment of series containing
floods caused by vastly different types of precipitation events and (iii) treatment of years with zero flood values for rivers
in arid climates.
7.2.2
UNITED KINGDOM
A Floods Study Team was set up in England in 1970 to study all aspects of flood hydrology with a view to
recommending methods of floodestiination to the engineering 'profession. The tearn consisted of full·time professional
staff working at the Institute of Hydrology and the Meteorogical Office. All the extant recorded flow data were collected
from the several gauging authorities: All usable flood series extracted from these'records, over 500 in nmuber, were
published. Major studies in rainfall frequency estimation, flood frequency estimation, rainfall-runoff modelling and
flood routing were carried out. The results of the studies were published in five volumes by NatUral Environment
Research Council (NERC,1975). The Flood Studies Report was reviewed by Burges (1979).
.
Methods of flood estimation considered were the use. of the annual maximum series model and the partial
duration series model principally, with an exploratory studY' of a time series method using a shot nOise model.
Estimation problems encountered in series with missing and lor censored peaks were also dealt with. Among these
methods, the relationships developed for flood estimation for ungauged catchments were based on the annual maximum
series model.
A detailed study of the form of distribution of annual maximum floods was carried out. Only six records in
excess of fifty years length were available, so it was decided to include records of thirty years or longer, of which twenty
nine were available, into this aspect of the study. Six additional records, although shorter than thirty years, were
included from the Republic of Ireland. A variety of statistical tests were employed in trying to discriminate between
distributions but not all tests were applied to all distributions.
7.2.2.1. X2 goodness of fit index
This index is not renowned for high power in the statistical sense and is not very useful. It was applied
individually to thirty eight records with each of three distributions (EV1, LN2, GEV). The number of stations at which
it rejected each distribution varied greatly with the significance level of the test
Significance Level
DistributiQn
Gumbel
Table 7.2.
LognQrmal
17
7
0.10
0.05
0.01
7.2.2.2
55
DISTRlBtrr10NS PREVIOUSLY CHOSEN OR RECOMMENDED FOR NATIONAL USE
CHAPTER 7
GEV
14
11
15
7
3
2
2
Number Qf times Qut Qf 38 recQrds that stated distributiQns were rejected by X2 goodness
Qf fittest.
The Kolmogorov-Smirnov (K-S) gOOd1U!8S offit test
This test was applied to the abQve three distributiQns as well as tQ the twQ parameter Gamma distributiQn. In
CQntrasttQ the X2 test. this test rejected a distributiQn in Qnly a few cases. It shQuid be nQted that the applicability Qf
this test is nQt restricted in any way by sample size.
Significance Level
DistributiQn
0.10
0.05
0.01
Table 7.3
7.2.2.3
Gumbel
Gamma
I
I
0
0
0
2
LognQrmal
1
0
0
GEV
0
0
0
Number Qf times Qut Qf 38 records that stated distributiQns were rejected by
KQlmQgQrov-SmirnQv goodness Qf fit test.
Goodness of fit test based on probability plot
The distributiQn and the record to be CQmpared are represented by a line Qr curve and a set Qf plQtted pQints
respectively Qn the same diagram Qr probability plQt. For each distributiQn and sample being cQmpared the quantities
di
=
[Q(i) - d(i)l/Q.
i
= 1,2.... N
(7.1)
were cQmputed where N is recQrd length, Q(i) is the Qbserved ith smallest value, dO) is the cQmputed variate value Qn
the line Qr curve at the ith plQtting pQsitiQn and Q is the mean Qf the Qbserved series.
A set Qf d values dr. d2.....dN can be summarised either by the mean absQlute value lal or by the roQt mean
square deviatiQn. rms(d). As a result, the cQnclusiQn drawn from any Qne cQmparisQn (Qf fitted distributiQn and Qbserved
recQrd) may depend Qn which methQd Qf summarising the d values is used. AnQther SQurce Qf diversity lies in the
manner in which dO) is defined. It depends Qn the plQtting probability fQrmula used. The traditiQnal Weibull fQrmula
Fi = i I (N + 1) is biased (Cunnane, 1978) but was included because Qf its widespread use. The Hazen plQtting pQsitiQn
was alsQ used. Fi = (i - 0.5)/N. as was the plQtting pQsitiQn based Qn the expected value Qf the reduced variate Qrder
statistic. The latter varies with distributiQn type but cQrresponds to the GringQrten fQrmula (i - 0.44)/ (N + 0.12) for
the Gumbel distributiQn and to the BJQm fQrmula (i - 3/8)1(N + 1/4) for the NQrmal distributiQn.
FQr each recQrd (35 in number) and each distributiQn (7 in number) values Qf lill were Qbtained using each Qf
the three plQtting pQsitiQn fQrmulae mentiQned. In additiQn rms(d) was Qbtained using the plQtting pQsitiQn based Qn
56
STATISTICAL DISTRIBUflONS FOR FLOOD FREQUENCY ANALYSIS
the expected value of the order statistics. This gave three tables, each containing 7 x 35 values of liil and one table of
7 x 35 rms(d) values. The distributions used were EVI, G2, LN2, GEV, P3, LP3 and log-gamma. For each record, in
each table, each distribution was assigned a rank between I and 7, rank I for the best fitting distribution (low liil) and
seven for the worst fitting one. For each distribution the ranks were summed over the 35 stations and these totals of
ranks were used as the basis of comparison.
In the first three tables the results were the same showing no dependence on plotting position. The three
parameter distributions showed the best fit in the following order: LP3, GEV, P3 while LN2 was the best fitting of the
two parameter distributions. However, wheu the goodness of fit was expressed by rms(d), (the fourth table), P3 jumped
to first place followed by LP3 and GEV. Thus the measure of average d value influences the relative merits of the
several distributions in a serious way since LP3 and P3 distributions are so different in the manner with which
magnitude varies with return period T at large values of T.
In summary the results showed that
Three parameterdistributions fitted more closely than two parameter ones.
LN2 fitted more closely than any other two parameter distribution.
On the basis of mean absolute deviation Idlthe order was LP3 better than GEV better than P3 and this
result was independent of plotring positiun formula and method of parameter estimation (moments or
maximum likelihood).
On the basis of root mean square deviation, rms(d) the order was P3 better than LP3 better than GEV.
7.2.2.4
Method used by U.S. Water Resources 'Council
The method used by the group on flow frequency methods, described earlier in 7.2.1 above was applied also to
the UK but.only to the data of six gauging stations. The same seyen distributions as were used in the previous section
of the study were used. It was found that, when using the Weibull plotting formula as in the USWRC study, lhe GEV
distribution gave the best fit at low values of T while LP3 give the closest fit at high values of T. When this type of
study was repeated for the six stations but using the Gringorten plotting formula (related to expected value of reduced
variate order statistics) P3 again showed the closest fil, followed by GEV, followed by LP3. Thus again it was found
that the order in which the distributious would be chosen as best depends on a small change in procedure.
7.2.2.5. Recommended distribution
Following the studies described above the U.K. Flood Studies Report (NERC, 1975) recommended the GEV
distribution because:
It performed consistently well, if not best in the statistical goodness of fit tests. The superiority of either P3
or LP3 distributions depends on which of two arbitrary steps in the testing procedure was adopted.
It described very well the empirically derived distribntions of the standardised variable Q I Q which were
derived for each of 10 regions by pooling the data of each region. (NERC 1975, VoI.1.2.6).
It is in accord with the rainfall probability distributions derived independently from a vast amount of
meteorological data (NERC 1975, VoUI).
Estimates of its parameters by moments, sextiles or maximum likelihood are as easy to obtain as the
corresponding estimates in other three parameter distributions. [Note that PWM estimation for GEV (Hosking .tl...Jl!..
1985 b) is both simple and efficient].
It has some theoretical attraction.
It was recommended that if only a small sample, N < 25, were available, the EVI distribution be fitted to it
because it is a special case of GEV. This was recommended for the sake of consistency even though the lognormal
performed better than it in the goodness of fit tests. If the true distribution were GEV then the EVI would certainly be
able to approximate it well over a limited range of return period without having to suffer the disadvantage of having to
estimate a third parameter from the small sample.
CHAPfER7
DISTRIBUTIONS PREVIOUSLY CHOSEN OR RECOMMENDED FOR NATIONAL USE
57
It should be noted that the type of goodness of fit tests related to probability plots and described above suffer
from a major weakness in that they take no account of the fact that the natural sampling variation of the largest
elements in a sample are far greater than that for the middle ranking values. Another point to note is that such tests
almost inevitably pick out the three parameter distributions before the two parameter ones. This is not a sufficient
reason for accepting three parameter distributions and rejecting two parameter ones. If the true population were EVI
such tests would show more favour to the GEV, with its third parameter, than to the parent itself. The same would
hold if the true population were LN; such tests would show more favour to LP3 than to the parent distribution.
7.Z.3
AUSTRAIlA.
In 1977 the Institution of Engineers of Australia adopted the LP3 distribution for flood frequency analysis in
Australia (lEA, 1977). The suitability of this distribution for describing Australian annual maximum series data has
been studied by McMahon and Srikanthan (1981). They made use of the moment ratio diagrams Cs versus Cv and PI
versus PZ where Cv and Cs are coefficients of variation and skewness respectively, both corrected to some extent for
small sample bias, and PI = c; and PZ is the kurtosis. On these diagrams the LP3 distribution appears as a curve
whose coordinates vary with the value of the shape parameter, b, in the log domain. There is a separate curve for each
value of scale parameter, a, considered. (See Table 3. I for notation). On these diagrams also the Normal, exponential
and EVI distributions can be represented by single points while Gamma, Weibull and LN distributions are represented
by single lines or curves.
Sample values of Cv, Cs = -{lfi and PZ were computed for 172 series of annual maximum floods and each
series was represented by a point on both the Cs - Cv and the PI - PZ diagram. The authors state that "in the PI - PZ
diagram, it is observed that most of the points plot to the right of the Gamma, lognormal, Weibull, Normal, Gumbel
and exponential distributions". In the Cs - Cv diagram these distributions do not cover more than half the points.
However the LP3 distribution is found to cover satisfactorily the data points both in the PI - PZ and Cs - Cv diagrams.
This simple analysis suggests that at least for these 172 streams LP3 is the only suitable general distribution for flood
frcquency analysis. In particular it is noted that the two parameter lognormal distribution is generally unsatisfactory".
The consistency of the plotted points between the two types of diagrams waS checked as follows. For each
stream the value of LP3 scale parameter a was determined from Cv and Cs' Using this scale parameter value and the
observed PI = C~ value, a value of pz waS obtained from the PI - PZ diagram. A plot of this value of PZ against the
observed value of PZ for each stream was then prepared. If the plotted points fell on the 45° line it would imply full
consistency between the two types of diagrams for the LP3 distribution in the light of the observed data. The majority
of the points plotted on one side of the line however.
The sample sizes on which the above analysis are based are not given, although full details are available in
McMahon (1979). It is inevitable that the observed values of Cv, Cs and PZ suffer from sampling effects and possibly
also from bias. If a correction is made for bias the sampling errors remain and even if these are randomly distributed
about the respective true values, their scatter may make a judgement between distributions less certain. The authors
conclude "Nevertheless the analysis as it stands suggests, at least for these Australian streams, that the LP3 distribution
is a satisfactory base method for analysing flood flow frequencies".
In conclusion also the authors questioned the wisdom of setting the skewness to zero when using the LP3
distribution for flood estimation problems in those cases where the observed value of sample skewness, in the log
domain, is not statistically different from zero. The shortcomings of moment ratio diagrams for the purposes of
selecting a distribution are discussed in Appendix 3.
7.Z.4
STATE OF CALIFORNIA (USA)
Wu and Goodridge (1974) report on studies made on the distributional form of rainfall and runoff series in
California. These were annual series of:
Short duration rainfall of from 5 minutes to 12 hours duration from 73 stations with 20 or more years of
record;
Long duration rainfall of from one to 60 days duration from 53 stations with 70 or more years record;
58
STATISTICAL DISTRIBUI10NS FOR FLOOD FREQUENCY ANALYSIS
Annual peak runoff and runoff volumes for durations ranging from I to 364 days from 90 records of 20 or
more years.
For each type of series for each station values of skewness Cs = ~ and kurtosis excess Ck = lh - 3 were
calculated and weighted average values of Cs and Ck for each type of series were obtained. The pair of values (Cs , C0
for each type of series was then plotted on Cs - Ck moment ratio diagrams on which specific points, lines or regions
represent the theoretical relation between Cs and Ck of the several candidate distributions of interest. Separate moment
ratio diagrams were used for (i) short duration rainfall, (ii) long duration rainfall and (iii) runoff data.
Their conclusions are (a) that the P3 distribution is the best overall model for precipation series of durations
up to 30 days, with the Weibull distribution best for longer durations and (b) that the Weibull distribution is the best
overall model for all of the runoff series. It should be noted that the curves representing the P3 and Weibull
distributions on the Cs - Ck diagrams lie quite close to one another.
The Wu and Goodrige (1974) study is not comparable with that of McMahon and Srikanthan (1981 a) even
though both are based on the use of moment ratio diagrams. The former does not consider the LP3 distribution while
the latter cousiders only that distribution.
In another study Cruff and Rantz (1965) compared the use of six probability distributions for flood series in
coastal regions of California and recommended use of the P3 distribution as a result.
7.2.5
ITALY
Cicioni !1Jl.!. (1973) examined the suitability of five distributions on 108 sets of flood data and concluded that
LN2 was the best as awhole. The other distributions tested were LN3, G2, P3 and EVl. However Rossi «.l!L. (1984)
found EV I, LN2 and PEV unsuitable for Italian flood data and recommended TCEV instead on the grounds of its ability
to model regional distribution of skewness.
7.2.6
CANADA
Spence (1973) compared the use of Normal, LN2, EVI and log EVI distributions to AM series for Canadian
Prairie catchments and concluded that the LN2 distribution was the most suitable overall one.
CHAPTER 8
CONCLUDING REMARKS
8 •1
Types of model
Statistical methods of flood frequency estimation in current use employ an AM or PD series type models in
which a series of flood magnitudes are assumed to behave like a random sample of independent identically distributed
variates. Most research work and published work deal with peak flows as the variable of interest even though the
variables flood volume and flood duration, are of practical interest in many applications. In the majority or research
projects attention has been confmed to the AM model.
8.2
The modelling prohlem
The main modelling problem is the selection of the probability distribution for the flood magnitudes coupled
with the choice of estimation procedure.
8.3
Descriptive ahility of distributions
Random samples from many of the distributions traditionally used for frequency analysis do not display the
same behaviour of skewness as do observed regional AM data sets. Exceptions are the relatively recently introduced
"thick tailed" distributions such as WAK, GLG and TCEV. These are also sufficiently flexible to provide a good fit to
observed data
Many of the traditionally used three-parameter distributions (p3, LP3, GEV or EV2, Weibull) are sufficiently
flexible to provide a moderately good fit to observed data but they do give rise to the condition of separation of
skewness.
Of the two widely used two-parameter distributions, EVI and LN2, the latter can show a reasonable fit to a
wider variety of observed data than can the former. The EVI sometimes fits observed data wen in humid climates in
which floods do not vary greatly from year to year (low Cv). These two distributions also give rise to the condition of
separation of skewness.
8.4
Predictive ability and robustness
As well as considering descriptive ability the choice of a DIE procedure must take into account the predictive
abilities of such procedures. In view of the lack of absolute knowledge of the correct form of distribution of floods the
property of robustuess is very important in this context. This depends both on the distribution chosen and method of
parameter estitnation.
8.5
Parameter estimation
Parameter estimation by ML has optimal properties when the sample(s) on which it is used actually are drawn
from the distribution assumed in the procedure. If the sample is from a different distribution the optimal properties are
by no means guaranteed. Since it is distribution-specific it may not be robust.
Parameter estimation by ordinary moments, while very popular among hydrologists, is known to be biased
and inefficient especially with three-parameter distributions. The exact corrections for bias are not easy to summarize in
simple formulae.
Parameter estimation by PWM, which is relatively new, is as easy to apply as ordinary moments, is usually
unbiased and is almost as efficient as ML. Indeed in small samples PWM may be as efficient as ML. With a suitable
choice of distribution PWM estimation also contributes to robustness and is attractive from that point of view.
Another attraction of the PWM method is that it can be easily used in regional estitnation schemes.
60
STATISTICAL DISTRIBunONS FOR FLOOD FREQUENCY ANALYSIS
Graphical estimation even in regional index flood types of scheme leads to very variable estimates which are
not objective and should not be used since efficient, objective methods are available. Unbiased plotting positions
(Chapter 4.3.4) should be used for data display. Weibull plotting positions lead to bias in quantile estimates if used in
either graphical or least squares estimation schemes.
8.6
At-site and at-site/regional estimation
A choice must be made between flood estimates based on (a) at-site data alone or (b) at-site plus regional data.
(a)
Flood estimates may be based on at-site data alone if:
(i)
the at-site record is exceptionally long;
(ii) there are no regional data available;
(iii) the region is very heterogeneous i.e. Cv of Cv > 0.4.
It must be accepted that a single at-site record can provide limited quality estimates over a limited range of
return periods. If a two-parameter distribution is used the standard error ofestimate will be smaller than if a
three-parameter distribution is used but the bias will (probably) be larger. On the other hand, use of a threeparameter distribution can be accompanied by such large standard error as to make the estimate of very little
value. This is the limitation of use of at-site data alone. It follows of course that it is not suggested here that
multiparameter distributions such as WAK or TCEV be used with single records;
(b) Flood estimates may profitably be based on joint use of at-site and regional data, providing a reasonably
homogeneous flood region can be identified. In this context a homegeneous region is a collection of
catchments whose flood statistics are homogeneous. It does not imply that all catchments in it are in a
confined geographical area. The advantage of joint use of at-site and homogeneous regional data is that there
is sufficient information in the combined data set to enable a muItiparameter distribution to be used reliably.
Thus a distribution which does not cause thc condition of separation of skewness can be used.
In these circumstances, the WAK distribution needs to be considered. It does not cause the condition of
separation of skewness i.e. it satisfies the descriptive requirements of a flood frequency model and its
parameters can be reliably estimated from regionally averaged standardized PWMs and it has been impressive
in all the robustness tests published thus far. It is stressed again that it is not being suggested here that WAK
be used in at-side mode.
Of course, all three-parameter distributions, when uscd in this context, will give more reliable results
than when used in at-site mode alone even though they may cause the condition of separation of skewness i.e.
they may not satisfy all the descriptive requirements of a flood frequency model.
8.7
Arid and semi-arid zones
Floods in semi-arid and arid zones generally have much higher Cv than those of humid zones. Hence longer
records, unfortunately not often available, are required for such zones. Also the lower AM flood values may be more of
a distraction than value to the estimation scheme. Serious consideration should be given to censoring the lower AM
values. Since this would leave very few flood values at each station it is imperative to use regional estimation methods
in such circumstances, subject to the proviso that Cv of C v does not exceed 0.4. Many of these "high Cv hydrologies"
are in the developing countries.
8.8
Regional homogeneity
Research is continuing to establish how to distinguish between catchments within a country which have
different Qr IQ versus T relations (see Chapters 4.6 and 5.3.1). Some catchment types may have steeper curves (larger
Cv) than others and this difference may be a function of catchment characteristics other than area alone. Since greater
quantile accuracy can be achieved by grouping catchments into homogeneous groups, efforts should be made in any
CHAPTER 8
CONCLUDING REMARKS
61
flood estimation scheme to check on regional homogeneity. A small amount of regional heterogeneity is tolerable and
in such cases regional flood estimation schemes still perform better than at-site ones.
8.9
Necessity for flow gauging
In the absence of at-site data a Q versus catchment characteristics relation may be used to obtain Q. It is
worth stressing that a gauging station should be installed at any site as soon as it becomes clear that flood estimates
will be required there, as a small amount of site data greatly improves the precision of the Q estimate which can then
be used with a regionally based estimate of Qr IQ
8.10
Interpretation and use of flood frequency estimates.
The concept of flood frequency analysis as an aid iu decision making is a useful one. However it is a
technique'which may easily be mis-used. Most desigu flood estimates involve statistical extrapolation of some kind,
the dangers of which, referred to in Chapter 1, Section 6, should be borne in mind.
REFERENCES
Acreman, M.C., and Sinclair, C.D., 1986: Classification of drainage basins according to their physical characteristics;
An application for flood frequency analysis in Scotland. J. Hydro!., 84, 365 -380. (Also pres. at Brit.
Hydro!. Soc., Newcastle upon Tyne, July, 1984).
Acreman, M.C., and Hosking, LR.M., 1986: Estimating regional flood frequency curves for Scotland in the presence
of correlation. Unpub!. Rept., 1nst. of Hydrology, Wallingford, U.K
Adamowski, K, 1985: Nonparametric kernel estimation of flood frequencies. Water ResouT. Res.,21(11), 1585 - 1590.
Aitchison, J., and Brown, J.AC., 1957: The log-normal distribution, with special reference to its uses in economics.
Cambridge University Press, New York.
Alexander, G.N., Karoly, A. and Susts, AB., 1969: Equivalent distributions with application to rainfall as an upper
bound to flood distributions. J. Hydro!. 9, 322 - 344.
Alexander, G.N., Karoly, A. and Susts, A.B., 1969: Equivalent distributions with application to rainfall as an upper
bound to flood distributions (continued, parts 3 and 4). J. Hydro!. 9, 345 - 373.
Ahmad, M.l., 1988: Application of statistical methods to flood frequency analysis. Thesis pres. in fulfillment of Ph.
D. degree, Univ. of St. Andrews, Scotland, 169 pages.
Ahmad, M.l., Sinclair, C.D. and Spurr, B.D., 1988 (b): Assessment of flood frequency models using EDF statistics.
Water Resour. Res. 24 (8), 1323 - 1328.
Ahmad,M.I., Sinclair, C.D. and Werrity, A., 1988: Log-logistic flood frequency analysis. J. Hydro!., 98, 205-224.
Anderson, T.W. and Darling, D.A., 1954: A test of goodness of fit. JASA,49,765-769.
Amell, N.W. and Gabrielle, S., 1985: Regional flood frequency analysis with the two-component extreme value
distribution: An assessment using computer simulation experiments. Workshop on combined efficiency of
direet and indirect estimations for point and regional flood prediction, Perugia, Italy, December.
Amell, N.W., Beran, M.A, and Hosking, J.R.M., 1986: Unbiased plotting positions for the general extreme value
distribution. J. Hydro!., 86, 59 - 69.
Amell, N.W. and Gabrielle, S., 1988: The performance of the two component extreme value distribution in regional
flood frequency analysis. Water ResouT. Res., 24(6), 879 - 887.
Ashkar, F. and Rousselle, J. A multivariate statistical analysis of flood magnitude, duration and volume In V.P.
Singh (Ed.) "Statistical analysis of Rainfall and Runoff', pp 651-668, Water Resources Pub!., Colorado,
1982.
Ashkar, F. and Rousselle, J., 1983a: Some remarks on the truncation used in partial duration flood series models.
Water Resour. Res., 19(2),477-480.
Ashkar, F. and Rousselle, J., 1983b: The effect of certain restrictions imposed on the inter-arrival times of floods.
Water ResouT. Res., 19(2),481-485
Atkins, GP., 1980: Regional flood frequency aualysis in Papua New Guinea. Ph.D thesis, Univ. Technology, Lae,
Papua,New Guinea. 473 pages.
Bardsley, KE., 1977: A test for distinguishing betwecn extreme value distributions. J. Hydro!., 34(3/4), 371-381.
Bcable, M.E. and McKerchar, AI., 1982: Regional flood estimation in New Zealand. New Zealand National Water and
Soil Cons. Org., Water and Soil Tech. Pub!. No. 20, 132pp.
Beard, L.R., 1960: Probability estimates based on small normal distribution samples. Jour. Geophys. Res., 65(7),
2143-2148.
Beard, L.R., 1962: Statistical methods in hydrology. Hydrologic Engineering Centre, Corps of Engineers, Davis, Ca.
REFERENCES
64
Beard, L.R., 1974: Flood flow frequency techniqucs. Center for Research in Water Resources, The University of Texas
at Austin, 28 pages + Tables + Appendices.
Beard, L.R., 1987: Discussion of "Relative Accuracy of Log Pearson 1I1 Procedures" by Wallis and Wood (1985).
ASCE, J.Hydrau!. Engng., 113(a), 1205 -1206.
Benson, M.A., 1950: Use of historical data in'flood frequency analysis. Trans. Am. Geophys. Union, 31(3), 419 -424.
Benson, M.A., 1960: Characteristics of frequency curves based au a theoreticall()()() year record. U.S. Geo!. Surv.,
Water Supply Paper 1543-A.
Benson, M.A., 1962a: Evolution of methods for evaluating the occurrence of floods. U.S. Geo!. Surv., Water Supply
Paper 1550-A.
Benson, M.A., 1962b: Factors influencing the occurrence of floods in a humid region of diverse terrain. U.S. Geo!.
Surv., Water Supply Paper No. 1580 - B., 64p,
Benson, M.A., 1968: Uniform flood frequency estimating methods for federal agencies. Water Resour. Res, 4(5), 891 908.
Beran, M.A. and Nozdryn-Plotnicki, MJ., 1977: Estimation of low return period floods. Hydro!. Sci. Bul!. 22, 275 282.
Beran, M.A., Hosking, J.R.M. and Arnell, N., 1986: Comment on "Two-component extreme value distribution for
flood frequency analysis" by Rossi et al. (1984). Water Resour. Res., 22(2), 263 - 266.
Bernier,
r.,
Bernier,
r., 1967b: Methodes Bayesiennes en hydrologie statistique. Proc. Int!' Hydro!. Symp., Fort Collins,
1967a: Sur Ie theorie du renouvellement et son application en hydrologie. Electricite de France,. HYD,
67(10).
ColoradO, U.S.A., 459 - 470.
Biswas, A.K. and Fleming, G., 1966: Floods in Scotland: Magnitude and frequency. Water and Water Engg., 246252.
Blom, G., 1958: Statistical estimates and transformed Beta variables. John Wiley, New York, pp. 68-75 and 143-146.
Bobee, B., 1973: Sample error ofT-year events computed by fitting a Pearson Type 3 distributiou. Water Resour.
Res., 9(5), 1264 - 1270.
Bobee, B. and Robitaille, R., 1975: Correction and bias in the estimation of the coefficient of skewness. Water
Resour: Res., 11(6), 851 - &54.
Borgman, L.E., 1963: Risk criteria. Jour. Waterways and Harb. Div., ASCE, 89, 1 - 35.
Boughton, W.C., 1980: A frequency distribution for annual floods. Water Resour. Res., 16(2), 347 - 354.
Bridges, W.e., 1982: Technique for estimating magnitude and frequency of floods on natural flow streams in Florida.
US Geo!. Surv.. Water Resour. Invest 82-4012.
Burges, S.J., 1979: Review of NERC's flood studies report 1960 (q.v.) EOS, Amer. Geophys. Union, 60(46), 788 790.
Burges, S.J., Lettenmaier, D.P. and Bates, C.L., 1975: Properties of the three-parameter log-normal distribution.
Water Resour. Res., 11(2),229 - 235.
Cervantes, J.E., 1981: A trigger type cluster model for flood analysis. Ph.D. thesis, Purdue Univ., West Lafayette,
Indiana, U.S.A..
Cervantes, J.E., Kavvas, M.L and Delleur, J.W. 1983: Cluster mOdel for flood analysis. Water Resour. Res., 19(1),
209-224.
STATISTICAL DISTRIBUTIONS FOR H.OOD FREQUENCY ANALYSIS
65
Charbeneau, R.J., 1978: Comparison of the two- and three-parameter log normal distributions used in streamflow
synthesis. Water Resour. Res., 14(1), 149-150.
Chen Jia-Qi, Ye Yong-yi and Tan Wei, 1975: The important role of historical flood data in the estimation of spillway
design floods. Seientia Sinica, 18(5),669 - 680.
Chow, V.T., 1950: Discussion of "Annual Floods and the Partial Duration Series" by W.B. Langbein. Trans. Am.
Geophys. Union, 31(6), 939 - 941.
Chow, V.T., 1951: A general formula for hydrologic frequency analysis. Trans. Amer. Geophys. Union, 32, 231- 237.
Chow, V.T., 1953: Frequency analysis of hydrologic data with special application to rainfall intensities. Univ. Illinois
Eng. Expt. St., Bul!. 414.
Chow. V.T., 1954: The log probability law and its engineering applications. Proc. ASCE, 80(536), I - 25.
Cieioni, G., Giuliano G. and Spaziani, F.M., 1973: Best fitting of probability functions to a set of data for flood
studies. Proc. Second Int!. Symp. on Hydrology - floods and droughts. Water Resour. Pub!., Fort Collins,
Colorado pp. 304 - 314.
Cole, G., 1966: An application of the regional analysis of flood flows. Institution of Civil Engineers, London,
Symposium on river flood hydrology (1965), 39-57.
Condie, R., 1977: The log-Pearson type 3 distribution: The T-year event and its asymptotic standard error by
maximium likelihood theory. Water Resour Res., 13(6),987-991.
Commons, W., 1986: Sampling behaviour of coefficients of skewness and kurtosis in the context of moment ratio
diagrams. Unpub!. Report, Dept. Engng. Hydro!., Univ. Col!., Galway, Ireland. 6 pages.
Condie, R. and Lee, K.A., 1983: Flood frequency analysis with historic infonnation. J. Hydro!., 58(1/2), 47-62.
Conger, D.H., 1986: Estimating magnitude and frequency of floods for ungauged urban streams in Wisconsin.
US Geo!. Surv. Water Resour. Invest. Rept 86-4005.
Correia, F.N.: Multivariate partial duration series in flood risk analysis. In V.P. Singh (Ed.) "Hydrologic Frequency
Modelling", 541-554, D. Reidel Pub!. Co., 1987.
Cmff, R.W. and Rantz, S.E., 1965: A comparison of methods used in flood frequency studies for coastal basins in
California. U.S.Geo!. Surv. Water Supply Paper 1580-E.
Cunnane, C. and Nash, J.E., 1971: Bayesian estimation of frequency of hydrologic events. Proc. Warsaw Symp. on
Math. Models in Hydrology. (lAHS Sci.Pub!. 100, pp47-55, 1974).
Cunnane, c., 1973: A particular comparison of annual maximum and partial duration series methods of flood frequency
prediction. J. Hydro!., 18,257-271, with ERRATA 19, p377.
Cunnane,
c.,
1978: Unbiassed plotting positions - a review. J. Hydro!., 37(3/4), 205 - 222.
Cunnane, C., 1979: A note on the Poisson assumption in partial duration series models. Water Resour. Res., 15(2),
489 - 494.
Cunnane.
c., 1984: Condition of separation of skewness of random samples from the general extreme value
distribution. Unpub!. Rep., Dept. Eng. Hydro!., Univ. Coli., Galway, 7pp
Cunnane, C., 1987: Review of statistical models for flood frequency estimation. Paper pres. at Int. Symp. on flood
frequency and risk analyses, Baton Rouge, La.. Pub!. in Singh, V.P. (Ed.), Hydrologic frequency modelling,
49 - 95, Reidel Pub!. Co., Dordrecht.
Curetan, E.E., 1968: Unbiased estimation of the standard deviation. Amer. Stat. 22(1), p.22.
Dalin, J.S., 1986: Statistical analyses of mixed populations. Paper pres. at Int. Sympos. on Flood frequency analysis
and risk analyses,Baton Rouge, La., 20 MS pages.
REFERENCES
66
Dalrymple, T., 1960: Flood freqency methods. U.S. Geo!. Snrvey, Water Supply Paper 1543 A, pp 11 - 51,
.
Washington.
De Coursey, D.G., 1972: Objective regionalisation by peak flow rates. Proceedings Second International Symposium
in Hydrology, Fort Collins, Colorado, pp 385 - 405.
Eagleson, P.S., 1972: Dynamics of flood frequency. Water Resources Research, 8(4), 878-898.
Eychaner, J.H., 1984: Estimation of magnitude and frequency of floods in Pima County, Arizona with comparisons of
alternative methods. Water Resour. Invest. Rep.84-4142, U.S. Geo!' Surv., Tucson, Arizona.
Farhan, Y.l., 1984: Regionalisation of surface water catchments on east bank of Jordan. 25th Inln!. Geog. Congress
Pre - Sympos.No 30 on "Problems in regional hydrology", Univ. of Freiburg, Fed. Rep. Germany.
Fiering, M.B., 1969: Streamflow Synthesis. Macmillan, London (Chapt.3).
Fiering, M.B. and Jackson, B.B., 1971: Synthetic streamflows. Amer. Geophys. Union., Water Resour. Monograph
1,98 pp.
Fiorentino, M. and Gabriele, S., 1984: A correction for the bias of maximum likelihood estimators of Gumbel
parameters. J. Hydro!. 73 (1/2), 39 - 50.
Foster, H.A., 1924: Theoretical frequency curves and their application to engineering problems. Trans. ASCE, 87,
142-173.
Fuller, W.E., 1914: Flood Flows. Trans. Am. Soc. Civ. Engrs., 77, 564 - 617.
Greenwood, JA., Landwehr, J.M., Matalas, N.C. and Wallis, J.R., 1979: Probability weighted moments: Definition
and relation to parameters of distributions expressible in inverse form. Water Resonr. Res., 15(5) 1049-1054.
Greis, N.P. and Wood, E.F., 1981: Regional flood frequency estimation and network design. Water Resour. Res.,
17(4), 1167 - 1177.
Gringorten,
u., 1963:
A plotting rule for extreme probability paper. J. Geophys. Res., 68(3), 813 - 814.
Guillot, P., 1973: Application of the method of GRADEX. Proc. 2nd Int. Sympos. in Hydro!., "Floods and
droughts", pp44 - 49, Water Resour. Public., Fort Collins, Colorado.
Gumbel, E.J., 1941: The return period of flood flows. Ann. Math. Statist., 12(2), 163 - 190.
Gumbel, E.J., 1958: Statistics of extremes. Columbia Univ. Press, 375 pp.
Hall, MJ. and O'Connell, P.E., 1972: Time series analysis of mean daily river flows. Water and Water Engng., 76,
125-133.
Hasselbad, V., 1969: Estimation of finite mixtures of distributions from the exponential family. J. Am. Stat. Assoc.,
64, 1459 - 1471.
Hazen, A., 1914: Discussion on "Flood flows" by W.E. Fuller. Trans. ASCE., 77, 626-632.
Hazen, A., 1932: Flood flows. John Wiley, New York, 199 pp.
Hebson, C.S. and Wood, E.F., 1982: A derived flood frequency distribution using Horton order ratios. Water Resour.
Res., 18(5), 1509 - 1518.
Hebson, C.S. and Cunnane, c., 1986: Assessment of use of at-site and regional flood data for flood frequency
estimation. Paper presented at Int. Sympos. on Flood Freqnency and Risk Analyses, Baton Rouge. Pub!. in
Singh, V.P. (Ed.), 1987, Hydrologic Frequency Modelling, 433 - 448, Reidel Pub!. Co., Dordrecht.
Hedman, E.R. and Osterkamp, W.R., 1982: Streamflow characteristics related to channel geometry of streams in
Western United States. US Geo!. Surv. Water Supply Paper 2193, 17p.
STAllSTICALDlSTRIBUfiONS FOR HOOn FREQUENCY ANALYSIS
67
Hirsch, R.M.: Probability plotting position formulas for flood records with historical information. Paper pres. at USChina bilateral symposium on the analysis of extraordinary flood events, Nanjing, Oct. 1985. Pub!. in J.
Hydro!., 96, (1-4), 185 - 199, 1987
Hoshi, K., and Burges, S.1., 1981: Sampling properties of parameter estimates for the log-Pearson Type 3 distribution
using moments in real space. J. Hydro!., 53: 305-316.
Hosking, J.R.M., 1984: Testing the general extreme value distribution hypothesis. Biometrika.
Hosking, J.R.M., 1985: Comment on "A correction for the bias of maximum likelihood estimates of Gumbel
parameters", by Fiorentino and Gabrielle. J. Hydro!., 78(3/4), 393-396.
Hosking, J.RM., 1986a: The theory of probability weighted moments. IBM Math. Res. Rep., RC12210, Yorktown
Heights, New York, 16Op.
Hosking, J.R.M., 1986b: The Wakeby distribution. IBM Math. Res. Rep., RC12302, Yorktown Heights, New York,
21p.
Hosking, J.R.M. and Wallis, J.R., 1984: Palaeoflood hydrology and flood frequency analysis. EOS, 65(45), p890
(Pres. at Amer. Geophys. Union Fan Meeting, San Francisco, Dec. 1984).
Hosking, J.R.M. and Wallis, J.R., 1985: The effect of inter-site dependence on regional flood frequency analysis.
EOS, 66(46), p906, (presented AGU Fan Meeting, San Francisco, Dec. 1985).
Hosking, J.R.M. and Wallis, J.R., 1986a: Palaeoflood hydrology and flood frequency Analysis. Water Resour. Res.,
22(4),543 - 550.
Hosking, J.R.M. and Wallis, J.R., 1986b: The value of historical data in flood frequency analysis. Water Resour.
Res., 22(11), 1606 - 1612.
Hosking, J.R.M., Wallis, J.R., and Wood, E.F., 1985 (a): An appraisal of the regional flood frequency procedure in the
UK Flood Studies Report. Hydro!. Sci. J., 30(1), 85-109.
Hosking, J.R.M., Wallis, J.R. and Wood, E.F., 1985 (b): Estimation of the generalised extreme value distribution by
the method of probability weighted moments. Technometrics, 27(3), 251 - 261.
Houghton, J.C., 1977: Robust estimation of the frequency of extreme events in a flood frequency.conte Ph.D.
Dissertation, Div. of App!. Sci., Harvard Univ., Cambridge, Mass.
Houghton, J.e., 1978 (a): Birth of a parent: The Wakeby distribution for modelling flood flows. Water Resour. Res.,
14(6), 1105 - 1109.
Houghton, J.C., 1978 (b): The incomplete means estimation procedure applied to flood frequency analysis. Water Res.
Res., 14(6), llll - 1115.
Hua, Shi-Qian, 1985: A general survey of flood frequency analysis in China. Paper pres. at US-China bilateral
Symposium on the analysis of extraordinary flood events, Nanjing. Pub!. in J. Hydro!., 96,(1-4), 15 - 24,
1987
Hydrologic Engineering Center (HEC), 1975: Hydrologic frequency analysis. Hydrologic Engineering Methods for
Water Resour. Development, (US contrib. to lHD),Vo!. 3, Corps of Engineers, Davis, Calif., HEC-lHD0300,16 pages.
1n-na, Nophadal ,1988: A new plotting formula for Pearson Type 3 distribution. M.Sc. Thesis Subm. to Dept.
Engng. Hydro!., Univ. Col!., Galway, Ireland, 49 pages.
Institution of Engineers, Australia, 1977: Australian rainfall and runoff - flood analysis and design. I.E.A., 149 pp.
Jenkinson, A.F., 1955: The frequency distribution of the annual maximum (or minimum) values of meteorological
elements. Quart. J. Roy. Meteor. Soc., 81 158 - 171.
68
REFERENCES
Jenkinson, A.F., 1969: Statistics of extremes. In: Estimation of maximum floods. WMO No 233,TPI26, (Tech.
Note No. 98), 183 " 228.
Jennings, M.E. and Benson, M.A., 1969: Frequency curves for annual flood series with some zero events or
incomplete data. Water Resour. Res., 5(1), 276 - 280.
Ji Xue-wu, Ding Jing, H.W. Shen and Salas J.D., 1984: Plotting positions for Pearson type 3 Distrihution. J. Hydro!.
(74), 1 - 29.
Kaczmarek, Z., 1957: Efficiency of the estimation of floods with a given return period. Proc. Toronto Sympos., lARS
Public. No. 45, 145 - 159.
Kavvas, M.L., 1982 (a): Stochastic trigger model for flood peaks, 1. Development of the mode!. Water Resour. Res.,
18(2), 383 - 398.
Kavvas, M.L., 1982 (b): Stochastic trigger model for flood peaks, 2. Application of the model to the flood peaks of
Goksu Karahaali. Water Resour. Res., 18(2), 399 - 411.
Kendall, M.G. and Stuart, A., 1961: The advanced theory of statistics, Griffin, London, Vo!. 2, 522 - 527.
Kirby, W., 1974: Algebraic boundness of sample statistics. Water Resour. Res., 10(2),220 - 222.
Klemes, V., 1986: Dilettantism in Hydrology:- Transition or Destiny? Water Resour. Res., 22(9), 177S - 188S.
Klemes, V., 1987: Hydrological and engineering relevance of flood frequency analysis. Paper presented at Int. Sympos.
on Flood Frequency and Risk Analyses, Baton Rouge. Pub!. in Singh, V.P. (Ed.), 1987, Hydrologic
Frequency Modelling, 1 - 18, Reidel Pub!. Co., Dordrecht.
Kottegoda, N.T., 1984: Investigation of outliers in annual maximum flow series. J. Hydro!., 72, pp. 105 - 137.
Kuczera, G., 1982a: Combining site-specific and regional information: An empirical Bayes approach. Water Resour.
Res., 18(2),306 - 314.
Kuczera, G., 1982b: Robust flood frequency models. Water Resour. Res., 18(2),315 - 324.
Kuczera, G., 1983: Effect of sampling uncertainty and spatial correlation on an empirical Bayes procedure for
combining site and regional information. J. Hydro!., 65(4), 373 -398.
Lall U. and Beard, L.R., 1982: Estimation of Pearson type 3 moments. Water Resour. Res., 18(5), 1563 - 1569.
Lamberti, P. and Pilati, S., 1985: Probability distributions of annual maxima of seasonal hydrological variables.
Hydro!. Sci. Jour., 30(1), III - 136.
Landwehr, J.M., Matalas, N.C. and Wallis, J.R., 1978: Some comparisons of flood statistics in real and log space.
Water Res. Res. 14(5),902 - 920, 1978; and CORRECTION in 15(6), p1672.
Landwehr, J.M., Matalas, N.C. and Wallis, J.R., 1979 (a): Probability weighted moments compared with some
traditional techniques in estimating Gumbel parameters and quantiles. Water Resour. Res. 15(5), 1055 - 1064.
Landwehr, J.M., Matalas, N;C. and Wallis, J.R., 1979 (b): Estimation of parameters and quantiles of Wakeby
distribution. Water Res. Res. 15(6). 1361 -1379.
Landwehr, J.M., Matalas, N.C. and Wallis J.R, 1980: Quantile estimation with more or less flood-like distributions.
Water Resour. Res., 16(3),547 - 555.
'
Landwehr, J.M., Tasker, G.D. and Jarret, RD., 1987: Discussion of "Relative Accuracy of log-Pearson IT! procedures"
by Wallis and Wood (1985). ASCE, J. Hydrau!. Engng., 113(9), 1206· 1210
.
.
Langbein, W.B. and others, 1947: Topographic characteristics of drainage basins. U.S. Goo!. Surv. Prof. Paper 968·C,
pp 125 - 157.
STATISTICAL DISTRIBlmONS FOR FLOOD FREQUENCY ANALYSIS
69
Langbein, W.B., 1949: Annual floods and the partial duration flood series. Trans. Am. Geophys. Union, 30, 879 881.
Leese, M.N., 1973: Use of censored data in the estimation of Gumbel distribution parameters for annual maximum
flood series. Water Resour. Res., 9(b), 1534 - 1542.
Lettenmaier, D.P, 1985: Regionalization in flood frequency analysis:- Is it the answer? Paper pres. at US-China
Bilateral Symposium on the analysis of extraordinary flood events, Nanjing.
Lettenmaier, D.P., and Potter, K.W., 1985: Testing flood frequency estimation methods using a regional flood
generating mode!. Water Resources Res., 21(12),1903 - 1914.
Lettenmaier, D.P., Wallis J.R., and Wood, E.F., 1985 Note on the comparative robustness of estimates of extreme
flood quantiles. Pres. at American Geophysical Union, Spring Meeting, Baltimore.
Lettenmaier, D.P., Wallis, J.R. and Wood, E.F., 1987: Effect of heterogeneity on flood frequency estimation. Water
Resour. Res., 23(2), 313 -323.
Lieblein, J., 1953: A new method of analysing extreme value data. U.S. Nat. Adv. Comm. Aeronaut., Tech. Note
3053, 88 pages.
Lloyd, E.H., 1952: Least squares estimation of location and scale parameters using order statistics. Biometrika, 39, 88
- 95.
Lowery, M.D. and Nash, J.E., 1970: A comparison of methods of fitting the double exponential distribution. 1.
Hydro!., 10(3), 259 - 275.
Matalas, N.C. and Wallis, J.R., 1972: An approach to formulating strategies for flood frequency analysis. Proc. Int.
Symp. on Uncertainties in Hydrologic and Water Resource Systems, Tucson, Arizona, 940 - 961.
Matalas, N.C. and Wallis, J.R., 1973: Eureka! It fits a Pearson type 3 distribution. Water Resour. Res., 9(2), 281-289.
Matalas, N.C., Wallis J.R
815 - 826.
and Slack J.R., 1975: Regional skew in search of a parent. Water Resour. Res., 11(6),
McMahon, T.A., 1979(a): Hydrologic characteristics of Australian streams. Monash Univ., Clayton, Viet., Civ. Eng.
Res. Rep. No. 3/79, 79 pp.
McMahon, T.A., 1979(b): Hydrologic characteristics of arid zones. Proc. Canberra Sympos., "The hydrology of areas
of low precipitation", lARS Pub!. No. 128, pp 105 - 124.
McMahon, T.A.. and Srikanthan, R., 1981 (a): Log-Pearson type 3 distribution - is it applicable to flood frequency
analysis of Australian streams? 1. Hydro!., 52,139 - 147.
McMahon, T.A. and Srikanthan, R., 1981 (b): Log-Pearson type 3 distribution - effect of dependence, distribution
parameters and sample size on peak annual flood estimates. J. Hydro!., 52, 149 - 159.
Moran, P.A.P., 1957: The statistical treatment of flood flows. Trans. AGU, 38(4), 519 - 523.
Mosley, M.P., 1981: Delimitation of New Zealand hydrological regions. Hydro!., 49, 173 - 192.
Nash LE. and Amorocho, J., 1966: The accuracy of the prediction of floods of high return period. Water Resour. Res.,
2(2), 191 - 198.
Nash J.E. and Amorocho, J., 1967: Note on "The accuracy of the prediction of floods of high return period." Water
Resour. Res. (Letters), 3(2), p635.
Nash, J.E. and Shaw, B.L., 1966: Flood frequency as a function of catchment characteristics. Proc. Sympos. on River
Flood Hydrology (1965), Inst. Civ. Engnrs., London, 115 - 136.
NERC, 1975: Flood Studies Report. Nat. Environ. Res. Council, London, Vols. I - 5, 1100 pp.
REFERENCES
70
Nguyen, V., In-na, N. and Bobee, B: A new plotting position fonnula for Pearson Type 3 distribution. Subm. for
publication, ASCE J. Hydrau!. Engug., 2Opp, 1988.
Njenga, M., 1985: Simulation applied to the inference problem of the underlying distribution of hydrologic random
variables. Thesis subm. in partial fulfillment of M.Sc., DeplEng.Hydro!., Univ.Coll., Galway, 15Opp.
Nozdryn-Plotnicki, MJ. and Watt, W.E., 1979: Assessment of fitting techniques for the log-Pearson type 3
distribution using Monte Carlo simulation. Water Resour. Res., 15(3),714-718.
O'Donnell, T., Hall, M.J. and O'Connell, P.E., 1972: Some applications of stochastic hydrological models. Int.
Symp. on Modelling Techniques in Water Resources Systems, (Editor A.K. Biswas) VoU, pp 227 - 239,
Pub!. by Environment Canada, Ottawa.
Olin D.A. and Bingham, RH., 1982: Synthesised flood frequency of urban streams in Alabama. US Geo!. Survey
Water Resour. Invest. 82-683.
Otten, A and van Montfon, M.AJ., 1978: The power of two tests on the type of distribution of extremes. J. Hydro!.,
37, 195 - 199.
Phien, R.N. and Hira, M.A., 1983: Log-Pearson type 3 distribution: parameter estimation. J. Hydro!., 64, 25 - 37.
Phien, R.N. and Lung-Cheng Hsu, 1984: Variance of the T-year event in the 10gcPearson type 3 distribution. J.
Hydro!. 77, 141 - 158.
Potter, W.D., 1958: Upper and lower frequency curves for peak rates of runoff. Trans., Amer. Geophy. Union, 39(1),
ppl00-105.
Quimpo, RG., 1967: Stochastic model of daily flow sequences. Hydrology Paper No.18, Colorado State Univ.
Reich, B.M., 1970: Flood series compared to rainfall extremes. Water Resour. Res., 6(6), 1655 - 1667.
Reich, B.M., 1977: Lysenkoism in U.S. flood determinations. Special session on flood frequency methods, Amer.
Geophys. Union, San Francisco, l3pages.
Reich, B.M., 1985: Flood probability effects from progressive fluvial erosion. Paperpresented at US - China Bilateral
Sympos. on the analysis of extraordinary flood events. Nanjing, China.
Riggs, H.C., 1978: Streamflow characteristics from channel size. AS.C.E., J. Hydrau!. Div., 104, HYl, 87-96.
Rosbjerg, D., 1984: Estimation in partial duration series with independent and dependent peak values. J. Hydro!., 76,
183 - 195.
Rossi, E, Fiorentino, M. and Versace, P., 1984: Two component extreme value distribution for flood frequency
analysis. Water Resour. Res., 20(7), 847 - 856.
Rossi, F., Fiorentino, M. and Versace, P., 1986: Reply to comment by
267 - 269.
Beran~.
(1986). Water Resour. Res., 22(2),
Sangal, B.P., and Biswas, A.K., 1970: The three-parameter log-nonnal distribution and its application in hydrology.
Water Resour. Res;, 6(2), 505 - 515.
Shane, R.M. and Lynn, W.R, 1964: Mathematical model for flood risk evaluation. J. Hydrau!. Div., ASCE, 90, 1 20.
Sinclair, C.D. and Ahmad MJ., 1987: Modified Anderson-Darling test. Techn. Rept., Dept. Math. Sci., Univ. of St.
Andrews.
Simmons, RR. and Carpenter, D.H., 1978: Technique for estimating the magnitude and frequency of floods in
Delaware. US Goo!. Surv. Water Resour. Invest., Open File Rept 78-93.
Singh, K.P. and Sinclair, RA.: Two distribution method for flood frequency analysis. J. Hydrau!. Div., ASCE, HYI,
98,29 - 44.
STATISTICALDISTRIBU!10NS FOR R..OOD FREQUENCY ANALYSIS
71
Singh, V.P. and Aminian Hossein, 1986: An empirical relation between volume and peak of direct runoff. Water
Resour. Bull., 22(5), 725 - 730.
Singh, V.P. (Editor). "Regional Flood Frequency Analysis" Proc. Int. Sympos. on Flood Frequency and Risk
Analyses, Baton Rouge, Louisiana, May 1986. Pub!. by Reidel Pub!. Co., Dordrecht, 400 pp. 1987.
Slack, J.R., Wallis, J.R. and Malalas, N.C., 1975: On the value of information to flood frequency analysis, Water
Resour. Res., 11(5),629-647.
Slade, I.J., 1936: The reliability of statistical methods in the determination of flood frequencies. U.S. Geo!. Surv.,
Water Supply Paper 771, pp 421 - 432.
Song Dedun and Ding Jing, 1988: Estimation of Pearson type 3 parameters by method of probability weighted
moments. J. Hydro!., 101,47 -61.
Spence, E.S., 1973: Theoretical frequency distributions for the analysis of plains streamflow. Can. J. Earth Sciences,
10, 130 - 139.
Stedinger, J.R., 1980: Fitting log-normal distributions to hydrologic data. Water Resour. Res., 16(3) 481-490.
Stedinger, J.R., 1983 (a): Design events with specified flood risk. Water Resour. Res., 19(2),511 - 522.
Stedinger, J.R., 1983 (b): Estimating a regional flood frequency distribution. Water Resour. Res., 19(2),503 - 510.
Stedinger, J.R., 1983 (c): Confidence intervals for design events. ASCE, J. Hydrau!. Engng., 109(1), pp13 - 27.
Stedinger, JR. and Tasker, G.D., 1985: Regional hydrological analysis: 1. Ordinary, weighted, and generalised least
squares compared. Water Resour. Res., 21(9),1421 - 1432.
Stedinger, J.R. and Tasker, G.D., 1986: Regional hydrological analysis: 2. Model error estimators, estimation of
sigma and log-Pearson type 3 distributions. Water Resour. Res., 22(10), 1487 - 1499.
Sukhatme, P.V., 1937: Tests of significance for the Chi square population with two degrees of freedom. Annals of
Eugenics, 8, 52 - 56.
Taesombut, V. and Yevjevich, V., 1978: Use of partial flood series for estimating distribution of maximum annual
flood peak. Hydro!. Paper No. 97, Colorado State University, Fort Collins, 71 pp.
Takeuchi, K., 1984: Annual maximum series and partial duration series - evaluation of Langbein's formula and Chow's
discussion. 1. Hydro!., 68, 275 - 284.
Tasker, G.D., 1982: Comparing methods of hydrologic regionalization. Water Resour. Bull., 18(6),965 - 970
Tasker, G.D., 1983: Effective record length for the T-year event. J. Hydro!., 64, 39 - 47.
Tasker, G.D.: Regional Analysis of Flood Frequencies. In Singh V.P. (Ed.), "Regional Flood Frequency Analysis",
Reidel Pub!. Co., Dordrecht, pp 1-9, 1987.
Tasker, G.D. and Moss, M.E., 1979: Analysis of Arizona flood data network for regional information, Water Resour.
Res., 15(6), 1791 - 1796.
Tavares, L.V. and Da Silva, J.E., 1983: Partial duration series method revisited, J. Hydro!., 64, 1 - 24.
Thomas, H.A., 1949: Frequency of minor floods. Jour. Boston Soc. Civ. Engrs., 35, p. 425.
Thomas, D.M. and Benson M.A., 1970: Generalization of streamflow characteristics from drainage-basin
characteristics. U.S.G.S. Water-Supply Paper No. 1975.
Thomas, W.O., Jr., 1985: A uniform technique for flood frequency analysis. ASCE, J. Water Resour. Plann. and
Manag., 111(3) 321 - 327.
REFERENCES
72
Thomas, W.O., Jr., 1987: Techniques used by U.S. Geological Survey in estimating the magnitude and frequency of
floods. Proc. 18th. Binghampton Geomorpho!. Sympos., Pub!. by Unwin and Hyman, London, pp267-288.
Todorovic, P., 1970: On some problems involving a random number of random variables. Annals Math. Statist.,
41(3), 1059 - 1063.
Todorovic, P. and Zelenhasic, E., 1970: A stochastic model for flood analysis. Water Resour. Res., 6(6),1641 - 1648.
Todorovic, P. and Rousselle, J., 1971: Some problems of flood analysis. Water Resour. Res., 7(5), 1144 - 1150.
Todorovic P, 1978: Stochastic Models of Floods. Water Resour Res., 14(2),345-356,.
U.S.W.R.e., 1967: A uniform technique for determining flood flow frequencies. Bulletin 15, Hydrology Committee,
Water Resources Council, Washington.
U.S.W.R.C., 1976: Guidelines for determining flood flow frequency. Bulletin 17, Hydrology Committee, Water
Resources Council, Washington. (Also revised versions, Bulletin 17A, 1977 and Bulletin 17B, 1981).
Van Montfort, J.AJ., 1970: Ou testing that the distribution of extremes is of type I when type 2 is the alternative. J.
Hydro!.,11(4), 421 - 427.
Van Montfort, M.AJ. and Gomes, M.L, 1985: Statistical choice of extremal models for complete and censored data.
J. Hydro!., 77, 77 - 87.
Wallis, J.R., 1980: Risk and uncertainties in the evaluation of flood events for the design of hydrologic structures.
Keynote address at "Seminar on Extreme Hydrological Events - Floods and Droughts", Erice, Italy, 33 pp.
Wallis, J.R., 1982: Hydrological problems associated with oil shale development. In Environmental Systems
Analysis and Management. Ed. S. Rinaldi, North Hol1and Pub!. Co., ppl85 - 102.
Wallis, J.R. and Wood, E.F., 1985: Relative accuracy of log-Pearson III procedures. A.S.C.E. Journal of Hydraulic
Engineering, 111(7), 1043 - 1056. (See also discussion and reply, 113(9), 1205 - 1214, 1987).
Wallis, J.R. and Wood, E.F., 1987: Reply to discussion, by Beard (1987) and by Landwehr tl.ll!. (1987) of "Relative
accuracy of log-Pearson III procedures" by Wallis and Wood (1985). ASCE, J. Hydrau!. Engng., 113(9), 1210
- 1214
Wallis, J.R., Matalas N.C. and Slack J.R., 1974: Just a moment! Water Resour. Res., 10(2),211 - 219.
Wallis, J.R., Matalas, N.C. and Slack, J.R., 1977: Apparent regional skew. Water Resour. Res., 13(1), 159 - 182.
Waylen, P. and Woo, Ming-ko, 1982: Prediction of annual floods generated by mixed processes. Water Resour. Res.,
18(4), 1283 - 1286.
White, E.L., 1975: Factor analysis of drainage basin properties: classification of flood behaviour in terms of basin
geomorphology. Water Resources Bulletin, 11(4),676 - 687.
Wiltshire, S.W., 1985: Grouping basins for regional flood frequency analysis. Hydro!. Sci. Joum., 30(1), 151 - 159.
Wiltshire, S.W., 1986 (a): Identification of homogeneous regions for flood frequency analysis. J. Hydro!., 84, 287 302.
Wiltshire, S.W., 1986 (b): Regional flood frequency analysis I: Homogeneity statistics. Hydro!. Sci. Jour., 31(3), 321
- 333.
Wiltshire, S.W., 1986 (c): Regional flood frequency analysis
Britain. Hydro!. Sci. Jour., 31(3),335 - 346.
11: Multivariate classification of drainage basins in
Wiltshire, S. and Beran, M., 1986: Multivariate techniques for the identification of homogeneous flood frequency
regions. Pres. at Internat. Sympos. on flood frequency and risk analyses, Baton Rouge, USA, 17p, 1986.
Pub!. in Singh, V.P. (Ed.), 1987, Hydrologic frecuency modelling, Reidel Pub!. Co., Dordrecht, pp133-146.
STATISTICALDISTRIBUTlONS FOR RDODFREQUENCY ANALYSIS
73
Wood, E.F., 1974: A Bayesian approach to analysing uncertainty among stochastic models. Research Report 74 - 16,
Int. Inst. for App!. Syst. Analy., Laxenburg, Austria, 19 pp.
Wood, E.F., 1976: An Analysis of the effects of parameter uncertainty in detenministic hydrologic models. Water
Resources Research, 12(5), 925-935.
Wood, E.F. and Rodriguez-Iturbe, I., 1975: Bayesian inference and decision making for extreme hydrologic events.
Water Resour. Res., 11(4),533 - 542.
World Meteorological Organization, 1969: Estimation of maximum floods. World Meteoro!. Organization, WMO-No.
233,11' 126, (Tech. Note No. 98), pp 183 -228.
World Meteorological Organization, 1983: Guide to Hydrological Practices, Volume II. World Meteoro!. Organization,
WMO-No. 168, p5 - 26.
World Meteorological Organization, 1981: Selection of distribution types for extremes of precipitation, by B. Sevruk
and H. Geiger. World Meteoro!. Organization, WMO-No. 560, OH Rep. No. 15,64 pp.
World Meteorological Organization, 1986: Manual on estimation of probable maximum precipitation. World
Meteoro!. Organisation, WMO-No. 332, OH Report 1, Revised edition
Wu, B. and Goodridge, J.D., 1976: Selection of frequency distributions for hydrologic frequency analysis. Dept. of
Water Resources, State of California, Sacramento, 85 pages.
Yevjevich, V., 1968: Misconceptions in hydrology and their consequences. Water Resour. Res., 4 (2), 225 - 232.
Yevjevich, V., 1979: Extraction of full information on flood peaks in arid areas. Proc Canberra Sympos., "The
hydrology of areas of low precipation", IAHS Pub!. No. 128, pp223 - 234.
Yevjcvich, V. and Taesombut, V., 1978: Information on flood peaks in daily flow series. Proc. Int. Symp. on Risk
and Reliability in Water Wesources, Univ. of Waterloo, Ontario, Canada, pp. 451 - 470.
Yevjevich, V. and Obeysekaera, J.B.T., 1984: Estimation of skewness of hydrologic variables. Water Resour. Res.,
20(7), 935 - 943.
Zhang, Y., 1982: Plotting positions of annual flood extremes considering extraordinary values. Water Resour. Res.,
18(4),859-864.
APPENDIX 1.
VOLUME
FLOODS.
Introduction
As indicated in Chapter 1 the vast majority of published work regarding flood distribution choice and
performance relates to instantaneous peak flows or to peak one day mean flows .. However instantaneous peak flow
information (or that on one day flow) alone is insufficient for certain very important types of work. Knowledge of the
instantaneous flood peak - return period relationship enables calculation of the probability that a bridge, flood plain or
building may be inundated or that a levee may be overtopped. However it provides no information on the volume of
water which may flow over a levee and which may have to be pumped back into the river at a later stage. The amount
and frequency of such pumping would need to be estimated for economic analysis of an existing or proposed scheme.
There are two possible categories of volume - frequency problems. These relate to Ca) volumes over a threshold flow and
(b) total volumes of flow.
Q
Qo -----
L-
- ' \/
f_.L-.L.L....L-L..L....L-LLJ_-=_
1<-1.- -
t
D ---toI,j
Figure ALl Definition sketches for volume flood variables.
Ca)
VOLUMESOVERA 11IRESHOlD.
The data series SI' S2 ,S3
involved in such study would consist of volumes of flows exceeding Qa '
where Qo is the existing within-bank or within-levee flow or a proposed design channel flow. While analysis of such
series have undoubtedly been undertaken reports of them have not been widely published. Hence no immediate advice
about distributions for such series is available. However because of the Central Limit Tbeorem of statistics the variable
S must be less skewed than instantaneous flood peak Q.
The theoretical derivation of distribution of flood volumes in this context has been discussed by Todorovic
(1978), Ashkar and Rousselle (1982) and Correia (1987).
Al.2
(b)
TarALFLOwVOLUMES.
Information on total flow volume - return period rel"atioI1ships V(D) - T, for a series of different durations D,
is a nonnal requirment for spillway assessment of any reservoir and is a necessity in designing a flood control reservoir
or in assessing its operating rules. Let (Vj(D) , i = 1, 2... N] bea series of annnal maximum flow volumes in N
years of record relating to duration D. Typically;s~ch:series would-bC abstracted for D = 1, 3, 7 or 10, 30, 90 and 365
days. (Hydrologic Engineering Center (HEC), 1975). Typically Vj(D) would be expressed as an average flow so that
"peak flows and volumes can be readily compared and coordinated" (HEC, 1975). That is, if Vi*(D) is the '1ctu!'1
maximum volnme of flow for duration D in m3 in year i, Vi(D) is taken as Vj*(D) 1(86400D) m3/s. ..
.
As indicated above, because of the Central Limit Theorem of statistics, the skewness of the variable V(D)
should be less than that of instantaneous peak Q and one would expect for sufficiently large D that V(D) wonld have a
Normal (Gaussian) distribntion. HEC (1975) report typical skewnessvalnes of 10gV(D) for United States of America
ranging from zero for instantaneous peaks (LN distribution) to - 0.23 for D = 10, - 0.37 for D = 90 and - 0040 for D =
365 days. These values are recommended for use in USA in the absence of locally derived regional valnes. These,
along with the first two moments in the log domain obtained from the at-site data, are then used with the LP3
assumption (in a manner analogons to the USWRC (1977) recommendation for instantaneons peak flows) to calculate
V(D)-T relationships. Note that the negative skewness values imply upper bounds for V(D) at large T.
Statistical modelling of some aspects of the relationships between flood peak, dnration and volume has also
been discussed by Todorovic (1978) while Singh and Hossein (1986) have examined empirical relationships between
peak and flood volume.
In view of .the superiority of index flood melhodsover the LP3 I regional skew procedure it is recommended
that V(D) quantiles be also estimated by at-site I regional index flood methods. No preciseguidclines can be offered
about the choice of parent distribution for standardized flood volnmes at this time. However those distribntions which
havc been found robnst for instantaneous flood peaks should prove satisfactory for volnmes also. .
In any regional stndy of flood volumes the several standardised V(D) -. Tielationships for D = 1, 3, 7, ....
days should be examined for the possibility of finding one or more relationships between the V(D) - T parameters and
the duration D. This would allow for more efficient joint estimation of all parameters. For instance NERC(1975, I,
p354) found that the dependence of Q(D) on D could be expressed as
Q{D)
Qi
(1 + BD)n
1 " D " 10 days,
(ALl)
a form of relationship often used for rainfall intensity statistics. This was suggested by comparison with the rainfall
intensity duration relationship derived in the meteorological studies of NERC (1975, Volume 2, Chapter 3). Here Q i
is mean of AM instantaneous flood peak series while Q(D) is mean of AM flood volumes series (expressed as m3 Is).
Unfortunately the estimated parameters Band n interact with one anotrrer and as a result were found to be poorly
correlated with catchment characteristics. In addition NERC (1975, I, pp 358-361) concluded that Cv(D), although
displaying a small reduction in value with increase in D, could be regarded as constant for all durations up to 10 days.
This effectively means a common growth curve xT - T could be adopted for all durations, where xT = Q(D)T I Q (D).
This degree of simplification however may not apply for D considerably larger than 10 days nor may it extend reliably
to other regions.
APPENDIX 2
ESTIMATES
Let Xt, x2
OF
POPULATION
MOMENTS
AND
THEIR
BIASES
xN denote values in a random sample from some X population. Define crude sample moments as
mr
=
E (xi -
X)r / N
(A2.I)
where x is sample mean and summation is from i=I to i=N and let
(A2.2)
=
In general mr is a downwards biased estimate of llr, the population rth central moment and
(A2.3)
=
is a downwards biassed estimate of skewness, C s. These biasses (Wallis et al , 1974) are larger, in small samples, than
can be corrected by replacing the denominator of the sample moments N by (N-I) or some other simplc expression in
N. For instance
(A2.4)
=
is an unbiassed estimate of the second central moment 112 but its square root is not unbiassed for the standard
deviation, cr (Cureton, 1968). Similarly
1\
m3
=
- 3 / [(N - 1) (N - 2)]
N E (Xi - x)
(A2.5)
is considered an nnbiassed estimator for 113 in samples from a Normal distribution but
1\
g
(A2.6)
=
is a biassed estimator for C s' The expression
=
N2 E (Xi - x)4/ [(N - 1) (N - 2) (N - 3)]
(A2.7)
has been proposed as an nnbiassed estimator for 114 in samples from a Normal distribution but
1\
h
=
(A2.8)
is a biassed estimator for kurtosis, Ck. Use of N alone as denominator in equation (A2.1) leads to biassed estimators
for cr, Cs and Ck but nse of the denominators (N - 1), (N- I)(N - 2) / Nand (N - I)(N - 2)(N - 3) / NO' for ~2 ' ~3
and ~4 respectively is not sufficient to eliminate bias in C s and Ck estimates. The amounts of bias in SN and gN of
equations (A2.2) and (A2.3) as estimators for cr and Cs respectively were shown by Wallis.tl..l!!.. (1974) to be functions
of three things:
(i) sample size
(ii) skewness of parent population
(iii) form of parent population distribntion
This dependence is shown in Table A2.I in which a small selection of bias ratios, a(S) and a(G), are quoted
from Wallis et aI. (1974 , Tables 3 and 4). The bias ratios are defined by
A2.2
U(SN) = (population a)1 (Mean of 100 000 values of "N)
U(gN) = (population Cs) I (Mean of 100 000 values of gN)
There is only weak dependence between u(sN) and the form of the population distribution for population Cs <
5, but u(gN) is quite dependent on the form of population distribution especially for C s > 3 and it is very dependent on
the actual value of population C s . Wallis et al. (1974) discovered that the sampling distribution of skewness obtained
by simulation seemed to have an upper bound which was never exceeded no matter how many simulations were
performed. Kirby (1974) investigated the theoretical possibility of such an upper bound on Cv and gN and proved
theoretically that such bounds exist, thus:
o
<
C v < (N-l)1/2
(A2.9)
I gN I
<
(N - 2) I (N - 1)1/2
(A2.10)
These are distribution-free results and hold for any set of N values, randomly selected or otherwise. To see
this consider samples of size 10 containing 9 values of unity and large 10th values e.g. 10, 102 , 103
106 and note
that while gN grows in these samples that it never exceeds the bound ofequation (A2.1O).
Hazen (1932) noticed that skewness calculated from AM series increased with sample size N and suggested
that sample values of skewness be multiplied by (I + 8.5 I N) to eorrect for this bias, The effectiveness of this factor in
conjunction with equation (A2.6) for estimating C s as
G*
=
A
g . (1 + 8.5 I N)
(A2Il)
where ~ is defined by equation (A2.6), was investigated by Wallis et al (1974) who computed
u*(G) =
(population Cs)/(Mean of 100 000 values of G*)
(A212)
and from comparison of u*(G) and U(gN) they concluded that G* is an approximately unbiassed estimator from C s for a
small range of C s values, say 0.5 < C s < 2 for the lognormal distribution. It is also a good estimator for theEVI
distribution.
From the above it can be concluded that the populations, from which AM series are assumed to be random
samples, have values of Cv , Cs and Ck which are in excess of the average regional value calculated by the usual
formulae.
The Wallis et al, (1974) findings, based on Monte Carlo simulation work, has serious implications for
methods of flood estimation which depend on the use of sample moments. One such method is the US WRC (1967,
1977, 1981) recommended use of the log Pearson Type 3 distribution with parameter estimation by moments in the
log-domain. BObee and Robitaille (1975) suggested a skewness estimator for the Pearson Type 3 distribution
20.2)
( 1.48
6.77) ---2]
- -G [(I 6.51
Gs s
+N+7+
N+N2 G s
(A2.13)
as
where
is the average regional skew. Lall and Beard (1982) pointed out that Bobee and Robitaille's estimator is based
on Wallis et a1.'s (1974) investigation into the expected value of gN given C s while it is used to estimate C s given gN'
They state that "In the light of the bounded nature of sample skew and its skewed distribution it is suspected that there
is a lack of a one-to-one correspondence between the expected value of C s given gN and the expected value of gN given
Cs". In conclusion they point out the need to take the magnitude of the Kirby (1974) bound on skewness into account
and they state that the nature of the (required) bias correction factor for skewness is probably markedly different from
what is currently used.
Yevjevich and Obeysekera (1984) also consider methods of skewness estimation, especially from gamma and
lognormally distributed samples. They point out that estimates of population skewness obtained by distributional
A2.3
methods of parameter estimation (e.g. ML with form of population known) show marked improvement in bias and
efficiency over distribution-free methods (e.g. use of sample moments with distribution-free correction factors).
Distribution
Population
Skewness
a(sN)
a(gN)
N=IO
N=50
N=90
Normal
0.00
1.084
1.016
1.009
Gumbel
1.14
1.108
1.021
1.012
2.172
1.217
1.123
Lognormal
0.25
1.14
5.00
15.00
1.085
1.104
1;276
1;581
1;016
1.020
1;083
1.226
1;009
1.011
1;054
1.164
1;903
2.161
4.234
10.239
1.141
1;221
2.064
4.654
1;076
1.139
1;757
3.788
Pareto
3.00
5.00
15.00
1.191
1;265
1.392
1.047
1.079
1.147
1;027
1.050
1.102
2.744
4.202
11.784
1;484
2.118
5.613
1.316
1;813
4.659
0.25
1.14
5.00
1.084
1.104
1.390
1.015
1.020
1.094
1.008
1.011
1.055
1.868
1;972
2.735
1.129
1.174
1;499
1.066
1.096
1.323
0.25
1.14
5.00
15.00
1;080
1;014
1;018
1.086
1;304
1.008
1;010
1.053
1;209
1;863
1;819
3.325
7.990
1.125
1.141
1;722
3.706
1;068
1;079
1;490
3.008
2.194
1;207
1.113
Pearson 3
Weibull
l.1oo
1;327
1;861
N=1O
Not Quoted
a*(G)
. Table A2.1
N=50
N=90
Cs
Selection of Bias Factor Values for Sample Standard Deviation and Skewness.
a = (population parameter value){(Mean of sample statistic value over 100 000
random samples),
(After Wallis et aI. 1974).
APPENDIX 3
MOMENT
RATIO
DIAGRAMS.
Conventional moment ratio diagrams
A distribution function expressed in parametric form has a small number of parameters (I, 2, 3 or 4 usually).
AIl moments are expressible as functions of these and moments of order higher than the number of parameters are
necessarily expressible as functions of lower order moments. A moment ratio diagram is a graph of one such moment
as a unique function of a lower order one. Populations corresponding to different applications can be represented on
such a diagram by use of dimensionless moments such as Cv , Cs and Ck. Figure 3.2 in Chapter 3 is an example of Ck
versus Cs relations for lognormal, Weibull, GEV and Pearson Type 3 distributions. EVI and exponential distributions
plot as points on this diagram while the Cs - Ck points for Wakeby distributions plot neither as a single point or a
curve but cover an area on this diagram. This Wakeby area encloses most of the curves shown except that Weibull and
Gamma distributions having C s greater than approximately 2 and LN distributions having Cs approximately greater
than 3.5 fall below and outside the valid Wakeby area.
Plotting sample values of (Cs , Ck) or sample avcrages (Cs , Ck) from a region-full of data on such diagrams
ought to help diagnosis of the correct distributional form for floods. However, the bias corrections required to be applied
to such values prior to plotting depend on sample size. parent skewness and parent distributional form, (Appendix 2). If
parent skewness < 2 then eqn (A2.I I) might be reasonably used to removc bias in Cs but no such simple adjustment is
readily known for Ck. If parent skewness > 2 but the parent distributional form is unknown the choice of bias
correction to be made to Cs and Ck is even less certain since the bias corrections for given sample size are very
dependent on distributional form when parent skewness> 2 (Wallis et al.. 1974). This is true especially of the realistic
flood-like distributions such as WAK. GEV and TCEV.
Apart from bias the standard error of such (Cs , Ck) or even of regional mean values (Cs • Ck) is very large.
For instance if M=25 stations and N=30 years, then se(C s) ~ 0.15 before bias correction (~ 0.30 after bias correction)
when parent is GEV with skewness = 2.5. This is of the same order of magnitude as the horizontal width (ie in Cs
direction) of the band enclosing 4 distributions on Figure 3.2. When se(Ck) is taken into account simultaneously it
must be appreciated that it is virtually impossible to select a distribution unambiguously using such a moment ratio
diagram.
Commons (1986) has investigated some sampling properties of skewness - kurtosis diagrams by Monte Carlo
methods using a GEV parent distribution. GEV was choscn because of its flood-like properties and convenience of use.
The location and scale parameters were arbitrarily set to 0 and I respectively while k values corresponding to a range of
skewness given in Table A3.l were used.
Because of the known bias in small sample estimates of moments it is anticipated that a curve joining plotted
points representing the expected value of skewness Cs and expected value of kurtosis Ck in small samples should plot
below the curve representing the population (Cs • CjJ relation. Such a curve would depend on sample size N and one
would expect a family of such curves to exist for different N values.
A selection of (Ck. Cs) relations obtained by Commons (1986) for N = 10(20)90 are shown on Fig A3.1a
while Fig A3.lb shows the same information in the form of path lines taken by (C s , CjJ points from a common
parent, fixed k, as a function of sample size N. These curves arc obtained from M = 100 000 samples of each size. Figs
A3.1c and d show corresponding plots obtained from M = 100 samples. These are much less well defined indicating
that (C s, CjJ values obtained from 100' samples or fewer would be poor indicators of the true GEV parent even if the
"correct" biased plots of figures A3.2a and b were available. Since hydrological regions rarely have more than 100
gauging sites in them. the hydrologist seeking to verify the suitability of a certain distribution has to contend with the
uncertainty caused by erratic (random) effects in Figures A3.lc and d. It goes without saying that such plots would be
even less useful for discriminating between distribution types on a diagram such as Figure 3.2. Chapter 3.
A3.2
16
16
POpn.
M=100000
_'30
(a)
,"
12
iii
"'iii
....
....0:
"'0
0
i7
0:
:::l
'"
~
12
8
:::l
'"
p~
4
2
0
8
4'
--
20
3
-
2
3
2
3
16
16
M=1'OO
M·100000
(d)
(b)
12
12,
(J)
"'0
iii
iii
0
....
....
0:
0:
:::l
:::l
'"
'"
8
8
4
4
2
0
o
Figure A3.1:
1
SKEWNESS
2
1
3
SKEWNESS
Plots of (C"C0 obtained by simulation from GEV parent population, asa function of sample size,
(a)and (c), and of parent shape parameterk, (b) and (d), M is the number of simulations on which
the diagrams are based,
L • Moment ratio diagrams
The L • moment diagram is a new superior diagnostic tool based on the reletively recently introduced
probability weighted moments, PWMs, defined in.Appendix 4. Hosking (1986 a) has defined L· moments, which are
linear combinations of PWMs, as follows:
(A3.l)
Al
=
MlOO
A2
=
2 MUD
MlOO
A3
=
6 M120
6 MUD
A4
=
20 MBO
30 M120
(A3.2)
+
(A3.3)
MlOO
+ 12 MUD
MlOO
(A3.4)
A3.3
k
11
0.278
0.201
0.133
0.084
0.025
6.849
9.532
14.528
23.451
80.451
(J
Cs
1.001
1.051
1.109
1.164
1.243
0.00
0.25
0.50
0.707
1.00
2,717
2.876
3.258
3.764
4.734
5.390
0.000
0.5772
1.283
1.14
-0.041
-0.109
-0.177
-0.216
-0.240
-0.288
0.620
0.697
0.787
0.845
0.884
0.970
1.358
1.515
1.734
1.899
2.022
2.336
1.414
2.00
3.00
4.00
5.00
10.00
TableA3.1:
Ck
6.923
11.920
23.397
76.200
297.442
--
Population mean, standard deviation, skewness and kurtosis
valnes of GEV(O, 0, k) populations used by Commons (1986).
Skewness values fixed as in Wallis et aI., (1974).
A.2 is a measure of distribution scale or spread and LCv = A.2 / A.lis analogous to ordinary C v. The
dimensionless L - moments
~3
=
(A3.5)
~4
=
(A3.6)
can be considered as L - skewness and L - kurtosis respectively, each lying in the range (-1,+1). These quantities can
also be defmed in terms of quantities M10k in view of the relations in equations A4.2 and A4.3 of Appendix 4.
Hosking (1988, pp 3 - 5) advocates the use of t3 - t4 plots (t3 and t4 being sample estimates of ~3 and ~4
respectively) for the purposes of identifying the underlying distribution from which samples were drawn. He
demonstrates that (C s , Ck) values from several samples drawn from distinct distributions show very considerable
overlap when plotted on a (Cs , C0 moment ratio diagram, see Figure A3.2. On the other hand the corresponding (t3,
14) values show considerably less overlap and hence could be expected to help identify underlying parent distributions
with greater reliability than is possible with the conventional moment ratio diagram. The t3 - 4 diagram also has the
advantage of being based on unbiassed sample quantities in contrast to Cs - Ck quantities which have to be corrected for
bias, a requirement which has its own difficulties as explained above and in Appendix 2.
Hosking (1988) demonstrated this effect with samples from a GEV and two Weibull distributions. The same
effect can be seen in Figure A3.2 for samples from GEV and Pearson Type 3 populations, each having the same value
of Cv and Cs. It is clear that in this case the L - moments show better ability to discriminate between the distributions.
This ability is dependent on parent skewness however, increasing with increase in skewness and decreasing with decrease
in it.
See over leaf for Figure A3.2
A3.4
30
•
D
0.4
GEV
P3
D
en
....
en
0.3
D
0
~
~J'
"
•• nib
~D
.~
0
0
2
SKEWNESS
...:I
3
4
...
.'!ot
•
•
D
D
• •• '" I!~
0
D.
f-<
l>:
ffiI
P3
en
....
en
20
g 10
•
D
0.2
•
••
.~dil!P
D
••• y",(~
0.1
0.0
0.0
•
0.1
•
•
0.2
D
D
0.3
0.4
0.5
L - SKEWENESS
Figure A3.2: Plots of sample skewness versus sample kurtosis, and of sample L - skewness versus sample
L-kurtosis, for 50 samples of size 100 simulated from GEV and P3 parent eaeh having ~ = 100,
Cv =0.5 and Cs = 1.90. The L - moments show better ability to discriminate between the two.
APPENDIX 4
PARAMETER ESTIMATION BY PROBABILITY WEIGHTED MOMENTS (PWM).
A4.I
Introduction
Probability weighted moments were introduced by Greenwood ~ (1979) and further analysed by Hosking
(1986). They are useful in deriving expressions for the parameters of distributions whose inverse forms x = x(F) can be
explicitly defined. In particular they allow parameter estimates to be obtained for distributions which are defined only in
inverse form, such as the Wakeby distribution which was introduced as a general flood frequency distribution model by
Houghton (1978). Note thatEVI and GEV distributions can be written in both forms F = F(x) and x = x(F) and hence
their parameters can also be estimated by PWM. PWM's are defined (Greenwood tl.ll!., 1979) by
(M.l)
where i, j, k are real numbers. If j = k = 0 and i is a non-negative integer then M.ioo represents the conventional moment
of order i about the origin.
The special cases of i = 1 and either j = 0 or k = 0, M ijO and M IOk ' are linear in x and are of sufficient
generality for parameter estimation. M ljO and M IOk are interdependent and related as follows (Greenwood tl.ll!., 1979,
EqnA)
MIlX<
=
I:=o (D
M ljO
=
I~=o (D (_1)k M lOk
(-1)j M ijO
(A4.2)
(A4.3)
The form M lOk is used by Greenwood tl.Jl!. (1979) for parameter estimation in the Weibull, Generalised Lambda,
Logistic and Wakeby distributions. The latter form (M IjO) is used by them for EV1 and Kappa distributions and is
also adopted by Hosking ~(1985, a,b) in the GEV case. Table A4.1 gives formulae for MijO ' for EV1 and GEV
cases and MIlX< for the Wakeby case.
A4.2
Principles of parameter estimation by PWM.
These may be summarised as:
(i)
For the chosen distribution find expressions for either M'jo or M lOk , as convenient, in terms of the
unknown parameters (Table A4.l),
.
A
A
(ii) Calculate sample esumates M ijO (or M IOk ) of the PWMs used in (i), (see section A4.3
below),
(iii) Equate the sample estimates of (ii) to the expressions in (i) for as many values of j (or k) as
there are unknown parameters.
(iv) Solve the algebraic equations in (iii) for the unknown parameters (see Section A4.4 below).
These are precisely the same type of steps as are involved in estimation by ordinary moments.
A4.2
Ref.
Inverse fonn and PWM fonnulae
Distribution
EVI
x
=
u + a[ -In -In F},
M ,jO
=
_u_
+
I + j
where
GEV
Wakeby
(Original fonn)
Wakeby
E
where F = Pr(X:S:x)
a [ In(l+j) +
I + j
E
)
*
,
= Euler's Number = 0.57721...
+~[I-(-lnF)k}
x
=
u
M ,jO
=
'j":j:""J [ u
x
=
e
x
=
m + a [ I - (I - F)b]
M ,Ok
=
l'""+'k
x
=
~ + ~[I-(l-F)f3] -
M ,Ok
=
I
I + k
I
+ a [ I - (j + Itk r(l + k) ] / k )
- A(I - F) B + C (I - F) -D ,
m
a -c
+ I + k
-
**
or
+
- c[I-(I-F)-dj,
++
a
c
+
I + k + b
I + k - d
*
t[1- (I-FtO]
+++
(Reparameterised.
fonn)
Table A4.1 :
a
y
[~ + I + k + f3 + I + k -
***
oj
Inverse form and PWM formulae for 3 common flood frequency distributions.
Both Wakeby fonns refer to the same distribution. The second fonn allows for easier checking on
the validity of estimated parameters.
(*
From Greenwood tl.lIl., 1979, Table 2)
( ** From Hosking tl.lIl., 1985b, eqn 9)
(*** From Hosking 1986b, eqns 2,4 and 5) .
(+
Houghton's (1978a) form)
(++ Greenwood ~ (1979) form)
(+++ Hosking's (1986b) reparameterised fonn; ~ = m, f3 = b, a = ab, 'Y= cd, = d).
°
A4.3
Sample Estimates of PWM'S
Unbiassed sample estimates of MljO and M 10k can be expressed, (Hosking, 1986b, p17; Landwehr
1979, eqn. 16) as :
j = 0,1, ..... N-I
ill..ll!..,
(A4.4)
A4.3
(A4.5)
where x(~ , i = 1, 2".N are the ordered N sample values, with i = 1 denoting the smallest value, and the quantities in
square brackets represent Fj and (1 - FY' type quantities or weights respectively.
An alternative "plotting position" type weighting function
i - 0.35
N
=
(A4.6)
leading 10
(A4.7)
=
=
N
-1
""N
'::"i=l [I-F(i)]
k
(A4.8)
xli)
has been recommended by Landwehr l:!1l!... (l979b,c), Wallis (1980) and Hosking~, (I985a,b). This latter form is
also recommended here for hydrological purposes.
" ljO for j
Note that M
A
=0 and M" IOk for k =0 is identical to the sample mean.
Also, the relations (A4.2) and
A
A
A
(A4.3) hold for M ljO and M IOk calculated from (A4.4) and (A4.5) and also separately for M ljO and M 10k
calculated from (A4.7) and (A4.8).
A 4.3.1 Exam pie calculations of PWM's.
Annual maximum peak floods for River Trent at Nottingham, England for 1884 - 1933 (N = 50 years) are
used for illustration. These are taken from NERC (1975, Vol. IV, pp 266-288). The sample values are given in Table
A4.2 while unbiassed PWM estimates obtained by equations (A4.4) and (A4.5) and biassed PWM estimates obtained by
equations (A4.7) and (A4.8) are also shown in that table. The two sets of PWM estimates differ by only very small
amounts but they lead to different parameter values and quantile estimates in most estimating schemes. The biased
PWM estimates are favoured for purposes of quantile estimation in hydrology as indicated earlier.
"ljO values by equation (A4.7), the biased but favoured method, is illustrated in outline form
CalcqIation of M
"
in Table A4.3. Calculation of M IOk values proceeds in an analogous manner with F(i)j values replaced by [I - F(il.
"IOk could be obtained from M"1jo values using equation (A4.2) thus:
Alternatively, M
M I01
"
=
M IOO
"
"
M"o
"
Mill>
=
" IOO
M
"
2M"o
+
" I20
M
"103
M
=
M IOO
"
"
3M"o
+
" I20
3M
" I30
M
"
Mill<
=
"IOO
M
4M"o
"
+
" I20
6M
" I30
4M
(A4.8a)
(A4.8b)
(A4.8c)
+
" I40
M
(A4.8d)
A4.4
Sample data in chronological order.
950.90
195.99
368.46
688.81
658.34
270.53
787.46
118.88
565.16
399.00
186.34
276.32
493.12
474.52
599.95
866.7&
801.53
410.09
393.52
336.79
822.93
324.94
221.49
643.43
410.09
526.68
618.38
720.14
446.86
512.16
482.22
966.72
908.31
393.52
505.76
537.65
594.59
681.11
505.76
945.48
370.79
337.73
360.41
720.14
704.37
221.20
526.68
200.19
480.67
206.16
206.16
370.79
493.12
618.38
822.93
221.20
393.52
505.76
643.43
866.78
221.49
393.52
505.76
658.34
908.31
270.53
399.00
512.16
681.11
945.48
276.32
410.09
526.68
688.81
950.90
324.94
410.09
526.68
704.37
966.72
Ranked sample data.
186.34
337.73
474.52
565.16
720.14
118.88
336.79
446.86
537.65
720.14
195.99
360.41
480.67
594.59
787.46
200.19
368.46
482.22
599.95
801.53
Estimates of probability weighted moments
j=k=O
j=k=l
j=k=2
j=k=3
j=k=4
514.7810
514.7810
321.6082
193.1728
237.5128
109.0774
189.5726
72.9222
158.3115
53.4462
514.7810
514.7810
321.8682
192.9128
238.0372
109.0817
190.3035
72.9844
159.2062
53.5233
(a) Unbiassed.
M FO
J.
M ,Ok
(b) Biassed.
M ,jO
M lOk
Table A4.2: Example data of River Trent at Trent Bridge and sample PWM values, (1884 - 1933, N = 50).
The PWM values are calculated to four decimal places to avoid possible inconsistencies whcn they are used for
parameter estimation purposes. In Wakeby parameter estimation some linear combinations of PWM's occur whose
coefficients are large and which alternate in sign. Insufficient accuracy due to deliberate rounding of PWM's leads to
quite subslautia1 changes in the results or leads to invalid, see Table A4.4 and Section A4.4.3 below, Wakeby
parameters. Rounding errors also affect GEV parameter estimates from PWMs. Hence it is recommended that PWM
values are not rounded off during hand-calculations.
A 4.4
Parameter estimating equations or .algorithms
For information on a computer program for Wakeby parameter estimation see Section A4.5 of this
Appendix, page A4.14.
. .
A4.4.1
EVt CASE
The equations of Table A4.1 are simple to handle and yield
1\
a
=
A
(M,OO
A
-
2M lOl) / In 2
(A4.9)
A4.5
x(il(l
Rank
i
F(i)
Flow
(i-0.35)/50
XCi)
1
2
3
4
5
0.0130
0.0330
0.0530
0.0730
0.0930
118.88
186.34
195.99
200.19
206.16
1.5454
6.1492
10.3875
14.6139
19.1729
0.0201
0.2029
0.5505
1.0668
1.7831
46
47
48
49
50
0.9130
0.9330
0.9530
0.9730
0.9950
866.78
908.31
945.48
950.90
966.72
791.3701
847.8432
901.0424
925.2257
959.9530
722.5209
790.6739
858.6934
900.2246
953.2333
Sum
x(il(;)
25739.05
16093.410
Mean =
514.78
321.8682
Sum/50
"
= M"Xl
"
= MIlO
11901.860
x(i)F(i)3
x(i)F(i)4
2.612E·4
6.697E·3
0.0292
0.0779
0.1658
3.395E·6
2.21OEA
1.546E-3
5.685E-3
1.542E-2
659.6616
737.6987
818.3348
875.9380
946.5607
9515.175
238.0372
" I7Jl
=M
190.3035
" I30
=M
602.2711
688.2729
779.8731
852.2877
939.9347
7960.310
159.2062
" I40
=M
"1jO values by Eqn.(A4.7).
Table A4.3 : Outline illustration of calculation of M
"
0;
=
A
(2Mll 0
A
M lOo) lin 2
-
(M.I0)
and
A
U
A"
eo;
=
(A4.11)
Inserting t1te sample PWM values from part (b) of Table A4.2 gives
A
0;
=
[514.7810 - 2(192.9128)] /0.69315
=
186.04
A
=
514.7810 - (0.5772) (186.04)
=
407.40.
U
The estimated 50· year return period flood is
A
xso
A4.4.2
=
A
U
+
,,'
0;(
-In-In(I-1150))
GEV CASE.
The equations involved are not immediately soluble but Hosking et al. (1985, a,b) give a simple, accurate
a1goritltm :
A
k
=
7.8590 C + 2.9554 C2
(M.12)
A4.6
where
A
C
=
A
M IOO
2 MIlO
A
3 M I20
M lOO
A
A
A
a.
A
U
=
=
log 2
log 3
A
A
MlOoI . k
[2 MIlo
A
(A4.13)
A
r(1+k) . [1 _ 2- k
A
(A4.12a)
]
A
A
A
M IOO + a. [r(1 + k) - l]/k
(A4.14)
Inserting the sample PWM values from part (b) of Table A4.2 gives
C
=
2(321.8682) - 514.7810
3(238.0372) - 514.7810
=
0.6469423
=
0,125
-
log2
log3
0.630929
A
A
k
A
a.
A
u
0.016005
=
r(1+k) = 0.941743
=
206.23
=
418.67
The estimated 50 year return period flood correspunds to F = 0,98 and
A
AA
u + (a.
I k).
(:
( 1 - (-In 0,98) } =
A
1055.49 m
3
Is.
A
In the calculation of C, k and r(1 +k) considerable care should be taken to avoid rounding errors in order to
obtain consistent results. The two components of C are of the same order of magnitude so the less significant digits
determine its value.
A4.4.2.!
1'EsTOF HYPOTIIESIS TIIATK= O.
A statistical test of the hypothesis that an observed random sample has come from an EVI distribution when
EV2 is the alternative has been given by van Montfort (1970) while Hosking (1984) has compared a number of
methods of testing whether the shape parameter, k, is zero in the GEV distribution, (i.e the EVl· hyputhesis with GEV
as alternative). The most powerful is a likelihood ratio test while a simple test based on the probability weighted
moment (PWM) estimate of k is almost as puwerfuL This latter test. is ,also given by Hosking tl1l!...(1985b). In it the
PWM estimate of k is taken to be distributed as N[O, 0.5635/N] so that the test consists'of comparing the statistic Z =
A
k(N/0.5635)1/2 with the critical values of the standardised Normal variate.
A
In this case N = 50, k = 0.125 and Z = 0.125 (50 10.5635)1/2 = 1.177 which is not significantly large at
5% significance level. Hence the hypothesis that k = 0 is not rejected.
A
A k value of 0.125 corresponds to a GEV population skewness value of approximately 0.5 which is in
disagreement with the general observation that flood pupuIations are more pusitively skewed. Howeverthe disagreeA
'
ment is only an apparent one in view of the fact that k is not significantly different from zero (BVI skewness = 1.14) or
even from - 0.05 (GEV skewness = 1.53). The above calculations with a different 50 year River Trent data set (1917A
1969, years 1955-57 missing) gives k = - 0.01.
A4.7
A4.4.3
WAKEBYCASE
Explicit solutions for the parameters in terms of PWMs, M lOk , have been given by Greenwood tl..l!L.(1979)
for both the 4- parameter (m = 0) and 5-parameter (m '" 0) Wakeby distributions. These are given here first while
Hosking's (1986) solution for the re-parameterised fonn is given later, in Section A4.4.3.2.
A4.4.3.1 Greenwood
~
Let M(k) =
(1979) estim£ltes.
M 'Dk
for
k = 0,1,2,3,4.
(i)
Case m - 0 (4 parameter Wakebyl
1.
Calculate N l' N z' N, as
"
M(D)
(A4.15)
Nz =
" + 8 M(l)
" - M(D)
"
-9 M(2)
(A4.16)
=
" - M(O)
"
-3M(2)+ 4 M(l)
(M.17)
N,
N,
2.
3.
=
- 27
"
M(Z)
"
+ 16 M(l)
"
Calculate C 1• C 2 and C, as
"
C,
=
64 ~') + 54
"
8M(I)
(A4.18)
Cz
=
" + 18 M(Z)
" - 4M(I)
"
16 M(,)
(M.19)
C,
=
"
4 M(,) + 6 M(2) - 2M(I)
A
"
M(Z)
"
(M.W)
Calculate b"
=
(N,C , - N , C,)
±
2(N2 C, - N,Cz)
H
(A4.21)
where
Ifb, llnd b2 are the two values ofb in (A4.21) let
(A4.2Ia)
4.
"
Calculated
"d
5.
= (N, + b" Nz) I (Nz + b" N,).
(A4.22)
Next calculate
" - d)
" M(D)
{ O} = (I + b)(1
(A4.23)
A4.8
A
A
(A4.24)
(I J =2(2+ b)(2-d)M(1)'
A
A
Then estimate a and c
A
A
6.
a
=
A
~ (~+
.LQ.l.)
+ ~
l..llA
(b+ l)(b+ 2) (
8)
2 + b
(M.25)
1
and
A
A
7.
C
=
A
I
(l - d)(2 - d) ( - ( 1
1\
1\
1\/1.
d (b + d)
2 + d
+
\O_L)
(A4.26)
Applying these formulae to the example data of section A4.3.1 gives
N,
=
- 373.3834
Nz
=
46.7857
N3
=
- 70.3751
C,
=
- 161.9436
Cz
=
12.0352
C3
=
- 11.6363
b,
=
22.86140
b.
=
=
22.8614
1\
b
A
{OJ= 17757.94898
A
.a
A
1\/\
1\
d
=
- 0.4457
(1J
=
23459.43566
=
-952.4527.
A
230.8253
=
0.44569
c
1\/\
Note that b> - I, d < 1 and (a b + cd) > 0 as required for valid solutions, see "Checks on parameter validity",
Section A4.4.3.2, below.
Then
1\./\
A
"T
=
x(F = I-1m
=
A
1\
a(I-(I-F)b) -c(I-(I-F)-d)
The 50 year return period flood corresponds to F = 0.98,
A
xso- =
x(0.98) = 230.8253 [1 - (0.02)22.8614] - (-952.5427).[1- (0.02y(-0.4457)j
= 230.8253 + (952.4527).[0.825108]
= 1016.702 m 3/s, say 1020 m3/s.
Ii!}
Case m '" 0 I 5 parameter Wakeby).
I.
Calculate N[, N z and N3 as
(A4.27)
A4.9
A
1\
N z = + 16 M(,)
1\
/I.
27 M(z) + 12 M(l)
-
- M(O)
(A4.28)
(A4.29)
2.
3.
Calculate Cl' C z and C, as
Cl
=
Cz
=
C,
=
1\
1\
1\
1\
O)
(A4.30)
48 M(,) + 27 M(Z) - 4 M(l)
(A4.31)
125 M(.) - 192 M(,) + 81 M(Z) - 8
25
A
A
M(.) •
A
A
A
A
5 M(.)- 12 M(,) +
9
M(2) -
M
A
A
2 M(z)
(A4.32)
A
Calculate b
( N, C l
A
b =
-
N l C,) ± H
2(Nz C, - N, C z)
(M.33)
where
H
=
A
This gives two values of b, say bl and bz. Let
(M.34)
4.
A
Calculate d
A
A
d
=
N l + bNz
Nz
5.
(M.35)
+ bN,
Calculate
A
A
A
(0 ]
=
( 1]
=
2(2 +
(2 ]
=
3(3 + b)(3 - d .M(Z)
(3 ]
=
4(4 + b) (4 - d .M(,)
A
A
(1 + b) (1 - d).~o)
(M.36)
~ (2 - 3)'~(I)
(A4.37)
A
3)
A
A
3)
A
(A4.38)
(M.39)
A
Then estimate m, a and c
6.
A
m
= [[3]-[2]-(1]+[OlJ/4
(MAO)
A4.1O
A
A
7.
a
8.
C
=
A
(b + n(b + 2) ( -l!L
I\f\A
"
b (b + d)
2 + b
(M.4I)
3)(2-d) ( - [I)
(A4.42)
(1-
A
=
1\"
1\
d (b + d)
"
2 + d
Applying these formulae to the example data of Section (A4.31) gives
Nt
C,
b,
=
=
=
49.4962
30.2650
6.69148
C2
=
6.69148
d
A
b
A
=
104.4811
=
168.8859
A
a
A
c
b2
=
=
=
22.7152
8.3893
0.35047
=
0.35047
A
N,
C,
=
=
47.1025
12.2857
[ I } = 7882.0622
[ 3 ) = 13578.8845
{O } = 5347.0944
(2 } = 10625.9922
m
N2
= - 1014.8514
and the estimated 50 year flood is
=
M.4.3.2
x(O.98) = 104.4811 + 168.8859 [I - (0.02)6.69150)_ (-1014.8514) [1- (0.02) -(-0.35047))
= 104.4811 + 168.8859 + 757.2706
=1030.6376 say 1030 m3/s.
Checks on parameter
validity~
Not all arbitrary combinations of values of a, b, c, d yield an x-F relation corresponding to a valid
distriQution i.e. one whose F value lies in the range 0 to I and one for Which the PWMs exist. Valid and invalid
parameter combinations were given by Landwehr l1.llL(1978, with corrections, 1979) and were reproduced by Wallis
(1980). These are given here in Table A4.4. In addition to these conditions it is necessary that
b>-I
for the Wakeby mean (=
and
d<1
(A4.43)
A
M(O))
to exist (Landwehr tl.ll!., 1978, p914). If these conditions are not also satisfied by band
A
dthen one of the assumptions on which the algorithm is based, namely existence of M(Ol' is invalid and the algorithm
must be considered to have failed.
The conditions of Table A4.4 have been expressed more simply for the reparameterised form by Hosking
(1986b) and these are given in Table A4.4 both in Hosking's notation and also in terms of a,b,c,d. The first three
conditions are necessary to ensure the uniqueness of x and F, i.e that no two distinct parameter sets yield the same x(F)
while the last two, in conjunction with the first three, ensure that x(F) does define a proper probability distribution.
A4.11
If an invalid parameter combination arises from the above calculation steps, Landwehr ~ (l979b,c) and
A
Wallis (1980) suggest selecting a maximum allowable value of b, say 50, and calculating the remaining parameter
A
A
values conditional on that b value. If the resulting estimates yield another invalid combination then b is decreased by a
preselected amount LIb and another trial is made. These iterations are continued until a valid parameter set is found or
A
A
else until b becomes less than a preselected minimum allowable value, say b = I, in which case the algorithm is
considered to fail to provide any valid parameter estimates.
Failure may occur with PWMs from individual small samples from thin tailed distributions (Wallis, 1984,
Pers. comm.). It rarely if ever occurs with PWM values in the form of regionally averaged standardised PWM's used in
the recommended at site / regional WAK / PWM flood frequency procedure whose details are given in Appendix 5 and
whose properties were discussed in Chapter 5.
Imposing a large value on
bminimises the contribution of (I_F)ii to the value of x(F), except for very small
A
A
values of F, i,e in the non-flood end of the distribution. In such a case (m+a) combine to form a single location
parameter and the Wakeby distribution is effectively reduced to a 3 parameter distribution, equivalent in form to a
generalised Pareto distribution.
The above procedure whereby 4 of the parameters are estimated, conditional on a fairly arbitrarily assigned
A
large value ofb and the consequent degeneration of Wakeby into Pareto does not appeal to Hosking (1986b) who states
that "the eventual parameter estimates are effectively on the boundary of the parameter space and the sampling properties
of the estimates are difficult to assess. It seems preferable to accept that failure of the estimation procedure is an
indication that the Wakeby distribution is not a suitable choice to fit to the data".
A4.4.3.3
Numerical Accuracy.
The number of decimal places used above may seem excessive for hydrological calculation. However once the
hydrological data have been specified, what remains is statistical calculation and calculations based on linear
combinations of PWMs as used above are quite sensitive to rounding errors. It is quite in order to round off
hydrological data to one or zero decimal places according to the appropriate level of accuracy attributed to the flood data
on hydrological or hydrometric grounds. Such rounding should be distinguished from arbitrary rounding within a set of
A
statistical calculations. Finally the calculated xT value should be rounded at the end of calculations to avoid any
A
illusions of hydrological accuracy. In fact in the two Wakeby examples shown, the calculated xT are not significantly
different from 1000 m'/s.
A4 .4.3.4
Effect of rounding.
A
It is useful to note that if the M lOk values used in the example are rounded off to (515.0, 193.0, 1090,73.0
A
and 53.5) i.e a maximum of 0.07% round off, then the 5 parameter Wakeby calculations lead to a value of d = 10.76
which exceeds 1.0 and hence is invalid, see eqn. A4.43. This occurs partly because the internal relations between the
several i):10k values have been slightly but significantly disturbed to the extent that the rounded
incapable of describing a 5-parameter Wakeby distribution.
A4.4.3.5
i):10k
values are
Estimation of Wakeby parameters in reparameterised form.
This form, due to Hosking (1986b), is given in the last entry of Table A4.1. In this case Hosking (1986b)
gives the following estimating algorithm for the 5 parameter Wakeby.
A
Let ak = M lOk = Sample estimate of M lOk obtained by one of the methods of Section A4.3.
I.
Calculate
(M.44)
A4.12
Sign of Parameter
Valid distribution?
(i)
a
b
c
d
Yes
+
+
+
+
+
+
+
+
+
+
+
+
X
-
+
+
+
-
-
+
-
-
+
-
-
+
+
-
1:
2:
3:
4:
X
X
+
+.
-
+
+
-
+
+
-
+
+
-
2
X
X
3
4
+
+
+
+
5
-
1
X
X
-
-
No
1
'I-
-
Maybe
X
X
Valid if (ab + cd) > O.
Valid if (ab + cd) > 0 anda> c and b"; Idl.
Valid if (ab + cd) >0 and a> c and b" Idl.
Valid if (ab + cd) > 0 and either Ibl < Idl or c > a when Ibl = Idl.
Valid if (ab + cd) > 0 and either Ibl > Idl or c > a when Ibl = Idl.
5:
In addition to the above it is necessary for the mean to exist i.e. b> -1 and d < I, (Eqn. A4.43).
Condition
Hosking's notation
Greenwood ~ notation
(ii)
Table A4.4
1
~+
Ii > 0 or ~=r=Ii=O
2
if a = 0 then~=O
ifab=O then b=O
3
ify=O then Ii = 0
if cd = 0 then d=O
4
y"O
cd"O
5
a+y"O
ab+cd"O
b + d > 0 or b=cd=d=O
Necessary conditions to be satisfied by Wakeby distribution parameters.
(i) from Landwehr et aI. (1978, corrected 1979),
(ii) from Hosking (1986b).
A4.13
(M.45)
Nz = 80 - 12 at + 27l1z - 16 8:l
N 3 = 80 -
6 at +
9 liz -
(M.46)
4 a3
and
2.
C l = 4(m-3)(m-4)al - 27(m-2)(m4)lIz + 32(m-2)(m-3)a3 - m3am. 1
(M.47)
C2 = 2(m-3)(m-4)al - 9(m-2)(m-4)a2 + 8(m-2)(m-3)a3 - m2am_l
(M.48)
(m-3)(m-4)al - 3(m-2)(m-4)a2 + 2(m-2)(m-3)a3 - m am- l
(A4.49)
C3 =
where m takes the value m = 5 or m = N (i.e sample size) in the 5 parameter Wakeby case. (m is not to be confused
with the parameter m in Greenwood ~.(1979) version of the first form in Table A4.1). Using m = 5 corresponds
to equating the sample and theoretical values of the first 5 PWM's as in Greenwood tl..lI!.. (1979). Using m = N in
effect makes the estimated lower bound parameter = x(l)' the smallest sample value. In this example, let m = 5.
Hence
C l = 8 a l - 81 liz + 192 a3 - 125 a.
(A4.50)
C z = 4 at - 27 liz +
48 a 3 -
25
a.
(A4.51)
C 3 = 2 at -
12 a, -
5
a.
(A4.52)
9 az +
Note that Nand C values used here are equal in magnitude but opposite in sign to those used in equations
(A4.27 - A4.32). These changes in sign do not affect the following parameter estimates.
3.
~ and - g are the roots of the quadratic equation
(A4.53)
where
i.e
Zl'
H
=
z2
- (N l C 3 - N 3 C l ) ±
2(N z C 3 - N 3 C z)
=
H
(A4.54)
(Nt C 3 - N 3 C I )2 - 4 (Nz C3 - N3 Cz} (Nl Cz - NZC I )
A
13
max (z" zz}
(A4.55)
= - min (z" zz}
(M.56)
=
A
o
A
A
4.
~ = {(80 - 8al - 27l1z + 64a3) +
5.
a
6.
Y
A
A
A
A
A
A
AA
(13 - 8)(80 - 4al - 9l1z + 1003) - J3 8 (80 - 2al - 3l1z + 4a3) l
A
A
= (1+ (3)(2 + (3)[ - ~ - (1- 8) 1Io + 2(2 - 0) all
A
A
A
A
A
= (1 - 8)(2 - 0) ( ~ + (1+ (3) 80 - 2(2 + (3) all
A
A
I (J3 + 0)
A
(A4.58)
A
I (J3 + 0)
Applying these formulae to the example data of Section (A4.3.1) gives
(A4.57)
(A4.59)
A4.14
=
- 49.4962
Nz
=
22.7152
N~
=
- 47.1025
C1 =
- 30.2650
Cz
=
8.3893
C3
=
- 12.2857
=
0.35047
=
355.6754
N,
~
=
6.69149
~
=
6.69149
/)
/I
IJ
/I
/I
l;
=
104.4811
/I
IX
= =
0.35047
/I
1130.0976
'Y
For F = 0,98, T = 50
~50 = 104.4811 + (1130.0976/6.69149).[1-(0.02)6.69149]_- (355.6754/-0.3505).[1-(0.02)'(-0.3505)]
=
=
104.4811 + 168.8858 + 1014.7658(0,74619)
1030.57, say 1030 m3/s.
Checks on parameter validity.
In the re-parameterised form of Table A4.2, the checks on parameter validity to ensure that x(F) is a valid
inverse distribntion function and that no two distinct sels of parameters. yield the same value of x(F) are simpler than
those required for the first form. These conditions (Hosking 1986) are given in Table A4.4 and are satisfied by the
parameter values calcu1ated above.
A4.4.3.6
Sampling Properties of Wakeby parameter and quantile estimates.
Landwehr tl.l!L. (1979 a,b) investigated these properties by simulation methods. They found that parameter
estimates from sample sizes prevalent in hydrology are extremely variable and biased and hence may bear little
resemblance to the true values. In such samples also there is a possibility that no valid distribution is described by the
estimated parameters. On the other hand estimated quantiles display less erratic behaviour although in hydrologically
sized samples they are still quite variable. However Wakeby quantiles of an index flood scaled distribution, estimated
from regionally averaged PWMs, are practically unbiassed and have acceptably small standard errors, regardless of the
true form of the parent distribution. This property of robustness is discussed in more detail in Chapter 5. Also failure
of the estimating alogrithm to produce valid parameter combinations is unlikely in the regional estimation context.
A 4.5
Computer program for parameter estimation
A computer program is available for performing the calculations involved in the numerical examples of this
Appendix. IUs anticipated that this program will be available through the WMO - HaMS programme.
APPENDIXS
NUMERICAL
EXAMPLES
Introduction
Numerical examples are included here to illustrate some of the at-site/ regional techniques mentioned in the
text. Such examples serve only to illustrate the calculation steps involved. They do not in themselves show how
appropriate or how efficient such methods are for estimating flood quantiles, QT' Only appropriate statistical tests and
analyses, carried out algebraicly or by simulation methods, can answer such questions. Such studies were discussed in
Chapter S.
DATA
A medium sized data set of M = 20 stations each having N = 20 years of AM flood record from East Central
England is used for illustrative purposes. The AM series and their probability weighted moment (PWM) values are
given in Table AS.1. Some AM floods in this region are caused by frontal rain (winter or summer), some by
convective rainstorms (mostly summer) and some by snowmelt (winter/spring). Regional average values of statistics
obtained from a larger data set indicates that popnlation regional average values of Cv and Cs are of the order of 0.50 and
3.0 respectively.
Index flood method based on regionally averaged standardised probability weighted moments
This method was initially proposed by Wallis (1980) in conjunction with the Wakeby distribution. That
procedure is denoted WAK / PWM in the text. The same method was used with EVI and GEV distributions by Greis
and Wood (1981) and by Hosking et al. (198Sa) respectively. Basically the variate Q is replaced by X = Q / (E(Q)
which is assumed to have a common distribution at all sites in the region. The scaling quantity, in this case E(Q), is
called an index flood because for a given catchment it characterises the flood magnitudes for that catchment and it
reflects the effect of the very important controlling factor of catchment area. A graphical implementation of this type of
technique was introduced by Dalrymple (1960).
Tbe steps involved are: The PWMs of this X distribution are first estimated, steps (1) to (3) below, in a
distribution free manner. These are then used to estimate the parameters and quantiles of a distribution chosen to
represent the X distribution, steps (4) to (6) below. These X quantile estimates are then scaled (i.e. multiplied) by
estimated mean annual flood Q, i.e. an estimate of E(Q)for any site at which flood quantile estimates are desired (step
7).
A
Step I.
For each AM record compute MlOko k
GEV, V
Step 2.
=3 for
= 0, 1, 2...v, using equation (A4.8).
4-parameter Wakeby and v
A
A
=4 for S paramcter Wakeby.
Here y
= I for EV1, v = 2 for
A
M 100 is the mean of the series.
A
For each record compute mk = M 10k / M100, k = 0, I, 2,.....v. These are standardised PWM values. The
Gik values for the present example are tabulated in Table AS.I. Note that Gi O = 1.0.
Step 3.
For each k, calculate weighted average values of Gi k over M records as follows
M
mk =
L
(Gik)j [Nj / L]
(AS.I)
Nj
(AS.2)
j=!
where
M
L
=
L
j=!
m
These ( mO =1, I' m2
) values are now considered as sample estimates ofPWMs for the regional
standardised population whose variate X = Q I E(Q) has mean of unity. In the present example the weights
[NjJLl are all equal and mk is simply the arithmetic mean of (Gikh, (Gik )2...•.....(Gikho.
The mk values for the present example are:
(LO, 0.36217, 0.20679. 0.13976. 0.10336).
Step 4:
Choose a distributional form for the X population.
Step 5:
Apply methods of PWMparameter estimation described in Appendix 4 to estimate the X population
).
parameters from (I, m 2, m 3
Step 6:
Calculate estimates of quantiles, QT, of the estimated X population. These are assumed to apply to every site
in the. region.
Step 7:
The at-site quantile estimate QT is obtained as
A
A
~
-
A
(A5.3)
Q.XT
=
where Q is the at-site mean annual flood obtained preferably from observed data or in their obsence from a
regionally calibrated relation between Q and catchment characteristics.
Example Cal: Application of GID' - distribution.
1\
Since the GEV estimating equations (A4.12)
A
(A4.14) are expressed in terms of MljO rather than MlOk. convert
the mJi; values obtained above to corresponding m j. values using equations (A4.3). Thus
(mj)j,=2
1.0 - 2(0.36217) + 0~20679 = OA8245.
=
A
Note that the exact same values would have been obtained for m j if MIjO values had been used in steps I to 5 above.
Now (LO, R63783. 0.48245) are used to estimate GEV values of (u. a. k) using equations (A4.12), (A4.12a), (A4.13)
and (A4.14) to give
A.
k
A·
u
a"
=
-O.U64
=
0.7508
(}.3529
=
A
A
lOT
=
A
u
+
a.t.
"I
I - f • In (t - I I 'F»)I{
Ii
k
"'O
X
=
2A934
(A5A)
A5.3
Thus for site number 10 whose mean annual flood estimate is Q
=17.352 m3/s
17.352 (2.4934) = 43.265. say 43 m3/s.
=
Example (b). Application of 5-parameter Wakeby distribution.
1\
Use (l.O. m 10 m2. m3. m4) = [1.0. 0.36217. 0.20679, 0.13976, 0.10336) for [MlOk. k
estimating procedure described in Appendix 4, pp A4.9 - A4.10.
= 0 ..... 4)
in the
This leads to the following estimates in the Greenwood tlJl1 (1979) notation
m
=
0.2442
a
=
~
=
0.4598
4.1904
c
=
1.3874
=
0.2170.
1\
1\
a
A
A
k
A
A
The corresponding values in Hosking's parameterisation are ~ =0.2442, a = 1.9267, P = 4.1904, y= 0.3011 and Ii =
0.2170. Each of these sets of parameter estimates satisfy the conditions. expressed in Table A4.4. for the definition of a
valid Wakeby distribution.
Then from
A
x(F)
=
1\
1\
m + a [1
1\
(1 - F)b ]
1\
1\
- c[ 1 - (1 - F)-d ]
the T = 50 year quantile estimate is
1\
x50
=
0.2442 + 0.4598[1 - (0.02)4.1904] - 1.3874[1 _ (0.02)'0.2170]
=
0.2442 + 0.4598 + 1.8551
=
2.5591
Thus for site number 10.
1\
Qso
=
(17.352)(2.5591) = 44.405, say 44.5 m3/s.
(A5.5)
M.4
32007
28804
YEARS
1940-59 1935-55 1950-69
1944-66
1960-79 1960-79 1960-79 1950-69
1960-79 1940-59
27.36
26.60
17.45
22.34
31.34
34.51
41.53
25.48
27.36
20.72
172.75
137.07
164.50
181.17
240.28
356.96
105.13
140.85
63.13
140.85
2.34
4.13
4.16
10.12
6.18
4.81
3.01
27.81
4.07
4.29
560.82
1006.53
1107.33
495.39
624.08
476.74
624.08
458.53
374.00
279.08
28.88
8.63
7.93
12.63
5.38
18.32
22.93
11.61
25.90
14.72
6.94
2.55
5.77
6.26
2.41
5.02
8.37
7.26
8.74
5.00
6.90
3.53
5.85
7.01
2.63
8.18
10.52
9.74
10.45
7.84
23.73
14.56
13.11
10.18
26.32
10.77
16.61
30.03
24.05
16.61
11.88
4.59
8.88
14.05
10.70
13.36
14.29
13.82
14.84
15.13
31.51
24.68
22.90
3.47
25.28
14.01
24.80
6.00
10.10
18.17
23.70
25.48
10.57
11.70
24.69
17.64
13.65
14.16
13.14
24.32
306.74
176.94
137.07
148.55
115.35
140.85
111.90
148.55
203.02
133.33
4.55
3.26
5.33
7.20
6.10
6.25
2.91
5.23
5.51
4.81
747.15
547.16
732.82
809.99
265.71
295.86
356.04
337.56
731.25
385.12
14.11
9.71
7.82
7.46
34.30
5.16
37.54
28.06
25.86
27.99
6.29
2.68
6.52
2.38
7.82
2.56
7.73
7.47
7.77
7.37
6.04
2.69
10.48
4.18
17.01
0.70
16.06
17.39
18.58
20.50
18.22
9.53
17.46
16.73
5.91
13.67
16.94
13.87
15.77
14.97
19.53
18.00
19.16
6.03
7.36
1.84
7.53
7.43
7.36
6.99
22.31
17.56
14.01
13.33
21.26
10.21
11.08
17.56
26.49
12.30
22.687 166.250
8.997
65.179
5.279
39.280
3.616
27.546
2.701
20.879
6.104
2.049
1.207
0.838
0.632
560.762 17.747
211.720 5.922
122.770 3.128
84.275 2.020
63.336 1.460
1.000
0.397
0.233
0.159
0.119
1.000
0.336
0.198
0.137
0.104
,l-
Ml00
MI0l
M102
M103
MI04
roo
ml
m2
m3
m4
1.000
0.392
0.236
0.166
0.126
1.000
0.378
0.219
0.150
0.113
1.000
0.334
0.176
0.114
0.082
32002
32006
28070
I
I
I
I
I
I
I
I
AM
DATA
m 3 /s
I
I
I
I
I
I
I
I
30001
32004
28002
.,.
28010
32003
STATION
5.846
2.288
1.290
0.852
0.618
9.314 16.452
2.996 6.542
1.534 3.884
0.953 2.683
0.658 2.008
1.000
0.391
0.221
0.146
0.106
1.000
0.322
0.165
0.102
0.071
1.000
0.398
0.236
0.163
0.122
11.139 17.352
4.103 6.477
2.276 3.609
1.498 2.375
1.083 1.714
1.000
0.368
0.204
0.134
0.097
·1.000
0.373
0.208
0.137
0.099
Table M.l : Annual maximum flood data from East Central England for regional flood frequency estimation example.
A5.5
32008
1945-66
32010
1940-59
33006 33009
34001 34002
34003 36001
36008
37001
1956-77 1951-70 1958-77 1958-77 1959-78 1955-74 1961-80 1950-69
3.66
29.56
3.53
3.92
7.09
12.87
6.06
4.13
8.84
15.80
89.09
65.45
75.28
21.45
58.30
38.61
255.00
28.67
48.65
62.68
6.47 78.60
9.69 80.90
65.03
9.41
8.73 152.72
8.85
77.74
5.63 77.74
4.74
82.90
5.60 147.16
4.36 77.98
10.39 145.34
10.42
9.90
11.13
7.14
6.00
7.89
6.63
17.98
7.22
21.80
13.73
10.44
12.90
5.28
5.00
6.12
3.37
13.02
6.92
61.92
5.60
10.07
5.18
2.62
3.66
2.72
8.23
6.42
9.35
6.38
25.20
20.39
40.07
40.07
40.07
23.22
24.92
33.76
38.91
23.22
12.77
15.78
21.72
5.98
13.09
11.87
85.00
22.30
17.78
20.38
28.03
16.99
26.61
21.23
17.55
14.72
15.57
31.14
22.93
20.95
3.14
7.72
9.45
26.72
7.33
15.31
4.12
10.43
14.84
11.66
71.51
54.15
53.86
47.83
80.57
48.39
46.43
68.28
98.14
62.53
5.40 85.85
6.60 78.10
9.82 95.35
7.59 36.98
9.25 81.63
9.12 ' 89.16
2.89 183.06
13.30 128.92
28.03
83.77
6.46
88.74
13.08
8.74
14.21
14.21
2.88
7.10
15.23
5.01
7.71
8.04
10.98
8.89
12.58
9.89
2.61
3.81
11.31
4.13
9.07
7.31
5.19
9.02
8.11
3.43
2.88
9.99
2.40
10.65
3.72
6.04
20.39
18.69
99.12
34.92
29.88
28.67
31.01
4.07
31.06
29.38
19.83
10.88
9.97
31.20
8.50
18.52
38.00
'34.34
21.11
22.96
36.80
25.48
13.99
38.64
9.48
24.52
19.96
34.11
31.14
20.95
10.309
3.219
1.712
1.118
0.814
68.744
24.413
14.283
9.792
7.291
8.617
3.129
1.824
1.249
0.932
1.000
0.312
0.166
0.108
0.079
1.000
0.355
0.208
0.142
0.106
1.000
0.363
0.212
0.145
0.108
Table A5.1 (continued)
96.884 10.116 10.964
38.929 3.737 3.209
23.717 2.152
1.746
16.803 1.468
1.144
12.841 1.094 0.828
1.000
0.402
0.245
0.173
0.133
1.000
0.369
0.213
0.145
0.108
1.000
0.293
0.159
0.104
0.076
6.083 31.851 22.099 23540
2.221 11.815 7.213
9.385
1.236 6.886 4.039
5.545
0.824 4.679
2.704
3.831
0.607 3.448
1.985
2.879
1.000
0.365
0.203
0.135
0.100
1.000
0.371
0.216
0.147
0.108
1.000
0.326
0.183
0.122
0.090
1.000
0.399
0.236
0.163
0.122
MI00
MIDI
M102
M 103
MI04
rno
rnl
rn2
rn3
ID4
1
APPENDIX 6
WMO SURVEY ON DISTRIBUTION TYPES CURRENTLY IN USE FOR FREQUENCY
ANALYSIS OF EXTREMES OF FLOODS BY HYDROLOGICAL AND OTHER SERVICES
Prepared for the WMO Secretariat by B. Sevruk and H. Geiger, Institute of Geography, Federal Institute of
Technology, Zurich, Switzerland.
Introduction
The report contains an analysis of replies to a questionnaire sent to selected members of the WMO
Commission for Hydrology. Fifty-five agencies from 28 countries took part in the survey. The information and results
are summarized in five tables as follows:
Table I
List of countries and agencies
Table II
Distribution types currently in use for extremes of precipitation
Table III
Distribution types currently in use for floods
Table IV
Plotting positions currently in use for extremes of precipitation and floods
Table V
The most frequently used distribution types for precipitation and floods
Each country was also requested to indicate specific references of publications, manuals or textbooks
containing descriptions of the national / agency standard or recommended distribution type used operationally. These
references are also included in this Appendix.
Selection of distribution types
In most countries there is no "standard" or adopted distribution; but some statistical is procedure recommended
in a general manner and applied. Usually a number of different distributions are applied to data, as shown in Tables II
and III and the choice of a particular distribution is made empirically and I or by comparison by using statistical tests.
The "best fit"distribution is eventually used.
A frequently used statistical test is that by Kolmogorov-Smirnov but the X2 and Van Montfort [extreme value
type I (EVI) and extreme value type 2(EV2) distribution] as well as many other tests (Anderson-Darling; Brunet-Moret;
Cramer-Von Mises; Kritski-Menkel, and some subjective tests) are also applied. However, in many countries a
selection of an AM distribution is actually not made in any objective manner; lhe choice of distribution is argued in a
general manner, as follows:
The distribution is
- widely, most or generally accepted
simple, easy, quick or convenient to apply
consistent, flexible or robust (low sensibility to oUlliers)
- theoretically well based
- documented in the WMO Guide or elsewhere.
No special method or parameter estimation is preferred and the graphical method is as frequently or even more
used as any other method.
A6.2
Currently used distribution types and plotting positions
The most freqently used or recommended frequency distribution types for extremes of both precipitation and
floods are the EVI and the log-normal distributions. Almost one half of all agencies use these distribution types. This
agrees well with the conclusions in the WMO report by Sevruk and Geiger (1981) which reviewed the literature on the
extremes of precipitation. The EV2, the Pearson 3 (P3) and the log-Pearson 3 (LP3) distributions are also in use but
mainly for floods, where they account for one third of all the cases. Among the plotting positions, the Weibull formula
is by far the most favoured (1{2 of agencies), followed by the Gringorten, Hazen and Blom formulas (1/3 of all
agencies).
A6.3
TABLE I
List of countries and agencies
No.
COUNlRY
AGENCY
ADDRESS
ABREVIATION
REFERENCE
1
Australia
Queensland Water
Resources Commission
GPO Box 2454,
Brisbane QLD 400I
AA 1
2
M.W.S. & D.B.
PO Box A53, Sydney
South, Australia 2000
AA2
3
N.S.W. Public Works
Department Office
Phillip St., Sydney
2000
AA3
29
4
Engineering & Water
Supply Depanment
GPO Box 1751,
Adelaide 5001
AA4
24,29,37,47
5
Bureau of Meteorology
GPO Box 1289K,
Melbourne VIC., 3001
AA5
37,48,49
6
"
AA6
36,48,49
·AA7
29,31,32
6, 7, 8, 12, 14, 34
7
State Rivers & Water
Supply Commission of
Victoria,
8
Hydro-Electric Comm.
GPO Box 355D, Hobart,
Tasmania, 7001
AA8
29
9
Water Resources
Branch Public Works
Department
2 Havelock Street,
West Perth,
Western Australia
AA9
29
10
Water Division, Dept.
of Transpon & Works
Darwin,
Northern Territory
AA 10
. 590 Orrong Road,
ArrnadaIe,
Victoria 3143
29,37,47
11
Auslria
Hydrographical
Service
Marxergasse 2,
A-1030 Vienna
All
20,40
12
Belgium
Section d'Hydrologie,
Institut Royal
Meteorologique de
Belgique
3, ave. Circulaire,
1180 Brus,els
BE I
45
13
Brazil
Departamento Nacional
deObrasde
Saneamento-Ministerio
do Interior
Rua Debrct, 23-99
Andar-Rio de Janeiro
RJ Brazil
BR I
14
Bulgaria
Institute d'Hydrologie
et de meteorologic
Blvd. Lenine, No.66
Sofia No.
BU I
55,56,57
15
Canada
Inland Water
Directorate
Environment Canada
CA I
Ottawa, Ontario,KIA OE7
22
A6.4
TABLE I (continued)
No.
COUNlRY
AGENCY
ADDRESS
ABREVIATION
REFERENCE
16
Costa Rica
Instituto Costaricense
de Electricidad
Apartado 10032,
San Jose
CR I
17
Cyprus
DejJartment of Water
Development
Nicosia
CY I
18
Czechoslovakia
Czech Hydrometeonr
logical Institute'
Slovak Hydrometeorological Institute
15129 Praha 5,
Smichov, Holeckova 8,
833 15 BratislavaKolilia, Jeseniova 17
CZl'
19
Finland
The National Board of
Waters,"Hydrological
Office
P,O. Box 250
SF-OOIO! Helsinki 10
FII
20
Fnmce
Service Hydrologique
de I'Orstom
7~ 74ROlited'Aulnay
91140 Bondy
FRI
12
·21
German
Democratic
Republic
Meteorologischer
Dienst der DDR,
Forschungsinstitut
fllr Hydrometeorologie
DDR-1020 Berlin
GDI
41
Institut fllr
Wasserwirtschafi
DDR-ll90 Berlin
GD2
20
14, 27, 35, 52
22
Germany,
Federal
Republic
Institut fUr
Hydrologie und
Wasserwirtschafi,
Universitllt Karlsruhe "
KaiserSlrasse 12,
7500 Karlsruhe I
GF I
29
23
Hong Kong
Royal Observatory
Nathan Road~ KoWloon
HKI
25
24
Hungary
ResearchCenlre for
Water Resources
Development
PO Box 27,
H-1453 Budapest
HUI
16
25
Netherlands
Rijkswaterstaat' Directie PO Box 20907,
Waterhuishouding en
,·2500 EX The Hague
Waterbeweging
NE 1
3, .:is
Royal Netherlands
Meteorological
Institute (KNMI)
PO Box 201,
3730 AE De Bilt
NE2
9, 10
26
27
New Zealand
Water Resource Survey
DSIR
P.O. Box 12 - 043
Wellington
NZI
"5, 15, 50
28
Norway
The Norwegian
Meteorological Inst.
Box 320, Blindem
Oslo 3
NOI
38
A6.5
TABLE I (continued)
No.
COUNTRY
AGENCY
ADDRESS
ABREVIATION
29
Poland
Institute of
Meteorology & Water
Management
61 , Podlesna str.,
01 - 673 Warsaw,
PO I
11,30
30
Romania
Institute de Meteorologie
et Hydrologie
Soseaua BucurestiPloiesti, Nr. 97Bucuresti-Romanie
RO I
19
31
Sweden
Swedish Meteorological
& Hydrologicallnst.
Box 923,
S - 601 19 Norrk5ping
SE I
32
Switzerland
InstilUt federal de
recherches foresticres
CH-8903 Birmensdorf
SW I
33
Service hydrologique
nationale
Case postale 2742
3001 Berne
SW2
34
Service federal des
routes et des digues
Case postale 2743
3001 Bcrne
SW3
Hydrology Division
Royal Irrigation Dept.
811 Samsen Road
Bangkok 10300
TH I
36
Electricity Generating
Authority of Thailand
Rama 6 Bridge
Nonthaburi, Bangkok
TH2
32
37
Hydrology Section
National Energy
Administration
Ban Phibuntham,
Kasatsuk Bridge,
Rama I Road
Bangkok 10500
TH3
36
35
Thailand
REFERENCE
54
21
38
Turkey
General Directorate
of State Hydraulic Works
Yilcetepe, Ankara
TU I
4,23
39
United
Kingdom
Wessex Water
Authority
Wessex House
Passage Street
Bristol BS20JQ
UK I
39
40
North West Water
Authority
Dawson House,
Liverpool Road,
Warrington WA5 3LW
UK2
39
41
Thames Waler Authority
Vastern Road
Reading RG I 8DB
UK3
39
42
Southern Water
Authority
Guildbourne House
Worthing, BNll ILD
UK4
39
43
Anglian Water
Authority
Ambury Road,
Huntingdon
PEl86NZ
UK5
39
A6.6
TABLE I (continued)
AGENCY
ADDRESS
ABREVIATION
Institute of Hydrology
Wallingford
OXlOSBB
UK6
2S.49
Greater LondilU Council
Dept. of Public
Health Engineering,
River Branch
South Block,
County Hall,
London SEI 7PB
UK7
39
46
Norihumbrian Water
Authority
Gosforlh,
Newcastle upon Tyne
NE33PX
UK8
1,2, IS
41'
Sevem"TtentWater
Authority
2297 CoventryRoad
Birmingham,
B263PU
UK9
17.39
4S'
WelshWater
Authority
Cambrian Way.
Brecon, Powys,
LD37HP
UK 10
39'
49
South West Water
Authority
Matford Lane,Exeter
EX24QX, Devon
UKlI
39
No.
COUN1RY
44
45
UnitedKingdom
(Continued)
REFERENCE
50;
Uruguay
Direccion Nacional
de Meteorologia '
Castilla de Correa
No; 64
UR I
53
51
USA
NOAA,National
Weather Service
S060 13th St.
Silver Spring MD
US 1
26.51
52
US Army Corps of
Engineers. HQDA
(DAEN"CWH)
Washington DC 20314
US 2
51
53
US Geological
Survey
415 National Center
Reston Virginia 22092
US 3
51
National Council for
Scientific Research
& Meteorological Dept.
POBox 30200, Lusaka
Zambia
ZA 1
54
zambia
A6.7
TABLE II
Distribution types currently in use for extremes of precipitation
(') indicates distribution recommimded or used as "standard" distribution
DISTRIBUTION
TYPE
COUNTRY AND AGENCY
TOTAL %
TOTAL' %
NO. OF
COUNTRIES
Extreme Value (EVI)
AAI, BEl', CRI', CZ',Fll',
FRI, GD, GFI', HKI', NE2',
NZI', NOl', POI', SEI, SWI,
THI', TH2', TUI', UK2', UK4',
UKS, UK7', UK11', URI', USI'
2S
30
19
41
20
Extreme value (EV2)
AAI, BEl', HI, FRI, SWI,
TUI', UKS, USI'
8
10
3
6
8
Extreme value (EV3)
AAI, GD', SEI, UKS
4
S
I
2
3
General extreme value (GEV) FRI, NE2', UKl", UK3', UKS', UKl!'
6
7
S
11
3
Pearson 3 (P3)
AAI, BUI, FRI, POI', ROl',
TH3, TUI'
7
9
3
6
7
Gamma 2 or 3 parameters
BUI, CYI', CZ', SEI
4
S
2
4
4
Exponential
GFI', SWI
2
2
I
2
2
Log-Pearson 3 (LP3)
AAI, AA3, FRI, TR3', TVI'
S
6
2
4
4
Normal
SEI, ZAI'
2
2
I
2
2
Normal- power-transformed
AAI
I
I
0
0
I
Log-Normal
AAl', AAS', AA7', AA8', BEl,
BUI', FRI,HUI, POI', SEI,
TH3,TUI, , UKS
13
16
7
IS
10
General symmetrical
distribution
AAI
I
I
0
0
I
Boughton
AAI
I
I
0
I
I
Wakeby
AAI
I
0
0
I
Loi des fuils
FRI
I
I
0
0
I
Krilski-Menkel
ROI'
I
I
I
2
I
I
I
I
I
I
I
2
2
I
I
84
100%
47
100%
Empirical methods
Log-log
Hazen
Total
BRI'
TRI'
A6,S
TABLE III
Distribution types currently iu use for floods
(*) indicates recommended or used as "standard" distribution
DIS1RIBUTION
TYPE
COUN1RY AND AGENCY
TOTAL %
TOTAL* %
NUMBER OF
COUNlRIES
Extreme value (EVI)
AAI, AAZ, AA3*, All*,
BEl, CAl, CRI', CYI',
FII*, FRI, GD, GFI, HKI',
NZI*, SEI, TRI', THZ',
TH3*, TVI*, UKI, UKZ*,
UK3*, UK4*, UK5*,
UK7*, UK9, UKI9*, UK))*
ZS
ZO.O
IS
Z5.0
16
Extreme value (EVZ)
AAI, AAZ, AA3*, BEl, FII,FRI,
GEl, NZI, TVI*,UK5, URI*
))
7.7
3
40
9
Extreme value (EV3)
AAI, CAl, CRI, CZ, FII,
SI, TH3, UK5
S
5.6
0
0.0
S
Geneml extreme
value (GEV)
FRI, UKI*, UKZ*, UK3*
UK5*, UK9, UK))*
7
4.9
5
7.0
Z
Pearson 3 (P3)
AAI, AAIO, All', CZ',
BU!, FRI, HU!, POI',
ROl*, SWZ, SW3, THZ',
TH3, TVI*, UKI', UKIO', UK))
l7
11.9
8
IZ.O
IZ
Gamma with Z or 3
parameters
BEl, BU!, CRI', CZ',
GFI', SEI
6
4.Z
3
4.0
6
Exponential
BEl, NEI', SW3, UK7',UKS*
5
3.5
3
4.0
4
Log-Pearson 3
(LP3)
AAI*, AA3', AA4',AA6', AA7*,
AAS' ,AA9', AA10', All',
CAl, CRI', CZ', FRI,
GFI', NZI, SWZ, TH3',
TUl', UK)), USI', USZ', US3'
ZZ
15.4
17
Z3.0
13
Normal
SEI
I
0.7
0
0.0
I
Nonnalpower transformed
AAI'
I
0.7
0
0.0
I
Normal-Box-Cox
transfonned
AA4
I
0.7
I
1.0
I
Log-Normal
AAI, AAZ, AA3', AM, AA7',
AA9', BEl, BUl', CAl, CRI',
CYI, CZ', FIl, FRI, GFI,
SEI, SWZ, SW3, THZ', TH3,
TVI', UK5, UK7', UK9,
UKIO', UK)), ZAI'
27
18.9
11
15.0
16
A6.9
TABLE III (continued)
DIS1RIBUTION
TYPE
COUNTRY AND AGENCY
TOTAL %
TOTAL' %
NUMBER OF
COUNTRIES
General symmetrical
distribution
AAI
I
0.7
0
0.0
I
Boughton distribution
AAI
1
0.7
0
0.0
I
Wakeby Distribution
AAI
I
0.7
0
0.0
1
Loi des foits
FRI
1
0.7
0
0.0
1
Kritski-Menkel
ROI'
I
0.7
I
1.0
1
Beta
FRI
0.7
0
0.0
1
0.7
1
I
1
0.6
1
I
1
Empirical methods
Hazen
TIll'
Schreiber-Noblis
All'
Total
1
143
72
A6.10
TABLE IV
Plotting positions used in use for extremes of precipitation and floods
(n; sample size, m ; rank in descending order)
PLOTTING
POSmON
Weibull
COUNlRY AND AGENCY
T
;
n + I
m
AAI, AA2, AA3, AA6, AA7
TOTAL (%)
NUMBER OF
COUNlRIES
29
47
16
8
13
3
7
11
6
6
10
2
AA8, AA9, AA11, All, BEl,
CRI, FII, GFI, HKI, NZI,
POI, RO!, SWI, SW2, THI,
TH2, TUI, UK2, UK4, UK11,
US I, US2, US3, ZAI
Gringorton
T
;
n + 0.12
m - 0.44
NE2, NZl, UK2, UK3, UK4,
UK5, UK9, UK11
Hazen
T
;
n
m - 0.5
AA2, BRl, CZ, FRl, THl
UK9, UKll
Blom
T
;
n + 0.25
m - 0.375
AA7, UK2, UK5, UK8, UK9
UK 11
Cunnane
T;
n + 0.2
m - 0.4
AAl, AAlO, CAl, GFI
4
7
3
California
T
;
n
m
AA2, All, BRI
3
5
3
Chegodayev
T
;
n + 0.4
m - 0.3
NEI, POI
2
3
2
Adamowski
T
;
n + 0.5
m - 0.24
CAl
1
2
1
T
;
(n + I) - a
m-a
1
2
1
61
100
Total
AA4
A6.11
TABLE V
The most frequently used distribution types for precipitation and floods
(*) indicates reeommended or used as "sl1lndard" distribution
DISTRIBUTION TYPE
FLOODS
PRECIPITATION
%
%*
%
%*
Extreme value I (BVl)
30
41
20
27
Log-normal (LN)
16
15
19
15
Extreme value 2 (BV2)
10
6
g
4
General extreme value (GEV)
7
11
5
7
Pearson 3 (P3)
9
6
12
12
Log-Pearson 3 (LP3)
6
4
15
23
Extreme value 3 (BV3)
5
2
6
0
Gamma
5
4
4
6
Others
12
II
A6d2
References containing descriptions of distribution types used by national hydrological agencies
I
.2
Archer, D.P., 1981: Seasonality of flooding and assessment of seasonal flood risk. Proc. Inst. Civ. Eng. Pt2,
70.
Archer, DR, 1981: A catchment approach to flood estimation. J. Inst. Water Eng. and Sci., 35 (3).
3
Barlow, R.E. et aI., 1972: Statistical inference under order restrictions. Wiley, New York.
4
Bayazii, M.: Statistical methods in hydrology. Ankara.
5
Beable, M.E. and McKerchar, A.I., 1982: Regional flood estimation in New Zealand. Tech. Pub. No. 20, Water
& Soil Div., Ministry of Works and Development, Wellington.
6
Bobee, B. and Boucher, P., 1981: Adjustment des distributions Pearson type 3, gamma, log-Pearson type 3 ei
10g:gamma. Rapport Scientifique No. 105, INRS-Eau, Universite du Quebec.
7
Bonghton, W.C., 1980: A frequeney distribution for annual floods. Water Resour. Res., 16 (2), pp.347-354.
8
Box, G.E.P. and Tiao, G.C., 1973: Bayesian influence in statistical analysis. Addison-Wesley, Reading, pp
.
156-160.
9
Buishand, T.A. and Velds, C.A., 1980: Klimaat van Nederland, Neerslag en Verdamping. Staatsuitgeverij, Den
Haag, pp 206.
10
Buishand, T.A., 1983: Uitzonderlijk hoge neerslaghoeveelheden en de theorie van de extreme waarden.
Cultuurtechnisch Tijdschrift, 23, 9-20
11
Byczkowski, A., 1972: Hydrological background for the design of the land reclamation structures. Extremal
flows. PWRiL, Warsaw.
12
Cahiers de I'Orstom, Serie d'Hydrologie, 6 (3), 1969; 10 (2), 1973; 11 (4), 1974; 12 (2), 1975; 15 (3), 1978.
13
Chander, S., Spolia, S.K. and KUMAR, 1978: Flood frequency analysis by power transformation. J. Hydraul.
Div., A.S.C.E., 104 (HYll), 1495-1504.
14
Chow, V.T. (ed), 1964: Handbook of applied hydrology. McGraw-Hill Book Company, pp 8-25 to 8-26.
15
Coulter, J.D. and Hessel!, J.W.D., 1980: The frequency of high intensity rainfall in New Zealand, Part II, point
estimates. NZ Met. Ser.
16
Csoma, J. and Szigyarto, Z., 1975: A matematikai statisztika alkalmazasa a hidrol6giaban. Vizgazdlilkodesi
Tudomanyos KutatO Intezet, Budapest.
17
Cunnane, C., 1978: Unbiased plotting positions - a review. J. Hydrol., 37 (3/4), pp 205-222.
18
Dalrymple, T., 1960: Flood frequency analysis. Manual of Hydrology. Pt. 3, US Geol. Survey.
19
Diaconu, C. and Mociornita, D., Instructions techniques pour la determination des particularites des cmes de
calcul (tMoriques) sur les rivieres, Bucarest.
20
DVWK, 1976: Empfehlung zur Berechnung der Hochwasserwahrscheinlichkeit. H. 101, DVWK-Reglen ZUf
Wasserwirtschaft. Hamburg, Paul Parey Verlag.
21
Eidgenoessisches amt fUr Strassen nnd Flussbau, 1974: Die grossten bis zum Jahre 1969 beobachteten
Abflussmengen von schweizerischen Gewassem, Bern.
A6.13
22
Environment Canada Manuals
Flood Damage Reduction Program, Flood Frequency Analysis (FDRPFFA)
Flood Frequency Analysis with Low Outliers (LOWOUT)
Flood Frequency Analysis with Historic Information (ISTORIC)
Inland Waters Directorate, Environment Canada, Ottawa, Ontario,
KIAOE7
23
Erkek, C.: Statistical methods in flood estimations. Ankara.
24
Gnanadesikan, R., 1977: Methods for statistical data analysis of multi-variate observations. J. Wiley & Sons.
25
Gumbel, E.J., 1954: Statistical theory of extreme values and some practical applications. US Nat. Bur. Stds.
Appl. Math. Ser. 33.
26
Gumbel, E.1., 1958: Statistics of Extremes. Columbia University Press, New York, pp 375.
27
Haan, T.C., 1977: Statistical methods in hydrology. The Iowa State University Press.
28
Hamlin, M.1. and Wright, C.E., 1978: The effect of drought on the river systems. Pap. to Royal Soc., pp 7393,London.
29
Institution of Engineers, 1977: Australian rainfall and runoff-flood analysis and design. 2nd ed., Australia.
30
Kaczmarek, Z., 1970: Statistical methods in hydrology and meteorology, WKi, Warsaw.
31
Karoly, A. and Alexander, G.N., 1960: Analysis of storm rainfall in Victoria.
32
Kite, G.W., 1977: Frequency and risk 'analysis in hydrology, Water Resour. Publ., Fort Collins, Colorado
80522, U.S.A.
33
Koppittke, R., Steward, B. and Tickle, K., 1976: Frequency analysis of flood data in Queensland. Paper
presented at Institution of Eng. Austr. Hydrol. Symp., Sydney.
34
Landwehr, J.M., Matalas, N.C and Wallis, J.R., 1979: Estimation of parameters and quantiles of Wakeby
distribution 1 and 2. Water Resour. Res., 15 (6), pp 1361-1379.
35
Linsley, R.K., Kohler, M.A. and Paulhus, J.L.H.: Applied hydrology. McGraw-Hill, New York.
36
Maniak, U., De Haar, U., Hofius, K., Johannsen, H.H., Liebscher, H., Schroeder, R., Schultz, G., Wohr, F. and
Zayc, R, 1971: Theoretische Hydrologie, Heft 1, Stochastische Verfahren. Deutsche
Forschungsgemeinschaft, Bonn.
37
McMahon, T.A. and Srikanthan, R., 1981: LP III distribution - is it applicable to flood frequency analysis of
Australian streams? J. Hydrol., 52 (lj2), pp 139-147.
38
Nemec, J., 1972: Engineering Hydrology. McGraw-Hill Publishing Company Limited, England.
39
NERC, 1975: Flood studies report. Nat. Env. Res. Council, London.
40
Nobilis, F., 1981: Zur Berechnung der n-jlihrlichkeit von Hochwassern, und zur Interpretation von
KonfidenzintervalJen. Mitteilungsblatt des Hydrographischcn Dienstes in Oesterreich, Nr. 49, S.44-59, Wein.
41
QWRC, 1982: An empirical study of flood frequency data. Queensland Water Resour. Comm., Surface Water
Branch, Internal Rep. No. 000716. PR, Febr.
42
QWRC, 1983: Plotting positions for Queensland flood frequency data. Queensland Water Resour. Comm.,
Surface Water Branch, Internal Rep. No. 000717, Jan.
43
Reich, T.: Zur Hliufigifkeitsverteilung extremer Tagessummen der Niederschlagshohe, Meteorol. Z. (in press).
A6.14
44
Sevruk, B. and Geiger, H.; 1981: Selection of distribution types for extremes of precipitation. World Meteoro!.
Org., Operational Hydrol. Rep., 15, WMO-No. 560, pp 64.
45
Sneyers, R., 1975: Sur l'analyse statistique des series d'observations. WMO-Tech. Note No. 143, pp 189.
46
Srikanthan, R. and McMahon, T.A., 1981: Log Pearson III distribution - effect of dependance, distribution
parameters and sample size on peak annual flood estimates. J. Hydro!., 52 (1/2), pp 149-159.
47
Srikanthan, R. and McMahon, T.A., 1981: Log Pearson III distribution - an empirically-derived plotting
position. J. Hydrol., 52 (1/2), pp 161-163.
48
Stephens, M.E., 1974: EDF Statistics for goodness of fit and some comparisons. J. Am. Stat Ass., Vol. 6g,
pp 730-737.
49
Tabony, R.G., 1977: The variability of long duration rainfall over Great Britain. Metoorol. Off. Sci. Pap. No.
37.
50
Tomlinson, A.I., 1980: The frequency of high intensity rainfalls in New Zealand. Part I. Tech. Pub. No. 19,
Water & Soil Div. Ministry of Works and Development, Wellington, pp 36 + 4 maps.
51
U.S. Interagency Committee on Water Data, 1982: Guidelines for determining flood flow frequency:
Hydrology Subcommittee Bulletin 17-B (with editorial corrections, March 1982), U.S. Gool. Survey, Office
of Water Data Coordination, Reston, Virginia 22092.
52
Viesmann, W.R. et aI., 1972: Introduction to hydrology. Intext Educational Publishers, New York.
53
WMO, 1981: Hydrological Operational Multi-purpose Subprogramme (HOMS) Reference Manual. World
Meteorol. Org., 1st Ed.
54
Zeller, J., Geiger, H. and Roethlisberger, G.. 1976·1981: Starknieder-schlfige des schweizerischen Alpen - und
Alpenrandgebietes. Band 1-5, Eidg. Anstalt fUr das forstliche Versuchswesen, Birmensdorf, Schweiz.
55
Marinov, I. et aI., 1979 and 1980: Manuel hydrologique, Vol. I & II, Hydrologuitchen naratchnik, "Tecnica",
Sofia.
56
Guerassimov, S., 1980: Manual de determination des crnes en Bulgarie, Blvd. Lennine 66, Sofia.
57
Guerassimov, S., 1976: Determination de la distribution empirique precise a composition des components
statistiques, Hydrologie et .meteorologie, No.2, 1976, Sofia.