Download Title

Project Plan Task 8 and VERSUS2 Installation problems Anatoly Myravyev and Anastasia Bundel, Hydrometcenter of Russia March 2010 Task 8: Statistical features like confidence intervals and the Bootstrap method Formal definition of confidence intervals (CIs): • Estimation of an unknown value  defines a distribution Р corresponding to a random sample X from the population ={Р}. • If for a given α>0 there exist random variables  =  (α, Х) such that P(– <  < +)  1– α, then the interval (– , +) is called the confidence interval for  of level 1– α. • The random interval contains the unknown value , which is not random. The statistical problem lies in the construction of CIs • Cases with known probability distribution function of the population: parametric CIs • Cases where the pdf is not known: non-parametric CIs Parametric CIs • Normal distribution assumption is most frequent. The underlying sample must be an iid-sample (independent and identically distributed). • Pluses: – Easy and not computer-intensive • Minuses: – Cannot be used for scores with non-normal distributions without some normalization (proportions, odds ratio, correlation coefficients, …), or require complicated calculation formulas Non-parametric CIs • Construction of artificial datasets from a given collection of real data by resampling the observations. • Pluses: – Highly adaptable to different testing situations because no assumptions regarding an underlying theoretical distribution of data are required – Computational ease • Minuses: – The assumptions for sample statistics must not be overlooked: representativeness, iid Bootstrapping • Operates by constructing the artificial data using sampling with replacement from the original data (Efron 1979, Wassermann 2006) • Highly elaborated computational technique (R-project) • The most common and popular resampling method in verification (Wilks 1995) Different bootstrap methods – how to construct CIs from the samples obtained • • • • • • • Percentile CIs used at present in MET Package Bias-corrected Cis (BSa) Normal approximation CIs Basic bootstrap CIs Bootstrap-t CIs Approximated bootstrap CIs (ABC), etc. A compromise between their accuracy and computational burden must be made. Implementation of CIs using R package boot • Boot is one of the required packages for R verification package • The intention is to introduce commands analogous to the MySQL v_index table in a form like • index_booted<-boot(index(fcs,obs), 1000) • index_ci<-(index_booted, conf=c(0.95, 0.99), type=c(“perc, ”bca”) Conclusions • The accuracy of statistical scores depends among other things on the following: – Sampling uncertainty – Validity of assumptions about representativeness and iid of the sample – Observational uncertainty Bayesian prediction – Uncertainty in the physical intervals? processes (Gilleland, 2008) • Different α can be used (e.g. CIs of level 0.95, 0.99, even 0.70, etc) depending on the scope of analysis Conclusions (2) • In view of ambiguities about a “most precise” method for the CI construction, we should try several procedures on real frc and obs data available. Both parametric and nonparametric statistics are rightful (MET experience!) • The decision making (what is good, what is bad) should be performed on the multi-criteria basis Problems with VERSUS2 functioning In the Hydrometcenter of Russia Problems with VERSUS2 functioning • Installation is done in the RedHat environment without errors • The new data leave traces in the MySQL tables and the test (Pirmin-) files are acquired • However, the data information gets lost in the vicinity of the Data Availability tab (Model? Date Intervals?...) • A tutorial variant for the package is urgently needed with valid obs and frc data Thank you for your attention!

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Title