Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Searching the “True” value: Central tendency indicators and experimental data CRLISS Alessio Pitidis Department of Environment and Primary Prevention ISS Standards for statistical treatment of Proficency Tests data, Examples: CRLISS ISO/IEC: Guide 43-1(1996), CD 17043(2008); ISO 13528(2005) Determining assigned value: known values, certified reference values, reference values,consensus between experts, consensus between participants. Determining standard deviation for proficiency assessment: prescribed value, by perception, from a general model, from data in a round of a PT scheme. Performance score: difference, percent difference, percentile or ranks, z-scores, En-numbers Reaching Consensus CRLISS If you do not have a Gold Standard (reference value), you need to reach consensus Preferred method in the Guidelines and Standards: Algorithm A (H15) But are also allowed: other methods with a sound statistical basis The core of Algorithm A CRLISS The H15 algorithm is a Huber estimator ∑ ρ(u ) i x i −µ Ui = c ⋅σ 1 2 ρ(u) = u 2 1 2 ρ(u) = k u − k 2 If u ≤k If u >k Huber estimators CRLISS φ = min ρ(u) k → ∞ ⇒ mean φ k = 0 ⇒ median u -k k It is substantially a weighted mean, Obtained by introducing a “constant of tuning” that consents to weigh the outliers The concept of trimmed mean The weighted mean conceptually is very similar to trimmed mean: a weighted mean were you assign to the extreme values weight zero The Trimmed mean is insensitive to small numbers of gross errors and works well with heavy tailed distributions close to the normal The means fails the first, the median fails the second (Royal Society of Chemistry) Robust statistics work well with error distributions heavier than the normal (RSC) 3 Definitions of robustness A robust measure is insensitive to: 1. The presence of outliers 2. Grouping and rounding observations 3. Deviation from basic assumptions (Box), in particular distributional hypothesis The curve of influence CRLISS Definition: IC(x;F,T) = lim ε →0 T [(1− ε)F + ε.δx ]− T (F ) ε Where: T(F)=parameter estimator δx = probability distribution that assigns unitary Mass to the x point on the real numbers line F= probability distribution for the T estimator IC(X) T(F)= µ µ X The influence of an infinitesimal contamina_ tion of the mean increases linearly with the difference x- µ in all the directions The curve of influence T(F)= F-1(1/2) IC(X) 1 F −1 2 Median X The median has a curve of Influence limited and monotone; An infinitesimal contamination in A x point has a constant effect on the median; but it has a point of discontinuity at its central value reflecting a local instability in that point. α-trimmed mean: α 1 0≤ ≤ 2 2 1− 1 T (F ) = 1− α a 2 −1 F ∫ (t )dt a 2 Influence on α-trimmed mean IC(X) α F −1 2 α F −11− 2 X The α-trimmed mean joins the properties of the Mean and the median estimators: the trimmed Mean changing is limited if the contamination point Falls in the central part of the sample and is null if It falls in the tails. An outlier value is situated in the tails and does not influence the trimmed mean. Central tendency indicators robustness The trimmed mean shares with the median the same behavior on the ties so theirs robustness, with respect To presence of extreme values on the ties is equivalent. The advantage of trimmed versus the median is that it mantains it sensitivity to variations in the central part of the distribution. Centers of r order: robustness ord r min ∑ x i − x n i er i r for r ≥ 0 2 1 0 center Robustness Powers mean Aritmethic mean Median Mode very low low medium high Relationship among centers CRLISS For unimodal distribution slightly asymmetric we have the Empirical relationship: Median-Mode=3(Mean-Median) Mode Median Mean Mean Median For unimodal symmetrical distribution: Mean = Median = Mode Mode The problem of inference CRLISS Whenever we use a single statistic to estimate a parameter we refer to the estimate as a point estimate for that parameter. When we use a statistic to estimate a parameter, the verb used is "to infer." We infer the population parameter from the sample statistic. Some population parameters cannot be inferred from the statistic. The population size N cannot be inferred from the sample size n. The population minimum, maximum, and range cannot be inferred from the sample minimum, maximum, and range. Populations are more likely to have single outliers than a smaller random sample. The population mode and median usually cannot be inferred from a smaller random sample. There are special circumstances under which a sample mode and median might be a good estimate of a population mode and median. Inferring a parameter Sample Sample size n Sample mode Sample median Sample mean Sample stdev Sample distrib. shape CRLISS Population X X X Population size n Population mode Population median Population mean Population stdev Population distrib. shape Deviation from the Hypotheses Parameter Standard error Sample distribution Box robustness Mean It is true for small and big samples. For N>=30 The mean sample distribution is normal whatever is the population distribution µ− = µ σ = − x σ N x Proportion (median) (mode) σp = Median σ me p(1− p) N Same observations valid for the mean. Median = 50% For N>=30 the sample π 1,2533σ median distribution is =σ = normal, only if the 2N N population is normal Using the median in small samples Use as a proportion: The median is the 50th percentile of a frequency distribution You can rely on the properties of the binomial distribution and Calculate confidence intervals in exact probability without need of the hypothesis of normality. np1 ^ n x α n−x P( p ≤ p1 ) = ∑ p (1− p) ≤ ^ x 2 x= 0 P( p1 ≤ p ≤ p2 ) ≥ 1− α n ^ n x α n−x P( p ≥ p2 ) = ∑ p (1− p) ≤ x 2 x= np 2 Power the sample: If you can not increase the number of observations you could do it by computing i.e. using bootstrap techniques. Basic concepts of bootstrap Observed sample CRLISS K = 10 X1, X2, X3, X4, X5, X6, X7, X8, X9, X10 Extract a random sample of k elements from the observed one, accepting data to be repeated in the random sample. In other terms you reinsert each time the data in the sample you are extracting from i.e. from the original one you generate the sample: X1, X3, X3, X4, X5, X6, X6, X6, X7, X10 Now you have the numbers to Calculate your central M Parameter and its stdev KK Possible Dispositions with Replication (possible samples)