Download Document

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Degrees of freedom (statistics) wikipedia , lookup

History of statistics wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Taylor's law wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Student's t-test wikipedia , lookup

Transcript
Searching the “True” value:
Central tendency
indicators and
experimental data
CRLISS
Alessio Pitidis
Department of
Environment and
Primary Prevention
ISS
Standards for statistical treatment of
Proficency Tests data, Examples:
CRLISS
ISO/IEC: Guide 43-1(1996), CD
17043(2008); ISO 13528(2005)
Determining assigned value: known values,
certified reference values, reference
values,consensus between experts,
consensus between participants.
Determining standard deviation for
proficiency assessment: prescribed value,
by perception, from a general model, from
data in a round of a PT scheme.
Performance score: difference, percent
difference, percentile or ranks, z-scores,
En-numbers
Reaching Consensus
CRLISS
If you do not have a Gold Standard
(reference value), you need to reach
consensus
Preferred method in the Guidelines
and Standards: Algorithm A (H15)
But are also allowed: other methods
with a sound statistical basis
The core of Algorithm A
CRLISS
The H15 algorithm is a Huber estimator
∑ ρ(u )
i
 x i −µ 
Ui = 

 c ⋅σ 
1 2
ρ(u) = u
2
1 2
ρ(u) = k u − k
2
If
u ≤k
If
u >k
Huber estimators
CRLISS
φ = min ρ(u)
k → ∞ ⇒ mean
φ
k = 0 ⇒ median
u
-k
k
It is substantially a weighted mean, Obtained by
introducing a “constant of tuning” that consents to
weigh the outliers
The concept of trimmed mean
The weighted mean conceptually is
very similar to trimmed mean: a weighted
mean were you assign to the extreme
values weight zero
The Trimmed mean is insensitive to small
numbers of gross errors and works well
with heavy tailed distributions close to the
normal
The means fails the first, the median fails
the second (Royal Society of Chemistry)
Robust statistics work well with error
distributions heavier than the normal
(RSC)
3 Definitions of robustness
A robust measure is insensitive to:
1. The presence of outliers
2. Grouping and rounding
observations
3. Deviation from basic assumptions
(Box), in particular distributional
hypothesis
The curve of influence
CRLISS
Definition:
IC(x;F,T) = lim
ε →0
T [(1− ε)F + ε.δx ]− T (F )
ε
Where:
T(F)=parameter estimator
δx = probability distribution that assigns unitary
Mass to the x point on the real numbers line
F= probability distribution for the T estimator
IC(X)
T(F)= µ
µ
X
The influence of an infinitesimal contamina_
tion of the mean increases linearly with the
difference x- µ in all the directions
The curve of influence
T(F)= F-1(1/2)
IC(X)
 1
F −1 
 2
Median
X
The median has a curve of
Influence limited and monotone;
An infinitesimal contamination in
A x point has a constant effect on
the median; but it has a point of
discontinuity at its central value
reflecting a local instability in that
point.
α-trimmed mean:
α
1
0≤ ≤
2 2
1−
1
T (F ) =
1− α
a
2
−1
F
∫ (t )dt
a
2
Influence on α-trimmed mean
IC(X)
α 
F −1 
2
 α
F −11− 
 2
X
The α-trimmed mean joins the properties of the
Mean and the median estimators: the trimmed
Mean changing is limited if the contamination point
Falls in the central part of the sample and is null if
It falls in the tails. An outlier value is situated in the
tails and does not influence the trimmed mean.
Central tendency indicators robustness
The trimmed mean shares with the median the same
behavior on the ties so theirs robustness, with respect
To presence of extreme values on the ties is equivalent.
The advantage of trimmed versus the median is that
it mantains it sensitivity to variations in the central part
of the distribution.
Centers of r order: robustness
ord
r
min ∑ x i − x n i er
i
r
for r ≥ 0
2
1
0
center
Robustness
Powers mean
Aritmethic mean
Median
Mode
very low
low
medium
high
Relationship among centers
CRLISS
For unimodal distribution slightly asymmetric we have the
Empirical relationship:
Median-Mode=3(Mean-Median)
Mode Median Mean
Mean Median
For unimodal symmetrical distribution:
Mean = Median = Mode
Mode
The problem of inference
CRLISS
Whenever we use a single statistic to estimate a parameter
we refer to the estimate as a point estimate for that parameter.
When we use a statistic to estimate a parameter, the verb
used is "to infer." We infer the population parameter from the
sample statistic.
Some population parameters cannot be inferred from the
statistic. The population size N cannot be inferred from the
sample size n. The population minimum, maximum, and range
cannot be inferred from the sample minimum,
maximum, and range. Populations are more likely to have
single outliers than a smaller random sample.
The population mode and median usually cannot be inferred
from a smaller random sample. There are special
circumstances under which a sample mode and median might
be a good estimate of a population mode
and median.
Inferring a parameter
Sample
Sample size n
Sample mode
Sample median
Sample mean
Sample stdev
Sample distrib.
shape
CRLISS
Population
X
X
X
Population size n
Population mode
Population median
Population mean
Population stdev
Population distrib.
shape
Deviation from the Hypotheses
Parameter Standard error
Sample
distribution
Box robustness
Mean
It is true for small and
big samples. For N>=30
The mean sample
distribution is normal
whatever is the
population distribution
µ− = µ
σ =
−
x
σ
N
x
Proportion
(median)
(mode)
σp =
Median
σ me
p(1− p)
N
Same observations valid
for the mean. Median =
50%
For N>=30 the sample
π 1,2533σ median distribution is
=σ
=
normal, only if the
2N
N
population is normal
Using the median in small samples
Use as a proportion:
The median is the 50th percentile of a frequency distribution
You can rely on the properties of the binomial distribution and
Calculate confidence intervals in exact probability without need
of the hypothesis of normality.
np1
^
 n x
α
n−x
P( p ≤ p1 ) = ∑ p (1− p) ≤
^
x
2
x= 0  
P( p1 ≤ p ≤ p2 ) ≥ 1− α
n
^
 n x
α
n−x
P( p ≥ p2 ) = ∑   p (1− p) ≤
x
2
x= np 2  
Power the sample:
If you can not increase the number of observations you could
do it by computing i.e. using bootstrap techniques.
Basic concepts of bootstrap
Observed sample
CRLISS
K = 10
X1, X2, X3, X4, X5, X6, X7, X8, X9, X10
Extract a random sample of k elements from
the observed one, accepting data to be repeated in
the random sample. In other terms you reinsert each
time the data in the sample you are extracting from
i.e. from the original one you generate the sample:
X1, X3, X3, X4, X5, X6, X6, X6, X7, X10
Now you have the numbers to
Calculate your central M
Parameter and its stdev
KK
Possible
Dispositions with
Replication
(possible samples)