Download Normální rozdělení

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Normal distribution
also Gaussian distribution
y=normal(x,0,1)
p=1-inormal(x,
0.6
1.0
0.5
0.8
0.4
0.6
0.3
0.4
0.2
0.2
0.1
0.0
0.0
-3
-2
-1
0
1
2
3
-3
-2
-1
0
Probability density
1
f ( x) 
e
 2
holds -






xf ( x )dx  
( x   ) f ( x)dx  
2
 ( x   )2
2 2
Mean value of distribution
is 
2
variance of distribution is
2
There is certain inconsistency -  is a symbol for mean in general
as well as the specific parameter of normal distribution (which is a
mean too) and similar for 2
Key role of normal distribution in statistics rises from
central limit theorem. It says, that mean of “very big”
random sample is random value with almost normal
distribution, even if distribution of population differs from
normal one.
1
f ( x) 
e
 2
 ( x   )2
2 2
1
f ( x) 
e
 2
 ( x   )2
2 2
1
f ( x) 
e
 2
 ( x   )2
2 2
From “definition” – variable with normal distribution can take
with nonzero probability values from -  to + 
Biological variables haven’t usually normal distribution, but
can be ofted “reasonably” approximated by normal
distribution.
Skewness and kurtosis
i-th moment - mean value Xi
i-th central moment, κi – mean value (X- )i
So, mean value is the first general moment
The first central moment is from definition 0
Variance is the second central moment
Skewness is characterized as the third central moment
Kurtosis is characterized as the fourth central moment
Skewness
Positive skewness – a lot of negative
deviances from mean is compensated by
lesser number of big positive deviances:
3, 3, 3, 4, 7
μ=4
3=(3-4)3+ (3-4)3+ (3-4)3+ (4-4)3+ (7-4)3
=(-1)+(-1)+(-1)+0+27=24
3 – is in the third powers of measurement
units
1 =
3
3
.
- is dimensionless and set just shape
median
mean
Skewness
Negative skew distribution - a lot of small positive
deviations from mean is compensated by lesser
number of big negative deviances 5, 5, 5, 1, 4 μ=4
3=(5-4)3+ (5-4)3+ (5-4)3+ (4-4)3+ (1-4)3
=1+1+1+0+(-27)=-24
mean
median
Kurtosis –
th
4
central moment
4
4
Normal distribution is mesokurtic
as rate of kurtosis is accounted 2 =
normal mesokurtic
leptokurtic 2 > 0
4
4
=3
-3
platykurtic 2 < 0
standartized normal distribution
If the variable X has normal distribution with parameters , 2, then after transformation
Zi 
Xi  

has the variable Z normal distribution with mean value 0 and variance 1 (standart deviation is 1 then too). This type
of distribution is called standartized normal distribution.
“checking“
of normality
- graphic
Plot a cumulative
histogram of counts
on a scale of
probabilities
Normal
Left skewed
Platykurtic
Mixture of two types of normal distribution
Right skewed
leptokurtic
Obr. Chyba! V dokumentu není žádný text v zadaném stylu.-A Příklady hustot pravděpodobnisti („frekvenčních
distribucí“)
spolu
jejich kumulativními
distribucemi,
vynášené nadistributions)
svislé ose na škále
„normální
pravděpodobnostní
Examples
of sdenzities
of probabilities
(frequency
together
with
stupnice“.
their cumulative distributions plotted on y-axis in a scale of normal probability
Normality checking– I compute
skewness and kurtosis and
compare them with expected
values for normal distribution.
The most of biological data has positive skewed distribution –
that’s why computing of skewness give us quite strong test and
tell us how much data differ from normality at the same time.
Normality testing -
2
χ
test
I compute mean and variance from data and compare sample
data with date with normal distribution with the same mean
and variance as data of my own.
Then, with help of χ2 test, I compare number of cases in size
classes set up from observed data and expected frequencies
2
(
O

E
)
in normal distribution –

i
E
classic problem, I must decide the breadth of categories
(columns’ width in histogram) – number of degrees of
2 parameters from data
freedom = k-1-2
Variable: SR30X30_, Distribution: Normal
Chi-Square test = 2.09870, df = 2 (adjusted) , p = 0.35017
7
6
No. of observations
5
4
3
2
1
0
10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29
Category (upper limits)
Editors of journals demand such
a test, but
• (almost) no biological data have normal
distribution, so
• if I have a large data, the test is strong and I
reject null hypothesis about normality (even
if the deviation from normality is small)
• if I have a few data, the test is desperately
weak and even for data with big deviation
from normality I cannot reject null
hypothesis
Related documents