Download PPT

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Introduction to probability
Stat 134
FAll 2005
Berkeley
Lectures prepared by:
Elchanan Mossel
Yelena Shvets
Follows Jim Pitman’s
book:
Probability
Section 5.3
The normalization of the normal
Recall: N(0,1) has density f(x) =
2
-1/2x
Ce
Question: what is the value of C?
Answer:
•We will calculate the value of C using X ,Y »
N(0,1) that are independent.
•(X,Y) have joint density
f(x,y) =
• And:
C2
2 + y2)
-1/2
(
x
e
;
Rotational Invariance
C2
2 + y2)
-1/2
(
x
e
Note: The joint density f(x,y) =
is
rotationally invariant – the height depends only on
the radial distance from (0,0) and not on the angle.
Let:
Y
R
X
Rotational invariance
•Note that R 2 (r,r + dr) if (X,Y) is in the annulus
A(r,r+dr) of circumference 2p r, and area 2p r dr.
2
• In A(r,r+dr) we have: f(x,y) » C2 e-1/2 r . Hence:
•Therefore the density of R is:
•So:
y
r
r+dr
x
The Variance of N(0,1)
•The probability distribution of R is called
the Rayleigh distribution. It has the density
1.1
1.0
.9
.8
.7
.6
.5
.4
.3
.2
.1
.0
0
1
2
3
4
•By the change of variables formula S = R2 ~ Exp(1/2):
• Therefore the Variance of N(0,1) is given by:
Radial Distance
•A dart is thrown at a circular target by an expert.
•The point of contact is distributed over the target so that
approximately 50% of the shots are in the bull’s eye.
•Assume that the x and y-coordinates of the hits measured
from the center, are distributed as (X,Y), where X,Y are
independent N(0,1).
Questions:
• What’s the radius of the bull’s eye?
• What’s the % of the shots that
land within the radius twice that of
the bull’s eye?
• What’s the average distance of the
shot from the center?
Radial Distance
• What’s the radius r of the bull’s eye?
• The hitting distance R has Rayleigh
distribution. Therefore:
r
A
P(A) ¼ 0.5
• What’s the % of the shots that land within the radius
twice that of the bull’s eye?
Radial Distance
• What’s the approximate average distance of the shot
from the center?
•The average is given by:
(by symmetry,)
Linear Combinations of
Independent Normal Variables
Suppose that X, Y » N(0,1) and independent.
Question:
What is the distribution of Z = aX + bY ?
Solution:
•Assume first that a2+b2 = 1.
•Then there is an angle q such that
Z = cosq X + sin q Y.
Linear Combinations of
Independent Normal Variables
•Z = cosq X + sin q Y.
Y
q
•By rotational symmetry:
P(x<Z<x+Dx) = P(x<X<x+Dx)
•So: Z ~ N(0,1).
q
X
x
x
Dx
Linear Combinations of
Independent Normal Variables
If Z = aX + bY, where a and b are arbitrary, we can
define a new variable:
So Z’» N(0,1) and Z » N(0,  a2 + b2 2).
If X» N(m, s2) and Y » N(l, t2) then
So X + Y » N(l + m,  s2 + t2 2).
N independent Normal Variables
Claim:
If X1 ,…, XN are independent N(mi,si2) variables then
Z = X1+X2+…+XN » N(m1+…+mN, (s12+…+sN2) ).
Proof: By induction. Base case is trivial: Z1 = X1 »N(m1,s12)
Assuming the claim for N-1 variables we get
ZN-1 » N(m1 +…+ mN-1, (s12+…+sN-12) ) .
•Now: ZN = ZN-1 + XN , where XN and ZN-1 are independent
Normal variables. So by the previous result:
ZN»N(m1+..+mN, (s12+…+sN2) ) .
c-square Distribution
•Claim: The joint density of n
independent N(0,1) variables is:
Note: The density is spherically symmetric it
depends on the radial distance:
•Claim:
This follows from the fact that a shell of radius r and
thickness dr in n dimensions has volume cn rn-1dr, where cn
denotes the surface area of a unit sphere.
c-square Distribution
•Claim: The distribution of R2
satisfies:
This distribution is also called the
c-square distribution with n degrees of freedom.
Applications of c-square Distribution
Claim:
•Consider an experiment that is
•repeated independently n times
•where the ith outcomes has the
probability pi for 1 · i · m.
•Let Ni = # of outcomes of the ith
type (N1+…+Nm = n).
•Then for large n:
Pb=6/20; Pi=4/20; Pc=10/20.
10 draws with replacement
Nb=3; Ni=1; Nc=6.
R22 = (3–3)2/3 + (1-2)2/2 + (6-5)2/5
=1/2 + 1/5 = 0.7
has approximately a c-square distribution with
m-1 degrees of freedom.
c-square Distribution
Pb=6/20; Pi=4/20; Pc=10/20.
10 draws with replacement
Note: The claim allows to
“test” to what extent an
outcome is consistent with
an a priory guess about
the actual probabilities.
Nb=3; Ni=1; Nc=6.
c2 = (3–3)2/3 + (1-2)2/2 + (6-5)2/5
=1/2 + 1/5 = 0.7
df\area
0.995 0.99 0.975 0.95
0.9 0.75
0.5 0.25
0.1 0.05 0.025 0.01 0.005
1 0.00004 0.00016 0.00098 0.00393 0.01579 0.10153 0.45494 1.3233 2.70554 3.84146 5.02389 6.6329 7.87944
2 0.01003 0.0201 0.05064 0.10259 0.21072 0.57536 1.38629 2.77259 4.60517 5.99146 7.37776 9.21034 10.59663
c2 = 0.7 and the probability of observing a statistic of this size
or larger is about 60%, so the sample is consistent with the box.
c-square Example
We have a sample of male and female college students and
we record what type of shoes they are wearing. We would
like to test the hypothesis that men and women are not
different in their shoe habits, so we set the expected
number in each category to be the average of the two
observed values.
Sandals Sneakers Leather
shoes
Boots
Other
Totals
Male
observed
6
17
13
9
5
50
Male
expected
9.5
11
10
12.5
7
Female
observed
13
5
7
16
9
Female
expected
9.5
11
10
12.5
7
Total
19
22
20
25
14
50
100
c –square Example
M/Sandals: ((6 - 9.5)2/9.5) =1.289
F/Sandals: ((13 - 9.5)2/9.5) =1.289
M/Sneakers: ((17 - 11)2/11) =3.273
F/Sneakers: ((5 - 11)2/11) =3.273
M/L. Shoes: ((13 - 10)2/10) =0.900
F/L. Shoes: ((7 - 10)2/10) =0.900
M/Boots: ((9 - 12.5)2/12.5) =0.980
F/Boots: ((16 - 12.5)2/12.5) =0.980
M/Other: ((5 - 7)2/7) =0.571
F/Other: ((9 - 7)2/7) =0.571
(Again, because of our balanced male/female sample, our
row totals were the same, so the male and female
observed-expected frequency differences were identical.
This is usually not the case.)
The total chi square value for Table 1 is 14.026 the
number of degrees of freedom is 4. This gives
This allows to reject the null hypothesis
Related documents