Download Statistics 406 Homework 5

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Statistics wikipedia , lookup

History of statistics wikipedia , lookup

Transcript
Statistics 406 Homework 5
Due Monday, October 25
1. Suppose X √is a random variable with sample space 0, 1/100, 2/100, . . . , 1, and P (X =
k/100) = c k for a constant c. What is the value of c? What is the sample space of
Y = (X − 1/2)2 ? Write a program to calculate the distribution of Y .
Warning: Do not use the approach given on page 11 of the notes here. Due to numerical roundoff you will get the wrong sample space. Work out the sample space
of Y mathematically, then work out the formula for the probability distribution of Y
mathematically (in terms of c), then write a program to evaluate the formula.
Solution:
The normalizing constant is given by the following Octave code.
c = 1/sum(sqrt(0:100));
The sample space of Y contains 51 points:
0, 1/104 , 4/104 , 9/104 , . . . , 502 /104
Here is the code to construct the distribution table:
%% Zero is a special case.
T(1,:) = [0, c*sqrt(50)];
%% All the other points follow the same pattern.
for k=1:50
T(k+1,:) = [k^2/10^4, c*(sqrt(50-k) + sqrt(50+k))];
end
2. Consider the following two joint distributions:
(a) Suppose that X is generated uniformly on 1, 2, . . . , 100, and then Y is generated
uniformly from the set of all distinct divisors of X (including 1 and X).
(b) Suppose that X and Y are generated uniformly from the set of all pairs x, y such
that x is an integer between 1 and 100, and y is a divisor of x.
For each of the two joint distributions, write a program to calculate the marginal
distribution and expected value of Y , the conditional distribution of X given Y = 2,
and the conditional mean of X given Y = 2.
Are the joint distributions for (a) and/or (b) uniform? Are the conditional distributions
for (a) and/or (b) uniform?
1
Solution:
%% The joint distribution of X and Y.
J = zeros(100,100);
%% Cycle through the X sample space.
for i=1:100
D = rem(i, [1:100]);
ID = find(D == 0);
%% Use (1) for part a, (0) for part b.
if (1)
J(i,ID) = 1 / (100*length(ID));
else
J(i,ID) = 1;
end
endfor
%% Standardize the probabilities.
J = J / sum(sum(J));
This does nothing in part a.
%% Marginal Y distribution.
PY = sum(J)’;
%% Marginal Y mean.
EY = dot(PY, [1:100]);
%% Conditional distribution of X given Y=2.
PXY2 = J(:,2) / sum(J(:,2));
%% Conditional expectation of X given Y=2.
EXY2 = dot(PXY2, [1:100]);
The joint distribution and conditional distribution are uniform for part b, but not for
part a.
3. Suppose that X and Y are generated independently on 1, 2, . . . , 100. Write a program
to calculate P (X − Y = a|X + Y = b) for all possible values of a and b. Use the results
of your program to (i) calculate E(X − Y |X + Y = b) for all possible values of b, and
(ii) determine whether X − Y and X + Y are independent.
2
Solution:
%% The joint distribuition of X-Y and X+Y.
D = zeros(199,200);
%% Cycle through the X,Y sample space.
for x=1:100
for y=1:100
%% The row where probabilities for x-y are stored.
a = x-y+100;
%% The column where probabilities for x+y are stored.
b = x+y;
%% Increment by the probability that X=x and Y=y.
D(a,b) = D(a,b) + 1/10000;
end
end
%% P(X-Y | X+Y)
C = D ./ (ones(size(D,1),1)*sum(D));
%% EYX(k) is E(X-Y | X+Y=k)
EYX = [-99:99] * C;
X − Y and X + Y are not independent, since the columns of C are not all equal.
I didn’t ask for the following, but you should understand it:
The conditional mean function E(X − Y |X + Y ) is constant. We learned that E(X −
Y |X +Y ) is always constant when X −Y and X +Y are independent. But the converse
statement is not true: E(X − Y |X + Y ) may be constant even when X − Y and X + Y
are dependent. This is an example of that situation.
Although the mean of X − Y given X + Y is the same for all values of X + Y , other
properties of the conditional distribution do depend on the value of X+Y . For example,
when X + Y = 200, we must have X = Y = 100, so the distribution of X − Y contains
only the point 0, with no variance. But when X + Y = 100, the (X, Y ) values can be
(50, 50), (49, 51), (51, 49), and many more points. The corresponding X − Y values
are 0, −2, 2, . . . . These values average out to zero, hence the conditional mean is still
zero. But there is a lot of variation around the mean.
3