Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Statistics 406 Homework 5 Due Monday, October 25 1. Suppose X √is a random variable with sample space 0, 1/100, 2/100, . . . , 1, and P (X = k/100) = c k for a constant c. What is the value of c? What is the sample space of Y = (X − 1/2)2 ? Write a program to calculate the distribution of Y . Warning: Do not use the approach given on page 11 of the notes here. Due to numerical roundoff you will get the wrong sample space. Work out the sample space of Y mathematically, then work out the formula for the probability distribution of Y mathematically (in terms of c), then write a program to evaluate the formula. Solution: The normalizing constant is given by the following Octave code. c = 1/sum(sqrt(0:100)); The sample space of Y contains 51 points: 0, 1/104 , 4/104 , 9/104 , . . . , 502 /104 Here is the code to construct the distribution table: %% Zero is a special case. T(1,:) = [0, c*sqrt(50)]; %% All the other points follow the same pattern. for k=1:50 T(k+1,:) = [k^2/10^4, c*(sqrt(50-k) + sqrt(50+k))]; end 2. Consider the following two joint distributions: (a) Suppose that X is generated uniformly on 1, 2, . . . , 100, and then Y is generated uniformly from the set of all distinct divisors of X (including 1 and X). (b) Suppose that X and Y are generated uniformly from the set of all pairs x, y such that x is an integer between 1 and 100, and y is a divisor of x. For each of the two joint distributions, write a program to calculate the marginal distribution and expected value of Y , the conditional distribution of X given Y = 2, and the conditional mean of X given Y = 2. Are the joint distributions for (a) and/or (b) uniform? Are the conditional distributions for (a) and/or (b) uniform? 1 Solution: %% The joint distribution of X and Y. J = zeros(100,100); %% Cycle through the X sample space. for i=1:100 D = rem(i, [1:100]); ID = find(D == 0); %% Use (1) for part a, (0) for part b. if (1) J(i,ID) = 1 / (100*length(ID)); else J(i,ID) = 1; end endfor %% Standardize the probabilities. J = J / sum(sum(J)); This does nothing in part a. %% Marginal Y distribution. PY = sum(J)’; %% Marginal Y mean. EY = dot(PY, [1:100]); %% Conditional distribution of X given Y=2. PXY2 = J(:,2) / sum(J(:,2)); %% Conditional expectation of X given Y=2. EXY2 = dot(PXY2, [1:100]); The joint distribution and conditional distribution are uniform for part b, but not for part a. 3. Suppose that X and Y are generated independently on 1, 2, . . . , 100. Write a program to calculate P (X − Y = a|X + Y = b) for all possible values of a and b. Use the results of your program to (i) calculate E(X − Y |X + Y = b) for all possible values of b, and (ii) determine whether X − Y and X + Y are independent. 2 Solution: %% The joint distribuition of X-Y and X+Y. D = zeros(199,200); %% Cycle through the X,Y sample space. for x=1:100 for y=1:100 %% The row where probabilities for x-y are stored. a = x-y+100; %% The column where probabilities for x+y are stored. b = x+y; %% Increment by the probability that X=x and Y=y. D(a,b) = D(a,b) + 1/10000; end end %% P(X-Y | X+Y) C = D ./ (ones(size(D,1),1)*sum(D)); %% EYX(k) is E(X-Y | X+Y=k) EYX = [-99:99] * C; X − Y and X + Y are not independent, since the columns of C are not all equal. I didn’t ask for the following, but you should understand it: The conditional mean function E(X − Y |X + Y ) is constant. We learned that E(X − Y |X +Y ) is always constant when X −Y and X +Y are independent. But the converse statement is not true: E(X − Y |X + Y ) may be constant even when X − Y and X + Y are dependent. This is an example of that situation. Although the mean of X − Y given X + Y is the same for all values of X + Y , other properties of the conditional distribution do depend on the value of X+Y . For example, when X + Y = 200, we must have X = Y = 100, so the distribution of X − Y contains only the point 0, with no variance. But when X + Y = 100, the (X, Y ) values can be (50, 50), (49, 51), (51, 49), and many more points. The corresponding X − Y values are 0, −2, 2, . . . . These values average out to zero, hence the conditional mean is still zero. But there is a lot of variation around the mean. 3