Download Stat

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Data analysis wikipedia , lookup

Predictive analytics wikipedia , lookup

Regression analysis wikipedia , lookup

Expectation–maximization algorithm wikipedia , lookup

Generalized linear model wikipedia , lookup

Data assimilation wikipedia , lookup

Transcript
Real Analysis
I. Quatifying Exam 2006
1) Argue that all countable sets have Lebesgue measure zero. Give example
of a set with continuum cardinality whose Lebesgue measure is zero.
2) State and prove the Monotone Convergence Theorem.
b]. Show that if / is
integrable and I! f @)ar : [p,a1f d.m.
3) Let f be a bounded real-valued function on
Riemann integrable, then
/
is Lebesgue
[o,
4) State the Lebesgue-Radon-Nikodym theorem. What does
it
mean that
the Lebesgue integral is absolutely continuous?
5) Show that if
/
is absolutely continuous on [a, b], then
f
e BV([a,b]).
Ph.D. Qualifying Examination in Probability
April 10, 2006
Instntctions:
(a) There are thre,e prcblems. Subrnit all thre for fu,II crzitit.
(b) Staft each problem on a new page, ind,iu,te the problem number, and number
the
pages.
(c) Indicate &nA concept or theorem used,, and, show hou
1". Suppose
that X*
it is being applieil.
L X and that lX"l < y, all rz, with Y e Lt.
(a)ShowthatXeLt.
fHint. FirstshowthatP(lXl > y*e):0foranye >
P(lXl > Y) :0 and then complete the proof.]
0. Usethistoobtain
(b) Show that X^ \ X.
fffii'nt. Assume that the claimed @nvergence does nof hold and apply a subsequence
argument along with the Lebesgue Dominated Convergence Theorem to obtain the
deired convergence on a subsubsequence, a contradiction.]
2. Let Xo.,n
)
1 be
,S':DLyXi,n)t.
i.i.d. random'ua,riables with P(X"
(a) Show that
-
1)
: P(X" - -1) :
Ll2, and put
sn3o.
n
(b) Show that
Sn
[Hi,nt. You rnay use the fact that
(") show
rhat
ffi;o
I(n
a'c'
toC n)-L < oo.]
+T
\/n
o
3. Let {Xn,j, i > l,n > 1} be an axray of random variables on a probability
such that for any e > 0
JY-
i"rt*;l > e) :
o'
j:L
(a) Show that
l.}f
tr,
t P ^
r, --+ oo.
lo"rl -- u'
(b) Show that limsup",-oo supj>r lX-,A is a tail function.
space (Q,
F, P)
Ph.D. Qualifying Exarnination: Spring 2006
Statistical Inference
Instructions
o Number of questions
o
Show all work and justify all steps to get full credit.
o Simplify your answers as much as possible.
. ,,,.,i1.1 :
a population X - N(p,a2) with both
1. (20 pOints);iCopsidel
:ir -.
'
.
Xt,...&
-r
":;
parameters unknown. Let
::
denote a random sample from this population. For a specified constant 16,
define the probabilitv
p
: Pr(x ( rs) :
- dl"),
o(("0
where O(.) is the cumulative distribution function of a
N(0,1) distribution. Derive
a
closed-form level a test for the hypotheses
Ho:p
(
(p6
versus /{1
:
p)po,
< 1 is a specified probability.
(Hint: Use the fact that if Z - N(0,l-), Y - Xzk, a^nd the two are independent,
then ?6(4) : (Z + L)/\F/E follows a noncentral f;.-distribution with noncentrality
where 0
po
parameter A.)
2. Consider two independent random variables X and Y where X - Exponential ()) and
Y - Exponential(pr,). It is not possible to obtain direct observations on X and Y.
Instead, we observe the random variables Z and I/', where Z : mtn{X, Y} and
y irz:X;
w:1{ 0,
lf z:Y'
[
(a) (t0 points) Find the joint distribution of
Z
and
W.
Show
that
Z
and
W
arc
independent.
(b)
(10 points) Suppose
(Z
estimator of .\ and p.
: z,W :0)
is observed. Find the maximum likelihood
is a single observation from a N(0,q,0 ) 0' distribution' Use
estimator
the following steps to construct the uniformly minimum variance unbiased
(UMVUE) for 0 based on X.
3. Suppose that
X
(u) (z points) show that T
(b)
: lxl is complete and sufncient for 0.
(7 points) Show that the density of
Q"il-''
s$):- { o,
[
(") (Z points)
Show that for
Pr(X
(d)
(Z points) Show
(")
(7 points) Argue
that
-
T
"*p{-ro!(t
is
- g)'}{1 * exp(-2t)}'
if t >
0;
otherwise'
t > 0,
tlT
-
E6lf -
t) :exp(t)/{exp(-t) + exp(t)}'
t) :ttanh(t), f >
0, where
that lXltanh(lxl) is the UMVUE for
g'
tanh(i)
: ffiE#:{3.
Ph.D. Qualifying Exarnination: Spring 2a06
Linear Models
April 12, 2006
Instructions
o Number of questions : 2, total points
o
Show your work and
:
3b.
justify all steps to get full credit.
Problems
1.
(20 pt-rints) Consider ubservations
with correlated errors Y
Y;-p,*e;
Suppose
(1
that the variance of each error is o2
:
(Yt,Yz,. . ., y,,) where
<z<n)
and,
the correlation cr.refficients are
C(e;,ei):p (i+j)
(r)
(5 points) Shovr that the correlation matrix is positive-definite
if antl only if
p> -1/('n-I)
(b) (5 points) Let 7 :urd
demrte the sa,rnple mean and sample varia,nce based on
^S2
the l'v:r,lues. Shr-rw that )/ and 5a .l,re intlependent, even rt p # 0.
(c) (10 points) Prove that for 0 < p < l the general (weighteri) least squarss estimate
. f.or p, is the same a,s the ordinary least squares estimate.
2.
(15 p.rints) Linear model
with unc'rrelatecl errors has the form
0o* ?rU;+ P2U ((1 <t <,n)
Y:
where
n:4k
for some integer & > 1. The columns,
[/ and V, are defined as follows.
(2k+1.<i<4k)
U;:L (1 <i<2k);Ui:-L
V:I, (1 <z<k; 2k+I<i<3k)
V:-L, (k+i <i<2k; BA+1<i<4k)
(a) (10 points) Find the variance-covariance matrix fr:r the estimatecl vector 6
(h,0r)
(b)
(5 points) Derive the constrained
specified
:
le:xt square estimate of p1 ,r.s the value af p2 is
Ph.D. Qualifying Examination in Statistical Methods
April 14, 2006
Instructions
- The Exam consists of two computer projects.
- Load the data sets from http://www.utdallas.edu/-mbaron/Qual06
- Conduct the necessary data analysis using software of your choice.
- Submit a report, written or typed, hard copy or e-mail. If you choose to e.mail the report,
send it to both [email protected] and [email protected].
- In the report, describe every step of your analysis: method, re&sons, and results. For example:
(1)
Test signif,cance of uariable XXX. Use SAS, PROC
p-ualue of 0.0003. Therefore, ... ...
(2) Verify assumptions of the test. Use ... ...
Therefore, ... ...
...
The uariable
with option
...
The
F test
... uiolates assumption ...
gi,aes a
because ...
- Attach your computer programs
and only relevant parts of the output. Do not attach the
parts of output that were not used to answer questions.
PROJECT
I
"Real Estate Sales, Data Set 1"
The city tax assessor was interested in predicting residential home sales prices in a mid-western
as a function of various characteristics of the home and surrounding property. Data on 522
transactions were obtained for home sales during the year 2002. The 13 variables are:
city
Column
1
2
3
4
5
6
7
8
I
10
Variable
Identification number
Sales price of residence in dollars
Area of residence in sq. feet
Number of bedrooms
Number of bathrooms
Air conditioning: equals 1 if present, 0 if absent
Garage size: number of cars the garage can hold
Swimming pool: equals 1 if present, 0 if absent
Year the property was built
Quality of construction: high, medium, or low. In the data set,
1 - high
2 - medium
3-low
11
L2
13
Style: a qualitative indicator of architectural style
Lot size in sq. feet
Adjacency to a highway: equals 1 if there is a highway in the neighborhood, 0 if not
1. Develop the best subset model for predicting sales price. As part of this analysis, verify
assumptions being used, check for outliers and influential observations. Apply remedial measures if necessary.
Explain your choice of a model. Assess your model's ability to predict sales prices.
2. Create a new bina,ry response variable Y "high quality construction". Let Y : Lif quality
(variable 10) equals I,Y :0 otherwise. Find out if "high quality" can be adequately predicted
from the other variables in the data set. Develop the best prediction rule determining whether
the quality of construction is high or not.
PROJECT
II
'(Kidney failure hospitalization',, Data Set
2
Kidney failure patients are commonly treated on dialysis machines that filter toxic substances
from the blood. The appropriate dose for effective treatment depends, among other things, on
duration of treatment and weight gain between treatments as a result of fluid buildup. To study
effects of these two factors on the number of days hospitalized during a year, a random sample of
10 patients per group was obtained. Tleatment duration (factor A) was categorized into 2 groups:
short duration (dialysis time under 4 hours) and long duration (4 hours or more). Average weight
gain between treatments (factor B) during the year was categorized into 3 groups: slight, moderate,
and substantial. The following variables are recorded
Column
1
2
3
4
Variable
Number of days hospitalized
Factor A "treatment duration": equals 1 for LONG, equals 2 for SHORI
Factor B "weight gain": equals L for SLIGHT, 2 for MODERATE, 3 for SUBSTANTIAL
Patient ID number, 1 to 10, in each group
1. Obtain the fitted values and residuals for a two.factor ANOVA model with a transformed
respose
Y':log(Y*1).
Are there any departures from this model? Does the normality assumption appear to be
reasonable here?
2. Prepare an estimated treatment means plot and the analysis of variance table. Does your
graph suggest that any factor effects are present? Does any one source account for most of
the total va"riability? Test whether the two factors interact. State an upper bound on the
family level of significance for the conducted tests.
3. Estimate the treatment mean
p,22 and the difference
intervals. Interpret your interval estimates.
D:
Fzs
-
lt21 with 95% confidence