Download Homework Problems 1. (5pts) What is the level of measurement

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Confidence interval wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Central limit theorem wikipedia , lookup

Misuse of statistics wikipedia , lookup

Taylor's law wikipedia , lookup

German tank problem wikipedia , lookup

Law of large numbers wikipedia , lookup

Student's t-test wikipedia , lookup

Transcript
Homework Problems
1. (5pts) What is the level of measurement (nominal, ordinal, interval or
ratio) for each of the following variables?
(a) The number of cups of coffee sold at Starbucks each Sunday during
2008.
(b) The courses offered by the department of statistics at NCCU (National Chengchi University) during 2006, such as statistics, calculus,
probability, etc.
(c) The monthly average temperatures at Taipei over the past 21 years.
(d) The ranking of NCCU in 2009 according to the QS World University
Rankings.
2. (5pts) Suppose that we have a sample of scores with sample mean 80 and
sample standard deviation 1. At least what percentage of the scores are
between 76 and 84?
3. (5pts) Suppose that we have a sample of scores with sample mean 80 and
sample standard deviation 1. Find a range that covers at least 70% of the
scores using Chebyshev’s theorem.
4. (5pts) Suppose that we made a list of 8 singers and asked a group of 6000
people about their favorite singers in our list. As a result, we have a sample
of 6000 numbers, where each number is one of 1, 2, ..., 8, corresponding
to one of the 8 singers. Which of the following ways for summarizing this
sample are meaningful?
(a) Reporting the sample mean.
(b) Reporting the sample variance.
(c) Reporting the sample median.
(d) Reporting the mode of the sample.
(e) Reporting the frequencies of 1’s, 2’s, ..., and 8’s in the sample.
5. (5pts) For a sample of size 1000 with minimum 0, maximum 1 and sample
standard deviation 0.3, determine the number of classes for drawing a
histogram using Scott’s rule.
6. (5pts) During the past 34 years, Virgin Atlantic Airways experienced no
air accident in 23 years. Estimate the probability that Virgin Atlantic Airways experiences no air accident in one year. Which concept of probability
(classical or empirical probability) is used to make the estimate?
7. (5pts) Suppose that we have a box of 1000 items, and 5 of them are defective. Suppose that we randomly select two items one at a time without
replacement (選取二個物件時, 一次取一個, 取出不放回). Let A be the event
that the first item is defective and B be the event that the second item is
defective. Find P (A|B) and P (A). Are A and B independent? Hint: use
Bayes theorem to find P (A|B).
8. (5pts) Determine whether X is a discrete random variable in each of the
following cases.
(a) X is the total number of cars sold in Taiwan in the next month.
1
(b) X is the average weight of all the babies born in Taiwan next year.
(c) X is the amount of time you need to wait for service when you visit
the post office the next time.
9. (5pts) Suppose that we have a box of 1000 items, and 5 of them are defective. Suppose that we randomly select two items one at a time without
replacement . For i = 1, 2, let
1 if the i-th selected item is defective;
Xi =
0 if the i-th selected item is not defective.
(a) Are X1 and X2 independent?
(b) Do X1 and X2 have the same PMF?
10. (5pts) Suppose that we have a box of 1000 items, and 5 of them are
defective. Suppose that we randomly select two items one at a time with
replacement. For i = 1, 2, let
1 if the i-th selected item is defective;
Xi =
0 if the i-th selected item is not defective.
Find the PMF for X1 and the PMF for X2 .
11. (5pts) Suppose that X is a discrete random variable with PMF pX , where

0.2 if x = 0;



0.5 if x = 2;
pX (x) =
0.3 if x = 4;



0
otherwise.
Find P (1.5 < X < 4.5).
12. (5pts) Suppose that X is a discrete random variable with PMF pX , where

0.1 if x = −1;



0.3 if x = 1;
pX (x) =
0.6 if x = 2;



0
otherwise.
Find the mean and variance of X.
13. (5pts) Consider the X in Problem 12. Let Y = X 2 .
(a) Find the PMF of Y .
(b) Find the mean of Y .
Remark. Compare the E(X 2 ) obtained from Part (b) with the V ar(X)
from Problem 12. You should be able to see that E(X 2 ) = V ar(X) +
(E(X))2 .
14. (5pts) For a random variable X whose E(X) exists, we have
E(a + bX) = a + bE(X) for all constants a, b.
(1)
Using (1) to show that V ar(X + a) = V ar(X) and V ar(bX) = b2 V ar(X)
for all constants a, b, assuming E(X 2 ) and E(X) exist.
2
• Remark. This result can be expressed more briefly as V ar(a + bX) =
b2 V ar(X).
15. (5pts) Suppose that X and Y are two random variables with the same
PMF, and P (X = 1) = p = 1 − P (X = 0), where 0 < p < 1. Show that if
P (Y = 1|X = 0) 6= P (Y = 1|X = 1), then X and Y are not independent.
16. (5pts) Suppose that X is a random variable with mean 80 and variance 1.
Find a range A such that P (X ∈ A) ≥ 0.7 using Chebyshev’s theorem.
17. (5pts) Suppose that X1 , . . ., Xn are IID random variables with E(X1 ) = µ
and V ar(X1 ) = σ 2 . Let X̄ = (X1 + · · · + Xn )/n. Show that
!
n
X
E
(Xi − X̄)2 = (n − 1)σ 2 .
i=1
Pn
Pn
• Hint: express i=1 (Xi − X̄)2 using i=1 Xi2 and (X̄)2 , and then
compute the expectations of the two terms using the fact that E(X 2 ) =
V ar(X) + (EX)2 and Equations (6) and (7) in the handout “Mean,
variance and standard deviation”.
• Remark. This result shows that the expectation of the sample variance of a random sample is the variance of the population distribution.
18. (5pts) Ms Li is the manager of the human resource department of a company. Based on her experience, she estimates that the probability that a
new employee will quit in one year is 0.025. The company hired 40 people
last month. What is the probability that 2 of the 40 new employees will
quit in one year?
19. (5pts) Suppose that a class of 15 students are given 4 movie tickets. To be
fair, the students would like to choose 4 ticket winners among themselves
randomly. Suppose that 10 of the 15 students are males and the rest are
females. Find the probability that exactly 3 of 4 ticket winners are males.
20. (5pts) Suppose that Ms Yu goes fishing every weekend, and she catches 3
fishes in 2 hours on average. Suppose that she plans to spend 2 hours on
fishing this weekend. Let X be the number of fishes that she will catch
this weekend. Propose a distribution for X and find the probability that
X ≥ 1 using the proposed distribution.
21. (5pts) In Problem 18, find an approximate value for the probability that 2
of the 40 new employees will quit in one year using a Poisson distribution.
You may use R or other devices such as a calculator to evaluate e−µ for a
given µ.
22. (5pts) Show that
∞
X
x(x − 1)e−µ
x=0
using the fact that
∞
X
e−µ
x=0
µx
= µ2
x!
µx
= 1.
x!
Remark. This result shows that if X ∼ P oisson(µ), then E(X(X − 1)) =
µ2 .
3
23. (5pts) Suppose that X has a PDF fX , where

 x if 0 ≤ x < 1;
1 if 1 ≤ x ≤ 1.5;
fX (x) =

0 otherwise.
Find the following probabilities.
(a) P (0 < X < 1).
(b) P (0 < X < 1.1).
(c) P (X > 1.5).
(d) P (X < 0).
(e) P (X = 0.5).
24. (5pts) In Problem 23, we have fX (1) > fX (0.5). Does this imply that
P (X = 1) > P (X = 0.5)? Justify your answer.
25. (5pts) For the X and fX in Problem 23, derive a formula for
P (1 − h < X < 1 + h)
P (0.5 − h < X < 0.5 + h)
for 0 < h < 0.5. Find an h in (0, 0.5) such that
P (1 − h < X < 1 + h)
fX (1) −
P (0.5 − h < X < 0.5 + h) fX (0.5) < 0.01.
Hint: choose a small h.
26. (5pts) Suppose that X ∼ U (1, 3). Find the following quantities.
(a) P (X < 1).
(b) P (X = 2).
(c) P (2 < X < 4).
(d) E(X).
(e) V ar(X).
You may use the following R outputs to solve this problem.
• Running
h <- function(x){ x/2 }
integrate(h, 1, 3)
gives the R output
2 with absolute error < 2.2e-14
• Running
h <- function(x){ x^2/2 }
integrate(h, 1, 3)
gives the R output
4.333333 with absolute error < 4.8e-14
27. (5pts) Suppose that X ∼ N (0, 1). Find the following probabilities using
the table “Normal probabilities” (available at the course web site).
4
(a) P (−0.5 < X < 1.51).
(b) P (0.5 < X < 1.51).
(c) P (X > 1.51).
(d) P (X > −0.5).
(e) P (X < 0.5).
28. (5pts) Suppose that X ∼ N (1, 32 ). Find P (2.5 < X < 5.53).
29. (5pts) Suppose that X is random variable with mean 1 and standard
deviation 2, and Y = 1 − 2X.
(a) Find the mean and variance of Y .
(b) Give two values a and b such that P (a ≤ Y ≤ b) ≥ 0.75 using
Chebyshev’s Theorem.
(c) Suppose that the distribution of X is normal. For the a and b found
in Part (b), what is P (a ≤ Y ≤ b)?
30. (5pts) Suppose that Ms Yu goes fishing every weekend, and she catches 4
fishes in 2 hours on average. Suppose that the number of fishes she catches
in the first t hours forms a Poisson process. Find the probability that the
next time when Ms Yu goes fishing, she does not catch any fish in the first
0.5 hour and catches one fish in the first hour.
31. (5pts) In Problem 30, let X be the waiting time till the first fish caught
and let Y be the numbers of fishes caught in the first 0.5 hour.
(a) What is the distribution of X? Find P (X > 0.5).
(b) What is the distribution of Y ? Find P (Y = 0).
32. (5pts) Suppose that a productive line produces pens with defective rate
0.02. Let X be the number of defective pens in the next 1000 pens made
from the productive line. Approximate P (X ≤ 22) using a normal probability.
33. (5pts) Consider the population of rents for one-bedroom apartments near
NCCU. Suppose that the population distribution can be approximated
well by a normal distribution with standard deviation of NT$4,000 per
month. Suppose that we have a random sample of 30 rents from the
population, and the sample mean is NT$1,1000 per month. Find an approximate 95% confidence interval for the population mean rent.
34. (5pts) Consider the population of rents for one-bedroom apartments near
NCCU. Suppose that the population distribution can be approximated
well by a normal distribution with unknown mean and variance. Suppose
that we have a random sample of 30 rents from the population, and the
sample mean and sample standard deviation are NT$11,000 per month
and NT$4,000 per month. Find an approximate 90% confidence interval
for the population mean rent.
35. (5pts) A mobile phone manufacturer would like to estimate the defective
rate for one of their new products. Suppose that a random sample of 40
items are taken and 1 of them are defective. Construct an approximate
95% confidence interval for the defective rate.
5
36. (5pts) Consider the population of rents for one-bedroom apartments near
NCCU. Suppose that the population distribution can be approximated
well by a normal distribution with standard deviation of NT$4,000 per
month. Suppose that we want to construct an approximate 90% confidence
interval for the population mean rent with maximum allowable margin of
error NT$4,000 per month. What is the required sample size?
37. (5pts) A mobile phone manufacturer would like to construct a 90% approximate confidence interval for the defective rate for one of their new
products with maximum allowable margin of error 0.005. What is the
required sample size?
38. (5pts) A new drug has been developed to lower systolic blood pressure.
Suppose that after using the new drug, a typical patient’s systolic blood
pressure is lowered by µ (mmHg) on average. Suppose that after using a
conventional drug, a typical patient’s systolic blood pressure is lowered by
µ0 (mmHg) on average. We would like to know whether the new drug is
more effective than the conventional drug. Suppose that we want to control the probability of falsely claiming the new drug is more effective than
the conventional drug. Which statement should be the null hypothesis,
µ ≥ µ0 or µ < µ0 ?
39. (5pts) Consider Problem 36. Suppose that we have a random sample of 30
rents from the population, and the sample mean is NT$11,000 per month.
(a) Can we conclude that the population mean rent is more than NT$10,000
per month at the 0.01 significance level?
(b) Can we conclude that the population mean rent is less than NT$18,000
per month at the 0.05 significance level?
(c) Can we conclude that the population mean rent is different from
NT$14,000 per month at the 0.01 significance level?
40. (5pts) Consider Problem 34.
(a) Can we conclude that the population mean rent is more than NT$10,000
per month at the 0.01 significance level?
(b) Can we conclude that the population mean rent is less than NT$18,000
per month at the 0.05 significance level?
(c) Can we conclude that the population mean rent is different from
NT$14,000 per month at the 0.01 significance level?
41. (5pts) Consider Problem 36. Suppose that we have a random sample of
30 rents from the population, and the sample mean is NT$11,000 per
month. For each a in {0.02, 0.04, 0.06, 0.08, 0.1}, can we conclude that the
population mean rent is more than NT$10,000 per month at level a? Note
that you only need to compute one number to solve this problem.
42. (5pts) Consider Problem 36. Suppose that we have a random sample of 30
rents from the population, and the sample mean is NT$11,000 per month.
(a) How confident can we be to conclude that the population mean rent
is less than NT$18,000 per month?
(b) How confident can we be to conclude that the population mean rent
is different from NT$14,000 per month?
6
Note: we can be very confident to conclude H1 if we have very strong
evidence against H0 .
43. (5pts) Suppose that (X1 , . . . , Xn ) is a random sample from N (µ, σ 2 ),
where σ is known. Suppose that 0 < p < a < 1. Show that
σ
σ
X̄ − za−p √ , X̄ + zp √
n
n
is a (1 − a) confidence interval for µ.
44. The R output after running
qnorm(0.915);qnorm(0.925);qnorm(0.935);qnorm(0.945);qnorm(0.950)
is
[1]
[1]
[1]
[1]
[1]
1.372204
1.439531
1.514102
1.598193
1.644854
and the R output after running
qnorm(0.955);qnorm(0.965);qnorm(0.975);qnorm(0.985)
is
[1]
[1]
[1]
[1]
1.695398
1.811911
1.959964
2.170090
Recall that running qnorm(1-p) in R gives zp , so the above output gives
zp ’s for different p’s. Suppose that (X1 , . . . , Xn ) is a random sample from
N (µ, σ 2 ), where σ is known. Give five different 90% confidence intervals
for µ using the above zp ’s and the result in Problem 43. Which one of the
five confidence intervals has the shortest length?
45. (5pts) A mobile phone manufacturer would like to learn about the defective rate for one of their new products. Suppose that a sample of 1000 is
taken and 5 out of the 1000 items are defective.
(a) Can we conclude that the defective rate is more than 0.01 at the 0.05
significance level?
(b) Can we conclude that the defective rate is less than 0.01 at the 0.05
significance level?
(c) Can we conclude that the defective rate is different from 0.006 at the
0.01 significance level?
46. (5pts) Suppose that (X1 , . . . , Xn ) is a random sample from N (µ, 22 ), where
n = 100. Consider the testing problem
H0 : µ ≤ 0 v.s. H1 : µ > 0
7
and a test that rejects H0 at level 0.05 if
√
nX̄/2 > z0.05 .
Let p(µ) be the probability that the test rejects H0 . Determine the order
of p(−0.2), p(−0.1), p(0), p(0.1) and p(0.2) and express your answer in
the form
p(a1 ) < p(a2 ) < p(a3 ) < p(a4 ) < p(a5 ),
where {a1 , . . . , a5 } = {−0.2, −0.1, 0, 0.1, 0.2}. You can justify your conclusion by computing the probabilities directly or provide a proof without computing the probabilities (請將p(−0.2), p(−0.1), p(0), p(0.1) and
p(0.2)依大小排列, 由小排到大. 你可以直接算出這五個值再排序, 也可以不算
出值, 但提出一個證明來推導這五個值的順序).
47. (5pts) Consider Problem 34. Can we conclude that the standard deviation
for the rent population distribution is greater than NT$3,200 at levels 0.01,
0.05 and 0.1 respectively? Use the table “Quantiles for χ2 distributions”
on the course web site to find the ka,df values.
8