Download PSTAT 120C Probability and Statistics - Week 7

Document related concepts
no text concepts found
Transcript
PSTAT 120C Probability and Statistics - Week 7
Fang-I Chu, Varvara Kulikova
University of California, Santa Barbara
May 21, 2012
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
Topics for review
The runs test
Nature and definition
Hint for #1 in hw5
Multinomial distribution
Nature and definition
Conditional distribution
Discussion/ hints of #2 in hw5
χ2 goodness of fit tests
Nature and definition of χ2 test statistic
Discussion/hints of #3,#4 in hw5
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
The runs test
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
The runs test
Nature:
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
The runs test
Nature:
whether a set of outcome represents an independent sequence
or if there is some association within the sequence.
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
The runs test
Nature:
whether a set of outcome represents an independent sequence
or if there is some association within the sequence.
test for randomness is another way to phrase the test of such
independent relation. (known as the runs test)
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
The runs test
Nature:
whether a set of outcome represents an independent sequence
or if there is some association within the sequence.
test for randomness is another way to phrase the test of such
independent relation. (known as the runs test)
Definition
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
The runs test
Nature:
whether a set of outcome represents an independent sequence
or if there is some association within the sequence.
test for randomness is another way to phrase the test of such
independent relation. (known as the runs test)
Definition
A run is a maximal subsequence of like elements.
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
The runs test
Nature:
whether a set of outcome represents an independent sequence
or if there is some association within the sequence.
test for randomness is another way to phrase the test of such
independent relation. (known as the runs test)
Definition
A run is a maximal subsequence of like elements.
A very small (positive association) or very large (negative
association) number of runs in a sequence indicates non
randomness.
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
The runs test
Nature:
whether a set of outcome represents an independent sequence
or if there is some association within the sequence.
test for randomness is another way to phrase the test of such
independent relation. (known as the runs test)
Definition
A run is a maximal subsequence of like elements.
A very small (positive association) or very large (negative
association) number of runs in a sequence indicates non
randomness.
Test
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
The runs test
Nature:
whether a set of outcome represents an independent sequence
or if there is some association within the sequence.
test for randomness is another way to phrase the test of such
independent relation. (known as the runs test)
Definition
A run is a maximal subsequence of like elements.
A very small (positive association) or very large (negative
association) number of runs in a sequence indicates non
randomness.
Test
H0 : each occurrence is independent v.s. Ha :there is some
association
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
The runs test
Nature:
whether a set of outcome represents an independent sequence
or if there is some association within the sequence.
test for randomness is another way to phrase the test of such
independent relation. (known as the runs test)
Definition
A run is a maximal subsequence of like elements.
A very small (positive association) or very large (negative
association) number of runs in a sequence indicates non
randomness.
Test
H0 : each occurrence is independent v.s. Ha :there is some
association
R, the number of runs in a sequence, denote the test statistic
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
The runs test
Nature:
whether a set of outcome represents an independent sequence
or if there is some association within the sequence.
test for randomness is another way to phrase the test of such
independent relation. (known as the runs test)
Definition
A run is a maximal subsequence of like elements.
A very small (positive association) or very large (negative
association) number of runs in a sequence indicates non
randomness.
Test
H0 : each occurrence is independent v.s. Ha :there is some
association
R, the number of runs in a sequence, denote the test statistic
Rejection region: R ≤ a or R ≥ b
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
The runs test
Nature:
whether a set of outcome represents an independent sequence
or if there is some association within the sequence.
test for randomness is another way to phrase the test of such
independent relation. (known as the runs test)
Definition
A run is a maximal subsequence of like elements.
A very small (positive association) or very large (negative
association) number of runs in a sequence indicates non
randomness.
Test
H0 : each occurrence is independent v.s. Ha :there is some
association
R, the number of runs in a sequence, denote the test statistic
Rejection region: R ≤ a or R ≥ b
level of test: α = P(R ≤ a) + P(R ≥ b)
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
#1 in hw1
#1
Many basketball players feel that they get into a groove where they
shoot better. Here is the sequence of attempts to make a basket
by a player:
make make miss miss miss miss miss miss
make make make make miss make make make
make miss miss
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
#1 continued...
#1
(a) If we assume that these outcomes are all independent, then give
a 95% confidence internal for p the probability of making a basket.
Hints
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
#1 continued...
#1
(a) If we assume that these outcomes are all independent, then give
a 95% confidence internal for p the probability of making a basket.
Hints
use formula
p of confidence interval
p for p:
(p̂ − z α2 p̂(1 − p̂)/n, p̂ + z α2 p̂(1 − p̂)/n).(set α = 0.05)
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
#1 continued...
#1
(a) If we assume that these outcomes are all independent, then give
a 95% confidence internal for p the probability of making a basket.
Hints
use formula
p of confidence interval
p for p:
(p̂ − z α2 p̂(1 − p̂)/n, p̂ + z α2 p̂(1 − p̂)/n).(set α = 0.05)
P(miss) = P(make) = p̂ = 0.5
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
#1 continued...
#1
(a) If we assume that these outcomes are all independent, then give
a 95% confidence internal for p the probability of making a basket.
Hints
use formula
p of confidence interval
p for p:
(p̂ − z α2 p̂(1 − p̂)/n, p̂ + z α2 p̂(1 − p̂)/n).(set α = 0.05)
P(miss) = P(make) = p̂ = 0.5
Given n = 19, we could compute
Fang-I Chu, Varvara Kulikova
p̂(1−p̂)
n
=
0.5·0.5
19
PSTAT 120C Probability and Statistics
#1 continued..
#1
(b)Use a runs test to test our assumption that the outcomes are
independent. Calculate a P-value and determine whether it is
significant at the α = 0.1 level.
Hint:
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
#1 continued..
#1
(b)Use a runs test to test our assumption that the outcomes are
independent. Calculate a P-value and determine whether it is
significant at the α = 0.1 level.
Hint:
(1). number of make/ miss (n1 /n2 ): n1 = 10, n2 = 9
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
#1 continued..
#1
(b)Use a runs test to test our assumption that the outcomes are
independent. Calculate a P-value and determine whether it is
significant at the α = 0.1 level.
Hint:
(1). number of make/ miss (n1 /n2 ): n1 = 10, n2 = 9
(2). count the number of runs: # of runs =6
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
#1 continued..
#1
(b)Use a runs test to test our assumption that the outcomes are
independent. Calculate a P-value and determine whether it is
significant at the α = 0.1 level.
Hint:
(1). number of make/ miss (n1 /n2 ): n1 = 10, n2 = 9
(2). count the number of runs: # of runs =6
(3). look up p-value from table in page 870: use information from
(1) and (2), we have P(R ≤ 6) = 0.029 and
P(R ≥ 15) = P(R > 14) = 1 − P(R ≤ 14) = 1 − 0.974.
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
#1 continued..
#1
(b)Use a runs test to test our assumption that the outcomes are
independent. Calculate a P-value and determine whether it is
significant at the α = 0.1 level.
Hint:
(1). number of make/ miss (n1 /n2 ): n1 = 10, n2 = 9
(2). count the number of runs: # of runs =6
(3). look up p-value from table in page 870: use information from
(1) and (2), we have P(R ≤ 6) = 0.029 and
P(R ≥ 15) = P(R > 14) = 1 − P(R ≤ 14) = 1 − 0.974.
(4). p-value= P(R ≤ 6) + P(R ≥ 14), we can then draw
conclusion at α = 0.1 level.
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
Multinomial distribution
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
Multinomial distribution
Nature:
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
Multinomial distribution
Nature:
the experiment consists of n independent identical trials
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
Multinomial distribution
Nature:
the experiment consists of n independent identical trials
the outcome of each trial falls into one of k classes
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
Multinomial distribution
Nature:
the experiment consists of n independent identical trials
the outcome of each trial falls into one of k classes
the probability that the outcome
Pkof a trial falls into class i is
pi , where i = 1, 2, . . . , k, note i=1 pi = 1
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
Multinomial distribution
Nature:
the experiment consists of n independent identical trials
the outcome of each trial falls into one of k classes
the probability that the outcome
Pkof a trial falls into class i is
pi , where i = 1, 2, . . . , k, note i=1 pi = 1
Y1 , . . . , Yk denote random variables, with Yi P
as the number of
k
trials for which outcome falls in class i, note i=1 Yi = n
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
Multinomial distribution
Nature:
the experiment consists of n independent identical trials
the outcome of each trial falls into one of k classes
the probability that the outcome
Pkof a trial falls into class i is
pi , where i = 1, 2, . . . , k, note i=1 pi = 1
Y1 , . . . , Yk denote random variables, with Yi P
as the number of
k
trials for which outcome falls in class i, note i=1 Yi = n
Form:
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
Multinomial distribution
Nature:
the experiment consists of n independent identical trials
the outcome of each trial falls into one of k classes
the probability that the outcome
Pkof a trial falls into class i is
pi , where i = 1, 2, . . . , k, note i=1 pi = 1
Y1 , . . . , Yk denote random variables, with Yi P
as the number of
k
trials for which outcome falls in class i, note i=1 Yi = n
Form:
p(y1 , y2 , . . . , yk ) =
y1 y2
n!
y1 !y2 !...yk ! p1 p2
Fang-I Chu, Varvara Kulikova
. . . pkyk
PSTAT 120C Probability and Statistics
Multinomial distribution
Nature:
the experiment consists of n independent identical trials
the outcome of each trial falls into one of k classes
the probability that the outcome
Pkof a trial falls into class i is
pi , where i = 1, 2, . . . , k, note i=1 pi = 1
Y1 , . . . , Yk denote random variables, with Yi P
as the number of
k
trials for which outcome falls in class i, note i=1 Yi = n
Form:
p(y1 , y2 , . . . , yk ) =
y1 y2
n!
y1 !y2 !...yk ! p1 p2
. . . pkyk
Remark:
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
Multinomial distribution
Nature:
the experiment consists of n independent identical trials
the outcome of each trial falls into one of k classes
the probability that the outcome
Pkof a trial falls into class i is
pi , where i = 1, 2, . . . , k, note i=1 pi = 1
Y1 , . . . , Yk denote random variables, with Yi P
as the number of
k
trials for which outcome falls in class i, note i=1 Yi = n
Form:
p(y1 , y2 , . . . , yk ) =
y1 y2
n!
y1 !y2 !...yk ! p1 p2
. . . pkyk
Remark:
binomial distribution is a special case of multinomial
distribution with k = 2
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
Conditional distribution
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
Conditional distribution
Intuition: distribution of certain random variable when given
some other relevant condition
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
Conditional distribution
Intuition: distribution of certain random variable when given
some other relevant condition
Example. distribution of X1 , X2 , X3 given S = X1 + X2
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
Conditional distribution
Intuition: distribution of certain random variable when given
some other relevant condition
Example. distribution of X1 , X2 , X3 given S = X1 + X2
Form: P(X1 = x1 , X2 = x2 , X3 = x3 |S = s) =
P(X1 =x1 ,X2 =x2 ,X3 =x3 ,S=s)
P(S=s)
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
Conditional distribution
Intuition: distribution of certain random variable when given
some other relevant condition
Example. distribution of X1 , X2 , X3 given S = X1 + X2
Form: P(X1 = x1 , X2 = x2 , X3 = x3 |S = s) =
P(X1 =x1 ,X2 =x2 ,X3 =x3 ,S=s)
P(S=s)
Special case: multinomial distribution condition on some of its
sum gives us independent binomials.
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
discussion/hints for #2
#2
Suppose that X1 , X2 , X3 , X4 are multinomially distributed with
n = 16 and
p1 = ab
(1)
p2 = (1 − a)b
(2)
p3 = a(1 − b)
(3)
p4 = (1 − a)(1 − b)
(4)
for some a and b between 0 and 1. Let S = X1 + X2
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
#2 continued...
#2(a)
Find P{R = 10}
Hint:
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
#2 continued...
#2(a)
Find P{R = 10}
Hint:
Known: add two of the observations from multinomial, we
still obtain multinomial.
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
#2 continued...
#2(a)
Find P{R = 10}
Hint:
Known: add two of the observations from multinomial, we
still obtain multinomial.
R = X1 + X3 ∼ binomial(n, p1 + p3 )
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
#2 continued...
#2(a)
Find P{R = 10}
Hint:
Known: add two of the observations from multinomial, we
still obtain multinomial.
R = X1 + X3 ∼ binomial(n, p1 + p3 )
p1 + p3 = ab + a(1 − b) = a
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
#2 continued...
#2(a)
Find P{R = 10}
Hint:
Known: add two of the observations from multinomial, we
still obtain multinomial.
R = X1 + X3 ∼ binomial(n, p1 + p3 )
p1 + p3 = ab + a(1 − b) = a
Write out marginal distribution of R ∼ (n = 16, a)
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
#2 continued...
#2(a)
Find P{R = 10}
Hint:
Known: add two of the observations from multinomial, we
still obtain multinomial.
R = X1 + X3 ∼ binomial(n, p1 + p3 )
p1 + p3 = ab + a(1 − b) = a
Write out marginal distribution of R ∼ (n = 16, a)
P(R = 10) then can be easily obtained.
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
#2 (b)
Find the P{X1 = 6|S = 12}
Hint:
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
#2 (b)
Find the P{X1 = 6|S = 12}
Hint:
(1). by definition of conditional probability:
1 =x,S=s}
P{X1 = x|S = s} = P{XP(S=s)
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
#2 (b)
Find the P{X1 = 6|S = 12}
Hint:
(1). by definition of conditional probability:
1 =x,S=s}
P{X1 = x|S = s} = P{XP(S=s)
(2). numerator is P{X1 = 6, S = X1 + X2 = 12} = P{X1 =
n!
6, X2 = 6, X3 + X4 = 4} = x1 !x2 !(x
p x1 p2x2 (p3 + p4 )x3 +x4
3 +x4 )! 1
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
#2 (b)
Find the P{X1 = 6|S = 12}
Hint:
(1). by definition of conditional probability:
1 =x,S=s}
P{X1 = x|S = s} = P{XP(S=s)
(2). numerator is P{X1 = 6, S = X1 + X2 = 12} = P{X1 =
n!
6, X2 = 6, X3 + X4 = 4} = x1 !x2 !(x
p x1 p2x2 (p3 + p4 )x3 +x4
3 +x4 )! 1
(3). denominator is similar as in
(a),
16!
P(X1 + X2 = 12) = 12!4!
(p1 + p2 )12 (p3 + p4 )4
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
#2 (b)
Find the P{X1 = 6|S = 12}
Hint:
(1). by definition of conditional probability:
1 =x,S=s}
P{X1 = x|S = s} = P{XP(S=s)
(2). numerator is P{X1 = 6, S = X1 + X2 = 12} = P{X1 =
n!
6, X2 = 6, X3 + X4 = 4} = x1 !x2 !(x
p x1 p2x2 (p3 + p4 )x3 +x4
3 +x4 )! 1
(3). denominator is similar as in
(a),
16!
P(X1 + X2 = 12) = 12!4!
(p1 + p2 )12 (p3 + p4 )4
6
6
(4). The result is P{X1 = 6|S = 12} = 12
6 a (1 − a) .
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
#2 continued...
#2(c)
Suppose we want to condition on the event {S = 12, R = 9}, what
are the possible values that X1 can take?
Solution: Possible outcome
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
#2 continued...
#2(c)
Suppose we want to condition on the event {S = 12, R = 9}, what
are the possible values that X1 can take?
Solution: Possible outcome
X1 X2 X3 X4
5
7
4
0
6
6
3
1
7
5
2
2
8
4
1
3
9
3
0
4
#2(d)
Find P{X1 = 0|S = 12, R = 10}.
Solution:
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
#2 continued...
#2(c)
Suppose we want to condition on the event {S = 12, R = 9}, what
are the possible values that X1 can take?
Solution: Possible outcome
X1 X2 X3 X4
5
7
4
0
6
6
3
1
7
5
2
2
8
4
1
3
9
3
0
4
#2(d)
Find P{X1 = 0|S = 12, R = 10}.
Solution: According to table in part (c), we have
P{X1 = 0|S = 12, R = 10} = 0(why?)
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
χ2 goodness of fit tests
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
χ2 goodness of fit tests
The χ2 test
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
χ2 goodness of fit tests
The χ2 test
goodness of fit test: test how well does the data fit the
suggested distribution
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
χ2 goodness of fit tests
The χ2 test
goodness of fit test: test how well does the data fit the
suggested distribution
Definition and test nature:
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
χ2 goodness of fit tests
The χ2 test
goodness of fit test: test how well does the data fit the
suggested distribution
Definition and test nature:
H0 : p1 = p1∗ , . . . , pk = pk∗ v.s. Ha : some other set of
probability
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
χ2 goodness of fit tests
The χ2 test
goodness of fit test: test how well does the data fit the
suggested distribution
Definition and test nature:
H0 : p1 = p1∗ , . . . , pk = pk∗ v.s. Ha : some other set of
probability
P (Observed−Expected)2
test statistic: X 2 =
Expected
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
χ2 goodness of fit tests
The χ2 test
goodness of fit test: test how well does the data fit the
suggested distribution
Definition and test nature:
H0 : p1 = p1∗ , . . . , pk = pk∗ v.s. Ha : some other set of
probability
P (Observed−Expected)2
test statistic: X 2 =
Expected
expected value: E (Xi ) = np
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
χ2 goodness of fit tests
The χ2 test
goodness of fit test: test how well does the data fit the
suggested distribution
Definition and test nature:
H0 : p1 = p1∗ , . . . , pk = pk∗ v.s. Ha : some other set of
probability
P (Observed−Expected)2
test statistic: X 2 =
Expected
expected value: E (Xi ) = np
X 2 test statistic has approximately a χ2 distribution with k − 1
df.
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
χ2 goodness of fit tests
The χ2 test
goodness of fit test: test how well does the data fit the
suggested distribution
Definition and test nature:
H0 : p1 = p1∗ , . . . , pk = pk∗ v.s. Ha : some other set of
probability
P (Observed−Expected)2
test statistic: X 2 =
Expected
expected value: E (Xi ) = np
X 2 test statistic has approximately a χ2 distribution with k − 1
df.
degree of freedom: number of estimates for p
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
hints for #3 in hw 5
#3
A survey of 1275 resident in Isla Vista asked whether or not they
wanted to build a new train station. 408 of the residents said that
they did want to build the station, but 375 said that they did not.
345 people said they had no opinion and 147 refused to respond.
(a) Calculate a 95% confidence interval for the percentage who
want to build the station.
Hint:
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
hints for #3 in hw 5
#3
A survey of 1275 resident in Isla Vista asked whether or not they
wanted to build a new train station. 408 of the residents said that
they did want to build the station, but 375 said that they did not.
345 people said they had no opinion and 147 refused to respond.
(a) Calculate a 95% confidence interval for the percentage who
want to build the station.
Hint:
use formula
p of confidence interval
p for p:
(p̂ − z α2 p̂(1 − p̂)/n, p̂ + z α2
p̂(1 − p̂)/n).(set α = 0.05) (
same formula as in 1(a).)
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
hints for #3 in hw 5
#3
A survey of 1275 resident in Isla Vista asked whether or not they
wanted to build a new train station. 408 of the residents said that
they did want to build the station, but 375 said that they did not.
345 people said they had no opinion and 147 refused to respond.
(a) Calculate a 95% confidence interval for the percentage who
want to build the station.
Hint:
use formula
p of confidence interval
p for p:
(p̂ − z α2 p̂(1 − p̂)/n, p̂ + z α2
p̂(1 − p̂)/n).(set α = 0.05) (
same formula as in 1(a).)
P(supportive about station) = p̂ =
Fang-I Chu, Varvara Kulikova
408
1275
PSTAT 120C Probability and Statistics
hints for #3 in hw 5
#3
A survey of 1275 resident in Isla Vista asked whether or not they
wanted to build a new train station. 408 of the residents said that
they did want to build the station, but 375 said that they did not.
345 people said they had no opinion and 147 refused to respond.
(a) Calculate a 95% confidence interval for the percentage who
want to build the station.
Hint:
use formula
p of confidence interval
p for p:
(p̂ − z α2 p̂(1 − p̂)/n, p̂ + z α2
p̂(1 − p̂)/n).(set α = 0.05) (
same formula as in 1(a).)
P(supportive about station) = p̂ =
408
1275
obtained 95% confidence interval is (0.29,0.34)
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
#3 continued...
#3
(b)Calculate a 95% confidence interval for the difference between
the proportion who want to build the station and the proportion
who do not want to build the station. Estimate the standard error
using the sample proportions.
Hints
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
#3 continued...
#3
(b)Calculate a 95% confidence interval for the difference between
the proportion who want to build the station and the proportion
who do not want to build the station. Estimate the standard error
using the sample proportions.
Hints
Formula Var(X1 − X2 ) = Var(X1 ) + Var(X2 ) − 2Cov(X1 , X2 ) =
np1 (1 − p1 ) + np2 (1 − p2 ) + 2np1 p2
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
#3 continued...
#3
(b)Calculate a 95% confidence interval for the difference between
the proportion who want to build the station and the proportion
who do not want to build the station. Estimate the standard error
using the sample proportions.
Hints
Formula Var(X1 − X2 ) = Var(X1 ) + Var(X2 ) − 2Cov(X1 , X2 ) =
np1 (1 − p1 ) + np2 (1 − p2 ) + 2np1 p2
Given n = 1275, compute p̂1 =
Fang-I Chu, Varvara Kulikova
408
1275
and p̂2 =
375
1275
PSTAT 120C Probability and Statistics
#3 continued...
#3
(b)Calculate a 95% confidence interval for the difference between
the proportion who want to build the station and the proportion
who do not want to build the station. Estimate the standard error
using the sample proportions.
Hints
Formula Var(X1 − X2 ) = Var(X1 ) + Var(X2 ) − 2Cov(X1 , X2 ) =
np1 (1 − p1 ) + np2 (1 − p2 ) + 2np1 p2
Given n = 1275, compute p̂1 =
408
1275
and p̂2 =
375
1275
Write our the variance for difference in proportions as
2)
1)
Var(p1 − p2 ) = Var(Xn12−X2 ) = p̂1 (1−p̂
+ p̂2 (1−p̂
+ 2p̂n1 p̂2
n
n
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
#3 continued...
#3
(b)Calculate a 95% confidence interval for the difference between
the proportion who want to build the station and the proportion
who do not want to build the station. Estimate the standard error
using the sample proportions.
Hints
Formula Var(X1 − X2 ) = Var(X1 ) + Var(X2 ) − 2Cov(X1 , X2 ) =
np1 (1 − p1 ) + np2 (1 − p2 ) + 2np1 p2
Given n = 1275, compute p̂1 =
408
1275
and p̂2 =
375
1275
Write our the variance for difference in proportions as
2)
1)
Var(p1 − p2 ) = Var(Xn12−X2 ) = p̂1 (1−p̂
+ p̂2 (1−p̂
+ 2p̂n1 p̂2
n
n
Use formula of confidence
interval for p1 − p2 :p
p
(pˆ1 − pˆ2 −z0.025 Var(p1 − p2 ), pˆ1 − pˆ2 +z0.025 Var(p1 − p2 ))
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
#3 continued...
#3
(b)Calculate a 95% confidence interval for the difference between
the proportion who want to build the station and the proportion
who do not want to build the station. Estimate the standard error
using the sample proportions.
Hints
Formula Var(X1 − X2 ) = Var(X1 ) + Var(X2 ) − 2Cov(X1 , X2 ) =
np1 (1 − p1 ) + np2 (1 − p2 ) + 2np1 p2
Given n = 1275, compute p̂1 =
408
1275
and p̂2 =
375
1275
Write our the variance for difference in proportions as
2)
1)
Var(p1 − p2 ) = Var(Xn12−X2 ) = p̂1 (1−p̂
+ p̂2 (1−p̂
+ 2p̂n1 p̂2
n
n
Use formula of confidence
interval for p1 − p2 :p
p
(pˆ1 − pˆ2 −z0.025 Var(p1 − p2 ), pˆ1 − pˆ2 +z0.025 Var(p1 − p2 ))
obtained 95% CI as (0.026-0.043, 0.026+0.043)
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
#4 in hw5
#4
A gambler wants to test whether the die she has is fair so she rolls
it 300 times and records the outcomes as
Roll
1 2 3 4 5 6
Count 54 71 42 45 51 37
meaning she rolled a 1 on the die 54 times, etc.
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
#4 in hw 5
#4
(a) A fair die has probability 16 for each of the six outcomes. Use a
chi-squared test to test whether or not the probabilities are all
equal
Hints/Solution outline
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
#4 in hw 5
#4
(a) A fair die has probability 16 for each of the six outcomes. Use a
chi-squared test to test whether or not the probabilities are all
equal
Hints/Solution outline
H0 : pi = 16
Ha : at least one pi 6= pj
where i, j = 1, . . . , 6, i 6= j
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
#4 in hw 5
#4
(a) A fair die has probability 16 for each of the six outcomes. Use a
chi-squared test to test whether or not the probabilities are all
equal
Hints/Solution outline
H0 : pi = 16
Ha : at least one pi 6= pj
where i, j = 1, . . . , 6, i 6= j
total number of rolls is 300. The expected number of times of
each of the six outcomes appear is 300 × 61 = 50.
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
#4 (a) continued...
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
#4 (a) continued...
reedited contingency table as
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
#4 (a) continued...
reedited contingency table as
Roll
1
2
3
4
5
6
obs count expected count
54
50
71
50
42
50
45
50
51
50
37
50
χ2
Fang-I Chu, Varvara Kulikova
(O−E )2
E
0.32
8.82
1.28
0.5
0.02
3.38
?
PSTAT 120C Probability and Statistics
#4 (a) continued...
reedited contingency table as
Roll
1
2
3
4
5
6
obs count expected count
54
50
71
50
42
50
45
50
51
50
37
50
χ2
(O−E )2
E
0.32
8.82
1.28
0.5
0.02
3.38
?
degrees of freedom= 6 − 1 = 5(why?)
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
#4 (a) continued...
reedited contingency table as
Roll
1
2
3
4
5
6
obs count expected count
54
50
71
50
42
50
45
50
51
50
37
50
χ2
(O−E )2
E
0.32
8.82
1.28
0.5
0.02
3.38
?
degrees of freedom= 6 − 1 = 5(why?)
draw conclusion by comparing χ2 statistic (obtained from
table) to χ25,0.05
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
#4 in hw 5
#4
(b) An oblong die will have P(1) = P(6), P(2) = P(5), and
P(3) = P(4), but all of the probabilities will not necessarily be the
same. Use a chi-squared test to test whether the data is consistent
with an oblong die.
Hints/Solution outline
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
#4 in hw 5
#4
(b) An oblong die will have P(1) = P(6), P(2) = P(5), and
P(3) = P(4), but all of the probabilities will not necessarily be the
same. Use a chi-squared test to test whether the data is consistent
with an oblong die.
Hints/Solution outline
H0 : P(1) = P(6), P(2) = P(5), and P(3) = P(4)
Ha : at least one equality in null is not true.
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
#4 in hw 5
#4
(b) An oblong die will have P(1) = P(6), P(2) = P(5), and
P(3) = P(4), but all of the probabilities will not necessarily be the
same. Use a chi-squared test to test whether the data is consistent
with an oblong die.
Hints/Solution outline
H0 : P(1) = P(6), P(2) = P(5), and P(3) = P(4)
Ha : at least one equality in null is not true.
under H0 , the estimates of the above probabilities are
computed as
54 + 37
p̂1 = p̂6 =
300(2)
71 + 51
p̂2 = p̂5 =
300(2)
42 + 45
p̂3 = p̂4 =
300(2)
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
#4 (b) continued...
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
#4 (b) continued...
reedited contingency table as
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
#4 (b) continued...
reedited contingency table as
Roll
1
2
3
4
5
6
obscount
54
71
42
45
51
37
Fang-I Chu, Varvara Kulikova
expectedcount
300 × p̂1 = 45.5
300 × p̂2 = 61
300 × p̂3 = 43.5
300 × p̂4 = 45.5
300 × p̂5 = 61
300 × p̂6 = 45.5
χ2
O−E )2
E
?
?
?
?
?
?
6.56
PSTAT 120C Probability and Statistics
#4 (b) continued...
reedited contingency table as
Roll
1
2
3
4
5
6
obscount
54
71
42
45
51
37
expectedcount
300 × p̂1 = 45.5
300 × p̂2 = 61
300 × p̂3 = 43.5
300 × p̂4 = 45.5
300 × p̂5 = 61
300 × p̂6 = 45.5
χ2
O−E )2
E
?
?
?
?
?
?
6.56
degrees of freedom= 6 − 1 − 2 = 3(why?)
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
#4 (b) continued...
reedited contingency table as
Roll
1
2
3
4
5
6
obscount
54
71
42
45
51
37
expectedcount
300 × p̂1 = 45.5
300 × p̂2 = 61
300 × p̂3 = 43.5
300 × p̂4 = 45.5
300 × p̂5 = 61
300 × p̂6 = 45.5
χ2
O−E )2
E
?
?
?
?
?
?
6.56
degrees of freedom= 6 − 1 − 2 = 3(why?)
lost 1 df because total is fixed at 300; then lost 2 df because
we estimated 2 same wparameters.
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
#4 (b) continued...
reedited contingency table as
Roll
1
2
3
4
5
6
obscount
54
71
42
45
51
37
expectedcount
300 × p̂1 = 45.5
300 × p̂2 = 61
300 × p̂3 = 43.5
300 × p̂4 = 45.5
300 × p̂5 = 61
300 × p̂6 = 45.5
χ2
O−E )2
E
?
?
?
?
?
?
6.56
degrees of freedom= 6 − 1 − 2 = 3(why?)
lost 1 df because total is fixed at 300; then lost 2 df because
we estimated 2 same wparameters.
draw conclusion by comparing χ2 statistic (obtained from
table) to χ23,0.05
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics