Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
PSTAT 120C Probability and Statistics - Week 7 Fang-I Chu, Varvara Kulikova University of California, Santa Barbara May 21, 2012 Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics Topics for review The runs test Nature and definition Hint for #1 in hw5 Multinomial distribution Nature and definition Conditional distribution Discussion/ hints of #2 in hw5 χ2 goodness of fit tests Nature and definition of χ2 test statistic Discussion/hints of #3,#4 in hw5 Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics The runs test Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics The runs test Nature: Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics The runs test Nature: whether a set of outcome represents an independent sequence or if there is some association within the sequence. Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics The runs test Nature: whether a set of outcome represents an independent sequence or if there is some association within the sequence. test for randomness is another way to phrase the test of such independent relation. (known as the runs test) Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics The runs test Nature: whether a set of outcome represents an independent sequence or if there is some association within the sequence. test for randomness is another way to phrase the test of such independent relation. (known as the runs test) Definition Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics The runs test Nature: whether a set of outcome represents an independent sequence or if there is some association within the sequence. test for randomness is another way to phrase the test of such independent relation. (known as the runs test) Definition A run is a maximal subsequence of like elements. Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics The runs test Nature: whether a set of outcome represents an independent sequence or if there is some association within the sequence. test for randomness is another way to phrase the test of such independent relation. (known as the runs test) Definition A run is a maximal subsequence of like elements. A very small (positive association) or very large (negative association) number of runs in a sequence indicates non randomness. Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics The runs test Nature: whether a set of outcome represents an independent sequence or if there is some association within the sequence. test for randomness is another way to phrase the test of such independent relation. (known as the runs test) Definition A run is a maximal subsequence of like elements. A very small (positive association) or very large (negative association) number of runs in a sequence indicates non randomness. Test Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics The runs test Nature: whether a set of outcome represents an independent sequence or if there is some association within the sequence. test for randomness is another way to phrase the test of such independent relation. (known as the runs test) Definition A run is a maximal subsequence of like elements. A very small (positive association) or very large (negative association) number of runs in a sequence indicates non randomness. Test H0 : each occurrence is independent v.s. Ha :there is some association Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics The runs test Nature: whether a set of outcome represents an independent sequence or if there is some association within the sequence. test for randomness is another way to phrase the test of such independent relation. (known as the runs test) Definition A run is a maximal subsequence of like elements. A very small (positive association) or very large (negative association) number of runs in a sequence indicates non randomness. Test H0 : each occurrence is independent v.s. Ha :there is some association R, the number of runs in a sequence, denote the test statistic Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics The runs test Nature: whether a set of outcome represents an independent sequence or if there is some association within the sequence. test for randomness is another way to phrase the test of such independent relation. (known as the runs test) Definition A run is a maximal subsequence of like elements. A very small (positive association) or very large (negative association) number of runs in a sequence indicates non randomness. Test H0 : each occurrence is independent v.s. Ha :there is some association R, the number of runs in a sequence, denote the test statistic Rejection region: R ≤ a or R ≥ b Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics The runs test Nature: whether a set of outcome represents an independent sequence or if there is some association within the sequence. test for randomness is another way to phrase the test of such independent relation. (known as the runs test) Definition A run is a maximal subsequence of like elements. A very small (positive association) or very large (negative association) number of runs in a sequence indicates non randomness. Test H0 : each occurrence is independent v.s. Ha :there is some association R, the number of runs in a sequence, denote the test statistic Rejection region: R ≤ a or R ≥ b level of test: α = P(R ≤ a) + P(R ≥ b) Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics #1 in hw1 #1 Many basketball players feel that they get into a groove where they shoot better. Here is the sequence of attempts to make a basket by a player: make make miss miss miss miss miss miss make make make make miss make make make make miss miss Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics #1 continued... #1 (a) If we assume that these outcomes are all independent, then give a 95% confidence internal for p the probability of making a basket. Hints Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics #1 continued... #1 (a) If we assume that these outcomes are all independent, then give a 95% confidence internal for p the probability of making a basket. Hints use formula p of confidence interval p for p: (p̂ − z α2 p̂(1 − p̂)/n, p̂ + z α2 p̂(1 − p̂)/n).(set α = 0.05) Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics #1 continued... #1 (a) If we assume that these outcomes are all independent, then give a 95% confidence internal for p the probability of making a basket. Hints use formula p of confidence interval p for p: (p̂ − z α2 p̂(1 − p̂)/n, p̂ + z α2 p̂(1 − p̂)/n).(set α = 0.05) P(miss) = P(make) = p̂ = 0.5 Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics #1 continued... #1 (a) If we assume that these outcomes are all independent, then give a 95% confidence internal for p the probability of making a basket. Hints use formula p of confidence interval p for p: (p̂ − z α2 p̂(1 − p̂)/n, p̂ + z α2 p̂(1 − p̂)/n).(set α = 0.05) P(miss) = P(make) = p̂ = 0.5 Given n = 19, we could compute Fang-I Chu, Varvara Kulikova p̂(1−p̂) n = 0.5·0.5 19 PSTAT 120C Probability and Statistics #1 continued.. #1 (b)Use a runs test to test our assumption that the outcomes are independent. Calculate a P-value and determine whether it is significant at the α = 0.1 level. Hint: Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics #1 continued.. #1 (b)Use a runs test to test our assumption that the outcomes are independent. Calculate a P-value and determine whether it is significant at the α = 0.1 level. Hint: (1). number of make/ miss (n1 /n2 ): n1 = 10, n2 = 9 Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics #1 continued.. #1 (b)Use a runs test to test our assumption that the outcomes are independent. Calculate a P-value and determine whether it is significant at the α = 0.1 level. Hint: (1). number of make/ miss (n1 /n2 ): n1 = 10, n2 = 9 (2). count the number of runs: # of runs =6 Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics #1 continued.. #1 (b)Use a runs test to test our assumption that the outcomes are independent. Calculate a P-value and determine whether it is significant at the α = 0.1 level. Hint: (1). number of make/ miss (n1 /n2 ): n1 = 10, n2 = 9 (2). count the number of runs: # of runs =6 (3). look up p-value from table in page 870: use information from (1) and (2), we have P(R ≤ 6) = 0.029 and P(R ≥ 15) = P(R > 14) = 1 − P(R ≤ 14) = 1 − 0.974. Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics #1 continued.. #1 (b)Use a runs test to test our assumption that the outcomes are independent. Calculate a P-value and determine whether it is significant at the α = 0.1 level. Hint: (1). number of make/ miss (n1 /n2 ): n1 = 10, n2 = 9 (2). count the number of runs: # of runs =6 (3). look up p-value from table in page 870: use information from (1) and (2), we have P(R ≤ 6) = 0.029 and P(R ≥ 15) = P(R > 14) = 1 − P(R ≤ 14) = 1 − 0.974. (4). p-value= P(R ≤ 6) + P(R ≥ 14), we can then draw conclusion at α = 0.1 level. Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics Multinomial distribution Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics Multinomial distribution Nature: Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics Multinomial distribution Nature: the experiment consists of n independent identical trials Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics Multinomial distribution Nature: the experiment consists of n independent identical trials the outcome of each trial falls into one of k classes Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics Multinomial distribution Nature: the experiment consists of n independent identical trials the outcome of each trial falls into one of k classes the probability that the outcome Pkof a trial falls into class i is pi , where i = 1, 2, . . . , k, note i=1 pi = 1 Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics Multinomial distribution Nature: the experiment consists of n independent identical trials the outcome of each trial falls into one of k classes the probability that the outcome Pkof a trial falls into class i is pi , where i = 1, 2, . . . , k, note i=1 pi = 1 Y1 , . . . , Yk denote random variables, with Yi P as the number of k trials for which outcome falls in class i, note i=1 Yi = n Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics Multinomial distribution Nature: the experiment consists of n independent identical trials the outcome of each trial falls into one of k classes the probability that the outcome Pkof a trial falls into class i is pi , where i = 1, 2, . . . , k, note i=1 pi = 1 Y1 , . . . , Yk denote random variables, with Yi P as the number of k trials for which outcome falls in class i, note i=1 Yi = n Form: Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics Multinomial distribution Nature: the experiment consists of n independent identical trials the outcome of each trial falls into one of k classes the probability that the outcome Pkof a trial falls into class i is pi , where i = 1, 2, . . . , k, note i=1 pi = 1 Y1 , . . . , Yk denote random variables, with Yi P as the number of k trials for which outcome falls in class i, note i=1 Yi = n Form: p(y1 , y2 , . . . , yk ) = y1 y2 n! y1 !y2 !...yk ! p1 p2 Fang-I Chu, Varvara Kulikova . . . pkyk PSTAT 120C Probability and Statistics Multinomial distribution Nature: the experiment consists of n independent identical trials the outcome of each trial falls into one of k classes the probability that the outcome Pkof a trial falls into class i is pi , where i = 1, 2, . . . , k, note i=1 pi = 1 Y1 , . . . , Yk denote random variables, with Yi P as the number of k trials for which outcome falls in class i, note i=1 Yi = n Form: p(y1 , y2 , . . . , yk ) = y1 y2 n! y1 !y2 !...yk ! p1 p2 . . . pkyk Remark: Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics Multinomial distribution Nature: the experiment consists of n independent identical trials the outcome of each trial falls into one of k classes the probability that the outcome Pkof a trial falls into class i is pi , where i = 1, 2, . . . , k, note i=1 pi = 1 Y1 , . . . , Yk denote random variables, with Yi P as the number of k trials for which outcome falls in class i, note i=1 Yi = n Form: p(y1 , y2 , . . . , yk ) = y1 y2 n! y1 !y2 !...yk ! p1 p2 . . . pkyk Remark: binomial distribution is a special case of multinomial distribution with k = 2 Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics Conditional distribution Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics Conditional distribution Intuition: distribution of certain random variable when given some other relevant condition Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics Conditional distribution Intuition: distribution of certain random variable when given some other relevant condition Example. distribution of X1 , X2 , X3 given S = X1 + X2 Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics Conditional distribution Intuition: distribution of certain random variable when given some other relevant condition Example. distribution of X1 , X2 , X3 given S = X1 + X2 Form: P(X1 = x1 , X2 = x2 , X3 = x3 |S = s) = P(X1 =x1 ,X2 =x2 ,X3 =x3 ,S=s) P(S=s) Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics Conditional distribution Intuition: distribution of certain random variable when given some other relevant condition Example. distribution of X1 , X2 , X3 given S = X1 + X2 Form: P(X1 = x1 , X2 = x2 , X3 = x3 |S = s) = P(X1 =x1 ,X2 =x2 ,X3 =x3 ,S=s) P(S=s) Special case: multinomial distribution condition on some of its sum gives us independent binomials. Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics discussion/hints for #2 #2 Suppose that X1 , X2 , X3 , X4 are multinomially distributed with n = 16 and p1 = ab (1) p2 = (1 − a)b (2) p3 = a(1 − b) (3) p4 = (1 − a)(1 − b) (4) for some a and b between 0 and 1. Let S = X1 + X2 Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics #2 continued... #2(a) Find P{R = 10} Hint: Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics #2 continued... #2(a) Find P{R = 10} Hint: Known: add two of the observations from multinomial, we still obtain multinomial. Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics #2 continued... #2(a) Find P{R = 10} Hint: Known: add two of the observations from multinomial, we still obtain multinomial. R = X1 + X3 ∼ binomial(n, p1 + p3 ) Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics #2 continued... #2(a) Find P{R = 10} Hint: Known: add two of the observations from multinomial, we still obtain multinomial. R = X1 + X3 ∼ binomial(n, p1 + p3 ) p1 + p3 = ab + a(1 − b) = a Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics #2 continued... #2(a) Find P{R = 10} Hint: Known: add two of the observations from multinomial, we still obtain multinomial. R = X1 + X3 ∼ binomial(n, p1 + p3 ) p1 + p3 = ab + a(1 − b) = a Write out marginal distribution of R ∼ (n = 16, a) Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics #2 continued... #2(a) Find P{R = 10} Hint: Known: add two of the observations from multinomial, we still obtain multinomial. R = X1 + X3 ∼ binomial(n, p1 + p3 ) p1 + p3 = ab + a(1 − b) = a Write out marginal distribution of R ∼ (n = 16, a) P(R = 10) then can be easily obtained. Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics #2 (b) Find the P{X1 = 6|S = 12} Hint: Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics #2 (b) Find the P{X1 = 6|S = 12} Hint: (1). by definition of conditional probability: 1 =x,S=s} P{X1 = x|S = s} = P{XP(S=s) Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics #2 (b) Find the P{X1 = 6|S = 12} Hint: (1). by definition of conditional probability: 1 =x,S=s} P{X1 = x|S = s} = P{XP(S=s) (2). numerator is P{X1 = 6, S = X1 + X2 = 12} = P{X1 = n! 6, X2 = 6, X3 + X4 = 4} = x1 !x2 !(x p x1 p2x2 (p3 + p4 )x3 +x4 3 +x4 )! 1 Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics #2 (b) Find the P{X1 = 6|S = 12} Hint: (1). by definition of conditional probability: 1 =x,S=s} P{X1 = x|S = s} = P{XP(S=s) (2). numerator is P{X1 = 6, S = X1 + X2 = 12} = P{X1 = n! 6, X2 = 6, X3 + X4 = 4} = x1 !x2 !(x p x1 p2x2 (p3 + p4 )x3 +x4 3 +x4 )! 1 (3). denominator is similar as in (a), 16! P(X1 + X2 = 12) = 12!4! (p1 + p2 )12 (p3 + p4 )4 Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics #2 (b) Find the P{X1 = 6|S = 12} Hint: (1). by definition of conditional probability: 1 =x,S=s} P{X1 = x|S = s} = P{XP(S=s) (2). numerator is P{X1 = 6, S = X1 + X2 = 12} = P{X1 = n! 6, X2 = 6, X3 + X4 = 4} = x1 !x2 !(x p x1 p2x2 (p3 + p4 )x3 +x4 3 +x4 )! 1 (3). denominator is similar as in (a), 16! P(X1 + X2 = 12) = 12!4! (p1 + p2 )12 (p3 + p4 )4 6 6 (4). The result is P{X1 = 6|S = 12} = 12 6 a (1 − a) . Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics #2 continued... #2(c) Suppose we want to condition on the event {S = 12, R = 9}, what are the possible values that X1 can take? Solution: Possible outcome Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics #2 continued... #2(c) Suppose we want to condition on the event {S = 12, R = 9}, what are the possible values that X1 can take? Solution: Possible outcome X1 X2 X3 X4 5 7 4 0 6 6 3 1 7 5 2 2 8 4 1 3 9 3 0 4 #2(d) Find P{X1 = 0|S = 12, R = 10}. Solution: Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics #2 continued... #2(c) Suppose we want to condition on the event {S = 12, R = 9}, what are the possible values that X1 can take? Solution: Possible outcome X1 X2 X3 X4 5 7 4 0 6 6 3 1 7 5 2 2 8 4 1 3 9 3 0 4 #2(d) Find P{X1 = 0|S = 12, R = 10}. Solution: According to table in part (c), we have P{X1 = 0|S = 12, R = 10} = 0(why?) Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics χ2 goodness of fit tests Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics χ2 goodness of fit tests The χ2 test Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics χ2 goodness of fit tests The χ2 test goodness of fit test: test how well does the data fit the suggested distribution Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics χ2 goodness of fit tests The χ2 test goodness of fit test: test how well does the data fit the suggested distribution Definition and test nature: Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics χ2 goodness of fit tests The χ2 test goodness of fit test: test how well does the data fit the suggested distribution Definition and test nature: H0 : p1 = p1∗ , . . . , pk = pk∗ v.s. Ha : some other set of probability Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics χ2 goodness of fit tests The χ2 test goodness of fit test: test how well does the data fit the suggested distribution Definition and test nature: H0 : p1 = p1∗ , . . . , pk = pk∗ v.s. Ha : some other set of probability P (Observed−Expected)2 test statistic: X 2 = Expected Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics χ2 goodness of fit tests The χ2 test goodness of fit test: test how well does the data fit the suggested distribution Definition and test nature: H0 : p1 = p1∗ , . . . , pk = pk∗ v.s. Ha : some other set of probability P (Observed−Expected)2 test statistic: X 2 = Expected expected value: E (Xi ) = np Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics χ2 goodness of fit tests The χ2 test goodness of fit test: test how well does the data fit the suggested distribution Definition and test nature: H0 : p1 = p1∗ , . . . , pk = pk∗ v.s. Ha : some other set of probability P (Observed−Expected)2 test statistic: X 2 = Expected expected value: E (Xi ) = np X 2 test statistic has approximately a χ2 distribution with k − 1 df. Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics χ2 goodness of fit tests The χ2 test goodness of fit test: test how well does the data fit the suggested distribution Definition and test nature: H0 : p1 = p1∗ , . . . , pk = pk∗ v.s. Ha : some other set of probability P (Observed−Expected)2 test statistic: X 2 = Expected expected value: E (Xi ) = np X 2 test statistic has approximately a χ2 distribution with k − 1 df. degree of freedom: number of estimates for p Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics hints for #3 in hw 5 #3 A survey of 1275 resident in Isla Vista asked whether or not they wanted to build a new train station. 408 of the residents said that they did want to build the station, but 375 said that they did not. 345 people said they had no opinion and 147 refused to respond. (a) Calculate a 95% confidence interval for the percentage who want to build the station. Hint: Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics hints for #3 in hw 5 #3 A survey of 1275 resident in Isla Vista asked whether or not they wanted to build a new train station. 408 of the residents said that they did want to build the station, but 375 said that they did not. 345 people said they had no opinion and 147 refused to respond. (a) Calculate a 95% confidence interval for the percentage who want to build the station. Hint: use formula p of confidence interval p for p: (p̂ − z α2 p̂(1 − p̂)/n, p̂ + z α2 p̂(1 − p̂)/n).(set α = 0.05) ( same formula as in 1(a).) Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics hints for #3 in hw 5 #3 A survey of 1275 resident in Isla Vista asked whether or not they wanted to build a new train station. 408 of the residents said that they did want to build the station, but 375 said that they did not. 345 people said they had no opinion and 147 refused to respond. (a) Calculate a 95% confidence interval for the percentage who want to build the station. Hint: use formula p of confidence interval p for p: (p̂ − z α2 p̂(1 − p̂)/n, p̂ + z α2 p̂(1 − p̂)/n).(set α = 0.05) ( same formula as in 1(a).) P(supportive about station) = p̂ = Fang-I Chu, Varvara Kulikova 408 1275 PSTAT 120C Probability and Statistics hints for #3 in hw 5 #3 A survey of 1275 resident in Isla Vista asked whether or not they wanted to build a new train station. 408 of the residents said that they did want to build the station, but 375 said that they did not. 345 people said they had no opinion and 147 refused to respond. (a) Calculate a 95% confidence interval for the percentage who want to build the station. Hint: use formula p of confidence interval p for p: (p̂ − z α2 p̂(1 − p̂)/n, p̂ + z α2 p̂(1 − p̂)/n).(set α = 0.05) ( same formula as in 1(a).) P(supportive about station) = p̂ = 408 1275 obtained 95% confidence interval is (0.29,0.34) Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics #3 continued... #3 (b)Calculate a 95% confidence interval for the difference between the proportion who want to build the station and the proportion who do not want to build the station. Estimate the standard error using the sample proportions. Hints Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics #3 continued... #3 (b)Calculate a 95% confidence interval for the difference between the proportion who want to build the station and the proportion who do not want to build the station. Estimate the standard error using the sample proportions. Hints Formula Var(X1 − X2 ) = Var(X1 ) + Var(X2 ) − 2Cov(X1 , X2 ) = np1 (1 − p1 ) + np2 (1 − p2 ) + 2np1 p2 Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics #3 continued... #3 (b)Calculate a 95% confidence interval for the difference between the proportion who want to build the station and the proportion who do not want to build the station. Estimate the standard error using the sample proportions. Hints Formula Var(X1 − X2 ) = Var(X1 ) + Var(X2 ) − 2Cov(X1 , X2 ) = np1 (1 − p1 ) + np2 (1 − p2 ) + 2np1 p2 Given n = 1275, compute p̂1 = Fang-I Chu, Varvara Kulikova 408 1275 and p̂2 = 375 1275 PSTAT 120C Probability and Statistics #3 continued... #3 (b)Calculate a 95% confidence interval for the difference between the proportion who want to build the station and the proportion who do not want to build the station. Estimate the standard error using the sample proportions. Hints Formula Var(X1 − X2 ) = Var(X1 ) + Var(X2 ) − 2Cov(X1 , X2 ) = np1 (1 − p1 ) + np2 (1 − p2 ) + 2np1 p2 Given n = 1275, compute p̂1 = 408 1275 and p̂2 = 375 1275 Write our the variance for difference in proportions as 2) 1) Var(p1 − p2 ) = Var(Xn12−X2 ) = p̂1 (1−p̂ + p̂2 (1−p̂ + 2p̂n1 p̂2 n n Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics #3 continued... #3 (b)Calculate a 95% confidence interval for the difference between the proportion who want to build the station and the proportion who do not want to build the station. Estimate the standard error using the sample proportions. Hints Formula Var(X1 − X2 ) = Var(X1 ) + Var(X2 ) − 2Cov(X1 , X2 ) = np1 (1 − p1 ) + np2 (1 − p2 ) + 2np1 p2 Given n = 1275, compute p̂1 = 408 1275 and p̂2 = 375 1275 Write our the variance for difference in proportions as 2) 1) Var(p1 − p2 ) = Var(Xn12−X2 ) = p̂1 (1−p̂ + p̂2 (1−p̂ + 2p̂n1 p̂2 n n Use formula of confidence interval for p1 − p2 :p p (pˆ1 − pˆ2 −z0.025 Var(p1 − p2 ), pˆ1 − pˆ2 +z0.025 Var(p1 − p2 )) Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics #3 continued... #3 (b)Calculate a 95% confidence interval for the difference between the proportion who want to build the station and the proportion who do not want to build the station. Estimate the standard error using the sample proportions. Hints Formula Var(X1 − X2 ) = Var(X1 ) + Var(X2 ) − 2Cov(X1 , X2 ) = np1 (1 − p1 ) + np2 (1 − p2 ) + 2np1 p2 Given n = 1275, compute p̂1 = 408 1275 and p̂2 = 375 1275 Write our the variance for difference in proportions as 2) 1) Var(p1 − p2 ) = Var(Xn12−X2 ) = p̂1 (1−p̂ + p̂2 (1−p̂ + 2p̂n1 p̂2 n n Use formula of confidence interval for p1 − p2 :p p (pˆ1 − pˆ2 −z0.025 Var(p1 − p2 ), pˆ1 − pˆ2 +z0.025 Var(p1 − p2 )) obtained 95% CI as (0.026-0.043, 0.026+0.043) Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics #4 in hw5 #4 A gambler wants to test whether the die she has is fair so she rolls it 300 times and records the outcomes as Roll 1 2 3 4 5 6 Count 54 71 42 45 51 37 meaning she rolled a 1 on the die 54 times, etc. Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics #4 in hw 5 #4 (a) A fair die has probability 16 for each of the six outcomes. Use a chi-squared test to test whether or not the probabilities are all equal Hints/Solution outline Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics #4 in hw 5 #4 (a) A fair die has probability 16 for each of the six outcomes. Use a chi-squared test to test whether or not the probabilities are all equal Hints/Solution outline H0 : pi = 16 Ha : at least one pi 6= pj where i, j = 1, . . . , 6, i 6= j Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics #4 in hw 5 #4 (a) A fair die has probability 16 for each of the six outcomes. Use a chi-squared test to test whether or not the probabilities are all equal Hints/Solution outline H0 : pi = 16 Ha : at least one pi 6= pj where i, j = 1, . . . , 6, i 6= j total number of rolls is 300. The expected number of times of each of the six outcomes appear is 300 × 61 = 50. Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics #4 (a) continued... Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics #4 (a) continued... reedited contingency table as Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics #4 (a) continued... reedited contingency table as Roll 1 2 3 4 5 6 obs count expected count 54 50 71 50 42 50 45 50 51 50 37 50 χ2 Fang-I Chu, Varvara Kulikova (O−E )2 E 0.32 8.82 1.28 0.5 0.02 3.38 ? PSTAT 120C Probability and Statistics #4 (a) continued... reedited contingency table as Roll 1 2 3 4 5 6 obs count expected count 54 50 71 50 42 50 45 50 51 50 37 50 χ2 (O−E )2 E 0.32 8.82 1.28 0.5 0.02 3.38 ? degrees of freedom= 6 − 1 = 5(why?) Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics #4 (a) continued... reedited contingency table as Roll 1 2 3 4 5 6 obs count expected count 54 50 71 50 42 50 45 50 51 50 37 50 χ2 (O−E )2 E 0.32 8.82 1.28 0.5 0.02 3.38 ? degrees of freedom= 6 − 1 = 5(why?) draw conclusion by comparing χ2 statistic (obtained from table) to χ25,0.05 Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics #4 in hw 5 #4 (b) An oblong die will have P(1) = P(6), P(2) = P(5), and P(3) = P(4), but all of the probabilities will not necessarily be the same. Use a chi-squared test to test whether the data is consistent with an oblong die. Hints/Solution outline Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics #4 in hw 5 #4 (b) An oblong die will have P(1) = P(6), P(2) = P(5), and P(3) = P(4), but all of the probabilities will not necessarily be the same. Use a chi-squared test to test whether the data is consistent with an oblong die. Hints/Solution outline H0 : P(1) = P(6), P(2) = P(5), and P(3) = P(4) Ha : at least one equality in null is not true. Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics #4 in hw 5 #4 (b) An oblong die will have P(1) = P(6), P(2) = P(5), and P(3) = P(4), but all of the probabilities will not necessarily be the same. Use a chi-squared test to test whether the data is consistent with an oblong die. Hints/Solution outline H0 : P(1) = P(6), P(2) = P(5), and P(3) = P(4) Ha : at least one equality in null is not true. under H0 , the estimates of the above probabilities are computed as 54 + 37 p̂1 = p̂6 = 300(2) 71 + 51 p̂2 = p̂5 = 300(2) 42 + 45 p̂3 = p̂4 = 300(2) Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics #4 (b) continued... Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics #4 (b) continued... reedited contingency table as Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics #4 (b) continued... reedited contingency table as Roll 1 2 3 4 5 6 obscount 54 71 42 45 51 37 Fang-I Chu, Varvara Kulikova expectedcount 300 × p̂1 = 45.5 300 × p̂2 = 61 300 × p̂3 = 43.5 300 × p̂4 = 45.5 300 × p̂5 = 61 300 × p̂6 = 45.5 χ2 O−E )2 E ? ? ? ? ? ? 6.56 PSTAT 120C Probability and Statistics #4 (b) continued... reedited contingency table as Roll 1 2 3 4 5 6 obscount 54 71 42 45 51 37 expectedcount 300 × p̂1 = 45.5 300 × p̂2 = 61 300 × p̂3 = 43.5 300 × p̂4 = 45.5 300 × p̂5 = 61 300 × p̂6 = 45.5 χ2 O−E )2 E ? ? ? ? ? ? 6.56 degrees of freedom= 6 − 1 − 2 = 3(why?) Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics #4 (b) continued... reedited contingency table as Roll 1 2 3 4 5 6 obscount 54 71 42 45 51 37 expectedcount 300 × p̂1 = 45.5 300 × p̂2 = 61 300 × p̂3 = 43.5 300 × p̂4 = 45.5 300 × p̂5 = 61 300 × p̂6 = 45.5 χ2 O−E )2 E ? ? ? ? ? ? 6.56 degrees of freedom= 6 − 1 − 2 = 3(why?) lost 1 df because total is fixed at 300; then lost 2 df because we estimated 2 same wparameters. Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics #4 (b) continued... reedited contingency table as Roll 1 2 3 4 5 6 obscount 54 71 42 45 51 37 expectedcount 300 × p̂1 = 45.5 300 × p̂2 = 61 300 × p̂3 = 43.5 300 × p̂4 = 45.5 300 × p̂5 = 61 300 × p̂6 = 45.5 χ2 O−E )2 E ? ? ? ? ? ? 6.56 degrees of freedom= 6 − 1 − 2 = 3(why?) lost 1 df because total is fixed at 300; then lost 2 df because we estimated 2 same wparameters. draw conclusion by comparing χ2 statistic (obtained from table) to χ23,0.05 Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics