Download College Prep. Stats. Name: Important Information for Final Exam

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Inductive probability wikipedia , lookup

Sufficient statistic wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Psychometrics wikipedia , lookup

Taylor's law wikipedia , lookup

Foundations of statistics wikipedia , lookup

Statistical hypothesis testing wikipedia , lookup

Omnibus test wikipedia , lookup

Misuse of statistics wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Student's t-test wikipedia , lookup

Transcript
College Prep. Stats.
Important Information for Final Exam
Name: _____________________________________
Chapter 6:
When being asked to find a probability, area, or percentage…
Normalcdf for a Standard Normal Distribution (µ = 0 and  = 1)
P(z > a)
shade to the right
means z is greater than some value
a
normalcdf(a, 99999, 0, 1)
P(z < a)
shade to the left
means z is less than some value
normalcdf(–99999, a, 0, 1)
a
P(a < z < b)
shade in between
means z is between 2 values
a
normalcdf(a, b, 0, 1)
b
When being asked to find a z score, in other words, you HAVE the probability or area or percentage…
invNorm for a Standard Normal Distribution (µ = 0 and  = 1)
The area to the left
j
z
0
z = invNorm( j, 0, 1)
The area to the right
q
0 z
z = invNorm(1 – q, 0, 1)
The area in between
r
 1  r 

 z  invNorm 
, 0,1 
 2

–z
0
z
When being asked to find a probability, area, or percentage…
Normalcdf for a Non-Standard Normal Distribution
P(x > a)
shade to the right
means greater than some value
a
normalcdf(a, 99999, µ, )
P(x < a)
shade to the left
means less than some value
normalcdf(–99999, a, µ, )
a
P(a < x < b)
shade in between
means between 2 values
a
normalcdf(a, b, µ, )
b
When being asked to find an x value, in other words, you HAVE the probability or area or percentage…
invNorm for a Non-Standard Normal Distribution
The area to the left
j
0
x
x = invNorm( j, µ, )
The area to the right
q
0 x
x = invNorm(1 – q, µ, )
The area in between
r
–x
 1  r 

 x  invNorm 
,  ,  
 2

0
x
When being asked to find a probability, area, or percentage for a sample…
Normalcdf for a Non-Standard Normal Distribution
P  x  a
shade to the right
means greater than some value
a

 
normalcdf  a , 99999,  ,

n

P  x  a
shade to the left
means less than some value
a

 
normalcdf  99999, a ,  ,

n

P a  x  b
shade in between
means between 2 values
a
b

 
normalcdf  a , b,  ,

n

Notation: the expression zα denotes the z score with an area of α to its right.
Chapter 7
When working with Proportions …
Critical Values (z*) for a Population Proportion p …
When finding a critical value, use the following calculator command: invNorm(area to the left, 0, 1)
Example: If the given confidence level is 86%, α = 1 – 0.86 = 0.14, therefore, α/2 = 0.07. To find the correct critical value, find the
area to the left (Confidence level + α/2) = 0.86 + 0.07 = 0.93. Use the following computation in your technology: invNorn(0.93, 0, 1)
Margin of Error for Proportions …
E  margin of error
z / 2  z*  critical value
ˆˆ
pq
E  z*
n
x  number of successes
n  sample size
 x
pˆ  proportion of successes  
n
qˆ  proportion of failures 1  pˆ 
Confidence Interval for Estimating a Population Proportion p …
pˆ  E  p  pˆ  E OR
 pˆ  E, pˆ  E 
When working with means (σ is Known) …
Critical Values for a Population Mean µ when σ is Known …
When finding a critical value, use the following calculator command: invNorm(area to the left, 0, 1)
Example: If the given confidence level is 86%, α = 1 – 0.86 = 0.14, therefore, α/2 = 0.07. To find the correct critical value, find the
area to the left (Confidence level + α/2) = 0.86 + 0.07 = 0.93. Use the following computation in your technology: invNorn(0.93, 0, 1)
Margin of Error for Means (with  Known) …
E  margin of error
z / 2  z*  critical value
E  z*

n
  population mean
 = population standard deviation
x  sample mean
n  sample size
Confidence Interval for Estimating a Population Mean (with  Known) …
x  E    x  E OR
 x  E, x  E 
When working with means (σ is Not Known) …
degrees of freedom = n – 1 for the student t distribution
Critical Values (t*) for a Population Mean µ when σ is Not Known …
When finding a critical value, use the following calculator command: invT(area to the left, df)
Example: If the given confidence level is 86%, with a sample size of 28, the degrees of freedom will be n – 1, so df = 27.
α = 1 – 0.86 = 0.14, therefore, α/2 = 0.07.
To find the correct critical value, find the area to the left (Confidence level + α/2) = 0.86 + 0.07 = 0.93.
Use the following computation in your technology: invT(0.93, 27)
Margin of Error E for Estimate of µ (With σ Not Known) …
E t*
s
n
E  margin of error
t*  critical value
s = sample standard deviation
n  sample size
x  sample mean
  population mean
Confidence Interval for Estimating a Population Mean µ (with  Not Known) …
x  E    x  E OR
 x  E, x  E 
Choosing the Appropriate Distribution …
How do we know when to use zα/2 or tα/2 (z* or t*)?
*If you are working with a categorical variable (estimating a population proportion, p) always use zα/2 (z*).
*If you are working with a quantitative variable (estimating a population mean, µ) and you DO know σ, use zα/2 (z*).
*If you are working with a quantitative variable (estimating a population mean, µ) and you DO NOT know σ, use tα/2 (t*).
**Remember that the population distribution must be normal or n must be large for quantitative variables.**
Chapter 8
Hypotheses for Proportions …
Null Hypothesis: H0
The null hypothesis is a statement that the value of a population parameter (such as proportion or mean) is equal to some claimed
value.
H0: p = some decimal
Alternative Hypothesis: H1
The alternative hypothesis is the statement that the parameter has a value that somehow differs from the null hypothesis.
The symbolic form of the alternative hypothesis must use one of these symbols: , <, >.
H0: p < some decimal
H0: p > some decimal
H0: p  some decimal
Test Statistic for Proportions …
z
p̂  p
pq
n
Notation
n = number of trials
x
pˆ  (sample proportion)
n
p = population proportion (used in the null hypothesis)
q=1–p
P-Value for Proportions …
For right-tailed tests: P(z > test statistic)
*To find this probability in your calculator, type: normalcdf(z test statistic, 999999999, 0, 1)
For left-tailed tests: P(z < –test statistic)
*To find this probability in your calculator, type: normalcdf(–99999999, –z test statistic, 0, 1)
***Don’t forget if your test is two-sided, double your P-value.***
Conclusions for Proportions …
Using the significance level :
If P-value   , reject H0.
If P-value >  , fail to reject H0.
There is enough evidence to
suggest that the … (alternative
hypothesis in context)
There is not enough evidence
to suggest that the …
(alternative hypothesis in
context)
Hypotheses for Means …
Null Hypothesis: H0
The null hypothesis is a statement that the value of a population parameter (such as proportion or mean) is equal to some claimed
value.
H0: µ = some number
Alternative Hypothesis: H1
The alternative hypothesis is the statement that the parameter has a value that somehow differs from the null hypothesis.
The symbolic form of the alternative hypothesis must use one of these symbols: , <, >.
H0: µ < some number
H0: µ > some number
H0: µ  some number
Test Statistic for Means (with σ Not Known) …
t
x 
s
n
Notation
n = sample size
x  sample mean
µ = population mean
s = sample standard deviation
P-Value for Means …
Use the tcdf feature on your calculator, with degrees of freedom (df) = n – 1.
For right-tailed tests: P(t > test statistic)
*To find this probability in your calculator, type: tcdf(t test statistic, 99999999, df)
For left-tailed tests: P(t < –test statistic)
*To find this probability in your calculator, type: tcdf(–99999999, t test statistic , df)
***Don’t forget if your test is two-sided, double your P-value.***
Conclusions for Means …
Using the significance level :
If P-value   , reject H0.
If P-value >  , fail to reject H0.
There is enough evidence to
suggest that the … (alternative
hypothesis in context)
There is not enough evidence
to suggest that the …
(alternative hypothesis in
context)
Chapter 9
Hypotheses for Two Proportions …
Null Hypothesis: H0
The null hypothesis is a statement that the value of a population parameter (such as proportion or mean) is equal to some claimed
value.
H 0 : p1  p2
Alternative Hypothesis: H1
The alternative hypothesis is the statement that the parameter has a value that somehow differs from the null hypothesis.
The symbolic form of the alternative hypothesis must use one of these symbols: , <, >.
H1 : p1  p2
H1 : p1  p2
H1 : p1  p2
Pooled estimate …
x x
p 1 2
n1  n2
Notation
For population 1, we let:
x1 = number of successes in the sample
For population 2, we let:
x2 = number of successes in the sample
n1 = size of the sample
n2 = size of the sample
Test Statistic for Two Proportions …
z
( pˆ1  pˆ 2 )
 pq   pq 



 n1   n2 
, (ZPROP Program)
Notation
For population 1, we let:
x
pˆ1  1 (the sample proportion)
n1
For population 2, we let:
x
pˆ 2  2 (the sample proportion)
n2
p  pooled estimate
q  1 p
Critical value for Two Proportions …
invNorm(area to the left, 0, 1)
*left tailed test, α is in the left tail, z* = –critical value, (invNorm(area in the left tail, 0, 1))
*right tailed test, α is in the right tail, z* = +critical value, (invNorm(1 – area in the right tail, 0, 1))
*two tailed test, α is divided equally between the two tails, z* = ±critical value, (invNorm(area in the left tail, 0, 1))
P-values for Two Proportions …
Remember to find probabilities use the normalcdf feature in your calculator:
For right-tailed tests: P(z > test statistic)
*To find this probability in your calculator, type: normalcdf(z test statistic, 999999999, 0, 1)
For left-tailed tests: P(z < –test statistic)
*To find this probability in your calculator, type: normalcdf(–99999999, –z test statistic, 0, 1)
***Don’t forget if your test is two-sided, double your P-value.***
Conclusions for Two Proportions …
Using the significance level :
There is enough evidence to
suggest that the … (alternative
hypothesis in context)
If P-value   , reject H0.
If P-value >  , fail to reject H0.
There is not enough evidence
to suggest that the …
(alternative hypothesis in
context)
Margin of Error for Two Proportions …
 pˆ qˆ   pˆ qˆ 
E  | z* |  1 1    2 2  , (EPROP program)
 n1   n2 
Notation
For population 1, we let:
x1 = number of successes in the
sample
x1 = size of the sample
For population 2, we let:
x2 = number of successes in the
sample
x2 = size of the sample
x1
(the sample proportion)
n1
qˆ1  1  pˆ1
x2
(the sample proportion)
n2
qˆ2  1  pˆ 2
pˆ1 
pˆ 2 
z* is the POSITIVE critical value!!
Confidence Interval Estimate For Two Proportions …
 ( pˆ1  pˆ 2 )  E, ( pˆ1  pˆ 2 )  E 
Hypotheses for Two Means σ1 and σ2 Unknown …
Null Hypothesis: H0
The null hypothesis is a statement that the value of a population parameter (such as proportion or mean) is equal to some claimed
value.
H 0 : 1  2
Alternative Hypothesis: H1
The alternative hypothesis is the statement that the parameter has a value that somehow differs from the null hypothesis.
The symbolic form of the alternative hypothesis must use one of these symbols: , <, >.
H1 : 1  2
H1 : 1  2
H1 : 1  2
Critical value for Two Means σ1 and σ2 Unknown …
Critical values: Use invT(area to the left, df) with degrees of freedom, df = the smaller of n1 – 1 and n2 – 1.
*left tailed test, α is in the left tail, t* = –critical value, (invT(area in the left tail, df))
*right tailed test, α is in the right tail, t* = +critical value (invT(1 – area in the right tail, df))
*two tailed test, α is divided equally between the two tails, t* = ±critical value, (invT(area in the left tail, df))
Test Statistic for Two Means σ1 and σ2 Unknown …
t
( x1  x2 )
 s1

 n1
2
  s2 


  n2 
, (TMEAN program)
2
Notation
For population 1, we let:
x1 = sample mean
s1 = sample standard deviation
n1 = size of the first sample
For population 2, we let:
x2 = sample mean
s2 = sample standard deviation
n2 = size of the first sample
P-values for Two Means σ1 and σ2 Unknown …
Use the tcdf feature on your calculator, with degrees of freedom, df = the smaller of n1 – 1 and n2 – 1.
For right-tailed tests: P(t > test statistic)
*To find this probability in your calculator, type: tcdf(t test statistic, 99999999, df)
For left-tailed tests: P(t < –test statistic)
*To find this probability in your calculator, type: tcdf(–99999999, t test statistic , df)
***Don’t forget if your test is two-sided, double your P-value.***
Conclusions for Two Means σ1 and σ2 Unknown …
Using the significance level :
There is enough evidence to
suggest that the … (alternative
hypothesis in context)
If P-value   , reject H0.
If P-value >  , fail to reject H0.
There is not enough evidence
to suggest that the …
(alternative hypothesis in
context)
Margin of Error for Two Means σ1 and σ2 Unknown …
s2  s 2 
E  | t* |  1    2  , (EMEAN Program)
 n1   n2 
Notation
For population 1, we let:
s1 = sample standard deviation
For population 2, we let:
s2 = sample standard deviation
x1 = size of the sample
x2 = size of the sample
t* is the POSITIVE critical value!!
Confidence Interval Estimate For Two Means σ1 and σ2 Unknown …
 ( x1  x2 )  E, ( x1  x2 )  E 
Hypotheses for Matched Pairs …
Null Hypothesis: H0
The null hypothesis is a statement that the value of a population parameter (such as proportion or mean) is equal to some claimed
value.
H 0 : d  0
Alternative Hypothesis: H1
The alternative hypothesis is the statement that the parameter has a value that somehow differs from the null hypothesis.
The symbolic form of the alternative hypothesis must use one of these symbols: , <, >.
H1 :  d  0
H1 :  d  0
H1 :  d  0
Critical value for Matched Pairs …
Critical Values: Use the invT(area to the left, df), with degrees of freedom (df) = n – 1.
*left tailed test, α is in the left tail, t* = –critical value, (invT(area in the left tail, df))
*right tailed test, α is in the right tail, t* = +critical value (invT(1 – area in the right tail, df))
*two tailed test, α is divided equally between the two tails, t* = ±critical value, (invT(area in the left tail, df))
Test Statistic for Matched Pairs …
d
, where degrees of freedom = n – 1
t
sd
n
Notation for Dependent Samples
d = mean value of the differences d for the paired sample data (get from calc.)
sd = standard deviation of the differences d for the paired sample data (get from calc.)
n = number of pairs of data.
P-values for Matched Pairs …
P-values: Use the tcdf feature on your calculator, with degrees of freedom (df) = n – 1.
For right-tailed tests: P(t > test statistic)
*To find this probability in your calculator, type: tcdf(t test statistic, 99999999, df)
For left-tailed tests: P(t < –test statistic)
*To find this probability in your calculator, type: tcdf(–99999999, t test statistic , df)
***Don’t forget if your test is two-sided, double your P-value.***
Conclusions for Matched Pairs …
Using the significance level :
If P-value   , reject H0.
If P-value >  , fail to reject H0.
There is enough evidence to
suggest that the … (alternative
hypothesis in context)
There is not enough evidence
to suggest that the …
(alternative hypothesis in
context)
Margin of Error for Matched Pairs …
E  | t* |
sd
n
Notation
sd = standard deviation of the differences d for the paired sample data (get from calc.)
n = number of pairs of data.
t* is the POSITIVE critical value!!
Confidence Interval Estimate For Matched Pairs …
 d  E, d  E 
Chapter 11
Goodness-of-Fit Test Notation
O
E
k
n
represents the observed frequency of an outcome.
represents the expected frequency of an outcome.
represents the number of different categories or outcomes.
represents the total number of trials.
Hypotheses for Goodness of Fit when all the probabilities are EQUAL …
H0: p1 = p2 = …= pk = probability (To find this probability do 1divided by k)
H1: At least one of these probabilities is incorrect.
Expected Frequencies for Goodness-of-Fit when all the probabilities are EQUAL …
E
n
k
(the sum of all observed frequencies divided by the number of categories)
Critical Values for Goodness-of-Fit when all the probabilities are EQUAL …
1. Found in Table A–4 using k – 1 degrees of freedom, where k = number of categories.
2. Goodness-of-Fit hypothesis tests are always right-tailed.
P-values for Goodness-of-Fit when all the probabilities are EQUAL …
On the calculator:
χ2cdf(test statistic, 99999, df)
Test Statistic for Goodness-of-Fit when all the probabilities are EQUAL …
2  
(O  E ) 2
E
Hypotheses for Goodness of Fit when all the probabilities are NOT EQUAL …
H0: p1 = probability a; p2 = probability b, …, pk = probability n
H1: At least one of these probabilities is incorrect.
Expected Frequencies for Goodness-of-Fit when all the probabilities are NOT EQUAL …
E = np (each expected frequency is found by multiplying the sum of all observed frequencies by the probability for the category)
Critical Values for Goodness-of-Fit when all the probabilities are NOT EQUAL …
1. Found in Table A–4 using k – 1 degrees of freedom, where k = number of categories.
2. Goodness-of-Fit hypothesis tests are always right-tailed.
Test Statistic for Goodness-of-Fit when all the probabilities are NOT EQUAL …
2  
(O  E ) 2
E
P-values for Goodness-of-Fit when all the probabilities are NOT EQUAL …
On the calculator:
χ2cdf(test statistic, 99999, df)
Conclusions for Goodness-of-Fit Test when all the probabilities are EQUAL and when the probabilities are NOT EQUAL …
If your p-value is less than α (significance level):
Reject H0. There is enough evidence to suggest that at least one of these probabilities is incorrect.
If your p-value is greater than α (significance level):
Fail to reject H0. There is not enough evidence to suggest that at least one of these probabilities is incorrect.
Chi-Square Test of Independence and Chi-Square Test of Homogeneity Notation
O
E
r
c
represents the observed frequency in a cell of a contingency table.
represents the expected frequency in a cell, found by assuming that the row and column variables are independent
represents the number of rows in a contingency table (not including labels).
represents the number of columns in a contingency table (not including labels).
Chi-Square Test of Independence: comes from a single random sample.
Hypotheses for Chi-Square Test of Independence …
H0: The row and column variables are independent.
H1: The row and column variables are dependent.
Test Statistic for Chi-Square Test of Independence …
2  
(O  E ) 2
E
where O is the observed frequency in a cell and E is the expected frequency found by evaluating
E
(row total)(column total)
(table total)
P-values for Chi-Square Test of Independence …
P-values are typically provided by computer software, or a range of P-values can be found from Table A–4.
On the calculator:
χ2cdf(test statistic, 99999, df)
Conclusions for Chi-Square Test of Independence …
If your p-value is less than α (significance level):
Reject H0. There is enough evidence to suggest that the two variables (in context) are not independent.
If your p-value is greater than α (significance level):
Fail to reject H0. There is not enough evidence to suggest that the two variables (in context) are not independent.
Chi-Square Test of Homogeneity: comes from more than one sample or an experiment.
Hypotheses for Chi-Square Test of Homogeneity …
H0: The different populations have the same proportion of some characteristics. (in context)
H1: The proportions are different.
Test Statistic for Chi-Square Test of Homogeneity …
2  
(O  E ) 2
E
where O is the observed frequency in a cell and E is the expected frequency found by evaluating
E
(row total)(column total)
(table total)
P-values for Chi-Square Test of Homogeneity …
P-values are typically provided by computer software, or a range of P-values can be found from Table A–4.
On the calculator:
χ2cdf(test statistic, 99999, df)
Conclusions for Chi-Square Test of Homogeneity …
If your p-value is less than α (significance level):
Reject H0. There is enough evidence to suggest that the proportions are different.
.
If your p-value is greater than α (significance level):
Fail to reject H0. There is not enough evidence to suggest that the proportions are different.