Download Confidence Interval: Estimate Multiplier Standard Error

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Advanced Placement Statistics
Calculator Instructions
Exploratory Data Analysis
Five-number summary
Stat / Calc / 1-Var-Stat
Histogram
1-list: just set the Stat Plot 3rd icon type
Xlist: L1
Freq: 1 (Make sure Alpha key is off)
Zoom Stat
2-lists: just set the Stat Plot 3rd icon type
Xlist: L1
Freq: L2
PRESS GRAPH after the window
has been set up.
Normal Probability Plot




Calculator Window
Xmin = 1 smaller than smallest data
Xmax = 1 bigger than bigger data
Xscl = range / number of bars (integer)
Ymin = -.5
Ymax = the number of times the mode
appears
Yscl = 1
Stat Plot 6th icon type
Data Axis: X
Mark: choose one
Zoom Stat
Normal Distribution
Probability between two values:
normalcdf(lower, upper, mean, std dev)
normalcdf(lower,upper) Table A
invNorm(prop) z-score from Table A
invNorm(prop,mean,std dev) z - score
1
Shading the Normal Distribution
Window Set up
DRAW (2nd / PRGM) / ClrDraw / ENTER
Xmin = 4 std dev to the left of mean
Xmax = 4 std dev to the right of mean
Xscl = range / 8
Ymin = -.5
Ymax = about .8 (you might have to adjust)
Yscl = about .01 (you might have to adjust)
DISTR (2nd / VARS) / DRAW /
ShadeNorm(lower, upper, mean, std dev)
Binomial Distributions
DISTR / A:
binompdf( n, p, outcome )
Example:
Probability that in 5 tries there are 2
successes when p is .3
binompdf( 5, .3, 2 )
binomcdf( n, p, from 0 to outcome )
Example:
Probability that in 5 tries there are at most
2 successes when p is .3
binomcdf( 5, .3, 2 )
Example:
Probability that in 5 tries there are at least 2
successes when p is .3
1 - binomcdf( 5, .3, 1 )
Example:
Probability that in 5 tries there are at least 1
success when p is .3
P(At least one): 1 – P(none)
1 - binomcdf( 5, .3, 0 )
2
t-Distributions
Probability between two values:
tcdf(lower, upper, degrees of freedom)
Shading the t-Distribution
DRAW (2nd / PRGM) / ClrDraw / ENTER
DISTR (2nd / VARS) / DRAW /
Shade_t(lower, upper, df)

Window Set up
Xmin = -3
Xmax = 3
Xscl =1
Ymin = -.1
Ymax =0.4
Yscl = 0.1
2
Distribution
Probability between two values:
 2 cdf(lower,upper,df)
2
Shading the  -Distribution
Shade  (lower,upper,df)
2
Window Set up
Xmin = 0
Xmax = 14
Xscl =1
Ymin = -.1
Ymax =0.3
Yscl = 0.1
Goodness of Fit
L1 : observed
L2 : expected
List / Math / 5: sum
Two-way Tables
Enter observed counts in Matrix [A] in 2nd
x-1
Expected counts will be placed in matrix B
Stat / Tests
 2 -Test
Scatterplots, Correlation and
3
Regression
Correlation coefficient:
CATALOG ( 2nd zero) / find
DiagnosticsOn / ENTER / ENTER
Regression Line:
Stat / Calc / 8.LinReg(a+bx) L1, L2, Y1
To get Y1: VARS / Y-VARS / Function
Zoom Stat
Residuals Plot:
L1: explanatory
L2: response
Y1: regression line




Y1(L1) → L3 (Predicted)
L2 – L3 → L4 (Obsvd-Pred)= Res
Plot L1 vs. L4
Zoom Stat
Calculator use for Regression Line
and Inferences
s
SEb 
( x  x ) 2
( x  x ) 2 comes from Stat/1vars on the explanatory list:
Sx n 1
Standard error about the line
Confidence interval and
Hypothesis testing for Regression
Slope: b  t * SE b
comes from Stat/1-vars on the
residuals list. Use the term.
Stat/Test/LinRegTInterval and/or
Stat/Test/LinRegTTest
4
seq(expression,x,from,to)
randInt(from, to, how many)
Example
seq(x2,x,1,100): the first 100 squares
randInt(1, 6, 120):rolling a die 120 times
Inference Procedures with Normal
Distributions
Inference Procedures with tDistributions
Standard Deviation is known
Standard Deviation is unknown
One-sample mean Confidence Interval:
 Stat / Tests / 7: ZInterval
 Stats, sample mean and population
std dev given
 Data for L1
 C-Level
 Calculate
One-sample mean Confidence Interval:
 Stat / Tests / 8: TInterval
 Stats, sample mean and std dev
given
 Data for L1
 C-Level
 Calculate
One-sample mean Test of Significance:
 Stat / Tests / 1: Z-Test
 Stats, mean and std dev given
 Data for L1
 Calculate
One-sample mean Test of Significance:
 Stat / Tests / 2: T-Test
 Stats, mean and std dev given
 Data for L1
 Calculate
Two-sample means Confidence Interval:
 Stat / Tests / 9: 2-SampZInt
 Stats, sample mean and population
std dev given
 Data for L1 and L2
 C-Level
 Calculate
Two-sample mean Confidence Interval:
 Stat / Tests / 0: 2-SampTInt
 Stats, sample mean and std dev
given
 Data for L1 and L2
 C-Level
 Calculate
Two-sample means Test of Significance:
 Stat / Tests / 3: 2-SampZTest
 Stats, sample mean and population
std dev given
 Data for L1 and L2
 Calculate
Two-sample means Test of Significance:
 Stat / Tests / 4: 2-SampTTest
 Stats, mean and std dev given
 Data for L1 and L2
 Calculate
Simulations
5
One-sample proportion Confidence
Interval:
 Stat / Tests / A: 1-PropZInt
 Counts and samples size
 C-Level
 Calculate
Two-sample proportion Confidence
Interval:
 Stat / Tests / B: 2-PropZInt
 Counts and samples size
 C-Level
 Calculate
One-sample proportion Test of
Significance:
 Stat / Tests / 5: 1-PropZTest
 Counts and samples size
 Calculate
Two-sample proportion Test of
Significance:
 Stat / Tests / 6: 2-PropZTest
 Counts and samples size
 Calculate
Formulas and Conditions
Random
Variables
Expected value
Rules with
For means
a and b are
constants
 X   xi p i
 a bX  a  b X
 X Y  Y   X
Variances
 2 X  ( xi  x ) 2 p i
General Rules for variances:
For variances
 2 a bX  b 2 2 X
X and Y are independent random
variables
 2 X Y  2 X   2Y
 2 X Y  2 X   2Y  2 x y
 2 X Y  2 X   2Y  2 x y
 2 X Y  2 X   2Y
Binomial Distribution
Setting
1. Either success or
failure
2. Fixed number of
observations
3. n independent
observations
Binomial
Coefficient
n!
n
 
 k  (n  k )! k!
Binomial Probability
n
p( X  k )    p k (1  p) n  k
k
Binomial Distribution
Mean
X n p
 X  n p(1  p)
6
4. same probability
of success
Normal Approximation for Binomial Distributions
When n  p 10 and n  (1  p ) 10 , the binomial distribution X is approximately normal, N (np, np(1  p) )
Sampling Distribution of
a Sample Proportion
Mean
p̂
Standard Deviation
 pˆ 
p̂ is an unbiased estimator of p
 p̂ gets smaller as n increases
 pˆ  p
p(1  p)
n
Conditions:
Only when the population is at least
10 times as large as the sample.
This formula does not apply when
the sample is a large part of the
population.
It can be approximated
with a normal distribution,
p(1  p)
),
n
N ( p,
Conditions:
when n  p  10 and
n  (1  p ) 10
The normal approximation
improves as the sample size n
increases.
For fixed sample size n, the
normal approximation is most
accurate when p is close to 1
2
and least accurate when p is near
0 or 1.
Sampling Distribution of
a Sample Mean
x
x
is an unbiased estimator of
the population mean 
Mean
X  
Standard Deviation
X 

n
Conditions:
This formula can only be used when
the population is at least 10 times as
large as the sample.
If the sample is an SRS
from a population that has
the normal distribution
with mean  and standard
deviation  , then the
sample mean x has the
normal distribution
 with mean  and
N ( ,
)
n
standard deviation  .
n
The Central Limit Theorem
If an SRS of size n is drawn from any population whatsoever with mean  and standard
deviation  and n is large, then the sampling distribution of the sample mean
distribution N (  ,  ) with mean  and standard deviation  .
n
x
is close to the normal
n
7
Normal Density equation
y
1
e
 2
1  x 
 

2  
2nd/VARS/DRAW
ShadeNorm(0,0,mean,stD)
Set window to be around
the mean
2
Estimate  Multiplier  Standard Error
Multiplier  Standard Error = margin of error
Confidence Interval:
Estimate Hypothesiz ed value
Standard Error
Test Statistic:
Parameter of
Interest
  mean
  st dev
Estimate, hypotheses &
Conditions
Multiplier &
Test with DF
Known variance
X
z-interval
H 0   0
Conditions:
 data are from SRS
 sampling distribution of x approx
normal

x  z * SE
z-test
z
x  0
Standard Error (SE)
SE   x 

n
X
Unknown variance
H 0   0
One sample
t-interval
Conditions:
8

n
 data are from SRS(Very Important)
 sampling distribution of x approx
normal for n<15
 t procedures can be used for n  15 if no
outliers or strong skewness are present
 In case of skewness, t procedure can be
used as long as n  40
x  t * SE
SE  ̂ x 
t-test with (n-1) df
t
s
n
x  0
s
n
1  2
x1  x2
H 0 : 1   2
Conditions:
When the sizes of the two samples are equal
and the two populations being compared
have distributions with similar shapes,
probability values from the t table are quite
accurate for a broad range of distributions
when the samples are as small as n1=n2=5.
When the two population distributions have
different shapes, larger samples are needed.
 SRS’s from two distinct populations
 Independent samples
 Both populations are normally distributed
 Means and std dev are unknown
Unknown variances
Two sample
t-interval
2
SE =
2
s1
s
 2
n1 n2
( x1  x 2 )  t *
t
2
2
s1
s
 2
n1 n2
( x1  x 2 )  ( 1   2 )
2
2
s1
s
 2
n1
n2
df = smaller of
n1-1 and n2-1
Both procedures err on the safe side:
higher P-values and lower confidence
than are actually true.
two-sample t procedures are more robust
than the one-sample t method
x
n
H 0 p  p0
pˆ 
Approx z-interval
pˆ  z SE
*
p
Conditions:
 data are from SRS
 pop at least 10 times as large as the
sample
 for a test , npˆ  10 and n(1  pˆ )  10
Approx z-test z 
pˆ  p0
p0 (1  p0 )
n
SE = ˆ p̂ = pˆ (1  pˆ )
n
for confidence intervals
p0 (1  p0 )
n
for hypothesis testing
SE =  p̂ =
for confidence interval , np 0  10 and
n(1  p0 )  10
9
Sample size given a margin of error
2
 z* 
n    p* 1  p*
m


Where p* is .5
( p1  p2 )
pˆ 
pˆ1  pˆ 2
Confidence Intervals
H 0 : p1  p2
( pˆ 1  pˆ 2 )  z * SE
successes in both samples
totlal successes
pˆ 

n2 pˆ 2  5, n2 (1  pˆ 2 )  5
pˆ 1 (1  pˆ 1 ) pˆ 2 (1  pˆ 2 )

n1
n2
Hypothesis Testing
X1  X 2
n1  n2
Conditions:
 The populations are at least 10
times as large as the samples

n1 pˆ 1  5, n1 (1  pˆ 1 )  5,
SE 
z
pˆ 1  pˆ 2
pˆ (1  pˆ )(
1
1
 )
n1 n 2
Conditions:
 The populations are at least 10
times as large as the samples
n1 pˆ  5, n1 (1  pˆ )  5,


n2 pˆ  5, n2 (1  pˆ )  5
( pˆ 1  pˆ 2 )
x  x 
  1    2 
 n1   n2 
 2 number is define as
Goodness of Fit Test
Multiple
proportions
H0: the actual population proportions are
equal to the hypothesized
Ha: the actual population proportions are
different from the hypothesized
n: number of outcome categories
df: n-1
(O  E ) 2
 
E
2
Where the E counts are calculated with
the proportions from the H0 hypothesis.
Conditions:
 All individual expected counts are
at least 1 and no more than 20% of
the expected counts are less than 5.
 SRS
Test for
10
Homogeneity of populations
H0: p1 = p2 = p3 = …
Ha: not all proportions are
equal
Conditions:
 All individual expected counts are
at least 1 and no more than 20% of
the expected counts are less than 5.
 Multiple SRS’s
Two Way
Tables
 2 number is define as
2  
(O  E ) 2
E
Where the E  rowTotal  columnTotal
TableTotal
Test of
Association/Independence
 2 number is define as
H0: there is no relationship between
two categorical variables
Ha: there is relationship between
two categorical variables
(O  E ) 2
 
E
2
Conditions:
 All individual expected counts are
at least 1 and no more than 20% of
the expected counts are less than 5.
 A single SRS
Where the E  rowTotal  columnTotal
TableTotal
Linear Regression
Correlation Formula


 xi  x  yi 
1

r

n  1  s x  s y
Least Square Regression Line
y 


yˆ  a  bx
with slope
br
Sy
Sx
b
yˆ  a  bx
and intercept
a  y  bx
residuals
11
residuals  y  yˆ
Inference for Regression
Model


Conditions:
* Observations are independent
* True linear relationship
* The standard deviation of the
response about the true lines is
the same everywhere
* The response varies normally about
the true regression line.
Confidence Interval for the slope of the
line:
b  t * SEb
df=n-2
SEb 
s
( x  x ) 2
( x  x ) 2
comes from
Stat/1-vars on
the explanatory
list:
Sx n 1
Standard error about the line
s
Confidence Interval
y
yˆ  t SE̂
df=n-2
*
( y  yˆ ) 2
n2
S(y - yˆ ) 2 comes
from
Stat/1-vars on the
residuals
list.
2
Use the term Sx .
1
( x*  x ) 2
SE ˆ  s

n  (x  x)2
12
ŷ
1
( x*  x ) 2
SE yˆ  s 1  
n  (x  x)2
Prediction Interval
yˆ  t * SE yˆ
Transforming Relationships
Algebra
Defining and using Logarithms
logx = y if and only if by=x
log(response)= a + k log (explanatory)
Properties of Logarithms
log(AB) = log A + log B
log(A/B) = log A – log B
log Xp = p log X
10log(response) = 10a+log(explanatory^k)
Response = 10a
x explanatoryk
13
Related documents