Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Advanced Placement Statistics Calculator Instructions Exploratory Data Analysis Five-number summary Stat / Calc / 1-Var-Stat Histogram 1-list: just set the Stat Plot 3rd icon type Xlist: L1 Freq: 1 (Make sure Alpha key is off) Zoom Stat 2-lists: just set the Stat Plot 3rd icon type Xlist: L1 Freq: L2 PRESS GRAPH after the window has been set up. Normal Probability Plot Calculator Window Xmin = 1 smaller than smallest data Xmax = 1 bigger than bigger data Xscl = range / number of bars (integer) Ymin = -.5 Ymax = the number of times the mode appears Yscl = 1 Stat Plot 6th icon type Data Axis: X Mark: choose one Zoom Stat Normal Distribution Probability between two values: normalcdf(lower, upper, mean, std dev) normalcdf(lower,upper) Table A invNorm(prop) z-score from Table A invNorm(prop,mean,std dev) z - score 1 Shading the Normal Distribution Window Set up DRAW (2nd / PRGM) / ClrDraw / ENTER Xmin = 4 std dev to the left of mean Xmax = 4 std dev to the right of mean Xscl = range / 8 Ymin = -.5 Ymax = about .8 (you might have to adjust) Yscl = about .01 (you might have to adjust) DISTR (2nd / VARS) / DRAW / ShadeNorm(lower, upper, mean, std dev) Binomial Distributions DISTR / A: binompdf( n, p, outcome ) Example: Probability that in 5 tries there are 2 successes when p is .3 binompdf( 5, .3, 2 ) binomcdf( n, p, from 0 to outcome ) Example: Probability that in 5 tries there are at most 2 successes when p is .3 binomcdf( 5, .3, 2 ) Example: Probability that in 5 tries there are at least 2 successes when p is .3 1 - binomcdf( 5, .3, 1 ) Example: Probability that in 5 tries there are at least 1 success when p is .3 P(At least one): 1 – P(none) 1 - binomcdf( 5, .3, 0 ) 2 t-Distributions Probability between two values: tcdf(lower, upper, degrees of freedom) Shading the t-Distribution DRAW (2nd / PRGM) / ClrDraw / ENTER DISTR (2nd / VARS) / DRAW / Shade_t(lower, upper, df) Window Set up Xmin = -3 Xmax = 3 Xscl =1 Ymin = -.1 Ymax =0.4 Yscl = 0.1 2 Distribution Probability between two values: 2 cdf(lower,upper,df) 2 Shading the -Distribution Shade (lower,upper,df) 2 Window Set up Xmin = 0 Xmax = 14 Xscl =1 Ymin = -.1 Ymax =0.3 Yscl = 0.1 Goodness of Fit L1 : observed L2 : expected List / Math / 5: sum Two-way Tables Enter observed counts in Matrix [A] in 2nd x-1 Expected counts will be placed in matrix B Stat / Tests 2 -Test Scatterplots, Correlation and 3 Regression Correlation coefficient: CATALOG ( 2nd zero) / find DiagnosticsOn / ENTER / ENTER Regression Line: Stat / Calc / 8.LinReg(a+bx) L1, L2, Y1 To get Y1: VARS / Y-VARS / Function Zoom Stat Residuals Plot: L1: explanatory L2: response Y1: regression line Y1(L1) → L3 (Predicted) L2 – L3 → L4 (Obsvd-Pred)= Res Plot L1 vs. L4 Zoom Stat Calculator use for Regression Line and Inferences s SEb ( x x ) 2 ( x x ) 2 comes from Stat/1vars on the explanatory list: Sx n 1 Standard error about the line Confidence interval and Hypothesis testing for Regression Slope: b t * SE b comes from Stat/1-vars on the residuals list. Use the term. Stat/Test/LinRegTInterval and/or Stat/Test/LinRegTTest 4 seq(expression,x,from,to) randInt(from, to, how many) Example seq(x2,x,1,100): the first 100 squares randInt(1, 6, 120):rolling a die 120 times Inference Procedures with Normal Distributions Inference Procedures with tDistributions Standard Deviation is known Standard Deviation is unknown One-sample mean Confidence Interval: Stat / Tests / 7: ZInterval Stats, sample mean and population std dev given Data for L1 C-Level Calculate One-sample mean Confidence Interval: Stat / Tests / 8: TInterval Stats, sample mean and std dev given Data for L1 C-Level Calculate One-sample mean Test of Significance: Stat / Tests / 1: Z-Test Stats, mean and std dev given Data for L1 Calculate One-sample mean Test of Significance: Stat / Tests / 2: T-Test Stats, mean and std dev given Data for L1 Calculate Two-sample means Confidence Interval: Stat / Tests / 9: 2-SampZInt Stats, sample mean and population std dev given Data for L1 and L2 C-Level Calculate Two-sample mean Confidence Interval: Stat / Tests / 0: 2-SampTInt Stats, sample mean and std dev given Data for L1 and L2 C-Level Calculate Two-sample means Test of Significance: Stat / Tests / 3: 2-SampZTest Stats, sample mean and population std dev given Data for L1 and L2 Calculate Two-sample means Test of Significance: Stat / Tests / 4: 2-SampTTest Stats, mean and std dev given Data for L1 and L2 Calculate Simulations 5 One-sample proportion Confidence Interval: Stat / Tests / A: 1-PropZInt Counts and samples size C-Level Calculate Two-sample proportion Confidence Interval: Stat / Tests / B: 2-PropZInt Counts and samples size C-Level Calculate One-sample proportion Test of Significance: Stat / Tests / 5: 1-PropZTest Counts and samples size Calculate Two-sample proportion Test of Significance: Stat / Tests / 6: 2-PropZTest Counts and samples size Calculate Formulas and Conditions Random Variables Expected value Rules with For means a and b are constants X xi p i a bX a b X X Y Y X Variances 2 X ( xi x ) 2 p i General Rules for variances: For variances 2 a bX b 2 2 X X and Y are independent random variables 2 X Y 2 X 2Y 2 X Y 2 X 2Y 2 x y 2 X Y 2 X 2Y 2 x y 2 X Y 2 X 2Y Binomial Distribution Setting 1. Either success or failure 2. Fixed number of observations 3. n independent observations Binomial Coefficient n! n k (n k )! k! Binomial Probability n p( X k ) p k (1 p) n k k Binomial Distribution Mean X n p X n p(1 p) 6 4. same probability of success Normal Approximation for Binomial Distributions When n p 10 and n (1 p ) 10 , the binomial distribution X is approximately normal, N (np, np(1 p) ) Sampling Distribution of a Sample Proportion Mean p̂ Standard Deviation pˆ p̂ is an unbiased estimator of p p̂ gets smaller as n increases pˆ p p(1 p) n Conditions: Only when the population is at least 10 times as large as the sample. This formula does not apply when the sample is a large part of the population. It can be approximated with a normal distribution, p(1 p) ), n N ( p, Conditions: when n p 10 and n (1 p ) 10 The normal approximation improves as the sample size n increases. For fixed sample size n, the normal approximation is most accurate when p is close to 1 2 and least accurate when p is near 0 or 1. Sampling Distribution of a Sample Mean x x is an unbiased estimator of the population mean Mean X Standard Deviation X n Conditions: This formula can only be used when the population is at least 10 times as large as the sample. If the sample is an SRS from a population that has the normal distribution with mean and standard deviation , then the sample mean x has the normal distribution with mean and N ( , ) n standard deviation . n The Central Limit Theorem If an SRS of size n is drawn from any population whatsoever with mean and standard deviation and n is large, then the sampling distribution of the sample mean distribution N ( , ) with mean and standard deviation . n x is close to the normal n 7 Normal Density equation y 1 e 2 1 x 2 2nd/VARS/DRAW ShadeNorm(0,0,mean,stD) Set window to be around the mean 2 Estimate Multiplier Standard Error Multiplier Standard Error = margin of error Confidence Interval: Estimate Hypothesiz ed value Standard Error Test Statistic: Parameter of Interest mean st dev Estimate, hypotheses & Conditions Multiplier & Test with DF Known variance X z-interval H 0 0 Conditions: data are from SRS sampling distribution of x approx normal x z * SE z-test z x 0 Standard Error (SE) SE x n X Unknown variance H 0 0 One sample t-interval Conditions: 8 n data are from SRS(Very Important) sampling distribution of x approx normal for n<15 t procedures can be used for n 15 if no outliers or strong skewness are present In case of skewness, t procedure can be used as long as n 40 x t * SE SE ̂ x t-test with (n-1) df t s n x 0 s n 1 2 x1 x2 H 0 : 1 2 Conditions: When the sizes of the two samples are equal and the two populations being compared have distributions with similar shapes, probability values from the t table are quite accurate for a broad range of distributions when the samples are as small as n1=n2=5. When the two population distributions have different shapes, larger samples are needed. SRS’s from two distinct populations Independent samples Both populations are normally distributed Means and std dev are unknown Unknown variances Two sample t-interval 2 SE = 2 s1 s 2 n1 n2 ( x1 x 2 ) t * t 2 2 s1 s 2 n1 n2 ( x1 x 2 ) ( 1 2 ) 2 2 s1 s 2 n1 n2 df = smaller of n1-1 and n2-1 Both procedures err on the safe side: higher P-values and lower confidence than are actually true. two-sample t procedures are more robust than the one-sample t method x n H 0 p p0 pˆ Approx z-interval pˆ z SE * p Conditions: data are from SRS pop at least 10 times as large as the sample for a test , npˆ 10 and n(1 pˆ ) 10 Approx z-test z pˆ p0 p0 (1 p0 ) n SE = ˆ p̂ = pˆ (1 pˆ ) n for confidence intervals p0 (1 p0 ) n for hypothesis testing SE = p̂ = for confidence interval , np 0 10 and n(1 p0 ) 10 9 Sample size given a margin of error 2 z* n p* 1 p* m Where p* is .5 ( p1 p2 ) pˆ pˆ1 pˆ 2 Confidence Intervals H 0 : p1 p2 ( pˆ 1 pˆ 2 ) z * SE successes in both samples totlal successes pˆ n2 pˆ 2 5, n2 (1 pˆ 2 ) 5 pˆ 1 (1 pˆ 1 ) pˆ 2 (1 pˆ 2 ) n1 n2 Hypothesis Testing X1 X 2 n1 n2 Conditions: The populations are at least 10 times as large as the samples n1 pˆ 1 5, n1 (1 pˆ 1 ) 5, SE z pˆ 1 pˆ 2 pˆ (1 pˆ )( 1 1 ) n1 n 2 Conditions: The populations are at least 10 times as large as the samples n1 pˆ 5, n1 (1 pˆ ) 5, n2 pˆ 5, n2 (1 pˆ ) 5 ( pˆ 1 pˆ 2 ) x x 1 2 n1 n2 2 number is define as Goodness of Fit Test Multiple proportions H0: the actual population proportions are equal to the hypothesized Ha: the actual population proportions are different from the hypothesized n: number of outcome categories df: n-1 (O E ) 2 E 2 Where the E counts are calculated with the proportions from the H0 hypothesis. Conditions: All individual expected counts are at least 1 and no more than 20% of the expected counts are less than 5. SRS Test for 10 Homogeneity of populations H0: p1 = p2 = p3 = … Ha: not all proportions are equal Conditions: All individual expected counts are at least 1 and no more than 20% of the expected counts are less than 5. Multiple SRS’s Two Way Tables 2 number is define as 2 (O E ) 2 E Where the E rowTotal columnTotal TableTotal Test of Association/Independence 2 number is define as H0: there is no relationship between two categorical variables Ha: there is relationship between two categorical variables (O E ) 2 E 2 Conditions: All individual expected counts are at least 1 and no more than 20% of the expected counts are less than 5. A single SRS Where the E rowTotal columnTotal TableTotal Linear Regression Correlation Formula xi x yi 1 r n 1 s x s y Least Square Regression Line y yˆ a bx with slope br Sy Sx b yˆ a bx and intercept a y bx residuals 11 residuals y yˆ Inference for Regression Model Conditions: * Observations are independent * True linear relationship * The standard deviation of the response about the true lines is the same everywhere * The response varies normally about the true regression line. Confidence Interval for the slope of the line: b t * SEb df=n-2 SEb s ( x x ) 2 ( x x ) 2 comes from Stat/1-vars on the explanatory list: Sx n 1 Standard error about the line s Confidence Interval y yˆ t SÊ df=n-2 * ( y yˆ ) 2 n2 S(y - yˆ ) 2 comes from Stat/1-vars on the residuals list. 2 Use the term Sx . 1 ( x* x ) 2 SE ˆ s n (x x)2 12 ŷ 1 ( x* x ) 2 SE yˆ s 1 n (x x)2 Prediction Interval yˆ t * SE yˆ Transforming Relationships Algebra Defining and using Logarithms logx = y if and only if by=x log(response)= a + k log (explanatory) Properties of Logarithms log(AB) = log A + log B log(A/B) = log A – log B log Xp = p log X 10log(response) = 10a+log(explanatory^k) Response = 10a x explanatoryk 13