Download Statistics formulae list - Singapore A Level Notes

Chapter 9: Permutations and Combinations 1. Counting Principles  Addition principle: Let X and Y be 2 mutually exclusive events. If X can occur in x different ways and Y can occur in y different ways, then X or Y can occur in (x + y) ways  Multiplication principle: Let P and Q be two actions. If P can occur in p different ways and Q can occur in q different ways, then P and Q can occur in succession in (p x q) ways 2. Permutations  Order important  Number of permutations of n distinct objects taken all a time without replacement: n !  n(n  1)(n  2)...(3)(2)(1), for n    Number of permutations of n distinct objects taken r at a time without replacement: n! n , where 0  r  n Pr  (n  r )!  Number of permutations of n distinct objects taken r at a time with replacement: n r , where 0  r  n  Number of permutations of n objects taken all at a time with replacement, of which n1 are of the 1st type, n2 are of the 2nd type … nk of the k th type, where n! n1  n2  ...  nk  n : n1 !n2 !...nk ! n!  ( n  1)!  Number of permutations of n distinct objects in a circle: n 3. Combinations  Order not important  n Pr  nCr  r !  Number of combinations of n distinct objects taken r at a time without replacement: n P n! n Cr  r  r ! r !(n  r )! Chapter S1: Probability 1. Properties involving Sets n()  n( A)  n() n()  0 If A is a subset of B, then n( A)  n( B) n( A)  n( A ')  n() n( A  B )  n( A)  n( B )  n( A  B ) n( A)  n( A  B )  n( A  B ') n( B)  n( B  A)  n( B  A ') 2. Calculating Probability  In the case of a finite sample space  whereby all the sample points are equally n( E ) likely to occur, the probability of an event E: P( E )  , where n(E) denotes the n ( ) number of elements of the set E  Properties 0  P( A)  1 P ()  0 and P ()  1 If A is a subset of B, then P ( A)  P ( B ) P( A)  P( A ')  1 P( A  B)  P( A)  P( B)  P( A  B) [Addition principle] P( A)  P( A  B )  P ( A  B ') P( B)  P( B  A)  P( B  A ') P( A  B)  P( A)  P( B | A) [Multiplication principle]  Probability distribution The probability distribution of T is as follow: Total score 2 3 4 5 Probability 0.2 0.6 0.1 0.1 3. Mutually Exclusive Events  2 events are mutually exclusive if either one event or the other can occur, but not both  P( A  B)  0  P( A  B)  P( A)  P( B) 4. Conditional Probability P( A  B) P( A | B)  P( B) 5. Independent Events  The occurrence / non-occurrence of one event has no influence on the occurrence / non-occurrence of the other  P( A | B)  P( A) or P( B | A)  P( B)  P( A  B)  P( A)  P( B) 6. Probability Tree 1/4 1/3 A 3/4 2/3 A’ 1/5 4/5 B B’ B B’ 1 4 1 1 1 P( A  B)    3 4 12 P ( B | A)  1 1 2 1 13 P( B)  P( B  A)  P( B ' A)      3 4 3 5 60 P( A  B)  P( A)  P( B)  P( A  B) Chapter S2: Binomial and Poisson Distributions 1. Binomial Distribution  Consists of n independent trials  Outcome of each trial either success or failure  Probability of success for each trial, p, remains constant  X ~ B(n, p)  E ( X )  np , Var ( X )  np(1  p) [IN MF15] n n n!  P( X  x)    p x (1  p) n  x for x = 0,1,2,…,n where    nCr  [IN (n  x)! x !  x  x MF15] 2. Poisson Distribution  Events randomly occurs in a given interval of time or space  Events are independent of each other  Events occur uniformly: average rate of occurrences is a constant  Events occur singly: probability of two or more events occurring within a very short interval is negligible  X ~ Po( )     e   x for x = 0,1,2… [IN MF15] x! E ( X )   , Var ( X )   [IN MF15] X  Y ~ Po(1  2 ) only if X and Y are independent Poisson ~ Binomial: If X ~ B(n, p) and n is large (n>50) and p is small such that np<5, then X ~ Po(np) approximately P( X  x)  Chapter 3: Normal Distribution 1. Properties of Expectation E (a )  a E (aX )  aE ( X ) E (aX  b)  aE ( X )  b E (aX  bY )  aE ( X )  bE (Y ) 2. Properties of Variance Var (a)  0 Var (aX )  a 2Var ( X ) Var (aX  b)  a 2Var ( X ) Var (aX  bY )  a 2Var ( X )  b 2Var (Y ) 3. Normal Distribution  X ~ N ( ,  2 )  E ( X )   , Var ( X )   2  Standard normalization: P( X  x)  P( Z   Linear combinations: o X  Y ~ N ( 1  2 ,  12   2 2 ) o   x  ) aX  b ~ N (a1  b,  12 ) o aX  bY ~ N (a1  b2 ,  12   2 2 ) Normal ~ Binomial: If X ~ B(n, p) and n is sufficiently large such that np>5 and X ~ N (np, np(1  p)) n(1-p)>5, then approximately and P( X  x)  P( x  0.5  X  x  0.5) by continuity correction Normal ~ Poisson: If X ~ Po( ) and   10 , then X ~ N ( ,  ) approximately, and P( X  x)  P( x  0.5  X  x  0.5) by continuity correction Chapter S4A: Sampling Methods Methods of Sampling Simple Random Sampling (random) Advantages Disadvantages Examples   Assign every student a number and then use a list of random numbers to select a sample of 10 students from the school. Simple to implement   Systematic Sampling (random)  Even spread of population   Stratified Sampling (random)  Likely to give good representation of population    Quota sampling (nonrandom)  Sampling unit can  be easily replaced if unavailable Time consuming for very large populations to locate each participant Sampling frame is needed, which may be difficult to obtain Sampling unit cannot be replaced if not available May be biased when the members of the population have a cyclic pattern Sampling unit cannot be replaced if not available Sampling frame is needed Strata may not be clearly defined Sampling unit cannot be replaced if not available Biased as researcher may select people who are easier to survey Arrange the names of the 130 children in alphabetical order. As k = 130/10 = 13, randomly select one name from the first 13 children. If the rth name is chosen, every 13th name from the rth name on the list of children will be included in the sample ie. rth, r+2(13)th … r+9(13)th names will be included in the sample. The proportion of 200 staff in different age groups is as follows: 38% under 40… To obtains a sample of 40 staff, we draw random samples from the age groups with sample size in the same proportion as the size of each age group as follows: 15 (38% of 40) under 40… Divide the population into age groups eg. 18-27… Take a fixed number from each age group so that the total number of people is 80 (the sample size). Non-randomly choose people, search the population until you have found the required number for each stratum. Random: each member of a population has an equal chance of being selected (must show figures to prove) Chapter 4B: Distribution of Sample Mean and Central Limit Theorem 1. Estimates of Population Parameters 1 x n  Unbiased estimate of population mean: x   Unbiased estimate of population variance: s 2   [IN MF15] y  ax  b  s y 2  a 2 sx 2 1 1  2 (  x) 2  ( x  x)2  x  n 1 n  1  n  2. Distribution of Sample Mean  X ~ N ( ,  Z 2 n ) x 2 n  Central Limit Theorem: If non-normal population, X ~ N (  , 2 n ) approximately if sample size is large Chapter 5: Hypothesis Testing 1. Terminology  Level of significance: probability of rejecting the null hypothesis given that the null hypothesis is true o Probability of concluding that the resistance is not 10 ohms when it is 10 ohms is 0.05  P-value: the smallest level of significance at which the null hypothesis can be rejected 2. Framework  State null hypothesis and alternative hypothesis  Write down test type and level of significance  Assume null hypothesis is true: under H0, … and determine distribution  Decide on test statistic: using a z/t-test…  Calculate p-value  Conclusion o Since the p-value = … > … we do not reject the null hypothesis and conclude that there is insufficient evidence, at … level of significance, that the population mean is… o The reverse is true for p=value = … < … 3. Test Statistics Population Normal  known  2 unknown Large 2 s2 X ~ N (  , ) X ~ N (  , ) 0 0 sample n n z-test s 2 is an unbiased estimate of  2 z-test 2 Small sample X ~ N ( 0 , 2 n z-test ) X  0 ~ t (n  1) S/ n s 2 is an unbiased estimate of  2 t-test T Population Not Normal  known  2 unknown 2 s2 X ~ N ( 0 , ) X ~ N ( 0 , ) n n approximately approximately by by Central Central Limit Limit Theorem Theorem z-test z-test Assume X Assume X follows follows normal normal distribution distribution X  0 T ~ t (n  1) 2 S/ n X ~ N ( 0 , ) n s 2 is an unbiased estimate of  2 t-test 2 Chapter S6: Correlation and Regression 1. Product Moment Correlation Coefficient ( x  x)( y  y )  r ( x  x ) 2 ( y  y ) 2      Strong linear correlation: |r| > 0.9 Must look at scatter diagram for more accurate inferences 2. Linear Regression  x: independent, y: dependent  line of y on x ( x  x)( y  y ) o y  y  b( x  x ) where b  [IN MF15] ( x  x ) 2  y: independent, x: dependent  line of x on y ( x  x)( y  y ) o x  x  d ( y  y ) where d  ( y  y ) 2 3. Interpretation of Data  y  a  bx o a is the estimated percentage of trains that run on time when there is no investment o b is the expected increase in percentage for every increase is $1000 in investment  Correlation does not imply causality o Large product moment correlation coefficient only indicates the existence of linear relationship, but x and y could be caused by another factor z, or there may be a more complex relation  When asked if there is strong linear relation, look at whether there are outliers in scatter plot  Interpolation more accurate than extrapolation 4. Linearisation of Data  To find best model o Calculate r o Given data, determine whether direct / inverse relationship between variables and see whether your model fits this relationship  can quickly eliminate options so don’t need to calculate that many values of r

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Statistics formulae list - Singapore A Level Notes