Download Chapter 2 Descriptive Statistics

Chapter 2 Descriptive Statistics Sample mean n x Sample variance s  Empirical Rule Chebyshev’s theorem z score Coefficient of variation pth percentile Weighted mean i 1 i x1  x2    xn n  n n 2 Sample standard deviation Calculating the Sample Variance (Computational formula for s2) x  x  x  i i 1 n 1 s s 2  x  x    x2  x   1 2     xn  x  2 n 1 2 2   n     xi  1  n 2  i 1   2 s   xi n  n  1  i 1     For a normally distributed population, this rule tells us that 68.26 percent, 95.44 percent, and 99.73 percent of the population measurements are within one, two, and three standard deviations, respectively, of the population mean. A theorem that (for any population) allows us to find an interval that contains a specified percentage of the individual measurements in the population. x  mean z standard deviation Standard deviation Coefficient of variation  100 Mean For a set of measurements arranged in increasing order, a value such that p percent of the measurements fall at or below the value, and (100 − p) percent of the measurements fall at or above the value.  wi xi Sample variance for grouped data w fM fM x n f  f M  x  s  Population mean for grouped data  Population variance for grouped data  i Sample mean for grouped data 2 i i i i 2 i 2 2 i n 1 fi M i N  f M  i N i   2 i Chapter 3 Probability Computing the Probability of an Event The Rule of Complements The Addition Rule Mutually Exclusive Events The Addition rule for two mutually exclusive events The Addition rule for N mutually exclusive events Conditional probability Number of sample space outcomes that correspond to the event Total number of sample space outcomes P  A   1  P  A P  A  B   P  A  P  B   P  A  B  P( A  B)  0 P  A  B   P  A  P  B  P  A1  A2  AN   P  A1   P  A2     P  AN  P  A | B  P  A  B P  B P  B | A  P  A  B P  A The General multiplication rule P  A  B   P  A P  B | A  P  B P  A | B Independent Events P  A | B   P  A P  B | A  P  B  The Multiplication rule for two independent events The Multiplication rule for N independent events Bayes’ theorem P  A  B   P  A P  B  P  A1  A2  AN   P  A1  P  A2   P  AN  P  Si | E   P  Si  E  P  Si  P  E | Si   PE PE Chapter 4 Discrete Random Variables Properties of a Discrete Probability Distribution P(x) P( x)  0  p ( x)  1 All x The Mean, or Expected Value, of a Discrete Random Variable x   x  P  x  All x The Variance and standard deviation of a discrete random variable  x2    x  x  P  x  2 All x  x   x2 The Binomial Distribution p  X  x  The Mean, Variance, and Standard Deviation of a Binomial Random Variable The Poisson Distribution x  np,  x2  npq, and  x  npq n! p x q n x x ! n  x ! e   x x! x  ,  x2   ,  x   P  x  The Mean, Variance, and Standard Deviation of a Poisson Random Variable The Hypergeometric Distribution    p  x    N r n x r x N n  r  The Mean and Variance of a Hypergeometric Random Variable  r  r  N  n   x  n   and  x2  n  1    N  N  N  N  1  Chapter 5 Continuous Random Variables Properties of a Continuous Probability Distribution The Uniform Distribution f ( x)  0  1  f  x   d  c 0 x  The Normal Probability Distribution z values The Standard Normal Distribution Normal approximation to the binomial distribution cd 2 for c  x  d otherwise and  x  1  x  d c 12 2    1 f  x  e 2    2 x z  x z  Consider a binomial random variable x where n is the number of trials and p is the probability of success. If np  5 and n(1 – p)  5, then x is approximately normal with mean  = np and standard deviation   npq . To standardize, use z  x  0.5    or z  x  0.5    . The Exponential Distribution Mean and standard deviation of an exponential distribution λe λx for x  0 and P( x  a )  1  ea f  x   otherwise 0 1 1  x  and  x    Chapter 6 Sampling Distributions Sampling distribution of the sample mean If x has mean  and standard deviation  , then x has Standard deviation of the sampling distribution of the sample mean Central limit theorem x  Sampling distribution of the sample proportion mean x   and standard deviation  x   n . In addition, if x follows a normal distribution, then x also follows a normal distribution.  n If the sample size n is sufficiently large (at least 30), then x will follow an approximately normal distribution with mean x   and standard deviation  x   n . If np  5 and n(1 – p)  5, then p̂ is approximately normal with mean  p̂ = p and standard deviation  pˆ  Standard deviation of the sampling distribution of the sample proportion  pˆ  p(1  p) n p(1  p) / n . Chapter 7 Hypothesis Testing Hypothesis Testing Steps 1. State the null and alternative hypotheses. 2. Specify the level of significance. 3. Select the test statistic. 4. Find the critical value (or compute the p-value). 5. Compare the value of the test statistic to the critical value (or the p-value to the level of significance) and decide whether to reject H0. x  0 z0   n Hypothesis test about a population mean (σ known) Large-sample pˆ  p0 hypothesis test about z0  p (1  p ) . 0 0 a population n proportion Sampling x1  x2 has mean  x1  x2  1  2 distribution of 2 2 x1  x2 (independent and standard deviation  x  x   1   2 1 2 n1 n2 random samples) Hypothesis test  x1  x2   D0 about a difference in z0   12  22 population mean (σ1  n1 n2 and σ2 known) Large-sample  pˆ1  pˆ 2   0 , pˆ  total number of success in both samples hypothesis test about z0  total number of trials in both samples 1 1 a difference in pˆ (1  pˆ )    population  n1 n2  proportions where p1 = p2 Large-sample  pˆ1  pˆ 2   D0 hypothesis test about z0  ˆ p1 (1  pˆ1 ) pˆ 2 (1  pˆ 2 )  a difference in n n2 1 population proportions where p1  p2 Calculating the 0  a probability of a Type z*   n II error Sample-size ( z *  z ) 2  2 n determination to ( 0   a ) 2 achieve specified values of α and β Chapter 8 Comparing Population Means and Variances Using t Tests and F Ratios t test about μ t test about μ1 – μ2 when σ12 = σ22 t test about μ1 – μ2 when σ12 ≠σ22 Hypothesis test about μd t0  t0  t0 x  0 s n  x1  x2   D0 , s 2  (n1  1) s12  (n2  1) s22 , df 1 1 s 2p     n1 n2   x  x   D0 , df  1 2 t0  s12 s22  n1 n2 n1  n2  2 p  s 2 1 s 2 1 n1  s22 n2   n1  n2  2 2 n1  ( n1  1)   s22 n2  (n2  1) 2 2 d  D0 sd n Sampling distribution of s12 2 2 2 2    If , then has an F distribution with 1 2 s1 /s2 (independent s22 random samples) df1 = n1 – 1 and df2 = n2 – 1. Hypothesis test about s12 s22 2 2 2 2 2 For . For . H :    , F  H :    , F  a 1 2 0 a 1 2 0 the equality of σ1 and 2 2 s s 2 1 σ2 2 Chapter 9 Confidence Intervals z-based confidence interval for a population mean μ with σ known t-based confidence interval for a population mean μ with σ unknown Sample size when estimating μ Large-sample confidence interval for a population proportion p Sample size when estimating p t-based confidence interval for μ1 – μ2 when σ12 = σ22 t0   x1  x2   D0 , s 2  (n1  1) s12  (n2  1) s22 , 1 1 s     n1 n2  df  n1  n2  2 2 p p n1  n2  2 t-based confidence interval for μ1 – μ2 when σ12 ≠σ22 s12 s22  , n1 n2  x1  x2   t 2,df df  Large-sample confidence interval for a difference in population proportions s 2 1 s 2 1 n1  s22 n2  2 n1  (n1  1)   s22 n2  (n2  1) 2  pˆ1  pˆ 2   z 2 2 pˆ1 (1  pˆ1 ) pˆ 2 (1  pˆ 2 )  n1 n2 Chapter 10 Experimental Design and Analysis of Variance One-way ANOVA sums of squares The sum of squares total (SST) is The betweengroups mean square (MSB) is The mean square error (MSE) is One-way ANOVA F test n1 p n2 SSB   ni ( xi  x ) , SSE   ( x1 j  x1 )   ( x2 j  x2 )  2 j 1 i 1 SST  SSB  SSE MSB  SSB p 1 MSE  SSE n p F MSB SSB / ( p  1)  MSE SSE / (n  p) 2 j 1 2 np   ( x pj  x p ) 2 j 1 Estimati on in one-way ANOV A: Individu al 100(1 – ) confiden ce interval for i  h Estimati on in one-way ANOV A: Tukey simultan eous 100(1 – ) confiden ce interval for i  h Estimati on in one-way ANOV A: Individu al 100(1 – ) confiden ce interval for i Random ized  xi  xh   t /2 1 1 MSE    , df  n  p  ni nh   xi  xh   q MSE 1 1  , with q corresponding to p and n  p. 2 ni nh xi  t /2 MSE , df  n  p ni p b i 1 j 1 p b SSB  b ( xi  x )2 , SSBL  p  ( x j  x )2 , SST   ( xij  x )2 i 1 j 1 block sums of squares Estimati 2  xi  xh   t /2 s , df  ( p  1)(b  1), s  MSE on in a b randomi zed block experim ent: Individu al 100(1 – ) confiden ce interval for i  h Estimati s x  x  q , with q corresponding to p and( p  1)(b  1).   i h  on in a b randomi zed block experim ent: Tukey simultan eous 100(1 – ) confiden ce interval for i  h a a b m b Two2 2 , SS (1)  bm ( x  x ) SST   ( xij ,k  x ) SS (2)  am ( x j  x )2 ,  i way i 1 i 1 j 1 k 1 j 1 ANOV a b A sums SS (int)  m ( xij  xi  x j  x )2 , of i 1 j 1 squares SSE = SST – SS(1) – SS(2) – SS(int) Estimati 2  xi  xi '   t /2 MSE   , df  ab(m  1) on in  bm  two-way ANOV A: Individu al 100(1 – ) confiden ce interval for i  i ' Estimati on in two-way ANOV A: Tukey simultan eous 100(1 – ) confiden ce interval for factor 1 i  i ' Estimati on in two-way ANOV A: Tukey simultan eous 100(1 – ) confiden ce interval for factor 2  j   j'  xi  xi '   q  1  MSE   , with q corresponding to a and ab(m  1).  bm   xi  xh   q  1  MSE   , with q corresponding to b and ab(m  1).  am  Estimati on in two-way ANOV A: Individu al 100(1 – ) confiden ce interval for ij xij  t /2 MSE , df  ab(m  1) with q corresponding to b and ab(m  1). m Chapter 11 Correlation Coefficient and Simple Linear Regression Analysis Least squares point estimates of β0 and β1 b1  SS xy SS xx , where  n  n    xi    yi  n n SS xy   ( xi  x )( yi  y )   xi yi   i 1   i 1  n i 1 i 1  n    xi  n n 2 2 and SS xx   ( xi  x )   xi   i 1  n i 1 i 1 b0  y  b1 x The predicted value of yi yˆi  b0  b1 xi Point estimate of a mean value of y at x = x0 Point prediction of an individual value of y at x = x0 ŷ  b0  b1 x0 ŷ  b0  b1 x0 2 Chapter 12 Multiple Regression Chapter 13 Nonparametric Methods Sign test for a population median If H a : M d  M 0 , then S = number of sample measurements less than M0. If H a : M d  M 0 , then S = number of sample measurements greater than M0. Large-sample sign test Wilcoxon rank sum test Wilcoxon rank sum test (largesample approximation) S  .5  .5n .5 n If H a : D1 shifted to the right of D2, then reject H0 if T  TU (n1  n2 ) or T  TL (n1  n2 ) . If H a : D1 shifted to the left of D2, then reject H0 if T  TU (n1  n2 ) or T  TL (n1  n2 ) . If H a : D1 shifted to the right or left of D2, then reject H0 if T  TU or T  TL . n (n  n  1) T  T , T  1 1 2 , z 2 T z n1n2 (n1  n2  1) 12  T = sum of the ranks associated with the negative paired differences T  = sum of the ranks associated with the positive paired differences If H a : D1 shifted to the right of D2, then reject H0 T  Wilcoxon signed ranks test if T = T   T0 . If H a : D1 shifted to the left of D2, then reject H0 if T = T   T0 . If H a : D1 shifted to the right or left of D2, then reject H0 if T = the smaller of T  and T  is  T0 . z T  T t rs n  2 Kruskal-Wallis H test T , T  n( n  1) , T  4 n(n  1(2n  1) 24 Kruskal-Wallis H statistic Spearman’s rank correlation coefficient Spearman’s rank correlation test 1  rs2 , where rs  1  Chapter 14 Chi-Square Tests Goodness of fit test for multinomial probabilities k 2   i 1 ( f i  Ei )2 Ei 6 d 2 n(n 2 1) Test for homogeneity k 2   i 1 Goodness of fit test for a normal distribution k 2   i 1 ( f i  Ei )2 Ei ( f i  Ei )2 Ei Chi-Square test for independence Chapter 15 Decision Theory Maximin criterion Maximax criterion Expected monetary value criterion Expected value of perfect information Expected value of sample information Expected net gain of sampling Find the worst possible payoff for each alternative and then choose the alternative that yields the maximum worst possible payoff. Find the best possible payoff for each alternative and then choose the alternative that yields the maximum best possible payoff. Choose the alternative with the largest expected payoff. EVPI = expected payoff under certainty – expected payoff under risk EVSI = EPS - EPNS ENGS = EVSI – cost of sampling Chapter 16 Time Series Forecasting No trend Linear trend Quadratic trend Modelling constant seasonal variation by using dummy variables Multiplicative decomposition method Simple exponential smoothing Double exponential smoothing Mean absolute deviation (MAD) For a time series with k seasons, define k – 1 dummy variables in a multiple regression model. (e.g. for quarterly data, define three dummy variables) Yt  TRt  SN t  CLt  IRt Mean squared deviation (MSD) Percentage error (PE) Mean absolute percentage error (MAPE) PE  yt  yˆ t  100 yt n  PE t 1 A simple index An aggregate price index A Laspeyres index is A Paasche index is n yt 100 y0   pt    100   p0   pt q0 100  p0q0  p q 100 p q t t 0 t

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Chapter 2 Descriptive Statistics