Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Quantitative Business Analysis: Ken Choie http://dasan.sejong.ac.kr/~kchoie/ Q: A: Part I: How to make decisions under uncertainty? Maximize the expected value! Probability: 1. Random variables Possible Outcomes Probabilities 2. The Expected Value and Variance of a Random Variable Common Probability Distributions 3. Summary of Observations/ Data Central Tendency Dispersion Skewness Kurtosis Part II: Statistics (Sample Data and Inferences): 1. Estimation of Population parameters The Central Limit Theorem Point Estimation Interval estimation (Confidence Intervals) 2. Hypothesis Testing (Testing hypothesis on population parameters) Concerning the Mean Concerning the Variance 3. Regression An Overview: Possibility: Winning a lottery Probability: Possible outcomes (events); Attached likelihood (probability) Random variable: e.g., the # of heads in 10 coin-tosses A random variable has many possible outcomes; Frequency distribution Each possible outcome has an associated probability The probability distribution (outcome + probability combination) Probability Theory: Concerned with the parameters of the probability distribution of a random variable The expected value, The variance. The probability distribution of a random variable; Discrete random variables, Continuous random variables, The parameters of the probability distribution: Central Tendency Dispersion Skewness Kurtosis Statistics: Population with its parameters (the underlying probability distribution); Sample (with its statistics) taken from the population 1. How do you estimate the unknown parameter values of a random variable? Infer the population parameter values from samples (sample statistics). Estimation of a parameter value: Point estimate Interval estimate (Confidence interval) 2. How do you verify/test hypotheses on the unknown parameter values of random variable? Use the sample statistics to evaluate the hypotheses on the population parameters Null hypothesis Alternative hypothesis 3. Can you specify/quantify the relationship between random variables? Do the regression analysis Part I: Probability Theory: Possible outcomes (events) of a random variable: Methods of enumeration: Multiplication principle: n * m Permutation: (n objects, r positions), P(n,r) = Combination: n! (n−r)! (a distinguishable permutation), n! C(n,r) = (n−r)!r! The odds in favor of an event are 2 to 1 => the probability of the event is 1/3. Algebra of sets: Union: Intersection: ∪ ∩ Venn diagram: P(A) = 1 – P(A) P(A ∪ B) = P(A) + P(B) − P (A ∩ B) ; if A, B are not mutually exclusive Sequential events: Independent events: Events A and B are independent if and only if, P(A∩ B) = P(A)P(B) Conditional probability: P(A|B) = P(A∩B) P(B) P(A∩ B) = P(B)P(A|B) Bayes’ formula: a posterior probability conditional on the prior probability P(Bj|A) = P(Bj)P(A|Bj) ∑i P(Bi)P(A|Bi) The mean and variance of a random variable: Let X denote a random variable with a certain outcome set, R. The probability distribution function of the random variable X is denoted by f(x). The expected value of X is a weighted mean of X where the weights are the probabilities f(x); the (weighted) mean of X is denoted as μ. The expected value of X = the mean of X, μ E [X] = μ = the mean If X is a discrete random variable: E [x] = ∑R x f(x) In general, E [g(x)] = ∑R g(x)f(x) If c is a constant number, E(c) = c If c is a constant and g is a function, E [cg(x)] = c E[g(x)] In particular, If g(x) = (X − μ)2 E [(X − μ)2 ] = ∑R(X − μ)2 f(x) = σ2 = the variance of X σ = the standard deviation of X Expected Value and Variance of a Random Variable: E (X) = ∑R f (X) ∗ X : prob. weighted average = ∑ P (Xi) ∗ Xi Var (X) = E [ Xi – E(Xi)]2 = ∑R f (X) ∗ [ Xi – E(Xi)]2 = ∑ P (Xi) ∗ [ Xi – E(Xi)]2 : prob. weighted dispersion Example 1: Probability: .10 .60 .30 rate of return in %: -10 12 15 Let the rate of return be X, then E(X) = .10 * (-10) + .60 *12 + .30*15 = 10.7; Var(X) = .10 * (10 − 10.7)2 + .60 * (12 − 10.7)2 + .30 ∗ (15 − 10.7)2 = 49.41 Discrete probability distribution function Bernoulli trial: Let X be a random variable, X(success) =1, X(failure) =o, f(x) = P x (1 − P)1−x , x = 0,1 E [x] σ2 = ∑R x f(x) = ∑R x P x (1 − P)1−x = 0*(1-P) +1*P = P = ∑R(X − μ)2 f(x) = ∑R(X − P)2 P x (1 − P)1−x = P (1 - P) The binomial distribution: Let Y be a random variable whose value is the number of successes in n Bernoulli trials. f(x) = C(n, y) P y (1 − P)1−y , y = 0,1, … , n μ = nP σ2 = nP(1 – P) The continuous probability distribution functions: The uniform distribution The exponential distribution The normal distribution: The standard normal distribution: N (0,1) If X is N(μ, σ2 ) then Z = (X− μ) σ is N (0,1) Summary of Probability Distribution functions Central Tendency Dispersion: Skewness Kurtosis The differences among mean, median, mode: The connection between the mean and dispersion; The coefficient of variance, the Sharpe ratio, The relationship between the central tendency and skewness; Positively skewed => median < mean The normal distribution vs other distribution Kurtosis = 3, it is a normal distribution Leptokurtic (Kurtosis >3) fatter at the extremes and higher peak in the middle Platykurtic (Kurtosis < 3) Pressed down in the middle The relationship between two random variables, X and Y: The mean: = ∑R X f(X) E (X) = μx The variance: = E [ (X − μx )2 ] Var (X) = ∑R(X − μx )2 f(X) = σ2 x The mean: = ∑R Y f(Y) E (Y) = μy The variance: Var (Y) = E [ (Y − μy )2 ] = ∑R(Y − μy )2 f(Y) = σ2 y The joint probability distribution function, f (X,Y): The graphic representation of the relationship; The covariance: Cov (X,Y) = E [ (X – μx ) (Y − μy )] = ∑R [(X − μx ) (Y - μy )] = σxy The correlation coefficient: ρxy = Cov (X,Y) σx σy f (X,Y) Example 2: The joint probability of returns for securities A, B. Return on B = 30% Return on A = 20% Return on A = 10% return on B = 15% .6 0 0 .4 Let the rate of return on A be X, and the rate of return on B be Y. E (X) = 0.6 * 20% + 0.4 * 10% = 16% = Expected rate of return on A E (Y) = 0.6 * 30% + 0.4 * 15% = 24% = Expected rate of return on B Var (X) = 0.6 * (20 − 16)2 = 24 + 0.4 * (10 − 16) 2 Var (Y) = 0.6 * (30 − 24) 2 = 54 + 0.4 * (15 − 24) 2 Cov (X,Y) = 0.6 * (20 - 16)*(30 - 24) + 0.4 * (10 - 16)*(15 - 24) = 0.6* 24 + 0.4 * 54 = 36 The correlation coefficient: ρxy = Cov (X,Y) σx σy = 36 √24∗54 Positively correlated. Example 3: The joint probability of returns for securities A, B. Return on B = 30% Return on A = 20% Return on A = 10% return on B = 15% .4 .1 .2 .3 Let the rate of return on A be X, and the rate of return on B be Y. E (X) = 0.6 * 20% + 0.4 * 10% = 16% = Expected rate of return on A E (Y) = 0.5 * 30% + 0.5 * 15% = 22.5% = Expected rate of return on B Var (X) = 0.6 * (20 − 16)2 = 24 Var (Y) = 0.5 * (30 − 22.5) 2 + 0.5 * (15 − 22.5) 2 = 56.25 Cov (X,Y) = 0.4 * (20 - 16)*(30 – 22.5) + 0.1 * (10 – 16) * (30-22.5) + 0.2 * (20 - 16) * (15 – 22.5) + .3 * (10 - 16)*(15 – 22.5) = 16.5 The correlation coefficient: + 0.4 * (10 − 16) 2 ρxy = Cov (X,Y) σx σy = 16 √24∗56.25 Less positively correlated. Example 4. The expected return and variance of a portfolio Portfolio P consists of the assets A and B whose weights in the portfolio are 55% and 45%, respectively. The return on the portfolio, K, is a linear combination of two random variables X and Y. K = 0.55X + 0.45Y Hence, E(K) = = = = E (0.55X + 0.45Y) 0.55 E(X) + 0.45 E(Y) 0.55 * 16%+ 0.45 * 24% 19.6 % Var (K) = E [K – E(K)] 2 = E [(0.55X + 0.45Y) – E (0.55X + 0.45Y)]2 = E [ {0.55X – E(0.55X)} + {0.45Y − E(0.45Y)}] 2 2 = E [ {0.55X – E(0.55X)} + 2 ∗ {0.55X – E(0.55X)} ∗ {0.45Y − E(0.45Y)} + {0.45Y − E(0.45Y)2 ] = 0.552 * Var (X) + 2 * 0.55 * 0.45 * Cov (X, Y) + 0.452 * Var (Y) = 0.552 ∗ 24 + 2 * 0.55 * 0.45 *36 + 0.452 ∗ 54 Let Z be a linear combination of two random variables X and Y. E (Z) = Var (Z) = E(aX + bY) = a E(X) + b E(Y) Var(aX + bY) =a2 Var(X) + 2ab Cov(X,Y) + b2 Var(Y) Recapitulation of the Theory of Probability: Probability = a certain outcome the number of possible outcome To compute the number of possible outcome, study combination and permutation; Independent probability Conditional probability A random variable: A certain outcome may take on a particular value within a range of outcomes; Which particular value the random variable may take on has an associated probability; The relationship between the value of the random variable and its associated probability is called the probability distribution (or the probability distribution function) The probability distribution of a random variable X: A measure of its central tendency: E (X) = ∑R f (X) ∗ X : prob. weighted average = ∑ P (Xi) ∗ Xi A measure of the dispersion of outcomes: Var (X) = E [ Xi – E(Xi)]2 = ∑R f (X) ∗ [ Xi – E(Xi)]2 = ∑ P (Xi) ∗ [ Xi – E(Xi)]2 : prob. weighted dispersion A measure of the relationship between the mean and the variance of X: σ The coefficient of variance: The Sharpe’s ratio: μ E(X)− Rf σ Two or more random variables: The relationship between two random variables, X and Y: The mean: = ∑R X f(X) E (X) = μx The variance: = E [ (X − μx )2 ] Var (X) = ∑R(X − μx )2 f(X) = σ2 x The mean: = ∑R Y f(Y) E (Y) = μy The variance: Var (Y) = E [ (Y − μy )2 ] = ∑R(Y − μy )2 f(Y) = σ2 y The joint probability distribution function, f (X,Y): The graphic representation of the relationship; The covariance: Cov (X,Y) = E [ (X – μx ) (Y − μy )] = ∑R [(X − μx ) (Y - μy )] = σxy f (X,Y) The correlation coefficient: ρxy = Cov (X,Y) σx σy Let Z be a linear combination of two random variables X and Y. E (Z) = Var (Z) = E(aX + bY) = a E(X) + b E(Y) Var(aX + bY) =a2 Var(X) + 2ab Cov(X,Y) + b2 Var(Y) Part II. Statistics The Probability theory: For a random variable X with the probability distribution, f(x), we can determine the central tendency, E(X), and the dispersion, Var(X). (These parameters of f(x) are called “population parameters”.) The issue: In reality, the probability distribution of X, f(x), is unknown. Hence, the mean and the variance of X are also unknown. If we need to know the central tendency and the dispersion of the random variable X, What can we do? The solution: Use the sample data to guess the parameter values of f(x). Statistics: If you take many samples, these samples themselves would have a certain probability distribution with its own central tendency and dispersion. These parameters of sample distribution are called “sample parameters”. Using the characteristics of a sample (called the “sample statistics”) to infer information on the “population parameters” of f(x) is the subject of “Statistics”. Sampling: to save time and money Minimize sampling error: simple random sampling vs. stratified random sampling Time-series data and cross-sectional data Sampling biases: selection (survivorship, time-period) Data mining: search for patterns the pattern found may be specific to the period, lacking an economic story. The central limit theorem: If ̅ X is the mean of a random sample with n observations from an unknown distribution of the random variable X ~ --- (μ , σ2 ), ̅ ~N( μ , then X σ2 n ) Where, σ2 the standard deviation (or error) of the sample mean, σ X̅ is √ n let W= ̅− μ X σ ⁄ n √ W ~ N (0 , 1) then, as n --> ∞ W is known as the standard normal distribution. The Chebyshev’s Inequality: If Y ~ --- (μ , σ2 ) P[|Y– μ | > k* then, σ ] < 1 k2 The Point Estimation of the population mean: The sample mean is the BLUE : unbiased, efficient (smallest variance), consistent (as n gets larger) the MLE (the parameter of the population pdf that could have produced X) The Confidence Interval (i.e., an interval estimation) for the population mean: An interval estimation centered around the sample mean: One can say with the lies within the interval. If X ~ --- ( μ , σ2 ) , then ̅− μ X σ ⁄ n √ W= ___ % degree of confidence that the population mean, μ , ~ N (0 , 1) as n --> ∞ Hence, CI If => ̅ X ± z* σ √n X ~ --- ( μ , unknown ) then, Q = ̅− μ X s ⁄ √n−1 ~ t distribution, when the sample size, n, is small; ~ approximately N(0,1), when n is large CI => ̅ X ± t* s √n−1 The greater is the sample size (or the degrees of freedom), the narrower is the confidence interval. The greater is the reliability factor (i.e., the degree of confidence), the wider is the confidence interval. Given the range of the confidence interval and the degree of confidence, one can compute the sample standard deviation of the test static. Example 5. stock. Ten analysts have given the following fiscal year earnings forecasts for a Forecast (X) Number of Analysts(n) 1.40 1.43 1.44 1.45 1.47 1.48 1.50 1 1 3 2 1 1 1 What is the mean forecast and standard deviation of forecasts? Provide a 95% confidence interval for the population mean of the forecast. X i ni (Xi − X̅ ) 1 1 3 2 1 1 1 10 1.40 1.43 4.32 2.90 1.47 1.48 1.50 14.50 -0.05 -0.02 -0.01 0.00 0.02 0.03 0.05 ̅ X = 14.50/10 = 1.45 s2 = s = √0.0007778 Forecast (Xi ) analysts (ni ) 1.40 1.43 1.44 1.45 1.47 1.48 1.50 ̅ )2 ∑(Xi − X = 0.0007778 (n−1) = 0.02789 the confidence interval, ̅ ± t* X s √n = 1.45 ± 2.262 ∗ 0.02789 √10 (Xi − X̅ )2 0.0025 0.0004 0.0001 0.0000 0.0004 0.0009 0.0025 (Xi − X̅ )2 ni 0.0025 0.0004 0.0003 0.0000 0.0004 0.0009 0.0025 0.0070 ̅. The variance of X vs. the variance of 𝐗 X and ̅ X each has its own probability distrubution function X ~ --- (μ , σ2 ), σ2 = ∑(Xi − μ)2 n We estimate σ2 , using the sample data: s2 ̅ X ~N( μ , σ2 n = ̅ )2 ∑(Xi − X n−1 ), σ2 n σ2 We estimate σ2 n n , substituting s2 for σ2 : s2 => n The confidence interval for a variable over time: If return, r (0,t), has a normal pdf and it is i,i,d, then E [r (o,t)] = μ * t ; Var [r (0,t)] = σ2 * t = > an envelope-shaped confidence interval. = > can be used to calculate the probability of exhausting the (futures) margin Hypothesis Testing : How to use sample information to test a hypothesis about the population parameters? => How to infer conclusions about the population parameters from sample statistics? Hypothesis: Null hypothesis (H0 ) vs. alternative hypothesis (HA ) One-tailed test vs. two-tailed test Test statistic: Something based on a sample Type I and type II errors: True state of situation Decision H0 true H0 false Do not reject H0 : Reject H0 : correct type I error type II error correct The null hypothesis must be constructed in such a manner that The cost (i.e., penalty) of not reject H0 is low. The power of a test: The probability of correctly rejecting the null hypothesis The p – value: The smallest level of significance at which the null hypothesis can be rejected. The process: Determine the hypotheses, -> determine the distribution of the test statistic, -> determine the decision rule, One-tailed test vs. two-tailed test (depends on the alternative hypothesis); the level of significance; the type I (reject the null in error) & the type II (don’t reject the null in error) -> take a sample and compute the value of the test statistic. Reject the null hypothesis: If the sample statistic is greater than the critical value; if the p-value of the sample statistic (the area outside of the test statistic; the smallest level of significance at which one can reject a null hypothesis) is smaller than the significance level; the confidence interval for the test statistic does not contain the null hypothesis value on the test statistic Concerning the mean: (the test statistic has the t- distribution.) 1. Tests concerning a single mean: The test statistic is a mirror image of the confidence interval for the population mean. Test statistic = ̅ − μ0 X s √n if the t statistic value is too large, reject the null hypothesis. Example 6. Investment analysts often use earnings per share (EPS) forecasts. Performance in forecasting quarterly EPS # of forecasts mean forecast error stan. dev. of erro Analyst A Analyst B 61 121 $0.05 $0.02 $0.10 $0.12 Question: Are the analysts’ forecasting qualities good? Answer: the null hypothesis, H0 , is that each analyst ′ s mean forecasting error (= predicted − actual) , μ0 , equal zero. H0 : μ 0 = 0 Ha : μ0 ≠ 0 The test statistic: Test statistic = ̅ − μ0 X s √n For analyst A, 60 = 61-1 degrees of freedom: (use the t-distribution) at the 0.05 significance level, reject H0 if t > 2.000 test statistic = (0.05−0) 0.10⁄ √61 = 0.05 0.0129 = 3.873 reject the null hypothesis: (i.e., analyst A’s forecasts tend to be too high) For analyst B, 120 = 121-1 degrees of freedom: (use the z-distribution) at the 0.05 significance level, reject H0 if t > 1.96 test statistic = (0.02−0) 0.12⁄ √121 = 0.02 0.0109 = 1.8349 do not reject the null hypothesis: (i.e., analyst B’s forecasts tend to be close to zero) the p-value of the sample statistic = 0.5 - 0.4671 =0.0329 at the 0.05 significance level, each tail is 0.025 => do not reject the null hypothesis 2. tests concerning differences between means: (if the samples are independent, population variances are equal ): Test statistic = ̅1− X ̅ 2 )–(μ1 − μ2 ) (X S2 S2 √( p + p ) n1 Example 7. n2 Investment analysts often use earnings per share (EPS) forecasts. Performance in forecasting quarterly EPS # of forecasts mean forecast error stan. dev. of erro Analyst A Analyst B 61 121 $0.05 $0.02 $0.10 $0.12 Question: Is analyst A’s forecasting quality worse than B’s? Answer: the null hypothesis is that analyst A’s forecasting error is no worse than B’s. H0 : μ A − μ B ≤ 0 Ha : μ A − μ B > 0 The test statistic: assuming that the forecast errors of both analysts are normally distributed and that the samples are independent and from population with equal variance, test statistic = ̅A− X ̅ B )–(μA − μB ) (X S2 S2 √( p + p ) n1 n2 the degrees of freedom is 180 = 61-1+121-1: (use the z-distribution) at the 0.05 significance level, reject H0 if z > 1.645 the pooled estimate of variance is: Sp2 = (n1 −1) S2 (n1 + n2 −2) 1 = = + (61−1)∗0.102 (61+121−2) 0.6+1.728 (61+121−2) ̅A− X ̅ B )–(μA − μB ) (X S2 S2 √( p + p ) n1 = (121−1)∗0.122 = 0.0129 180 Test statistic = + (n2 −1) S2 (n1 + n2 −2) 2 n2 (0.05−0.02)−0 0.0129 0.0129 + 61 121 √ = 0.03 √0.0002+0.0001 = 1.7133 reject the null hypothesis: (i.e., the forecast error of analyst A is greater than that of analyst B) the p-value of the sample statistic = 0.5 - 0.4564=0.0436 at the 0.05 significance level, one tail test => reject the null hypothesis 3. Tests concerning mean differences: (the data consisting of paired observations; the random variable is the difference): Test statistic Example 8. = ̅− μd d Sd ̅ The monthly returns on the S&P 500 and small-cap stocks. January 1960 – December 1999, 480 months Mean Stand dev. S&P 500 returns (%) small-cap stock (%) differences (%) 1.0542 4.2185 1.3117 5.9570 -0.258 3.752 Question: Is there any difference between the mean returns on the S&P 500 and smallcap stocks? Answer: hypothesis: H0 : μd = 0 Ha : μ d ≠ 0 The test statistic: The degrees of freedom = 480-1; use the z-distribution Test statistic = ̅− μd d Sd ̅ = −0.258−0 3.752⁄ √480 = -1.5065 at the 0.05 significance level, we do not reject the null hypothesis. (i.e., the mean difference between the two indexes in the period was 0.) the p-value of the sample statistic = 0.5 - 0.4345= 0.0655 Concerning variance: 1. Tests concerning a single variance (normally distributed population): The test statistic (n−1)S2 ~ 𝑋2 σ2o The test statistic has the chi-square distribution with (n-1) degrees of freedom. Example 9. Suppose that the variance of annual returns on your portfolio during last ten years is 225% and the variance of annual returns on the benchmark in the same period is 400%. Question: Is your underlying variance of return of your portfolio less than that of the benchmark? Answer: hypothesis: H0 : σ2 ≥ 400 Ha : σ2 < 400 The test: Use the 𝑋 2 -distribution; the degrees of freedom = 10-1; at the 0.05 significance level, the 𝑋 2 -distribution is 3.33 i.e., we will reject the null hypothesis if the test statistic is less than 3.33 Test statistic = (n−1)S2 σ2o We do not reject the null hypothesis. = 9∗225 400 = 5.06 2. Tests concerning the equality of two variances (normally distributed population): The test statistic S21 S22 ~ F The test statistic has the F distribution with (n1 -1) and (n2 -1) degrees of freedom. Example 10. Determine if the variance of returns on the S&P 500 has changed. Time period n monthly return(%) variance Before Oct 1987 After Oct 1987 120 120 1.416 1.436 22.367 15.795 Question: Did the variance change subsequent to the October 1987 market crash? Answer: hypothesis: H0 : σ2b = σ2a Ha : σ2b ≠ σ2a The test: Use the F-distribution; the degrees of freedom = 120-1 both the numerator and the denominator; a two-tailed test; 120 at the 0.02 significance level, the F120 -distribution is 1.53 i.e., we will not reject the null hypothesis if the test statistic is less than 1.53 Test statistic = S2b S2a = 22.367 15.795 = 1.416 We do not reject the null hypothesis. Question: Did the variance become smaller subsequent to the October 1987 market crash? Answer: hypothesis: H0 : σ2b ≤ σ2a Ha : σ2b > σ2a The test: Use the F-distribution; the degrees of freedom = 120-1 both the numerator and the denominator; a one-tailed test; 120 at the 0.01 significance level, the F120 -distribution is 1.53 i.e., we will not reject the null hypothesis if the test statistic is less than 1.35 Test statistic = S2b S2a = We do not reject the null hypothesis. 22.367 15.795 = 1.416 A summary of Statistics: (estimation methods & hypothesis tests on population parameters) Estimation of population parameter: Sample size: small (less than 120), use the Student t-distribution large (greater than 120), use the standard normal z-distribution Hypothesis tests on population parameter: H0 : parameter value = something Ha : parameter value ≠ something A two-tailed test H0 : parameter value ≥ something Ha : parameter value < 𝑠𝑜𝑚𝑒𝑡ℎ𝑖𝑛𝑔 A one-tailed test Tests on μ: Either the t-distribution or the z-distribution Tests on σ2 : Either the 𝑋 2 -distribution or the F-distribution Probability theory vs. Statistics: Both are concerned with a random variable X its probability distribution function. Probability theory: Determine the probability distribution of X. Given the probability distribution function, f(X), determine the parameters of the probability distribution such as the central tendency (e.g., mean) and the dispersion (e.g., variance) of the distribution. Statistics: Infer the parameters (such as the mean and the variance) of the probability distribution function from a sample. Non-parametric inference: 1. A test that is not concerned with a parameter; e.g., is the sample random or not? runs test 2. Aa test that makes minimal assumptions on distribution of the population Convert observation values into ranks or signs (+ or -) The linear relationship between two variables (the correlation between them): Corr. Coefficient = Cov [R(x),R(y)] [std dev R(x) ∗ std dev R(y)] The Spearman Rank Correlation Coefficient: =1- 6 ∑ d2i n(n2 −1) Regression: A linear relationship between two variables = > a linear relationship between two economic variables. Yi Suppose = a + b * Xi E (Yi ) = a + b * Xi + ϵi where, Yi ~ N [ E (Yi ) , σ2 ] Now consider, (Yi ̅ ) = - Y ̅) + θi β ( Xi - X Visualize a graphic relationship: e.g., the expected weight given a height The estimator for β that minimizes the variance of [ Yi − E (Yi )] is: β̂ = Cov (Xi , Yi ) Var (Xi ) ̂i Hence, the estimate of Yi is Y Where ̂ = (Y ̅ - β̂ ∗ X ̅) + β̂ * Xi Y And the estimate of σ2 is ̂ σ2 Where σ2 ̂ = 1 n ̂i ] 2 ∑ [ Yi − Y Example 11. The following table shows the average price of gasoline per liter for the year and the annual sales (in $mm). Year (i) Average Price (Xi )_ 1 2 3 4 5 6 1. 2. 3. 4. 5. $ 0.40 0.36 0.42 0.31 0.33 0.34 Sales (Yi ) $20 25 16 30 35 30 Calculate the (sample) mean and standard deviation for X and Y. Calculate the (sample) covariance between X and Y. Calculate the (sample) correlation coefficient between X and Y. What is the linear relationship between X and Y? If the average price is raised to $0.45, what is your the expected annual sale? Answer: Year(i) Xi Yi (Xi − ̅ X)2 1 2 3 4 5 6 0.40 0.36 0.42 0.31 0.33 0.34 20 25 16 30 35 30 0.0016 0.0000 0.0036 0.0025 0.0009 0.0004 36 1 100 16 81 16 -0.24 0.00 -0.60 -0.20 -0.27 -0.08 Sum 2.16 156 0.0090 250 -1.39 1. For X, ̅ = X ∑ Xi n = 2.16 6 ̅ )2 ∑(Xi − X Sx = √ (n−1) = 0.36 0.009 = √ 5 = 0.042426 (Yi − ̅ Y)2 (Xi − ̅ X)*(Yi − ̅ Y) For Y, ̅ = Y ∑ Yi n = 156 6 ̅ )2 ∑(Yi − Y Sy = √ (n−1) 2. Cov (X, Y) = = 26 250 = √ 5 = 7.0711 ̅ )∗(Yi − Y ̅) ∑(Xi − X (n−1) 3. Correlation coefficient = = −1.39 Cov (X,Y) Sx ∗ Sy = -0.278 5 = −0.278 0.042426∗7.0711 4. β̂ = Cov (Xi , Yi ) Var (Xi ) = −0.278 (0.042426)2 = -154.44 ̅ - β̂ ∗ X ̅) = 26 + (-154.44) * 0.36 = 81.6 (Y Hence, ̂ Yi = 81.6 − 154.44 ∗ Xi 5. From ̂ ̅ - β̂ ∗ ̅ Y = (Y X) + β̂ * Xi ̂ Y = 81.6 – = 12.10 154.44 * (0.45)