Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Expectation–maximization algorithm wikipedia , lookup
Interaction (statistics) wikipedia , lookup
Instrumental variables estimation wikipedia , lookup
Data assimilation wikipedia , lookup
Forecasting wikipedia , lookup
Regression toward the mean wikipedia , lookup
Choice modelling wikipedia , lookup
Time series wikipedia , lookup
Linear regression wikipedia , lookup
DR. HUSSIN ABDULLAH SCHOOL OF ECONOMICS, FINANCE AND BANKING, UUM COB DEMAND ESTIMATION After studying this chapter, you should be able to: 1. Discuss how the firm’s managers use the information about demand for its product to determine correctly its profit-maximizing rate of output and price, or whether to produce a particular product at all. 2. Discuss demand respond to consumer income increase or decrease as a result of an economic expansion or contraction. 3. Specify the components of a regression model that can be used to estimate a demand equation. 4. Interpret the regression results (i.e., explain the quantitative impact that changes in the determinants have on the quantity demanded). 5. Explain the meaning of R2. 6. Evaluate the statistical significance of the regression coefficients using the t-test and the statistical significance of R2 using the F-test. Introduction: • An important contributor to firm risk arises from sudden shifts in demand for the product or service. • Demand estimation serves two managerial objectives: (1) it provides the insights necessary for effective management of demand, and (2) it aids in forecasting sales and revenues. The theory SIMPLE LINEAR REGRESSION Relationships, among other things, may serve as a basis for estimation and prediction. Simple prediction—when we take the observed values of X to estimate or predict corresponding Y values. Regression analysis uses simple and multiple predictors to predict Y from X values. With respect to similarities and differences of correlation and regression, their relatedness would suggest that beneath many correlation problems is a regression analysis that could provide further insight about the relationship of Y with X. The Basic Model A straight line is fundamentally the best way to model the relationship between two continuous variables. Regression coefficients are the intercept and slope coefficients. Slope (β1)—the change in Y for a 1-unit change in X. – This is the ratio of change (∆) in the rise of the line relative to the run or travel along the X axis. 1 DR. HUSSIN ABDULLAH SCHOOL OF ECONOMICS, FINANCE AND BANKING, UUM COB Intercept (β0)—one of two regression coefficients, is the value for the linear function when it crosses the Y axis or the estimate of Y when X is zero. Concept Application Unfortunately, one rarely comes across a data set composed of four paired values, a perfect correlation, and an easily drawn line. A model based on such data is deterministic in that for any value of X, there is only one possible corresponding value of Y. A probabilistic model also uses a linear function. Error term is the deviations of values of Y from the regression line of Y for a particular value of X. Method of Least Squares The method of least squares is a procedure for finding a regression line that keeps errors of estimate to a minimum. When we predict the values for Y for each Xi the difference between the actual Yi and the predicted Y is the error. This error is then squared and then summed. Residuals A residual is the difference between the regression line value of Y and the real Y value. When standardized, residuals are comparable to Z scores with a mean of 0 and a standard deviation of 1. It is important to apply other diagnostics to verify that the regression assumptions (normality, linearity, equality of variance and independence of error) are met. Predictions Prediction and confidence bands are bow-tie shaped confidence interval around a predictor. Confidence intervals can be expanded or narrowed. Testing for Goodness of Fit Goodness of fit is a measure of how well the regression model is able to predict Y. The most important test in bivariate linear regression is whether the slope,β1, is equal to zero. Zero slopes result from various conditions: Y is completely unrelated to X, and no systematic pattern is evident. There are constant values of Y for every value of X. 2 DR. HUSSIN ABDULLAH SCHOOL OF ECONOMICS, FINANCE AND BANKING, UUM COB The data are related but represented by a nonlinear function. The t-Test To test whether β1 = 0, we use a two-tailed test. The F Test The F test has an overall role for the model in multiple regressions. See F test example for an illustration. Coefficient of Determination In predicting the values of Y without any knowledge of X, our best estimate be Y mean. Each predicted value that does not fall on Y contributes to an error estimate. Multiple Regression Multiple regression—statistical tool used to develop a self-weighting estimating equation that predicts values for a dependent variable from the values of independent variables. Multiple regression is used as a descriptive tool in three types of situations: It is often used to develop a self-weighting estimating equation by which to predict values for a criterion variable (DV) from the values for several predictor variables (IVs). A description application of multiple regression calls for controlling for confounding variables to better evaluate the contribution of other variables. Multiple regression can be also used to test and explain causal theories. This approach is referred to as path analysis (e.g., describes, through regression, an entire structure of linkages advanced by a causal theory). Multiple regression is also used as an inference tool to test hypotheses and to estimate population values. Method Multiple regression is an extension of the bivariate linear regression discussed in Chapter 19. Dummy variables—nominal variables converted for use in multivariate statistics. Regression coefficients are stated either in raw score units (the actual X values) or standardized coefficients (regression coefficients in standardized form [mean = 0] used to determine the comparative impact of variables that come from different scales. When regression coefficients are standardized, they are called beta weights (β) (standardized regression coefficients where the size of the number reflects the 3 DR. HUSSIN ABDULLAH SCHOOL OF ECONOMICS, FINANCE AND BANKING, UUM COB level of influence X exerts on Y), and their values indicate the relative importance of the associated X values, particularly when the predictors are unrelated. Example Most statistical packages provide various methods for selecting variables for the equation. Forward selection—sequentially adds the variable to a regression model that results in the largest R2 increase. Backward elimination—sequentially removes the variable from a regression model that changes R2 the least. Stepwise selection—a method for sequentially adding or removing variables from a regression model to optimize R2 . – – – – – Collinearity—when two independent variables are highly correlated. Multicollinearity—when more than two independent variables are highly correlated. Both of the above can have damaging effects on multiple regression. Another difficulty with regression occurs when researchers fail to evaluate the equation with data beyond those used originally to calculate it. A solution to the above problem can be the holdout sample (the portion of the sample excludes for later validity testing when the estimating equation is first computed). Based on the formula (see chapter), the coefficient of determination is the ratio of the line of best fit’s error that incurred by using Y. One purpose of testing is to discover whether the regression equation is a re effective predictive device than the mean of the dependent variable. The coefficient of determination is symbolized by r squared. It has several purposes: As an index of fit, it is interpreted as the total proportion of variance in Y explained by X. As a measure of linear relationship, it tells us how well the regression line fits the data. It is also an important indicator of the predictive accuracy of the equation. Typically, we would like to have an r squared that explains 80 percent or more of the variation. Important Concepts: Individual Demand Curve the greatest quantity of a good demanded at each price the consumers are willing to buy, holding other influences constant The Market Demand Curve is the horizontal sum of the individual demand curves. The Demand Function includes all variables that influence the quantity demanded 4 DR. HUSSIN ABDULLAH SCHOOL OF ECONOMICS, FINANCE AND BANKING, UUM COB Q = f( P, Ps, Pc, Y, N, W, PE) + + ? + ? + where: P is price of the good PS is the price of substitute goods PC is the price of related goods Y is income, N is population, W is wealth, and PE is the expected future price Downward Slope to the Demand Curve • Reasons that price and quantity are negatively related include: » income effect -- as the price of a good declines, the consumer can purchase more of all goods since his or her real income increased. » substitution effect -- as the price declines, the good becomes relatively cheaper. A rational consumer maximizes satisfaction by reorganizing consumption until the marginal utility in each good per dollar is equal. Sign of the estimated Regression Coefficients A good regression model should be based on a good economic theory. The theory should indicate what sign each estimated coefficient must take. For example, the coefficient for the price variable in a demand equation should have negative sign, that is, when price increases, demand decreases. The income variable should have a positive sign. If the signs of estimated coefficients do not agree with the theory, the validity of the model should be questioned. How to do Demand Estimation? In estimating the demand for a particular good or service, the process will be: First step: determine all the factors that might influence this demand (i.e the formation of Demand model). Example: Suppose we wanted to estimate the demand for pizza by university students in Malaysia. What variables would most likely affect their demand for pizza? Remember demand theory? We could start to answer this question by using price and all the nonprice determinants – such as income, prices of related goods, taste and preferences, future expectation and number of buyers. But it is not always possible or appropriate to include all these variables in a particular demand estimation. Why? Factors of the availability of data and the cost of generating new data. The two types of data used in regression analysis are cross-sectional and time series. For the purpose of illustration, let us assume we have obtained cross-sectional data on university students (Public and Private University in Malaysia) by conducting a survey of thirty randomly selected University during a particular month. Second step: Data Collection Suppose we have gathered the following information for each campus from this survey: (1) average number of slices (quantity) consumed per month by students, (2) average price of a slice of pizza in places selling pizza (3) annual income (PTPN and FAMA) (4) average price of soft drink sold in the pizza places, and (5) location of the campus (urban versus rural) 5 DR. HUSSIN ABDULLAH SCHOOL OF ECONOMICS, FINANCE AND BANKING, UUM COB The data obtained from our hypothetical survey are presented in Table 1. 6 DR. HUSSIN ABDULLAH SCHOOL OF ECONOMICS, FINANCE AND BANKING, UUM COB Table 1. Sample data: The demand for Pizza Price_P Income_Y P Com_Pc Loc X4 QuantityDD 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 100.00 100.00 90.00 95.00 110.00 125.00 125.00 150.00 80.00 80.00 90.00 100.00 100.00 110.00 125.00 110.00 150.00 100.00 150.00 150.00 150.00 125.00 125.00 100.00 75.00 100.00 110.00 125.00 150.00 150.00 14.00 16.00 8.00 7.00 11.00 5.00 12.00 10.00 18.00 12.00 6.00 5.00 12.00 10.00 14.00 15.00 16.00 12.00 12.00 10.00 13.00 15.00 16.00 17.00 10.00 12.00 6.00 10.00 8.00 10.00 100.00 95.00 110.00 90.00 100.00 100.00 125.00 150.00 100.00 90.00 80.00 75.00 100.00 125.00 130.00 80.00 90.00 95.00 100.00 90.00 95.00 100.00 95.00 100.00 100.00 110.00 125.00 90.00 80.00 95.00 7 1.00 1.00 1.00 1.00 .00 .00 1.00 .00 1.00 1.00 1.00 1.00 1.00 .00 .00 1.00 .00 1.00 .00 .00 .00 1.00 1.00 .00 1.00 1.00 .00 .00 .00 .00 10 12 13 14 9 8 4 3 15 12 13 14 12 10 10 12 11 12 10 8 9 10 11 12 13 10 9 8 8 8 DR. HUSSIN ABDULLAH SCHOOL OF ECONOMICS, FINANCE AND BANKING, UUM COB Third step: Data Analysis To estimate the demand for pizza, we employed the regression function contained in SPSS. The result Regression - Demand Estimation: Simple Regression Analysis Variables Entered/Removed(b) Model 1 Variables Entered Variables Removed Location_X 4, Tuition_X2, Pri_Cross_ X3, Price_X1(a) Method . Enter a All requested variables entered. b Dependent Variable: Quantity_Y Model Summary Model 1 R Adjusted R Square R Square Std. Error of the Estimate .846(a) .717 .671 1.64048 a Predictors: (Constant), Location_X4, Tuition_X2, Pri_Cross_X3, Price_X1 ANOVA(b) Model 1 Sum of Squares Regressio n Residual Total df Mean Square 170.087 4 42.522 67.279 25 2.691 237.367 29 F Sig. 15.801 .000(a) a Predictors: (Constant), Location_X4, Tuition_X2, Pri_Cross_X3, Price_X1 b Dependent Variable: Quantity_Y Coefficients(a) Unstandardized Coefficients Model 1 B (Constant) Price_X1 Tuition_X2 Pri_Cross_ X3 Location_X 4 Standardized Coefficients Std. Error Beta 26.667 3.278 -.088 .018 .138 .087 -.076 -.544 t Sig. 8.135 .000 -.733 -4.858 .000 .174 1.595 .123 .019 -.438 -3.948 .001 .885 -.097 -.615 .544 8 DR. HUSSIN ABDULLAH SCHOOL OF ECONOMICS, FINANCE AND BANKING, UUM COB a Dependent Variable: Quantity_Y Fourth Step: Testify the Validity of the independent variables A good forecasting model may not satisfy all the statistical tests and uphold all the underlying assumptions. Researchers can visually examine the table of model statistics, to determine if the following criteria are met. The criteria vary with the number of independent variables as well as with the number of observations. The following rule of thumb is based on a model including three independent variables, thirty data points and the 95% confidence level: R2: The value of R2 falls between 0 and 1. Higher the value better is the correlation between the independent variables used in a model. The users should be sontent with when the value of R2 is greater than 0.9. F-test: The calculated F-value should be greater than 3. If not, the estimated model does not represent a good causal relationship between and independent variables. This a test of the overall soundness of a model. t-test: The calculated t-values for the regression coefficients should be greater than 2 in absolute terms. The t-value measures the signifance of individual regression coefficient. Standard Error of Regression: Smaller the standard error of regression, better will be the accuracy of forecasts. 9 DR. HUSSIN ABDULLAH SCHOOL OF ECONOMICS, FINANCE AND BANKING, UUM COB Fifth Step: Data Interpretation Find the point price elasticity, the point income elasticity, and the point cross-price elasticity at P=100, Y=14, Ps =110, and in Urban Area (Loc = 1) if the demand function were estimated to be: QD = 26.67 – 0.086·P + 0.138·Y – 0.076·Pc - 0.544.Loc Is the demand for this product (pizza) elastic or inelastic? Is it a luxury or a necessity? Does this product have a close substitute or complement? Find the point elasticities of demand. Let us assume the explanatory variables have the following values: Price of Pizza (P) = 100 (i.e.,RM100) Income (Y) = 14 (i.e RM14,000) Price of Soft Drink (Pc) = 110 (i.e RM1.10) Location (Loc) = Urban Area =1 Answer • First find the quantity at these prices and income: QD = 26.67 – 0.086·(100) + 0.138·(14) – 0.076·(110) - 0.544.(1) = 10.898 • Price elasticity ED • Income Elasticity EY = (Q/Y)(Y/Q) = (0.138)(14/10.898) = +.177 which is a normal good, but a necessity • Cross-price elasticity EAB = (QA/PB)(PB /QA) = (– 0.076)(110/10.898) = -.767 which is a complimentary = (Q/P)(P/Q) = (-0.086)(100/10.898) = -0.78 which is inelastic Six Step: Conclusions Combined Effect of Demand Elasticities Example: The firm can use these elasticities to forecast the demand for their product (coffee) next year. • • Firm XYZ has a price elasticity of -2 for coffee Firm XYZ have an income elasticity of 1.5 10 DR. HUSSIN ABDULLAH SCHOOL OF ECONOMICS, FINANCE AND BANKING, UUM COB • The cross price elasticity is +.50 • Most managers find that prices and income change every year. The combined effect of several changes are additive. %DQ = ED(% DP) + EY(% DY) + EX(% DPR) » where P is price, Y is income, and PR is the price of a related good. • If you knew the price, income, and cross price elasticities, then you can forecast the percentage changes in quantity. • What will happen to the quantity sold if you raise price 3%, income rises 2%, and price of substitute goods raises its price 1%? » %DQ = EP • %DP +EY • %DY + EX • %DPx Q: A: » = -2 • 3% + 1.5 • 2% +.50 • 1% » = -6% + 3% + .5% » %DQ = -2.5%. We expect sales to decline. Will Total Revenue for your product rise or fall? Total revenue will rise slightly (about + .5%), as the price went up 3% and the quantity of coffe sold will fall 2.5%. Assessment of Model Performance In business forecasting, a response variable is often driven by many other variables. A good forecasting model does not have to include all of the relevant variables. When a model attains its optimal performance, inclusion of additional variables simply complicates the task of forecasting. But they do not add anything to the accuracy. If two models yield the same forecast accuracy, the one which contains fewer variables should be chosen. 11 DR. HUSSIN ABDULLAH SCHOOL OF ECONOMICS, FINANCE AND BANKING, UUM COB PROBLEMS QUESTION 1 1. Husin Sdn. Berhad is the maker of a high-quality Tongkat Ali. A linear regression model used to estimate the demand function for Husin Tongkat Ali yielded the following results: QD = 10, 425 – 2,910 PX + 0.028 A (2.88) (7) R2 = 0.81 + 11,100POP (3.13) S.E.E. = 3.4 Where QD = quantity of Mash Tongkat Ali demanded PX = price of Mash Tongkat Ali A = Husin Sdn. Bhd. Advertising in dollars POP = percentage of the Malaysia population over 21 years of age (i) (ii) (iii) (iv) Determine the point price elasticity for prices of RM5 and RM10, when A = RM1,000,000 and POP = .05. (10 marks) Determine the point advertising elasticity at an advertising level of RM2,000,000, if price remain at RM5 and POP = .05. (5 marks) The T-statistic value for each coefficient is given in parentheses. If you know that the demand function was estimated using 25 observations, can you reject at the 95-percent confidence level the hypothesis that there is no relationship between each of the independent variables and QD? (5 marks) What steps are usually involved in the estimation of a demand equation by regression analysis? (5 marks) QUESTION 2 Vivian Maju Segar Sdn. Bhd. (VMS) has hired you as a consultant to analyze the demand for its line of telecommunications devices in 35 different market areas. The available data set includes observations on the number of thousands of units sold by VMS per month (QX), the price per unit charged by VMS (PX), the average unit price of competing brands (PZ), monthly advertising expenditures by VMS (A), and average gross sales (in $1,000) of businesses in the market area (I). The result of a regression analysis (with t-ratios in parenthesis) is given below. QX = 300 (3.0) R2 = 0.91 (a) (b) (c) - 6 PX (3.33) + 2 PZ + 0.04 A + 0.01 I (2.5) (1.33) (2.5) S.E.E. = 3.6 Evaluate the statistical significance of the equation as a whole and of each of its coefficients. (5 marks) The average values of the independent variables in the data set used to estimate the equation are PX = $195, PZ = $225, A = $11,000, and I = $200,000. Calculate a point estimate of VMS’s average sales and a 95% interval estimate of sales based on these values. (10 marks) What steps are usually involved in the estimation of a demand equation by regression analysis? (10 marks) 12