Survey
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
Business Forecasting with Accompanying Excel-Based ForecastX™ Software 2-2 The Forecast Process • The forecast process involves selecting one or more forecasting techniques depending on the type of data available. • The type of data is determined by evaluating data for trend, seasonal and cyclical components. • In this chapter, we will evaluate a number of data series to see which time-series component exist in each. McGraw-Hill/Irwin Copyright © 2002 by The McGraw-Hill Companies, Inc. All rights reserved. Business Forecasting with Accompanying Excel-Based ForecastX™ Software 2-3 The Forecast Process • The success of the forecast process depends on the effectiveness of the communication between the managers who use forecasts and the individuals who develop the forecasts. • It is also important for managers to have some familarity with the methods used in developing the forecast. McGraw-Hill/Irwin Copyright © 2002 by The McGraw-Hill Companies, Inc. All rights reserved. Business Forecasting with Accompanying Excel-Based ForecastX™ Software 2-4 The Steps in Forecasting Process 1. 2. 3. 4. 5. 6. 7. 8. Specify objectives Determine what to forecast Identify time dimensions Consider the quantity and the type of data Select a forecasting method (see Table 2.1) Evaluate its accuracy Prepare and present the forecasts Track the forecasts McGraw-Hill/Irwin Copyright © 2002 by The McGraw-Hill Companies, Inc. All rights reserved. Business Forecasting with Accompanying Excel-Based ForecastX™ Software 2-5 Model Evaluation • In evaluating forecasting models, it is very important to distinguish between fit an accuracy. • Fit refers to in-sample model performance, whereas accuracy refers to out-off-sample performance. • In many cases models that perform well in sample perform very poorly out-of-sample. McGraw-Hill/Irwin Copyright © 2002 by The McGraw-Hill Companies, Inc. All rights reserved. Business Forecasting with Accompanying Excel-Based ForecastX™ Software 2-6 Model Evaluation • Since forecast accuracy is always first priority, emphasis should be placed on out-of-sample RMSE rather than model fit. • This is usually accomplished by use of a holdout period in sample. • This is a period at the end of the sample in which forecasts from from earlier periods can be made to access the accuracy of a given model. • In summary, fit refers to how well model works with past data, accuracy relates to how well the model works in the forecast horizon.(see pg. 52) McGraw-Hill/Irwin Copyright © 2002 by The McGraw-Hill Companies, Inc. All rights reserved. Business Forecasting with Accompanying Excel-Based ForecastX™ Software 2-7 Data Patterns • The data that are used most often in forecasting are time series which include a variety of patterns. The best way to observe these patterns is to plot them over time: Trend: A long-term change (positive or negative) in the level of data. When there is neither positive nor negative trend, data are considered stationary. McGraw-Hill/Irwin Copyright © 2002 by The McGraw-Hill Companies, Inc. All rights reserved. Business Forecasting with Accompanying Excel-Based ForecastX™ Software 2-8 Data Patterns Seasonal Pattern: Seasonality occurs when a regular variation in the level of data repeats itself at the same time each year/month or week. Cyclical Pattern: Cyclical fluctuations are related to business cycles and in comparison to seasonal fluctuations, they are of longer duration and are less regular. Irregular component: Irregular fluctuations occur randomly. McGraw-Hill/Irwin Copyright © 2002 by The McGraw-Hill Companies, Inc. All rights reserved. Business Forecasting with Accompanying Excel-Based ForecastX™ Software 2-9 Gross Domestic Product (GDP) Data Set A times-series plot of real GDP on a quarterly basis is given in Fig. 2.1. What data patterns can be observed from these times-series? 1. Long-term positive trend 2. Cyclical fluctuation So, GDP is nonstationary and has a cyclical component. McGraw-Hill/Irwin Copyright © 2002 by The McGraw-Hill Companies, Inc. All rights reserved. Figure 2-1 Business Forecasting with Accompanying Excel-Based ForecastX™ Software 2-11 Private Housing Starts (PHS) Data Set • PHS data are plotted in Fig. 2.1 from 1980 to 2000, on a quarterly basis. What data patterns can be observed from these times-series? 1. Upward trend, 2. Cyclical movements, 3. Seasonal pattern • The cyclical nature of data is more obvious in comparison of PHSSA (Deseasonalized PHS data) with the trend. McGraw-Hill/Irwin Copyright © 2002 by The McGraw-Hill Companies, Inc. All rights reserved. Figure 2-2 Business Forecasting with Accompanying Excel-Based ForecastX™ Software 2-13 Leo Burnett Advertising Agency (LBB) Data LBB data are shown in Fig. 2.3 on an annual basis from 1950 to 1995. What data patterns can be observed from these times-series? 1. Upward non-linear trend Note that there is no need to consider seasonality since these are annual data. McGraw-Hill/Irwin Copyright © 2002 by The McGraw-Hill Companies, Inc. All rights reserved. Figure 2-3 Business Forecasting with Accompanying Excel-Based ForecastX™ Software 2-15 Data Patterns and Model Selection Model selection depends on data patterns. Now, let us select a model by using information in Table 2.1: For GDP having a trend and a cycle, but no seasonality: • Holt’s exponential smoothing • Linear regression trend • Casual regression • Time-series decomposition McGraw-Hill/Irwin Copyright © 2002 by The McGraw-Hill Companies, Inc. All rights reserved. Business Forecasting with Accompanying Excel-Based ForecastX™ Software 2-16 Data Patterns and Model Selection For PHS data having a trend, seasonality, and a cycle: • Winter’s exponential smoothing • Linear regression with seasonal adjustment • Casual regression • Time-series decomposition For LBB data with a nonlinear trend, nonlinear and causal regression are appropriate. McGraw-Hill/Irwin Copyright © 2002 by The McGraw-Hill Companies, Inc. All rights reserved. Table 2-1 Business Forecasting with Accompanying Excel-Based ForecastX™ Software 2-18 A Statistical Review Descriptive Statistics: 1. Measures of central tendency: mean, median and mode 2. Measures of dispersion: range, variance and standart deviation, coefficient of variation. • Mean is arithmetic average of all the numbers in data, median splits the data into two equal parts and mode is the response which occurs most frequently. (see pg. 57) McGraw-Hill/Irwin Copyright © 2002 by The McGraw-Hill Companies, Inc. All rights reserved. Business Forecasting with Accompanying Excel-Based ForecastX™ Software 2-19 A Statistical Review • Range is the difference between the smallest and the greatest value. • Standart deviation measures the squared differerences between the mean and each observation. Note that the sum of unsquared differences around the mean is equal to zero. • Variance is the square of the standart deviation. McGraw-Hill/Irwin Copyright © 2002 by The McGraw-Hill Companies, Inc. All rights reserved. Business Forecasting with Accompanying Excel-Based ForecastX™ Software 2-20 A Statistical Review • Coefficient of variation, defined as the standard deviation of the observations divided by the mean, provides a measure of relative variation, whereas standart deviation provides a measure of absolute variation. • Using all of these descriptive statistics together gives a better idea about the data than using just the mean. McGraw-Hill/Irwin Copyright © 2002 by The McGraw-Hill Companies, Inc. All rights reserved. Table 2-2 Table 2-3 Business Forecasting with Accompanying Excel-Based ForecastX™ Software 2-23 A Statistical Review These sales data are plotted over time in Fig. 2.4. What data patterns can be observed from these times-series? The sales data are stationary, there is not a trend. • The reason that the mean is above the other central tendency measure is the existence of one large value in data. This large value pulls up the mean but has little or no effect on the mean or mode. McGraw-Hill/Irwin Copyright © 2002 by The McGraw-Hill Companies, Inc. All rights reserved. Figure 2-4 Business Forecasting with Accompanying Excel-Based ForecastX™ Software 2-25 Normal Distribution • Normal distribution for a continuous random variable is defined by the mean and the variance of the variable. • As seen in Fig. 2.5, all normal distributions are symmetrical around the mean (i.e., the mean is equal to the median). μ - / + 1σ includes about 68% of the area μ - / + 2σ includes about 95% of the area μ - / + 3σ includes about 99% of the area McGraw-Hill/Irwin Copyright © 2002 by The McGraw-Hill Companies, Inc. All rights reserved. Business Forecasting with Accompanying Excel-Based ForecastX™ Software 2-26 Standart Normal Distribution (Z–distribution) • To ease the calculation of area under the normal curve, normal distributed variable is transformed into standart normal variable: Z =X- μ /σ (measures the number of standart deviations by which X differes from the mean) If Z >0, then X takes place to the right of mean. If Z <0, then X takes place to the left of the mean. McGraw-Hill/Irwin Copyright © 2002 by The McGraw-Hill Companies, Inc. All rights reserved. Figure 2-5 Business Forecasting with Accompanying Excel-Based ForecastX™ Software 2-28 Example Suppose the sales for a product is represented by a normal distribution with a mean of 50 and a standart deviation of 10. What percent of sales would be between 40 and 65? P(40<X<65) Z1=(40-50)/10= -1 Z2= (65-50)/10=1.5 P(-1<Z<1.5)= 0.3413+0.4332=0.7745 What percent of the sales would be greater than 30? P(X>30)=P(Z>-2)==.4772+0.5=0.9772 McGraw-Hill/Irwin Copyright © 2002 by The McGraw-Hill Companies, Inc. All rights reserved. Table 2-4 Figure 2-6 Business Forecasting with Accompanying Excel-Based ForecastX™ Software 2-31 The Sampling Distribution of the Mean •If a random sample of n observations is taken from a normal population with mean μ and variance σ2 , then each observation of the random sample will have the same normal distribution as the population being sampled. •If we are sampling from a population with unknown ditribution, then the sampling distribution of X will still be approximately normal with mean μ and variance σ2 /n provided the sample size is large. (Central Limit Theorem) McGraw-Hill/Irwin Copyright © 2002 by The McGraw-Hill Companies, Inc. All rights reserved. Business Forecasting with Accompanying Excel-Based ForecastX™ Software 2-32 The Sampling Distribution of the Mean • The normal approximation for X will be good if n>=30. (Law of large numbers) • If n<30, the approximation is good only if the population is not too different from a normal distribution. • If the population is known to be normal, the sampling distribution of X will follow a normal distribution exactly, no matter how small the size of the samples. McGraw-Hill/Irwin Copyright © 2002 by The McGraw-Hill Companies, Inc. All rights reserved. Business Forecasting with Accompanying Excel-Based ForecastX™ Software 2-33 Example What is the probability of selecting a sample of 100 observations with a mean greater than 300 when the true population mean is 288 and the population standard deviation is 60? Z=(300-288):60/√100=2 P(Z>2)=0.5-0.4772 =0.0228 McGraw-Hill/Irwin Copyright © 2002 by The McGraw-Hill Companies, Inc. All rights reserved. Business Forecasting with Accompanying Excel-Based ForecastX™ Software 2-34 The Student’s t-Distribution • The Student’s t-distribution is used, when the population variance is not known or when the sample size is small.Since the t-distribution depends on the number of degrees of fredom (df), there are many tdistributions (see Table 2.5) What is df? • As the sample size gets very large, the student’s t-dist. becomes the same as normal distribution. • Go through the examples given on pg.67 to understand how to read the Table 2.5. McGraw-Hill/Irwin Copyright © 2002 by The McGraw-Hill Companies, Inc. All rights reserved. Table 2-5 Business Forecasting with Accompanying Excel-Based ForecastX™ Software 2-36 Statistical Inference • We usually draw a sample from a population and then by calculating: -- the sample mean and -- the confidence interval for the sample mean We make some inference about the whole population. Interpret the confidence level using the figure on pg. 69 and the example given on pg. 70 McGraw-Hill/Irwin Copyright © 2002 by The McGraw-Hill Companies, Inc. All rights reserved. Business Forecasting with Accompanying Excel-Based ForecastX™ Software 2-37 Hypothesis Testing Issues to deal with are: 1. Setting up the null and the alternative hypothesis (One-tailed vs two-tailed test, see pg. 70 and 71)) 2. Choosing confidence level 3. Determining the level of significance for one-tailed and two-tailed tests. Type I error = Significance level =1- Confidence Level The effect of sample size on decreasing Type I and Type II error. (Examples on pg. 73) McGraw-Hill/Irwin Copyright © 2002 by The McGraw-Hill Companies, Inc. All rights reserved. Table 2-6 Business Forecasting with Accompanying Excel-Based ForecastX™ Software 2-39 Correlation Measures the degree of linear association between X and Y. (-1<= r<=1) (see Fig. 2.7) In this course, we will use Pearson productmoment correlation (see pg. 75) We could perform a hypothesis test to check the existence of linear association between two variables (see examples on pg. 76) McGraw-Hill/Irwin Copyright © 2002 by The McGraw-Hill Companies, Inc. All rights reserved. Figure 2-7 Figure 2-7 (continued) Business Forecasting with Accompanying Excel-Based ForecastX™ Software 2-42 Correlograms Correlograms help to measure the correlation between successive observations over time. Autocorrelation with lag-k can be measured using the formula on pg. 78. If the time series is stationary, rk will approach zero rapidly as k increases. If there is a trend, rk will approach zero slowly. If there is seasonality in data, the value of rk will be significantly different from zero at k=4 for quarterly data or k=12 for monthly data. McGraw-Hill/Irwin Copyright © 2002 by The McGraw-Hill Companies, Inc. All rights reserved. Business Forecasting with Accompanying Excel-Based ForecastX™ Software 2-43 Correlograms A k-period plot of autocorrelations is called an autocorrelation function (ACF) or a correlogram. We can perform a hypothesis test to check whether the autocorrelation at lag k is significantly different from zero. We will reject the Null Hypothesis if : |rk| >2/√n McGraw-Hill/Irwin Copyright © 2002 by The McGraw-Hill Companies, Inc. All rights reserved. Business Forecasting with Accompanying Excel-Based ForecastX™ Software 2-44 Autocorrelation Structure of Real GDP From Figure 2.8, it is clear that GDP has a fairly strong positive trend (see hypothesis test results on pg. 79) In order to use a forecasting method which requires stationary data, we need to transform GDP data (see Figure 2.9 on pg. 80) to a stationary series (see Figure 2.9 on pg. 81) To transform, the first differences for GDP data will be calculated as fallows: DGDPt = GDPt – GDPt-1 Fig. 2.10 shows autocorrelation structure for DGDP data. McGraw-Hill/Irwin Copyright © 2002 by The McGraw-Hill Companies, Inc. All rights reserved. Figure 2-8 Figure 2-9 Figure 2-10 Figure 2-11 Figure 2-11 (continued) Figure 2-12 Figure 2-13 Figure 2-14 Table 2-4 (continued) Table 2-5 (continued) Table 2-7