Download Decision Technology - Villanova University

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Gibbs sampling wikipedia , lookup

Transcript
Decision Technology
Modeling, Software and Applications
Matthew J. Liberatore
Robert L. Nydick
John Wiley & Sons, Inc.
Financial Simulation
Using @Risk
OVERVIEW OF @RISK
@Risk is a spreadsheet add-in that has two advantages.
1.
2.
@Risk provides easy access to many probability
distributions.
Add-ins make it easier to develop simulation
models than Excel alone.
Overview of @Risk features:
 Many built-in input probability distributions.
 Output cells can be specified. @Risk maintains
statistics for these cells across replications.
 The Risksimtable command allows the user to run
the simulation several times for different values of
input parameters.
OVERVIEW OF @RISK
The development of a simulation model is a two-step process:
1.
Build the model, that is, the logic that transforms input
(e.g., demand uncertainty) into output (e.g., profit).
2.
@Risk automatically replicates the model with many
random input values, reporting output as requested.
Simply put, @Risk takes any spreadsheet (e.g., income
statement, cash flow statement) and specifies that some of
the cells are uncertain. @Risk then repeats the simulation
many times and computes statistics across replications.
@Risk is really an advanced sensitivity analysis since it
studies how changes to key input parameters impacts
output.
Although we use @Risk, a similar add-in is Crystal Ball.
FITTING A PROBABILITY DISTRIBUTION
Historical data are needed to estimate probability
distributions and parameters for uncertain
parameters.
Suppose we believe that the probability distribution of
demand can be estimated using the past data that
was gathered and appears in the demand.xls file.
We can use the “Fit Distributions to Data” button in
@Risk to analyze the data. After selecting this
button, enter the following information: Excel Data
Range, Type of Data (Sampled Values), No
Filtering Options, Continuous Domain, and OK.
FITTING A PROBABILITY DISTRIBUTION
@Risk will analyze the data and use goodness of fit
statistical tests to rank order probability
distributions and their parameters for the data
entered.
In our example, we see that the best fit probability
distribution is Normal with a mean of 99.414 and a
standard deviation of 11.059.
To attempt to validate this feature, the data in
demand.xls was generated from a normal
distribution with a mean of 100 and a standard
deviation of 10.
FITTING A PROBABILITY DISTRIBUTION
The difference between 99.414 and 100 and 11.059 and
10 is due to the small sample size. The first
numbers in each pair are the sample estimates while
the second numbers are the population values.
If this were actual data, we would use the normal
distribution with mean of 99.414 and standard
deviation of 11.059 as our demand probability
distribution within an @Risk model.
We are now ready to create our first @Risk model.
CALENDAR SIMULATION
Calendars are sold as a fundraiser activity.
It costs $7.50 to purchase each calendar and they are
sold for $10.00.
Unsold calendars can be returned for $2.50 each.
The number of calendars demanded is assumed to
follow a triangular distribution with minimum,
most likely, and maximum values of 100, 175, and
300, respectively.
How many calendars should be ordered?
CALENDAR SIMULATION
We begin by simulating 1000 replications for an order
size of 200. The following values and formulas are
needed. See Calendar template.xls.
B4: 7.50, B5: 10.00, B6: 2.50
E4-E6: 100, 175, 300
B9: 200
B13: =ROUND(RISKTRIANG(E4,E5,E6),0)
C13: =B5*Min(B9,B13)
D13: =B4*B9
E13: =B6*MAX(B9-B13,0)
F13: =C13+E13-D13
B16-B19: =riskmin(F13), =riskmax(F13),
=riskmean(F13), =riskstddev(F13)
CALENDAR SIMULATION
Cells B4 – B6 contain the data for this problem.
Cells C13 – F13 contain the formulas for
Revenue, Salvage Value, Costs, and Profit.
The formula in B13 computes the random
demand based on the values given in E4:E6.
B9 specifies that 200 calendars will be ordered.
B16:B19 provide statistical results of the output.
CALENDAR SIMULATION
With cell F13 highlighted, choose the Add Output button to
make this the output cell.
Click on the Simulation Settings button and select 1000
iterations and 1 simulation in the Iterations tab.
In the sampling tab of the Simulation Settings button, select
Latin Hypercube, Standard Recalc (Monte Carlo),
Fixed = 1, All, check Save as Default, and OK.
Click on the Report Settings button and select: Show
Interactive @Risk Results Window, Generate Excel
Reports Selected Below, Simulation Summary, Output
Graphs, Active Workbook, Metafile, check Save as
Default, and OK.
What is Sampling?

Sampling is the process by which values are randomly
drawn from the selected distribution.
 In @Risk, during each iteration of the simulation, one
observation is chosen from the input distribution.
 As the number of iterations increases, the sample of
observations more closely resembles the input
distribution.
 When running a simulation, it is important that all areas
of the input distribution get sampled, especially the low
probability (high uncertainty) areas. If not, uncertainty
will seem less than it actually is.
The Concept of “Efficiency”
 Statisticians
have developed different ways to
sample (or draw) from distributions.
If we could do an infinite number of iterations in our
simulation, these methods would produce equal results.
 However, since we use a finite number of iterations,
sampling methods do not produce equivalent results.

 A sampling method
is considered more efficient
than another if it approximates a distribution
with fewer iterations.
 Two popular sampling methods:
Monte Carlo Simulation
 Latin Hypercube

Monte Carlo Simulation
 Monte
Carlo simulation draws samples from the
full range of the distribution on each draw.
Is an entirely random sampling technique.
 Requires a large number of iterations to adequately
approximate the input distribution.
 Why? Most observations drawn are closer to the mean.
Creates clustering. The tails (areas of high uncertainty)
are usually underrepresented in the sampling.

Latin Hypercube
 Latin
Hypercube samples from all parts of the
distribution, reducing clustering.
Not entirely random (is a “stratified” sampling method)
 Latin Hypercube divides a distribution into intervals
(strata) of equal probability and randomly draws from
each interval.
 Insures that all portions of the distribution are sampled,
including the tails.

 Latin
Hypercube sampling is more efficient
than Monte Carlo sampling:

Requires fewer iterations.
Example:

Suppose we sample 8 times from a normal distribution.
With Monte Carlo sampling we might get:
Notice that the tails (high
uncertainty areas) are not
adequately represented.
This results in
underestimating risk
Source: Modeling the Future: The Full Monte, the Latin Hypercube and Other Curiosities
by Glenn Kautt, CFP, and Fred Wieland, Ph.D., FPA Journal
Example continued
 With
Latin Hypercube, we would get:
Notice that even with only 8
observations, the tails are
much more adequately
represented. This results in a
truer representation of risk.
The area in each strata is
equal but the width of each
strata varies.
Source: Modeling the Future: The Full Monte, the Latin Hypercube and Other Curiosities
by Glenn Kautt, CFP, and Fred Wieland, Ph.D., FPA Journal
CALENDAR SIMULATION
To run the simulation, select the Start Simulation
button.
@Risk will create a Results window.
Choose the Detailed Statistics Window to
display additional output.
Average profit for 1000 trials when 200
calendars are ordered is $337.4975 with a
standard deviation of $189.0535.
CALENDAR SIMULATION
Scroll to the bottom of the Detailed Statistics Window
to enter a target value or target percentage. For
example, if 0 is entered in the target #1 value cell,
7.5% is returned. This means that 7.5% of the 1000
profits were 0 or negative.
Enter 12% in the target #2 percentage cell to see that
12% of the profit values were less than or equal to
$65.
To view the full results – data, demand, and profit
values from all 1000 replications – select the
Insert/Data option in the Results Window.
CALENDAR SIMULATION
To display a graph of profit, click on the Profit item in
the left pane of the Results Window and then select
the Insert/Graph/Histogram item.
We see a spike at a profit of $500. This occurs
whenever a demand greater than 200 is generated
resulting in a profit of $500.
CALENDAR SIMULATION
95% confidence interval:
Mean profit +/- 2(Mean Standard Error),
where, Mean Standard Error = Standard
deviation/SQRT(Iterations)
When 200 calendars are ordered: Mean profit =
337.4975.
Mean Standard Error = 189.0535/SQRT(1000) = 5.98.
95% Confidence Interval = 337.4975-2(5.98) = 325.54
and 337.4975+2(5.98) = 349.45.
This means that we are 95% confident that the true
population mean of profit is between 325.54 and
349.45.
CALENDAR SIMULATION
If we want to narrow the range of the confidence
interval but keep the confidence level fixed
at 95%, the number of replications must be
increased. To accomplish this, we use the
following formula:
n = [16*Estimated standard deviation^2]/L^2
Where L is the width of the confidence interval.
On the previous slide we saw that L = 23.91.
CALENDAR SIMULATION
If we want L to be 10, then:
n = [16*189.0535^2]/10^2 = 5718.60
This implies that we need 5719 iterations to get a
95% confidence interval range of 10.
MULTIPLE ORDER QUANTITIES
The objective of this problem is to choose an order
quantity that maximizes profit. We could continue
to rerun the simulation for different order
quantities.
It would be much better if we could have @Risk
automatically evaluate several order quantities on
the same set of random demand values. The
Risksimtable command accomplishes this.
Enter “Possible order quantities” in cell D8 and the
desired order quantities in cells D9 – H9: 150, 175,
200, 225, and 250.
In cell B9 enter: =Risksimtable(D9:H9).
MULTIPLE ORDER QUANTITIES
In the Simulation Settings dialog box, select 5 as the
number of simulations since 5 different order
quantity values need to be evaluated.
After running the simulations, @Risk shows the results
for all 5 order quantities in the Detailed Summary
of the Results Window.
The order quantity of 175 results in the largest mean
profit; however, we may want to sacrifice a small
amount of profit (367.2025 to 354.165) to improve
the standard deviation (121.8619 down to
59.00198).
RISK OPTIMIZER
We have now evaluated 5 different order quantities but
we still have no guarantee that there isn’t an order
quantity that generates even more profit.
Risk Optimizer will accomplish this. Go back to
Calendar.xls and also launch Risk Optimizer.
Next select the Risk Optimizer Settings button:
 We want to find the Maximum of the Mean of cell
$F$13 (Profit).
 Next, select: By Adjusting the Cells and Add:
Adjust the Cells: Min 100, Cell Range is $B$9,
Max is 300, check Integer Values Only, then Add,
and OK.
RISK OPTIMIZER
Select Options and choose the following:
 Population Size: 50
 Log Simulation Data
 Random Number Seed is Random
 Use Same Seed Each Sim
 Check Change in Last 100 Valid Sims is Less Than
0.01%.
 Stop on Actual Convergence
 Tolerance Auto
 OK
These setting determine stopping rules for the
optimization search.
RISK OPTIMIZER
Next choose OK, then the Start Optimization button, and OK.
Risk Optimizer then plays a king of the hill type of search.
Depending on the level of complexity of the problem, this could
take a VERY long time to stop. You can always select the
Stop button to identify the best solution found so far.
Eventually, Risk Optimizer will report the best order size and
corresponding profit that it found. Note that we have no
guarantees that this is in fact the true optimal solution since
it was identified using simulation.
In addition, if the same problem is run multiple times, there is a
good chance that the solution will vary somewhat.
ADDITIONAL UNCERTAINTY
 As
in the previous example, we need to
place an order for next year’s calendar.
 See Additional Uncertainty template.xls.
 We continue to assume that the calendars
will sell for $10 (B6) and demand at this
price is triangularly distributed with
minimum, most likely, and maximum
values of 100, 175, and 300, respectively
(E5:E7).
 However, there are now two other sources
of uncertainty.
ADDITIONAL UNCERTAINTY
 The
maximum number of calendars that can
be supplied follows a triangular distribution
with values of 125, 200, and 250 (E10:E12).
 The supplier charges $7.50 per calendar
(B4) if he can supply the entire order.
Otherwise, he will charge only $7.25 per
calendar (B5).
 Unsold calendars cannot be returned for a
refund. Instead, they will be put on sale for
$5 a piece after February 1 (B7).
ADDITIONAL UNCERTAINTY
 At
$5, we believe the demand for leftover
calendars is triangularly distributed with
parameters of 0, 50, and 75 (F5:F7).
 Any calendars still left over after March 1,
will be thrown away.
 We plan to order 200 calendars and want to
use simulation to analyze the resulting
profit.
ADDITIONAL UNCERTAINTY
 As
before, we first need to develop the
model.
 Then we can run the simulation with @Risk
and examine the results.
 The model itself requires a bit more logic
than the previous models.
ADDITIONAL UNCERTAINTY
 The
model can be developed as follows:
 Random
inputs. There are three random inputs:
the most the supplier can supply, the customer
demand when the selling price is $10, and the
customer demand for sale-price calendars.
Generate these values in A16, D16 and G16 as:
=ROUND(RiskTriang(E10,E11,E12),0),
=ROUND(RiskTriang(E5,E6, E7),0) and
=ROUND(RiskTriang(F5,F6, F7),0).
 Random potential demand is generated at the sale
price even though there might not be any
calendars left to put on sale.
ADDITIONAL UNCERTAINTY
 Actual
supply. The number of calendars
supplied is the smaller of the number ordered
and the maximum the supplier is able to supply.
Calculate this value in cell B16 as:
=MIN(A16,B10).
 Order cost. The reduced price of $7.25 is
charged if the supplier cannot supply the entire
order. Otherwise, $7.50 per calendar is paid.
Calculate the total order cost in cell C16 as:
=IF(A16>=B10,B4,B5)*B16
ADDITIONAL UNCERTAINTY
 Other
quantities. Calculate the revenue from
regular-price sales in cell E16 with the formula
=B6*MIN(B16,D16).
 Calculate the number left over after regularprice sales in cell F16 with the formula
=MAX(B16-D16,0).
 Calculate revenue from sale-price sales in cell
H16 with the formula =B7*MIN(F16,G16).
 Calculate profit and designate it as an output
cell for @Risk in cell I16 with the formula
=RISKOUTPUT( )+E16+H16-C16.
ADDITIONAL UNCERTAINTY
 Next
specify the simulation settings, specify
the report settings and run the simulation.
 When there are several input cells, @Risk
generates a value from each of them
independently and calculates the
corresponding profit on each iteration.
 The results indicate an average profit of
$396.29, a 5th percentile of $55.75, a 95th
percentile of $528, and a distribution of
profits that is again skewed to the left.
ADDITIONAL UNCERTAINTY
 We
now demonstrate a feature of @Risk
that is particularly useful when there are
several random input cells.
 This feature lets us see which of these
inputs is most related to, or correlated with,
an output cell.
 To perform this analysis, select the
Insert/Graph/Tornado Graph menu item
from the @Risk Results window.
ADDITIONAL UNCERTAINTY
 In
the dialog box, select Profit as the output
variable and the Correlation Sensitivity
button.
 These results show graphically and
numerically how each of the random inputs
correlates with profit – the higher the
correlation, the stronger the relationship
between that input and profit.
 We see that the regular-price demand has by
far the strongest effect on profit (0.793
correlation).
ADDITIONAL UNCERTAINTY
 The
other two inputs, maximum supply
(0.048 correlation) and sale-price demand
(0.034 correlation), are not as important
because they are nearly unrelated to profit.
 If a random input is highly correlated with
an output, then it might be worth the time
and cost to learn more about this input and
possibly reduce the amount of uncertainty
involving it.
CORRELATED VARIABLES
 All
previous problems have had random
numbers be probabilistically independent.
This means that if a random value is much
larger than its mean, the other random
values are unaffected.
 Sometimes values are correlated. If they are
positively correlated, then a large number
for one value will tend to produce a large
number for a second value. While
negatively correlated values tend to move in
opposite directions.
CORRELATED VARIABLES
 Suppose
that there are two different
calendars that are sold but their demand is
negatively correlated. This means that if a
customer buys one calendar they are
unlikely to buy another one. Assume a
correlation of -0.90.
 The other parameters of this problem are the
same as Calendar.xls.
 Compare the profit levels for correlation
values of -0.90, 0, and 0.90.
CORRELATED VARIABLES
 The
RISKCORRMAT (correlation matrix)
function is needed.
 A correlation matrix has 1’s along the main
diagonal since a variable is perfectly
correlated with itself. The correlation values
appear in the other parts of the matrix and
the matrix is symmetric.
 See Correlated Demand template.xls.
 We would like to run this model for 3
different correlation values that appear in
cells I9:K9 as -.90, 0, and 0.90.
CORRELATED VARIABLES
 The
correlation matrix is entered in cells
J5:K6 as 1, =RISKSIMTABLE(I9:K9), =J6,
and 1.
 The RISKSIMTABLE command will allow
@Risk to run the simulation for the three
different correlation values that appear in
cells I9:K9.
 We assume that the company orders 200
calendars of each type (cells B9 and B10).
 The data from Calendar.xls are entered into
cells B4:B6 and E4:E6.
CORRELATED VARIABLES
 The
formulas for Revenue, Cost, Refund,
and Profit from Calendar.xls are entered in
row 14 for product 1 and then copied to row
15 for product 2.
 All values for the two products are summed
in row 16.
 To finish the model we must randomly
generate correlated demands for the two
products in cells B14 and B15.
CORRELATED VARIABLES
 The
demand for product 1 is entered in cell
B14 as:
=ROUND(RISKTRIANG(E4,E5,E6,RISKCORRMAT(J5:K6,1)),0)
 The
demand for product 2 is entered in cell
B15 as:
=ROUND(RISKTRIANG(E4,E5,E6,RISKCORRMAT(J5:K6,2)),0)
 The
first argument of RISKCORRMAT is the
correlation matrix range. The second is an
index of the variable (1 for product 1 and 2 for
product 2).
 Correlated demand values for the two products
will now be generated.
CORRELATED VARIABLES
 Next
specify the simulation settings (1000 runs
and number of simulations 3, one for each
correlation value).
 In the Detailed Statistics window you will see
Profit results for the 3 runs. The mean values
are equal at 674.995.
CORRELATED VARIABLES
 This
may be surprising but can be explained
because @Risk uses the same random
numbers for each run but “shuffles” them in
different orders to get the correct correlations.
 The means are unaffected since this is like
saying the average of 30, 26, and 48 is the
same as the average of 48, 30, and 26.
CORRELATED VARIABLES
 Notice,
however, that the standard deviations
for the three runs are different (157.7571,
262.881, and 365.5104).
 This means that the variation in profit
increases as the correlation goes from negative
to zero to positive.
CORRELATED VARIABLES
 When
demands are negatively correlated, high
demands for one product tend to cancel low
demands for another product making extreme
profit values less likely.
 When demands are positively correlated, high
and low demands tend to go together making
extreme profits more likely.
 This is why investors are warned to diversify
their portfolio to reduce risk.
 Use Risk Optimizer to find the ideal number of
calendars of each type to order.