Download A Beginner's Guide to Simulation Studies

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Central limit theorem wikipedia , lookup

Transcript
A Beginner’s Guide to
Simulation Studies
Barbara Foster, Ph.D.
Department of Academic Computing Services
UT Southwestern Medical Center
Dallas, TX
Why Do a Simulation
Study?
To determine patterns or results
from hypothetical situations which
may be applied to real situation.
125
Overview
„
„
„
„
Defining the question
Design the study
Program the design
Run the program
„
„
„
„
to generate data
to obtain and accumulate statistic(s) of interest
to analyze accumulated information
Draw Conclusions
Defining the Question
The Central Limit Theorem in the field of
statistics states that regardless of the
underlying pattern of the data, that
if a number of samples of size n are
randomly selected from a specified
population, and
if the average is calculated for each of
the samples
then the sample averages will have a bell
shaped distribution as the sample size gets
larger.
126
Defining the Question
How large does the sample size
have to be for the bell shaped
curve to occur?
Designing the Study
„
What is the statistic of interest?
the average (also known as the mean)
Example of data from a random sample
of size 5:
2, 4, 3, 5, 4
X=
∑X
n
=
2 + 4 + 3 + 5 + 4 18
= = 3.6
5
5
127
Designing the Study
„ What
sample sizes are of
interest?
5, 10, 25, 50, 100
Designing the Study
„
What shapes (also known as distributions)
will be sampled from?
Distributions
Normal (bell shaped)
Uniform
Exponential
Gamma
128
Reason for Selection
WellWell-studied/Ideal
Constant frequency
Extreme values
Nonon-symmetric
Designing the Study
14
12
„
Normal
Distribution
10
8
6
4
2
0
Frequency
6
„
Uniform
Distribution
4
2
0
1
2
3
4
5
Designing the Study
12
10
„
Exponential
Distribution
8
6
4
2
0
1
2
3
4
5
1
2
3
4
5
12
10
„
Gamma
Distribution
8
6
4
2
0
129
6
Program the Design
„
„
Start with the most basic unit
When the basic unit is working
correctly then add the repeats
Program the Design
„
Start with the simplest unit:
DATA NUMBERS;
SXSZ=5;
DO A=1
A=1 TO SXSZ;
X=RANNOR(0
X=RANNOR(0);
END;
OUTPUT;
PROC MEANS MEAN NOPRINT; VAR X; OUTPUT OUT=MEANS
MEAN=AVG /AUTOLABEL; RUN;
RUN;
DATA M;
SET MEANS;
SXSZ=20
SXSZ=20;;
KEEP SXSZ MEAN;
130
Program the Design
„
Consolidate results of several runs:
LIBNAME CLT 'C:\
'C:\PRESENTATIONS\
PRESENTATIONS\SCSUG';
RUN;
%macro CLT;
%DO REPEATS=1 %TO 5;
DATA NUMBERS;
SXSZ=5
SXSZ=5; DO A=1
A=1 TO SXSZ;X=RANNOR(0
SXSZ;X=RANNOR(0); OUTPUT;
END;
PROC MEANS MEAN NOPRINT; VAR X; OUTPUT OUT=MEANS MEAN=AVG /AUTOLABEL;
/AUTOLABEL; RUN;
DATA M;
SET MEANS;
SXSZ=20
SXSZ=20;;
KEEP SXSZ AVG;
PROC APPEND BASE=CLT.AVG_DATA;
%END;
%MEND;
%CLT; RUN;
PROC PRINT DATA=CLT.AVG_DATA;
RUN;
Program the Design
„
Add the graphing portion
LIBNAME CLT 'C:\
'C:\PRESENTATIONS\
PRESENTATIONS\SCSUG';
RUN;
RUN;
%macro CLT;
%DO REPEATS=1
REPEATS=1 %TO 5;
DATA NUMBERS;
SXSZ=5
SXSZ=5; DO A=1
A=1 TO SXSZ;X=RANNOR(0
SXSZ;X=RANNOR(0);
OUTPUT;
END;
PROC MEANS MEAN NOPRINT; VAR X; OUTPUT OUT=MEANS MEAN=AVG /AUTOLABEL;
/AUTOLABEL; RUN;
DATA M;
SET MEANS;
SXSZ=20
SXSZ=20;;
KEEP SXSZ AVG;
PROC APPEND BASE=CLT.AVG_DATA;
%END;
PROC chart DATA=CLT.AVG_DATA;
VBAR AVG; RUN;
%MEND;
%MEND;
%CLT; RUN;
RUN;
131
Program the Design
„
The basic unit is completed.
Now add the multiple desired
sample sizes.
Program the Design
„ Now
add the desired
distributions.
132
Run the Program
„
The time it takes to run the
program is dependent upon
– The size of the samples
– The number of repetitions (samples)
– The number of distributions selected
– The requested statistics and/or
graphs
Draw Conclusions
„
„
From which distributions do the
sample means form a normal
distribution with small sample sizes?
From which distributions do the
sample means require a large sample
size to form a normal distribution?
133
Recommended Reading:
„
SAS for Monte Carlo Studies: A Guide
for Quantitative Researchers (2002)
Xitao Fan
Akas Felsovalyi
Stephen A. Sivo
Sean C. Keenan
134