Transcript
```Describing Data
Wahyu Wibowo
Central Tendency
the value used to characterize the center
of the set of values it contains.
 useful to quantify the middle or central
location of a variable.
 the central location (quantitative data): the
mode, the median, and the mean.
 the central location (qualitative data): the
mode.

Dispersion Parameter
The boxplot provides an indication of the
 The field of statistics has developed
parameters to describe this spread, or
dispersion, using a single measure.

interquartile range
 range
 median absolute deviation
 standard deviation and variance

Coefficient of Variation
If two variables are measured with
different units, then the values of the
standard deviation cannot be used as the
measure of comparison for the dispersion
 CV can be to tompare dispersions
measured in different units
 equal to the quotient of the standard
deviation and the absolute value of the
mean

Skewness
Skewness is a measure of distribution
asymmetry.
 Yule & Pearson express the difference
between median and mean as a degree of
deviation from symmetry

3( x  med )
Skew 
s

Values larger than 0 indicate a rightskewed distribution, values less than 0
indicate a left-skewed distribution, and
values that are 0 indicate a symmetric
distribution.
Kurtosis
Kurtosis is used to help determine which
form is present.
 Defined as the fourth central moment,

Distribution of data

how the different values are distributed
around this location
Boxplot
Box plots provide a succinct summary of
the overall frequency distribution of a
variable.
 Six values are usually displayed: the
lowest value, the lower quartile (Q1), the
median (Q2), the upper quartile (Q3), the
highest value, and the mean

Dotplot
Use to assess and compare distributions
by plotting the values along a number line.
 Dotplots are especially useful for
comparing distributions
 The x-axis for a dotplot is divided into
many small intervals, or bins. Data values
falling within each bin are represented by
dots

Stem and Leaf
Use to examine the shape and spread of
sample data
 The display has three columns,
o The leaves (right)
o The stem (middle)
o Counts (left)

Histograms

The purpose of a histogram is to
graphically summarize the distribution of a
univariate data set.
The histogram graphically shows the
following:
1. center (i.e., the location) of the data;
2. spread (i.e., the scale) of the data is;
3. skewness of the data;
4. presence of outliers; and
5. presence of multiple modes in the data

Referensi :
Exploratory Data Analysis in Business and
Economics,
 Toit, S.H.C, Steyn, A.G.W., Stumpf, R.H.,
Graphical Exploratory Data Analysis,
Springer-Verlag

```
