Download Analyzing and Interpreting Data

yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Describing Data
Wahyu Wibowo
Central Tendency
the value used to characterize the center
of the set of values it contains.
 useful to quantify the middle or central
location of a variable.
 the central location (quantitative data): the
mode, the median, and the mean.
 the central location (qualitative data): the
Dispersion Parameter
The boxplot provides an indication of the
value spread around the median.
 The field of statistics has developed
parameters to describe this spread, or
dispersion, using a single measure.
interquartile range
 range
 median absolute deviation
 standard deviation and variance
Coefficient of Variation
If two variables are measured with
different units, then the values of the
standard deviation cannot be used as the
measure of comparison for the dispersion
 CV can be to tompare dispersions
measured in different units
 equal to the quotient of the standard
deviation and the absolute value of the
Skewness is a measure of distribution
 Yule & Pearson express the difference
between median and mean as a degree of
deviation from symmetry
3( x  med )
Skew 
Values larger than 0 indicate a rightskewed distribution, values less than 0
indicate a left-skewed distribution, and
values that are 0 indicate a symmetric
Kurtosis is used to help determine which
form is present.
 Defined as the fourth central moment,
Distribution of data
how the different values are distributed
around this location
Box plots provide a succinct summary of
the overall frequency distribution of a
 Six values are usually displayed: the
lowest value, the lower quartile (Q1), the
median (Q2), the upper quartile (Q3), the
highest value, and the mean
Use to assess and compare distributions
by plotting the values along a number line.
 Dotplots are especially useful for
comparing distributions
 The x-axis for a dotplot is divided into
many small intervals, or bins. Data values
falling within each bin are represented by
Stem and Leaf
Use to examine the shape and spread of
sample data
 The display has three columns,
o The leaves (right)
o The stem (middle)
o Counts (left)
The purpose of a histogram is to
graphically summarize the distribution of a
univariate data set.
The histogram graphically shows the
1. center (i.e., the location) of the data;
2. spread (i.e., the scale) of the data is;
3. skewness of the data;
4. presence of outliers; and
5. presence of multiple modes in the data
Referensi :
Exploratory Data Analysis in Business and
 Toit, S.H.C, Steyn, A.G.W., Stumpf, R.H.,
Graphical Exploratory Data Analysis,