Download Data I

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Bootstrapping (statistics) wikipedia , lookup

History of statistics wikipedia , lookup

World Values Survey wikipedia , lookup

Time series wikipedia , lookup

Transcript
Data
Handbook Chapter 4 & 5
Data
A series of readings that represents a
natural population parameter
 It provides information about the
population itself

Organizing Data

Important prelude to describing and
interpreting data
Charting Data

Tables
– Organized by rows and columns
Column 1 Column 2 Column 3
Row 1
Row 2
Row 3
Charting Data

Graphs
– Organized by horizontal (abscissa) and vertical
(ordinate) axes
Charting Data

Graphs
– Proper legend
– Properly labeled axes
Graphs

Multiple graphs used for comparing data should
map the same variables on the ordinate and
abscissa and use the same scale for each graph.
Describing data
Descriptions of data indirectly describes
actual population parameters
 Describing the data distribution is a first
step in this process

Data Distributions
Pattern of frequency
 Frequency is how often a particular
value or set of values occurs in a data
set

Histogram Frequency Graph
Types of distributions
Uniform
 Unimodal
 Bimodal
 Normal
 Skewed

Uniform

The distribution has an equal frequency
(number of occurrences) of each value
or category of values
Uniform Distribution of Tree
Heights
Uniform Frequency Graph
This is not a uniform distribution.
Unimodal
The distribution has an unequal
frequency (number of occurances) of
each value or category of values
 The distribution has distinct central
values that have a greater frequency
than the others

Unimodal Distribution of Tree
Heights
Unimodal Frequency Graph
Skewed
The distribution has distinct central
values that have a greater frequency
than the others
 The less frequent values are not evenly
distributed on either side of the high
point

Skewed Distribution of Tree
Heights
Skewed Frequency Graph
Bimodal
The distribution has two distinct values
or sets of values that have greater
frequencies than the others
 These values are separated from one
another by less frequent values
 Often indicative of two populations

Bimodal Distribution of Tree
Heights
Bimodal Frequency Graph
Two Populations of Trees
Normal
Frequencies are equally spread out on
either side of a central high point
 Bell shaped
 Most frequent type of distribution

Normal Frequency Graph
Interpreting Data
Descriptive statistics are used to
summarize data
 Several descriptive statistics are used to
describe two important aspects of data
distributions:

– Central Tendency
– Dispersion
Central Tendency
Most data are spread out around a
central high point
 The central values are the ones that
occur most often and thus important to
report

Measures of Central Tendency

Three common measurements
– Mean
»Average value
– Median
»Center value
– Mode
»Most frequent value
Mean

“Typical Value”
N
Mean =
S
Xi
i=1
N
Normal Distribution and
Central Measures

In a perfectly normal distribution the
mean, median and mode are all the
same
Perfectly Normal Distribution
Dispersion
The distribution of values that occur less
often
 The spread of the data around the
central values is important to report
 Dispersion is about the degree of
clustering of the data

Measures of Dispersion

Two common measurements
– Range
»Distance between the lowest and
highest values
– Standard Deviation
»Average deviation from the mean
Calculate Mean and Range
Ranges
Calculating Standard Deviation
Normal Distribution and
Dispersion
68.26% of values fall within one
standard deviation on either side of the
mean
 95.44% of values fall within two
standard deviations on either side of the
mean
 99.74% of values fall within three
standard deviations on either side of the
mean

Normal Distribution and
Standard Deviation
Graph of
Dispersion
Accuracy & Precision
Accuracy
Error

Error = Accuracy of a particular data point
relative to an accepted value

Absolute Error = I Accepted – Data I

Percent Error = I Accepted – Data I x 100
Accepted
Precision
Precision is a measure of how
consistent the data within a data set are
relative to each other
 One measure of precision of a data set
is the standard deviation SD provided
that m (the mean) is the accepted value
 m + SD

Calculation of SD of a Data Set
1/2
N
SD = S (m –
i=1
N-1
2
X)
i