Download Basic statistics - University of Sydney

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Basic statistics
Week 10 Lecture 1
1
Meanings of statistics

Two meanings of statistics



Statistics as a group of computational procedures that allow us to
find meaning in numerical data
Statistics as the value (number) you get by performing one of
those procedure on sample
Population parameters and sample statistics
The symbol employed for designating the factor
Population parameter
Sample statistic
Mean
m
x or M
Standard deviation
s
S or SD
Number or total
N
n
Thursday, May 20, 2004
ISYS3015 Analytic methods for IS professionals
School of IT, University of Sydney
2
Functions of statistics

Descriptive statistics


Describe what the data look like
Inferential statistics

Draw inferences about a large population from
sample
Thursday, May 20, 2004
ISYS3015 Analytic methods for IS professionals
School of IT, University of Sydney
3
Descriptive statistics

Points of central tendency


The central point around which the data revolve.
Mode



Median



The category or observation that appears most frequently in the
distribution
Only appropriate measure of central tendency for nominal variables
The mid point of a distribution
Frequently used to describe the central tendency of ordinal variables
Mean



Arithmetic average of the values within a data set
M= xi/n
Appropriate for interval and ratio variables
Thursday, May 20, 2004
ISYS3015 Analytic methods for IS professionals
School of IT, University of Sydney
4
Descriptive statistics (cont)

Example

High school student Joe’s daily grade in February
Monday
Tuesday
Wednesday
Thursday
Friday
Week 1
92
69
91
70
90
Week 2
89
72
87
73
86
Week 3
85
75
84
76
83
Week 4
83
77
81
78
79
Thursday, May 20, 2004
ISYS3015 Analytic methods for IS professionals
School of IT, University of Sydney
5
Descriptive statistics (cont)

Measures of variation: dispersion or deviation



Range
Average deviation
Standard deviation

 xM
n
The standard measure of variability in most statistical
operations
s

AD 
2
(
x

M
)
n

Variance

The standard deviation squared
 x  M 
2
s2 
Thursday, May 20, 2004
n
ISYS3015 Analytic methods for IS professionals
School of IT, University of Sydney
6
Shape of the distribution
Shape of the
distribution


Joe's daily score
The frequency of values
from different ranges of
the variable
Use histogram to visual
inspect the shape
6
5
4
3
2
Frequency

Std. Dev = 7.11
1
Mean = 81.0
N = 20.00
0
65.0 - 70.0
75.0 - 80.0
70.0 - 75.0
85.0 - 90.0
80.0 - 85.0
90.0 - 95.0
Joe's daily score
Thursday, May 20, 2004
ISYS3015 Analytic methods for IS professionals
School of IT, University of Sydney
7
Shape of the distribution: normal distribution




Many characteristics of
human populations follow
normal distribution
Horizontally symmetrical
and bell shaped
most of the scores in a
normal distribution tend to
occur near the center, while
more extreme scores on
either side of the center
become increasingly rare.
the mean, median, and
mode of the normal
distribution are the same
Thursday, May 20, 2004
ISYS3015 Analytic methods for IS professionals
School of IT, University of Sydney
8
Features of normal distribution

Predictable percentages of the population lie within any given
portion of the curve



68% of the population lie within 1 standard deviation from the mean
95.46% of the cases lies within 2 standard deviation from the mean
95% of the cases fall within 1.96 standard deviation units from the mean
Thursday, May 20, 2004
ISYS3015 Analytic methods for IS professionals
School of IT, University of Sydney
9
The family of normal curves



Mean determines
where the midpoint of
the distribution falls
Standard deviation
changes the shape of
the distribution without
affecting the midpoint
Standard normal
distribution


Mean: 0
Standard deviation: 1
Thursday, May 20, 2004
ISYS3015 Analytic methods for IS professionals
School of IT, University of Sydney
10
Measuring relative performance

Example

A student John obtained 60 out of 100 in a math exam and
50 out of 100 in an English exam


Mean score?
Standard deviation?
Thursday, May 20, 2004
ISYS3015 Analytic methods for IS professionals
School of IT, University of Sydney
11
Standard scores

Z-score

Measure the distance, in standard deviation units,
of any value in a distribution from the mean


Z = (x-m)/s
John’s standard scores


Zmath = (60-55)/10 = 0.5
ZEnglish = (50-45)/5 = 1
Thursday, May 20, 2004
ISYS3015 Analytic methods for IS professionals
School of IT, University of Sydney
12
Create index by Z scores in survey research


Triangulation of measures
We have measures on income and years of
education and we want to combine them to
form a socio-economic index


Annual incomes vary from 5,000 ~ 500,000
Yeas of education vary from 0 ~20
Thursday, May 20, 2004
ISYS3015 Analytic methods for IS professionals
School of IT, University of Sydney
13
Computing an index score using Z score
Income
Years education
Mean
65,000
11
Standard Deviation
22,000
4
A
64,000
16
B
86,000
9
Given population values
Suppose 2 individuals
Case A
Case B
Income: (64,000-65,000)/22,000
= -0.05
Income: (86,000-65,000)/22,000
Education: (16-11)/4
= 1.25
Education: (9-11)/4
Socio-economic index score
Thursday, May 20, 2004
1.20
Socio-economic index score
ISYS3015 Analytic methods for IS professionals
School of IT, University of Sydney
= 0.95
= -0.50
0.45
14
Correlation: measure of relationship


a measure of the relation between two or
more variables.
Statistic used: correlation coefficient


Between -1 and 1
Direction of relationship


The sign of the correlation coefficient
Strength of relationship

The value of the correlation coefficient
Thursday, May 20, 2004
ISYS3015 Analytic methods for IS professionals
School of IT, University of Sydney
15
Pearson r correlation





Simple linear correlation
The measurement scales
used should be at least
interval scales
Scatter gram (scatter plot)
provide visual inspection of
linear correlation
Excel can calculate
correlation
Correlation does not
indicate causation
Thursday, May 20, 2004
ISYS3015 Analytic methods for IS professionals
School of IT, University of Sydney
16
Example

We collected data
about the salaries (y)
and years of
experience (x) for a
sample of 50 auditors.
Are there any
relationship between
the salary and the
years of experience?
Thursday, May 20, 2004
ISYS3015 Analytic methods for IS professionals
School of IT, University of Sydney
17
Related documents