Download 26134 Business Statistics

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Bootstrapping (statistics) wikipedia , lookup

Taylor's law wikipedia , lookup

History of statistics wikipedia , lookup

Psychometrics wikipedia , lookup

Transcript
26134 Business Statistics
[email protected]
Week 3 Tutorial
Descriptive Statistics
Key concepts in this tutorial are listed below
1. Measures of Central Tendency
2. Measures of Dispersion/Spread
Maths Study Centre CB04.03.331 Open 12pm – 5pm Semester
Weekdays https://www.uts.edu.au/future-
1
students/science/student-experience/maths-study-centre
www.khanacademy.org
www.mahritaharahap.wordpress.com/teaching-areas
In statistics we usually want to statistically analyse a population but collecting data
for the whole population is usually impractical, expensive and unavailable. That is
why we collect samples from the population (sampling) and make inferences about
the population parameters using the statistics of the sample (inferencing) with some
level of accuracy (confidence level).
A population is a collection of all possible individuals, objects, or measurements of
interest. A sample is a subset of the population of interest.
Four Levels of Measurement of data
DATA
QUALITATIVE/
CATEGORICAL
QUANTITATIVE/
NUMERICAL
can be grouped
measure
NOMINAL
ORDINAL
cannot be arranged in
any particular order
can be arranged in
order
BEST MEASURE OF
CENTRAL TENDENCY:
MODE
BEST MEASURE OF
SPREAD: NONE
BEST MEASURE OF
CENTRAL TENDENCY:
MEDIAN, MODE
BEST MEASURE OF
SPREAD: IQR
3
INTERVAL
RATIO
Zero is arbitrary
Zero is not arbitrary
and means none
BEST MEASURE OF
CENTRAL TENDENCY:
MEAN,MEDIAN, MODE
BEST MEASURE OF
SPREAD: RANGE,
VARIANCE, S.D AND
IQR
BEST MEASURE OF
CENTRAL TENDENCY:
MEAN,MEDIAN,MODE
BEST MEASURE OF
SPREAD: RANGE,
VARIANCE, S.D AND
IQR
Measures of Central Tendency:
The mean is the average of a set of numbers
 Applicable for interval and ratio data
 Affected by each value in the data set including extreme
values.
The median is the middle value of a set of numbers after
they have been arranged in an order
 Applicable for ordinal, interval, and ratio data
 Unaffected by extremely large and extremely small
values
The mode is the most frequently occurring value in a data
set.
 Applicable to all levels of data measurement (nominal,
ordinal, interval, and ratio).
4
Measures of Dispersion/Spread:
 RANGE is the difference between the largest and the smallest values in a
set of data
 VARIANCE is the Average of the squared deviations from the arithmetic
mean.
 STANDARD DEVIATION is the square root of the variance. Variance and
standard deviation indicate the variability of the measurements.
 COEFFICIENT OF VARIATION (CV) is defined as the ratio of the standard
deviation to the mean, expressed as a percentage. Used to compare
standard deviation/variability of datasets with different means.
 INTER-QUARTILE RANGE (IQR) is the range of values between the first
and third quartiles i.e. Interquartile range = Q3-Q1. Useful measure for
ordinal data.
 Z-SCORE represents the number of standard deviations a value (x) is
above or below the mean of a set of numbers. Used for normally distributed
data. We will learn about normal distribution in lecture on probability distributions.
5
Four Levels of Measurement of data
DATA
6
QUALITATIVE/
CATEGORICAL
QUANTITATIVE/
NUMERICAL
can be grouped
measure
NOMINAL
ORDINAL
cannot be arranged in
any particular order
can be arranged in
order
BEST MEASURE OF
CENTRAL TENDENCY:
MODE
BEST MEASURE OF
SPREAD: NONE
BEST MEASURE OF
CENTRAL TENDENCY:
MEDIAN, MODE
BEST MEASURE OF
SPREAD: IQR
INTERVAL
RATIO
Zero is arbitrary
Zero is not arbitrary
and means none
BEST MEASURE OF
CENTRAL TENDENCY:
MEAN,MEDIAN, MODE
BEST MEASURE OF
SPREAD: RANGE,
VARIANCE, S.D AND
IQR
BEST MEASURE OF
CENTRAL TENDENCY:
MEAN,MEDIAN,MODE
BEST MEASURE OF
SPREAD: RANGE,
VARIANCE, S.D AND
IQR
a). nominal
b). mode
c). N/A (NOTE: no measures of dispersion that we
discussed in this course is meaningful for a nominal
variable)
d). The mode is 11 which means the most preferred
colour for a laptop bag is black
7
a). ordinal
b). mode and median
c). IQR is the most appropriate measure of dispersion
for ordinal data.
d). The median = mode = 11 and the IQR =2
a). interval
b). mode (= 11), median (= 11) and mean (= 10.3)
c). Range (= 4), Variance (= 1.9), Standard
deviation (= 1.4) and IQR (= 2)
d). In general, men tend to have shoe size between
10 and 11, but this varies from 8 to 12 with the
deviation of 1.38 point about 10.3 and the scatter
of size around the median is 2.
8
a). Mean =50
b). Median=50
c). Mode = 40 and 60
d). Range = 20
e). Variance=100
f). Standard Deviation=10
Both the mean and the median is 50, which means in general, the fresh graduate
salary in small companies is $50,000. The data shows the possibility of bimodal
distribution with two most frequent salaries of $40,000 and $60,000. The difference
between the highest and the lowest salaries is $20,000 with the deviation of $10,000
about the average salary.
9
The most appropriate measure of dispersion that can be used to compare between the two
is the coefficient of variation. Since Sydney’s CV (9.78%) is lower than Perth’s CV (13.31%),
we can conclude that Sydney has lower dispersion/spread.
In Sydney: The Median and Mode of the level of satisfaction is 3.
In Perth: The median in Perth is 4 and mode in Perth is 5
Overall, Perth has higher customer satisfaction level compared to Sydney.
10