Download BBA2: STT 200

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Bootstrapping (statistics) wikipedia , lookup

Foundations of statistics wikipedia , lookup

History of statistics wikipedia , lookup

Gibbs sampling wikipedia , lookup

Misuse of statistics wikipedia , lookup

Transcript
Fundamentals of Business Statistics: Marking Scheme
Fundamentals to Business Statistics: C103
NOV 2010 Proposed Model Answers/Marking Scheme
Question #
Mark/s
Question One
a) Multiple choice questions:
i.
d.
2
ii.
d.
2
iii.
c.
2
iv.
b.
2
v.
a.
2
10 marks
b) True or False
i.
True
1M
ii.
False: A data set can be unimodal- with one mode, bimodal – two modes or
generally multimodal – many modes e.g. the data set 3, 4, 8, 9, 19, 19, 23, 25
has two modes.
2M
iii.
True
1M
iv.
True
1M
5 marks
Total = 15 marks
Question Two
a) A sampling frame is a listing of all elements/objects/individuals (population) of
interest in a study. The study uses an voters register as its sampling frame. 3
marks
1
© IOBM2010
Fundamentals of Business Statistics: Marking Scheme
b) Advantages and disadvantages of sampling methods:
Method
Advantages
Simple random sampling  There is no individual
bias
in
selecting
elements
 Sampling variation can
be
estimated
mathematically
Stratified
random  Electoral roll can be used
sampling (by wards)
as sampling frame since
it distinguishes between
wards
 All wards would be
represented i.e. sample
would
be
more
representative of the
population
 Sampling variation for
each ward can be
estimated
mathematically
 Less tedious to select
because
each
ward
contains a manageable
number of units.
Quota sampling (Quotas  Quick as limited area
for each ward)
would be covered
 Different
wards
all
represented adequately
Disadvantages
 Selection process tiresome
for larger populations
 Some wards may be
overrepresented
 There may be nonresponse of selected units.
 Preliminary calculation of
sample sizes in each ward
necessary
 Calculation of overall
sampling error is less
straightforward
 Non-response may occur
 No estimate of sampling
variation can be made
 Results may be biased
through
choice
of
individuals
to
be
approached, and through
willingness or not to reply
c) The voters register will not be an up-to-date list of residents, due to deaths,
migration into or away from Blantyre, or from one ward to another. Again,
structural developments e.g. building, roads, etc may have affected the structure
of wards, the numbers in them and their economic characteristics.
4 marks
Total = 15 marks
2
© IOBM2010
Fundamentals of Business Statistics: Marking Scheme
Question Three
a) Four variables from analysis: Gender of author, type/classification of topical
issue, number of articles and date.
4 marks
b) (i) Distribution of topical issues by female reporters:
Distribution of Articles by Female Reporters by Type of Topical Issue
Type of Issue
Politics
Gender-based violence
Education
Health
Labour
Other
Number of articles
18
17
12
5
5
43
3 marks
(ii)
4 marks
c) Pie charts are easy to construct than a bar chart for simple data set and they show
proportions of the total rather than actual numbers of topical issues. A bar chart
would be better in b) to make exact comparisons between the subject matter,
although a pie chart would be more practical to highlight the proportion in the
"others" grouping.
4
3
© IOBM2010
Fundamentals of Business Statistics: Marking Scheme
4 marks
Total = 15 marks
Question Four
a)
The percentages in each category might have been rounded up to the nearest
whole number (hence possibly rounding errors had occurred)
1
1 mark
b)
Advantages of pie charts:
They show the size of each category in relation to the whole
They are visually appealing
2
Advantages of bar charts:
They are easy to draw without calculation and with ruler only
It is easy to compare directly on a scale the frequency of each category
2
4 marks
c)
Constructing pie charts
Call Type
Cost
(% of
total
cost)
Number (% of
total number of
calls made)
Angular
Measure
(Cost)
Angular
Measure
(Number)
Daytime
Off peak hours
& Weekends
Mobile
02 numbers
All others
61
50
222
182
13
17
6
2
33
7
8
1
47
62
22
7
120
25
29
4
1M
1M
2 marks
Pie Charts:
4
© IOBM2010
Fundamentals of Business Statistics: Marking Scheme
3 marks
3 marks
d)
Comments might include
 The most calls are made during daytime
 Daytime calls account for almost two-thirds of the total cost/bill.
 Off peak & weekend calls account for exactly one-third of the calls but a
much smaller proportion of the total cost/bill.
 Daytime and mobile calls are relatively more expensive than off peak and
weekend calls.
 etc
2 marks
Total = 15 marks
5
© IOBM2010
Fundamentals of Business Statistics: Marking Scheme
SECTION B
Question Six
a) Estimating measures by calculation:
i.
Mode
Modal class: 50 – 70
Mode  L  (U  L)
f 0  f1
2 f 0  f1  f 2
 50  20 
1560  1200
2(1560)  1200  1420
1
 50  20 
360
 64.4
500
1
 K 64400.00
1
3 marks
ii.
Median
Salary (K,000)
Below 20
20 but below 50
50 but below 70
70 but below 90
90 but below 120
120 but below 200
200 but below 300
300 but below 500
500 or more
Median position 
1  f

2
 Median class is 70 – 90
  f  F 
2

Median  L  C 
f
f
620
1200
1560
1420
1190
700
350
150
10
7201
 3600.5
2
 7200

 3380 

2

 70  20 
1560
 70  20  0.1410  72.8205
 K 72800.00
F
620
1820
3380
4800
5990
6690
7040
7190
7200
2M
2
1
1
1
1
6 marks
6
© IOBM2010
Fundamentals of Business Statistics: Marking Scheme
iii.
Upper and lower quartiles:
Lower quartile position 
1  f
Upper quartile position  3 
4

1  f
2
7201
 1800.25
4
 3
7201
 5400.75
2
 7200

 620 

4
  49.5  K 49,500.00
 Q1  20  30 
1200
 7200

 4800 
3
4
  102.68  K102,700.00
 Q3  90  30 
1420
1
1
2
2
6 marks
iv.
Quartile deviation:
qd 
Q3  Q1 102,700  49,500

2
2
 K 26,600.00
1
1
2 marks
b) The median and quartiles would be preferred to the mean and standard
deviation because the mean and standard deviation would be affected (i.e.
overblown/pushed up) by the “extreme” salaries in the open-ended top salary
class. Again we would have to make subjective assumptions about the upper
boundary of the topmost salary class in order to calculate the mean and
standard deviation.
3
3 marks
Total = 20 marks
7
© IOBM2010
Fundamentals of Business Statistics: Marking Scheme
Question Seven
a) Any two of:
Range is the difference between the largest and smallest values in the data set.
1
Adv
1
It is very easy to calculate
Disadv
1
It is very sensitive to extreme values/outliers since it depends only on the largest
and smallest values
Inter-Quartile Range (IQR) is the difference in value between the upper quartile
( Q3 ) and lower quartile ( Q1 ) or the range of the middle 50% of the total number
of observations/data values. (The candidates can also define the “semi-interquartile range, SIQR”)
1
Adv
1
Once the data are ordered, the quartiles are easy to locate, and the IQR is not
sensitive to extreme data values/outliers.
Disadv
1
It is difficulty to develop theory for using these measures (IQR & SIQR)
It is not easy to estimate IQR for grouped data
Variance is the square of the average deviation from the mean or
1
Standard Deviation is the measure of the average deviation from the mean.
Adv
1
Use of all the data values in its calculation
The measure has good mathematical theory, and so it is widely used
Disadv
1
It is a good measure when data are fairly symmetrical (normally distributed) as it
can be considerably affected by extreme data values or outliers.
6 marks
b)
8
© IOBM2010
Fundamentals of Business Statistics: Marking Scheme
1
Standard deviation:
s 
x
2
n
2
x
30710404  47118 






 n 
100
 100 


2
1
 85,093.4476
1
 291.71 minutes
1
4 marks
i.
Frequency tables
Time taken (min)
Tally
Number of recruits
0 – 199
//// //// //// //// ///
23
200 – 399
//// //// //// //// ///
23
400 – 599
//// //// //// ////
19
600 – 799
//// //// //// /
16
800 – 999
//// //// //// ////
19
1M
2M
1M
4
4 marks
ii. Standard deviation, s 
x
n
2
x


 n 


2
Table of sums
fx
f
fx 2
x
99.5
299.5
499.5
699.5
899.5
23
23
19
16
19
2288.5
227705.8
6888.5 2063106.0
9490.5 4740505.0
11192.0 7828804.0
17090.5 15372905.0

100
46,950
s
9
30,233,025
© IOBM2010
Fundamentals of Business Statistics: Marking Scheme
1M
s 
30,233,025  46,950 


100
 100 
1M
1M
3
2
 81,900
1
 286.18 minutes
1
5 marks
iii. The difference between the results in parts i. and iii. is because in part iii. data
values have been grouped at the class mid-points. Infact, part iii. result is just
an estimate of part i. result.
1
1 mark
Total = 20 marks
Question Eight
a) (i)
(ii)
Has a true zero
can give meaningful ratios
(iii) Can be ordered or ranked
b) Qualitative measurements/variables can be categorized as nominal or ordinal
Nominal measurements/variables cannot be put into a logical order and have no
true zero. Examples include; nationality, gender, make of car, marital status…
2
Ordinal measurements/variables can be arranged in a logical order but they also
do not have a true zero. Examples may include; scoring scale for an opinion
(strongly agree… disagree), shirt size (small, medium, large, x-large…), product
rating (excellent, very good, good, poor)…
2
4 marks
c) Quantitative measurements/variables can be classified as discrete or continuous
Discrete variables are variables that assume a finite number of values and are
usually obtained by counted. Examples include; number of customers in a queue,
number of vehicles, number of employees, number of children in a family…
10
2
© IOBM2010
Fundamentals of Business Statistics: Marking Scheme
Continuous variables are variables that assume an infinite number of values in an
interval and are usually obtained by measuring. Example include; length of a
computer, student’s height, time taken to serve a customer…
2
4 marks
d)
i.
Frequency distribution
Salary (K,000)
10 – 14
15 – 19
20 – 24
25 – 29
30 – 34
35 – 39
40 – 44
45 – 49
50 – 54
55 – 59
Number of Acc.
Assistants f
3
3
6
6
8
6
10
6
3
2
1M
2M
3
3 marks
ii. Pearson’s first coefficient of skewness:
SK 1 
x  Mo
s
Sums: n  53 ,  x 2  71,049 ,
 x  1839
11
1
© IOBM2010
Fundamentals of Business Statistics: Marking Scheme
x
s
 x  1839  34.70
n
1
53
x
2
n
x
 

 n 


2
2
71049  1839 

  136.5881  11.69
53
 53 
2
Mode is 22
 SK1 
1
34.7  22
 1.09
11.69
1
The distribution of raw salaries has a relatively heavy positive skewness
1
7 marks
Total = 20 marks
Question Nine
a) Construct frequency distribution
r.f.
Class Boundaries
f
x
r.F
0.20
9.5 – 14.5
12
12
0.20
0.40
14.5 – 19.5
24
17
0.60
0.25
19.5 – 24.5
15
22
0.85
0.10
24.5 – 29.5
6
27
0.95
0.05
29.5 – 34.5
3
32
1.00
2M
2M
1M
2M
7
7 marks
b)
Mode, Mo  l  C
f  f0
2 f  f 0  f1
Modal class: 14.5 – 19.5
 Mo  14.5  5
24  12
2  24  12  15
1
 14.5  2.8571
1
12
© IOBM2010
Fundamentals of Business Statistics: Marking Scheme
 12.86 days
1
3 marks
Inter-quartile range, IQR  Q3  Q1
Class Boundaries
f
F
9.5 – 14.5
12
12
14.5 – 19.5
24
36
19.5 – 24.5
15
51
24.5 – 29.5
6
57
29.5 – 34.5
3
60
1M
Q1 
n  1 60  1

 15.25 th item
4
4
1
 60

  12 
4
  15.125
Q1  14.5  5 
24
Q3  3
1
1
61
 45.75 th item
4
1
 3  60

 36 

4
  22.5
Q3  19.5  5 
15
1
 IQR  22.5  15.125  7.375
IQR  7.38 days
1
6 marks
Variance
Sums:
f
 60,
 fx  1140,  fx
s
2
x

n
2
x


 n 


2
 23,370
2
23370  1140 



60
 60 
13
2
2
1
© IOBM2010
Fundamentals of Business Statistics: Marking Scheme
 28.5 days 2
1
4 marks
Total = 20 marks
14
© IOBM2010