Download MAS1403 - School of Mathematics and Statistics

Document related concepts

Statistics wikipedia , lookup

History of statistics wikipedia , lookup

Probability interpretations wikipedia , lookup

Probability wikipedia , lookup

Transcript
MAS1403
Quantitative Methods for
Business Management
Semester 1
Dr. Daniel Henderson
School of Mathematics & Statistics
MAS1403: Quantitative Methods for Business Management
2016/17
Lecturer: Dr. Daniel Henderson, Room 2.21 Herschel Building.
Email: [email protected]
www.mas.ncl.ac.uk/∼ndah6/teaching/MAS1403/
Lectures:
Mondays at 12pm
In the Curtis Auditorium, Herschel Building
Tutorials:
One per week
There are 6 groups – check the module webpage to see which tutorial to attend.
Practicals:
Occasionally
Check the full schedule overleaf for dates. These will take place instead of the tutorials.
Drop-in:
Mon 1-2pm, Wed 1-2pm
Optional “office hours” where I will be available in my office for any help with the work.
Lecture notes and handouts
You will be provided with a booklet containing lecture notes and tutorial exercises.
You should bring your booklet to every class!
There will often be gaps in the lecture notes for you to complete during the lecture, so make sure you’ve got them with you!
All lecture notes, slides and solutions to tutorial exercises will be available to download from the course website (see above). There
should be a link to this website from within Blackboard. Some additional handouts may only be available in lectures and tutorials.
You will notice that my lecture slides are colour-coded: Green for announcements, Blue for “listen and learn” and Red for “write”!
Assessment
Assessment for this course is via examination (60% at end of Semester 2), assignments (10% each semester) and computer-based
assessments (10% each semester). Ordinarily, if you fail this module you cannot proceed to Stage 2 of your degree!
Exam:
May/June 2017
A two hour, open-book, computer-based exam based on whole course: Answer all questions.
Assignments:
Dec 2016, May 2017
About three big questions in each, some of which will use your own personal datasets and
some of which will require you to use the computer package Minitab.
CBAs:
Throughout the year
Three CBAs in each Semester. Available in “practice mode” for one week and then “exam
mode” the next week. Some multiple choice questions, but mainly data response/calculations.
Every student will get a different set of questions from a bank of hundreds!
Must be done in your own time.
Late Work Policy:
It is not possible to extend submission deadlines for coursework in this module and no late work can be accepted. For details of the
policy (including procedures in the event of illness etc.) please look at the School web site:
http://www.ncl.ac.uk/maths/students/resources/late-missed/
Other Stuff
Email:
Check your University email every day – announcements about the course will be made regularly!
Calculator:
There is no way around it, you must have a scientific calculator for this course, and it must be on the University’s
approved list! I recommend the Casio fX-85GT PLUS (about £10). You can get advice on how to use the Statistics
mode of your calculator in tutorials, and some video presentations on use of the calculator will be available from the
module webpage. You should bring your calculator to every class. You will be stuck without one!
MAS1403 - Provisional Schedule for Semester 1
Week 1 (week commencing 3/10/16)
Topic 1: Data collection, display and summaries
Mon
Thu
Thu
Thu
Fri
Fri
Fri
Herschel Building, Curtis Auditorium
Herschel Building, Lecture Theatre 3
King George VI Building, Lecture Theatre 6
King George VI Building, Lecture Theatre 1
Percy Building, G.13
Percy Building, G.13
Herschel Building, Lecture Theatre 3
3rd October
6th October
6th October
6th October
7th October
7th October
7th October
Lecture
Tutorial
Tutorial
Tutorial
Tutorial
Tutorial
Tutorial
12 - 1
10 - 11
11 - 12
12 - 1
9 - 10
10 - 11
11 - 12
Week 2 (week commencing 10/10/16)
Mon
Thu
Thu
Thu
Fri
Fri
Fri
10th October
13th October
13th October
13th October
14th October
14th October
14th October
Lecture
Practical
Practical
Practical
Practical
Practical
Practical
12 - 1
10 - 11
11 - 12
12 - 1
9 - 10
10 - 11
11 - 12
Herschel Building, Curtis Auditorium
Armstrong Building, 2.96 (PC)
King George VI Building, Lawn cluster
King George VI Building, Lawn cluster
Herschel Building, Blue Zone - Herschel cluster
Armstrong Building, 2.96 (PC)
King George VI Building, Lawn cluster
Week 3 (week commencing 17/10/16)
CBA1 opens in “practice mode”
Mon
Thu
Thu
Thu
Fri
Fri
Fri
17th October
20th October
20th October
20th October
21st October
21st October
21st October
Lecture
Tutorial
Tutorial
Tutorial
Tutorial
Tutorial
Tutorial
12 - 1
10 - 11
11 - 12
12 - 1
9 - 10
10 - 11
11 - 12
Week 4 (week commencing 24/10/16)
Herschel Building, Curtis Auditorium
Herschel Building, Lecture Theatre 3
King George VI Building, Lecture Theatre 6
King George VI Building, Lecture Theatre 1
Percy Building, G.13
Percy Building, G.13
Herschel Building, Lecture Theatre 3
Topic 2: Probability and decision making
CBA1 opens in “assessed mode” – deadline: midnight Friday 28th October
Mon
Thu
Thu
Thu
Fri
Fri
Fri
24th October
27th October
27th October
27th October
28th October
28th October
28th October
Lecture
Tutorial
Tutorial
Tutorial
Tutorial
Tutorial
Tutorial
12 - 1
10 - 11
11 - 12
12 - 1
9 - 10
10 - 11
11 - 12
Herschel Building, Curtis Auditorium
Herschel Building, Lecture Theatre 3
King George VI Building, Lecture Theatre 6
King George VI Building, Lecture Theatre 1
Percy Building, G.13
Percy Building, G.13
Herschel Building, Lecture Theatre 3
Week 5 (week commencing 31/10/16)
Mon
Thu
Thu
Thu
Fri
Fri
Fri
31st October
3rd November
3rd November
3rd November
4th November
4th November
4th November
Lecture
Tutorial
Tutorial
Tutorial
Tutorial
Tutorial
Tutorial
12 - 1
10 - 11
11 - 12
12 - 1
9 - 10
10 - 11
11 - 12
Herschel Building, Curtis Auditorium
Herschel Building, Lecture Theatre 3
King George VI Building, Lecture Theatre 6
King George VI Building, Lecture Theatre 1
Percy Building, G.13
Percy Building, G.13
Herschel Building, Lecture Theatre 3
Week 6 (week commencing 7/11/16)
CBA2 opens in “practice mode”
Mon
Thu
Thu
Thu
Fri
Fri
Fri
7th November
10th November
10th November
10th November
11th November
11th November
11th November
Lecture
Tutorial
Tutorial
Tutorial
Tutorial
Tutorial
Tutorial
12 - 1
10 - 11
11 - 12
12 - 1
9 - 10
10 - 11
11 - 12
Herschel Building, Curtis Auditorium
Herschel Building, Lecture Theatre 3
King George VI Building, Lecture Theatre 6
King George VI Building, Lecture Theatre 1
Percy Building, G.13
Percy Building, G.13
Herschel Building, Lecture Theatre 3
Week 7 (week commencing 14/11/16)
Topic 3: Probability models
CBA2 opens in “assessed mode” – deadline: midnight Friday 18th November
Assignment 1 available
Mon
Thu
Thu
Thu
Fri
Fri
Fri
14th November
17th November
17th November
17th November
18th November
18th November
18th November
Lecture
Practical
Practical
Practical
Practical
Practical
Practical
12 - 1
10 - 11
11 - 12
12 - 1
9 - 10
10 - 11
11 - 12
Herschel Building, Curtis Auditorium
Armstrong Building, 2.96 (PC)
King George VI Building, Lawn cluster
King George VI Building, Lawn cluster
Herschel Building, Blue Zone - Herschel cluster
Armstrong Building, 2.96 (PC)
King George VI Building, Lawn cluster
Week 8 (week commencing 21/11/16)
Mon
Thu
Thu
Thu
Fri
Fri
Fri
21st November
24th November
24th November
24th November
25th November
25th November
25th November
Lecture
Tutorial
Tutorial
Tutorial
Tutorial
Tutorial
Tutorial
12 - 1
10 - 11
11 - 12
12 - 1
9 - 10
10 - 11
11 - 12
Herschel Building, Curtis Auditorium
Herschel Building, Lecture Theatre 3
King George VI Building, Lecture Theatre 6
King George VI Building, Lecture Theatre 1
Percy Building, G.13
Percy Building, G.13
Herschel Building, Lecture Theatre 3
Week 9 (week commencing 28/11/16)
Mon
Thu
Thu
Thu
Fri
Fri
Fri
28th November
1st December
1st December
1st December
2nd December
2nd December
2nd December
Lecture
Tutorial
Tutorial
Tutorial
Tutorial
Tutorial
Tutorial
12 - 1
10 - 11
11 - 12
12 - 1
9 - 10
10 - 11
11 - 12
Herschel Building, Curtis Auditorium
Herschel Building, Lecture Theatre 3
King George VI Building, Lecture Theatre 6
King George VI Building, Lecture Theatre 1
Percy Building, G.13
Percy Building, G.13
Herschel Building, Lecture Theatre 3
Week 10 (week commencing 5/12/16)
CBA3 opens in “practice mode” and “assessed mode”
Mon
Thu
Thu
Thu
Fri
Fri
Fri
5th December
8th December
8th December
8th December
9th December
9th December
9th December
Lecture
Tutorial
Tutorial
Tutorial
Tutorial
Tutorial
Tutorial
12 - 1
10 - 11
11 - 12
12 - 1
9 - 10
10 - 11
11 - 12
Herschel Building, Curtis Auditorium
Herschel Building, Lecture Theatre 3
King George VI Building, Lecture Theatre 6
King George VI Building, Lecture Theatre 1
Percy Building, G.13
Percy Building, G.13
Herschel Building, Lecture Theatre 3
Week 11 (week commencing 12/12/16)
Assignment 1 deadline: 4pm, Thursday 15th December
CBA3 deadline: midnight, Friday 16th December
Mon
Thu
Thu
Thu
Fri
Fri
Fri
12th December
15th December
15th December
15th December
16th December
16th December
16th December
Lecture
Tutorial
Tutorial
Tutorial
Tutorial
Tutorial
Tutorial
12 - 1
10 - 11
11 - 12
12 - 1
9 - 10
10 - 11
11 - 12
Herschel Building, Curtis Auditorium
Herschel Building, Lecture Theatre 3
King George VI Building, Lecture Theatre 6
King George VI Building, Lecture Theatre 1
Percy Building, G.13
Percy Building, G.13
Herschel Building, Lecture Theatre 3
Christmas vacation!
Week 12 (week commencing 9/1/17) – Revision week
Mon
Thu
Thu
Thu
Fri
Fri
Fri
9th January
12th January
12th January
12th January
13th January
13th January
13th January
Lecture
Tutorial
Tutorial
Tutorial
Tutorial
Tutorial
Tutorial
12 - 1
10 - 11
11 - 12
12 - 1
9 - 10
10 - 11
11 - 12
Herschel Building, Curtis Auditorium
Herschel Building, Lecture Theatre 3
King George VI Building, Lecture Theatre 6
King George VI Building, Lecture Theatre 1
Percy Building, G.13
Percy Building, G.13
Herschel Building, Lecture Theatre 3
MAS1403 Quantitative Methods for Business Management
1 Collecting and presenting data
1.1 Definitions
The quantities measured in a study are called random variables and a particular outcome is
called an observation. A collection of observations is the data. The collection of all possible
outcomes is the population.
We can rarely observe the whole population. Instead, we observe some sub–set of this called
the sample. The difficulty is in obtaining a representative sample.
Data/random variables are of different types:
• Qualitative (i.e. non-numerical)
– Categorical
∗ Outcomes take values from a set of categories, e.g. mode of transport to Uni:
{car, metro, bus, walk, other}.
• Quantitative (i.e. numerical)
– Discrete
∗ Things that are countable, e.g. number of people taking this module.
∗ Ordinal, e.g. response to questionnaire; 1 (strongly disagree) to 5 (strongly
agree)
– Continuous
∗ Things that we measure rather than count, e.g. height, weight, time.
Example 1
Identify the type of data described in each of the following examples:
(a) The time between emails arriving in your inbox is recorded.
(b) An opinion poll was taken asking people what is their favourite chocolate bar.
(c) The number of students attending a MAS1403 tutorial is recorded.
1
MAS1403 Quantitative Methods for Business Management
1.2 Sampling techniques
We typically aim for the sample to be representative of the population. The larger the sample
size the more precise information we have about the population.
There are three main types of sampling: random, quasi-random, non-random.
• Simple random sampling (random)
– Each element in the population is equally likely to be drawn into the sample.
– All elements are “put in a hat” and the sample is drawn from the “hat” at random.
– Advantages – easy to implement; each element has an equal chance of being selected.
– Disadvantages – often don’t have a complete list of the population; not all elements
might be equally accessible; it is possible, purely by chance, to pick an unrepresentative sample.
• Stratified sampling (random)
– We take a simple random sample from each “strata”, or group, within the population.
The sample sizes are usually proportional to the population sizes.
– Advantages – sampling within each stratum ensures that that stratum is properly
represented in the sample; simple random sampling within each stratum has the
advantages listed under simple random sampling above.
– Disadvantages – need information on the size and composition of each group; as
with simple random sampling, we need a list of all elements within each strata.
• Systematic sampling (quasi-random)
– The first element from the population is selected at random, and then every kth item
is chosen after this. This type of sampling is often used in a production line setting.
– Advantages – its simplicity! – and so it’s easy to implement.
– Disadvantages – not completely random; if there is a pattern in the production process it is easy to obtain a biased sample; only really suited to structured populations.
• Judgemental sampling (non-random)
– The person interested in obtaining the data decides who should be surveyed; for
example, the head of a service department might suggest particular clients to survey
based on his judgement, and they might be people who he thinks will give him the
responses he wants!
– Advantages – very focussed and aimed at the target population.
– Disadvantages – relies on the judgement of the person conducting the questionnaire/survey, and so cannot be guaranteed to be representative; is prone to bias.
2
MAS1403 Quantitative Methods for Business Management
• Accessibility sampling (non-random)
– Here, the most easily accessible elements are sampled.
– Advantages – easy to implement.
– Disadvantages – prone to bias.
• Quota sampling (non-random)
– Similar to stratified sampling, but uses judgemental sampling within each strata instead of random sampling. We sample within each strata until our quotas have been
reached.
– Advantages – results can be very accurate as this technique is very targeted.
– Disadvantages – the identification of appropriate quotas can be problematic; this
sampling technique relies heavily on the judgement of the interviewer.
Example 2
(a) A toy company, Toys 4 U, is to be inspected for the quality and safety of the toys it produces.
The inspection team takes a sample of toys from the production line by choosing the first
toy at random, and then selecting every 100th toy thereafter. What form of sampling are the
team using?
(b) Another inspection team is to investigate the quality of the smartphone covers made by a
local factory. In a typical working day the factory produces 100 covers for the new i-Phone
and 200 covers for the latest Samsung phone. Suggest a suitable form of sampling to check
the quality of the smartphone covers produced.
Solution
3
MAS1403 Quantitative Methods for Business Management
1.3 Frequency tables
Once we have collected our data, often the first stage of any analysis is to present them in a
simple and easily understood way. Tables are perhaps the simplest means of presenting data.
The way we construct the table depends on the type of data.
Example: discrete data
The following table shows the raw data for car sales at a new car showroom over a two week
period in July.
Date
Cars Sold
1st July
9
2nd July
8
3rd July
6
4th July
7
5th July
7
6th July
10
7th July
11
Date
Cars Sold
8th July
10
9th July
5
10th July
8
11th July
4
12th July
6
13th July
8
14th July
9
Presenting these data in a relative frequency table by number of days on which different numbers
of cars were sold, we get the following table:
Cars Sold
Tally
Frequency
Totals
4
Relative Frequency %
MAS1403 Quantitative Methods for Business Management
Example: continuous data
The following data set represents the service time in seconds for callers to a credit card call
centre.
196.3 199.7 206.7 203.8 203.1
200.8 201.3 205.6 181.6 201.7
180.2 193.3 188.2 199.9 204.7
We can present these data in a relative frequency as follows:
Class Interval
180 ≤ time < 185
185 ≤ time < 190
190 ≤ time < 195
195 ≤ time < 200
200 ≤ time < 205
205 ≤ time < 210
Totals
Tally
||
|
|
|||
|||| |
||
Frequency Relative Frequency %
2
13.33
1
6.67
1
6.67
3
20.00
6
40.00
2
13.33
15
100
5
MAS1403 Quantitative Methods for Business Management
1.4 Exercises
1. Identify the type of data described in each of the following examples:
(a) An opinion poll was taken asking people which party they would vote for in a general
election.
(b) In a steel production process the temperature of the molten steel is measured and recorded
every 60 seconds.
(c) A market researcher stops you in Northumberland Street and asks you to rate between 1
(disagree strongly) and 5 (agree strongly) your response to opinions presented to you.
(d) The hourly number of units produced by a beer bottling plant is recorded.
2. A credit card company wants to investigate the spending habits of its customers. From its
lists, the first customer is selected at random; thereafter, every 30th customer is selected.
(a) Is this an example of simple random sampling, stratified sampling, systematic sampling,
or judgemental sampling?
(b) Is this form of sampling random, quasi-random or non-random?
3. The number of telephone calls made by 20 students in a day is shown below.
3 5
1 0
0 2
1 0
3 1 4
3 2
0 1
1 1
2 0 4
Put these data into a relative frequency table.
4. The following data are the recorded length (in seconds) of 25 mobile phone calls made by
one student.
281.4
312.7
270.7
304.1
305.4
293.4
327.7
293.9
320.7
317.9
306.5
311.5
310.9
283.6
289.5
286.6
314.8
346.4
337.5
286.9
298.4
303.3
304.6
259.6
300.5
Complete the following percentage relative frequency table for these data.
Class Interval
250 ≤ time < 270
270 ≤ time < 290
290 ≤ time < 310
310 ≤ time < 330
330 ≤ time < 350
Totals
Tally
||
|
|
|||
|||
Frequency Relative Frequency %
2
13.33
1
6.67
1
6.67
3
20.00
3
20.00
25
100
6
MAS1403 Quantitative Methods for Business Management
2 Graphical methods for presenting data
Once we have collected our data, often the best way to summarise this data is through an appropriate graph. Graphs are more eye–catching than tables, and give us an “at–a–glance” picture
of the main features of our data: its distribution, location, spread, outliers etc.
2.1 Stem–and–leaf plots
Example 1
The observations below are the recorded time it takes to get through to an operator at a telephone
call centre (in seconds).
54
45
30
56 50 67 55
51 47 53 29
39 65 54 44
38 49 45 39
42 44 61 51
54 72 65 58
50
50
62
Represent the data in a stem-and leaf plot.
Stem
Leaf
n=
stem unit =
leaf unit =
Some notes on stem–and–leaf plots.
– Always show the stem units and the leaf units.
– The stem unit will usually be either 10 or 1; the corresponding unit for the leaves is
usually 1 and 0.1.
– Order the leaves from smallest to largest.
– If you have observations recorded to 2 d.p., always round down, e.g. 2.97 would become
2.9 rather than 3.0.
7
MAS1403 Quantitative Methods for Business Management
2.2 Bar charts
A commonly–used and clear way of presenting categorical data or any ungrouped discrete data.
Example 2
The following frequency table represents the modes of transport used daily by 30 students to
get to university.
Mode Frequency
Car
10
Walk
7
Bike
4
Bus
4
Metro
4
Train
1
Total
30
This gives the following bar chart:
10
8
Frequency
6
4
2
Car
Walk
Bike
Bus
Metro
Train
This bar chart clearly shows that the most popular mode of transport is the car and the least
popular is the train (in our small sample).
8
MAS1403 Quantitative Methods for Business Management
2.3 Histograms
Histograms can be thought of as “bar charts for continuous data”. First construct a grouped
frequency table then draw a bar for each class interval. Important point: unlike bar charts, there
are no gaps between the bars in a histogram.
Example 3
The following frequency table summarises the service times (in seconds) at a telephone call
centre.
Service time
Frequency
175≤ time <180
1
180≤ time <185
3
185≤ time <190
3
190≤ time <195
6
195≤ time <200
10
200≤ time <205
12
205≤ time <210
8
210≤ time <215
3
215≤ time <220
3
220≤ time <225
1
Totals
50
Relative Frequency (%)
2
6
6
12
20
24
16
6
6
2
100
The histogram for these data is:
12
24
10
Frequency
8
6
20
Relative 16
frequency
(%)
12
4
8
2
4
175 180 185 190 195 200 205 210 215 220 225
Time (s)
175 180 185 190 195 200 205 210 215 220 225
Time (s)
We can also plot relative frequency (%) on the vertical axis: this gives a percentage relative
frequency histogram. These are useful for comparing datasets of different sizes.
9
MAS1403 Quantitative Methods for Business Management
2.4 Relative frequency polygons
The relative frequency polygon is exactly the same as the relative frequency histogram, but
instead of having bars we join the mid–points of the top of each bar with a straight line. These
are useful for illustrating the relative differences between two or more groups.
Example 4
Consider the following data on gross weekly income (in £) collected from two sites in Newcastle.
Weekly Income (£)
West Road (%)
0 ≤ income < 100
9.3
100 ≤ income < 200
26.2
200 ≤ income < 300
21.3
300 ≤ income < 400
17.3
400 ≤ income < 500
11.3
500 ≤ income < 600
6.0
600 ≤ income < 700
4.0
700 ≤ income < 800
3.3
800 ≤ income < 900
1.3
900 ≤ income < 1000
0.0
Jesmond Road (%)
0.0
0.0
4.5
16.0
29.7
22.9
17.7
4.6
2.3
2.3
The following plot shows percentage relative frequency polygons for the two groups.
Example comments: The distribution of incomes on West Road is skewed towards lower values, whilst those on Jesmond Road are more symmetric. The graph clearly shows that income
in the Jesmond Road area is higher than that in the West Road area. The spread of incomes is
roughly the same in the two areas. There are no obvious outliers.
10
MAS1403 Quantitative Methods for Business Management
2.5 Cumulative frequency polygons
These are very useful for comparing datasets.
– Construct a percentage relative frequency table for your data.
– Add a “cumulative” column by adding up the percentages as you go along.
– Plot the upper end–point of each class interval against the cumulative value.
Example 5
The following plot contains the cumulative frequency polygons for the income data at both the
West Road and Jesmond Road sites.
It clearly shows the line for Jesmond Road is shifted to the right of that for West Road. This tells
us that the surveyed incomes are higher on Jesmond Road. We can compare the percentages of
people earning different income levels between the two sites quickly and easily.
11
MAS1403 Quantitative Methods for Business Management
2.6 Scatter plots
Scatter plots are used to plot two variables which you believe might be related, for example,
advertising expenditure and sales.
Example 6
The following data represents monthly output and total costs at a factory.
Total costs (£)
10,300
12,000
12,000
13,500
12,200
14,200
10,800
18,200
16,200
19,500
17,100
19,200
Monthly output (units)
2,400
3,900
3,100
4,500
4,100
5,400
1,100
7,800
7,200
9,500
6,400
8,300
For scatter plots, we comment on whether there is a linear association between the two variables? If so, is this positive (“uphill”) or negative (“downhill”)? Is the association strong? Or
maybe moderate or weak?
The plot above shows a clear positive, roughly linear, relationship between the two variables:
the more units made, the more it costs in total.
12
MAS1403 Quantitative Methods for Business Management
2.7 Time Series Plots
Data collected over time can be plotted by using a scatter plot, but with time as the (horizontal)
x-axis, and where the points are connected by lines: a time series plot.
Example 7
Consider the following data on the number of computers sold (in thousands) by quarter (JanuaryMarch, April-June, July-September, October-December) at a large warehouse outlet, starting in
quarter 1 2000.
2000
2001
2002
2003
2004
Q1
86.7
105.9
113.7
126.3
136.4
Q2
94.9
102.4
108.0
119.4
124.6
Q3
94.2
103.1
113.5
128.9
127.9
Q4
106.5
115.2
132.9
142.3
The time series plot is:
For time series plots, look out for trend and seasonal cycles in the data. Also look out for any
outliers.
The above plot clearly shows us two things: firstly, that there is an upwards trend to the data
(sales increase over time), and secondly that there is some regular variation around this trend
(sales are usually higher in quarters 1 and 4 than quarters 2 and 3.
13
MAS1403 Quantitative Methods for Business Management
2.8 Exercises
1. The following table shows the weight (in kilograms) of 50 sacks of potatoes leaving a farm
shop (the data have been ordered from smallest to largest).
8.1
8.9
9.5
9.7
10.0
10.2
10.4
10.6
10.8
11.3
8.2
9.2
9.5
9.7
10.0
10.2
10.4
10.6
10.9
11.3
8.5
9.3
9.6
9.9
10.0
10.2
10.4
10.6
11.0
11.5
8.7
9.3
9.6
9.9
10.0
10.3
10.5
10.6
11.2
11.6
8.8
9.4
9.6
10.0
10.1
10.3
10.6
10.7
11.3
12.8
Display these data in a stem and leaf plot. State clearly both the stem and the leaf units.
Comment on the distribution of the data.
2. Which is more suitable for representing the data from Question 1 (above), a bar chart or a
histogram? Justify your answer.
3. A small clothes shop have records of daily sales both before and after a local radio advertising campaign. Relative frequency polygons of the sales data are shown below.
Relative frequency polygons of sales (before and after)
Rel. freq. (%)
30
Before
After
20
10
0
2000
4000
6000
8000
10000
Daily sales (£)
Comment, with justification, on the success, or otherwise, of the advertising campaign.
14
MAS1403 Quantitative Methods for Business Management
3 Numerical summaries for data
Numerical summaries are numbers which summarise the main features of your data. You should
use both a measure of location and a measure of spread to summarise your dataset.
3.1 Measures of location
A measure of location is a value which is “typical” of the observations in our sample
1. The mean
The sample mean is the “average” of our data: the total divided by the sample size. It’s given
by the formula
n
1X
x̄ =
xi ,
n i=1
which, put more simply, means “add them up and divide by how many you’ve got”.
Example 1
Suppose we ask 7 Stage 2 Business Management students how many units of alcohol they drank
last week and get: 16, 52, 0, 6, 10, 0, 21. The sample mean alcohol consumption of these n = 7
students is
If your data are given in the form of a frequency table, then you “multiply each observation by
its frequency, add these numbers together and then divide by how many you’ve got”. If you
have a grouped frequency table, then you don’t know the value of each observation and so just
use the midpoint of the class interval.
2. The median
This is just the observation “in the middle”, when the data are put into order from smallest to
largest:
th
n+1
median =
smallest observation.
2
Example 2
Ordering the student alcohol data from the previous example gives 0, 0, 6, 10, 16, 21, 52.
Clearly the middle value is 10, so the median is 10 units per week.
Example 3
Suppose we also asked four Stage 2 Marketing and Management students how many units of
alcohol they drank last week, and got: 21,0,12,14. Calculate the median.
Solution
The median is often used if the dataset has an asymmetric profile, since it is not distorted by
extreme observations (“outliers”).
15
MAS1403 Quantitative Methods for Business Management
3. The mode
The mode is simply the most frequently occurring observation. For example, consider the
following data: 2, 2, 2, 3, 3, 4, 5. The mode is 2 as it occurs most often. The modal class is
easily obtained from a grouped frequency table or a histogram; it’s the class with the highest
frequency.
3.2 Measures of spread
A measure of spread quantifies how “spread out” (or how “variable”) our data are.
1. The range
Range = largest value − smallest value. For example, the range of the data: 2, 2, 2, 3, 3, 4, 5 is
5 − 2 = 3.
• Advantage: very simple to calculate.
• Disadvantages: sensitive to extreme observations; only suitable for comparing (roughly)
equally sized samples.
2. The inter-quartile range (IQR)
The IQR measures the range of the middle half of the data, and so is less affected by extreme
observations. It is given by Q3 − Q1, where
(n + 1)
th smallest observation
4
3(n + 1)
Q3 =
th smallest observation
4
Q1 =
(“lower quartile”)
(“upper quartile”).
Example 4
Calculate the inter-quartile range for the following data.
8.7, 9.0, 9.0, 9.2, 9.3, 9.3, 9.5, 9.6, 9.6, 9.6, 9.7, 9.7, 9.9, 10.3, 10.4, 10.5, 10.7, 10.8
Solution
n = 18, so the position of Q1 is (18 + 1)/4 = 4.75, therefore
Q1 = 9.2 + 0.75 × (9.3 − 9.2) = 9.2 + 0.075 = 9.275.
Similarly, the position of Q3 is 3 × (18 + 1)/4 = 14.25, therefore
Q3 = 10.3 + 0.25 × (10.4 − 10.3) = 10.3 + 0.025 = 10.325.
And so
IQR = Q3 − Q1 = 10.325 − 9.275 = 1.05.
16
MAS1403 Quantitative Methods for Business Management
3. The variance and standard deviation
The sample variance is the standard measure of spread used in statistics. It can be thought of as
“the average squared deviation from the mean”, and is given by
n
1 X
s =
(xi − x̄)2 .
n − 1 i=1
2
The following formula is easier for calculations
( n
)
X
1
x2 − (n × x̄2 ) .
s2 =
n − 1 i=1 i
In practice most people simply use the Statistics mode on their calculator (mode SD or Stat).
The sample standard deviation is just the square root of the variance, and is often preferred as
it is in the “original units of the data”.
Example 5
Consider again the data on the number of units of alcohol consumed by a sample of 7 students
last week: 16, 52, 0, 6, 10, 0, 21. Calculate the sample variance and the sample standard
deviation.
Solution
We have already calculated the sample mean as x̄ = 15. Now
X
x2 = 162 + 522 + 02 + 62 + 102 + 02 + 212 = 3537
n(x̄)2 = 7 × 152 = 1575
and so the sample variance is
s2 =
1
1962
(3537 − 1575) =
= 327
7−1
6
and the sample standard deviation is
s=
√
s2 =
√
327 = 18.08 units per week.
17
MAS1403 Quantitative Methods for Business Management
3.3 Box plots
Box plots (or “box and whisker” plots) are another graphical method for displaying data.
Example 6
Suppose that, from our data, we obtain the following summary statistics:
Minimum Lower Quartile (Q1)
10
40
Median (Q2)
43
Upper Quartile (Q3)
45
Maximum
50
A box plot is constructed as follows.
Box plots are particularly useful for highlighting differences between groups.
Example 7
It clearly shows that although there is overlap between the three sets of data, the first and second
datasets contain roughly similar responses and that these are quite different from those in the
third set. Note that the asterisks (*) at the ends of the whiskers is the way Minitab highlights
outlying values.
18
MAS1403 Quantitative Methods for Business Management
3.4 Exercises
1. Recall the following data from Exercise 1 in Chapter 2 on the weight (in kg) of 50 sacks of
potatoes leaving a farm shop.
8.1
8.9
9.5
9.7
10.0
10.2
10.4
10.6
10.8
11.3
8.2
9.2
9.5
9.7
10.0
10.2
10.4
10.6
10.9
11.3
8.5
9.3
9.6
9.9
10.0
10.2
10.4
10.6
11.0
11.5
8.7
9.3
9.6
9.9
10.0
10.3
10.5
10.6
11.2
11.6
8.8
9.4
9.6
10.0
10.1
10.3
10.6
10.7
11.3
12.8
(a) Calculate the mean of the data.
(b) Calculate the median of the data.
(c) Calculate the range of the data.
(d) Calculate the inter–quartile range.
(e) Calculate the sample standard deviation.
(f) Draw a box plot for these data and comment on it.
(g) Put the data in a grouped frequency table.
(h) Find the modal class.
2. Chloe collected the following data on the weight, in grams, of “large” chocolate chip cookies
produced by Millie’s Cookie Company.
27.1 22.4 26.5 23.4 25.6 26.3 51.3 24.9 26.0 25.4
To summarise, Chloe was going to calculate the mean and standard deviation for this sample. However, her friend Mark warned her that the mean and standard deviation might be
inappropriate measures of location and spread for these data.
(a) Do you agree with Mark? If so, why?
(b) Calculate measures of location and spread that you feel are more suitable.
3. An internet marketing firm was interested in the amount of time customers spend on their
website. They recorded the lengths of visits to the website for a sample of 100 customers
and whether the customer was male or female. The standard deviations of the lengths of
visits were 12.2 seconds for males and 18.5 seconds for females. Which group has the more
variable visit lengths, based on this sample, males or females?
19
MAS1403 Quantitative Methods for Business Management
4 Introduction to Probability
4.1 Definitions
An experiment is an activity where we do not know for certain what will happen, but we will
observe what happens. An outcome is one of the possible things that can happen. The sample
space is the set of all possible outcomes. An event is a set of outcomes.
All probabilities are measured on a scale ranging from zero to one, and can be expressed as
fractions, decimal numbers or percentages.
Notation: P (A) represents the probability of the event A, e.g. P (Rain tomorrow). P (Ā) is the
probability that A does not occur (“not A”).
The collection of all possible outcomes, that is the sample space, has a probability of 1. Two
events are said to be mutually exclusive if both cannot occur simultaneously. Two events are
said to be independent if the occurrence of one does not affect the probability of the other
occurring.
Example 1
Do you think the following pairs of events are independent?
• A: Molly plays table tennis, and B: Molly is good at maths
• C: Henry gets over 60 in MAS1403, and D: Henry gets under 40 in MAS1403
4.2 Measuring probability
1. Classical interpretation
Used when all possible outcomes are “equally likely”. In general, calculations follow from the
formula
Total number of outcomes in which event occurs
P (Event) =
.
Total number of possible outcomes
2. Frequentist interpretation
When the outcomes of an experiment are not equally likely, we can perform the same experiment a large number of times and observe the outcome. The probability of an event can be
estimated using the following formula:
P (Event) =
Number of times an event occurs
.
Total number of times experiment performed
20
MAS1403 Quantitative Methods for Business Management
3. Subjective interpretation
Probabilities are formulated subjectively using an individual’s (sometimes expert) opinion.
(Useful when the experiment can’t be repeated.) For example, when we board an aeroplane,
we judge the probability of it crashing to be sufficiently small that we are happy to undertake
the journey.
4.3 Examples
1. Chicken King is a fast–food chain with 700 outlets in the UK. The geographic location of its
restaurants is tabulated below:
Region
NE SE SW NW
Under 10,000
35 42 21
70
Population 10,000–100,000 70 105 84
35
Over 100,000 175 28 35
0
Total
280 175 140 105
Total
168
294
238
700
A health and safety organisation selects a restaurant at random for a hygiene inspection.
Assuming that each restaurant is equally likely to be selected, calculate the following probabilities.
(a) P (NE restaurant chosen),
(b) P (Restaurant chosen from a city with a population over 100,000),
(c) P (SW and city with a population under 10,000).
Solution
21
MAS1403 Quantitative Methods for Business Management
2. The spinner shown below is spun once.
Assuming each sector on the board is the same size, calculate the following probabilities.
(a) P (lands on a red shape) =
(b) P (lands on a triangle) =
(c) P (lands on a 4-sided shape) =
3. On the probability scale, how likely do you think it is that Newcastle United will be promoted
this season? Which approach to probability would you use to estimate this?
22
MAS1403 Quantitative Methods for Business Management
4.4 The addition rule
The addition rule describes the probability of any of two or more events occurring. The addition
rule for two events A and B is
P (A or B) = P (A) + P (B) − P (A and B).
This describes the probability of either event A or event B happening.
Example 2
Prospective interns at internet startup BlueFox face two aptitude tests. If 35 percent of applicants
pass the first test, 25 percent pass the second test, and 15 percent pass both tests, what percentage
of applicants pass at least one test?
Solution
We are told P (pass 1st test) = 0.35, P (pass 2nd test) = 0.25 and P (pass 1st and 2nd test) =
0.15. Therefore using the addition law
P (pass at least one test) = P (pass 1st or 2nd test)
= P (pass 1st test) + P (pass 2nd test) − P (pass 1st and 2nd test)
= 0.35 + 0.25 − 0.15
= 0.45.
So 45% of the applicants pass at least one of the tests.
Note: if events A and B are mutually exclusive then P (A and B) = 0 since A and B can’t
occur together. Therefore,
P (A or B) = P (A) + P (B).
23
MAS1403 Quantitative Methods for Business Management
4.5 Exercises
1. Do you think the following pairs of events are independent or dependent? Explain.
(a) E: An individual has a high IQ
F : An individual is accepted for a University place
(b) E1 : An individual has a large outstanding credit card debt
E2 : An individual is allowed to extend his bank overdraft
2. The following data refer to a class of 18 students. Suppose that we will choose one student
at random from this class.
Student
Number Sex
1 M
2 F
3 M
4 M
5 F
6 M
7 M
8 M
9 F
Height Weight Shoe Student
(m)
(kg) Size Number
1.91
70 11.0
10
1.73
89
6.5
11
1.73
73
7.0
12
1.63
54
8.0
13
1.73
58
6.5
14
1.70
60
8.0
15
1.82
76 10.0
16
1.67
54
7.5
17
1.55
47
4.0
18
Height Weight Shoe
Sex
(m)
(kg) Size
M
1.78
76
8.5
M
1.88
64
9.0
M
1.88
83
9.0
M
1.70
55
8.0
M
1.76
57
8.0
M
1.78
60
8.0
F
1.52
45
3.5
M
1.80
67
7.5
M
1.92
83 12.0
Find the probabilities for the following events.
(a) The student is female.
(b) The student’s weight is greater than 70kg.
(c) The student’s weight is greater than 70kg and the student’s shoe-size is greater than 8.
(d) The student’s weight is greater than 70kg or the student’s shoe-size is greater than 8.
3. The regional manager of supermarket Freshco is interested in predicting sales patterns of
breakfast cereal. If 85% of Freshco customers buy branded cereals (e.g. Kellogg’s etc), 60%
of customers buy Freshco’s own-brand cereals, and 50% of customers buy both branded
and Freshco’s own-brand cereal, what percentage of Freshco customers do not buy breakfast
cereal?
24
MAS1403 Quantitative Methods for Business Management
5 Conditional probability
5.1 The multiplication rule
The multiplication rule describes the probability of two (or more) events occurring. The probability of two events A and B both occurring is
P (A and B) = P (A) × P (B|A),
where P (B|A) is the conditional probability of B given that A has already happened.
Example 1
A small company has 10 employees: 4 male and 6 female. You, as the manager, select two
employees at random to attend a training session. What is the probability that you select two
female employees?
Solution
Re-arranging the above expression for the multiplication rule gives a formula for calculating a
conditional probability:
P (A and B)
P (B|A) =
.
P (A)
Example 2
Recall that prospective interns at internet startup BlueFox face two aptitude tests. If 35 percent
of applicants pass the first test, 25 percent pass the second test, and 15 percent pass both tests,
what percentage of applicants pass the second test given that they passed the first test?
Solution
25
MAS1403 Quantitative Methods for Business Management
Independent events: two events A and B are independent if P (B|A) = P (B), in which case
P (A and B) = P (A) × P (B).
Example 3
Are the outcomes of the two aptitude tests at internet startup BlueFox independent? Justify your
answer.
Solution
Example 4
Employees at a Marketing firm are classified by age and sex as follows:
under 30 30 to 50
Male
0.275
0.125
Female
0.325
0.175
over 50 Total
0.025
0.075
So, for example, 27.5% of employees are Male and under 30 years of age.
From this table, calculate
(a) P (Male)
(d) P (30 to 50|Male)
(b) P (30 to 50)
(e) Are the events “Male” and “30 to 50” independent?
(c) P (Male|30 to 50)
(f) P (Male)
Solution
26
MAS1403 Quantitative Methods for Business Management
5.2 Tree diagrams
Tree diagrams (or probability trees) are simple, clear ways of presenting probabilistic information.
Example 5
Suppose we have a biased coin, with P (Head) = 0.75. Then the following tree diagram displays
all outcomes, along with their associated probabilities, for two consecutive flips of the coin:
0.75 × 0.75 = 0.5625
H
0.75
0.25
H
0.75
T
0.75 × 0.25 = 0.1875
0.25 × 0.75 = 0.1875
0.25
H
T
0.75
0.25
T
0.25 × 0.25 = 0.0625
Important: multiply probabilities along branches (multiplication rule); the probabilities at the
ends of the branches should add up to 1.
Example 6
A small company has 10 employees: 4 male and 6 female. You, as the manager, select two
employees at random to attend a training session. What is the probability that you select one
male and one female employee?
Solution
27
MAS1403 Quantitative Methods for Business Management
Example 7
Joe has a Business Management exam on Thursday morning. On Wednesday night he is free to
choose one (and only one) of the following activities: (a) go to the cinema, (b) go to the pub,
(c) stay home and watch TV, (d) stay home and study. The probabilities that he elects these
alternatives are 0.14, 0.45, 0.25 and 0.16, respectively. His conditional probabilities of passing
the exam given (a), (b), (c) and (d) are 0.4, 0.05, 0.5 and 0.8 respectively. Find
(i) the probability that Joe goes to the pub and passes his exam;
(ii) the probability that Joe passes his exam;
(iii) the probability that Joe went to the pub, given that he passed his exam.
Solution
Use the space provided below to construct a tree diagram for this example.
(i) P (Joe goes to Pub and passes exam) =
(ii) P (Joe passes exam) =
(iii) P (Joe went to Pub | Joe passed exam) =
28
MAS1403 Quantitative Methods for Business Management
5.3 Exercises
1. An on-line retailer conducts a survey of 200 customers and obtains the following results.
Male
Female
Age
Under 30 30 to 45
60
20
40
30
Over 45
40
10
A customer is selected at random.
(a) What is the probability that the customer is male and aged 30 to 45?
(b) Given that this customer is aged 30 to 45, what is the probability that they are male?
(c) Given that this customer is female, what is the probability that they are 45 or under?
(d) Now suppose that two customers are selected at random. What is the probability that
both are Male?
2. If Vinny goes to the cinema, there is a 60% chance he will then also go to the bar afterwards.
However, if he doesn’t go to the cinema, this reduces to just 30%. On Friday night, Vinny
decides to go to the cinema only if his friend Julia also goes. Vinny has no idea about Julia’s
intentions this Friday and so is just as likely to go to the cinema as he is to not go. Let C
be the event that Vinny goes to the cinema, and B the event that Vinny goes to the bar, this
Friday. Using a probability tree diagram, or otherwise, find
(a) P (C)
(b) P (C̄)
(c) P (B̄|C)
(d) P (B̄|C̄)
(e) P (C and B)
(f) P (B)
29
MAS1403 Quantitative Methods for Business Management
6 Decision–making using probability
6.1 Expected Monetary Value
The Expected Monetary Value (EMV) of a single event is simply the probability of that event
multiplied by its monetary value.
Example 1
Suppose you win £5 if you pull an ace from a pack of cards, the EMV would be
4
× 5 = 0.38.
52
Your expected return would be 38 pence; if you repeated this bet a large number of times, you
would come out, on average, 38 pence better off per bet. Therefore you would want to pay no
more than 38p for such a bet.
EMV (Ace) = P (Ace) × MonetaryValue(Ace) =
In general, for more complicated problems involving several options,
X
EMV =
{P (Event) × Monetary value of Event}
where the sum is over all possible events. We choose the option with the largest EMV.
Example 2
Synaptec is a small technology company with a new product that they wish to launch on to the
market. It could go for
• a direct approach, launching onto the domestic market through traditional channels,
• it could launch only on the internet,
• or it could license the product to a larger company through the payment of a licence fee
irrespective of the success of the product.
Initial market research suggests that demand for the product can be classed into three categories:
high, medium or low, and these categories will occur with probabilities 0.2, 0.35 and 0.45.
Likely profits (in £K) to be earned under each option are
Direct
Internet
Licence
High Medium Low
100
55
-25
46
25
15
20
20
20
How should the company launch the product?
The EMV of each option can be calculated as follows:
EMV (Direct) = (0.2 × 100) + (0.35 × 55) + (0.45 × (−25)) = £28K
EMV (Internet) = (0.2 × 46) + (0.35 × 25) + (0.45 × 15) = £24.7K
EMV (Licence) = (0.2 × 20) + (0.35 × 20) + (0.45 × 20) = £20K.
On the basis of expected monetary value, the best choice is the Direct approach as this maximises EMV.
30
MAS1403 Quantitative Methods for Business Management
6.2 Decision trees
When we include a decision in a tree diagram (see Chapter 5) we use a rectangular node, called
a decision node to represent the decision. The diagram is then called a decision tree.
Example 3
The decision tree for the last example (Example 2) would look like this:
100
H
0.2
M
0.35
55
L
0.45
Direct
+28
-25
0.2
40
H
Internet
0.35
M
+24.7
L
25
0.45
15
Licence
+20
0.2
20
H
M
L
0.35
20
0.45
20
Key points:
• There are no probabilities at a decision node but we evaluate the expected monetary values
of the options.
• In a decision tree the first node (on the left) is always a decision node.
• There may also be other decision nodes.
• If there is another decision node then we evaluate the options there and choose the best one
(based on EMV), and the expected monetary value of this option becomes the expected
monetary value of the branch leading to the decision node.
• We work “backwards” through the tree (from right to left), evaluating EMVs and making
decisions at each decision node.
31
MAS1403 Quantitative Methods for Business Management
Example 4
Charlotte Watson, the manager of a small sales company, has the opportunity to buy a fixed
quantity of a new type of Android tablet which she can then offer for sale to clients.
The decision to buy the product and offer it for sale would involve a fixed cost of £200,000. The
number of tablets that will be sold is uncertain, but Charlotte judges that:
• Sales will be “poor” with probability 0.2; this will result in an income of £100,000.
• Sales will be “moderate” with probability 0.5; this will result in an income of £220,000.
• Sales will be “good” with probability 0.3; this will result in an income of £350,000.
For an additional fixed cost of £30,000, market research can be conducted to aid the decision–
making process. The outcome of the market research can be either positive or negative, with
probabilities 0.58 and 0.42, respectively. Knowing the outcome of the market research changes
the probabilities for the main sales project as follows:
Market research
Positive
Negative
Main sales probabilities
Poor Moderate Good
0.15
0.45
0.4
0.6
0.35
0.05
Charlotte has various options:
• Buy the tablets, without market research.
• Pay for the market research.
• Do nothing.
If she pays for the market research then, depending on the outcome, she can:
• Buy the tablets.
• Do nothing.
(a) Draw a decision tree for this problem.
(b) Use expected monetary value to determine the optimal course of action for Charlotte.
The following page is left blank for your solution to this question
32
MAS1403 Quantitative Methods for Business Management
33
MAS1403 Quantitative Methods for Business Management
6.3 Exercises
1. Picoplex Technologies have developed a new manufacturing process which they believe will
revolutionise the smartphone industry. They are, however, uncertain how they should go
about exploiting this advance.
Initial indications of the likely success of marketing the process are 55%, 30%, 15% for
“high success”, “medium success” and “probable failure”, respectively. The company has
three options; they can go ahead and develop the technology themselves, licence it or sell
the rights to it. The financial outcomes (in £ millions) for each option are given in the table
below.
“high success”
Develop
80
Licence
40
Sell
25
“medium success”
40
30
25
“failure”
–100
0
25
(a) Draw a decision tree to represent the company’s problem.
(b) Calculate the Expected Monetary Value for all possible decisions the company may take
and hence determine the optimal decision for the company.
2. The manager of a small business has the opportunity to buy a fixed quantity of a new product
and offer it for sale for a limited time.
The decision to buy the product and offer it for sale would involve a fixed cost of £150,000.
The amount that would be sold is uncertain but the manager judges that:
• There is a probability of 0.3 that sales will be “poor” with an income of £80,000.
• There is a probability of 0.5 that sales will be “medium” with an income of £160,000.
• There is a probability of 0.2 that sales will be “good” with an income of £240,000.
For an additional fixed cost of £20,000, the product can be sold for a trial period before a
final decision is made. No income is made from this trial. The result of the trial will be
“poor” with probability 0.33, “medium” with probability 0.40 or “good” with probability
0.27. Knowing the outcome of the trial changes the probabilities for the main sales project:
Trial outcome
Poor
Medium
Good
Main sales probabilities
Poor Medium Good
0.7
0.2
0.1
0.2
0.6
0.2
0.1
0.2
0.7
The manager also has the option to do nothing.
(a) Draw a decision tree for this problem.
(b) Use expected monetary value to determine the optimal course of action for this business.
34
MAS1403 Quantitative Methods for Business Management
7 Discrete probability models
7.1 Probability distributions
The probability distribution of a discrete random variable X is the list of all possible values
X can take and the probabilities associated with them.
Example 1
If the random variable X is the outcome of a roll of a fair six-sided die then the probability
distribution for X is:
r
1
2
3
4
5
6 Sum
P (X = r) 1/6 1/6 1/6 1/6 1/6 1/6
1
Key point: For a discrete random variable the probabilities of each possible value sum up to 1.
7.2 The binomial distribution
Suppose the following statements hold:
• There are a fixed number of trials or experiments (n).
• There are only two possible outcomes for each trial (‘success’ or ‘failure’).
• There is a constant probability of ‘success’, p.
• The outcome of each trial is independent of any other trial.
Then the number of successes, X, follows a binomial distribution.
Example 2
Which of the following scenarios could be adequately modelled by a binomial distribution?
• The number of sixes on 3 rolls of a fair six-sided die.
• The number of students who pass MAS1403 this year.
7.2.1 Calculating probabilities
If X follows a binomial distribution we write X ∼ Bin(n, p), and
P (X = r) =
n
Cr × pr × (1 − p)n−r ,
r = 0, 1, . . . , n.
Here, n Cr is the number of ways of getting r successes out of n trials, and is given by
n
Cr =
n!
,
r!(n − r)!
where r! = 1 × 2 × 3 × · · · × (r − 1) × r is known as “r factorial”. Important: most scientific
calculators have an n Cr button!
35
MAS1403 Quantitative Methods for Business Management
Example 3
What is the probability of getting 2 sixes from three rolls of a fair six-sided die?
Solution
Example 4
If X ∼ Bin(10, 0.2) calculate:
(a) P (X = 2)
(c) P (X < 3)
(b) P (X ≤ 2)
(d) P (X > 1)
Solution
36
MAS1403 Quantitative Methods for Business Management
7.2.2 Mean and variance
If X ∼ Bin(n, p), then its mean (or “expected value”) and variance are
E[X] = n × p
and
Var(X) = n × p × (1 − p).
Example 5
If X ∼ Bin(10, 0.2) calculate:
(a) E[X]
(b) Var(X)
(c) SD(X)
Solution
Example 6
A salesperson has a 50% chance of making a sale on a customer visit and she arranges 6 visits
in a day.
(a) Assuming sales at each visit are independent, suggest an appropriate distribution for the
number of sales she makes in a day.
(b) Calculate her expected number of sales.
Solution
37
MAS1403 Quantitative Methods for Business Management
7.3 Exercises
1. Consider the following probability distribution for the discrete random variable X. One of
the values is missing.
r
P (X = r)
-2 -1 0 1
0.1 0.2 ? 0.3
2
0.2
What is the missing value, P (X = 0)?
2. Let X be the number of sixes rolled on four rolls of a fair six-sided die.
(a) Calculate the probability distribution of X, i.e. the values P (X = r) for r = 0, 1, 2, 3, 4.
(b) Calculate P (X ≤ 2).
(c) Calculate P (X > 2).
(d) Calculate the mean and variance of X.
(e) What is the most likely number of sixes from four rolls of the die?
38
MAS1403 Quantitative Methods for Business Management
8 More discrete probability models
8.1 The Poisson distribution
Suppose the following hold:
• Events occur independently, at a constant rate (λ);
• There is no natural upper limit to the number of events.
Then the number of events, X, occurring in a given interval, has a Poisson distribution with
parameter λ.
Example 1
Which of the following random variables could be modelled by a Poisson distribution? Suggest an alternative if the Poisson distribution is not appropriate, and state the values of any
parameters.
(a) Calls are received at a call centre at a constant rate of 3 per minute on average. Let X be
the number of calls received in a 1 minute period.
(b) An operator at a tele-sales marketing firm has 20 calls to make in an hour. History suggests
that calls will be answered 55% of the time. Let Y be the number of answered calls in an
hour.
(c) Newcastle United score goals at a constant rate of 2.4 in 90 minutes, on average. Let Z be
the number of goals scored in 45 minutes.
Solution
39
MAS1403 Quantitative Methods for Business Management
8.1.1 Probabilities, means and variances
If X follows a Poisson distribution we write X ∼ Po(λ), and
P (X = r) =
λr e−λ
,
r!
r = 0, 1, . . .
If X ∼ Po(λ), then its mean and variance are
E[X] = λ
Var(X) = λ.
and
[Approximation to binomial: If X ∼ Bin(n, p) with n large, p small and both np and n(1 −
p) > 5 then X is approximately P o(np).]
Example 2
If X ∼ P o(5) calculate:
(a) P (X = 4)
(d) E[X]
(b) P (X ≤ 1)
(e) SD(X)
(c) P (X > 0)
(f) SD(X)
Solution
40
MAS1403 Quantitative Methods for Business Management
Example 3
A new Mercedes–Benz car franchise forecasts that it will sell around three of its most expensive
models each day.
(a) What probability distribution might be reasonable to use to model the number of cars sold
each day?
(b) What is the expected number and standard deviation of the number of cars sold each day?
(c) What is the probability that 3 cars are sold on a particular day?
(d) What is the probability that no cars are sold on a particular day?
(e) What is the probability that at least one car is sold on a particular day?
(f) Sales will be monitored over the next seven days and the sales team at the franchise will
receive a warning if they make no sales on at least 1 of the 7 days. What is the probability
that they receive a warning?
Solution
41
MAS1403 Quantitative Methods for Business Management
8.2 Exercises (on Chapters 7 & 8)
1. Which of the following random variables could be modelled with a binomial distribution and
which could be modelled with a Poisson distribution? In each case state the value(s) of the
parameter(s) of the distribution.
(a) A salesperson has a 30% chance of making a sale on a customer visit. She arranges 10
visits in a day. Let X be the number of sales she makes in a day.
(b) Calls to the British Passport Office in Durham occur at a rate of 7 per hour on average.
Let Y be the number of calls at the passport office in a 1 hour period.
(c) History suggests that 10% of eggs from a family-run farm are bad. Let Z be the number
of bad eggs in a box of a dozen (i.e. 12) eggs.
2. An operator at a call centre has 8 calls to make in an hour. History suggests that they will be
answered 40% of the time. Let X be the number of answered calls in an hour.
(a) What probability distribution does X have?
(b) What is the mean and standard deviation of X?
(c) Calculate the probability of getting a response exactly 7 times.
(d) Calculate the probability of getting fewer than 2 responses.
3. Calls are received at a telephone exchange at an average rate of 4 per minute. Let Y be the
number of calls received in one minute.
(a) What probability distribution does Y have?
(b) What is the mean and standard deviation of Y ?
(c) Calculate the probability that there are 6 calls in one minute.
(d) Calculate the probability that there are no more than 2 calls in a minute.
(e) Calculate the probability that there are more than 2 calls in a minute.
42
MAS1403 Quantitative Methods for Business Management
9 Continuous probability models
9.1 The Normal distribution
The Normal distribution is possibly the best–known and most–used continuous probability
distribution: you will use it a lot in Semester 2 of MAS1403. Its probability density function
(pdf) has a symmetrical “bell shaped” profile:
f (x)
µ − 4σ
µ
µ − 2σ
µ + 4σ
µ + 2σ
x
We can think of the pdf as a smoothed percentage relative frequency histogram: the area under
the curve is 1.
The Normal distribution has two parameters: the mean, µ, and the standard deviation, σ.
0
10
20
30
40
50
60
0.08
0.04
Density
0.00
0.02
Density
0.00
0.02
0.00
Density
Normal pdfs with mean 30 and sds 5, 10, 15
0.04
Normal pdfs with means 10, 30, 50 and sd 10
0.04
Normal pdf with mean 30 and sd 10
-20
0
x
20
40
60
x
80
-20
0
20
40
60
80
x
If a random variable X has a Normal distribution with mean µ and variance σ 2 , then we write
X ∼ N µ, σ 2 .
9.1.1 The standard Normal distribution
The standard Normal distribution, usually denoted by
Z ∼ N(0, 1),
has a mean of zero and a variance of 1, and we have tables of probabilities for this particular
Normal distribution; see page 51.
43
MAS1403 Quantitative Methods for Business Management
Example 1
Find the following probabilities when Z ∼ N(0, 1).
(a) P (Z ≤ −1.46)
(d) P (−1.2 < Z ≤ 1.5)
(b) P (Z ≤ 0.01)
(e) P (Z < 1.5)
(c) P (Z > 1.5)
(f) P (Z = z)
Solution
44
MAS1403 Quantitative Methods for Business Management
9.1.2 Probabilities from any Normal distribution
Any Normally distributed random variable X ∼ N(µ, σ 2 ) can be transformed into the standard
Normal distribution using the formula:
X −µ
,
σ
Z =
therefore
P (X ≤ x) = P
x−µ
Z≤
σ
,
which can be looked up in tables.
Example 2
If X ∼ N(10, 22 ) calculate P (X ≤ 8).
Solution
Example 3
Suppose X is the IQ of a randomly selected 18–19 year old and that X follows a normal
distribution with mean µ = 100 and standard deviation σ = 15. Thus, we have:
X ∼ N 100, 152 .
Find the following probabilities.
(a) The probability that an 18–19 year old has an IQ less than 110.
(b) The probability that an 18–19 year old has an IQ greater than 110.
(c) The probability that an 18–19 year old has an IQ greater than 125.
(d) The probability that an 18–19 year old has an IQ between 95 and 115.
Solutions
45
MAS1403 Quantitative Methods for Business Management
This page has been left blank for your solutions to the last example
46
MAS1403 Quantitative Methods for Business Management
9.2 Exercises
1. A company promises delivery within 20 working days of receipt of order. However, in
reality, they deliver according to a normal distribution with a mean of 16 days and a standard
deviation of 2.5 days.
(a) What proportion of customers receive their order late?
(b) What proportion of customers receive their orders between 10 and 15 days of placing
their order?
(c) A new order processing system promises to reduce the standard deviation of delivery
times to 1.5 days. If this system is used, what proportion of customers will receive their
deliveries within 20 days?
2. A drinks machine is regulated by its manufacturer so that it dispenses an average of 200ml
per cup. However, the machine is not particularly accurate and actually dispenses an amount
that has a normal distribution with standard deviation 15ml.
(a) What percentage of cups contain below the minimum permissible volume of 170ml?
(b) What percentage of cups contain over 225ml?
(c) What percentage of cups contain between 175ml and 225ml?
(d) How many cups would you expect to overflow if 240ml cups are used for the next 10000
drinks?
47
MAS1403 Quantitative Methods for Business Management
10 More continuous probability distributions
10.1 The normal distribution: using tables in reverse
Suppose we are told that P (Z ≤ z) = 0.95. What is the value of z?
From tables on page 51, we can see that
P (Z ≤ 1.64) = 0.9495
P (Z ≤ 1.65) = 0.9505.
and
Therefore, z = 1.645.
Now suppose that X ∼ N(100, 152), as in the IQ example from Chapter 9. Below what IQ are
95% of the population?
We know that P (Z ≤ 1.645) = 0.95 and z = (x − µ)/σ so
1.645 =
x−µ
x − 100
=
,
σ
15
therefore
x = 1.645 × 15 + 100 ≃ 124.7.
In other words, 95% of IQs are less than about 125.
10.2 The uniform distribution
The uniform distribution is the most simple continuous distribution. As the name suggests, it
describes a variable for which all possible outcomes are equally likely.
If the random variable X follows a uniform distribution, we write
X ∼ U(a, b).
Probabilities can be calculated using the formula


0

x − a
P (X ≤ x) =

b−a


1
for x < a
for a ≤ x ≤ b
for x > b,
and the mean and variance are given by
a+b
E[X] =
,
2
(b − a)2
Var(X) =
.
12
48
MAS1403 Quantitative Methods for Business Management
10.3 The exponential distribution
The exponential distribution is another common distribution that is used to describe continuous random variables. It is often used to model lifetimes of products and times between
“random” events such as arrivals of customers in a queueing system or arrivals of orders. The
distribution has one parameter, λ. If our random variable X follows an exponential distribution,
then we say
X ∼ Exp(λ).
Probabilities can be calculated using
(
1 − e−λx
P (X ≤ x) =
0
for x ≥ 0
for x < 0,
and the mean and variance are given by
E[X] =
1
,
λ
Var(X) =
1
.
λ2
10.3.1 Poisson process
The exponential distribution and the Poisson distribution are related through the notion of events
occurring randomly in time (at a constant average rate, λ). This is known as a Poisson process.
Consider a series of randomly occurring events such as calls at a call centre. The times of calls
might look like
0
× ×1
2
××
3
×4 ×
5
There are two ways of viewing these data. One is as the number of calls in each minute (here 2,
0, 2, 1 and 1) and the other is as the times between successive calls. For the Poisson process,
• the number of calls in each one minute interval has a Poisson distribution with parameter λ, and
• the time between successive calls has an exponential distribution with parameter λ.
49
MAS1403 Quantitative Methods for Business Management
10.4 Exercises
1. An express coach is due to arrive in Newcastle from London at 11pm. However, in practice,
it is equally likely to arrive anywhere between 15 minutes early to 45 minutes late, depending
on traffic conditions. Let the random variable X denote the amount of time (in minutes) that
the coach is delayed.
(a) Calculate the mean of the delay time.
(b) What is the probability that the coach is less than 5 minutes late?
(c) What is the probability that the coach is more than 20 minutes late?
(d) What is the probability that the coach arrives between 10.55 and 11.20pm?
(e) What is the probability that the coach arrives before 11pm?
2. The time (in minutes) between requests to a network server can be modelled by an exponential distribution with rate parameter λ = 2.5.
(a) What is the expected time between requests?
(b) What is the probability that the time between requests is less than 1 minute and 30
seconds?
(c) What is the probability that the time between requests is greater than 1 minute?
(d) What is the probability that the time between requests is between 1 minute and 1 minute
and 30 seconds?
(e) What is the probability that the time between requests is between 30 seconds and 50
seconds?
50
MAS1403 Quantitative Methods for Business Management
Probability Tables for the Standard Normal Distribution
The table contains values of P (Z ≤ z), where Z ∼ N(0, 1).
z
-2.9
-2.8
-2.7
-2.6
-2.5
-2.4
-2.3
-2.2
-2.1
-2.0
-1.9
-1.8
-1.7
-1.6
-1.5
-1.4
-1.3
-1.2
-1.1
-1.0
-0.9
-0.8
-0.7
-0.6
-0.5
-0.4
-0.3
-0.2
-0.1
0.0
-0.09
0.0014
0.0019
0.0026
0.0036
0.0048
0.0064
0.0084
0.0110
0.0143
0.0183
0.0233
0.0294
0.0367
0.0455
0.0559
0.0681
0.0823
0.0985
0.1170
0.1379
0.1611
0.1867
0.2148
0.2451
0.2776
0.3121
0.3483
0.3859
0.4247
0.4641
-0.08
0.0014
0.0020
0.0027
0.0037
0.0049
0.0066
0.0087
0.0113
0.0146
0.0188
0.0239
0.0301
0.0375
0.0465
0.0571
0.0694
0.0838
0.1003
0.1190
0.1401
0.1635
0.1894
0.2177
0.2483
0.2810
0.3156
0.3520
0.3897
0.4286
0.4681
-0.07
0.0015
0.0021
0.0028
0.0038
0.0051
0.0068
0.0089
0.0116
0.0150
0.0192
0.0244
0.0307
0.0384
0.0475
0.0582
0.0708
0.0853
0.1020
0.1210
0.1423
0.1660
0.1922
0.2206
0.2514
0.2843
0.3192
0.3557
0.3936
0.4325
0.4721
-0.06
0.0015
0.0021
0.0029
0.0039
0.0052
0.0069
0.0091
0.0119
0.0154
0.0197
0.0250
0.0314
0.0392
0.0485
0.0594
0.0721
0.0869
0.1038
0.1230
0.1446
0.1685
0.1949
0.2236
0.2546
0.2877
0.3228
0.3594
0.3974
0.4364
0.4761
-0.05
0.0016
0.0022
0.0030
0.0040
0.0054
0.0071
0.0094
0.0122
0.0158
0.0202
0.0256
0.0322
0.0401
0.0495
0.0606
0.0735
0.0885
0.1056
0.1251
0.1469
0.1711
0.1977
0.2266
0.2578
0.2912
0.3264
0.3632
0.4013
0.4404
0.4801
-0.04
0.0016
0.0023
0.0031
0.0041
0.0055
0.0073
0.0096
0.0125
0.0162
0.0207
0.0262
0.0329
0.0409
0.0505
0.0618
0.0749
0.0901
0.1075
0.1271
0.1492
0.1736
0.2005
0.2296
0.2611
0.2946
0.3300
0.3669
0.4052
0.4443
0.4840
-0.03
0.0017
0.0023
0.0032
0.0043
0.0057
0.0075
0.0099
0.0129
0.0166
0.0212
0.0268
0.0336
0.0418
0.0516
0.0630
0.0764
0.0918
0.1093
0.1292
0.1515
0.1762
0.2033
0.2327
0.2643
0.2981
0.3336
0.3707
0.4090
0.4483
0.4880
-0.02
0.0018
0.0024
0.0033
0.0044
0.0059
0.0078
0.0102
0.0132
0.0170
0.0217
0.0274
0.0344
0.0427
0.0526
0.0643
0.0778
0.0934
0.1112
0.1314
0.1539
0.1788
0.2061
0.2358
0.2676
0.3015
0.3372
0.3745
0.4129
0.4522
0.4920
-0.01
0.0018
0.0025
0.0034
0.0045
0.0060
0.0080
0.0104
0.0136
0.0174
0.0222
0.0281
0.0351
0.0436
0.0537
0.0655
0.0793
0.0951
0.1131
0.1335
0.1562
0.1814
0.2090
0.2389
0.2709
0.3050
0.3409
0.3783
0.4168
0.4562
0.4960
0.00
0.0019
0.0026
0.0035
0.0047
0.0062
0.0082
0.0107
0.0139
0.0179
0.0228
0.0287
0.0359
0.0446
0.0548
0.0668
0.0808
0.0968
0.1151
0.1357
0.1587
0.1841
0.2119
0.2420
0.2743
0.3085
0.3446
0.3821
0.4207
0.4602
0.5000
z
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
1.1
1.2
1.3
1.4
1.5
1.6
1.7
1.8
1.9
2.0
2.1
2.2
2.3
2.4
2.5
2.6
2.7
2.8
2.9
0.00
0.5000
0.5398
0.5793
0.6179
0.6554
0.6915
0.7257
0.7580
0.7881
0.8159
0.8413
0.8643
0.8849
0.9032
0.9192
0.9332
0.9452
0.9554
0.9641
0.9713
0.9772
0.9821
0.9861
0.9893
0.9918
0.9938
0.9953
0.9965
0.9974
0.9981
0.01
0.5040
0.5438
0.5832
0.6217
0.6591
0.6950
0.7291
0.7611
0.7910
0.8186
0.8438
0.8665
0.8869
0.9049
0.9207
0.9345
0.9463
0.9564
0.9649
0.9719
0.9778
0.9826
0.9864
0.9896
0.9920
0.9940
0.9955
0.9966
0.9975
0.9982
0.02
0.5080
0.5478
0.5871
0.6255
0.6628
0.6985
0.7324
0.7642
0.7939
0.8212
0.8461
0.8686
0.8888
0.9066
0.9222
0.9357
0.9474
0.9573
0.9656
0.9726
0.9783
0.9830
0.9868
0.9898
0.9922
0.9941
0.9956
0.9967
0.9976
0.9982
0.03
0.5120
0.5517
0.5910
0.6293
0.6664
0.7019
0.7357
0.7673
0.7967
0.8238
0.8485
0.8708
0.8907
0.9082
0.9236
0.9370
0.9484
0.9582
0.9664
0.9732
0.9788
0.9834
0.9871
0.9901
0.9925
0.9943
0.9957
0.9968
0.9977
0.9983
0.04
0.5160
0.5557
0.5948
0.6331
0.6700
0.7054
0.7389
0.7704
0.7995
0.8264
0.8508
0.8729
0.8925
0.9099
0.9251
0.9382
0.9495
0.9591
0.9671
0.9738
0.9793
0.9838
0.9875
0.9904
0.9927
0.9945
0.9959
0.9969
0.9977
0.9984
0.05
0.5199
0.5596
0.5987
0.6368
0.6736
0.7088
0.7422
0.7734
0.8023
0.8289
0.8531
0.8749
0.8944
0.9115
0.9265
0.9394
0.9505
0.9599
0.9678
0.9744
0.9798
0.9842
0.9878
0.9906
0.9929
0.9946
0.9960
0.9970
0.9978
0.9984
0.06
0.5239
0.5636
0.6026
0.6406
0.6772
0.7123
0.7454
0.7764
0.8051
0.8315
0.8554
0.8770
0.8962
0.9131
0.9279
0.9406
0.9515
0.9608
0.9686
0.9750
0.9803
0.9846
0.9881
0.9909
0.9931
0.9948
0.9961
0.9971
0.9979
0.9985
0.07
0.5279
0.5675
0.6064
0.6443
0.6808
0.7157
0.7486
0.7794
0.8078
0.8340
0.8577
0.8790
0.8980
0.9147
0.9292
0.9418
0.9525
0.9616
0.9693
0.9756
0.9808
0.9850
0.9884
0.9911
0.9932
0.9949
0.9962
0.9972
0.9979
0.9985
0.08
0.5319
0.5714
0.6103
0.6480
0.6844
0.7190
0.7517
0.7823
0.8106
0.8365
0.8599
0.8810
0.8997
0.9162
0.9306
0.9429
0.9535
0.9625
0.9699
0.9761
0.9812
0.9854
0.9887
0.9913
0.9934
0.9951
0.9963
0.9973
0.9980
0.9986
0.09
0.5359
0.5753
0.6141
0.6517
0.6879
0.7224
0.7549
0.7852
0.8133
0.8389
0.8621
0.8830
0.9015
0.9177
0.9319
0.9441
0.9545
0.9633
0.9706
0.9767
0.9817
0.9857
0.9890
0.9916
0.9936
0.9952
0.9964
0.9974
0.9981
0.9986
51