Download N - The University of Texas at Dallas

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Infinitesimal wikipedia , lookup

Positional notation wikipedia , lookup

Large numbers wikipedia , lookup

Mathematics of radio engineering wikipedia , lookup

Georg Cantor's first set theory article wikipedia , lookup

Arithmetic wikipedia , lookup

Infinity wikipedia , lookup

Birthday problem wikipedia , lookup

Infinite monkey theorem wikipedia , lookup

Non-standard analysis wikipedia , lookup

Inductive probability wikipedia , lookup

Proofs of Fermat's little theorem wikipedia , lookup

Real number wikipedia , lookup

Hyperreal number wikipedia , lookup

Law of large numbers wikipedia , lookup

Risk aversion (psychology) wikipedia , lookup

Addition wikipedia , lookup

Elementary mathematics wikipedia , lookup

Transcript
Outcomes, Counting, Countability, Measures and Probability
OPRE 7310 Lecture Notes by Metin Çakanyıldırım
Compiled at 17:34 on Friday 23rd September, 2016
1 Why Probability?
We have limited information about the experiments so we cannot know their outcomes with certainty. More
information can be collected, if doing so is profitable, to reduce uncertainty. But some amount of uncertainty
always remains as information collection is costly and might even be impossible or inaccurate. So we are
often bound to work with probability models.
Example: Instead of forecasting the number of probability books sold at the UTD bookstore tomorrow, let us
ask everybody (students, faculty, staff, residents of Plano and Richardson) if they plan to buy a book tomorrow. Surveying potential customers in this manner is always possible. But surveys are costly and inaccurate.
⋄
Before setting up probability models, we observe experiments and their outcomes in real-life to have
a sense of what is likely to happen. Deriving inferences from observations is the field of Statistics. These
inferences about the likelihood of outcomes become the ingredients of probability models that are designed
to mimic the real-life experiments. Probability models can later be used to make decisions to manage the
real-life contexts.
Example: Some real-life experiments that are worthy of probability models are subatomic particle collisions,
genetic breeding, weather forecasting, financial securities, queues. ⋄
2 An Event – A Collection of the Outcomes of an Experiment
Outcome of an experiment may be uncertain before the experiment happens. That is, the outcome may not be
determined with (sufficient) certainty ex ante. Here the word experiment has a broad meaning that covers
more than laboratory experiments or on-site experiments. It covers any action or activity whose outcomes
are of interest. This broader meaning is illustrated with the next example.
Example: As an experiment, we can consider an update of Generally Accepted Accounting Principles
(GAAP) issued by Financial Accounting Standards Board (FASB.org). Suppose that the board is investigating an update of reporting requirements for startups (more formally, Development Stage Entities). The
board can decide to keep (K) the status quo, increase (I) the reporting requirements or decrease (D) them.
Although accounting professionals can assess the likelihood of each of the outcomes K, I and D, they cannot
be certain whether the board’s discussion will lead to I or D, or K, so the outcomes of the update experiment
are uncertain. ⋄
Sufficiency of certainty depends on the intended use of the associated probability model. A room thermostat may be assumed to be showing the room temperature with sufficient certainty for the purpose of
controlling the temperature with an air conditioner. The same thermostat may have insufficient certainty for
controlling the speed of a heat releasing chemical reaction. When the uncertainty is deemed to be sufficient,
it can be reduced, say by employing a more accurate thermostat. Or a probabilistic model can be designed
to incorporate the uncertainty, say by controlling the average speed of the reaction.
1
Example: Outcomes of a dice rolling experiment are 1, 2, 3, 4, 5 and 6. For a fair dice, each outcome is
(sufficiently) uncertain. ⋄
Each outcome of an experiment can be denoted generically by ω or indexed as ωi for specificity. Often
these outcomes are minimal outcomes that cannot be or are not preferred to be separated into several other
outcomes. Such minimal outcomes can be called elementary outcomes. Two elementary outcomes cannot occur
at once, so elementary outcomes are mutually exclusive. Then elementary outcomes can be collected to obtain
a set of outcomes Ω, that is generally called the sample space.
Example: For the experiment of updating the accounting principles, ω1 = K, ω2 = I, ω3 = D and Ω =
{K } ∪ { I } ∪ { D } = {K, I, D }. ⋄
Example: For the experiment of rolling a dice, ωi = i for i = 1, . . . , 6 and Ω = {1, 2, 3, 4, 5, 6}. ⋄
An event is a collection of the outcomes of an experiment. So, an event is a subset of the sample space, i.e.,
a non-empty event A has ω ∈ A ⊆ Ω for some ω. An event can be empty and is denoted by A = ∅. We are
not interested in impossible events in practice, the consideration of ∅ is useful for theoretical construction of
probability models.
Example: For the experiment of updating the accounting principles, the event of not increasing the reporting
requirements can be denoted by {K, D }. ⋄
Example: For the experiment of rolling a dice, the event of an even outcome is {2, 4, 6} and the event of no
outcome is ∅. ⋄
Example: Consider the collision of two hydrogen atoms on a plane. One of the atoms is stationary at the
origin and is hit by another moving from the left to right. After the collisio, the atom moving from left to
right can move to the 1st, 2nd, 3rd or 4th quadrant. The sample space for the movement of this atom is
{1, 2, 3, 4} and the event that it bounces back (moves from right to left after the collision) is {2, 3}. ⋄
Since an event corresponds to a set in the sample space, we can apply set operations on events. In
particular, for two events A and B, we can speak of intersection, union, set difference. If the intersection of
two events is empty, they are called disjoint: If A ∩ B = ∅, then A and B are disjoint.
Example: In an experiment of rolling a dice twice, we can consider the sum and the multiplication of the
numbers in the first and second rolls. Let A be the event that the multiplication is odd; B be the event that the
sum is odd; C be the event that both multiplication and sum are odd; D be the event that both multiplication
and sum are even.
The outcomes in A have both the first and second number odd, while the outcomes in B have an odd
number and an even number. Hence no outcome can be in both event A and B, which turn out to be disjoint
events: A ∩ B = ∅ = C. To be in D, an outcome must have both numbers even. In each outcome, either both
numbers are odd (so multiplication is odd and sum is even =⇒ A); or one is odd while the other is even
(so multiplication is even and sum is odd =⇒ B); or both numbers are even (so multiplication and sum are
even =⇒ D). Hence, A ∪ B ∪ D = Ω. To convince yourself further, you can enumerate each outcome and
see whether it falls in A or B or D by completing Table 1. ⋄
3 Counting Countable Outcomes
When outcomes are countable, they can be finite or infinite.
2
Table 1: Sample space Ω for two dice rolls is composed of pairs (i, j) for i, j = 1 . . . 6. What is shown in each
cell below are (multiplication=ij,sum=i + j) and the associated event.
First
Roll
1
2
3
4
5
6
1
(1, 2), A
(2, 3), B
(3, 4),
(4, 5),
2
(2, 3), B
(4, 4), D
(6, 5),
(8, 6),
Second Roll
3
4
(3, 4), A
(4, 5), B
(6, 5), B
(8, 6), D
(9, 6),
(12, 7),
(12, 7),
(16, 8),
5
(5, 6), A
(10, 7), B
(15, 8),
(20, 9),
6
(6, 7), B
(12, 8), D
(18, 9),
(24, 10),
3.1 Finite Outcomes
3.1.1 Multiplication Principle
We can manually count the outcomes of an experiment if the outcomes are finite. In the experiment of updating the accounting principles, the experiment of rolling a dice and the experiment of rolling a dice twice,
the number of outcomes are respectively 3, 6 and 36. These numbers are found by manually counting the
outcomes. Sometimes instead of manually counting, we use the multiplication principle of counting illustrated
in the next example.
Example: An online dress retailer carries 3 styles of lady dresses: Night dress, Corporate dress and Sporty
dress. Each style has 20 cuts, 8 sizes and 5 different colors. A stock keeping unit (sku) for the online retailer
is defined by the dress style, cut, size and color as these four characteristics fully describe the dress item and
are used to buy dresses from the suppliers. The number of skus for this retailer is 3 × 20 × 8 × 5. ⋄
When the outcome of an experiment is defined by K characteristics that are independent of each other,
we can use the multiplication principle. We start by enumerating the number of ways characteristic k can
materialize and denote it by nk . Then the number of outcomes is n1 n2 . . . nK . For the example of the online
retailer above, nstyle = 3, ncut = 20, nsize = 8, ncolor = 5 for the set of characteristics {style, cut, size, color }.
Another way to denote these is to set 1 := style, 2 := cut, 3 := size, 4 := color, so K = 4 and n1 = 3, n2 = 20,
n3 = 8, n4 = 5. If the characteristics are not all independent of each other, we can still use the multiplication
principle with some adjustments.
Example: After a market research study, the online dress retailer decides to customize its offerings. It offers
22 cuts of Night dresses, 18 cuts of Corporate dresses and 34 cuts of Sporty dresses. Night dresses need to
fit more closely so they have 10 sizes, while Corporate and Sporty dresses have respectively 8 and 6 sizes.
The number of skus become (22 × 10 + 18 × 8 + 34 × 6) × 5. In this case, the color is independent of other
characteristics. Within each style, the cut and the size characteristics are independent. ⋄
3.1.2 Permutations
There are other ways of counting the outcomes of experiments. Counting permutations is one of them.
Example: The online retailer intends to show each dress in 5 different colors side by side on its web site
so that customers can easily compare the colors and buy the one(s) they like. The primary five colors are
White (W), B (Black), L (Blue), R (Red), Y (Yellow). The intention is that the customer picks the dress in each
color and brings it into a cell in a 5 × 1 table on the screen and compares the colors. To make this process
efficient, the online retailer asks the web designer to restrict customers so that they can pick each color exactly
once. Some example outcomes are [W, B, L, R, Y ], [ B, L, R, Y, W ], [ L, R, Y, W, B], [ R, Y, W, B, L] , [Y, W, B, L, R].
The number of ways 5 colors can be put in the 5 × 1 table without repeating the colors is the number of
permutations of colors. There are 5 color choices for the first cell, 4 color choices for the second, 3 choices for
3
the third, 2 choices for the fourth, only 1 choice for the last. Using the principle of multiplication, 5 colors
can have 5 × 4 × 3 × 2 × 1 permutations. ⋄
In general, n distinct objects can have n! := n × (n − 1) × · · · × 2 × 1 permutations of length n. If the
permutation length is k ≤ n, then the number of such permutations is Pkn := n × (n − 1) × · · · × (n − k + 1),
which has multiplication of exactly k terms. Said differently, Pkn is the number of permutations of k objects out of
n distinct objects. Pkn is referred to as k-permutations-of-n.
As in the online retailer’s color example, sometimes objects are virtual and can be repeated (picked up)
as many times as necessary. This is often referred to as sampling with repetition.
Example: Despite the online retailer’s specification, the web designer cannot restrict the customers to pick
a color only once. For example, W can be picked up for the first and second cells in the 5 × 1 table. Then
the colors are sampled by the customers with repetition and we cannot speak of permutations. Rather we
can ask the number of ways 5 colors can be put in the table with repetition. There are 5 color choices for
the first cell, 5 color choices for the second, 5 choices for the third, 5 choices for the fourth and 5 choice for
the last. Once more using the principle of multiplication, 5 colors can be placed in 5 cells with repetition in
5 × 5 × 5 × 5 × 5 ways. ⋄
In general, n distinct objects can be put in k boxes with repetition in nk ways. Repetitions increase the number
of ways objects can be organized: nk > Pkn .
3.1.3 Combinations
A key question in counting the outcomes is whether the sequence of objects in an outcome makes it different
from another outcome including exactly the same objects. Consider [W, B, L, R, Y ] vs. [ B, L, R, Y, W ] of the
colors, we considered these two sequences as different above and wrote them as a vector by using square
brackets. If the comparison of colors depends on the sequence of colors, the sequence matters and customers
perceive [W, B, L, R, Y ] different from [ B, L, R, Y, W ]. If the sequence does not matter both [W, B, L, R, Y ] and
[ B, L, R, Y, W ] have the same colors and can be mapped to the set {W, B, L, R, Y }. Such mapping is many-toone because many sequences boil down to the same set of objects.
Example: Suppose that the web designer has created a 3 × 1 table and can restrict customers to put at most
one color in each cell. If the sequence matters and sampling is without repetition, the number of placing 5
colors in 3 cells is P35 = 5 × 4 × 3 = 60. Now consider colors B, R, Y and the sequences that can be generated
only from these three colors: [ B, R, Y ], [ B, Y, R], [ R, B, Y ], [ R, Y, B], [Y, B, R], [Y, R, B]. It is easy to see that 3
colors can make 6=3! sequences, so the mapping from the set of sequences to the set of items is (3!)-to-(1). In
other words, when we start treating different sequences with the same items as the same sets, the number of
outcomes based on sequences should drop by a factor of 6 to obtain the number of outcomes based on sets.
If the sequence does not matter and sampling is without repetition, the number of placing 5 colors in 3 cells
is P35 /3! = 5 × 4 × 3/6 = 10. ⋄
From above, the number of sequences with length k that can be made without repetition from n items is
Pkn . When we consider different sequences with the same items as the same set, the number of sets become
one k!th of the number of sequences. Hence, the number of subsets including exactly k items out of n distinct items
is Ckn := Pkn /k!. Ckn is referred to as k-choose-from-n. Picking subsets from sets is called making combinations.
Example: In what is called a combination lock (often with 4 digits), there are several concentric dials, each
with digits {0, 1, 2, . . . , }. The lock unlocks when all dials show previously chosen digits in the correct order.
These previously chosen digits and their order act like a password for the lock. For the lock, the sequence
of digits matter, e.g., 1234 is different from 4321. So this sort of lock should be called a permutation lock as
opposed to a combination lock. ⋄
4
Example: OM area has 20 Ph.D. and 200 Master of Science students. Faculty considers inviting 2 Ph.D. and
4 master students to a curriculum meeting. How many ways are there to choose 2 Ph.D. and 4 master students? There are C220 ways to choose Ph.D. students and C4200 ways to choose master students, so the number
of ways is C220 C4200 . ⋄
The number of combinations of k objects taken out of n objects is Ckn . When we are picking k objects,
we are making up two subsets - objects picked and objects unpicked. What happens if we are to make up r
subsets out of n objects such that subset i has k i objects and ∑ri=1 k i = n. The number ways r subsets can be
made up is Ckn1 ,k2 ,...kr = n!/(k1 !k2 ! . . . kr !).
Example: 11 Ph.D. students are to be assigned to 4 professors: Ganesh, Shun-Chen, Anyan and Metin so
that Ganesh and Anyan have 4 students each and Shun-Chen has 2 students while Metin has 1 student.
How many assignments are possible? We are splitting students into 3 subsets with nG = n A = 4 and nS = 2,
n M = 1 while nG + n A + nS + n M = 11. The number of ways is 11!/(4!4!2!1!). ⋄
Example: How many distinct permutations can be obtained from the letters of Mississippi? Mississippi has
11 letters, 4 Is, 4 Ss, 2 Ps and 1 M, so the number of permutations is 11!/(4!4!2!1!). ⋄
While discussing combinations above, we have referred to the number of subsets, whose elements cannot
repeat. So the combination discussion above pertains only to sampling without repetition. If repetition is
allowed in a collection, the collection is called a multiset. Each set is a multiset, so the multiset notion is
a generalization of the set notion. In a multiset, the total number of elements, including repetitions, is the
cardinality of the multiset and the number of times an element appears is the multiplicity of that element.
Example: Suppose that we are to pick colors B, R, Y to create multisets of cardinality 3. ⟨ B, R, Y ⟩ is the unique
multiset whose elements do not repeat, so { B, R, Y } is also a set. As repetition is allowed in a multiset, some
elements can be used twice or thrice while the others are not used at all. If Y is not used, we can still construct
multisets ⟨ B, R, R⟩, ⟨ B, B, R⟩, ⟨ B, B, B⟩ and ⟨ R, R, R⟩. If Y must be used, we can construct some other multisets
⟨ B, Y, Y ⟩, ⟨ R, Y, Y ⟩, ⟨Y, R, R⟩, ⟨Y, B, B⟩ and ⟨Y, Y, Y ⟩. The multiplicity of B in ⟨ B, R, R⟩ is x B = 1, while the
multiplicity of B in ⟨ B, B, R⟩ is x B = 2. ⋄
Using n distinct elements, how many multisets with cardinality k can we construct? Each multiset is
uniquely identified by its multiplicities { x1 , x2 , . . . , xn }. Since we set the cardinality equal to k, we need to
insist on x1 + x2 + · · · + xn = k. Also each multiplicity must be a natural number (non-negative integer), i.e.,
xi ∈ N for i = 1 . . . n. The number of multisets with cardinality k is the number of solutions to
X := { x1 + x2 + · · · + xn = k, xi ∈ N for i = 1 . . . n}.
To find the number of solutions to X , we first consider a seemingly different problem. Suppose that we
have n + k − 1 objects denoted by “+”.
+ +
+ ......
1st 2nd 3rd . . . . . .
+
+
n + k − 2nd n + k − 1st
We encircle n − 1 of these + objects to obtain exactly n segments made up of some or no + objects. If j − 1st
and jth encircled +s are next to each other, then the jth segment has no + or x j = 0. In general, x j is the
number +s in the jth segment.
+ ...
1st . . .
⊕
+
+ ...
x1 th 1st circle 1st . . .
⊕
+
......
x2 nd 2nd circle . . . . . .
⊕
+ ...
n − 1st circle 1st . . .
+
xn th,
where the indexing of + objects restart from 1 after each encircled +. By using n − 1 circles, we have obtained
n segments, each segment j has x j elements, the sum of x j s must be n + k − 1 minus n − 1 as we start with
5
n + k − 1 objects and encircle n − 1 of them. Hence, x1 + x2 + · · · + xn = k. Each solution to X has a
corresponding way of encircling +s, and vice versa. So the number of solutions to X is the number of ways
+ k −1
we can encircle n − 1 objects out of n + k − 1 objects, which is Cnn−
= Ckn+k−1 .
1
Example: How many multisets with cardinality k = 3 can be assembled from n = 3 colors B,R,Y? Plugging
in the numbers, the answer turns out to be C35 = 10. For this small problem, we can list all of these multisets:
⟨ B, R, Y ⟩, ⟨ B, R, R⟩, ⟨ B, B, R⟩, ⟨ B, B, B⟩, ⟨ R, R, R⟩, ⟨ B, Y, Y ⟩, ⟨ R, Y, Y ⟩, ⟨Y, R, R⟩, ⟨Y, B, B⟩ and ⟨Y, Y, Y ⟩. If we
add colors bLack and White, we have n = 5 colors to create more multisets with cardinality k = 3. The
number of such multisets is C37 = 35. ⋄
Next table summarizes our discussion on finite outcomes.
Table 2: Number of ways of constructing cardinality k permutations, strings, sets or multisets from n distinct
objects.
No
Repeat?
Yes
Sequence matters?
Yes
No
Ckn sets
Pkn permutations
nk strings
Ckn+k−1 multisets
3.2 Infinite Outcomes
Infinite outcomes can be generated from an experiment that can potentially be repeated infinitely many
times. Therefore, the experiment itself should be repeatable.
Example: Throwing a coin, rolling a dice and calling a call center are experiments that can be repeated. Each
time they are performed, they generate outcomes: {Head, Tail } for throwing a coin, {1, 2, 3, 4, 5, 6} for rolling
a dice, {Busy, Available} for calling a call center. If we perform these experiments independently m times,
the outcomes become { H, T }m , {1, 2, 3, 4, 5, 6}m , { B, A}m . Here superscript m denotes the Cartesian product
applied m times, e.g., { H, T }2 := { H, T } × { H, T }. ⋄
(Nearly-) Infinite outcomes need to be considered when we repeat an experiment until something (un-)
desirable happens. If we are waiting for heads in a coin tossing experiment and recording the outcomes, we
can see arbitrarily long sequences TT . . . TTH. We can throw two dices simultaneously and wait until their
sum turns out to be 7, then again arbitrarily long sequences of sums can be observed 6, 8, 12, 9, 9, 4, 3, . . . ,
8, 10, 8, 11, 2, 7. Or a hacker can attempt to find a password of length 4 made out of digits {0, 1, . . . , 9} and
keep attempting different 4-permutations of these 10 digits: 2012, 7634, 1803, etc. The sample space for the
k-long permutations (passwords) made with n objects has nk elements. If the hacker is attempting random
permutations to find a password, he may have to this infinitely many times. If he is enumerating all the
permutations and testing each one by one, he need to do this only 10,000 times.
A set is countable if each of its elements can be associated with a single natural number. The sample space
for the experiment of waiting for heads has infinite elements but it is countable. This sample space has T,
HT, HHT, HHHT, and so on. Associating the number of H’s in the outcome with exactly the same natural
number (including 0), we can see that the sample space is countable. The sample space for rolling two dices
until obtaining the sum of 7 is also countable. The appendix has more on countability of sets and shows that
rational numbers are countable while real numbers are not.
Example: In the experiment of throwing a coin m times, let xi = 1 if the ith throw turns out to be head;
otherwise, xi = 0. After the mth experiment, we can compute the frequency of head X (m) := ∑im=1 xi /m.
We have 0 ≤ X (m) ≤ 1. It is easy to see that the sample space for the frequency of head is Ω X (m) := {0 =
0/m, 1/m, 2/m, . . . , m/m = 1}. Ω X (m) is a countable set, as any subset of rational numbers is countable. Can
6
we then say that the sample space for this experiment is the interval [0, 1]? Asked differently, as we increase
m, does Ω X (m) contain every element of [0, 1]? Note that [0, 1] is an interval over real numbers, which include both rational and irrational
numbers, so this interval is not countable. If the answer is yes, we can take
√
an irrational number, say 2/2 ∈ [0, 1], and this number must be in Ω X (m) . But the elements of Ω X (m) are
only rational numbers. Hence, the answer is no, the sample space of Ω X (m) does not become [0, 1] for any m. ⋄
The last example shows that repeating an experiment many times does not make its sample space uncountable. On the other hand, a single experiment without any repetition can yield an uncountable sample
space. Because of these, the case of uncountable sample space deserves a separate discussion.
4 Uncountable Outcomes
Outcomes that take values over a continuum are uncountable. Formally speaking, such outcomes are in an
interval of real numbers ℜ. This brings up a philosophical question that is what, if any, quantity in nature
takes really continuous values. That is, is there a quantity which must be measured in continuous amounts?
Many attempts to find such a quantity turns out to be futile, once we consider enough details.
Example: The amount of oxygen molecules in a room can be said to be a certain number of liters. This
number can be reported by an environmental engineer as if it is continuous, hence taking values in ℜ. But
a chemist may attempt to count the number of oxygen molecules and report only a natural number from
N . The amount of energy obtained from splitting a radioactive isotope can also be reported to be in ℜ by
a nuclear reactor operator while it can also be argued to be in N by a quantum physicist. You can continue
this exercise and see if you can find an amount that requires continuous measurements. You can consider
the number of shipments made by Amazon, ratio of shipments made to Texas; number of patients arriving
at a hospital, ratio of underage patients arriving at that hospital, etc. ⋄
It appears that the nature has quantities that can be measured by rational numbers rather than real numbers, i.e., the nature does not require continuous measures. The next question is whether we create continuous measures in basic sciences or social sciences. One of the social sciences that deals with measurement and
reporting of activities is accounting. Accounting does not seem to create amounts that require continuous
measures.
Example: The monetary values reported by accounting systems are numbers with at most two decimal digits, so these numbers are rational. Accounting systems also compute Key Performance Indicators (KPIs) by
taking a ratio of two rational numbers. For example Rate of Return (ROI) of an investment is the annual
return made by the investment divided by the amount of investment. Since both the numerator and the
denominator are rational in these KPI computations, the ratio is also rational. ⋄
We can also consider the prices from the standpoint of Finance, demand from the standpoint of Marketing, personnel characteristics from the standpoint of Organizational Behavior, and conclude that we can
use only natural or rational numbers in our analyses. However, you should also realize that many models
in these disciplines are based on variables that take continuous values from an interval of real numbers. It
appears that when we switch from observing what is happening to the analysis of what will happen, we
tend to use continuous values. The reason behind this can be speculated to be the ease of analysis. We can
take the point of view that continuous values are invented by the analysts for the purpose of ease of analysis.
Example: Time is one of the oldest inventions of the human and is often considered to take continuous
values. This is the reason why time is an uncountable noun in English when it refers to amount as in the sentence: “I spend too much time to understand the difference between countable and uncountable outcomes”.
7
It can be mathematically more convenient to build models that take time as a continuous value. ⋄
When an outcome (a variable) takes countable values, we call it a discrete variable; otherwise it is a continuous variable. Note that discrete outcomes can be finite or infinite; the distinction between discrete vs.
continuous is based on countability. Discrete variables can approximate continuous variables fairly well. For example, every real number can be approximated by a rational number at any desired accuracy. This is known
as the fact that rational numbers are everywhere dense in real numbers.
Example: Is Ω X (m) := {0/m, 1/m, 2/m, . . . , m/m} dense in the rational numbers in the interval [0, 1]? For a
desired accuracy level ϵ and an arbitrary rational q ∈ [0, 1], we can fix m such that supω ∈ΩX(m) |ω − q| ≤ ϵ.
As a matter of fact, m = 1/ϵ suffices for every rational q. Therefore, Ω X (m) is dense in the rational numbers
in the interval [0, 1]. ⋄
Since discrete and continuous variables approximate each other well, we can justify using continuous
variables instead of discrete ones when the underlying quantity is in fact discrete. Continuous variables can
also be appealing because they are easier to communicate and fit well with the practice of defining variables
over a range.
Example: Demand forecasters in practice often talk about ranges for the demand values. They say the demand is going to be between a and b, or over the range [ a, b] of real numbers, although the demand is actually
a natural number in this range. They also say that the Texan demand is certain percentage of the national
demand and this percentage takes values in [ a, b] for 0 ≤ a ≤ b ≤ 1, although the percentage is actually a
rational number in this range. ⋄
When dealing with uncountable outcomes (continuous variables), we often come across sample spaces
of the form Ω = {ω : a ≤ ω ≤ b} = [ a, b]. When there are m continuous variables, we may have Ω =
[ a1 , b1 ] × · · · × [ am , bm ]. The same variable can be continuous over a range and be discrete afterwards. Such a
mixture can indicate an assumption, a need to focus on some particular observations or methodology used
in data collection.
Example: An employee can quit a job within the first year of starting or afterwards. If the quitting happens in
the first year, it is reported in terms of fractions of a year; otherwise, it is reported as multiples of a year. The
historical tenure data then belongs to [0, 1] ∪ {2, 3, 4 . . . }. Note that the data for the first year is more accurate
than the other years. Such increased accuracy within the first year may be required by the human resources
department to accurately understand what triggers premature quitting. Hence, the employee tenure is both
continuous (uncountable) over [0, 1] and discrete (countable) over {2, 3, . . . }. ⋄
5 Probability Measure
Up to now, we have defined experiments, events and sample spaces. Most of the probability theory is
about computing the probability of an event; the probability of event A is denoted by P( A). Viewing P as
a mapping from the sample space Ω to nonnegative real numbers ℜ+ , we can say that P measures the size
of set A ∈ Ω. Although we focus on probability measures in this section, the concept of measure is more
general. A general measure µ : Ω → [0, ∞) satisfies µ(∅) = 0 and the countable additivity condition that
µ(∪i∞=1 Ai ) = ∑i∞=1 µ( Ai ) for each disjoint sequence of sets A1 , A2 , . . . . A probability measure in addition
is required to satisfy P(Ω) = 1. This section addresses the issue of measuring a set first from a countable
sample space and then from an uncountable sample space.
8
5.1 Countable Sample Spaces
Countable sample spaces can be finite. Then we can list P({ω }) for each ω ∈ Ω.
Example: For the experiment of tossing a fair coin, P({ H }) = 1/2 and P({ T }) = 1/2. For the experiment of
rolling a fair dice, P({i }) = 1/6 for i = 1 . . . 6. ⋄
When no ambiguity happens, as in the above example, we can drop curly brackets and write P(ω ) as
opposed to P({ω }), e.g., P( H ) = 1/2. The probability of event A, or the probability measure of set A, is
P( A ) =
∑
ω∈ A
P( ω ).
When the sample space is finite, so is A and the sum above is a sum of finite terms. Then the probability of
an event can be found by summing up the probability of outcomes making up the event.
Example: What is the probability that sum of the rolls on two dices is 7? The numbers on rolls can be
considered as pairs. To sum up to 7, these pairs must be (1,6), (2,5), (3,4), (4,3), (5,2), (6,1). There are six
elementary outcomes summing up to 7 out of 36 elementary outcomes. Hence, P(Sum of the numbers is 7)
= P((1, 6), (2, 5), (3, 4), (4, 3), (5, 2), (6, 1)) = 6/36. ⋄
Example: A box contains 4 balls: 2 Black and 2 White. Two balls are removed from the box without replacement. Let the sample space be ordered pairs indicating the ball colors. Then Ω = {bb, wb, bw, ww}.
We can check that P(bb) = P(ww) = (2/4)(1/3) = 1/6 and P(wb) = P(bw) = (2/4)(2/3) = 2/6. Let us
define events A, B, C as A = {a white ball is chosen}, B = {a black ball is chosen} and C = {two chosen
balls are of different color}. P( A) = 1 − P(bb) = 1 − 1/6 = 5/6, P( B) = 1 − P(ww) = 1 − 1/6 = 5/6 and
P(C ) = P(wb) + P(bw) = 4/6. ⋄
For an event A ∈ Ω, P( A) can have frequentist and behavioral interpretations. P( A) can be thought as
the relative frequency of event A known from the history of observing the experiment in the past. P( A) can
be interpreted also as the fair price of a bet that pays $1 if event A happens and $0 otherwise. For example,
if A = { Head f irst, Tail a f terwards} in tossing a coin twice, the fair price is $0.25.
Countable sample spaces can be infinite but even then we always have
Ω = ∪i∞=1 {wi } and so A = ∪i∞=1 ({wi } ∩ A) .
The countable additivity of the probability measure immediately implies
P( A ) =
∞
∞
i =1
i =1
ωi ∈ A
∑ P ({wi } ∩ A) = ∑
P ( wi ) .
When the outcomes are finite, we can express the probability of each outcome explicitly. That is, we can
write P(ω ) for every ω ∈ Ω. When the outcomes are infinite but countable, we can still write expressions for
each P(ω ).
Example: A fair coin is tossed until H appears. The sample space has outcomes such as H, TH, TTH, TTTH
. . . . In general, an outcome has k = 1, 2, 3, . . . tosses, first k − 1 of them are T and the last one is H. Let An be
the event of stopping in at most n tosses.
A1 = { H } and P( A1 ) = P( H ) = 1/2.
A2 = { H, TH } and P( A2 ) = P( H ) + P( TH ) = 1/2 + 1/4 = 3/4.
A3 = { H, TH, TTH } and P( A3 ) = P( H ) + P( TH ) + P( TTH ) = 1/2 + 1/4 + 1/8 = 7/8.
9
Let Bn be the event of requiring n + 1 or more tosses until H appears. Clearly, B0 = Ω. And B1 =
{ TH, TTH, . . . , T . . . TH, . . . },
∞
P( B1 ) = P( TH, TTH, . . . , T . . . TH, . . . ) =
∑ P(First k tosses are T and the k + 1st is H )
k =1
∞
∞
k =1
k =0
∑ (1/2)k (1/2) = (1/4) ∑ (1/2)k = (1/4)(1/(1 − 1/2)) = 1/2.
=
Also
∞
P( B2 ) = P( TTH, TTTH, . . . , TT . . . TH, . . . ) =
∑ P(First k tosses are T and the k + 1st is H )
k =2
=
∞
∞
k =2
k =0
∑ (1/2)k (1/2) = (1/8) ∑ (1/2)k = (1/8)(1/(1 − 1/2)) = 1/4.
And in general
∞
P( Bn ) = =
∑ P(First k tosses are T and the k + 1st is H )
k=n
∞
=
∑ (1/2)k (1/2) = (1/2)n+1
k=n
∞
∑ (1/2)k = (1/2)n+1 (1/(1 − 1/2)) = (1/2)n .
k =0
The probability of requiring at least n + 1 tosses until H appears is P( Bn ) = (1/2)n . Once more, we have
written each P(ω ) in the sums above.
It is also worth noting that A1 ⊂ A2 ⊂ · · · ⊂ An and B1 ⊃ B2 ⊃ · · · ⊃ Bn . { An } is an increasing sequence
of sets and its limit is limn→∞ An = Ω, then we can set A∞ = Ω. { Bn } is a decreasing sequence of sets and
its limit is limn→∞ Bn = ∅, then we can set B∞ = ∅. You can check that
∞
A∞ = ∪∞
n=1 Ai and B∞ = ∩n=1 Bi .
Furthermore, An ∪ Bn = Ω and An , Bn are disjoint for each n = 1, 2, . . . . ⋄
5.2 Uncountable Sample Spaces
For uncountable Ω, the tactic of writing each P(ω ) in Ω runs into a difficulty. For example, we cannot list the
outcomes that make up the uncountable sample space [0, 1]. We do not know where to start and go next for
such a list. If the outcomes in [0, 1] are equally likely and we attach a positive probability to each outcome, the
sum of of probabilities will be more than 1. If we attach positive probability to only countable outcomes, then
the sample space essentially becomes countable. Unless otherwise stated, [ a, b] always denotes an interval of
real numbers.
If we cannot attach a probability to each outcome of an uncountable sample space, what can we do? That
is, we want to attach probabilities to some subsets of Ω such that we can compute probabilities for all the
other sets of interest. These subsets do not have to be elementary outcomes, they can be sets including more
than one outcome. In other words, we want to measure every set in Ω. Disappointingly, every set is not
measurable with every measure.
Example: Lebesgue measure cannot measure a specially constructed set. When its domain is restricted to
[0, 1], Lebesgue measure defined as µl ( A) := (b − a) for the interval A = [ a, b] is a probability measure. To
10
construct the special set E, we consider grouping
of two real numbers x, y in the same
their difference
√
√ class if √
√
x − y is rational. For
example,
real
number
2
is
grouped
in
the
same
class
with
1
+
2,
2.1
+
2, −0.9 + 2,
√
√
√
√
etc. Real number 3 is grouped in the same class with 1 + 3, 2.1 + 3, −0.9 + 3, etc. Membership in
a class can be considered as a relation, which turns out to be reflexive, symmetric and transitive. Hence,
this relation yields equivalence classes defined over real numbers. Each equivalence class can be called Er
where r is a real number and Er = r + {Rational Numbers} for r ∈ ℜ. The equivalence classes are disjoint
Er1 ∩ Er2 = ∅ for r1 ̸= r2 . The new set E is assembled by a picking a single element from Er ∩ [0, 1] for r ∈ ℜ.
The resulting set E is uncountable and subset of [0, 1]. Attempts to obtain µl ( E) results in contradictions with
either countable additivity or nonnegativity of µl . The details of this contradiction is outside the scope but
can be found on pp.27-28 of Cohn (2013). Attaching significant importance to this construction, Cohn provides it as Theorem 1.4.9. Gelbaum and Olmsted (2003) discusses this issue in §8.11 titled as A nonmeasurable
set. ⋄
5.2.1 Sigma-field
When every set is not measurable, the alternative is to measure a collection of sets. We let F denote a
collection of events (subsets of Ω) and start wondering what properties F need to satisfy to be useful in the
probability context. Intuitively, we want to able to consider unions and intersections of events, and assess
their probabilities or assign probabilities to them. To formalize this, we want to attach a probability to A ∪ B,
if we have done so for A and B. So we would like to include A ∪ B in F if A, B ∈ F . Then we can assess the
probability for the event where either A or B happens. Is including the unions in the collection F sufficient?
It is not for the purpose of assessing the probability of A ∩ B, or the event that both A and B happen.
An indirect way to include A ∩ B in F is to require that both the union A ∪ B and the complement Ac are
in F if A, B ∈ F . This is because A ∩ B = ( Ac ∪ Bc )c ∈ F if A, B ∈ F . Requiring A ∪ B ∈ F and Ac ∈ F
makes F closed under set difference operations: Difference A \ B = A ∩ Bc ∈ F and symmetric difference
A△ B = ( A \ B) ∪ ( B \ A) ∈ F .
If we stop here and require A ∪ B ∈ F , Ac ∈ F and Ω ∈ F , then the collection F is called a field. By
using A ∪ B ∈ F several times in an induction argument, we can also obtain that a finite number of unions
are also in F : ∪in=1 Ai ∈ F if Ai ∈ F for i = 1 . . . n. Unfortunately, this does not suffice for our purposes of
being able to consider the probability associated with infinitely many events (say probability of getting H on
an odd throw). Hence, we require a stronger condition that the countable number of unions are in F when
constructing probability models: ∪i∞=1 Ai ∈ F if Ai ∈ F for i = 1, 2, . . . . If a field is closed under countable
unions, it is called a sigma-field, denoted by σ-field. In summary, we obtain the following three conditions
that define a σ-field. F is a σ-field if F
i) Includes the sample space: Ω ∈ F .
ii) Closed under complement: Ac ∈ F if A ∈ F .
iii) Closed under countable unions: ∪i∞=1 Ai ∈ F if Ai ∈ F for i = 1, 2, . . . .
Note that ii) and iii) imply that the σ-field is closed under countable intersections. If Ω is finite, then any
field over Ω is also a σ-field.
Example: For Ω = {1, 2, 3, 4}, one of the σ-fields is F = {∅, Ω, {1, 2}, {3, 4}}. Another σ-field is F =
{∅, Ω, {1, 3}, {2, 4}}. For Ω = N , F = {∅, Ω, odd natural numbers, even natural numbers} is a σ-field. ⋄
Sometimes a given collection of subsets of Ω is not a σ-field, but it can be turned into a σ-field by adding
more subsets to it. Addition of subsets to the collection may continue until all of the subsets of Ω are included.
All of the subsets of Ω is the largest σ-field over Ω.
Example: For Ω = {1, 2, 3, 4}, {∅, Ω, {1, 2}, {2}, {3, 4}} is not a σ-field because it does not include {2}c or the
union {2} ∪ {3, 4}. We can add these to the collection to obtain {∅, Ω, {1, 2}, {2}, {3, 4}, {1, 3, 4}, {2, 3, 4}},
11
which is not a σ-field because it does not include the complement {2, 3, 4}c . We can add this to the collection to obtain {∅, Ω, {1, 2}, {2}, {3, 4}, {1, 3, 4}, {2, 3, 4}, {1}}, which is a σ-field. We can say that the σ-field
{∅, Ω, {1, 2}, {2}, {3, 4}, {1, 3, 4}, {2, 3, 4}, {1}} is generated by {∅, Ω, {1, 2}, {2}, {3, 4}}. ⋄
The σ-field generated by collection C is the smallest σ-field that includes C . We write σ (C) to refer to the
σ-field generated by C . By definition σ(C) = ∩{F σ-filed including C}.
The examples above are based on finite sample spaces, we can define σ-fields over uncountable sample
spaces. One of the most used σ-fields is the Borel field R over ℜ. Borel field R is generated by open intervals
in ℜ:
R = σ({( a, b) : −∞ < a ≤ b < ∞}).
So the Borel field contains all the open intervals, their countable unions as well as complements. By using
[ a, b] = ∩∞
n=1 ( a − 1/n, b + 1/n ), we can see that closed intervals of ℜ also generate the Borel field.
Example: Each rational number q can be written as an intersection of countable open intervals: {q} =
∩∞
n=1 ( q − 1/n, q + 1/n ). So each singleton including a rational is in the Borel field, as well as their countable
union, which is the set of rational numbers. Since the set of rational numbers are in the Borel field, so is its
complement – the set of irrational numbers. ⋄
Pairing the sample space Ω with a σ-field F defined over it, we obtain (Ω, F ), which is called a measurable
space. Measurable spaces are used to define measurable functions. A function ξ : Ω → ℜ is called F measurable function if {ω ∈ Ω : a ≤ ξ (ω ) ≤ b} ∈ F for each a, b ∈ F .
Example: Over Ω = {1, 2, 3, 4}, consider the σ-field F = {∅, Ω, {1}, {2}, {1, 2}, {3, 4}, {1, 3, 4}, {2, 3, 4}}.
Let us check if ξ 1 (ω ) = ω for ω ∈ Ω is F measurable. Since {ω ∈ Ω : 3 ≤ ξ 1 (ω ) ≤ 3} = {3} ∈
/ F,
ξ is not measurable. Let us check if ξ 2 (ω ) = ω for ω ∈ {1, 2, 3} and ξ 2 (4) = 3 is F measurable. This
time {ω ∈ Ω : 3 ≤ ξ 2 (ω ) ≤ 3} = {3, 4} ∈ F . Moreover, {ω ∈ Ω : 1 ≤ ξ 2 (ω ) ≤ 1} = {1} ∈ F ,
{ω ∈ Ω : 2 ≤ ξ 2 (ω ) ≤ 2} = {2} ∈ F and {ω ∈ Ω : 4 ≤ ξ 2 (ω ) ≤ 4} = ∅ ∈ F . Furthermore, {ω ∈ Ω : 1 ≤ ξ 2 (ω ) ≤ 2} = {1, 2} ∈ F , {ω ∈ Ω : 2 ≤ ξ 2 (ω ) ≤ 3} = {2, 3, 4} ∈ F and
{ω ∈ Ω : 1 ≤ ξ 2 (ω ) ≤ 3} = {1, 2, 3, 4} ∈ F . So ξ 2 is F measurable. ⋄
5.2.2 Probability Space: Sample Space, Sigma-field, Probability Measure
To obtain a probability space from the measurable space (Ω, F ), we need to define probability measure P
such that
i) Measure the sets in F : That is, P : F → [0, 1], i.e., for every A ∈ F there exists a real number P( A) ∈ [0, 1].
ii) Countable additivity: P(∪i∞=1 Ai ) = ∑i∞=1 P( Ai ) for disjoint A1 , A2 , . . . .
iii) Sample space always happens: P(Ω) = 1.
These three properties can also be called axioms of probability. It is easy to justify them when P(·) is interpreted as frequency. Note that ii) applies only to countable collection of sets Ai , it does not necessarily apply
when the collection is uncountable. A probability space is a triplet (Ω, F , P) made from the sample space Ω,
σ-field F and the probability measure P.
Example: Consider the Borel field R[1, 11] defined over the interval Ω = [1, 11] and the function f :
R[1, 11] → [0, 1] defined as
A ∈ R,)we(partition it into
) open and closed
) each
(
( f ([ a, b]) =) (b(− a)/10. For
intervals as follows A = ∪i∞=1 [ a1i , bi1 ] ∪ ∪i∞=1 ( a2i , bi2 ] ∪ ∪i∞=1 [ a3i , bi3 ) ∪ ∪i∞=1 ( a4i , bi4 )) . Such a countable
partition is possible because the Borel field includes only countable unions. Then f ( A) = f (∪i∞=1 [ a1i , bi1 ] ∪
∪i∞=1 ( a2i , bi2 ] ∪ ∪i∞=1 [ a3i , bi3 ) ∪ ∪i∞=1 ( a4i , bi4 ])). Now we can check that f satisfies conditions i), ii) and iii) to be a
probability measure and makes ([1, 11], R[1, 11], f ) a probability space. ⋄
12
In ℜ with the Borel field R, the length of the interval is called the Lebesgue measure. This can be extended
to higher dimensions and the Lebesgue measure becomes the area of a set in ℜ2 and the volume of a set in
ℜ3 . If we take a rock with a volume of 1 liter and break it into smaller pieces, then the total volume of some
certain pieces in a collection is the sum of the volume of each piece in that certain collection. This sounds
quite trivial, so should its analog ii) in the probability theory.
Why countable additivity as opposed to finite additivity? Let us consider the experiment of tossing a fair
coin until the head shows up. We want to compute the probability that the number tosses, say k, is an odd
number. Since k is an odd number, we can write it as k = 2n − 1 for n = 1, 2, . . . . If k = 1, then the outcome
is H, n = 1 and with probability 1/2. If k = 3, the outcome is TTH, n = 2 with probability (1/2)3 . For
a generic n, the outcome say ωn has 2(n − 1) H and 1 T with probability is (1/2)2n−1 . Now we need to
∞
compute P(∪∞
n=1 { ωn }), which becomes ∑n=1 P( ωn ) by countable additivity. If we do not have countable
∞
additivity but just finite additivity, we cannot write P(∪∞
n=1 { ωn }) = ∑n=1 P( ωn ). Justifying this equality via
∞
2n−1 = (1/2) ∞ (1/4)n =
countable additivity what remains is to evaluate ∑n=1 P(ωn ), which is ∑∞
∑ n =0
n=1 (1/2)
(1/2)(4/3) = 2/3. As an exercise you can also compute the probability for even number of tosses. As this
example illustrates, finite additivity can be insufficient.
Why countable additivity as opposed to uncountable additivity? Expression of uncountable additivity can
be P(∪ω ∈ A {ω }) = ∑ω ∈ A P(ω ) for the uncountable set A ∈ Ω. To test this equality, consider the probability
space (Ω = [0, 1], R[0, 1], µl ) where R[0, 1] is the Borel field on [0, 1] and µl is the Lebesgue probability
measure P([ a, b]) = b − a for 0 ≤ a ≤ b ≤ 1. So P(ω ) = P([ω, ω ]) = 0 for 0 ≤ ω ≤ 1. The left-hand
side of the assumed uncountable additivity equation yields P(∪ω ∈ A {ω }) = 1 for A = Ω. The right-hand
side of the same equation yields ∑ω ∈ A P(ω ) = ∑ω ∈ A 0 = 0. Assuming uncountable additivity leads us to a
contradiction of 1 = 0, which is incorrect. Hence, uncountable additivity can be incorrect.
6 Exercises
1. 2 cards are selected from a deck of 52 cards.
a) How many ways are there to select 2 cards?
b) How many ways are there to choose so that one of the cards is an ace and the other is either king,
queen or jack?
2. How many signals – each consisting of 9 flags hung in a line – can be made from a set of 4 white flags,
3 red flags and 2 blue flags if all flags of the same color are identical? We can think of two ways to
answer this question.
a) There are 9! different orderings of 9 distinct flags. Since the white flags are identical, we must divide
9! by 4!. Apply the same logic for red and white flags.
b) There are 9 positions on a signal, 4 of these must be assigned to white, 3 to red and 2 to blue. In other
words, we are making up 3 subsets, one for white, one for blue and one for red where kW = 4, k R = 3
and k B = 2 while kW + k R + k B = 9.
3. Consider a set of balls, 5 of which are red and 3 of which are yellow. Assume that all of the red balls
and all of the yellow balls are indistinguishable. How many ways are there to line up the balls so that
no two yellow balls are next to each other?
4. 4 musicians make up a chamber orchestra to play cello, violin, flute and piano.
a) If each musician can play all of the four instruments, how many orchestral arrangements are possible?
ANSWER 4!
b) If each musician can play all of the four instruments except for one who can play only 2 instruments,
how many orchestral arrangements are possible?
ANSWER (2) 3!
13
5. UT Dallas WalMart Supply Chain case competition team of 8 people are to return from Arkansas to
Dallas with two cars. If each of the two cars can take at most 5 people, how many ways the team
members can be distributed to these two cars?
6. How many ways are there to distribute a deck of 52 cards to 13 players so that each player has exactly
4 cards and each of these 4 cards come from a different suit (spades, hearts, diamonds, clubs)?
7. Given natural numbers {1, 2, . . . , n}, let π be a permutation of them: π (i ) = j means that number i is
in position j. Let Π be the set of all permutations.
a) Suppose that n = 4 and consider the permutation 2 1 3 4, what is the associated π (1), π (2), π (3), π (4)?
b) Consider n = 4. How many permutations are there with the property π (1) ̸= 1? How many permutations are there with the property π (1) = 1 and π (2) ̸= 2?
c) Define the set Πk of permutations as follows
Πk = {π : π (i ) = i for 1 ≤ i ≤ k and π (k + 1) ̸= k + 1} for 0 ≤ k ≤ n − 1
and Πn = {π : π (i ) = i for 1 ≤ i ≤ n}. Check to see if {Πk }nk=0 partitions the set Π: i) Πk ∩ Πm = ∅
for 1 ≤ k < m ≤ n and ii) ∪nk=0 Πk = Π.
d) Use parts above to prove
n −1
n! =
∑ (i)i! + 1.
i =0
8. In how many ways one can place seven indistinguishable balls in four distinct boxes with no box left
empty?
9. How many non-negative integer solutions are there for x1 + x2 + · · · + xn = b for integer b ≥ 0?
Express the number of non-negative integer solution to x1 + x2 + · · · + xn ≤ b in terms of n and b. This
will give you an idea about the cardinality of feasible sets in integer programs.
10. How many positive integer solutions are there for x1 + x2 + x3 = 4?
11. a) How many different paths a rook (which moves only horizontally and vertically) can move from the
southwest corner of a chessboard to the northeast corner without ever moving to the west or south?
We are interested in paths not in the specific moves of the rook. Thus, we can assume that the rook
makes 14 moves: 7 to the East and 7 to the North.
b) How many of the paths consist of four or more consecutive eastward moves?
ANSWER a) Each path is a permutation of 7 E and 7 N. The number of paths is (14!)/(7!7!).
b) Let i be the starting position of the string of 4 or more consecutive Es in any permutation of 7
Es and 7 Ns with at least 4 Es. If i = 1, the first 4 moves are Es, the remaining 10 moves can be
3 Es and 7Ns. There are 10!/(7!3!) paths with i = 1. If i = 2 . . . 11, the i − 1st position must be
N. Otherwise, the starting position of the string is not i but i − 1. Since i − 1st position is N, there
remains 6 Ns and 3 Es which can be permuted in (9!)/(6!3!). The total number of paths is 10!/(7!3!)+ 10
(9!)/(6!3!)=120+840=960.
An incorrect approach is to treat NEEEEENNNNNNEE as two different sequences obtained by NE(4E)-NNNNNNEE and by N-(4E)-ENNNNNNEE. This double counting happens when you consider
4Es as a unit and the remaining 3Es and 7Ns separately to conclude that the answer is 11!/(1! 3!
7!)=1320.
12. A board (table) has M + 1 columns and N + 1 rows. A piece is located at cell (1, 1) and will move to
cell ( M + 1, N + 1) either by moving up 1 cell or by moving right 1 cell.
a) How many moves are necessary to go from (1, 1) to ( M + 1, N + 1)?
14
b) How many distinct paths do exist from (1, 1) to ( M + 1, N + 1)?
c) How many distinct non-decreasing integer-valued functions can be defined over the domain of integers { a, a + 1, . . . , a + M } and range of integers {b, b + 1, . . . , b + N } such that the functions go through
( a, b) and ( a + M, b + N )?
d) How many distinct non-decreasing integer-valued functions can be defined over the domain of integers { a, a + 1, . . . , a + M } and range of integers {b, b + 1, . . . , b + N } such that the functions go through
the point ( a, b) and between the points ( a + M, b) and ( a + M, b + N )?
15
Appendix: Countability of Rationals and Uncountability of Reals 1
Here are two questions to consider. Are there the same number of integers as natural numbers? Are there
the same number of rational numbers as natural numbers? The idea is that there can be infinite sets that do
not have the same size. To make sense of that statement, we have to know what it means to say that two sets
have the same size.
This really goes back to our ideas of what it means to count the elements of a set. When I look out of my
window to a field of sheep, I count the sheep by matching each sheep to a number: 1, 2, 3, 4, . . . . And if I
count the books in the bookcase, I do the same thing: I match each book to a number: 1, 2, 3, 4, . . . . I would
say there are 10 sheep (or books) if I can match each sheep (book) to a number from 1 to 10 in such a way that
each number gets used, and no two sheep get the same number. We say that the set of sheep in this field has
the same size as the set {1, 2, 3, 4, 5, 6, 7, 8, 9, 10}, because we can match sheep to numbers. When would we
say there are more than 10 sheep? If we match the sheep to the numbers 1 to 10 and still have some sheep left
over, there must be more than 10 sheep. If we match the sheep to the numbers 1 to 10 and still have numbers
left over, there must be fewer than 10 sheep.
We use much the same idea with infinite sets. An infinite set is one that cannot be matched to any finite
set. We use the natural numbers as our counting set, and try to match numbers to the naturals. If we can
do that, we say that the set is countable; if we cannot, we say that the set is uncountable. There is a slight
ambiguity about whether a finite set is countable. For our purposes, finite sets are countable.
Here is another way to think about countability. A countable set can be listed: we write the element
matched to 1 first, the element matched to 2 second, and so on. And if we are given a list, we can get a
matching (the first element gets matched to 1, the second to 2, and so on). Sometimes that can be a useful
way of thinking about things.
Let us try to think of some examples of countable sets. The natural numbers are countable, because I can
match each natural n to itself. Let us try to think of some more interesting examples!
Are the integers countable? Can we write them in a list? We might write them as . . . , -3, -2, -1, 0, 1, 2, 3,
. . . , but that does not count because the list does not have a first element. So we need another strategy. We
need to be sure that we list everything and that we do not list anything twice. Here is one possibility: 0, 1,
-1, 2, -2, 3, -3, . . . . We “start in the middle and work outwards”. So yes, the integers are countable.
Are the rationals countable? Can we list the rationals? Yes. The rational numbers are countable.
Let us see how to prove this. One way of thinking about our proof that the integers are countable is that
we wrote them in a line (. . . , -3, -2, -1, 0, 1, 2, 3, . . . ) and then drew a path that took us through them all. You
can see this by drawing it for yourself.
If we could do something similar for the rationals, that would be great. But it is not quite as obvious how
to write them in a line. In fact, thinking about it for a bit it seems more natural to write them in a grid:
q
1
2
3
4
5
...
1
1/1 →
1/2 ↓
1/3 ↗
1/4 ↓
1/5 ↗
...
2
2/1 ↙
2/2 ↗
2/3 ↙
2/4 ↗
2/5
...
p
3
3/1 →
3/2 ↙
3/3 ↗
3/4
3/5
...
4
4/1 ↙
4/2 ↗
4/3
4/4
4/5
...
5
5/1 →
5/2
5/3
5/4
5/5
...
...
...
...
...
...
...
...,
where the rational p/q is written in the pth column and qth row.
Now, how can we plot a path through these? We can imagine working through each diagonal in turn.
So we might get something like 1/1, 2/1, 1/2, 1/3, 2/2, 3/1, 4/1, 3/2, 2/3, 1/4, 1/5, . . . . This is not quite
1 Based
on posts on http://theoremoftheweek.wordpress.com
16
allowed, because we have counted some rationals twice (e.g., 2/2 = 1/1). But that is easily fixed: we say
that we will follow this path, simply missing out anything we have seen before. This gives us a listing of the
positive rationals.
The real numbers are uncountable. This means there is no way of listing the real numbers. So our aim is to
prove that it is impossible to write the real numbers in a list. How could we possibly do that? We are going
to suppose that it is possible to list the real numbers. Then we will somehow derive a contradiction from
that, which will mean our original supposition must have been wrong.
So, we are supposing for the moment that we can list the real numbers. In that case, we can certainly
list the real numbers over [0, 1]. Let us imagine that we have done this, and we have written them all out
in order, using their decimal expansions. Slightly annoyingly, some numbers have two expansions, since
0.99999999999999 = 1, but let us say we always write the finite version rather than the one ending in infinitely
many 9s. So they look something like
0.a1,1 a1,2 a1,3 a1,4 a1,5 . . .
0.a2,1 a2,2 a2,3 a2,4 a2,5 . . .
0.a3,1 a3,2 a3,3 a3,4 a3,5 . . .
0.a4,1 a4,2 a4,3 a4,4 a4,5 . . .
0.a5,1 a5,2 a5,3 a5,4 a5,5 . . .
ai,j is the jth decimal in the ith real number.
To derive a contradiction, we are going to build another real number between 0 and 1, one that is not on
our list. Since our list was supposed to contain all such real numbers, that will be a contradiction, and we
will be done. So let us think about how to build another real number between 0 and 1 in such a way that we
can be sure it is not on our list. Let us say this new number will be 0.b1 b2 b3 b4 b5 . . . , where we are about to
define the digits bi .
We want to make sure that our new number is not the same as the first number on our list. So let us do
that by making sure they have different numbers in the first decimal place. Say if a1,1 = 3 then b1 = 7 and
otherwise b1 = 3. I really mean: define b1 to be any digit apart from a1,1 , but I want to make sure that we do
not get a number that ends in infinitely many 9s, because of the irritating fact that 0.99999999999999 = 1, so I
want to make sure we never choose b1 to be 9.
Now we want to make sure that our new number is not the same as the second number in our list. We
can do this by making sure that the second digit of our new number is not the same as the second digit of the
second number. So let us put b2 = 7 if a2,2 = 3 and b2 = 3 otherwise. And so on. At each stage, we make sure
that our new number is not the same as the nth number on the list, by making sure that bn is not the same as
an,n . And that defines our new real number, one that is definitely not on our list because we built it that way.
If we apply the above argument to prove that rational numbers are uncountable, where would the argument break? Hint: Rational numbers have repeating decimals while real numbers can have nonrepeating
decimals.
References
D.L. Cohn. 2013. Measure Theory. 2nd edition published by Birkhäuser.
B.R. Gelbaum and J.M.H. Olmsted. 2003. Counterexamples in Analysis. Published by Dover in 2003 and
based on the 2nd edition published by Holden Day in 1965.
17