Download course material

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Probability wikipedia , lookup

Transcript
COURSE MATERIAL
ON
CSC 223
DISCRETE STRUCTURE
PRODUCED By
FADARE OLUWASEUN GBENGA
1
1.0 Module 1: Introduction to set theory
1.1. Learning Outcomes: After completing this module, the students should be able to:
(1) Understand the concepts of discrete mathematics.
(2) Describe, understand discrete objects
(3) Understand the basic concepts of set theory
(4) Understand the concepts of set notations and set operations
1.2
Discrete mathematics is the study of mathematical structures that are fundamentally
discrete rather than continuous. In contrast to real numbers that have the property of varying
"smoothly", the objects studied in discrete mathematics – such as integers, graphs, and
statements in logic – do not vary smoothly in this way, but have distinct, separated values.
Discrete mathematics is mathematics that deals with discrete objects. Discrete objects are those
which are separated from (not connected to/distinct from) each other. Integers (aka whole
numbers), rational numbers (ones that can be expressed as the quotient of two integers),
automobiles, houses, people etc. are all discrete objects. On the other hand real numbers which
include irrational as well as rational numbers are not discrete. As you know between any two
different real numbers there is another real number different from either of them. So they are
packed without any gaps and cannot be separated from their immediate neighbors. In that sense
they are not discrete. In this course we will be concerned with objects such as integers,
propositions, sets, relations and functions, which are all discrete. We are going to learn concepts
associated with them, their properties, and relationships among them among others.
2
1.3
Introduction to Set Theory
The concept of set is fundamental to mathematics and computer science. Everything
mathematical starts with sets. For example, relationships between two objects are represented as
a set of ordered pairs of objects; the concept of ordered pair is defined using sets. Natural
numbers, which are the basis of other numbers, are also defined using sets. The concept of
function, being a special type of relation, is based on sets, and graphs and digraphs consisting of
lines and points are described as an ordered pair of sets.
1.3.1 Representation of Set
A set can be described in a number of different ways. The simplest is to list up all of its members
if that is possible. For example {1, 2, 3} is the set of three numbers 1, 2, and 3. { indicates the
beginning of the set, and } its end. Every object between them separated by commas is a member
of the set. Thus {{1, 2}, {{3}, 2}, 2}, {1 } } is the set of the elements {1, 2}, {{3}, 2} and {1}.
A set can also be described by listing the properties that its members must satisfy.
For example, { x| 1
x 2 and x is a real number. } represents the set of real numbers between 1
and 2, and { x| x is the square of an integer and x<100 } represents the set { 0, 1, 4, 9, 16, 25, 36,
49, 64, 81, 100 }.
A set is one of the most fundamental objects in mathematics. A set is an unordered collection of
distinct objects. The objects in a set are called the elements, or members, of the set. A set is said
to contain its elements. A set can be defined by simply listing its members inside curly braces.
For example, the set (2, 4, 17, 23). To denote membership, we use € symbol, as in 4 € (2, 4, 17,
23). On the other hand, non-membership is denoted as in 5 Ȼ (2, 4, 17, and 23)
3
Definition (Equality of sets): Two sets are equal if and only if they have the same elements.
More formally, for any sets A and B, A = B if and only if
x[x
A
x
B].
Thus for example {1, 2, 3} = {3, 2, 1}, that is the order of elements does not matter, and {1, 2, 3}
= {3, 2, 1, 1}, that is duplications do not make any difference for sets.
Definition (Subset): A set A is a subset of a set B if and only if everything in A is also in B.
More formally, for any sets A and B, A is a subset of B, and denoted by A
if
x[x
A
If A B, and A
x
B, if and only
B].
B, then A is said to be a proper subset of B and it is denoted by A
We will encounter the following sets and notations throughout the course:

= { }, the empty set.

= {0, 1, 2, 3…..}, the non-negative integers.

+

= {….,-2,-1, 0, 1, 2.…}, the integers.

= {q | q = a/b, a, b , b 0}, the rational numbers.

+

the real numbers

+
= {1, 2, 3,.… , the positive integers.
= {q |q , q > 0}, the positive rationals
the positive real s
4
B.
(Set equality). Two sets S and T are equal, written as S = T, if S and T contains exactly the same
elements, i.e., for every x, x S x T.
(Subsets). A set S is a subset of set T, written as S C T, if every element in S is also in T, i.e.,
for every x, x Sx T. Set S is a strict subset of T, written as S T if S T, and there exist some
element x T such that x S.
Example 1.7.

{1, 2} {1, 2, 3}.

{1,2} {1, 2, 3}.

{1, 2, 3} {1,2, 3}.

{1, 2, 3} {1, 2, 3}.

For any set S, S

For every set S ,S.

S T and T S if and only if S = T.
1.4
Definition (Set operations). Given sets S and T, we define the following operations:
o Power Sets. P(S) is the set of all subsets of S.
o Cartesian product. S x T = f(s, t) | s S, t T}.
o Union. S [T = {x | x S or x T}, set of elements in S or T.
o Intersection. S T = {x | x S, x T}, set of elements in S and T.
o Difference. S - T = {x | x S, x S T}, set of elements in S but not T.
o Complements. S = {x |x =S}, set of elements not in S. This is only meaningful when
we have an implicit universe U of objects, i.e., S = {x | x U, x S}.
5
Example: Let S = {1, 2, 3}, T = {3, 4}, V = {a, b}. Then:

P (T) = { , {3}, {4}, {3, 4}}.

S x V = {(1, a), (1, b), (2, a), (2, b), (3, a), (3, b)}.

S T = {1, 2, 3, 4}.

S T = {3}.

S - T = {1, 2}.

If we are dealing with the set of all integers, = {….,-2,-1, 0, 4, 5,…}.
1.5 Self-Assessment Questions
1.
Give five examples of discrete objects.
2.
“Everything mathematical starts with set”, justify this statement
3.
Use set notation to explain equality of two sets A and T
4.
Let S = {1, 6, 3}, T = {3, 5}, V = {a, b}. Then find the following (i) P (T) (ii) S T (iii) S
x V (iv) S – T (v) P (V)
6
2.0 Module 2: Introduction to Relation
2.1. Learning Outcomes: After completing this module, the students should be able to:
(1) Understand the concepts of relation
(2) Describe, understand the concepts of ordered pair of two elements
(3) Understand the basic concepts of Cartesian product
(4) Understand the concepts of binary relation
2.2
Relation
Here we are going to define relation formally, first binary relation. A relation in everyday life
shows an association of objects of a set with objects of other sets (or the same set) such as John
owns a red Mustang, Jim has a green Miata etc. The essence of relation is these associations. A
collection of these individual associations is a relation, such as the ownership relation between
peoples and automobiles. To represent these individual associations, a set of "related" objects,
such as John and a red Mustang, can be used. However, simple sets such as {John, a red
Mustang} are not sufficient here. The order of the objects must also be taken into account,
because John owns a red Mustang but the red Mustang does not own John, and simple sets do
not deal with orders. Thus sets with an order on its members are needed to describe a relation.
Here the concept of ordered pair and, more generally, that of ordered n-tuple are going to be
defined first. A relation is then defined as a set of ordered pairs or ordered n-tuples.
2.3
Ordered pair
An ordered pair is a set of a pair of objects with an order associated with them. If objects are
7
represented by x and y, then we write an ordered pair as <x, y> or <y, x>. In general <x, y> is
different from <y, x>. Ordered pair of two elements x and y, denoted (x,y) has the below
property:
(x,y) = (u,v) if and only if x=u and y=v
Thus, this property can be extended to define an ordered n-tuple as the ordered counterpart of a
set with n elements.
Given two sets A and B, their Cartesian product AXB is the set of all order pairs (x,y) such that x
A and y B . AX B = { (x,y): x A, y B }
If objects are represented by x and y, then we write the ordered pair as <x, y>. Two ordered pairs
<a, b> and <c, d> are equal if and only if a = c and b = d. For example the ordered pair <1, 2> is
not equal to the ordered pair <2, 1>.
2.4
Definition (equality of ordered pairs):
Two ordered pairs <a, b> and <c, d> are equal if and only if a = c and b = d.
For example, if the ordered pair <a, b> is equal to <1, 2>, then a = 1, and b = 2. <1, 2> is not
equal to the ordered pair <2, 1>.
2.5
Definition (binary relation):
A binary relation from a set A to a set B is a set of ordered pairs <a, b> where a is an element
of A and b is an element of B. If Cartesian product of A and B is denoted by AX B, thus, a
binary relation from A to B is a subset of Cartesian product AX B.
When an ordered pair <a, b> is in a relation R, we write a R b, or <a, b>
R. It means that
element a is related to element b in relation R. When A = B, we call a relation from A to B a
8
(binary) relation on A .
Examples:
If A = {1, 2, 3} and B = {4, 5}, then {<1, 4>, <2, 5>, <3, 5>}, for example, is a binary relation
from A to B. However, {<1, 1>, <1, 4>, <3, 5>} is not a binary relation from A to B because 1 is
not in B.
Example 1: Let A = {1, 2, 3} and B = {a, b}. Then
AXB = {<1, a>, <1, b>, <2, a>, <2, b>, <3, a>, <3, b>}.
2.5 Self-Assessment Questions
(1). Prove that the set of positive rational numbers + are countable
(2). Using set notation to define the following set operations on set F and M; (1) power set (2)
Cartesian product (3)union (4) intersection (5) difference (6) complement
(3) Prove that for all set S and T, S= (S T) (S-T)
(4) Define an ordered pair of two elements S and H
(5) Explain the principles of binary relation
(6) Let A, B, and C be sets. Prove or disprove: If, for all x, x ∈ A → (x ∈ B → x ∈ C), then
A ∩ B ⊆ C.
(7) Let A, B, and C be sets. 1. Prove or disprove: If A ∈ B, and B ⊆ C, then A ⊆ C. 2. Prove or
disprove: If A ⊆ B, and B ⊆ C, then A ⊆ C.
(8) Let A ⊆ B. 1. Prove or disprove: There exists an injection f : A → B. 2. Prove or disprove:
There exists a surjection g : B → A.
9
3.0 Module 3: Function and Relation
3.1. Learning Outcomes: After completing this module, the students should be able to:
(1) Understand the concepts of a function, a relation and their relationship
(2) Describe, understand the concepts of graph relation
(3) Understand the concepts of partial order, total order and strict order
3.2
Function and Relation
Relations: A relation on sets S and T is a subset of S X T. A relation on a single set S is a
subset of S X S.
Example: “Taller-than” is a relation on people, (A, B) "Taller-than if person A is taller than
person B.” ”is a relation on "= {(x, y) |x, y , x y}.
(Reflexivity, symmetry, and transitivity). A relation R on set S is:

Reflexive if (x, x) R for all x S.

Symmetric if whenever (x, y) R, (y, x) R.

Transitive if whenever (x, y), (y, z) R, then (x, z) R

" is reflexive, but " is not.

“sibling-of" is symmetric, but ”and “sister-of" is not.

“sibling-of", ", and ” are all transitive, but “parent -of" is not (“ancestor-of" is transitive,
however).
10
(Graph of relations). The graph of a relation R over S is a directed graph with nodes
3.3
corresponding to elements of S. There is an edge from node x to y if and only if (x, y) R. Let R
be a relation over S.

R is reflexive iff its graph has a self-loop on every node.

R is symmetric iff in its graph, every edge goes both ways.

R is transitive iff in its graph, for any three nodes x, y and z such that there is an edge
from x to y and from y to z, there exist an edge from x to z.

More naturally, R is transitive iff in its graph, whenever there is a path from node x to
node y, there is also a direct edge from x to y.
(Transitive closure). The transitive closure of a relation R is the least (i.e., smallest) transitive
relation R* such that R R*.
Let R = {(1, 2), (2, 3), (1, 4)} be a relation (say on set Z). Then (1, 3) R* (since (1, 2), (2, 3) R),
but (2, 4) =2 R*.
A relation R is transitive iff R = R*.
Given a set A, a relation on A is some property that is either true or false for any ordered pair (x,
y) A2. For example, “greater than” is a relation on Z, denoted by >. It is true for the pair (3, 2),
but false for the pairs (2, 2) and (2, 3). In more generality, given sets A and B, a relation between
A and B is a subset of A × B. By this definition, a relation R is simply a specification of which
pairs are related by R, that is, which pairs the relation R is true for. For the relation > on the set
{1, 2, 3},
> = {(2, 1), (3, 1), (3, 2)}
This notation might look weird because we do not often regard the symbol “>” as a meaningful
entity in itself. It is, at least from the vantage point of the foundations of mathematics: This
11
symbol is a particular relation. The common usage of the symbol “>” (as in 3 > 2) is an instance
of a useful notational convention: For a relation R, (a, b) 2 R can also be specified as aRb.
Thus, in the above example, (2, 1) 2 > can be written as 2 > 1.
A relation R on a set A is called
• reflexive if for all a ∈ A, aRa.
• symmetric if for all a, b ∈ A, aRb implies bRa.
• antisymmetric if for all a, b ∈ A, aRb and bRa implies a = b.
• transitive if for all a, b, c ∈ A, aRb and bRc implies aRc.
For example, the relation = on the set Z is precisely the set {(n, n) : n ∈ Z} and the relation on R
is the set {(x, x + |y|) : x, y ∈ R}. The congruence relation ≡n on the set Z is reflexive, symmetric,
and transitive, thus it is also an equivalence relation. The similarity relation on the set of triangles
in the plane is another example.
3.4
Order relations: A relation that is reflexive, antisymmetric, and transitive is called a
partial order. The relations ≤, ≥ , and | on the set , as well as the relation ⊆ on the powerset 2A of
any set A, are familiar partial orders. Note that a pair of elements can be incomparable with
respect to a partial order. For example, | is a partial order on , but 2∤3 and 3∤ 2. A set A with a
partial order on A is called a partially ordered set, or, more commonly, a poset.
A relation R on a set A is a total order if it is a partial order and satisfies the following additional
condition:
• For all a, b ∈A, either aRb or bRa (or both).
For example, the relations ≥ and ≤ are total orders on R, but | is not a total order on . Finally, a
strict order on A is a relation R that satisfies the following two conditions:

For all a, b, c ∈ A, aRb and bRc implies aRc. (Transitivity.)
12
• Given a, b ∈ A, exactly one of the following holds (and not the other two): aRb, bRa, a = b.
The familiar < and > relations (on , say) are examples of strict orders.
3.5
Function
A function f : S T is a "mapping" from elements in set S to elements in set T. Formally, f is a
relation on S and T such that for each(s ,t) S, there exists a unique t ∈ T such that (s; t) R. S is
the domain of f, and T is the range of f. {y | y =F(x) for some x S} is the image of f.
We often think of a function as being characterized by an algebraic formula, e.g., y = 3x - 2
characterizes the function f(x) = 3x - 2. Not all formulas characterize a function, e.g. x2 + y2 = 1
is a relation (a circle) that is not a function (no unique y for each x). Some functions are also not
easily characterized by an algebraic expression, e.g., the function mapping past dates to recorded
weather.
(Injection). f : S T is injective (one-to-one) if for every t T, there exists at most one s such that
f(s) = t, Equivalently, f is injective if whenever s s, we have f(s) f(s).
Example:

f :, f(x) = 2x is injective

f : + +, f(x) = x2 is injective.

f : , f(x) = x2 is not injective since negative reals don't have real square roots.
(Surjection). f : S T is surjective (onto) if the image of f equals its range. Equivalently, for
every t 2 T, there exists some s 2 S such that f(s) = t.
Example:

f :, f(x) = 2x is not surjective.
13

f : + +, f(x) = x2 is surjective.

f : , f(x) = x2 is not injective since negative reals don't have real square root.
(Bijection). f : S T is bijective, or a one-to-one correspondence, if it is injective and surjective.
(Inverse relation). Given a function f : S T, the inverse relation f-1 on T and S is defined by (t,
s) f-1 if and only if f(s) = t.
If f is bijective, then f-1 is a function (unique inverse for each t). Similarly, if f is injective, then
f-1 is a also function if we restrict the domain of f-1 to be the image of f. Often an easy way to
show that a function is one-to-one is to exhibit such an inverse mapping. In both these cases, f-1
(f(x)) = x.
Given two sets A and B, a function f : A ⟶B is a subset of A × B such that
(a) If x ∈ A, there exists y ∈ B such that (x, y) ∈ f.
(b) If (x, y) ∈ f and (x, z) ∈ f then y = z.
A function is sometimes called a map or mapping. The set A in the above definition is the
domain and B is the codomain of f. A function f : A ⟶ B is effectively a special kind of relation
between A and B, which relates every x ∈ A to exactly one element of B. That element is
denoted by f(x).
For a function f : A ⟶B, the set f(A) = {f(x) : x ∈ A} is called the range of f. The range is a
subset of the codomain but may be different from it. If f(A) = B then we say that f is onto. More
14
precisely, a function f : A ⟶B is a surjection (or surjective), or onto if each element of B is of
the form f(x) for at least one x ∈ A.
A function f : A ⟶B is an injection (or injective), or one-to-one if for all x, y ∈ A, f(x) = f(y)
implies x = y. Put differently, f : A ⟶B is one-to-one if each element of B is of the form f(x) for
at most one x ∈ A.
A function f : A ⟶B is a bijection (or bijective), or a one-to-one correspondence if it is both
one-to-one and onto. Alternatively, f : A ⟶B is a bijection if each element of B is of the form
f(x) for exactly one x ∈ A.
3.6
Set Cardinality of a Function
Bijections are very useful for showing that two sets have the same number of elements. If f : S !
T is a bijection and S and T are finite sets, then |S| = |T|. In fact, we will extend this definition to
infinite sets as well. (Set cardinality). Let S and T be two potentially infinite sets. S and T have
the same cardinality, written as |S| = |T|., if there exists a bijection f : S T (equivalently, if there
exists a bijection f I : T S). T has cardinality at larger or equal to S, written as, if there exists an
injection g : S ! T (equivalently, if there exists a surjection g0 : T S).
Definition: A set S is countable if it is finite or has the same cardinality as . Equivalently, S is
countable if || |.
Example

{1, 2, 3} is countable because it is finite.

is countable because it has the same cardinality as ; consider f :
, f(x) = x -1.
The set of positive even numbers, S = {2, 4,….}, is countable consider f : S, f(x) = 2x.
+
15
Theorem The set of positive rational numbers +are countable.
Proof.
+
is clearly not finite, so we need a way to count +. Note that double counting, triple
counting, even counting some element infinite many times is okay, as long as we eventually
count all of +. I.e., we implicitly construct a surjection f :
+
3.6 Self-Assessment Questions
(1) Give two examples of a relation in real life or its application in real life
(2) Using set notation to explain reflexive relation, symmetric relation, transitive relation and
transitive closure
(3) Define injective and subjective function
(4) Define a partial, total and strict order function
(5) Using set notation to define a set S that is countable, and give five examples of countable set
of numbers
(6) Let S ⊆ N, and for any x, y ∈ N, define x ≤ y if and only if there exists z ∈ S such that x + z =
y. Show that if ≤ is a partial order, then (a) 0 is in S and (b) for any x, y in S, x + y is in S.
(7) Let S be a set with n elements. Recall that a relation R is symmetric if xRy implies yRx,
antisymmetric if xRy and yRx implies x = y, reflexive if xRx for all x, and irreflexive if ¬(xRx) for
all x.
1. How many relations on S are symmetric, antisymmetric, and reflexive?
2. How many relations on S are symmetric, antisymmetric, and irreflexive?
3. How many relations on S are symmetric and antisymmetric?
16
4.0 Module 4: LOGIC
4.1. Learning Outcomes: After completing this module, the students should be able to:
(1) Understand the concepts of propositions and predicates
(2) Describe, understand the concepts negations and logical connectives
(3) Understand the concepts of tautologies and logical inference
4.2
Introduction to Logic
Perhaps the most distinguishing characteristic of mathematics is its reliance on logic. Explicit
training in mathematical logic is essential to a mature understanding of mathematics. Familiarity
with the concepts of logic is also a prerequisite to studying a number of central areas of computer
science, including databases, compilers, and complexity theory.
4.21
Propositions and predicates
A proposition is a statement that is either true or false. For example, “It will rain tomorrow” and
“It will not rain tomorrow” are propositions, but “It will probably rain tomorrow” is not, pending
a more precise definition of “probably”.
A predicate is a statement that contains a variable, such that for any specific value of the variable
the statement is a proposition. Usually the allowed values for the variable will come from a
specific set, sometimes called the universe of the variable, which will be either explicitly
mentioned or clear from context. A simple example of a predicate is x ≥ 2 for x R. A predicate
may have more than one variable, in which case we speak of predicates in two variables, three
variables, and so on, denoted as Q(x, y), S(x, y, z), etc.
17
4.3
Quantifiers
Given a predicate P(x) that is defined for all elements in a set A, we can reason about whether
P(x) is true for all x A, or if it’s at least true for some x A. We can state propositions to this
effect using the universal quantifier and the existential quantifier.
• x A : P(x) is true if and only if P(x) is true for all x A. This proposition can be read “For all x
A, P(x).”
• x A : P(x) is true if and only if P(x) is true for at least one x ∈ A. This proposition can be read
“There exists x A such that P(x).”
It is crucial to remember that the meaning of a statement may change if the existential and
universal quantifiers are exchanged. For example, m
n
: m > n means “There is an integer
strictly greater than all integers.”
4.4
Negations
Given a proposition P, the negation of P is the proposition “P is false”. It is true if P is false, and
false if P is true. The negation of P is denoted by ¬P, read as “not P.” If we know the meaning of
P, such as when P stands for “It will rain tomorrow,” the proposition ¬P can be stated more
naturally than “not P,” as in “It will not rain tomorrow.” The truth-value of ¬P can be
represented by the following truth table:
P
¬P
True
false
false
true
18
We see that the statements Q and ¬¬Q have the same truth values. In this case we say that the
two statements are equivalent, and write Q , ¬¬Q. If A, B we can freely use B in the place of A,
or A instead of B in our logical derivations. Negation gets really interesting when the negated
proposition is quantified. Then we can assert that
¬∀x ∈ A : P(x) , ∃x ∈ A : ¬P(x)
¬∃x ∈ A : P(x) , ∀x ∈ A : ¬P(x)
These can be interpreted as the claim that if P(x) is not true for all x ∈ A then it is false for some
x ∈ A and vice versa, and the claim that if P(x) is not false for any x ∈ A then it is true for all x ∈
A and vice versa. What this means, in particular, is that if we want to disprove a statement that
asserts something for all x ∈ A, it is sufficient to demonstrate one such x for which the statement
does not hold. On the other hand, if we need to disprove a statement that asserts the existence of
an x ∈ A with a certain property, we actually need to show that for all such x this property
does not hold.
4.5
Logical connectives
The symbol ¬ is an example of a connective. Other connectives combine two propositions (or
predicates) into one. The most common are , and P Q is read as “P and Q”; P Q as “P or Q”; P
Q as “P xor Q”; P Q as “P implies Q” or “if P then Q”; and P Q as “P if and only if Q”. The
truth-value of these compound propositions (sometimes called sentences) depends on the truth
values of P and Q (which are said to be the terms of these sentences), in a way that is made
precise in the truth-table below
19
One interesting thing about the above table is that the proposition P Q is false only when P is
true and Q is false. This is what we would expect: If P is true but Q is false then, clearly, P does
not imply Q. The important thing to remember is that if P is false, then P Q is true
P Q
PQ
PQ
PQ
P Q
PQ
T T
T
T
F
T
T
T F
F
T
T
F
F
F T
F
T
T
T
F
F
F
F
F
T
T
F
4.6 Tautologies and logical inference
A sentence that is true regardless of the values of its terms is called a tautology, while a
statement that is always false is a contradiction. Another terminology says that tautologies are
valid statements and contradictions are unsatisfiable statements. All other statements are said to
be satisfiable, meaning they can be either true or false. Easy examples of a tautology and a
contradiction are provided by P ¬P and P ¬P, as demonstrated by the following truth table:
P Q
P ¬P
P ¬P
T F
T
T
T T
T
F
Note that by our definition of logical equivalence, all tautologies are equivalent. It is sometimes
useful to keep a “special” proposition that is always true, and a proposition that is always false.
Thus any tautology is equivalent to and any contradiction is equivalent to .
Here is another tautology: (P Q)P:
P Q
P Q
(P Q)P
20
T T
T
T
T F
F
T
F T
F
T
F
F
T
F
The statement (P Q)P is read “P and Q implies P”. The fact that this is a tautology means that
the implication is always true. Namely, if we know the truth of P Q, we can legitimately
conclude the truth of P. In such cases the symbol is used, and we can write (P Q) P. There is a
crucial difference between (P Q) P and ((P Q) P. The former is a single statement, while the
latter indicates a relationship between two statements. Such a relationship is called an inference
rule.
A tautology of the form A ↔ B can be converted into the equivalence A ⇔ B, which can be
regarded as two inference rules, A ⇒ B and B ⇒ A. A particularly important inference rule is
called modus ponens, and says that if we know that P and P → Q are both true, we can conclude
that Q is true. It follows from the tautology (P ⋀ (P → Q)) → Q:
4.7
Logical statements: Logical statements may be built up from the following notation:
• symbols (s, t, etc.) standing for statements (these will be called variables),
• the symbol ∧, standing for “and,”
• the symbol ∨, standing for “or,”
• the symbol ⊕ standing for “exclusive or,”
• the symbol ¬, standing for “not,”
the symbol ⇒, standing for “implies,” and
• the symbol ⇔ , standing for “if and only if.”
21
The operators ∧, ∨, ⊕, ⇒, ⇔, and ¬ are called logical connectives. The operators ⇒ and ⇔ are
called conditional connectives.
Equivalence of logical statements. We say that two symbolic compound statements are
equivalent if they are true in exactly the same cases.
Distributive Law. The statements w ∧ (u ∨ v) and (w ∧ u) ∨ (w ∧ v) are equivalent.
DeMorgan’s Laws. DeMorgan’s Laws say that ¬(p ∨ q) is equivalent to ¬p ∧ ¬q, and that
¬(p ∧ q) is equivalent to ¬p ∨ ¬q.
4.8 Self-Assessment Questions
1. Give truth tables for the following expressions:
a. (s ∨ t) ∧ (¬s ∨ t) ∧ (s ∨ ¬t)
(b) (s ⇒ t) ∧ (t ⇒ u) (c) (s ∨ t ∨ u) ∧ (s ∨ ¬t ∨ u)
2. Show that the statements s ⇒ t and ¬s ∨ t are equivalent.
3. Prove the DeMorgan law which states ¬ (p ∧ q) = ¬p ∨ ¬q.
4. Show that p ⊕ q is equivalent to (p ∧ ¬q) ∨ (¬p ∧ q).
5. Use a truth table to show that (s∨t)∧(u∨v) is equivalent to (s∧u)∨(s∧v)∨(t∧u)∨(t∧v).
6.Use DeMorgan’s Law, the distributive law, to show that ¬((s ∨ t) ∧(s ∨ ¬t)) is equivalent to ¬s.
7. Prove that the statements ¬∀x ∈ U (p(x)) and ∃x ∈ U (¬p(x)) are equivalent.
8. Prove that the statements ¬∃x ∈ U (q(x)) and ∀x ∈ U (¬q(x)) are equivalent
9. Compute the truth table for the statement [(p ∧ q) ∨ r] ⇒
10. Show how each of the following propositions can be simplified using equivalences to a
single operation applied directly to P and Q. 1. ¬ (P → ¬Q). 2. ¬ ((P ∧ ¬Q) ∨ (¬P ∧ Q)).
11. Consider the formula (P ∧ ¬Q) ⇒ (¬P ∨ Q), where P and Q are atoms.
(a) Is the formula valid? Justify your answer.
22
(b) Is the formula satisfiable? Justify your answer.
5.0 Module 5 Boolean Algebra and Lattice
5.1 Learning Outcomes: After completing this module, the students should be able to:
(1) Understand the concepts of Boolean Algebra
(2) Describe, understand the concepts of homomorphism between Boolean algebra
(3) Define isomorphism that exit between Boolean algebra
(4) Understand the concepts of Lattices
5.2
Boolean Algebra and Lattice
In abstract algebra, a Boolean algebra or Boolean lattice is a complemented distributive lattice.
This type of algebraic structure captures essential properties of both set operations and logic
operations. A Boolean algebra can be seen as a generalization of a power set algebra or a field of
sets, or its elements can be viewed as generalized truth values. It is also a special case of a De
Morgan algebra and a Kleene algebra.
A Boolean algebra is a six-tuple consisting of a set A, equipped with two binary operations ∧
(called "meet" or "and"), ∨ (called "join" or "or"), a unary operation ¬ (called "complement" or
"not") and two elements 0 and 1 (called "bottom" and "top", or "least" and "greatest" element,
also denoted by the symbols ⊥ and ⊤, respectively), such that for all elements a, b and c of A, the
following axioms hold:
23
a ∨ (b ∨ c) = (a ∨ b) ∨ c
a ∧ (b ∧ c) = (a ∧ b) ∧ c
associativity
a∨b=b∨a
a∧b=b∧a
commutativity
a ∨ (a ∧ b) = a
a ∧ (a ∨ b) = a
absorption
a∨0=a
a∧1=a
identity
a ∨ (b ∧ c) = (a ∨ b) ∧ (a ∨ c)
a ∧ (b ∨ c) = (a ∧ b) ∨ (a ∧ c)
distributivity
a ∨ ¬a = 1
a ∧ ¬a = 0
complements
A Boolean algebra with only one element is called a trivial Boolean algebra or a degenerate
Boolean algebra
It follows from the last three pairs of axioms above (identity, distributivity and complements), or
from the absorption axiom, that
a=b∧a
if and only if
a ∨ b = b.
The relation ≤ defined by a ≤ b if these equivalent conditions hold, is a partial order with least
element 0 and greatest element 1. The meet a ∧ b and the join a ∨ b of two elements coincide
with their infimum and supremum, respectively, with respect to ≤. The power set (set of all
subsets) of any given nonempty set S forms a Boolean algebra, an algebra of sets, with the two
operations ∨ := ∪ (union) and ∧ := ∩ (intersection). The smallest element 0 is the empty set and
the largest element 1 is the set S itself.
***A homomorphism between two Boolean algebras A and B is a function f : A → B such that
for all a, b in A:
f(a ∨ b) = f(a) ∨ f(b),
f(a ∧ b) = f(a) ∧ f(b),
f(0) = 0,
f(1) = 1.
24
It then follows that f(¬a) = ¬f(a) for all a in A. The class of all Boolean algebras, together with
this notion of morphism, forms a full subcategory of the category of lattices.
A Boolean algebra is an abstract collection of “truth values" that obey the usual rules of logic. To
understand this, we first need to clarify some issues about logic. In propositional logic (or zerothorder logic) we have a collection of propositional variables p, q, r,… which represent statements
that could be true or false. We can combine them using the operations ⋀ (and), ⋁ (or), and ¬
(not) to form formulas. For example, ¬ (p ⋀ q) ⋁ r is a sentence which means “either p and q are
not both true or r is true". We also have special propositions 1 and 0, which are always true and
always false, respectively. We also write p ⟹ q as an abbreviation for “¬p ⋁ q” and p ⟺ q for
(p ⟹ q) ⋀ (q ⟹ p). These mean that p implies q and p is true iff q is true, respectively. The
basic question of interest in propositional logic is which formulas are necessarily true. Such
formulas are called tautologies. For example, p ⋁ ¬p is true, whether p is true or false. Similarly,
((p ⟹) q) ⋀ p) ⟹ q is true regardless of the truth values of p and q. In general, given any
formula, we can algorithmically check whether it is a tautology by plugging in all possible
combinations of truth values for the variables. We will use a slightly different notion of tautology
than the usual one. If A and B are formulas, we say the equation A = B is a tautology if A is true
exactly when B is true. Equivalently, A ⇔ B is a tautology in the sense of the previous
paragraph. Conversely, if A is a tautology in the earlier sense, then A = 1 is a tautology in the
new sense.
A Boolean algebra is a set B together with operations ¬ : B →B, ⋀ : B X B → B, and ⋁ : B X B
→ B, and special elements 0 ∈ B and 1 ∈ B, which satisfies the following properties for all a, b,
c∈B
1.
a ⋀1 = a ⋁0 = a
25
2.
a ⋀ ¬a = 0, a ⋁ ¬a = 1
3.
a⋀a=a⋁a=a
4.
¬ (a ⋀ b) = ¬a ⋁ ¬b, ¬ (a ⋁ b) = ¬a ⋀ ¬b
5.
a ⋀ b = b ⋀ a, a ⋁ b = b ⋁ a
6.
(a ⋀ b) ⋀ c = a ⋀ (b ⋀ c), (a ⋁ b) ⋁ c) = a ⋁ (b ⋁ c)
7.
a ⋀ (b ⋁c) = (a ⋀ b) ⋁ (a ⋀ c), a ⋁ (b ⋀ c) = (a ⋁ b) ⋀ (a ⋁ c)
Let B and C be Boolean algebras. Then a homomorphism f : B → C is a map that preserves all
the structure of Boolean algebras:
f(0) = 0, f(1) = 1, f(a ⋁ b) = f(a) ⋁ f(b), f(a ⋀ b) = f(a) ⋀ f(b), and f(¬a) = ¬f(a).
If f is also a bijection, we say f is an isomorphism.
Let B be a Boolean algebra and a, b ∈ B. Then we say a ≤ b if a ∧ b = a. This is equivalent to a ⋁
b = b. Indeed, if a ⋀ b = a, then a ∨ b = (a ∧ b) ⋁ b = b.
The relation is a partial order on any Boolean algebra. The solution:
We have a ≤ a since a ⋀ a = a. If a ≤ b and b ≤ a, then a = a ∧ b = b.
Finally, if a ≤ b and b ≤c then a ∧ c = (a ∧ b) ∧ c = a ∧ (b ∧ c) = a ∧b = a so a ≤ c.
Let P be a partially ordered set. Then an element 1 ∈ P is the top if p ≤ 1 for all p ∈ P. An
element 0 ∈ P is the bottom if 0 ≤ p for all p ∈ P.
5.3
LATTICES
A poset L is a lattice if every pair of elements x; y has
(i) a least upper bound x ⋁ y (called join), and (ii) a greatest lower bound x ⋀ y (called meet);
26
that is
z ≥ x ⋁⋀ y ⟺ z ≥ x and z ≥ y
z ≥≤ x ⋀ y ⟺ z ≤ x and z ≤ y:
An element z of a lattice L is called join irreducible if z ≠ z1 ⋁ z2 for z1, z2 < z.
Observe that the meet and join operations are associative; we note that for meet
w ≤ (x ⋀ y) ⋀ z ⟺ w ≤ z, x ∧ y , w ⟺x, y, z
⟺ w ≤ x, y ⋀ z ⟺ w ≤ x ⋀ (y ⋀ z)
and similarly for join
w ≥ (x ⋁ y) ⋁ z ⟺ w ≥ z, x ⋁ y , w ⟺x, y, z
⟺ w ≥ x, y ⋁ z ⟺ w ≥ x ⋁ (y ⋁ z)
If P is a finite poset such that
(i) Every x, y ∈ P have a greatest lower bound (ii) P has a unique maximal element ↑
then P is a lattice.
Questions
1. Define Boolean Algebra
2. Prove that Boolean algebra obey the rule of logic
3. If Z and M be Boolean algebras, define a homomorphism f : Z→ M that exit between Z
and M
4. If Z and M be Boolean algebras, define a isomorphism f : Z→ M that exit between Z and
M
5. Define Boolean Lattice
27
6.
Compute the truth table for the statement [(p ∧ q) ∨ r] ⇒ ( ˜q). Show your w
7. Show that each of the following propositions is a tautology using a truth table, each of
your truth tables should include columns for all sub-expressions of the proposition.
1. (¬P → P) ↔ P .
2. P ∨ (Q → ¬(P ↔ Q)).
3. (P ∨ Q) ↔ (Q ∨ (P ↔ (Q → R))).
6.0 Module 6 Introduction to Graph
6.1 Learning Outcomes: After completing this module, the students should be able to:
(1) Understand basic concepts of a graph
(2) Describe, understand the attributes of a graph
(3) Define undirected graph & graph isomorphism
(4) Understand the basic component of a graph of Lattices
6.2 GRAPHS
A graph G is an ordered pair (V, E), where V is a set and E is a set of two-element subsets of V.
That is, E ⊆ {x, y}: x, y ∈ V, x ≠ y
Elements of V are the vertices (sometimes called nodes) of the graph and elements of E are the
edges. If e = {x, y} ∈ E we say that x and y are adjacent in the graph G, that y is a neighbor of x
in G and vice versa, and that the edge e is incident to x and y.
What are graphs good for? Graphs are perhaps the most pervasive abstraction in computer
science. It is hard to appreciate their tremendous usefulness at first, because the concept itself is
so elementary. This appreciation comes through uncovering the deep and fascinating theory of
graphs and its applications. Graphs are used to model and study transportation networks, such as
the network of highways, the London Underground, the worldwide airline network, or the
28
European railway network; the ‘connectivity’ properties of such networks are of great interest.
Graphs can also be used to model the World Wide Web, with edges corresponding to hyperlinks;
Google uses sophisticated ideas from graph theory to assign a PageRank to every vertex of this
graph as a function of the graph’s global properties.
6.3
Common graphs
A number of families of graphs are so common that they have special names that are worth
remembering:
Cliques. A graph on n vertices where every pair of vertices is connected is called a clique (or nclique) and is denoted by Kn. Formally, Kn = (V,E), where V = {1, 2, . . . , n} and E = {{i, j} : 1 ≤
i < j ≤n}. The number of edges in Kn is
Paths. A path on n vertices, denoted by Pn, is the graph Pn = (V,E), where V = {1, 2, . . . , n} and
E = {{i, i + 1} : 1 ≤ i ≤ n − 1}. The number of edges in Pn is n − 1. The vertices 1 and n are
called the endpoints of Pn.
Cycles. A cycle on n ≥ 3 vertices is the graph Cn = (V,E), where V = {1, 2, . . . , n}
and E = {{i, i + 1} : 1 ≤ i ≤ n − 1} [ {{1, n}}. The number of edges in Cn is n.
A directed graph G is a pair (V,E) where V is a set of vertices (or nodes), and E ⊆ V X V is a set
of edges. An undirected graph additionally has the property that (u, v) ∈ E if and only if (v,u) ∈
E. In directed graphs, edge (u, v) (starting from node u, ending at node v) is different from edge
(v, u).
29
6.4
Graph isomorphism. If the above definition of a cycle is followed to the letter, a graph
is a cycle only if its vertices are natural numbers. When do we consider two graphs to be the
same? If the number of vertices or edges differs, then clearly the graphs are different. Therefore
let us focus on the case when two graphs have the same number of vertices and edges. Consider:
G1 = (V1 = {a,b,c}, E1 = {(a, b), (b, c)})
G2 = (V2 = {a,b,c}, E2 = {(1, 2), (2, 3)})
The only difference between G1 and G2 are the names of the vertices; they are clearly the same
graph! On the other hand, the graphs
H1 = (V1 ={a,b,c}, E1 = {(a, b), (b, a)})
H2 = (V2 = {a,b,c}, E2 = {(a, b), (b, c)})
are clearly different (e.g., in H1, there is a node without any incoming or outgoing edges.)
Two graphs G1 = (V1, E1) and G2 = (V2, E2) are isomorphic if there exists a bijection f : V1 →
V2 such that (u, v) ∈ E1 if and only if (f(u), f(v)) ∈ E2. The bijection f is called the isomorphism
from G1 to G2, and we use the notation G2 = f(G1).
Size. The number of edges of a graph is called it size. The size of an n-vertex graph is at most
achieved by the n-clique.
6.5
A directed graph G = (V,E) is strongly connected if there exists a path from any node u
to any node v. It is called weakly connected if there exists a path from an node u to any node v in
the underlying undirected graph: the graph Gi = (V, Ei) where each edge (u, v) ∈ E in G induces
an undirected edge in Gi (i.e. (u, v), (v,u) ∈ Ei).
30
When a graph is not connected (or strongly connected), we can decompose the graph into smaller
connected components.
Given a graph G = (V, E), a subgraph of G is simply a graph Gi = (V i, Ei) with V ⊆V and Ei ⊆
(V iXV i) ⋂ E; we denote subgraphs using Gi ⊆G.
6.6
A connected component of graph G = (V, E) is a maximal connected subgraph. i.e., it is
a subgraph H ⊆G that is connected, and any larger subgraph Hi (satisfying Hi ≠ H, H ⊆ Hi ⊆ G)
must be disconnected. We may similarly define a strongly connected component as a maximal
strongly connected subgraph.
A fundamental property of graphs is connectivity: whether the graph can be divided into two or
more pieces with no edges between them. Often it makes sense to talk about this in terms of
reachability, or whether you can get from one vertex to another along some path.
• A path of length n in a graph is the image of a homomorphism from Pn
A tree is defined to be an acyclic connected graph. There are several equivalent
characterizations. A graph is a tree if and only if there is exactly one simple path between any
two distinct vertices. A graph G is defined to be connected if and only if there is at least one
simple path between any two distinct vertices. A spanning tree of a nonempty connected graph
G is a subgraph of G that includes all vertices and is a tree (i.e., is connected and acyclic). Every
nonempty connected graph has a spanning tree.
A digraph is simply a directed graph. A diagraph allows travel between nodes only in the
direction indicated by the arrowed path lines which are called directed edges. A weighted graph
31
is a digraph which has values attached to the directed edges. These values represented the cost of
travelling from one node to the next.
Questions:
1. Describe five applications of a graph in a real world
2. Define a clique and its application to graph
3. Define a cycle and its application to graph
4. What is an undirected graph
5. Consider the relation R on A = {1, 2, 3, 4} given by 1R2, 2R3, 3R3, 3R4, 4R3. Draw the
digraph of R and compute its connectivity relation R∞. Draw the digraph of R∞. (Hint:
Try to determine R∞ by inspection, not by computing with formulas.)
6. Outlines conditions that must be satisfied before two graphs T and B can be isomorphic
7. Define a bijection between N and N × N (the ordered pairs (0,0),(0,1),(1,2),. . . of natural
numbers).
8. Define a bijection between N and Z.
9. What do you understand by a directed graph G = (V,E) that is strongly connected and
weakly connected
10. Define subgraph G1 to graph G = (V, E) of maximal connected subgraph
11. What is a tree? how does a graph becomes a tree
12. Outline the difference between directed graph and weighted graph
32
7.0 Module 7 Introduction to Graph
7.1 Learning Outcomes: After completing this module, the students should be able to:
(1) Understand basic concepts of typical matrices
(2) Understand how adjacency matrix can be computed from a given matrix
(3) Define undirected graph & graph isomorphism
(4) Understand the basic concepts of Boolean matrix
7.2
MATRICES
In mathematics, a matrix (plural matrices) is a rectangular array of numbers, symbols, or
expressions, arranged in rows and columns. The individual items in a matrix are called its
elements or entries. An example of a matrix with 2 rows and 3 columns is. A matrix is a
rectangular array of numbers or other mathematical objects, for which operations such as
addition and multiplication are defined
An integer matrix is a matrix whose entries are all integers. Examples include binary matrices,
the zero matrix, the unit matrix, and the adjacency matrices used in graph theory, amongst many
others. Integer matrices find frequent application in combinatorics.
The size of a matrix is defined by the number of rows and columns that it contains. A matrix
with m rows and n columns is called an m × n matrix or m-by-n matrix, while m and n are called
its dimensions. For example, the matrix A above is a 3 × 2 matrix.
Matrices which have a single row are called row vectors, and those which have a single column
are called column vectors. A matrix which has the same number of rows and columns is called a
square matrix. A matrix with an infinite number of rows or columns (or both) is called an infinite
33
matrix. In some contexts, such as computer algebra programs, it is useful to consider a matrix
with no rows or no columns, called an empty matrix.
There are a number of basic operations that can be applied to modify matrices, called matrix
addition, scalar multiplication, transposition, matrix multiplication, row operations, and
submatrix
7.3
Adjacency Matrix
A matrix is an array of numbers. In what follows, a matrix is denoted by an upper-case alphabet
in boldface (e.g., A), and its (i,j) th element (the element at the i th row and j th column) is denoted
by the corresponding lower-case alphabet with subscripts ij (e.g., aij).
A matrix is square if its number of rows equals the number of columns. A matrix is said to be
diagonal if its off-diagonal elements (i.e., aij, i = j) are all zeros and at least one of its diagonal
elements is non-zero, i.e., aii = 0 for some i = 1, . . . , n. A diagonal matrix whose diagonal
elements are all ones is an identity matrix.
A Matrix can be used to study graph. A graph is a set of points (called vertices or nodes) and a
set of lines called edges connecting some pair of vertices. If two vertices connected by an edge
are said to be adjacent.
(i) Consider a graph that its vertices A and B are connected by 2 distinct edges
(ii) Consider a graph that its vertex need not be connected to any other vertex
(iii)Consider a graph that its vertex may be connected to itself
The adjacency Matrix for a graph with n vertices is an n x n matrix whose (i,j) entry is 1 if the ith
vertex and jth vertex are connected and 0 if they are not. If in above figure, A is vertex 1, B is
vertex 2 e.t.c, then the adjacency matrix for this graph is
34
010000
101010
010001
000000
010001
001011
Adjacency Matrix.: We may number the vertices v1 to vn, and represent the edges in a n by n
matrix A. Row i and column j of the matrix, aij, is 1 if and only if there is an edge from vi to vj.
If the graph is undirected, then a ij = aji and the matrix A is symmetric about the diagonal; in this
case we can just store the upper right triangle of the matrix.
7.4
Boolean Matrix
A Given matrix is said to be or called Boolean matrix if such matrix has all the attributes of
Boolean algebra. Given two Boolean matrices A and B, we define the Boolean product of A ⋀ B
as that matrix whose (i,j)th entry is vk (aik ⋀ bkj), and we define the Boolean sum of A ⋁ B as that
matrix whose (i,j)th entry is aij ⋁ bij
7.5
Boolean Arithmetic. If a and b are binary digits (0 or 1), then
a ∧ b = if a = b = 1 and a ∧ b =
a b = if a = b = 0 and a b =
Let A and B be n × m matrices.
1. The meet of A and B: A ∧ B = [aij ∧ bij]
2. The join of A and B: A ∨ B = [aij ∨ bij]
Let A = [aij] be m × k and B = [bij] be k × n. The Boolean product of A and B, A ʘ B, is the m ×
n matrix C = [cij] defined by
35
cij = (ai1 ∧ b1j) ∨ (ai2 ∧ b2j) ∨ (ai3 ∧ b3j) ∨ · · · ∨ (aik ∧ bkj).
Boolean operations on zero-one matrices is completely analogous to the standard operations,
except we use the Boolean operators ∧ and ∨ on the binary digits instead of ordinary
multiplication and addition, respectively.
Ouestions
1. Draw a graph that its vertices A and B are connected by 2 distinct edges
2. Draw a graph that its vertex need not be connected to any other vertex
3. Draw a graph that its vertex may be connected to itself
4. Define an integer matrix and give two examples
5. Define a real matrix and give two examples
6. Mention five application of a matrix
7. What do you understand by adjacent graph and adjacent matrix?
8. How does a given matrix becomes a Boolean matrix
9. What do you understand by Boolean matrix
10. Let G be a group and ≤ a partial order on the elements of G such that for all x, y in G, x ≤ xy.
How many elements does G have?
11. Consider the matrices
A=
101
110
,B=
011
100
36
101
Compute A ʘ B, AB, B ʘ A and A ∧ B.
Assume A is the matrix of a relation. Draw the corresponding digraph.
37
7.0 Module 7 Introduction to Counting
7.1 Learning Outcomes: After completing this module, the students should be able to:
(1) Understand basic concepts of counting as a discreet object
7.2
APLICATION TO COUNTING
Combinatorics is the branch of Mathematics in which methods to solve counting problems are
studied. Counting is the process of creating a bijection between a set we want to count and some
set whose size we already know. The subject of enumerative combinatorics is counting.
Generally, there is some set A and we wish to calculate the size |A| of A. Here are some sample
problems:
• How many ways are there to seat n couples at a round table, such that each couple sits
together?
• How many ways are there to express a positive integer n as a sum of positive integers?
There are a number of basic principles that we can use to solve such problems.
7.3
The sum principle: Consider n sets Ai, for 1 ≤ i ≤ n, that are pairwise disjoint, namely
Ai ∩ Aj = ∅ for all i ≠ j. Then
=
For example, if there are n ways to pick an object from the first pile and m ways to
pick on object from the second pile, there are n+m ways to pick an object altogether
7.4
The product principle: If we need to do n things one after the other, and there are c1
ways to do the first, c2 ways to do the second, and so on, the number of possible courses of
action is For example, the number of possible threeletter words in which a letter appears at most
38
once that can be constructed using the English alphabet is 26 · 25 · 24: There are 26 possibilities
for the first letter, then 25 possibilities for the second, and finally 24 possibilities for the third.
Subtraction Rule (Inclusion-Exclusion for two sets)
7.5
Subtraction Rule
For any finite sets A and B (not necessarily disjoint), |A ∪ B| = |A| + |B| − |A ∩ B|
7.6
The Pigeonhole Principle
For any positive integer k, if k + 1 objects (pigeons) are placed in k boxes (pigeonholes), then at
least one box contains two or more objects.
Pigeonhole Principle (rephrased more formally)
If a function f : A → B maps a finite set A with |A| = k + 1 to a finite set B, with |B| = k, then f is
not one-to-one.
(Recall: a function f: A → B is called one-to-one if ∀ a1, a2 ∈ A, if a1 ≠ a2 then f(a1) ≠ f(a2).)
The principle says that if we place k + 1 or more pigeons into k pigeon holes, then at least one
pigeon hole contains 2 or more pigeons.
For example, in a group of 367 people, at least two people must have the same birthday (since
there are a total of 366 possible birthdays). More generally, we have
(Pigeonhole Principle). If we place n (or more) pigeons into k pigeon holes, then at least one box
contains [n/k] or more pigeons. In a group of 800 people, there are at least [800/355] = 3 people
with the same birthday.
Permutations
A permutation of a set S is an ordered arrangement of the elements of S. In other words, it is a
sequence containing every element of S exactly once.
39
7.7
Combinations
An r-combination of a set S is an unordered collection of r elements of S. In other words, it is
simply a subset of S of size r.
Formula for the number of r-combinations
Let C(n, r) denote the number of r-combinations of an n-element set. Another notation for C(n,
r) is:
These are called binomial coefficients, and are read as “n choose r”.
For all integers n ≥ 1, and all integers r such that 0 ≤ r ≤ n:
C(n, r) = . = n! / r! (n− r)! = (n · (n − 1) ….. (n − r + 1)) / r!
For all integers n ≥ 1, and all integers r, 1 ≤ r ≤ n:
=
Summarizing, for counting the number of possibilities to draw k copies from a set of n objects,
there are four possible results, depending on the context:
• If every copy is drawn from the full set, so the drawn copies are put back, and drawings in a
different order are considered to be different, then the number of possibilities is nk.
• If every copy is drawn from the remaining set, and no copies are put back, and drawings in a
different order are considered to be different, then the number of possibilities is
n(n − 1)(n − 2) · · · (n − k + 1) = n!/(n − k)! = k!.
• If every copy is drawn from the remaining set, and no copies are put back, and drawings in a
different order are considered to be the same, then the number of possibilities is .
• If every copy is drawn from the full set, so the drawn copies are put back, and drawings in a
different order are considered to be the same, then the number of possibilities is
40
7.8
Questions:
{1} How many bit strings of length seven are there?
Solution: Since each bit is either 0 or 1, applying the product rule, the answer is 27 = 128.
{2} How many different car license plates can be made if each plate contains a sequence of three
uppercase English letters followed by three digits?
Solution: 26 · 26 · 26 · 10 · 10 · 10 = 17,576,000
(3) Suppose variable names in a programming language can be either a single uppercase letter or
an uppercase letter followed by a digit. Find the number of possible variable names.
Solution: Use the sum and product rules: 26 + 26 · 10 = 286.
{4} How many bit strings of length 8 either start with a 1 bit or end with the two bits 00?
Solution:
Number of bit strings of length 8 that start with 1: 27 = 128.
Number of bit strings of length 8 that end with 00: 26 = 64.
Number of bit strings of length 8 that start with 1 and end with 00: 25 = 32.
Applying the subtraction rule, the number is 128 + 64 − 32 = 160.
{5} At least two students registered for this course will receive exactly the same final exam
mark. Why?
Reason: There are at least 102 students registered for CSC 223 suppose the actual number is
145), so, at least 102 objects. Final exam marks are integers in the range 0-100 (so, exactly 101
boxes).
{6} Consider the set S = {1, 2, 3}. The sequence (3, 1, 2) is one permutation of S.
There are 6 different permutations of S. They are: (1, 2, 3) , (1, 3, 2) , (2, 1, 3) , (2, 3, 1) , (3, 1, 2)
, (3, 2, 1)
41
{7} How many ways can a committee of 3 faculty members and 2 students be selected from 7
faculty members and 8 students?
Answer: Task T1: to choose 3 faculty members from 7, there are
7C3
= (7 · 6 · 5)/(3 · 2 · 1) = 35 ways.
Task T2: to choose 2 students from 8, there are 8C2 = (8 · 7)/(2 · 1) = 28 ways.
Altogether, there are 35 · 28 = 980 ways of choosing this committee.
{8} Two fair six-sided dice are rolled and the sum s of the numbers coming up is recorded. What
is the probability that s ≥ 10? Show your work.
Answer: The sample space
A = {(1, 1), (1, 2), . . . , (6, 6)}, and |A| = 36. The event that the sum is ≥ 10 is
E = {(4, 6), (5, 5), (5, 6), (6, 4), (6, 5), (6, 6)}.
So the probability is
|E| / |A| = 6 /36 = 1/ 6
{9} Two cards are dealt in succession from a standard shuffled 52-card deck. What is the
number of possible/ of picking 2-card hands
The number of such hands is
52C2
= (52 · 51)/(2 · 1) = 1326.
{10} A 6-sided die is rolled twice. What is the probability that the sum of the two rolls is exactly
8?
Answer: The sample space for two rolls of a die is A = {(1, 1), (1, 2), . . . , (6, 5), (6, 6)},
and |A| = 62 = 36. The event given by the sum of the two rolls being
8 is given by E = {(2, 6), (3, 5), (4, 4), (5, 3), (6, 2)},
and so |E| = 5. So the probability is |E| / |A| = 5 /36
(11) Consider the relation R on A = {1, 2, 3, 4} given by 1R2, 2R3, 3R3, 3R4, 4R3.
42
Draw the digraph of R and compute its connectivity relation R∞. Draw
the digraph of R∞. (Hint: Try to determine R∞ by inspection, not by computing with formulas.)
(12) How many different 5-card poker hands can be dealt from a deck of 52 cards?
{13} How many different 47-card poker hands can be dealt from a deck of 52 cards?
Solutions:
1
= 52! / (5! · 47!) = (52 ·51 ·50.49.48)/ 5.4.3.2.1 = 2, 598, 960
2
= 52! / (47! · 5!) = (52 ·51 ·50.49.48)/ 5.4.3.2.1 = 2, 598, 960
{14} A chess club with 20 members must elect a board consisting of a chairman, a secretary, and
a treasurer.
(a) What is the number of ways to form a board (from members of the club) if the three positions
must be occupied by different persons?
(b) What is the number of ways if it is permitted that a person occupies several positions?
(c) What is the number of ways if the rule is that every person occupies at most 2 positions?
(15) How many different solutions in non-negative integers x1, x2, and x3, does the following
equation have?
x1 + x2 + x3 = 11
Solution: We have to place 11 “pebbles” into three different “bins”, x1, x2, and x3.
{16} This is equivalent to choosing an 11-comb-w.r. from a set of size 3, so
the answer is
= = (13.12)/(2.1) = 78.
43
(17) A group of k students sit in a row of n seats. The students can choose whatever seats they
wish, provided: (a) from left to right, they are seated in alphabetical order; and (b) each student
has an empty seat immediately to his or her right. For example, with 3 students A, B, and C and
7 seats, there are exactly 4 ways to seat the students: A-B-C–, A-B–C-, A–B-C-, and -A-B-C-.
Give a formula that gives the number of ways to seat k students in n seats according to the rules
given above.
(18) Consider the following game: A player starts with a score of 0. On each turn, the player rolls
two dice, each of which is equally likely to come up 1, 2, 3, 4, 5, or 6. They then take the product
xy of the two numbers on the dice. If the product is greater than 20, the game ends. Otherwise,
they add the product to their score and take a new turn. The player’s score at the end of the game
is thus the sum of the products of the dice for all turns
before the first turn on which they get a product greater than 20. {1} What is the probability that
the player’s score at the end of the game is zero? {2} What is the expectation of the player’s
score at the end of the game?
8.0 Module 8 Introduction to Counting
8.1 Learning Outcomes: After completing this module, the students should be able to:
(1) Understand principles of discreet probability generating function
44
8.2
DISCRETE PROBABILITY GENERATING FUNCTION
The probability generating function is a powerful technique for studying the law of finite sums of
independent discrete random variables taking integer positive values. Generating functions are an
useful and up to date tool in nowadays practical mathematics, in particular in discrete
mathematics and and, in
the case of probability generating functions, in distributional convergence results as in
8.3
Probability Generating Functions
The generating function associated with a sequence a0, a1, a2, a3, ... is a formal series
f(x) = a0 + a1x + a2x2 + a3x3 + ...
A random variable X that assumes integer values with probabilities P(X = n) = pn is fully
specified by the sequence p0, p1, p2, p3, ... The corresponding generating function
f(x) = p0 + p1x + p2x2 + p3x3 + ...
is commonly referred to as a probability generating function. Each term is a power of x with a
coefficient; the exponent points to a value that the random value may take; the coefficient
indicates the probability of the random variable taking the value in the exponent.
Examples
Tossing a coin. Let's write 0 for the "heads" even, 1 for "tails". The generating function of the
experiment that consists of a single toss of a coin is then f(x) = (1/2) + (1/2)x. One possible
interpretation is that, in a single toss of a coin, the probability of having 0 heads is 1/2; the
probability of having 1 heads is also 1/2.
45
Rolling a die. The generating function for the experiment of rolling a die once is
f(x) = (1/6)x + (1/6)x2 + (1/6)x3 + (1/6)x4 + (1/6)x5 + (1/6)x6.
Tossing two coins. The sample space for a toss of two coins consists of four possible
outcomes: {HH, HT, TH, TT},or, if we use the convention of denoting the events H and T as 0
and 1, {00, 01, 10, 11}. There is a convenience in the switch of the notations if we are interested
in the number of time "heads" showed up in two tosses. It is 0 in the first event, 1 = 0 + 1, in the
next two, and 2 in the last 11 event. It follows that the probabilities of having 0, 1, or 2 heads in
two coin tosses are 1/4, 2/4, and 1/4, respectively. The probability generating function for the
random number of heads in two throws is defined as
f(x) = (1/4)1 + (2/4)x + (1/4)x2.
Observe that the generating function of two coin tosses equals to the square of of the generating
function associated with a single toss.
(1/4)1 + (2/4)x + (1/4)x2 = [(1/2) + (1/2)x]2.
The possible outcomes for three coins are {000, 001, 010, 011, 100, 101, 110, 111}. If we are
only concerned with the number of heads shown in three throws, then the sample space is {0, 1,
2, 3}, with probabilities 1/8, 3/8, 3/8, 1/8. This coefficients also come from the binomial theorem
for [(1/2) + (1/2)x]3. This is a general rule, the generating function for the number of heads
shown in N coin tosses equals [(1/2) + (1/2)x]N = 2-N(1 + x)N.
Even more generally, if f(x) and g(x) are the probability generating functions of
two independent random variables X and Y, then the generating function,
corresponding to the sum X + Y equals the product f(x)g(x)!
46
Rolling two dice. For a single die, f(x) = (1/6)x + (1/6)x2 + (1/6)x3 + (1/6)x4 + (1/6)x5 + (1/6)x6.
For two dice, we have
f2(x) = [(1/6)x + (1/6)x2 + (1/6)x3 + (1/6)x4 + (1/6)x5 + (1/6)x6.]2
= 6-2[x2 + 2x3 + 3x4 + 4x5 + 5x6 + 6x7 + 5x8 + 4x9 + 3x10 + 2x11 + x12],
which tells us that, for example, there are 5 ways to get 6 in two throws. Indeed,
6 = 1 + 5 = 2 + 4 = 3 + 3 = 4 + 2 = 5 + 1.
This happens with the probability of 5/36.
In probability theory, the probability-generating function of a discrete random variable is
a power series representation (the generating function) of the probability mass function of the
random variable. Probability-generating functions are often employed for their succinct
description of the sequence of probabilities Pr(X = i), and to make available the well-developed
theory of power series with non-negative coefficients.
f X is a discrete random variable taking values on some subset of the non-negative integers, {0,1,
...}, then the probability-generating function of X is defined as:
where f is the probability mass function of X. Note that the equivalent notation GX is
sometimes used to distinguish between the probability-generating functions of several
random variables
A generating function of a random variable (rv) is an expected value of a certain
transformation of the variable. All generating functions have some very important properties.
47
The most important property is that under mild conditions, the generating function
completely determines its distribution. The second important property is that the moments of
the random variable can be determined from the derivatives of the generating function. This
property is useful because often obtaining moments from the generating function is easier
than computing the moments directly from their definitions.
8.4 Discrete and continuous distributions: A random variable is a function X, whose value
is uncertain and depends on some random event. The space or range of X is the set S of
possible values of X. A random variable X is said to be discrete if this set has a finite or
countable infinite number of distinct values (i.e. can be listed as a sequence.). The random
variable X is said to have a continuous distribution if all values are possible in some real
interval. Often, there are functions that assign probabilities to all events in a sample space.
These functions are called probability mass functions if the events are discretely distributed,
or probability density functions if the events are continuously distributed. All the possible
value of a random variable and their associated probability values constitute the probability
distribution of the random variable. The discrete probability distributions are specified by the
list of possible values and the probabilities attached to those values, and the continuous
distributions are specified by probability density functions. The distribution of a random
variable X can be also described by the cumulative distribution function f
For a discrete random variable X with a probability mass function
we have 0 ≤ p(x) ≤ 1 for all x and ∑ p(x) = 1. The probability mass function or the
probability density function of a random variable X contains all the information that one ever
need about this variable.
48
8.5
The sequence of moments of a random
variable We know that the mean μ = E(X) and variance σ2 = E(X - E(X))2 =E(X2 – (E(X))2 of a
random variable enter into the fundamental limit theorems of probability, as well as into all sorts
of practical calculations. These important attributes of a random variable contain important
informations about the distribution function of that variable. But the mean and variance do not
contain all the available information about density function of a random variable. Besides the
two numerical descriptive quantities μ and σ that locate the center and describe the spread of the
values of a random variable, we define a set of numerical descriptive quantities, called moments,
which uniquely determine the probability distribution of a random variable.
We have a sequence of moments associated to a random variable X. In many cases this sequence
determines the probability distribution of X. However, the moments of X may not exist. In terms
of these moments, the mean μ and variance σ2 of X are given simply by μ= μ and σ2 = μ2- μ21
.The higher moments have more obscure meaning as k grows.
The moments give a lot of useful information about the distribution of X. The knowledge of the
first two moments of X gives us its mean and variance, but a knowledge of all the moments of X
determines its probability function completely. It turn out that different distributions cannot have
identical moments. That is what makes moments important. Therefore, it seems that it should
always be possible to calculate the expected value or mean of X, E(X) = μ , the variance V(X)=
σ2, or higher order moments of X from its probability density function, or to calculate the
distribution of, say, sum of two independent random variables X and Y, whose distributions are
known. In practice, it turn out that these calculations are often very difficult.
Let X be a random variable taking values in and consider that for all K , we have Pk.
49
. Then the probability generating function [PGF] of X denoted by
For all z in the set x in which it is well defined, that is
∞}
The first example the random variable takes rational positive and negative values. In the second
example the random variable takes irrational values. The discrete random variable taking
rational values X1 defined below appears naturally in the context of fair marking multiple choice
questions.
Function X with generating function X1 with the following attributes
(1) -1 with the probability of 1/16
(2) -2/3 with the probability of 3/16
(3) -1/3 with the probability of 3/16
(4) 0 with the probability of 1/8
(5) 1/3 with the probability of 3/16
(6) 2/3 with the probability of 3/16
(7) 1 with the probability of 1/16
The PGF of X1 is given by
ψX1 =1/16(t)-1 + 3/16(t)-2/3 + 3/16(t)-1/3+ 2/16 + 3/16(t) 1/3 +3/16(t) 2/3 + 1/16 (t)
For a deeper understanding of fair marking an exam with a set of, say, ten multiple choice
questions it is important to know the distribution of the sum of ten independent copies of
50
X1 which we denote by Y . We know that ψY = (ψX1 (t))10, For a second example consider a
random variable X2 taking some irrational values defined as in the following:
−3/4
π
2π
with probability 0.34
with probability 0.33
with probability 0.33
Obviously the PGF of X2 is given by
ψX2 (t) = 0.34/t3/4 + 0.33tπ + 0.33t2π
As above, we are interested in the law of Z the sum of ten identically distributed copies of
X2 . We know that
ψX2 (t) ={ 0.34/t3/4 + 0.33tπ + 0.33t2π }10
8.6
Generating Functions for Discrete Distributions
We have seen that the mean and variance of a random variable contain important information
about the random variable, or, more precisely, about the distribution function of that variable.
Now we shall see that the mean and variance do not contain all the available information about
the density function of a random variable.
For instance, suppose X and Y are random variables, with distributions PX = (1 2 3 4 5 6 0 1/4
1/2 0 0 ¼) and PY = ( 11 2 3 4 5 6 /4 0 0 1/2 1/4 0 ) . Then with these choices, we have E(X) =
E(Y ) = 7/2 and V (X) = V (Y ) = 9/4, and yet certainly PX and PY are quite different density
functions.
51
This raises a question: If X is a random variable with range {x1, x2, . . .} of at most countable
size, and distribution function p = pX, and if we know its mean µ = E(X) and its variance σ2 = V
(X), then what else do we need to know to determine p completely?
A nice answer to this question, at least in the case that X has finite range, can be given in terms
of the moments of X, which are numbers defined as follows:
µk = kth moment of X
= E(Xk)
=
provided the sum converges. Here p(xj) = P(X = xj).
In terms of these moments, the mean µ and variance σ2 of X are given simply by µ = µ1, σ2 =
µ2 − µ2 1 , so that a knowledge of the first two moments of X gives us its mean and variance. But
a knowledge of all the moments of X determines its distribution function p completely.
To see how this comes about, we introduce a new variable t, and define a function g(t) as follows
(etx)
We call g(t) the moment generating function for X, and think of it as a convenient bookkeeping
device for describing the moments of X. Indeed, if we differentiate g(t) n times and then set t = 0,
we get µn:
52
Questions:
(1) Suppose X has range {1, 2, 3, . . . , n} and pX(j) = 1/n for 1 ≤ j ≤ n
(uniform distribution). Calculate its g(t)
(2) Suppose now that X has range {0, 1, 2, 3, . . . , n} and pX(j) = ¡n j¢pjqn−j for 0 ≤ j ≤ n
(binomial distribution). Then its g(t) is
(3) Suppose X has range {1, 2, 3, . . .} and pX(j) = qj−1p for all j (geometric distribution). Then
Then its g(t) is
(4)
Find the generating functions, both ordinary h(z) and moment g(t), for the
following discrete probability distributions.
(a) The distribution describing a fair coin.
(b) The distribution describing a fair die.
(c) The distribution describing a die that always comes up 3.
(d) The uniform distribution on the set {n, n + 1, n + 2, . . . , n + k}.
(e) The binomial distribution on {n, n + 1, n + 2, . . . , n + k}.
(f) The geometric distribution on {0, 1, 2, . . . , } with p(j) = 2/3j+1.
(5)
53