Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Regular Languages
Sequential Machine Theory
Prof. K. J. Hintz
Department of Electrical and Computer Engineering
Lecture 3
Comments, additions and modifications
by Marek Perkowski
Languages
• Informal Languages
– English
– Body
– Bureaucratize
• Formal Languages
– Rule-based
– Elements are decidable
– No deeper understanding required
Formal Language
• All the Rules of the Language Are Explicitly
Stated in Terms of the Allowed Strings of
Symbols, e.g.,
– Programming languages, e.g., C, Lisp, Ada
– Military communications (formal “informal” L)
– Digital network protocols
A to ...
Alphabet: a finite set of symbols, aka I,
–
–
–
–
Roman:
Binary:
Greek:
Cyrillic:
{ a, b, c, ... , z }
{ 0, 1 }
{ a, c, e, g, i, k, l, ... }
{ Ж, Й, Њ, С, Р, ... }
String
String, word: a finite ordered sequence of
symbols from the alphabet, usually written
with no intervening punctuation
– x1 = “ t h e “
– x2 = “ 0 1 0 1 1 0 “
– x3 = “ c i g a e
“
– x4 = “ Ж Й С Р “
a
String
• Reverse of String
– The sequence of symbols written backwards
x1r " e h t "
• Reverse of Concatenation
– Strings themselves must be reversed
x y R y R x R
String
• Length or Size of String
– The number of symbols
x1 3
x2 6
x3 9
x4 4
x3 x4 13
Strings
• Null String, Empty String, e,
– A string of length or size zero
– The symbols e or , meant to denote the null
string, are not allowed to be part of the
language
Substring
• A String, v, Is a Substring of a String, w, iff
There Are Strings x and y Such That
–
–
–
–
w=xvy
x is called the prefix
y is called the suffix
x and/or y could be
Kleene Closure
• Set of All Strings, *, I*
– Order IS important
– Not the same as P , the powerset of the
alphabet, since order is NOT important in the
powerset
Concatenation Operator
• If x, y I*, then the concatenation of x and
y is written as
–z=x
– e.g., if
y
• x = “Red”
|x|= 3
• y = “skins”
|y|= 5
• z = x y = “Redskins“ | z | = 8
Concatenation Operator
• Concatenation of Any String With the Null
String Results in the Original String
–xe=ex=x
–x=x=x
• Concatenation is Associative
– x = abc
y=def
z= ghi
–(xy)z=x(yz)
– abcdefghi = abcdefghi
Language
• Language, L: Any Subset of the Set of All
Strings of an Alphabet
L *
L I*
I*
L1
L2
Classes of Languages
• Enumerated Languages
– Defined by a List of All Words in the Language
• Le = { “quidditch”, “nimbus 2000”, }
• not very interesting
• Rule-based
– Defined by Properties or a Set of Rules
L r w I * : w has the property P
Rule Based Languages
• A Test to Determine Whether a String Is a
Member of a Language
• A Means of Constructing Strings That Are
in the Language
– Must be able to construct ALL strings in the
language
– Must be able to construct ONLY strings in the
language
Rule-Based Language Example
Let I = { a, b }
• A Language That Consists of All Two Letter
Strings
– L = { aa, ab, ba, bb }
– is not an element of the language
Empty Language
• Null Language, Empty Language, : The
Language With No Words in It
– Not the same as
– can be made into a language with words
L
– A language consisting only of is still a
language
Kleene Star
If L I * is a language, then
• L* Is the Set of All Strings Obtained by
Concatenating Zero or More Strings of L.
• Concatenation of Zero Strings Is
• Concatenation of One String Is the String
Itself
• L+ = L* - { }
Kleene Closure Example
• L = { 0, 1}
L* = {
,
0, 00, 000, ... , 0*,
1, 11, 111, ... , 1*,
01, 001, 0001, ... , 0*1,
... }
Kleene Closure Examples
• L = { ab, f }
L* = { ,
ab, abf, fab, ffab, ffabf, ... }
• *
={ }
• if
L
={ }
then
L*
={ }
Kleene Closure Examples
Let I = { a, b }
• L = Language ( ( ab )* )
{, ab, abab, ababab, ... }
which is not the same as
• L = Language ( a* b* )
{, a, b, ab, aab, abb, ... }
The language of all strings of a’s and b’s in
which the a’s, if any, come before the b’s
Recursive Language Definition
• Variation of Rule-Based
• Three-step Process
1. Specify some basic elements of the set
2. Specify the rules for forming new elements
from old elements of the set
3. Specify that elements not in 1 or 2 above are
NOT elements of the set
Recursive Example
• Two Equivalent Recursive Definitions of
Rational Numbers
– Rational #1
1. Rat_1 = { -, ... -3, -2, -1, 1, 2, 3, ... , }
2. if p, q Rat_1, then p/q Rat_1
3. the only rational numbers are those generated by 1
and 2 above.
Recursive Example
– Rational #2
1. Rat_2 = { -1, +1 }
2. if p, q Rat_2, p,q != 0, then (p+q)/p Rat_2
3. the only rational numbers are those generated by 1
and 2 above.
e.g.,
n 1 1
11
2 1
2
3
n
1
1
1
generates all integers
Interest in Recursive Definitions
• Allows Us to Prove Some Statements About
What Is Computable.
• Leads to Proof by Induction
Principle of Mathematical
Induction*
Let A Be a Subset of the Natural Numbers
• 0 A, and
• for each natural number, n,
– if
– implies
– then
* Lewis & Papadimitriou, pg. 24
{ 0, 1, ..., n } A ,
(n + 1) A
A=N
Mathematical Induction
• In practice, mathematical induction is used
to prove assertions of the form
For all natural numbers, n,
property P is true
Mathematical Induction Practice
To prove statements of the form
A = { n : P is true of n }, three steps
1. Basis Step: show that 0 A,
i.e., P is true of n = 0
2. Induction Hypothesis: assume that for some
arbitrary, but fixed n > 0, P holds for each
natural number 0, 1, ... , n
Mathematical Induction Practice
3. Induction Step: use the induction hypothesis
(that P is true of n) to show that P is true of (n
+ 1)
• By the Induction Principle, Then A=N and
Hence, P Is True of Every Natural Number.
Induction Example*
Show that for any n 0,
n 2 n
1 2 n
2
1. Basis Step
02 0
0
2
0 0
true for n 0
* Lewis & Papadimitriou, pg. 25
Induction Example
2. Induction Hypothesis
Assume that for some n 0,
m2 m
1 2 m
2
when m n
Induction Example
3. Induction Step
1 2 n n 1 1 2 n n 1
n2 n
where 1 2 n is replaced by
from the
2
induction hypothesis
n2 n
n 1
2
n2 n 2n 2
2
Induction Example
n 2 2n 1 n 1
2
n 1 n 1
2
which shows that the hypothesis is true since if it was true
for n 0, then it must be true for any n 1
2
Another Induction Example
• Define EVEN as
1. 0 is in EVEN
2. if x EVEN then so is x + 2
3. The only elements of EVEN are those
produced by 1 & 2 above.
• Prove by induction that all of elements of
EVEN end in either 0, 2, 4, 6, or 8.
Induction Example (cont)
Proof
1. Basis Step
0 EVEN by definition, therefore the property is
true of the zero’th step since 0 { 0, 2, 4, 6, 8
}
2. Induction Hypothesis
Assume that the last digit of
(m+2) { 0, 2, 4, 6, 8 } for 0 < m < n
Induction Example
3. Induction Step
n
0
1
2
3
4
EVEN n
EVEN
ends in {0,2,4,6,8}
2
...
by step 2
4
n
2n + 2
6
n+1
(2n+2)+2
8
n+1
2(n+1)+2
0+2=2, 2+2=4,4+2=6
10
6+2=8, 8+2=0 {0,2,4,6,8}
Regular Expressions
• Shorthand Notation for Concisely
Expressing Languages
• Defined Recursively
• Lead to a Definition of Regular Languages
• Provide Finite Representation of Possibly
Infinite Languages
• Lead to Lexical Analyzers
Regular Expressions Notation
Language a
a
ab
a, b
a*
a *
a
a
with operator precedence being
highest
*
Kleene Star
Concatenation
lowest
or +
Set Union
Regular Expressions Over I
• and are regular expressions
• a is a regular expression for each a I
• If r and s are regular expressions, then so
are r s, r s, and r*
• No other sequences of symbols are regular
expressions
Regular Expressions Alternative
1. L( ) = { }
L( a ) = { a }
If p and q are regular expressions, then
2. L( pq )
= L( p ) L( q )
3. L( p q )
= L( p ) L( q )
4. L( p* )
= L( p )*
Regular Expressions Example
What is L3 ( ( a b )* a ) ?
L 3 L a b * L a
2
L a b * a
1
L a b * a
4
L a L b * a
3
a b * a
1, 1
a, b * a
definition
w a, b *: w ends with a
Regular Expressions
• Boolean OR Distributes over Concatenation
L language a bc c * b
L language ac * b bcc * b
– which is the language of all strings beginning
with a, ending with b, and having none or more
c’s in the middle, and,
– all strings beginning and ending with b and
having at least one c in the middle
Regular Expressions
• The Boolean OR Operator Can Distribute
When It Is Inside a Kleene Starred
Expression, but Only in Certain Ways
L language a bc * b
a bc a bc a bc b
a * b bc * b
ab bcb *
Regular Expressions
• Useful String
– ( a + b )* = the set of all strings of a and b of
any length
– L = Language ( ( a + b )* )
– { , ab, abab, abaab, abbaab, babba, bbb, ... }
Regular Languages
• If L I* is finite, then L is regular.
• If L1 and L2 are regular, so are
– L3= L1 L2
– L4= L1 L2 = {x1 x2 | x1 L1 , x2 L2 }
• If L is regular, then so is L*, where * is the
Kleene Star
Regular Languages
• If L Is a Finite Language, Then L Can Be
Defined by a Regular Expression.
• The Converse Is Not True. That Is, Not All
Regular Expressions Represent Finite
Languages.
• L = Language( ( a + b )* ) Is Infinite Yet
Regular