Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Applied Computer Science II Chapter 1 : Regular Languages Prof. Dr. Luc De Raedt Institut für Informatik Albert-Ludwigs Universität Freiburg Germany Overview • • • • • • • Deterministic finite automata Regular languages Nondeterministic finite automata Closure operations Regular expressions Nonregular languages The pumping lemma Finite Automata • An intuitive example : supermarket door controller • Figures 1,2,3 • Probabilistic counterparts exist – Markov chains, bayesian nets, etc. – Not in this course A finite automaton • Figure 1.4 • Formally A finite automaton is a 5-tuple (Q, , , q0 , F ) Startstate :q1 1.Q is a finite set of states 2. is a finite set, the alphabet 3. : Q Q is the transition function 4. qo Q is the start state Acceptstate :q2 5. F Q is the set of accept states States :q1 , q2 , q3 Transitions Output : accept or reject Describe M 1 Q {q1 , q2 , q3} {0,1} defined by A is the language of machine M we write L( M ) A A {w | w contains at least one 1 and an even number of 0s follow the last 1 } q1 0 q1 1 q2 q2 q3 q2 q3 q2 q2 q1 start state F {q2 } Other examples • 7,8,9 Another example A generalisation : Ai is the language of all strings where the sum of the numbers is a multiple of i except that the sum is reset to 0 whenever the symbol reset appears Automaton Bi = 1.Qi {q0 ,..., qi } 2. ={0,1,2, reset } 3. (q j , 0) q j (q j ,1) qk where k ( j 1) mod i (q j , 2) qk where k ( j 2) mod i (q j , reset ) q0 4. qo Q is start and accept state Formal definition of computation Let M be a finite automaton (Q, , , q0 , F ) Let w w1....wn be a string over M accepts w if a sequence of states r0 ,..., rn exists in Q such that 1.r0 q0 2. (ri , wi 1 ) ri 1 for all i 0,..., n 1 3.rn F M recognizes language A if A {w | M accepts w} A language is regular if some finite automaton recognizes it. Designing finite automata • Design automaton for language consisting of binary strings with an odd number of 1s • Design first states • Then transitions • Accept and reject states • Fig. 12 Another example • Design an automaton to recognize the language of binary strings containing the string 001 as substring • Fig 13 The regular operations Let A and B be languages We define : Union :A B {x | x A or x B} Concatenation :A B {xy | x A and y B} Star :A* {x1 ,..., xn | n 0 and each xi A} note: always A* Example A {good , bad } B {boy, girl} Regular languages are closed under … A set S is closed under an operation o if applying o on elements of S yields elements of S . Example: multiplication on natural numbers Counterexample :division of natural numbers Theorem 1.12 The class of the regular languages is closed under the union operation. In other words, if A1 and A2 are regular languages, so is A1 A2 • Proof M (Q, , , q, F ) constructed from M 1 (Q1 , 1 , 1 , q1 , F1 ) and M 2 (Q2 , 2 , 2 , q2 , F2 ) Define 1. Q {(r1 , r2 ) | r1 Q1 and r2 Q2 } 2. 1 2 3. ((r1 , r2 ), a) (1 (r1 , a), 2 ( r2 , a)) 4.q (q1 , q2 ) 5.F {(r1 , r2 ) | r1 F1 or r2 F2 } Theorem 1.13 The class of the regular languages is closed under the concatenation operation. In other words, if A1 and A2 are regular languages, so is A1 A2 Non deterministic finite automata • Deterministic – One successor state – transitions not allowed • Non deterministic – Several successor states possible – transitions possible • Figure 14 Deterministic versus non deterministic computation • Figure 15 Example Every NFA has an equivalent DFA • Figures 17-18 Another NFA Nondeterministic finite automaton A nondeterministic finite automaton is a 5-tuple (Q, , , q0 , F ) 1.Q is a finite set of states 2. is a finite set, the alphabet 3. : Q P(Q) is the transition function 4. qo Q is the start state 5. F Q is the set of accept states includes P(Q) the powerset of Q Example • Example 18 Formal definition of computation Let M be a nondeterministic finite automaton (Q, , , q0 , F ) Let w w1....wn be a string over M accepts w if a sequence of states r0 ,..., rn exists in Q such that 1.r0 q0 2.ri 1 (ri , wi 1 ) for all i 0,..., n 1 3.rn F Equivalence NFA and DFA Two machines are equivalent if they recognize the same language Theorem 1.19 Every nondeterministic finite automaton has an equivalent finite automaton Corollary 1.20 A language is regular if and only if some nondeterministic finite automaton recognizes it. Proof An example Closure under the regular operations Theorem 1.12/1.22 The class of the regular languages is closed under the union operation. In other words, if A1 and A2 are regular languages, so is A1 A2 Theorem 1.23 The class of the regular languages is closed under the concatenation operation. Theorem 1.24 The class of the regular languages is closed under the star operation. Proof idea Theorem 1.12/1.22 The class of the regular languages is closed under the union operation. In other words, if A1 and A2 are regular languages, so is A1 A2 • INSERT FIG 1.24 • PROOF P 60 ! Proof idea Theorem 1.23 The class of the regular languages is closed under the concatenation operation. Proof idea Theorem 1.24 The class of the regular languages is closed under the star operation. Regular expressions R R R R Applications • Design of compilers {, , }( DD* DD*.D D*.DD* ) where D {0,...,9} • awk, grep, vi … in unix (search for strings) • Perl programming language • Bioinformatics – So called motifs (patterns occurring in sequences, e.g. proteins) Theorem1.28 A language is regular if and only if some regular expression describes it Proof through : Lemma1.29 If a language is described by some regular expression, then it is regular Lemma 1.32 If a language is regular, then it is described by some regular expression Lemma1.29 If a language is described by some regular expression, then it is regular Lemma 1.32 If a language is regular, then it is described by some regular expression • Two steps – DFA into GNFA (generalized nondeterministic finite automaton) – GNFA into regular expression GNFAs • Two states q and r are connected in both directions • Exception : – Start state – Accept state – One direction only • Labels are regular expressions A generalized nondeterministic finite automaton is a 5-tuple (Q, , , qstart , qaccept ) 1.Q is a finite set of states Formally 2. is a finite set, the alphabet 3. : (Q {qaccept }) (Q {qstart }) R is the transition function 4. qstart Q is the start state 5. qaccept Q the accept state A GNFA accepts w w1...wk where each wi * if a sequence of states r0 ,..., rn exists in Q such that 1.r0 qstart 2.rk qaccept 3. for all i 0,..., n 1, we have that wi L( Ri ) where Ri (ri 1 , ri ) Convert DFA into GNFA Add new start state, with arrow to old start state Add new accept state, with arrows from old accept states If any arrows have multiple labels a and b, replace by a b Add arrows with label between states where necessary Convert GNFA into regular expression Induction Proof Claim 1.34 For any GNFA G, Convert(G ) is equivalent to G • Induction proof Basis k 2; straightforward Induction Assume claim holds for k 1 states Suppose Nonregular Languages • Finite Automata have a finite memory • Are the following languages regular ? B {0n1n | n 0} C {w | w has an equal number of 0s and 1s} D {w | w has an equal number of occurences of 01 and 10} • Mathematical proof necessary The pumping lemma The pumping lemma If A is regular, then there is a number p, for which if s is any string in A of length at least p then s may be divided into three pieces s xyz such that 1. for each i 0, xy i z A 2. | y | 0 3. | xy | p Note y Proof Idea Nonregular languages B {0 1 | n 0} n n Choose s 0 p1p 1. string y consists only of 0s 2. string y consists only of 1s 3. string y consists of both 0s and 1s C {w | w has an equal number of 0s and 1s} Choose s 0 p1p Would seem possible without condition 3! However, condition 3 of lemma states | xy | p Thus y consists of 0s only Then xyyz C Choice of s crucial. Consider s (01) p Alternative proof : B is nonregular If C were regular, then C 0*1* B regular Regular languages closed under intersection F {ww | w {0,1}*} Choose s 0 p10 p1 Condition 3 of lemma states | xy | p Thus y consists of 0s only Then xyyz F 0 p 0 p would not work ! D {1 | n 0} n2 Choose s 1p Perfect squares : 0,1,4,9,16,25,36,49,... 2 Consider xy i z and xy i 1 z Prove that for large i: xy i z and xy i 1 z cannot both be squares which should be true according to pumping lemma Let m n 2 | xy i z | Then (n 1) 2 n 2 2n 1 2 m 1 Choose | y | 2 m 1 2 | xy i z | 1 by setting i p 4 Indeed, observe | y || s | p 2 p4 <2 p 4 1 <2 | xy i z | 1 E {w | 0i 1 j where i j} p 1 p Choose s 0 1 Condition 3 of lemma states | xy | p Thus y consists of 0s only Then xy 0 z F Summary • • • • • • • Deterministic finite automata Regular languages Nondeterministic finite automata Closure operations Regular expressions Nonregular languages The pumping lemma