* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download An Algebraic Approach to Equivalence
		                    
		                    
								Survey							
                            
		                
		                
                            
                            
								Document related concepts							
                        
                        
                    
						
						
							Transcript						
					
					An Algebraic Approach to Equivalence
Among Context-Free Linear Grammars:
An Introductory Paper
Mark DeArman
Math 320
12/11/2003
Introduction
Linguistics is the scientific study of language. Key to the scientific study of any
subject is the ability of the scientists to represent their findings in the concrete language
of mathematics as apposed to simple empirical results. In this paper, I will introduce the
mathematical axioms and structures used to model language and grammar in a way such
that mathematical analysis is possible. Three sections divide the content logically so each
section builds upon the last. The only prerequisite knowledge assumed, is a basic
understanding of abstract algebra, topology and set theory.
The first section describes the algebraic structures that are building blocks for a Formal
Context-Free Linear Grammar. The purpose of this chapter is to refresh prerequisite
knowledge and show applications of those structures to language modeling.
The second section explains the axioms used to define a context-free linear grammar
and show examples of the flexibility of the structure. The purpose of this chapter is to
show in detail how a CFG is constructed from the building blocks of section one. The
section concludes with a discussion of transformations between grammars, which are
important structures to sentence formation or ever translation work in natural language
computing.
The final section describes how two CFGs can be analyzed using topological
techniques to determine their similarity. The purpose of this section is to show more
application of the previous two sections content. Though the final proof given in this
section is incomplete, it is included to promote further study in this area, again applicable
to translation work.
2
Section I :
Algebraic Structures
Key to visualizing and solving a problem in mathematics is a deep knowledge of how the
structure of that problem is setup. In this chapter, I will explain the use of the various algebraic
structures that contribute to the formalization of context-free grammars.
The basic building block of natural language is an alphabet. We define an alphabet as a finite
set of distinguishable elements called characters. For example we can define an alphabet of six
characters as a set  such that . It is important to note that  has no underlying
structure and is simply an unordered set of elements (Hockett, 55.)
A semigroup is a non-empty set of elements closed under association. Let  be a semigroup
(,). Then for any x,y  , (xy) z = x(yz) will hold. In terms of our alphabet we
are interested in a more specific type of semigroup called a free monoid (Hockett, 52.)
A free monoid is a semigroup with an identity e=, closed under association (), and
concatenation (). Let F be a free monoid, then F() is a free monoid over the alphabet whose
elements F are all the finite strings over  such that F={ x |  i0 [ i 0 x ]}
Since F() contains all the finite strings of  its order is infinite. For example, F() must
contain all the individual characters of along with all the combinations of their concatenation
and association. If is simply the null set, then F() contains simply the  string. Let
and let continue to be the as defined above. Then it follows that F() 
F() , since F() contains a infinite number elements which differ from F() (Hockett, 56.)
It is important to note that from the previous example that since there must be some
isomorphism f : F()  F(). If we let H : F()F() be the inclusion group generated by f ,
then H is a simple example of a language generated by the free monoid F() (Spanier, 2.)
Obviously, such a simple isomorphism does nothing to characterize a natural language. Our goal
in mathematical linguistics is to find isomorphisms f0 , f1 , … , fn , which generate a group H such
that strings of H mimic natural language. The collection of functions f will be developed in the
next section into a definition of formal context-free grammars as linear generative grammars
(Hockett, 58.)
3
Section II : Formal Context-Free Grammars
A formal grammar is an absolute description of a group H F() where is some
finite alphabet. There is a numerous variety of formal grammars, but the scope of this
paper only addresses a certain class of the larger category: those H that are linearly
generated and context free. Section III will address the implications and advantages of
linear generation, though the choice of this name will be evident after definition of their
structure. A context sensitive grammar as apposed to the ones developed in this paper are
groups H F() such that H is closed under a binary ordering operator, . No further
discussion will be devoted to them (Charnial and Wilks, 55.)
Definition: Context-Free Linear Generative Grammar1
A linear generative grammar is a system G(A, I, T, R) characterized by the following postulates:
o
A is a finite alphabet.
o
I is a unique initial character of A.
o
T is a proper subset of A – { I } called the terminal subalphabet.
o
The characters N = A – T are called the non-terminal or auxiliary subalphabet.
o
R is a non-null finite set of rule isomorphisms { Ri }m. Each rule is a function whose domain is the free monoid
F(A) and whose image is some subset there of. If  is any string over A then R() is unique over A.
Postulate P1.
For every rule R, R(Æ)= Æ.
A non-null string over the terminal subalphabet T is a terminal string.
Postulate P2.
If  is a terminal string and R is any rule, then R()=Æ.
Let S = { (Ri)n | n  1}, where each Ri is some rule of R. For a given string s over A let:
s1 = R1(s) , … , sn=Rn(sn-1). Then S, like R, is a function whose domain is F(A) and whose image is a subset
thereof. If for this sequence, there exists some non-null s over A , then we say the sequence is a rule row. Thus
if S(s) ¹ Æ then we can say S is a rule which takes s as an instring and generate S(s) as an outstring.
Postulate P3.
Given any rule row S, and any string s over A acceptable as an instring to S, then S(s) ¹ s.
A rule R such that R(I) ¹ Æ is an initial rule. If R S then S is an initial rule row and S(I) depends on the
choice of S R. If S(I) generates a terminal string, then S is called a rule chain.
Postulate P4.
Every rule of R appears on at least one chain.
From P3, circuit formation if prohibited because no S can generate its self.
Note that R can contain any number of duplicate rules R.
From the definition above a trivial grammar G can be constructed as an example to find
H(G)F() such that |H(G)| = 0 < n < ¥. Granted that this trivial case in no way
resembles a natural language, it shows all the steps necessary to find some H(G) which
does.
Example One:2
if s  I
Let A = { I , a } and let T = { a }. Let R É R( s)  
.
 otherwise
The alphabet A is a finite set of order two, T is the terminal alphabet of order one, and R
contains one rule which transforms an initial string I into a . For all s Ì F(A), there is
only one acceptable instring s=”I a”, thus G generates H(G) Ì F(T) of order one
containing only the string “a”. Though this example is simplistic, it helps to illustrate the
basic procedure. The next example expands upon Example One and shows the generate
1
2
This definition is taken from Hockett, 59-61 and contains only slight modifications to better fit the needs of this paper.
This example was derived from Hockett, 62 and contains slight modifications to better fit the needs of the paper.
4
of language H(G) of a more complex structure, more applicable to the discussion of
Section III.
Example Two:3
Let A = { I ,b,B,l,L,p,P } and let T = { b,l,p }.
Let R be the set containing the following rules:
 if s  
if s  I
.
R3 ( s)  
R1 ( s )  
  otherwise
  otherwise
if s  
if s  
R2 ( s)  
R4 ( s)  
 otherwise
 otherwise
Let G(A,I,T,R,) be a grammar and Let C(I)=s be some permutation of S(I) which
generates the terminal string for the system of rules.
Visually, a certain C(I) for this G would be structured as follows:
I
R1 :
B
L
R2 :
b
L
R3 :
b
l
P
R4 :
b
l
p
Note that the ordering of the rules of R is not necessary since only permutations of S
which generate valid outstings are acceptable for a choice of C(I). Observe that the
structure of C gives rise to a tree topology (Blackett, 165.)
Let T0 be a topological network formed by the ordering of C(I) for the above permutation
of S. Then T0 follows, where each level of the tree represents a transformation by Rn
which either fixes elements or replaces them based on a given rule.
I
R1
B
L
R2
b
L
R3
b
l
P
l
b
p
Since T0 is a topological space which is the subspace of F(A), we should be able to
choose a transformation c which will transform outstrings of G into outstrings of G’ for a
given C and C’.
o Let A be the English alphabet in upper and lower case. Let A’ = A È {‘(‘,’)’}.
R4
3
This example was based on work throughout Hockett.
5
o Let T = A – { x ÎA | x is upper case }.
o Define a function Coll(T, A) that takes a tree T and all non-terminal characters
with unique non-terminal characters from A and maintains ordering at the
vertexes by use of ‘(‘ and ‘)’. Define a second function Flat(T, T) which takes all
terminal characters of T and replaces them with x0 … xn.
o From example two, let c1 be a transformation equal to:
o Flat[Coll(T0,A),T]=I(A x1B(C x2D x3))
o Let c2 = I(K(G x3 B x1)L(M(j)E x2)) be the collapse of a tree T1 generated by G’
and C’(I).
o Let c be a function defined as c: c1 c2, which maps x0  x0 , … , xnxn
Then any outstring s in C(I) Ì H(G) can be transformed into outstrings s’ of C’(I) Ì
H(G’). Thus we can find morphisms over F(A) which can transform between rules of R
and R’.
Now we must make the transition from modeling random strings of characters into
modeling sentences. In order to do this, we must define a bijection s : L  A which
maps all the words and grammar parts from a lexicon L into an alphabet A. Then we can
choose rules in R that appropriately model sentence structure. As before, some rule sets
may yield |H(G)| = ¥ while others may yield finite languages. Most notably, any
compound sentences will yield an infinite order language (Kuroda, 174.) When dealing
with languages of infinite order, the algebraic techniques described above tell us nothing
about an entire language. In this scenario, the techniques of the next section will help.
With the Algebraic tools defined in this section, there is nothing but computation
necessary to develop a H(G) which isomorphic with a given natural language. It follows
from this, that two distinct natural languages could be modeled as H and H’. Having two
distinct models of natural languages leads the question, if it is even possible, to find
transformations between them that work as a translation agent. As a first step, Section III
will discuss mathematical tools that we have to investigate the similarity between two
phrase structures of infinite order.
6
Section III : Topology Applied to Formal Context-Free Grammar Structures
A linear generative grammar defines the structure of the H(G) subgroup. The structure is
defined as the semigroup consisting of all s = C(I) Ì S(x)xÎF(A) which can be finite or infinite. If
instead, we think of H(G) = { S(s) É Cn(I) ¹ Æ | 0 < n £ m, s Î F(A)} defined in the following
way, with m paths through S(R) then we can construct a topological analog, much like was done
for Example Two.
Topology allows us to talk one thing being near another within a space. The topological space
of for our tree structures is the free monoid over the alphabet under the influence of the grammar
structure. If a homeomorphism can be found between two topological spaces that maps the
neighborhoods of one space near enough to the others, then the two spaces can be considered
structurally equivalent (Kuroda, 175.)
We define our topological space as follows4:
o Let G be a grammar structure, G(A,I,T,R).
o Let F(A) be the free monoid over the alphabet.
o Let H(G) be the language induced by G and let H have order ¥.
Then (H(G), F(A)) meets the axioms for a topological space.
1. Æ Î F(A) and H(G) Î F(A)
2. For all U1,U2 Î F(A), then U1`Ç U2 Î F(A)
3. If X Ì F(A), then ÈX Î F(A).
The base for the space with respect to a given x Î H(G)
B(x)={ S(x) É Cn(I) ¹ Æ | 0 < n £ m }. Then the neighborhood system for the space is
defined as { B(x) }xÎH(G)., and meets the axioms for a neighborhood system.
1. For every x Î H(G), B(x) ¹ Æ and for every C ÎB(x), x ÎC.
2. If x ÎC ÎB(y), then there exists a V ÎB(x) such that V Ì C
3. For any U1,U2 Î B(x) there exists a U Î B(x) such that U Ì U1ÇU2
With these preconditions, we can proceed to analyze nearness or structural similarity of two
spaces.
Let K and K’ be languages generated by discreet grammars G and G’ respectively.
Let T = (K,F(A)) and T’=(K’,F(B)) be topological spaces which satisfy the axioms above.
Let P be an arbitrary finite set such that P Ì K and P’ Ì K’.
For t ÎT prune t so much as t ÎP and denote this new structure as tp.
For t’ ÎT’ prune t’ so much as t’ ÎP’ and denote this new structure as ‘tp.
Define new topological spaces tp =(P,K) and ‘tp =(P’,K’) which satisfy all the axioms above.
If tp Αtp then t’ is near t relative to the pruning set.
Since P Ì K for a given K, it is also a subset of B(x), the neighborhood system.
Let f0 and f1 be mappings such that f0, f1: (P,K)  (P’,K’) relative to P.
Thus f0 is homotopic to f1 relative to p if there exists a mapping:
4
Engelking, 13.
7
Z:(K,P) x I  (K’,P’) where I is the unit interval (Spanier, 23.)
The larger the set P and P’, the more likely the homotopy is to exist and be continuous
from K into K’ because P Ì B(x) (Kuroda, 183.)
The results given in this section are by no means complete. The information is presented
here as a starting point for further research into the topology of various classes of phrase
structures.
Conclusion
This paper is the culmination of at least a year of research and has done nothing but
open further doors, awaiting exploration, by the Author.
I have tried to give a brief overview of basic algebraic structures and their application to
linguistics. In the third section, the explanation may have gone off the deep end and it is
obvious more research needs to concentrate in this area.
In a final section, which I would have included space permitting, I would have liked to
investigate some of the more geometric representations of phrase structure applying
results from Geometric Topology and graph theory to obtain further analysis. This seems
the most promising area of study, since the algebraic homotopy relations do not yield to
easy or meaningful solutions. Matrix representations of phrase structure vertex and edge
equations might be easy to solve, as vector processing gets faster and faster on modern
computers.
My conclusion is of course as stated earlier, that more research needs to be done before
any meaningful results can be derived.
8
o Hockett, Charles F. 1967. Language, Mathematics, and Linguistcs. Mouton and Co,
Paris.
o Spanier, Edwin H. 1996. Algebraic Topology. McGraw and Hill, New York.
o Kuroda, S. Y. 1987. “A Topological Approach to Structural Equivalences of Formal
Languages.” Mathematics of Language. University of California, San Diego.
o Engelking, Ryszard. 1989. General Topology. Heldermann Verlag, Berlin.
9
					 
									 
									 
									 
									 
									 
									 
									 
									 
									 
                                             
                                             
                                             
                                             
                                             
                                             
                                             
                                             
                                            