Download BKLR1

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

DNA vaccination wikipedia , lookup

Anti-nuclear antibody wikipedia , lookup

Innate immune system wikipedia , lookup

Immune system wikipedia , lookup

Hygiene hypothesis wikipedia , lookup

Immunocontraception wikipedia , lookup

Antibody wikipedia , lookup

Molecular mimicry wikipedia , lookup

Adaptive immune system wikipedia , lookup

Duffy antigen system wikipedia , lookup

Immunomics wikipedia , lookup

ELISA wikipedia , lookup

Psychoneuroimmunology wikipedia , lookup

Cancer immunotherapy wikipedia , lookup

Monoclonal antibody wikipedia , lookup

Immunosuppressive drug wikipedia , lookup

Polyclonal B cell response wikipedia , lookup

Transcript
Immune-based transformations 1
Symeon Bozapalidis2, Antonis Kalampakas3,
Olympia Louskou-Bozapalidou, George Rahonis4
Department of Mathematics
Aristotle University of Thessaloniki
54006 Thessaloniki
Greece
Abstract
We introduce two models of transducers, the former admitting as input
words, and the later trees. With these machines, we try to describe
formally, some of the most important operations of the human immune
system. Moreover, the transducers will serve as models of intrusion
detection systems.
1. Introduction
The known computer security systems are based on the idea that using an appropriate
name and a password recognize each user. But the software technology progress
admits nowadays to the computer system intruders to use suitable programs in order
“to break” the security laws. On the other hand, the virus detection problem is very
crucial, since all the anti-virus programs can recognize only known viruses, but they
are not effective for unknown ones. This practically means that any anti-virus
program cannot detect a new virus, until the program will be updated so that it can
recognize it. Clearly, it would be very suitable, if a computer security system could
recognize any foreign intruder as well as if it could detect any new virus. But this is
exactly the ability of the human’s (or more general the vertebrates’) immune system.
Indeed the human’s immune system is an intelligent protective mechanism that can
recognize any foreign intruder called antigen even if it had not met this antigen
before. Moreover, it kills this antigen producing appropriate antibodies and learns to
detect the same antigen in a more fast way the next time it will meet it in the future.
This happens since the immune system produces large amounts of special antibodies
for the foreign intruder, the killers and the memory antibodies. A point which shows
how sophisticated is the immune system is that the antibodies and the antigens have
the same structural units which are twenty amino-acids. Thus the human’s security
system in fact has the ability to discriminate between the self and the non-self cells.
The last years the computer scientists try to apply the operating strategy of the
human’s immune system to the computer security systems. Many models have been
constructed but still they do not seem suitable enough to be applied in practice. The
reader will find much of this work from the group of Prof. St. Forrest but also other
references at [2]. Still in this very promising research topic there is not a theory
background that could lead to good applications. In this paper, we try a first approach
1
Supported by the EC Project IST-2000-26016
Email: [email protected]
3
Email: [email protected]
4
Email: [email protected] (Corresponding author)
2
to this problem from the viewpoint of Theoretical Computer Science. We construct
two transducers in order to simulate the most important operations of the human’s
immune system. The first one works on strings, which regarded to be the antibodies,
the antigens, and the bound complexes, which are, bound antigens and antibodies.
This binding operation between an antibody and an antigen takes place at certain sites
of the two types of cells. These sites are called binding sites. The transducer has the
ability to recognize if there is any binding site in a given input (a cell or a data in
computer terminology) and to delete the corresponding to antigen part of the site.
Moreover, it learns, that is, it obtains a memory to delete the same part in the
remaining string.
The other model regards the bounded complexes as trees, so it is a tree transducer. If
the transducer recognizes a binding site then starts to produce the corresponding killer
and memory nodes of the input tree. Thus the output is a non-infected forest with
large amounts of killers and memory nodes in its trees. We show that such a
transducer can easily be simulated by an alphabetic tree transducer [6], and thus the
transformation induced by such a machine is alphabetic [1]. By this result interesting
properties are obtained for this new object.
Tarakanov and Dasgupta have done another approach to immune-based systems in
[8], using the notions of formal proteins and networks of bindings.
We assume the reader to be familiar with notions from classical formal language
theory [7] and tree language theory [3,4]. We describe only some notations that will
be used in the sequel.
An alphabet is a finite set Σ, whereas Σ* denotes the free monoid generated by Σ, that
is the set of all words over Σ. If Σ,Γ are two alphabets then ΣΓ={σγ / σΣ, γΓ}, and
Σ*Γ*={wu / wΣ*, uΓ*}.
A finite ranked alphabet is a set Σ such that to each σΣ, we associate a natural
number denoted by rank(σ) and called the rank of σ. A letter might have more than
one ranks. The set of all trees TΣ over the ranked alphabet Σ, is defined inductively:
- σ TΣ for σΣ0,
- if t1,,,tnTΣ and σΣn, then σ(t1,,,tn)TΣ.
A forest or tree language is a set F TΣ.
2. Immune-based transducers
2.1. A detector transducer on words
We consider the set Σ of amino acids that is a set of twenty letters. Then each protein
is a word over this alphabet. As it happens in nature, we assume that two proteins (e.g.
one antibody and one antigen) can bind at specific sites called determinants or
binding sites according to the binding specificity theory [5].
For simplicity sake, we consider the binding sites to be a subset A of ΣΓ where Γ is a
copy alphabet of Σ. Γ will stand for the alphabet of structural units of the antigens. Let
A={r1,r2,…,rn}, with ri=σiγi, i[n] ([n]={1,2,…,n}). For each ri=σiγi, we write
left(ri)= σi and right(ri ) =γi. Then left(A)={σ1,σ2,…,σn}, and right(A)={γ1,γ2,…,γn}. We
say that a word w(ΣΓ)* is a bound complex string if w=u1uu2, with u1,u2(ΣΓ)*
and uA. Consider the finite set
2
Q  {q0 , q 1 ,..., q n }  {q ri1 ... ri p / rik  ri ,  ,   [i p ], i p  [n]} 
{q ri1 ... ri p  /   left ( A), rik  ri ,  ,   [i p ], i p  [n]}.
Definition 1. A detector transducer is a five-tuple D=(Σ,Γ,Q,q0,R), with input and
output alphabet the set ΣΓ. Q is the finite set of states and q0Q is the initial state. R
is the finite set of rules, defined by:
 q0 ( )  q0
 q0 ( )  q
 q ( )  qr
 q ( )   q0
 qr ( )  qr
 q r ( )  q r
 q r ( )   q r
if
if
if
if
if
if
     and   left ( A)
  left ( A)
r    A
 ' A
  left ( A)
  left ( A)
if  ' A
 q r ( )  q rr 
if r     A, r  r 
 qri1 ... rik ( )  qri1 ... rik
if   left ( A),    j , j  [rik ], for each ik  [n]
 q ri1 ... rik ( )  q ri1 ... rik
if    j , j  [rik ], for each ik  [n]
 qri1 ... rik ( )  qri1 ... rik 
if   left ( A), for each ik  [n]
 qri1 ... rik  ( )   qri1 ... rik
if  ' A, ik  [n] , for each ik  [n]
 qri1 ... rik  ( )  qri1 ... rik rik 1
if rik 1    A, rik 1  r j , j  [ik ], for each ik  [n]
 qri1 ... rik  ( )  qri1 ... rik
if r j    A, j  [ik ] , for each ik  [n] .
Intuitively, the detector takes as input a word (a data) over the alphabet ΣΓ and
starts the process at the initial state q0. When the detector processes a letter α which is
not a left letter of any binding site, i.e. αleft(Α), then leaves the state unchanged. If it
processes a symbol σ such that σ=left(r), rA, then it goes to the state qσ. If the next
symbol is γ such that r=σγ, then deletes γ and transfers the control of the system to the
state qr. From this time the detector has learnt to recognize the symbol γ and it deletes
each appearance of this letter in the rest of the word. If the letter after σ was  , such
that  ' A, then the transducer should return to the state q0, that is no bound complex
should had been recognized. Then, the transducer processes the next letter, which if it
is a left letter σi of A, then it transfers the control to q r  . If the next letter is γι, such
that σiγi = r   A, then the transducer deletes the antigen symbol γi and goes to the
state qrr . Thus, each time the detector recognizes a binding site ri=σiγi, deletes (kills)
the antigen symbol γi. Moreover, it learns (that is it obtains a memory) to do the same,
each time the same antigen symbol appears in the rest of the sequence, even if this
antigen symbol does not belong to a binding site. The learning ability is described
from the movement to a state, which contains as index the corresponding symbol ri.
3
It is clear that the above model does not contain empty moves and so it is real-time.
With this transducer, we simulate the self and non-self recognition process of the
vertebrates' immune system [5].
Alternatively, we can construct the detector transducer in such way, that the binding
sites are words from the set Σ*Γ*. In this case we can simulate the maturity of the
antibodies after the detection of an antigen. Moreover, such a model can be equipped
with a control so that it can recognize a binding site under certain conditions, for
example, if we consider the determinants with a weight. In this case, we assume that
binding happens only if the binding energy is suitable (e.g. less than a threshold) [5].
With this condition we can simulate the fault tolerance that is needed in a virusdetecting system. It is known [5] that the human’s immune system is also fault
tolerant in order to avoid autoimmunity. The investigation of such models is the scope
of our future research.
2.2. A detector transducer on trees
We consider as above the same alphabets and binding sites. Now, the proteins are
trees over the alphabet ΣΓ{α} where α is a new symbol of rank 2 and the symbols
from ΣΓ play the role of leaves. We define a detector tree transducer, to be a seventuple D=(ΣΓ,Q,Σm,Σk,d,q0,R), where Σm, Σk are two copies of Σ and stand for the set
of memory and killer cells, respectively. The inputs are trees with leaves in ΣΓ. If the
detector processes a binding site, then it deletes (kills) the antigen and produces with
the help of the symbol d (which simulates the ability of the human immune system to
produce the antibodies), the killer and memory cells corresponding to this binding
site. These memory cells have the specificity to bind with the same antigen in the
future, that is the detector learns to recognize the foreign intruders.
From the biochemistry point of view, with this model we try to simulate the clonal
selection ability of the immune system [5].
Let us describe the above in a more formal way. We denote by Σ0 the set of all amino
acids and let Γ0 be a copy of it, serving as the set of structural units of the antigens.
Consider now the finite ranked alphabets Σ and Γ, with Σ=Σ2=Σ0 and Γ=Γ2=Γ0. A set
A Σ0Γ0 is the set of binding sites. We define the ranked alphabet ΣΓ such that
(ΣΓ)0=(ΣΓ) and (ΣΓ)2=(ΣΓ)0{α}.
An element of TΣΓ is called a bound complex tree if it contains at least one symbol
from A.
Proposition 2. The set of all bound complexes is a recognizable forest.
Proof. We can use standard constructions from tree automata theory. Indeed, a
bottom-up tree automaton can recognize the set of all bound complexes [3,4].
Definition 3. A detector tree transducer is a scheme T=(ΣΓ{α},Σm,ΣkQ,d,q0,R),
with input alphabet ΣΓ{α}, output alphabet ΣΣmΣk where Q is the finite set of
states defined as Q  {q0 }  {q1r , q 2r / r   ,   A} , q0 is the initial state, and d is a
new symbol. R is the finite set of rules:
 q 0 ( ( x1 , x 2 )   (q0 ( x1 ), q 0 ( x 2 ))
 q0 (  ,  )  
if   ,   A
 q0 (  ,  )   (q1r (d ), q 2r (d ))
if r   ,   A
 q1r (d )   m (q1r (d ), q1r (d ))
4
σm Σm
 q1r (d )   m
 q (d )   k (q (d ), q (d ))
σk Σk
 q 2r (d )   k
σk Σk.
r
2
r
2
r
2
The detector tree transducer works in the following way: it admits as input a tree,
where all its nodes with rank 2, are labeled by the symbol α, and its leaves are
symbols in ΣΓ. The transducer leaves unchanged the symbol α. If a leaf that does not
belong into A is being processed, then the detector produces the left part of the leaf. If
the leaf in process is an element r=<σ,γ> in A, then the output of the transducer is the
tree  (q1r (d ), q 2r (d )) . From this time the transducer obtains the ability to produce
memory symbols of σ, with the help of the node q1r (d ) , and killer symbols of σ, with
the help of the node q 2r (d ) .
With this model, we simulate the ability of the human immune system to produce
killer and memory cells, after the recognition of an antigen.
The tree transformation induced by a detector tree transducer is defined classically
[3,4,6] and is denoted by D.
We denote by DETT the class of the tree relations induced by all detector tree
transducers.
It is not difficult to see that for a detector tree transducer D, there exists an alphabetic
tree transducer T [6], such that such that D=T.Thus
Proposition 4. It holds DETTAT where AT denotes the class of alphabetic relations.
By the above proposition and the corresponding results in [6], we obtain the next:
Corollary 5. The domain and the range of each detector tree transducer are
recognizable forests.
Corollary 6. The detector tree transducers preserve the recognizable, algebraic, one
counter, restricted one counter and generalized synchronized forests.
Conclusion. With the two detector transducers, we try a first approach to model the
human’s immune system from the Theoretical Computer Science point of view. The
machines are real-time, that is they do not contain empty moves which means that
they do not get into infinite useless loops. Thus they can serve as models for computer
security systems. Of course in this case where we need smaller alphabets e.g. the
binary alphabet, the constructions get simpler.
In a forthcoming paper, we compare the computational power of these models with
those constructed by Tarakanov and Dasgupta [8], and we try to apply them as a
theoretical basis to the work of St. Forrest and others [2].
3. References
1) Bozapalidis S., Alphabetic tree relations, TCS 99, 1992, 177-211.
2) Forrest S., http://www.cs.unm.edu/~forrest/abstracts.htm.
5
3) Gecseg F. and Steinby M., Tree Automata, Akademiai Kiado, Budapest, 1984.
4) Gecseg F. and Steinby M., Tree languages, in: Rozenberg G., Salomaa A. (Eds.),
Handbook of Formal Languages, vol. III, Springer, Berlin, pp. 1-68.
5) Lehninger A., Principles of Biochemistry, Worth Publishers, Inc. 1982.
6) Rahonis G., Alpabetic and synchronized tree transducers, TCS 255, 2001, 377399.
7) Salomaa A., Formal Languages, Academic Press, New York, 1973.
8) Tarakanov A. and Dasgupta D., A formal model of an artificial immune system,
Biosystems, 55(1-3), 2000, 151-158.
6