Download the proteome as a molecular language

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Two-hybrid screening wikipedia , lookup

Protein–protein interaction wikipedia , lookup

Transcript
THE PROTEOME AS A MOLECULAR LANGUAGE ('PROTEINESE').
Sungchul Ji, Department of Pharmacology and Toxicology, Rutgers
University, Piscataway, N.J. 08855. [email protected]
The purpose of this poster is to characterize and analyze the striking
similarities that exist between the proteome viewed as a molecular
language and the human language. Perhaps this is not surprising if one
accepts the thesis advocated in 1997 [2-4] that living cells use a
molecular language whose characteristics are similar to those of the
human language.
According to the cell language theory, cell functions are supported by
a hierarchy of molecular languages implemented by DNA, RNA, proteins,
and metabolites (e.g., ions, pyruvate, lactate, glucose, inorganic
phosphate, ATP, cAMP, etc.). In addition, we can recognize a fifth
language used primarily for intercellular communication through
exchange of diffusible ions and molecules such as hormones, cytokines,
prostaglandins, and morphogens. For convenience, we may adopt the
following names for these
languages:
1.
2.
3.
4.
5.
DNA language
RNA language
Protein language
Metabolite language
Intercellular language
=
=
=
=
=
'DNese'
'RNese'
'proteinese'
'metabolese'
'intercellese'
Thus, it may be stated that cellese consists of 5 different sublanguages, just as the computer language can be said to comprise a set
of sublanguages – i) digital logic language, ii) microarchitecture
language, iii) instruction set architecture language, iv) operating
system machine language, v) assembly language, and vi) problem-oriented
language [5].
The cell language theory [2 -4] postulates that there exists a set of
principles and rules obeyed by both cellese and humansese on the one
hand and by all the sub-languages of cellese on the other. These
principles
include:
i) the principle of syntagmatic and paradigmatic relations [2], and
ii) the principle of triple articulation (see Table 3 on p. 17 in
[4]).
Detailed discussions on these principles will not be given here since
they are avaialbe in the cited references and not directly related to
the comparison between proteinese and humanese, the focal point of this
poster.
The structure of 'proteinese' can be divided into 5 levels as shown in
Figure 1. The fiveness of Figure 1 and the fiveness of the number of
the sublanguages constituting cellese are not causally related (as far
as I can tell) and hence must be viewed as accidental. The first four
levels of organization can be identified with the well-known primary,
secondary,tertiary and quaternary structures of proteins. The fifth
structure is rather new (see for example [1]), having emerged only
recently due to the development of high-throughput experimental
techniques such as DNA arrays, enabling the study of genome-wide
behaviors of biopolymers.
Protein Networks
/\
||
Protein Complexes
/\
||
Folded Polypeptides
/\
||
Functional Domains
/\
||
Amino Acids
Figure 1.
(5)
(4)
(3)
(2)
(1)
The hierarchical organization of protein
language ('proteinese').
The 5-level structural organization of proteinese shown in Figure 1 is
compared with the 5-level structural organization of humanese in Table
1 (see the first three columns).
Table 1.
A structural and functional comparison of
protein language (proteinese) and human
language (humanese).
_____________________________________________________________
Level of
Organization
Proteinese
Humanese
Function
_____________________________________________________________
Primary
Amino acids
Phonemes
Units of Communication
_____________________________________________________________
Secondary
Functional
Morphemes Units of Meaning
Domains
_____________________________________________________________
Tertiary
Folded
Words
Units of Denotation
Polypeptides
_____________________________________________________________
Quaternary
Protein
Sentences Units of Judgments
Complexes
_____________________________________________________________
Quintic
Protein
Texts
Units of Reasoning
Networks
_____________________________________________________________
If the comparison in Table 1 is valid, we may infer the possible
functions of the different structures of proteins from the
corresponding functions of human language (see the last column). Thus,
the functional role of protein complexes is to make a judgment (e.g.,
to turn on or off gene expression), and that of a protein-protein
interaction network is to reason or compute, namely, to transform
inputs into outputs following a set of rules selected by biological
evolution.
The content of Table 1 can be viewed as a set of predictions made by
the cell language theory as applied to proteinese. It is hoped that
this poster will attract the interest of bioinformatics experts who can
test some of these predictions based on protein structural and
functional data available on the Internet.
References:
[1] Ganeur, J., Krause, R., Bouwmeester, T., and Casari, G. (2004).
Modular decomposition of protein-protein interaction networks. Genome
Biology 5:R57.
[2] Ji, S. (1997). Isomorphism between Cell and Human Languages:
Molecular Biological, Bioinformatic and Linguistic Implications.
BioSystems 44:17-39.
[3]
42.
Ji, S. (2002).
Microsemiotics of DNA.
Semiotica 138 (1/4): 15-
[4] Ji, S. (2005). Semiotics of Life: A Unified Theory of Molecular
Machines, Living Cells, the Mind, Peircean Signs, and the Universe
based on the Principle of Information-Energy Complementarity. Reports,
the Research Group on Mathematical Linguistics (Martin-Vide, C., ed.),
Rovira i Virgili University, Tarragona, Spain, pp. 1-183. The text
available at http://www.grlmc.com, under Publictions.
[5] Tanenbaum, A. S. (2003). Structured Computer Organization.
Edition. Prentice-Hall of India, New Delhi. P. 5.
Fourth