Download Conceptual Translation as a part of Gene Expression

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Bisulfite sequencing wikipedia , lookup

RNA interference wikipedia , lookup

Molecular cloning wikipedia , lookup

Protein wikipedia , lookup

Promoter (genetics) wikipedia , lookup

DNA supercoil wikipedia , lookup

Real-time polymerase chain reaction wikipedia , lookup

Eukaryotic transcription wikipedia , lookup

Community fingerprinting wikipedia , lookup

RNA polymerase II holoenzyme wikipedia , lookup

Metalloprotein wikipedia , lookup

Metabolism wikipedia , lookup

Non-coding DNA wikipedia , lookup

Amino acid synthesis wikipedia , lookup

Transcriptional regulation wikipedia , lookup

Polyadenylation wikipedia , lookup

Proteolysis wikipedia , lookup

RNA silencing wikipedia , lookup

Vectors in gene therapy wikipedia , lookup

Two-hybrid screening wikipedia , lookup

Protein structure prediction wikipedia , lookup

Gene wikipedia , lookup

Messenger RNA wikipedia , lookup

RNA wikipedia , lookup

Silencer (genetics) wikipedia , lookup

RNA-Seq wikipedia , lookup

Point mutation wikipedia , lookup

Biochemistry wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Gene expression wikipedia , lookup

Epitranscriptome wikipedia , lookup

Deoxyribozyme wikipedia , lookup

Nucleic acid analogue wikipedia , lookup

Genetic code wikipedia , lookup

Biosynthesis wikipedia , lookup

Transcript
Council for Innovative Research
www.cirworld.com
International Journal of Computers & Technology
Volume 3 No. 3, Nov-Dec, 2012
Conceptual Translation as a part of Gene Expression
Sukhjit Singh Sehra
Sumeet Kaur Sehra
Assistant Professor
Guru Nanak Dev Engineering
College, Ludhiana, Punjab
Assistant Professor
Guru Nanak Dev Engineering
College, Ludhiana, Punjab
ABSTRACT
The major problem being faced by biologists and researchers
is huge amounts of raw data but with a lack of means to
effectively use this data. DNA is the main building block of a
living organism. The information stored in DNA is used to
make a more trasisent, single standard polynucleotide called
RNA (ribonucleic acid). The process of making a RNA copy
of a Gene is called transcription and is accomplished through
the enzymatic activity of an RNA polymerase. There is a one
to one correspondence between the nucleotide used to make
RNA (G, A, U, C) and the nucleotide sequence in DNA (G,
A, T, C respectively).The next process of converting that
information from nucleotide sequence in RNA to the mRNA.
The sequence after skipping the intron part is m-RNA which
contains exons only. The next process of converting that
information from nucleotide sequence in mRNA to the amino
acid sequence that make protein is called translation. In the
present work, the process of conceptual translation of gene
sequences is implemented corresponding to amino acid
sequence. The mRNA to Protein Sequence Conversion is
done by dividing the sequence of mRNA into groups of three
nucleotides called codons. These codons are replaced by
corresponding amino acid combinations, which gives the
resultant protein. All the possible amino acid combinations
are displayed.
contain ribose sugars while DNA contains deoxyribose and
RNA uses predominantly uracil instead of thymine present in
DNA. RNA is transcribed from DNA by enzymes called RNA
polymerases and further processed by other enzymes [7].
RNA serves as the template for translation of genes into
proteins, transferring amino acids to the ribosome to form
proteins, and also translating the transcript into proteins
[4][6].
1.2.1 Messenger RNA (mRNA)
Messenger RNA is RNA that carries information from DNA
to the ribosome sites of protein synthesis in the cell. In
eukaryotic cells, once mRNA has been transcribed from
DNA, it is "processed" before being exported from the
nucleus into the cytoplasm, where it is bound to ribosomes
and translated into its corresponding protein form with the
help of tRNA. In prokaryotic cells, which have not partition
into nucleus and cytoplasm compartments, mRNA can bind to
ribosomes while it is being transcribed from DNA [2]. After a
certain amount of time the message degrades into its
component nucleotides, usually with the assistance of
ribonucleases [4][6].
1.2.2 Transfer RNA (tRNA)
1. THE GENETIC MATERIAL
Transfer RNA is a small RNA chain of about 74-95
nucleotides that transfers a specific amino acid to a growing
polypeptide chain at the ribosomal site of protein synthesis
during translation. It has sites for amino-acid attachment and
an anticodon region for codon recognition that binds to a
specific sequence on the messenger RNA chain through
hydrogen bonding. It is a type of non-coding RNA [4][6]
DNA is the main constituent of genetic material with in a
body. DNA is converted into RNA and then to Protein.
2. THE CENTRAL DOGMA
Keywords
DNA, RNA, amino acid, codons.
1.1 DNA (Deoxyribonucleic acid)
DNA (deoxyribonucleic acid) is the genetic material. This is a
profoundly powerful statement to molecular biologists. It is
the information stored in DNA that allows the organization of
inanimate molecules into functioning of living cells and
organisms that are able to regulate their internal chemical
composition, growth, and reproduction. As a direct result, it is
also what allows us to inherit our mother’s curly hairs, our
father’s blue eyes, and even our uncle’s too large nose. The
various units that govern those characteristics at the genetic
level, be it chemical composition or nose size, are called
genes [4][6].
1.2 RNA(Ribonucleic acid)
RNA is a nucleic acid polymer consisting of nucleotide
monomers that plays several important roles in the processes
that translate genetic information from deoxyribonucleic acid
(DNA) into protein products; RNA acts as a messenger
between DNA and the protein synthesis complexes known as
ribosomes, forms vital portions of ribosomes, and acts as an
essential carrier molecule for amino acids to be used in
protein synthesis. RNA is very similar to DNA, but differs in
a few important structural details: RNA is single stranded,
while DNA is double stranded. Also, RNA nucleotides
503 | P a g e
The sequence of nucleotide in a DNA molecule can have
important information content for a cell. It is actually proteins
that do the work of altering the cells chemistry by acting as
biological catalysts called enzymes. The process by which
information is extracted from the nucleotide sequence of gene
and then used to make a protein is essentially the same for all
living things on earth and is described by the grandly named
central dogma of molecular biology.
The Central Dogma of molecular biology relates DNA, RNA,
and proteins. Briefly put, the Central Dogma makes the
following claims.

The amino acid sequence of a protein provides an
adequate “blueprint” for the protein’s production.

Protein blueprints are encoded in DNA in the
chromosomes. The encoded blueprint for a single
protein is called a gene.

A dividing cell passes on the blueprints to its
daughter cells by making copies of its DNA in a
process called replication.

The blueprints are transmitted from the
chromosomes to the protein factories in the cell in
www.ijctonline.com
Council for Innovative Research
www.cirworld.com

International Journal of Computers & Technology
Volume 3 No. 3, Nov-Dec, 2012
the form of RNA. The process of copying the DNA
into RNA is called transcription.
4. SOLUTION METHODOLOGY
The RNA blueprints are read and used to assemble
proteins from amino acids in a process known as
translation.
Amino acids are the units that are stringed together to make
proteins. The function of a protein is intimately dependent on
the order in which its amino acids are linked by ribosomes
during translation. Twenty different amino acids are used in
protein
systhesis.
Figure 1: Central Dogma of Molecular Biology
3. CHOICE OF SEQUENCE FORMAT
There are four basic types of molecules involved in life: (1)
small molecules, (2) proteins, (3) DNA and (4) RNA.
Proteins, DNA and RNA are known collectively as biological
macromolecules. DNA is the main information carrier
molecule in a cell. DNA may be single or double stranded. A
single stranded DNA molecule, also called a polynucleotide,
is a chain of small molecules, called nucleotides. There are
four different nucleotides grouped into two types, purines:
adenosine and guanine and pyrimidines: cytosine and
thymine. They are usually referred to as bases (in fact bases
are the only distinguishing element between different
nucleotides, and denoted by their initial letters, A, C, G and T.
This sequence is stored in text files. There is very long
sequence (Chain of different combinations of ATCG) stored
in the file. The main interest of biotechnologist is to search the
sequence for similarity with other sequence.
The only difference in DNA and RNA is that in RNA
Thymine (T) is replaced with Uracil (U). There are different
formats that are used to organize this sequence data. Types of
different data formats [5] are given below:

Plain Text

FASTA Format

Genbank

Genetic Computer Group Format (GCG)
In the present problem, Plain Text format is chosen which
looks like the following:
CTATGACTTGATTGCGACTGATATTGACAAGAATTCA
TAAATTAAGTGAAACTAAACGAACCTCTTATAATTTC
GTTTAAATTTAAAATTGTGAAAAATTAATCTAAAAT
504 | P a g e
Figure 2: Conceptual Translation
Standard 3-letter abbreviation is used for each of the most
commonly used 20 amino acids. Each amino acid can be
assigned to one of essentially four different categories: NonPolar, Polar, Positively charged and Negatively charged. A
single change within a triplet codon is usually not sufficient to
cause a codon to code for an amino acid in a different group.
The genetic code is remarkably robust and minimizes the
extent to which mistake in the nucleotide sequence of genes
can change the function of the protein[1][6].
The mRNA to protein sequence is computed according to the
flow chart shown in fig. 2. This process is also called
conceptual translation. The mRNA sequence is accepted in
the form of plain text format. As a result, it is necessary for
ribosomes to use a triplet code to translate the information in
DNA, RNA and mRNA into amino acid sequence of proteins.
Each group of three nucleotide (a codon) in an mRNA copy of
the coding portion of a gene corresponds to a specific amino
acid.
Translation by ribosome starts at translation initiation sites on
mRNA copy of gene and proceeds until a stop codon is
encountered. Three codons of the genetic code are reserved as
www.ijctonline.com
Council for Innovative Research
www.cirworld.com
International Journal of Computers & Technology
Volume 3 No. 3, Nov-Dec, 2012
stop codons, one triplet codon is always used as a start codon.
The codon AUG is used to code the amino acid methionine
from the existing database. Accurate translation can occur
when ribosome examine codons in the phase or reading frame
that is established by a gene’s start codon. The alterations of a
gene’s reading frame change every amino acid coded
downstream of the alteration. The result are shown
accordingly in the flow chart giving the amino acid and the
starting and end point of the mRNA sequence method in the
existing amino acid from the Database and to produce the
Protein sequence.
5. RESULTS AND DISCUSSION
In the process of translation, mRNA is grouped into
combinations of three nucleotides called codons. The amino
acid chain is extracted by matching the input sequence with
the amino acid database as given in table 1.
For the input mRNA sequence, three amino acid sequences
are possible as shown in table 2. The different outputs are
possible depending on the different triplet combinations.
Table 1: Codons for amino acid
6. REFERENCES
[1] Berman, H.M., Westbrook, J. and Feng, Z., et al., (2000),
“The protein data bank”, Nucleic Acids Res., vol. 28, pp.
235–242.
[2] Brunak, S., Engelbrecht, J., and Knudsen, S. (1991),
“Prediction of human mRNA donor and acceptor sites
from the DNA sequence”, Journal of Molecular Biology,
vol. 220, pp. 49-65.
[3] Huang, D.S. and Zhu, Y.P. (2005), “Improving protein
secondary structure prediction by using the residue
conformational classes”, Elsevier, vol. 26, pp. 2346–
2352.
[4] Krane, D. and Raymer, M. (2003), “Fundamental
Concepts of Bioinformatics”, Pearson Education, New
Delhi, pp.1-314.
[5] Malhi, M.S. (2003), “Development of Data Mining
model for bioinformatics system”, M.Tech Thesis, PAU,
Ludhiana. pp. 1-76.
[6] Rastogi, S.C. and Mendiratta N. (2004), “Bioinformatics
Methods and Applications”, Book, Prentice Hall of India,
New Delhi. pp. 1-194.
[7] Segal, E. and Yelensky, R. (2003), “Genome-Wide
Discovery of
Transcriptional Modules from DNA
Sequence and Gene Expression”, Bioinformatics, vol. 19,
pp. 273-282.\
Table 2: Output amino acid sequence
505 | P a g e
www.ijctonline.com