Download Bz gene identification

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Mutation wikipedia , lookup

Transfer RNA wikipedia , lookup

Bisulfite sequencing wikipedia , lookup

Gene expression programming wikipedia , lookup

Epitranscriptome wikipedia , lookup

Cancer epigenetics wikipedia , lookup

Transposable element wikipedia , lookup

Nucleic acid double helix wikipedia , lookup

Gene desert wikipedia , lookup

Genomic library wikipedia , lookup

Epigenetics of neurodegenerative diseases wikipedia , lookup

Molecular cloning wikipedia , lookup

Zinc finger nuclease wikipedia , lookup

Gene therapy wikipedia , lookup

DNA supercoil wikipedia , lookup

DNA vaccination wikipedia , lookup

Epigenomics wikipedia , lookup

Cell-free fetal DNA wikipedia , lookup

Gene expression profiling wikipedia , lookup

Metagenomics wikipedia , lookup

Extrachromosomal DNA wikipedia , lookup

Gene nomenclature wikipedia , lookup

Frameshift mutation wikipedia , lookup

Replisome wikipedia , lookup

Human genome wikipedia , lookup

Nucleic acid analogue wikipedia , lookup

RNA-Seq wikipedia , lookup

Genetic engineering wikipedia , lookup

Primary transcript wikipedia , lookup

Genome (book) wikipedia , lookup

Genome evolution wikipedia , lookup

Nutriepigenomics wikipedia , lookup

No-SCAR (Scarless Cas9 Assisted Recombineering) Genome Editing wikipedia , lookup

Expanded genetic code wikipedia , lookup

Non-coding DNA wikipedia , lookup

Cre-Lox recombination wikipedia , lookup

Microsatellite wikipedia , lookup

Site-specific recombinase technology wikipedia , lookup

Deoxyribozyme wikipedia , lookup

Vectors in gene therapy wikipedia , lookup

History of genetic engineering wikipedia , lookup

Gene wikipedia , lookup

Genomics wikipedia , lookup

Designer baby wikipedia , lookup

Genome editing wikipedia , lookup

Microevolution wikipedia , lookup

Genetic code wikipedia , lookup

Point mutation wikipedia , lookup

Therapeutic gene modulation wikipedia , lookup

Helitron (biology) wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Transcript
Bronze Gene Prediction Instructions and Worksheet
Save this worksheet to your desktop and complete it on the computer!
Complete this worksheet in MS Word on your computer. If you have this document in print,
open it online http://www.dnai.org/media/bioinformatics/genefinding/bzgeneprediction_ws.doc.
If you opened this document in an Internet browser click File, click Save as, and save it to a
directory on your C- or A-drives. Then, close the browser, open the document in MS Word, and
follow the instructions to answer the questions. In doing so, you will discover where in the
sequence the bz gene is locatied, it’s structure and location in the maize genome, as well as the
3D structure of the bz protein product. Along the way you will become familiar with
bioinformatics routines such as locating and extracting information and sequences about/for
genes, genomes, and proteins from databases.
Try to find gene in DNA by determining the Open Reading Frames (ORFs) it contains

Assuming the bronze gene could be an ORF gene, try to find it by identifying and
analyzing the ORFs in the DNA sequence.
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
Open this worksheet on your computer, save it, and open it in MS Word.
Go to http://www.bioservers.org.
Find SEQUENCE SERVER, click ENTER.
Click MANAGE GROUPS.
Find Sequence sources, click Classes, then Public.
Find Jumping Genes Across Kingdoms, check the box to the left, click OK.
Click the title for the first entry and set it to corn, purple endosperm; wt.
Click Open, highlight and copy the entire sequence. Click Done.
Open Gene Boy at http://www.dnai.org/geneboy.
In the Sequences panel click Your Sequence.
aste the sequence into the central window.
Optional: replace the header Your Sequence with a name of your choosing (i.e.
corn bronze gene.
Click Save Sequence.
How long is the sequence? _____________ bp
In the Operations panel click Find Genes, then ORFs.
Click Reverse.
Record the ORFs indicated by Gene Boy in the table below and determine the
length of the amino acid sequence each could potentially encode.
ORF
ORF 1
ORF 2
ORF 3
ORF 4
ORF 5
RF
1 _
_
_
_
_
From – To
247-834
_
_
_
_
_
Length [bp]
588 bp _
_
_
_
_
Protein length [aa]
195 aa
_
_
_
_
_

The protein sequencing lab provides you with the amino acid for the protein product of
the bronze gene (see Attachment 1).
o
o
o
o
How many amino acids long is it? _____________________aa_
How many nucleotides are needed to encode a protein of this length? _______nt_
Could this protein be encoded by any of the ORFs determined above? _ yes/no _
What do you think might be going on? At what point may we have made a wrong
assumption?
__________________________________________________________________
__________________________________________________________________
__________________________________________________________________
__________________________________________________________________
__________________________________________________________________
Confirm the potential of the DNA sequence to encode the BZ protein by using the DNA to
search DNA databases for similar sequences
(This search can be conducted by using Gene Boy, Sequence Server, or any Internet site that
provides access to a Blast search.)














Go back to Gene Boy, click Clear, click your sequence.
Under Operations, click WWW Tools, click ORF.
Find Redraw, change the number next to it from 100 to 300, click Redraw.
Compare the ORFs indicated with the results you recorded in the table above.
Click on an ORF and submit the deduced amino acid sequence to a blastp search by
clicking blast.
Record the Request-id: ____________________________________________
Click Format.
The E Value is the most meaningful indicator for the quality of a hit; the lower the E
Value, the better the hit. Usually, E Values of less than 0.1 indicate meaningful hits. (For
further explanations click the link to Blast FAQ in the upper part of the NCBI Blast result
page.)
Read the titles listed for acceptable search hits and determine the nature of the gene.
Record the gi-number for an entry you wish to examine in more detail: ______________
Click the gi-link.
What protein does the GenBank entry contain? _________________________________
How long is it? __________________________________________________________
Does any of the ORFs listed in the table above encode a protein of this length? yes/no
Determine the model for the gene using protein evidence
The BZ protein has been sequenced (Attachment 1) and so has the DNA sequence (Sequence
Server, Attachment 2). Attachment 2 also provides a translation of this DNA sequence (deduced
amino acid sequence generated using the electronic DNA sequence translation tool at
http://www.dnalc.org/bioinformatics/2003/2003_dnalc_nucleotide_analyzer.htm#translator; see
Attachment 2). Detect within the deduced amino acid sequences in Attachment 2 the amino acid
sequence for the bz protein product provided in Attachment 1. Find in the translated sequences
the amino acid stretches that are entailed in the protein sequence and determine the coding
portion in the DNA.

In order to identify the bz gene in the DNA sequence highlight the nucleotide stretches
that correspond to the highlighted amino acid stretches. If necessary consult the genetic
code table in Attachment 3.

Discuss the structure of the gene:
o What is the structure of the bronze gene? ________________________________
o Describe the gene model for the bz gene:
_________________________________________________________________
_________________________________________________________________
_________________________________________________________________
o Concatenate the coding sequences. How long is the resulting sequence? Would it
be able to encode a protein of the right length? ___________________________

Use the Internet sites at http://wwwmgs.bionet.nsc.ru/mgs/programs/bdna/tata_bdna.html
and http://rulai.cshl.org/tools/polyadq/polyadq_form.html for the prediction of TATAboxes and PolyA Signal, respectively.
_________________________________________________________________
_________________________________________________________________
_________________________________________________________________

Finally, run the sequence through the two gene prediction programs listed in Gene Boy
under WWW Tools  Gene Prediction.
_________________________________________________________________
_________________________________________________________________
_________________________________________________________________

Discuss the results by comparing them with the annotation for the gene at:
http://www.ncbi.nlm.nih.gov/entrez/viewer.fcgi?db=nucleotide&val=22361
_________________________________________________________________
_________________________________________________________________
Discuss characteristics of spliced genes
… by deleting from the table below all wrong answers:
Begin with start codon
End with stop codon
Nucleotide number is multiple of 3
Contain coding sequence (CDS)
Contain stop codons
CDS can change reading frame
Exons
_True / False_
_True / False_
_True / False_
_True / False_
_True / False_
_True / False_
Introns
_True / False_
_True / False_
_True / False_
_True / False_
_True / False_
_True / False_
Determine the location of the gene in the maize genome










Click Map Viewer.
Click Zea mays.
Click Blast search plant genome.
Enter the sequence into the search window, click Blast.
Record the Request Id: _______________________________
Click Format.
Click Genome View.
How many chromosomes does maize have? ____ What chromosome is the gene on? ___
To view the gene in its environment click the number underneath the chromosome.
Zoom into the chromosome until the gene model for this gene becomes discernable.
Attachment 1:
Zea mays bronze gene product; 471 amino acids
---------+---------+---------+---------+---------+---------+
MAPADGESSPPPHVAVVAFPFSSHAAVLLSIARALAAAAAPSGATLSFLSTASSLAQLRK 60
---------+---------+---------+---------+---------+---------+
ASSASAGHGLPGNLRFVEVPDGAPAAEETVPVPRQMQLFMEAAEAGGVKAWLEAARAAAG 120
---------+---------+---------+---------+---------+---------+
GARVTCVVGDAFVWPAADAAASAGAPWVPVWTAASCALLAHIRTDALREDVGDQAANRVD 180
---------+---------+---------+---------+---------+---------+
GLLISHPGLASYRVRDLPDGVVSGDFNYVINLLVHRMGQCLPRSAAAVALNTFPGLDPPD 240
---------+---------+---------+---------+---------+---------+
VTAALAEILPNCVPFGPYHLLLAEDDADTAAPADPHGCLAWLGRQPARGVAYVSFGTVAC 300
---------+---------+---------+---------+---------+---------+
PRPDELRELAAGLEDSGAPFLWSLREDSWPHLPPGFLDRAAGTGSGLVVPWAPQVAVLRH 360
---------+---------+---------+---------+---------+---------+
PSVGAFVTHAGWASVLEGLSSGVPMACRPFFGDQRMNARSVAHVWGFGAAFEGAMTSAGV 420
---------+---------+---------+---------+---------+ATAVEELLRGEEGARMRARAKELQALVAEAFGPGGECRKNFDRFVEIVCRA 471
Attachment 2: bronze gene, Zea mays, 2221 nucleotides
1--------+---------+---------+---------+---------+---------+---------+---------+---------+---------+-DNA: GGTCCCCAAACTCCACGGCACCAACAGCTAAGCCCGATGCGCTGCGTGCGCGGCGATCCAACCGCCGGCTCACCTAAAAATTTCGGCACGTCTAACTGCGAC
+1: G P Q T P R H Q Q L S P M R C V R G D P T A G S P K N F G T S N C D
+2: V P K L H G T N S * A R C A A C A A I Q P P A H L K I S A R L T A T
+3:
S P N S T A P T A K P D A L R A R R S N R R L T * K F R H V * L R L
102
------------------------------------------------------------------------------------------------------------------------------103----+---------+---------+---------+---------+---------+---------+---------+---------+---------+---DNA: TGGCAGGTGCGCACGCGTGGTCGCGCGGAATAAAGCGGACACGTTGCGCCCCCAGCGAAGCCCGCACGCATCGCATTCGCATCGCATCGCAGGTCGCATCCG
+1: W Q V R T R G R A E * S G H V A P P A K P A R I A F A S H R R S H P
+2: G R C A R V V A R N K A D T L R P Q R S P H A S H S H R I A G R I R
+3:
A G A H A W S R G I K R T R C A P S E A R T H R I R I A S Q V A S D
204
------------------------------------------------------------------------------------------------------------------------------205--+---------+---------+---------+---------+---------+---------+---------+---------+---------+-----DNA: ACGCTAGCGGCTAGCCTAGCCGAACAGCCTGAGCGCGCGAAGATGGCGCCCGCCGACGGCGAGTCCTCCCCGCCGCCGCACGTGGCCGTGGTCGCCTTCCCG
+1: T L A A S L A E Q P E R A K M A P A D G E S S P P P H V A V V A F P
+2: R * R L A * P N S L S A R R W R P P T A S P P R R R T W P W S P S R
+3:
A S G * P S R T A * A R E D G A R R R R V L P A A A R G R G R L P V
306
------------------------------------------------------------------------------------------------------------------------------3--+---------+---------+---------+---------+---------+---------+---------+---------+---------+-------DNA: TTCAGCTCCCACGCGGCGGTGCTGCTCTCCATCGCGCGCGCCCTGGCTGCCGCCGCGGCGCCGTCCGGGGCCACGCTCTCGTTCCTCTCCACCGCGTCCTCC
+1: F S S H A A V L L S I A R A L A A A A A P S G A T L S F L S T A S S
+2: S A P T R R C C S P S R A P W L P P R R R P G P R S R S S P P R P P
+3:
Q L P R G G A A L H R A R P G C R R G A V R G H A L V P L H R V L P
408
------------------------------------------------------------------------------------------------------------------------------409--------+---------+---------+---------+---------+---------+---------+---------+---------+---------+
DNA: CTCGCGCAGCTCCGCAAGGCCAGCAGCGCCTCCGCCGGGCACGGGCTCCCGGGGAACCTGCGCTTCGTCGAGGTACCGGACGGCGCGCCCGCGGCCGAGGAG
+1: L A Q L R K A S S A S A G H G L P G N L R F V E V P D G A P A A E E
+2: S R S S A R P A A P P P G T G S R G T C A S S R Y R T A R P R P R R
+3:
R A A P Q G Q Q R L R R A R A P G E P A L R R G T G R R A R G R G D
510
------------------------------------------------------------------------------------------------------------------------------511------+---------+---------+---------+---------+---------+---------+---------+---------+---------+-DNA: ACCGTGCCGGTGCCGCGGCAGATGCAGCTGTTCATGGAGGCCGCGGAGGCCGGCGGGGTGAAGGCCTGGCTGGAGGCGGCCCGCGCCGCGGCGGGCGGCGCC
+1: T V P V P R Q M Q L F M E A A E A G G V K A W L E A A R A A A G G A
+2: P C R C R G R C S C S W R P R R P A G * R P G W R R P A P R R A A P
+3:
R A G A A A D A A V H G G R G G R R G E G L A G G G P R R G G R R Q
612
613----+---------+---------+---------+---------+---------+---------+---------+---------+---------+---DNA: AGGGTGACCTGCGTGGTGGGCGACGCGTTCGTGTGGCCGGCGGCGGACGCGGCCGCCTCCGCGGGGGCGCCGTGGGTGCCGGTGTGGACGGCCGCGTCGTGC
+1: R V T C V V G D A F V W P A A D A A A S A G A P W V P V W T A A S C
+2: G * P A W W A T R S C G R R R T R P P P R G R R G C R C G R P R R A
+3:
G D L R G G R R V R V A G G G R G R L R G G A V G A G V D G R V V R
714
------------------------------------------------------------------------------------------------------------------------------715--+---------+---------+---------+---------+---------+---------+---------+---------+---------+-----DNA: GCGCTCCTGGCGCACATCCGCACCGACGCGCTCCGGGAGGACGTTGGCGACCAGGGTGCGTTGGATTCTACTACTACTACTTCTCTCCCTTCCTTGTCCCTT
+1: A L L A H I R T D A L R E D V G D Q G A L D S T T T T S L P S L S L
+2: R S W R T S A P T R S G R T L A T R V R W I L L L L L L S L P C P F
+3:
A P G A H P H R R A P G G R W R P G C V G F Y Y Y Y F S P F L V P S
816
------------------------------------------------------------------------------------------------------------------------------817+---------+---------+---------+---------+---------+---------+---------+---------+---------+-------DNA: CATTGCGCGCGGGTTTGATGATCGAATGGCTGTTGCATTTCCATCGTTCGCAGCAGCAAACAGGGTGGACGGGCTACTGATCTCCCACCCGGGCCTCGCCAG
+1: H C A R V * * S N G C C I S I V R S S K Q G G R A T D L P P G P R Q
+2: I A R G F D D R M A V A F P S F A A A N R V D G L L I S H P G L A S
+3:
L R A G L M I E W L L H F H R S Q Q Q T G W T G Y * S P T R A S P A
918
------------------------------------------------------------------------------------------------------------------------------919--------+---------+---------+---------+---------+---------+---------+---------+---------+---------+
DNA: CTACCGCGTCCGTGACCTCCCAGACGGCGTCGTCTCCGGCGACTTCAACTACGTCATCAACCTCCTCGTCCACCGCATGGGGCAGTGCCTCCCGCGCTCTGC
+1: L P R P * P P R R R R L R R L Q L R H Q P P R P P H G A V P P A L C
+2: Y R V R D L P D G V V S G D F N Y V I N L L V H R M G Q C L P R S A
+3:
T A S V T S Q T A S S P A T S T T S S T S S S T A W G S A S R A L P
1020
------------------------------------------------------------------------------------------------------------------------------1021-----+---------+---------+---------+---------+---------+---------+---------+---------+---------+-DNA: CGCCGCCGTGGCACTCAACACGTTCCCAGGCCTGGACCCGCCCGACGTCACCGCGGCGCTCGCGGAGATCCTGCCCAACTGCGTCCCGTTCGGCCCCTACCA
+1: R R R G T Q H V P R P G P A R R H R G A R G D P A Q L R P V R P L P
+2: A A V A L N T F P G L D P P D V T A A L A E I L P N C V P F G P Y H
+3:
P P W H S T R S Q A W T R P T S P R R S R R S C P T A S R S A P T T
1122
------------------------------------------------------------------------------------------------------------------------------1123---+---------+---------+---------+---------+---------+---------+---------+---------+---------+---DNA: CCTCCTCCTCGCCGAGGACGACGCCGACACCGCCGCACCAGCCGACCCGCACGGCTGCCTCGCCTGGCTGGGCCGCCAACCCGCGCGCGGCGTCGCGTACGT
+1: P P P R R G R R R H R R T S R P A R L P R L A G P P T R A R R R V R
+2: L L L A E D D A D T A A P A D P H G C L A W L G R Q P A R G V A Y V
+3:
S S S P R T T P T P P H Q P T R T A A S P G W A A N P R A A S R T S
1224
1225-+---------+---------+---------+---------+---------+---------+---------+---------+---------+-----DNA: CAGCTTCGGCACGGTGGCGTGCCCGCGGCCCGACGAGCTCCGCGAGCTGGCGGCCGGGCTGGAGGACTCGGGCGCGCCGTTCCTGTGGTCGCTGCGCGAGGA
+1: Q L R H G G V P A A R R A P R A G G R A G G L G R A V P V V A A R G
+2: S F G T V A C P R P D E L R E L A A G L E D S G A P F L W S L R E D
+3:
A S A R W R A R G P T S S A S W R P G W R T R A R R S C G R C A R T
1326
------------------------------------------------------------------------------------------------------------------------------1327---------+---------+---------+---------+---------+---------+---------+---------+---------+-------DNA: CTCGTGGCCGCACCTCCCGCCGGGTTTCCTGGACCGCGCCGCGGGCACCGGGTCCGGGCTCGTGGTGCCCTGGGCGCCGCAGGTGGCCGTGCTGCGCCACCC
+1: L V A A P P A G F P G P R R G H R V R A R G A L G A A G G R A A P P
+2: S W P H L P P G F L D R A A G T G S G L V V P W A P Q V A V L R H P
+3:
R G R T S R R V S W T A P R A P G P G S W C P G R R R W P C C A T L
1428
------------------------------------------------------------------------------------------------------------------------------1429-------+---------+---------+---------+---------+---------+---------+---------+---------+---------+
DNA: TTCCGTGGGCGCGTTCGTGACGCACGCCGGGTGGGCGTCGGTGCTGGAGGGCTTGTCCAGCGGGGTGCCCATGGCGTGCCGCCCCTTCTTCGGCGACCAGCG
+1: F R G R V R D A R R V G V G A G G L V Q R G A H G V P P L L R R P A
+2: S V G A F V T H A G W A S V L E G L S S G V P M A C R P F F G D Q R
+3:
P W A R S * R T P G G R R C W R A C P A G C P W R A A P S S A T S G
1530
------------------------------------------------------------------------------------------------------------------------------1531-----+---------+---------+---------+---------+---------+---------+---------+---------+---------+-DNA: GATGAACGCGCGGTCCGTGGCGCACGTGTGGGGGTTCGGCGCCGCGTTCGAGGGCGCTATGACGAGCGCCGGAGTGGCCACGGCCGTGGAGGAGCTGCTGCG
+1: D E R A V R G A R V G V R R R V R G R Y D E R R S G H G R G G A A A
+2: M N A R S V A H V W G F G A A F E G A M T S A G V A T A V E E L L R
+3:
* T R G P W R T C G G S A P R S R A L * R A P E W P R P W R S C C A
1632
------------------------------------------------------------------------------------------------------------------------------1633---+---------+---------+---------+---------+---------+---------+---------+---------+---------+---DNA: CGGGGAGGAAGGGGCGCGGATGAGGGCAAGGGCCAAGGAGCTGCAGGCCTTGGTGGCCGAGGCGTTCGGGCCAGGCGGTGAGTGCAGGAAGAACTTCGACAG
+1: R G G R G A D E G K G Q G A A G L G G R G V R A R R * V Q E E L R Q
+2: G E E G A R M R A R A K E L Q A L V A E A F G P G G E C R K N F D R
+3:
G R K G R G * G Q G P R S C R P W W P R R S G Q A V S A G R T S T G
1734
------------------------------------------------------------------------------------------------------------------------------1735-+---------+---------+---------+---------+---------+---------+---------+---------+---------+-----DNA: GTTCGTCGAGATAGTCTGTCGCGCGTGAAAGGTCGTCTTGCTGTTCAGAGGTTTTACCAACAGAAGAACATAATGAATTGGATGGCATGCTACGTCGTATTC
+1: V R R D S L S R V K G R L A V Q R F Y Q Q K N I M N W M A C Y V V F
+2: F V E I V C R A * K V V L L F R G F T N R R T * * I G W H A T S Y S
+3:
S S R * S V A R E R S S C C S E V L P T E E H N E L D G M L R R I L
1836
1837---------+---------+---------+---------+---------+---------+---------+---------+---------+-------DNA: TCTTTTTTTGTTGATCCCTGAGTTGATACATTTTGTACTTGATACATGAGTTGCAGCAGCAGCAGCAACAGCCTTCTGTACCTTGGCTTTGGATCTGTATTC
+1: S F F V D P * V D T F C T * Y M S C S S S S N S L L Y L G F G S V F
+2: L F L L I P E L I H F V L D T * V A A A A A T A F C T L A L D L Y S
+3:
F F C * S L S * Y I L Y L I H E L Q Q Q Q Q Q P S V P W L W I C I L
1938
------------------------------------------------------------------------------------------------------------------------------1939-------+---------+---------+---------+---------+---------+---------+---------+---------+---------+
DNA: TTGTCACCAGTTATCTGAAAGCATCAATAACCTTCTGTCTTCTAGCAGTTGCCTCTCCAGATTGCCAAAATAGCATTTATTATAAGGTCTTATGCAATGTTT
+1: L S P V I * K H Q * P S V F * Q L P L Q I A K I A F I I R S Y A M F
+2: C H Q L S E S I N N L L S S S S C L S R L P K * H L L * G L M Q C F
+3:
V T S Y L K A S I T F C L L A V A S P D C Q N S I Y Y K V L C N V F
2040
------------------------------------------------------------------------------------------------------------------------------2041-----+---------+---------+---------+---------+---------+---------+---------+---------+---------+-DNA: TCAGATTGTTCCGATTAAATCTACGATTAGCATTTTAGCCCAGCAGTCCAGCCCATTGAAGGCTTATTCAGTTATTTTTAATCCATATAAATCAAAAAAGAT
+1: S D C S D * I Y D * H F S P A V Q P I E G L F S Y F * S I * I K K D
+2: Q I V P I K S T I S I L A Q Q S S P L K A Y S V I F N P Y K S K K I
+3:
R L F R L N L R L A F * P S S P A H * R L I Q L F L I H I N Q K R L
2142
------------------------------------------------------------------------------------------------------------------------------2143---+---------+---------+---------+---------+---------+---------+---------+DNA: TGATATAGATTAGAAAATATTTTAGTTTACTAGGAATTAAAACCCCTCAATTTTTCTTAATCCATATAAATTGTGGCAG
+1: * Y R L E N I L V Y * E L K P L N F S * S I * I V A
+2: D I D * K I F * F T R N * N P S I F L N P Y K L W Q
+3:
I * I R K Y F S L L G I K T P Q F F L I H I N C G
2221
-------------------------------------------------------------------------------------------------------------------------------
Attachment 3: Genetic Code (from http://psyche.uthct.edu/shaun/SBlack/geneticd.html)
Second Position of Codon
T
T
C
A
G
TTT Phe [F]
TTC Phe [F]
TTA Leu [L]
TTG Leu [L]
TCT Ser [S]
TCC Ser [S]
TCA Ser [S]
TCG Ser [S]
TAT Tyr [Y]
TAC Tyr [Y]
TAA Ter [end]
TAG Ter [end]
TGT Cys [C]
TGC Cys [C]
TGA Ter [end]
TGG Trp [W]
CCT Pro [P]
CCC Pro [P]
CCA Pro [P]
CCG Pro [P]
CAT His [H]
CAC His [H]
CAA Gln [Q]
CAG Gln [Q]
CGT Arg [R]
CGC Arg [R]
CGA Arg [R]
CGG Arg [R]
ACT Thr [T]
ACC Thr [T]
ACA Thr [T]
ACG Thr [T]
AAT Asn [N]
AAC Asn [N]
AAA Lys [K]
AAG Lys [K]
AGT Ser [S]
AGC Ser [S]
AGA Arg [R]
AGG Arg [R]
GCT Ala [A] GAT Asp [D]
GCC Ala [A] GAC Asp [D]
GCA Ala [A] GAA Glu [E]
GCG Ala [A] GAG Glu [E]
GGT Gly [G]
GGC Gly [G]
GGA Gly [G]
GGG Gly [G]
F
i
CTT Leu [L]
r
s
CTC Leu [L]
t C CTA Leu [L]
CTG Leu [L]
P
o
ATT Ile [I]
s
i A ATC Ile [I]
ATA Ile [I]
t
i
ATG Met [M]
o
GTT Val [V]
n
GTC Val [V]
G
GTA Val [V]
GTG Val [V]
T
C
A
G T
h
T i
C r
A d
G P
T o
s
C i
A t
G i
o
T n
C
A
G
An explanation of the Genetic Code: DNA is a two-stranded molecule. Each strand is a polynucleotide
composed of A (adenosine), T (thymidine), C (cytidine), and G (guanosine) residues polymerized by
"dehydration" synthesis in linear chains with specific sequences. Each strand has polarity, such that the 5'hydroxyl (or 5'-phospho) group of the first nucleotide begins the strand and the 3'-hydroxyl group of the final
nucleotide ends the strand; accordingly, we say that this strand runs 5' to 3' ("Five prime to three prime") . It is
also essential to know that the two strands of DNA run antiparallel such that one strand runs 5' -> 3' while the
other one runs 3' -> 5'. At each nucleotide residue along the double-stranded DNA molecule, the nucleotides
are complementary. That is, A forms two hydrogen-bonds with T; C forms three hydrogen bonds with G. In
most cases the two-stranded, antiparallel, complementary DNA molecule folds to form a helical structure
which resembles a spiral staircase. This is the reason why DNA has been referred to as the "Double Helix".
One strand of DNA holds the information that codes for various genes; this strand is often called the template
strand or antisense strand (containing anticodons). The other, and complementary, strand is called the coding
strand or sense strand (containing codons). Since mRNA is made from the template strand, it has the same
information as the coding strand. The table above refers to triplet nucleotide codons along the sequence of the
coding or sense strand of DNA as it runs 5' -> 3'; the code for the mRNA would be identical but for the fact
that RNA contains U (uridine) rather than T.
An example of two complementary strands of DNA would be:
(5' -> 3') ATGGAATTCTCGCTC
(Coding, sense strand)
(3' <- 5') TACCTTAAGAGCGAG (Template, antisense strand)
(5' -> 3') AUGGAAUUCUCGCUC
(mRNA made from Template strand)
Since amino acid residues of proteins are specified as triplet codons, the protein sequence made from the
above example would be Met-Glu-Phe-Ser-Leu... (MEFSL...).
Practically, codons are "decoded" by transfer RNAs (tRNA) which interact with a ribosome-bound messenger
RNA (mRNA) containing the coding sequence. There are 64 different tRNAs, each of which has an anticodon
loop (used to recognize codons in the mRNA). 61 of these have a bound amino acyl residue; the appropriate
"charged" tRNA binds to the respective next codon in the mRNA and the ribosome catalyzes the transfer of
the amino acid from the tRNA to the growing (nascent) protein/polypeptide chain. The remaining 3 codons are
used for "punctuation"; that is, they signal the termination (the end) of the growing polypeptide chain.
Lastly, the Genetic Code in the table above has also been called "The Universal Genetic Code". It is known as
"universal", because it is used by all known organisms as a code for DNA, mRNA, and tRNA. The
universality of the genetic code encompases animals (including humans), plants, fungi, archaea, bacteria, and
viruses. However, all rules have their exceptions, and such is the case with the Genetic Code; small variations
in the code exist in mitochondria and certain microbes. Nonetheless, it should be emphasized that these
variances represent only a small fraction of known cases, and that the Genetic Code applies quite broadly,
certainly to all known nuclear genes.