Download Bz gene identification

Bronze Gene Prediction Instructions and Worksheet Save this worksheet to your desktop and complete it on the computer! Complete this worksheet in MS Word on your computer. If you have this document in print, open it online http://www.dnai.org/media/bioinformatics/genefinding/bzgeneprediction_ws.doc. If you opened this document in an Internet browser click File, click Save as, and save it to a directory on your C- or A-drives. Then, close the browser, open the document in MS Word, and follow the instructions to answer the questions. In doing so, you will discover where in the sequence the bz gene is locatied, it’s structure and location in the maize genome, as well as the 3D structure of the bz protein product. Along the way you will become familiar with bioinformatics routines such as locating and extracting information and sequences about/for genes, genomes, and proteins from databases. Try to find gene in DNA by determining the Open Reading Frames (ORFs) it contains  Assuming the bronze gene could be an ORF gene, try to find it by identifying and analyzing the ORFs in the DNA sequence. o o o o o o o o o o o o o o o o o Open this worksheet on your computer, save it, and open it in MS Word. Go to http://www.bioservers.org. Find SEQUENCE SERVER, click ENTER. Click MANAGE GROUPS. Find Sequence sources, click Classes, then Public. Find Jumping Genes Across Kingdoms, check the box to the left, click OK. Click the title for the first entry and set it to corn, purple endosperm; wt. Click Open, highlight and copy the entire sequence. Click Done. Open Gene Boy at http://www.dnai.org/geneboy. In the Sequences panel click Your Sequence. aste the sequence into the central window. Optional: replace the header Your Sequence with a name of your choosing (i.e. corn bronze gene. Click Save Sequence. How long is the sequence? _____________ bp In the Operations panel click Find Genes, then ORFs. Click Reverse. Record the ORFs indicated by Gene Boy in the table below and determine the length of the amino acid sequence each could potentially encode. ORF ORF 1 ORF 2 ORF 3 ORF 4 ORF 5 RF 1 _ _ _ _ _ From – To 247-834 _ _ _ _ _ Length [bp] 588 bp _ _ _ _ _ Protein length [aa] 195 aa _ _ _ _ _  The protein sequencing lab provides you with the amino acid for the protein product of the bronze gene (see Attachment 1). o o o o How many amino acids long is it? _____________________aa_ How many nucleotides are needed to encode a protein of this length? _______nt_ Could this protein be encoded by any of the ORFs determined above? _ yes/no _ What do you think might be going on? At what point may we have made a wrong assumption? __________________________________________________________________ __________________________________________________________________ __________________________________________________________________ __________________________________________________________________ __________________________________________________________________ Confirm the potential of the DNA sequence to encode the BZ protein by using the DNA to search DNA databases for similar sequences (This search can be conducted by using Gene Boy, Sequence Server, or any Internet site that provides access to a Blast search.)               Go back to Gene Boy, click Clear, click your sequence. Under Operations, click WWW Tools, click ORF. Find Redraw, change the number next to it from 100 to 300, click Redraw. Compare the ORFs indicated with the results you recorded in the table above. Click on an ORF and submit the deduced amino acid sequence to a blastp search by clicking blast. Record the Request-id: ____________________________________________ Click Format. The E Value is the most meaningful indicator for the quality of a hit; the lower the E Value, the better the hit. Usually, E Values of less than 0.1 indicate meaningful hits. (For further explanations click the link to Blast FAQ in the upper part of the NCBI Blast result page.) Read the titles listed for acceptable search hits and determine the nature of the gene. Record the gi-number for an entry you wish to examine in more detail: ______________ Click the gi-link. What protein does the GenBank entry contain? _________________________________ How long is it? __________________________________________________________ Does any of the ORFs listed in the table above encode a protein of this length? yes/no Determine the model for the gene using protein evidence The BZ protein has been sequenced (Attachment 1) and so has the DNA sequence (Sequence Server, Attachment 2). Attachment 2 also provides a translation of this DNA sequence (deduced amino acid sequence generated using the electronic DNA sequence translation tool at http://www.dnalc.org/bioinformatics/2003/2003_dnalc_nucleotide_analyzer.htm#translator; see Attachment 2). Detect within the deduced amino acid sequences in Attachment 2 the amino acid sequence for the bz protein product provided in Attachment 1. Find in the translated sequences the amino acid stretches that are entailed in the protein sequence and determine the coding portion in the DNA.  In order to identify the bz gene in the DNA sequence highlight the nucleotide stretches that correspond to the highlighted amino acid stretches. If necessary consult the genetic code table in Attachment 3.  Discuss the structure of the gene: o What is the structure of the bronze gene? ________________________________ o Describe the gene model for the bz gene: _________________________________________________________________ _________________________________________________________________ _________________________________________________________________ o Concatenate the coding sequences. How long is the resulting sequence? Would it be able to encode a protein of the right length? ___________________________  Use the Internet sites at http://wwwmgs.bionet.nsc.ru/mgs/programs/bdna/tata_bdna.html and http://rulai.cshl.org/tools/polyadq/polyadq_form.html for the prediction of TATAboxes and PolyA Signal, respectively. _________________________________________________________________ _________________________________________________________________ _________________________________________________________________  Finally, run the sequence through the two gene prediction programs listed in Gene Boy under WWW Tools  Gene Prediction. _________________________________________________________________ _________________________________________________________________ _________________________________________________________________  Discuss the results by comparing them with the annotation for the gene at: http://www.ncbi.nlm.nih.gov/entrez/viewer.fcgi?db=nucleotide&val=22361 _________________________________________________________________ _________________________________________________________________ Discuss characteristics of spliced genes … by deleting from the table below all wrong answers: Begin with start codon End with stop codon Nucleotide number is multiple of 3 Contain coding sequence (CDS) Contain stop codons CDS can change reading frame Exons _True / False_ _True / False_ _True / False_ _True / False_ _True / False_ _True / False_ Introns _True / False_ _True / False_ _True / False_ _True / False_ _True / False_ _True / False_ Determine the location of the gene in the maize genome           Click Map Viewer. Click Zea mays. Click Blast search plant genome. Enter the sequence into the search window, click Blast. Record the Request Id: _______________________________ Click Format. Click Genome View. How many chromosomes does maize have? ____ What chromosome is the gene on? ___ To view the gene in its environment click the number underneath the chromosome. Zoom into the chromosome until the gene model for this gene becomes discernable. Attachment 1: Zea mays bronze gene product; 471 amino acids ---------+---------+---------+---------+---------+---------+ MAPADGESSPPPHVAVVAFPFSSHAAVLLSIARALAAAAAPSGATLSFLSTASSLAQLRK 60 ---------+---------+---------+---------+---------+---------+ ASSASAGHGLPGNLRFVEVPDGAPAAEETVPVPRQMQLFMEAAEAGGVKAWLEAARAAAG 120 ---------+---------+---------+---------+---------+---------+ GARVTCVVGDAFVWPAADAAASAGAPWVPVWTAASCALLAHIRTDALREDVGDQAANRVD 180 ---------+---------+---------+---------+---------+---------+ GLLISHPGLASYRVRDLPDGVVSGDFNYVINLLVHRMGQCLPRSAAAVALNTFPGLDPPD 240 ---------+---------+---------+---------+---------+---------+ VTAALAEILPNCVPFGPYHLLLAEDDADTAAPADPHGCLAWLGRQPARGVAYVSFGTVAC 300 ---------+---------+---------+---------+---------+---------+ PRPDELRELAAGLEDSGAPFLWSLREDSWPHLPPGFLDRAAGTGSGLVVPWAPQVAVLRH 360 ---------+---------+---------+---------+---------+---------+ PSVGAFVTHAGWASVLEGLSSGVPMACRPFFGDQRMNARSVAHVWGFGAAFEGAMTSAGV 420 ---------+---------+---------+---------+---------+ATAVEELLRGEEGARMRARAKELQALVAEAFGPGGECRKNFDRFVEIVCRA 471 Attachment 2: bronze gene, Zea mays, 2221 nucleotides 1--------+---------+---------+---------+---------+---------+---------+---------+---------+---------+-DNA: GGTCCCCAAACTCCACGGCACCAACAGCTAAGCCCGATGCGCTGCGTGCGCGGCGATCCAACCGCCGGCTCACCTAAAAATTTCGGCACGTCTAACTGCGAC +1: G P Q T P R H Q Q L S P M R C V R G D P T A G S P K N F G T S N C D +2: V P K L H G T N S * A R C A A C A A I Q P P A H L K I S A R L T A T +3: S P N S T A P T A K P D A L R A R R S N R R L T * K F R H V * L R L 102 ------------------------------------------------------------------------------------------------------------------------------103----+---------+---------+---------+---------+---------+---------+---------+---------+---------+---DNA: TGGCAGGTGCGCACGCGTGGTCGCGCGGAATAAAGCGGACACGTTGCGCCCCCAGCGAAGCCCGCACGCATCGCATTCGCATCGCATCGCAGGTCGCATCCG +1: W Q V R T R G R A E * S G H V A P P A K P A R I A F A S H R R S H P +2: G R C A R V V A R N K A D T L R P Q R S P H A S H S H R I A G R I R +3: A G A H A W S R G I K R T R C A P S E A R T H R I R I A S Q V A S D 204 ------------------------------------------------------------------------------------------------------------------------------205--+---------+---------+---------+---------+---------+---------+---------+---------+---------+-----DNA: ACGCTAGCGGCTAGCCTAGCCGAACAGCCTGAGCGCGCGAAGATGGCGCCCGCCGACGGCGAGTCCTCCCCGCCGCCGCACGTGGCCGTGGTCGCCTTCCCG +1: T L A A S L A E Q P E R A K M A P A D G E S S P P P H V A V V A F P +2: R * R L A * P N S L S A R R W R P P T A S P P R R R T W P W S P S R +3: A S G * P S R T A * A R E D G A R R R R V L P A A A R G R G R L P V 306 ------------------------------------------------------------------------------------------------------------------------------3--+---------+---------+---------+---------+---------+---------+---------+---------+---------+-------DNA: TTCAGCTCCCACGCGGCGGTGCTGCTCTCCATCGCGCGCGCCCTGGCTGCCGCCGCGGCGCCGTCCGGGGCCACGCTCTCGTTCCTCTCCACCGCGTCCTCC +1: F S S H A A V L L S I A R A L A A A A A P S G A T L S F L S T A S S +2: S A P T R R C C S P S R A P W L P P R R R P G P R S R S S P P R P P +3: Q L P R G G A A L H R A R P G C R R G A V R G H A L V P L H R V L P 408 ------------------------------------------------------------------------------------------------------------------------------409--------+---------+---------+---------+---------+---------+---------+---------+---------+---------+ DNA: CTCGCGCAGCTCCGCAAGGCCAGCAGCGCCTCCGCCGGGCACGGGCTCCCGGGGAACCTGCGCTTCGTCGAGGTACCGGACGGCGCGCCCGCGGCCGAGGAG +1: L A Q L R K A S S A S A G H G L P G N L R F V E V P D G A P A A E E +2: S R S S A R P A A P P P G T G S R G T C A S S R Y R T A R P R P R R +3: R A A P Q G Q Q R L R R A R A P G E P A L R R G T G R R A R G R G D 510 ------------------------------------------------------------------------------------------------------------------------------511------+---------+---------+---------+---------+---------+---------+---------+---------+---------+-DNA: ACCGTGCCGGTGCCGCGGCAGATGCAGCTGTTCATGGAGGCCGCGGAGGCCGGCGGGGTGAAGGCCTGGCTGGAGGCGGCCCGCGCCGCGGCGGGCGGCGCC +1: T V P V P R Q M Q L F M E A A E A G G V K A W L E A A R A A A G G A +2: P C R C R G R C S C S W R P R R P A G * R P G W R R P A P R R A A P +3: R A G A A A D A A V H G G R G G R R G E G L A G G G P R R G G R R Q 612 613----+---------+---------+---------+---------+---------+---------+---------+---------+---------+---DNA: AGGGTGACCTGCGTGGTGGGCGACGCGTTCGTGTGGCCGGCGGCGGACGCGGCCGCCTCCGCGGGGGCGCCGTGGGTGCCGGTGTGGACGGCCGCGTCGTGC +1: R V T C V V G D A F V W P A A D A A A S A G A P W V P V W T A A S C +2: G * P A W W A T R S C G R R R T R P P P R G R R G C R C G R P R R A +3: G D L R G G R R V R V A G G G R G R L R G G A V G A G V D G R V V R 714 ------------------------------------------------------------------------------------------------------------------------------715--+---------+---------+---------+---------+---------+---------+---------+---------+---------+-----DNA: GCGCTCCTGGCGCACATCCGCACCGACGCGCTCCGGGAGGACGTTGGCGACCAGGGTGCGTTGGATTCTACTACTACTACTTCTCTCCCTTCCTTGTCCCTT +1: A L L A H I R T D A L R E D V G D Q G A L D S T T T T S L P S L S L +2: R S W R T S A P T R S G R T L A T R V R W I L L L L L L S L P C P F +3: A P G A H P H R R A P G G R W R P G C V G F Y Y Y Y F S P F L V P S 816 ------------------------------------------------------------------------------------------------------------------------------817+---------+---------+---------+---------+---------+---------+---------+---------+---------+-------DNA: CATTGCGCGCGGGTTTGATGATCGAATGGCTGTTGCATTTCCATCGTTCGCAGCAGCAAACAGGGTGGACGGGCTACTGATCTCCCACCCGGGCCTCGCCAG +1: H C A R V * * S N G C C I S I V R S S K Q G G R A T D L P P G P R Q +2: I A R G F D D R M A V A F P S F A A A N R V D G L L I S H P G L A S +3: L R A G L M I E W L L H F H R S Q Q Q T G W T G Y * S P T R A S P A 918 ------------------------------------------------------------------------------------------------------------------------------919--------+---------+---------+---------+---------+---------+---------+---------+---------+---------+ DNA: CTACCGCGTCCGTGACCTCCCAGACGGCGTCGTCTCCGGCGACTTCAACTACGTCATCAACCTCCTCGTCCACCGCATGGGGCAGTGCCTCCCGCGCTCTGC +1: L P R P * P P R R R R L R R L Q L R H Q P P R P P H G A V P P A L C +2: Y R V R D L P D G V V S G D F N Y V I N L L V H R M G Q C L P R S A +3: T A S V T S Q T A S S P A T S T T S S T S S S T A W G S A S R A L P 1020 ------------------------------------------------------------------------------------------------------------------------------1021-----+---------+---------+---------+---------+---------+---------+---------+---------+---------+-DNA: CGCCGCCGTGGCACTCAACACGTTCCCAGGCCTGGACCCGCCCGACGTCACCGCGGCGCTCGCGGAGATCCTGCCCAACTGCGTCCCGTTCGGCCCCTACCA +1: R R R G T Q H V P R P G P A R R H R G A R G D P A Q L R P V R P L P +2: A A V A L N T F P G L D P P D V T A A L A E I L P N C V P F G P Y H +3: P P W H S T R S Q A W T R P T S P R R S R R S C P T A S R S A P T T 1122 ------------------------------------------------------------------------------------------------------------------------------1123---+---------+---------+---------+---------+---------+---------+---------+---------+---------+---DNA: CCTCCTCCTCGCCGAGGACGACGCCGACACCGCCGCACCAGCCGACCCGCACGGCTGCCTCGCCTGGCTGGGCCGCCAACCCGCGCGCGGCGTCGCGTACGT +1: P P P R R G R R R H R R T S R P A R L P R L A G P P T R A R R R V R +2: L L L A E D D A D T A A P A D P H G C L A W L G R Q P A R G V A Y V +3: S S S P R T T P T P P H Q P T R T A A S P G W A A N P R A A S R T S 1224 1225-+---------+---------+---------+---------+---------+---------+---------+---------+---------+-----DNA: CAGCTTCGGCACGGTGGCGTGCCCGCGGCCCGACGAGCTCCGCGAGCTGGCGGCCGGGCTGGAGGACTCGGGCGCGCCGTTCCTGTGGTCGCTGCGCGAGGA +1: Q L R H G G V P A A R R A P R A G G R A G G L G R A V P V V A A R G +2: S F G T V A C P R P D E L R E L A A G L E D S G A P F L W S L R E D +3: A S A R W R A R G P T S S A S W R P G W R T R A R R S C G R C A R T 1326 ------------------------------------------------------------------------------------------------------------------------------1327---------+---------+---------+---------+---------+---------+---------+---------+---------+-------DNA: CTCGTGGCCGCACCTCCCGCCGGGTTTCCTGGACCGCGCCGCGGGCACCGGGTCCGGGCTCGTGGTGCCCTGGGCGCCGCAGGTGGCCGTGCTGCGCCACCC +1: L V A A P P A G F P G P R R G H R V R A R G A L G A A G G R A A P P +2: S W P H L P P G F L D R A A G T G S G L V V P W A P Q V A V L R H P +3: R G R T S R R V S W T A P R A P G P G S W C P G R R R W P C C A T L 1428 ------------------------------------------------------------------------------------------------------------------------------1429-------+---------+---------+---------+---------+---------+---------+---------+---------+---------+ DNA: TTCCGTGGGCGCGTTCGTGACGCACGCCGGGTGGGCGTCGGTGCTGGAGGGCTTGTCCAGCGGGGTGCCCATGGCGTGCCGCCCCTTCTTCGGCGACCAGCG +1: F R G R V R D A R R V G V G A G G L V Q R G A H G V P P L L R R P A +2: S V G A F V T H A G W A S V L E G L S S G V P M A C R P F F G D Q R +3: P W A R S * R T P G G R R C W R A C P A G C P W R A A P S S A T S G 1530 ------------------------------------------------------------------------------------------------------------------------------1531-----+---------+---------+---------+---------+---------+---------+---------+---------+---------+-DNA: GATGAACGCGCGGTCCGTGGCGCACGTGTGGGGGTTCGGCGCCGCGTTCGAGGGCGCTATGACGAGCGCCGGAGTGGCCACGGCCGTGGAGGAGCTGCTGCG +1: D E R A V R G A R V G V R R R V R G R Y D E R R S G H G R G G A A A +2: M N A R S V A H V W G F G A A F E G A M T S A G V A T A V E E L L R +3: * T R G P W R T C G G S A P R S R A L * R A P E W P R P W R S C C A 1632 ------------------------------------------------------------------------------------------------------------------------------1633---+---------+---------+---------+---------+---------+---------+---------+---------+---------+---DNA: CGGGGAGGAAGGGGCGCGGATGAGGGCAAGGGCCAAGGAGCTGCAGGCCTTGGTGGCCGAGGCGTTCGGGCCAGGCGGTGAGTGCAGGAAGAACTTCGACAG +1: R G G R G A D E G K G Q G A A G L G G R G V R A R R * V Q E E L R Q +2: G E E G A R M R A R A K E L Q A L V A E A F G P G G E C R K N F D R +3: G R K G R G * G Q G P R S C R P W W P R R S G Q A V S A G R T S T G 1734 ------------------------------------------------------------------------------------------------------------------------------1735-+---------+---------+---------+---------+---------+---------+---------+---------+---------+-----DNA: GTTCGTCGAGATAGTCTGTCGCGCGTGAAAGGTCGTCTTGCTGTTCAGAGGTTTTACCAACAGAAGAACATAATGAATTGGATGGCATGCTACGTCGTATTC +1: V R R D S L S R V K G R L A V Q R F Y Q Q K N I M N W M A C Y V V F +2: F V E I V C R A * K V V L L F R G F T N R R T * * I G W H A T S Y S +3: S S R * S V A R E R S S C C S E V L P T E E H N E L D G M L R R I L 1836 1837---------+---------+---------+---------+---------+---------+---------+---------+---------+-------DNA: TCTTTTTTTGTTGATCCCTGAGTTGATACATTTTGTACTTGATACATGAGTTGCAGCAGCAGCAGCAACAGCCTTCTGTACCTTGGCTTTGGATCTGTATTC +1: S F F V D P * V D T F C T * Y M S C S S S S N S L L Y L G F G S V F +2: L F L L I P E L I H F V L D T * V A A A A A T A F C T L A L D L Y S +3: F F C * S L S * Y I L Y L I H E L Q Q Q Q Q Q P S V P W L W I C I L 1938 ------------------------------------------------------------------------------------------------------------------------------1939-------+---------+---------+---------+---------+---------+---------+---------+---------+---------+ DNA: TTGTCACCAGTTATCTGAAAGCATCAATAACCTTCTGTCTTCTAGCAGTTGCCTCTCCAGATTGCCAAAATAGCATTTATTATAAGGTCTTATGCAATGTTT +1: L S P V I * K H Q * P S V F * Q L P L Q I A K I A F I I R S Y A M F +2: C H Q L S E S I N N L L S S S S C L S R L P K * H L L * G L M Q C F +3: V T S Y L K A S I T F C L L A V A S P D C Q N S I Y Y K V L C N V F 2040 ------------------------------------------------------------------------------------------------------------------------------2041-----+---------+---------+---------+---------+---------+---------+---------+---------+---------+-DNA: TCAGATTGTTCCGATTAAATCTACGATTAGCATTTTAGCCCAGCAGTCCAGCCCATTGAAGGCTTATTCAGTTATTTTTAATCCATATAAATCAAAAAAGAT +1: S D C S D * I Y D * H F S P A V Q P I E G L F S Y F * S I * I K K D +2: Q I V P I K S T I S I L A Q Q S S P L K A Y S V I F N P Y K S K K I +3: R L F R L N L R L A F * P S S P A H * R L I Q L F L I H I N Q K R L 2142 ------------------------------------------------------------------------------------------------------------------------------2143---+---------+---------+---------+---------+---------+---------+---------+DNA: TGATATAGATTAGAAAATATTTTAGTTTACTAGGAATTAAAACCCCTCAATTTTTCTTAATCCATATAAATTGTGGCAG +1: * Y R L E N I L V Y * E L K P L N F S * S I * I V A +2: D I D * K I F * F T R N * N P S I F L N P Y K L W Q +3: I * I R K Y F S L L G I K T P Q F F L I H I N C G 2221 ------------------------------------------------------------------------------------------------------------------------------- Attachment 3: Genetic Code (from http://psyche.uthct.edu/shaun/SBlack/geneticd.html) Second Position of Codon T T C A G TTT Phe [F] TTC Phe [F] TTA Leu [L] TTG Leu [L] TCT Ser [S] TCC Ser [S] TCA Ser [S] TCG Ser [S] TAT Tyr [Y] TAC Tyr [Y] TAA Ter [end] TAG Ter [end] TGT Cys [C] TGC Cys [C] TGA Ter [end] TGG Trp [W] CCT Pro [P] CCC Pro [P] CCA Pro [P] CCG Pro [P] CAT His [H] CAC His [H] CAA Gln [Q] CAG Gln [Q] CGT Arg [R] CGC Arg [R] CGA Arg [R] CGG Arg [R] ACT Thr [T] ACC Thr [T] ACA Thr [T] ACG Thr [T] AAT Asn [N] AAC Asn [N] AAA Lys [K] AAG Lys [K] AGT Ser [S] AGC Ser [S] AGA Arg [R] AGG Arg [R] GCT Ala [A] GAT Asp [D] GCC Ala [A] GAC Asp [D] GCA Ala [A] GAA Glu [E] GCG Ala [A] GAG Glu [E] GGT Gly [G] GGC Gly [G] GGA Gly [G] GGG Gly [G] F i CTT Leu [L] r s CTC Leu [L] t C CTA Leu [L] CTG Leu [L] P o ATT Ile [I] s i A ATC Ile [I] ATA Ile [I] t i ATG Met [M] o GTT Val [V] n GTC Val [V] G GTA Val [V] GTG Val [V] T C A G T h T i C r A d G P T o s C i A t G i o T n C A G An explanation of the Genetic Code: DNA is a two-stranded molecule. Each strand is a polynucleotide composed of A (adenosine), T (thymidine), C (cytidine), and G (guanosine) residues polymerized by "dehydration" synthesis in linear chains with specific sequences. Each strand has polarity, such that the 5'hydroxyl (or 5'-phospho) group of the first nucleotide begins the strand and the 3'-hydroxyl group of the final nucleotide ends the strand; accordingly, we say that this strand runs 5' to 3' ("Five prime to three prime") . It is also essential to know that the two strands of DNA run antiparallel such that one strand runs 5' -> 3' while the other one runs 3' -> 5'. At each nucleotide residue along the double-stranded DNA molecule, the nucleotides are complementary. That is, A forms two hydrogen-bonds with T; C forms three hydrogen bonds with G. In most cases the two-stranded, antiparallel, complementary DNA molecule folds to form a helical structure which resembles a spiral staircase. This is the reason why DNA has been referred to as the "Double Helix". One strand of DNA holds the information that codes for various genes; this strand is often called the template strand or antisense strand (containing anticodons). The other, and complementary, strand is called the coding strand or sense strand (containing codons). Since mRNA is made from the template strand, it has the same information as the coding strand. The table above refers to triplet nucleotide codons along the sequence of the coding or sense strand of DNA as it runs 5' -> 3'; the code for the mRNA would be identical but for the fact that RNA contains U (uridine) rather than T. An example of two complementary strands of DNA would be: (5' -> 3') ATGGAATTCTCGCTC (Coding, sense strand) (3' <- 5') TACCTTAAGAGCGAG (Template, antisense strand) (5' -> 3') AUGGAAUUCUCGCUC (mRNA made from Template strand) Since amino acid residues of proteins are specified as triplet codons, the protein sequence made from the above example would be Met-Glu-Phe-Ser-Leu... (MEFSL...). Practically, codons are "decoded" by transfer RNAs (tRNA) which interact with a ribosome-bound messenger RNA (mRNA) containing the coding sequence. There are 64 different tRNAs, each of which has an anticodon loop (used to recognize codons in the mRNA). 61 of these have a bound amino acyl residue; the appropriate "charged" tRNA binds to the respective next codon in the mRNA and the ribosome catalyzes the transfer of the amino acid from the tRNA to the growing (nascent) protein/polypeptide chain. The remaining 3 codons are used for "punctuation"; that is, they signal the termination (the end) of the growing polypeptide chain. Lastly, the Genetic Code in the table above has also been called "The Universal Genetic Code". It is known as "universal", because it is used by all known organisms as a code for DNA, mRNA, and tRNA. The universality of the genetic code encompases animals (including humans), plants, fungi, archaea, bacteria, and viruses. However, all rules have their exceptions, and such is the case with the Genetic Code; small variations in the code exist in mitochondria and certain microbes. Nonetheless, it should be emphasized that these variances represent only a small fraction of known cases, and that the Genetic Code applies quite broadly, certainly to all known nuclear genes.

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Bz gene identification