Download Overview

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
CS 6990
Bioinformatics
Fall 2004
Dr. Susan Bridges
Department of Computer Science and Engineering
Bioinformatics
DNA, RNA, and Protein
Macromolecule
Repeating Unit
Role
DNA
Deoxyribonucleotides
(A,C,G,T)
Genome
RNA
Ribonucleotides
(A,C,G,U)
Genome
Messenger
Gene product
Amino acids
(A,C,D,E,F,G,H,I,K,I,M,
N,P,Q,R,S,T,V,W,Y)
Gene product
Protein
Department of Computer Science and Engineering
Bioinformatics
Central Dogma of
Molecular Biology
Transcription
Replication
DNA
Translation
RNA
Reverse
Transcription
Department of Computer Science and Engineering
Bioinformatics
Protein
DNA
Department of Computer Science and Engineering
Bioinformatics
Base Pairs (bp)
A, G
Purines
C, T
Pyrimidines
Department of Computer Science and Engineering
Bioinformatics
Base
Pairs in
Detail
Department of Computer Science and Engineering
Bioinformatics
String Representation
5’ …….
3’ …….
TACTGAGGC
3’
5’
Department of Computer Science and Engineering
Bioinformatics
Department of Computer Science and Engineering
Bioinformatics
Department of Computer Science and Engineering
Bioinformatics
How RNA differs
from DNA
1. Sugar is ribose rather
than deoxyribose
2. Thymine (T) replaced by
uracil (U)
3. Does not typically form a
double helix
4. Performs many functions
Nucleotide Codes
A
G
C
T
U
R
Y
N
Adenine
Guainine
Cytosine
Thymine
Uracil
Purine (A or G)
Pyrimidine (C or T)
Any nucleotide
W
S
M
K
B
H
D
V
Weak (A or T)
Strong (G or C)
Amino (A or C)
Keto (G or T)
Not A (G or C or T)
Not G (A or C or T)
Not C (A or G or T)
Not T (A or G or C)
Department of Computer Science and Engineering
Bioinformatics
Biology Terms
• Prokaryotes:
– “Primitive” organisms that do not have a nuclear
membrane
– Includes the bacteria
• Eukaryotes:
– “Higher” organisms in which the genetic
material is localized in the nucleus of the cells.
– Includes plants and animals like yeast, corn,
protozoa, humans
Department of Computer Science and Engineering
Bioinformatics
Protein
• The most “active” molecules in organisms
are proteins
– Structural
– Enzymes
• Proteins are polymers of amino acids—a
long string of amino acid residues
• 20 amino acids (+ a few strange ones that
occur occasionally)
Department of Computer Science and Engineering
Bioinformatics
Department of Computer Science and Engineering
Bioinformatics
Protein Backbone
• Backbone
• N-terminus (N-terminal) (amino group)
• C-terminus (C-terminal) (carboxyl group)
Department of Computer Science and Engineering
Bioinformatics
Department of Computer Science and Engineering
Bioinformatics
Genes and the Genetic
Code
• Each chromosome is a long chain of DNA
• Certain sequences on the chromosome contain
the code for a protein. These are called genes.
• A gene is composed of a sequence of codons
– A codon is a nucleotide triplet (3 base sequence)
– The first triplet in a gene is a special codon called the
start codon (usually AUG)
– The gene ends with a stop codon.
• The genetic code consists of the 3 letter codes for
each amino acid.
Department of Computer Science and Engineering
Bioinformatics
Features of the Genetic
Code
• Written in linear form in terms of sequences of
bases.
• Each “word” in the code is a sequence of 3 bases.
• The code is degenerate: most amino acids can be
specified by more than one codon.
• The code contains start and stop signals but no
internal punctuation (commaless).
• The code is non-overlapping (codons are read in a
single reading frame.)
• The code is nearly universal.
Department of Computer Science and Engineering
Bioinformatics
Analogy
Acoapzzcordkathedogatetheratpercliosidklancocoaiem
ifuzzclqzthecatandthehatareredpercopoqpooijcc9a8cjkal;c
ackcccjasoeuejlschjw8eicnxkdoaoejknthecrivhejelpauenvy
pzznccmqthecowranforthedogandthecatandthedogateper
cxqoicqickvperyerlcaperkcaeiakd
Department of Computer Science and Engineering
Bioinformatics
Messenger RNA has Copy
of Message from DNA
Department of Computer Science and Engineering
Bioinformatics
Flow of Genetic Information
Gene
DNA template strand
(antisense)
Messenger RNA
(mRNA)
TACGGC CA A
transcription
AUGCCG GU U
Translation on ribosomes
protein
met
arg
val
Department of Computer Science and Engineering
Bioinformatics
More terminology
• Promoter sequence sometimes used to
recognize start of strand
• DNA has 2 strands
– Coding strand (sense): looks like mRNA
– Template strand (anticoding or antisense):
transcribed
• DNA is “read” from the 3’ end to 5’ end to make
mRNA
• mRNA is built from 5’ to 3’
• Upstream—before the start of the gene
• Downstream—after
the end of the gene
Department of Computer Science and Engineering
Bioinformatics
RNA synthesis
2 complementary DNA strands
Coding strand
5’
ATGCCGTTAGACCGTTAGCGGACC
Template strand 3’
TACGGCAATCTGGCAATCGCCTGG
RNA
5’
AUGCCGUUAGACCGUUAGCGGACC
Department of Computer Science and Engineering
Bioinformatics
Each gene in
most eukaryotes
is divided into
coding sections
(exons) and
noncoding
sections
(introns).
Department of Computer Science and Engineering
Bioinformatics
Introns are
spliced out or
mRNA. Only
exons used to
build protein
Department of Computer Science and Engineering
Bioinformatics
Department of Computer Science and Engineering
Bioinformatics
Web References
• Access Excellence Graphics Gallery,
http://www.accessexcellence.org/AB/GG/
• http://bioinfo.mbb.yale.edu/course/classes/c2/ppframe.htm
• http://cmgm.stanford.edu/biochem218/01Representation.html
• http://www.math.tau.ac.il/~rshamir/algmb/01/algmb01.html
• http://www.hgmp.mrc.ac.uk/GenomeWeb/docs-bioinformatics.html
Department of Computer Science and Engineering
Bioinformatics
Related documents