* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Document
Molecular cloning wikipedia , lookup
Cell-penetrating peptide wikipedia , lookup
Cre-Lox recombination wikipedia , lookup
Protein (nutrient) wikipedia , lookup
Non-coding DNA wikipedia , lookup
Protein moonlighting wikipedia , lookup
Western blot wikipedia , lookup
Transcriptional regulation wikipedia , lookup
Non-coding RNA wikipedia , lookup
Metalloprotein wikipedia , lookup
Protein adsorption wikipedia , lookup
Vectors in gene therapy wikipedia , lookup
Protein–protein interaction wikipedia , lookup
Expanded genetic code wikipedia , lookup
Silencer (genetics) wikipedia , lookup
Epitranscriptome wikipedia , lookup
Point mutation wikipedia , lookup
Nuclear magnetic resonance spectroscopy of proteins wikipedia , lookup
Molecular evolution wikipedia , lookup
Proteolysis wikipedia , lookup
Genetic code wikipedia , lookup
Two-hybrid screening wikipedia , lookup
List of types of proteins wikipedia , lookup
Gene expression wikipedia , lookup
Protein structure prediction wikipedia , lookup
Nucleic acid analogue wikipedia , lookup
Deoxyribozyme wikipedia , lookup
Chapter 1 Basic concepts of Molecular Biology
Outline
To present basic concepts of molecular biology
•Biological background
•Literature on computational molecular biology
To point out some of the most notable exceptions to
general rules, but not all
--nothing is 100% valid in molecular biology
Three molecules we will study
DNA
RNA
A string over alphabet {A,C,G,T}
Primary structure – a string over alphabet {A,C,G,U}
Secondary and tertiary structures
Protein
Primary structure – a string over alphabet
{A,R,N,D,C,Q,E,G,H,I,L,K,M,F,P,S,T,W,Y,V}
Secondary and tertiary structures
Central dogma of molecular biology
DNA makes RNA makes protein
DNA: store genetic information;
RNA: intermediate for protein
synthesis (messenger RNA),
catalytic and regulatory function
(non-coding RNA)
regular double helix structure
building blocks: 4 nucleotides A,C,G,
and T (Adenine, Cytosine, Guanine,
Thymine)
building blocks: 4 nucleotides A,C,G,
and U
(U=Uracil) and some rare other
nucleotides
Protein: catalytic and regulatory
function (`enzymes')
building blocks: 20 amino acids + 1
rare aa
1.1 Life
living and nonliving things
Difference
• Living: active– to move, reproduce, grow, eat
1. Act the way they do due to a complex array of
chemical reactions that occur inside them
2. Constantly exchanging matter and energy with its
surroundings
• Dead: in equilibrium with its surrounding
• Exceptions: seeds, and viruses / inactive but not dead
Commonness
• Composed by same atoms, and conforming to same
physical and chemical rules
1.1 Life
Proteins and nucleic acids
main actors in the chemistry of life: molecules
Proteins
• 1) responsible for what a living being is and does in
physical sense;
• 2) Russell Doolittle: “we are our proteins”
nucleic acids
• 1) encode the information necessary to produce
proteins;
• 2) responsible for passing along this “recipe
(prescription)” to subsequent generations
molecular biology research
understanding of the structure and function of proteins
and nucleic acids --the molecules as fundamental objects of
this course
1.2 proteins
proteins: most substances in our bodies;
many kinds; functions
Enzymes: catalysts of chemical reactions;
roles:
A protein: a chain of simpler molecules of
amino acids, AA
20 different AAs
exception: a few nonstandard amino acids.
How to get protein?
Ribosome (核糖体): a cell structure to
produce protein / assemble amino acids one by
one
examples of amino acids: alanine and threonine
• one alpha carbon, or Cα
• An amino group (NH2)
• a hydrogen atom: H
• a carboxy group: COOH
• a side chain: e.g. : 1)glycine: one hydrogen atom
2)tryptophan: two carbon rings
Residue
Protein and Amino Acids
Typically, the number of residues of a protein is 300.
A protein sequence
>gi|7228451|dbj|BAA92411.1| EST AU055734(S20025) corresponds to a region …
MCSYIRYDTPKLFTHVTKTPPKNQVSNSINDVGSRRATDRSVASCSSEKSVGTMSVKNASSISFEDIEKSISNWKIPKVN
IKEIYHVDTDIHKVLTLNLQTSGYELELGSENISVTYRVYYKAMTTLAPCAKHYTPKGLTTLLQTNPNNRCTTPKTLKWD
EITLPEKWVLSQAVEPKSMDQSEVESLIETPDGDVEITFASKQKAFLQSRPSVSLDSRPRTKPQNVVYATYEDNSDEPSI
SDFDINVIELDVGFVIAIEEDEFEIDKDLLKKELRLQKNRPKMKRYFERVDEPFRLKIRELWHKEMREQRKNIFFFDWYE
SSQVRHFEEFFKGKNMMKKEQKSEAEDLTVIKKVSTEWETTSGNKSSSSQSVSPMFVPTIDPNIKLGKQKAFGPAISEEL
VSELALKLNNLKVNKNINEISDNEKYDMVNKIFKPSTLTSTTRNYYPRPTYADLQFEEMPQIQNMTYYNGKEIVEWNLDG
FTEYQIFTLCHQMIMYANACIANGNKEREAANMIVIGFSGQLKGWWNNYLNETQRQEILCAVKRDDQGRPLPDRDGNGNP
TELKEGFHMEEKDEPIQEDDQVVGTIQKYTKQKWYAEVMYRFIDGSYFQHITLIDSGADVNCIREDEILDQLVQTKREQV
VNSIYLHDNSFPKSMDLPDQKITEKRAKLQDIPHHEERLLDYREKKSRDGQDKLPMEVEQSMATNKNTKILLRAWLLST
A protein sequence may have a few hundreds to several
thousands amino acids.
The basic hemical structure of an amino
acid. Carbon atoms are black,
Oxygen is dark grey, Nitrogen light grey,
and hydrogen white.
Backbone
A polypeptide chain. The R1 side chains identify the component amino acids.
Atoms inside each quadrilateral are on the same plane, which can rotate according
to angles and .
1.2 proteins
Primary structure: a linear sequence of residues; peptide
bonds
Fold in three dimensions
• secondary: interactions between backbone atoms;
“local” structure; helices: side chains move
• tertiary : packing on secondary
• quaternary structures: packing on different
proteins
Protein structure
1.2 proteins
Structure prediction
Secondary structure prediction
--exact folding: to specify all - pairs in a protein
three-dimensional structure prediction
--determining the folding of a protein is one of the main research areas in
molecular biology in that:
•
1)shape related to function
•
2)20 amino acids->3D-structure
•
3)no simple / accurate method: determining 3D-structure.
•
--try to predict a molecule’s structure from its Primary structure
shape determines to function
a folded protein has an irregular shape: bind to some other specific
molecules
the kinds of molecules a protein can bind to depend on its shape
1.2 protein binding
1.3
NUCLEIC ACIDS / 1.3.1 DNA
Two kinds of nucleic acids in living organisms
RNA: ribonucleic acids
DNA: deoxyribonucleic acids
Strand and Orientation
DNA: a molecule; a chain of simple molecules; Double
chain;
repetitions of the same basic unit->a sugar: 2’-deoxyribose
(脱氧核糖) attached to a phosphate residue (磷酸残基) > sugar molecule: 5 carbon atoms 1’-5’
the backbone: 3’ carbon of one unit and 5’ of next unit.
Orientation: 5’->3’; canonical: (1)technical paper (2) book
(3) sequence database file
DNA: deoxyribose sugar
Ribose
2’-deoxyribose
1.3.1 DNA
Bases (碱基)
Base:1’C atom; 4 kinds
•
A-T; C-G
•
A,G : purines (嘌呤); C,T: pyrimidines (嘧啶)
Complementarity of organic bases
1.3.1 DNA
complement; complementary bases; bp
two strands: a helical structure: discovered by James Watson
and Francis Crick in 1953
Watson—Crick base pairs: pairs A and T; C and G
bp: unit of length / a piece of DNA is 100,000 bp long or 100kb.
Antiparallel; Reverse complementation; replicate
5’ … TACTGAA … 3’
3’ … ATGACTT … 5’
Given one DNA chain “AGACGT”, can we get the complementary
chain?
1.3.1 DNA
Reverse complementation: operation to infer
the sequence of one strand given the other:
s→s’ →s bar / AGACGT-TGCAGAACGTCT (always 5-3)
replicate: It is precisely this mechanism that
allow DNA in a cell to replicate, therefore
allowing an organism that starts its life as
one cell to grow into billions of other cells,
each one carrying copies of the DNA
molecules from the original cell
1.3.2 RNA
Sugar: ribose / not 2’-deoxyribose
Uracil (U): not thymine (T)
not form a double helix
RNA-DNA hybrid helices
parts of an RNA molecule may bind to other parts of the
same molecule by complementarity
the 3-dimensional structure of RNA is far more varied
than that of DNA
function
DNA: perform one function--encoding information
RNA: perform different functions: there are different kinds
of RNAs
RNA structure
The
three-dimensional structure of RNA is far more varied than that of DNA.
tRNA
2D representation (typical tRNA clover-leaf)
1.4 the mechanisms of molecular genetics
DNA: “the blueprint of life”
- information necessary to build each protein or RNA found
in an organism is encoded in DNA molecules
outline
(1) describe the encoding
(2) protein synthesis: how a protein is built out of
DNA
(3) how information (or genetic information) in DNA
is passed along from a parent to its offspring.
1.4.1 genes and the genetic code
Chromosome and gene
Chromosome– very long DNA molecule
•
Each cell of an organism has a few chromosomes
gene
•
Definition: a contiguous stretch of DNA that contains the
information necessary to build a protein or an RNA
molecule
•
Lengths vary / 10,000 bp for humans
•
Certain cell mechanisms: capable of recognizing in the
DNA the precise points at which a gene starts and at which
it ends
codon and genetic code
codon
•
Definition: each nucleotide triplet
•
Role: a gene uses it to “specify” each AA in DNA
1.4.1 genes and the genetic code
genetic code
• Definition: the table that gives the
correspondence between possible triplet and each
AA
• Cause of using RNA bases: RNA molecules
provide the link between DNA and actual protein
synthesis
1. 64 possible nucleotide triplets correspond to
only 20 AAs
2. STOP: 3 codons
3. Exception
Genetic code
1.4.2 Transcription, translation, and protein synthesis
Transcription—process: messager RNA is produced
Promoter(启动子): a region before each gene
in DNA; to serve as an indication to cellular
mechanism that a gene is ahead
mRNA: a copy of gene; with exactly the same
sequence as one of the strands of the gene but
substituting U for T
Introns (内含子): parts of a gene / not used in
protein synthesis; spliced out from mRNA>shortened mRNA leaves nucleus with exons (外
显子) plus regulatory region
1.4.2 Transcription, translation, and protein synthesis
1.4.2 Transcription, translation, and protein synthesis
Translation: a process in which mRNA is translated
into Protein
tRNA:
• Connect between a codon and specific AA
• Number of tRNA: varies among species / not 64
• Some codens are not represented; some tRNA can
bind to more than one codon
Three bases of tRNA
To bind amino acids
The structure of tRNA
1.4.2 Transcription, translation, and protein synthesis
process
• As mRNA passes, tRNA bind to its codon
• Its attached AA falls in place just next to previous
AA in protein chain being formed
• Enzyme catalyzes AA to protein chain, releasing
it from tRNA
• Synthesis ends: when STOP codon appears, no
tRNA with it
See movie
Protein synthesis
1.4.2 Transcription, translation, and protein synthesis
The Central Dogma of Molecular Biology
replication
DNA
transcript
RNA
translation
Protein
Exception – retroviruses
genotype
phenotype