Download Document

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Replisome wikipedia , lookup

RNA-Seq wikipedia , lookup

Molecular cloning wikipedia , lookup

Cell-penetrating peptide wikipedia , lookup

Cre-Lox recombination wikipedia , lookup

Protein (nutrient) wikipedia , lookup

Non-coding DNA wikipedia , lookup

Protein moonlighting wikipedia , lookup

Western blot wikipedia , lookup

Protein wikipedia , lookup

Transcriptional regulation wikipedia , lookup

Non-coding RNA wikipedia , lookup

Metalloprotein wikipedia , lookup

Protein adsorption wikipedia , lookup

Vectors in gene therapy wikipedia , lookup

Protein–protein interaction wikipedia , lookup

Expanded genetic code wikipedia , lookup

Silencer (genetics) wikipedia , lookup

Epitranscriptome wikipedia , lookup

Point mutation wikipedia , lookup

Nuclear magnetic resonance spectroscopy of proteins wikipedia , lookup

Molecular evolution wikipedia , lookup

Proteolysis wikipedia , lookup

Genetic code wikipedia , lookup

Two-hybrid screening wikipedia , lookup

List of types of proteins wikipedia , lookup

Gene expression wikipedia , lookup

Protein structure prediction wikipedia , lookup

Nucleic acid analogue wikipedia , lookup

Deoxyribozyme wikipedia , lookup

Biochemistry wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Transcript
Chapter 1 Basic concepts of Molecular Biology
Outline
To present basic concepts of molecular biology
•Biological background
•Literature on computational molecular biology
To point out some of the most notable exceptions to
general rules, but not all
--nothing is 100% valid in molecular biology
Three molecules we will study

DNA


RNA



A string over alphabet {A,C,G,T}
Primary structure – a string over alphabet {A,C,G,U}
Secondary and tertiary structures
Protein


Primary structure – a string over alphabet
{A,R,N,D,C,Q,E,G,H,I,L,K,M,F,P,S,T,W,Y,V}
Secondary and tertiary structures
Central dogma of molecular biology
DNA makes RNA makes protein

DNA: store genetic information;



RNA: intermediate for protein
synthesis (messenger RNA),
catalytic and regulatory function
(non-coding RNA)



regular double helix structure
building blocks: 4 nucleotides A,C,G,
and T (Adenine, Cytosine, Guanine,
Thymine)
building blocks: 4 nucleotides A,C,G,
and U
(U=Uracil) and some rare other
nucleotides
Protein: catalytic and regulatory
function (`enzymes')

building blocks: 20 amino acids + 1
rare aa
1.1 Life
 living and nonliving things
 Difference
• Living: active– to move, reproduce, grow, eat
1. Act the way they do due to a complex array of
chemical reactions that occur inside them
2. Constantly exchanging matter and energy with its
surroundings
• Dead: in equilibrium with its surrounding
• Exceptions: seeds, and viruses / inactive but not dead
 Commonness
• Composed by same atoms, and conforming to same
physical and chemical rules
1.1 Life
 Proteins and nucleic acids
 main actors in the chemistry of life: molecules
 Proteins
• 1) responsible for what a living being is and does in
physical sense;
• 2) Russell Doolittle: “we are our proteins”
 nucleic acids
• 1) encode the information necessary to produce
proteins;
• 2) responsible for passing along this “recipe
(prescription)” to subsequent generations
 molecular biology research
 understanding of the structure and function of proteins
and nucleic acids --the molecules as fundamental objects of
this course
1.2 proteins
 proteins: most substances in our bodies;
many kinds; functions
 Enzymes: catalysts of chemical reactions;
roles:
A protein: a chain of simpler molecules of
amino acids, AA
 20 different AAs
 exception: a few nonstandard amino acids.
 How to get protein?
 Ribosome (核糖体): a cell structure to
produce protein / assemble amino acids one by
one

examples of amino acids: alanine and threonine
• one alpha carbon, or Cα
• An amino group (NH2)
• a hydrogen atom: H
• a carboxy group: COOH
• a side chain: e.g. : 1)glycine: one hydrogen atom
2)tryptophan: two carbon rings
Residue
Protein and Amino Acids
Typically, the number of residues of a protein is 300.
A protein sequence
>gi|7228451|dbj|BAA92411.1| EST AU055734(S20025) corresponds to a region …
MCSYIRYDTPKLFTHVTKTPPKNQVSNSINDVGSRRATDRSVASCSSEKSVGTMSVKNASSISFEDIEKSISNWKIPKVN
IKEIYHVDTDIHKVLTLNLQTSGYELELGSENISVTYRVYYKAMTTLAPCAKHYTPKGLTTLLQTNPNNRCTTPKTLKWD
EITLPEKWVLSQAVEPKSMDQSEVESLIETPDGDVEITFASKQKAFLQSRPSVSLDSRPRTKPQNVVYATYEDNSDEPSI
SDFDINVIELDVGFVIAIEEDEFEIDKDLLKKELRLQKNRPKMKRYFERVDEPFRLKIRELWHKEMREQRKNIFFFDWYE
SSQVRHFEEFFKGKNMMKKEQKSEAEDLTVIKKVSTEWETTSGNKSSSSQSVSPMFVPTIDPNIKLGKQKAFGPAISEEL
VSELALKLNNLKVNKNINEISDNEKYDMVNKIFKPSTLTSTTRNYYPRPTYADLQFEEMPQIQNMTYYNGKEIVEWNLDG
FTEYQIFTLCHQMIMYANACIANGNKEREAANMIVIGFSGQLKGWWNNYLNETQRQEILCAVKRDDQGRPLPDRDGNGNP
TELKEGFHMEEKDEPIQEDDQVVGTIQKYTKQKWYAEVMYRFIDGSYFQHITLIDSGADVNCIREDEILDQLVQTKREQV
VNSIYLHDNSFPKSMDLPDQKITEKRAKLQDIPHHEERLLDYREKKSRDGQDKLPMEVEQSMATNKNTKILLRAWLLST
A protein sequence may have a few hundreds to several
thousands amino acids.
The basic hemical structure of an amino
acid. Carbon atoms are black,
Oxygen is dark grey, Nitrogen light grey,
and hydrogen white.
Backbone
A polypeptide chain. The R1 side chains identify the component amino acids.
Atoms inside each quadrilateral are on the same plane, which can rotate according
to angles  and  .
1.2 proteins
 Primary structure: a linear sequence of residues; peptide
bonds
 Fold in three dimensions
• secondary: interactions between backbone atoms;
“local” structure; helices: side chains move
• tertiary : packing on secondary
• quaternary structures: packing on different
proteins
Protein structure
1.2 proteins
 Structure prediction
 Secondary structure prediction
--exact folding: to specify all - pairs in a protein
 three-dimensional structure prediction
--determining the folding of a protein is one of the main research areas in
molecular biology in that:
•
1)shape related to function
•
2)20 amino acids->3D-structure
•
3)no simple / accurate method: determining 3D-structure.
•
--try to predict a molecule’s structure from its Primary structure
 shape determines to function
 a folded protein has an irregular shape: bind to some other specific
molecules
 the kinds of molecules a protein can bind to depend on its shape
1.2 protein binding
1.3
NUCLEIC ACIDS / 1.3.1 DNA
 Two kinds of nucleic acids in living organisms
 RNA: ribonucleic acids
 DNA: deoxyribonucleic acids
 Strand and Orientation
 DNA: a molecule; a chain of simple molecules; Double
chain;
 repetitions of the same basic unit->a sugar: 2’-deoxyribose
(脱氧核糖) attached to a phosphate residue (磷酸残基) > sugar molecule: 5 carbon atoms 1’-5’
 the backbone: 3’ carbon of one unit and 5’ of next unit.
 Orientation: 5’->3’; canonical: (1)technical paper (2) book
(3) sequence database file
DNA: deoxyribose sugar

Ribose
2’-deoxyribose
1.3.1 DNA
 Bases (碱基)
 Base:1’C atom; 4 kinds
•
A-T; C-G
•
A,G : purines (嘌呤); C,T: pyrimidines (嘧啶)
Complementarity of organic bases
1.3.1 DNA
 complement; complementary bases; bp

two strands: a helical structure: discovered by James Watson
and Francis Crick in 1953

Watson—Crick base pairs: pairs A and T; C and G

bp: unit of length / a piece of DNA is 100,000 bp long or 100kb.
 Antiparallel; Reverse complementation; replicate
5’ … TACTGAA … 3’
3’ … ATGACTT … 5’
Given one DNA chain “AGACGT”, can we get the complementary
chain?
1.3.1 DNA


Reverse complementation: operation to infer
the sequence of one strand given the other:
s→s’ →s bar / AGACGT-TGCAGAACGTCT (always 5-3)
replicate: It is precisely this mechanism that
allow DNA in a cell to replicate, therefore
allowing an organism that starts its life as
one cell to grow into billions of other cells,
each one carrying copies of the DNA
molecules from the original cell
1.3.2 RNA
 Sugar: ribose / not 2’-deoxyribose
 Uracil (U): not thymine (T)
 not form a double helix
 RNA-DNA hybrid helices
 parts of an RNA molecule may bind to other parts of the
same molecule by complementarity
 the 3-dimensional structure of RNA is far more varied
than that of DNA
 function
 DNA: perform one function--encoding information
 RNA: perform different functions: there are different kinds
of RNAs
RNA structure
The
three-dimensional structure of RNA is far more varied than that of DNA.
tRNA
2D representation (typical tRNA clover-leaf)
1.4 the mechanisms of molecular genetics
DNA: “the blueprint of life”
- information necessary to build each protein or RNA found
in an organism is encoded in DNA molecules

outline
 (1) describe the encoding
 (2) protein synthesis: how a protein is built out of
DNA
 (3) how information (or genetic information) in DNA
is passed along from a parent to its offspring.

1.4.1 genes and the genetic code


Chromosome and gene

Chromosome– very long DNA molecule
•
Each cell of an organism has a few chromosomes

gene
•
Definition: a contiguous stretch of DNA that contains the
information necessary to build a protein or an RNA
molecule
•
Lengths vary / 10,000 bp for humans
•
Certain cell mechanisms: capable of recognizing in the
DNA the precise points at which a gene starts and at which
it ends
codon and genetic code

codon
•
Definition: each nucleotide triplet
•
Role: a gene uses it to “specify” each AA in DNA
1.4.1 genes and the genetic code

genetic code
• Definition: the table that gives the
correspondence between possible triplet and each
AA
• Cause of using RNA bases: RNA molecules
provide the link between DNA and actual protein
synthesis
1. 64 possible nucleotide triplets correspond to
only 20 AAs
2. STOP: 3 codons
3. Exception
Genetic code
1.4.2 Transcription, translation, and protein synthesis

Transcription—process: messager RNA is produced
 Promoter(启动子): a region before each gene
in DNA; to serve as an indication to cellular
mechanism that a gene is ahead
 mRNA: a copy of gene; with exactly the same
sequence as one of the strands of the gene but
substituting U for T
 Introns (内含子): parts of a gene / not used in
protein synthesis; spliced out from mRNA>shortened mRNA leaves nucleus with exons (外
显子) plus regulatory region
1.4.2 Transcription, translation, and protein synthesis
1.4.2 Transcription, translation, and protein synthesis

Translation: a process in which mRNA is translated
into Protein
 tRNA:
• Connect between a codon and specific AA
• Number of tRNA: varies among species / not 64
• Some codens are not represented; some tRNA can
bind to more than one codon
Three bases of tRNA
To bind amino acids
The structure of tRNA
1.4.2 Transcription, translation, and protein synthesis

process
• As mRNA passes, tRNA bind to its codon
• Its attached AA falls in place just next to previous
AA in protein chain being formed
• Enzyme catalyzes AA to protein chain, releasing
it from tRNA
• Synthesis ends: when STOP codon appears, no
tRNA with it
See movie
Protein synthesis
1.4.2 Transcription, translation, and protein synthesis
The Central Dogma of Molecular Biology
replication
DNA
transcript
RNA
translation
Protein
Exception – retroviruses
genotype
phenotype