Download Molecular Biology and Biological Chemistry

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Microsatellite wikipedia , lookup

Replisome wikipedia , lookup

DNA nanotechnology wikipedia , lookup

Helitron (biology) wikipedia , lookup

Transcript
Molecular Biology and
Biological Chemistry
The Fundamentals of Bioinformatics
Chapter 1
Introduction
•
•
•
•
•
•
•
The Scale Spectrum
The Genetic Material
Gene Structure and Information Content
Protein Structure and Function
The Nature of Chemical Bonds
Molecular Biology Tools
Genomic Information Content
The Scale Spectrum
nano
micro
macro
• Nano
– Genes, proteins, genetic networks
• Micro
– Organ physiology, pharmacokinetics
• Macro
– Whole body, multi-organism
DNA structure.
DNA: Deoxyribose Nucleic Acid
History:
• 1868 Miescher – discovered nuclein
• 1944 Avery – experimental evidence that DNA is
constituent of genes.
• 1953 Watson&Crick – double helical nature of DNA.
“We wish to suggest a structure for the salt of deoxyribose nucleic acid (D.N.A.). This
structure has novel features which are of considerable biological interest.”
• 1980 X-ray structure of more than a full turn of DNA.
The Genetic Material
• Genes:
– the basis of inheritance
– A specific sequence of nucleotides.(nt)
• Nucleotide bases
– 4 types: Guanine(G), Adenine (A), Thymine
(T), & Cytosine (C)
– Only differ in their ‘Nitrogenous base’
– Alphabet of the ‘Language of Genes’
Five types of bases.
Base Pairings
• DNA is highly redundant
– Strands are complementary
– Permits replication
• Base pairings are stable and robust
– Only G-C or A-T combinations possible
Complementarity of nucleotide– bases
for double stranded helical structure.
Double helical structure of DNA.
Antiparallel Nature of DNA
• 5’end of one strand matches 3’ end of other
If one strand is
5’-GTATCC-3’
Then other is
3’-CATAGG-5’
Most processes go from 5’ to 3’, so write as:
5’-GGATAC-3’
• Strands are reverse complements
• 5’ is ‘upstream’, and 3’ is ‘downstream’
The Genome
• Full complement of Genes
• Set of chromasomes
– DNA chains
The Central Dogma
RNA-polymerase
ribosomes
• DNA makes RNA makes Protein
– General not universal
• Enzymes
– Proteins that makes things happen, but are not
used up
– X_ase
The Central Dogma (2)
• Transcription
– RNA construction mediated by RNA-polymerase
– One-one correspondence with DNA
• G, C, A, and U (Uracil)
• Translation
– Conversion of nucleotides to amino acids
– Ribosomes - complex structure of RNA & protein
– Mediates protein synthesis
The Central Dogma (3)
Gene Structure and Information
Content
• Information formatting and interpretation is
very important
– Alphabet and punctuation
• Same ‘language’ used for both:
– Prokaryotes (bacteria)
– Eukaryotes (more complex life forms)
Promoter Sequences
• Gene Expression
– Process of using information in DNA to make RNA
molecule then a corresponding protein
• Expressing right quantity of protein essential for
survival
• Two crucial distinctions
– Which part of genome is start of a gene
– Which genes code for proteins needed at a particular
time
• Responsibility falls to RNA-polymerase
Promoter sequences (2)
• Can’t look for single nucleotide
– 1 in 4 chance of appearing at random
– General probability of a sequence = (1/4)n
• Prokaryotes: 13 nt promoter sequences
– 1 in 70 million chance of random appearance
– Genome a few million nts long
– Datum: 1nt, 6 that are 10 nts upstream & 6 that are 35
nts upstream
• Eukaryotes are several orders of magnitude bigger
Promoter Sequences (3)
• Two types of Genes:
1. Structural
•
Cell structure or metabolism
2. Regulatory
•
•
•
Production control
Positive regulation
Negative regulation
The Genetic Code
• Need way to robustly translate from DNA to
Protein
– 4 nt alphabet
– 20 amino acid (aa) alphabet
– Mismatch
• Codon (triplet code)
–
–
–
–
1&2 nts give < 20
Each aa coded by a codon
Degeneracy: more than 1 codon per aa = robustness
Stop codon: full stop
The Genetic Code
Open Reading Frames (ORFs)
• Start codon: AUG (and methinine)
• Reading frame
– Established by start codon
– Necessary for accurate translation
– Mistakes lead to wrong proteins (& premature stops)
• Open Reading Frame
–
–
–
–
Inordinately long reading frame with no stop codon
Proteins 100s of aa long
Random stop: 1 in 20
Distinguishing feature of prokaryotes and eukaryotes.
Introns and Exons
•
•
•
•
•
Messenger RNA - perfect copy of DNA
Introns: locally uninformative sequences in mRNA
Exons: locally informative sequences in mRNA
Splicing: removal of introns, rejoining exons
Spliceosomes: enzymes that do splicing
– GT-AG rule (potentially too common)
– Checks 6 extra nts
– Allows subtle nuances
Introns and Exons (2)
Protein Structure and Function
• Proteins are molecular machinery that
performs most work in cells
• Vast array of tasks
– Structure, catalysis, transportation, signalling
metabolism …
• Highly complex compounds
– Primary, secondary, tertiary, quaternary
structure.
Primary & Secondary Structure
• Primary structure = the linear sequence of amino
acids comprising a protein:
AGVGTVPMTAYGNDIQYYGQVT…
• Secondary structure
– Regular patterns of hydrogen bonding in proteins result
in two patterns that emerge in nearly every protein
structure known: the -helix and the
-sheet
– The location of direction of these periodic, repeating
structures is known as the secondary structure of the
protein
Planarity of the peptide bond
Psi () – the
angle of
rotation about
the C-C bond.
Phi () – the
angle of
rotation about
the N-C bond.
The planar bond angles and bond
lengths are fixed.
Phi and psi
C=O
•  =  = 180° is
extended
conformation
•  : C to N–H
•  : C=O to C
C
N–H
The alpha helix

 60°
Properties of the alpha helix
•     60°
• Hydrogen bonds
between C=O of
residue n, and
NH of residue
n+4
• 3.6 residues/turn
• 1.5 Å/residue rise
• 100°/residue turn
The beta strand (& sheet)
   135°
  +135°
Properties of beta sheets
• Formed of stretches of 5-10 residues in
extended conformation
• Pleated – each C a bit
above or below the previous
• Parallel/aniparallel,
contiguous/non-contiguous
Parallel and anti-parallel -sheets
Anti-parallel
Parallel
• Anti-parallel is slightly energetically favored
Molecular Biology Tools
•
•
•
•
•
•
Restriction enzyme digests
Gel electrophoresis
Blotting and hybridization
Cloning
Polymerase chain reaction
DNA sequencing
Genomic Information Content
• C-value paradox
– No correlation between organism complexity
and DNA size
• Reassociation Kinetics
– Denaturing/renaturing
– Cot equation: t0.5
– Junk DNA
… & Finally
“There are only 10 types of people in the
world: those that understand binary and
those that do not”
Pete Smith (or Anon)