Download Information flow within the cell

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Promoter (genetics) wikipedia , lookup

Nucleic acid analogue wikipedia , lookup

Paracrine signalling wikipedia , lookup

RNA-Seq wikipedia , lookup

Signal transduction wikipedia , lookup

Transformation (genetics) wikipedia , lookup

Genetic engineering wikipedia , lookup

Deoxyribozyme wikipedia , lookup

Genomic library wikipedia , lookup

Gene regulatory network wikipedia , lookup

Silencer (genetics) wikipedia , lookup

Gene wikipedia , lookup

Point mutation wikipedia , lookup

Two-hybrid screening wikipedia , lookup

Gene expression wikipedia , lookup

Transcriptional regulation wikipedia , lookup

Non-coding DNA wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Endogenous retrovirus wikipedia , lookup

Vectors in gene therapy wikipedia , lookup

Transcript
Information flow within the cell
a primer…
Zoi Lygerou
Medical School, Patras University
Greece
All living organisms today
have derived from the same cell
which appeared on earth some
4 billion years ago
© 2000 by Geoffrey M. Cooper
All living organisms today are related
The tree of life…
© 2002 by Bruce Alberts, Alexander Johnson, Julian Lewis, Martin Raff, Keith Roberts, and Peter Walter
All living organisms are made up of cells
Prokaryotic cell
1-10μm
Eukaryotic cell
10-100 μm
There are hundreds of different cell types within the human body,
each with a unique structure and function
Blood cells
Neuron
Sperm cell and oocyte
Fibroblasts
Cardiac myocyte
There are millions of different cells on earth…
All cells are made up of the same building blocks
Which is the main structural and functional component of cells?
Proteins are the main structural and functional
components of the cell
Collagen
Haemoglobin
Proteins are chains of amino-acids
NH2
COOH
Proteins are build from 20 different amino acids
© 2000 by Geoffrey M. Cooper
The unique properties of the amino-acid side
chains give each protein its structure and function
NH2
COOH
Proteins are folded into a specific conformation
compatible with a specific function
The amino acid sequence contains all the
information for protein structure and function
Out of the trillions of amino acid combinations
possible, proteins have the sequence which leads to
a stable structure suitable for a specific function
so, how is the amino-acid composition of proteins
defined, preserved and passed down to future
generations?
Each cell contains all the information required to
built and maintain the complete organism (to
synthesize all the proteins needed in every cell
type)
This information must be stored safely but be
accessible for decoding
Every time a cell divides, an accurate and full copy
must be made and correctly segregated to the
daughter cell
The storage molecule must be stable, with high
information content, easily copied and read
From Nature 171: 740–741 1953
Rosalind Franklin
Francis Crick
Maurice Wilkins
James D. Watson
1962 Nobel Prize for Physiology or Medicine
The DNA structure permits access to the primary sequence
How can the 4 DNA letters code for 20 amino-acids?
The 4 DNA letters are read in triplets
43 combinations = 64
Codons
5’GATGTTCATCGTAATCGTAGCTAACATATCA3’
3’CTACAAGTAGCATTACGATCGATTGTATAGT5’
GAT GTT CAT CGT AAT CGT AGC TAA CAT ATC A
Which is the look-up table?
The Genetic Code
Marshall W. Nirenberg
Robert W. Holley
H. Gobind Khorana
Nobel Prize in Physiology or Medicine 1968
Second Position of Codon
T
TTT Phe [F]
T
C
A
G
TCT Ser [S] TAT Tyr
[Y]
TGT Cys
[C]
T
TTC Phe [F] TCC Ser [S] TAC Tyr
[Y]
TGC Cys
[C]
C
TTA Leu [L]
TCA Ser [S] TAA Ter [end] TGA Ter [end] A
F
TTG Leu [L] TCG Ser [S] TAG
i
CTT Leu [L] CCT Pro [P] CAT
r
s
CTC Leu [L] CCC Pro [P] CAC
C
t
CTA Leu [L] CCA Pro [P] CAA
CTG Leu [L]
P
o
ATT Ile [I]
s
ATC Ile [I]
i A
ATA Ile [I]
t
i
ATG Met [M]
o
GTT Val [V]
n
GTC Val [V]
G
GTA Val [V]
GTG Val [V]
Ter [end] TGG Trp
[W]
T
h
T i
C r
d
A
G
His [H]
CGT Arg [R]
His [H]
CGC Arg [R]
Gln [Q]
CGA Arg [R]
CCG Pro [P] CAG Gln [Q]
CGG Arg [R]
ACT Thr [T]
AAT Asn [N]
AGT Ser [S]
ACC Thr [T] AAC Asn [N]
AGC Ser [S]
ACA Thr [T]
AAA Lys [K]
AGA Arg [R]
ACG Thr [T] AAG Lys [K]
AGG Arg [R]
GCT Ala [A] GAT Asp [D]
GGT Gly [G]
GCC Ala [A] GAC Asp [D]
GGC Gly [G]
G P
T o
s
C
i
A t
G i
o
T
n
C
GCA Ala [A] GAA Glu [E]
GGA Gly [G]
A
GCG Ala [A] GAG Glu [E]
GGG Gly [G]
G
The 4 DNA letters are read in triplets
5’
GATGTTCATCGTAATCGTAGCTAACATATCAAATTGA 3’
3’CTACAAGTAGCATTAGCATCGATTGTATAGTTTAACT5’
Forward frames
TTG A
L
Frame 1
G ATG TTC ATC GTA ATC GTA GCT AAC ATA TCA AAT TGA
Met F
I
V
I
V
A
N
I
S
N
Stop
Frame 2
GAT GTT CAT CGT AAT CGT AGC TAA CAT ATC AAA
D
V
H
R
N
R
S
StopH
I
K
Frame 3
GA TGT TCA TCG TAA TCG TAG CTA ACA TAT CAA ATT GA
C
S
S
StopS
StopL
T
Y
Q
I
Reverse frames
T CAA ATT TGA TAT GTT AGC TAC GAT TAC GAT GAA CAT C Frame 4
S
I
StopY
V
S
Y
D
Y
D
E
H
TC AAA TTT GAT ATG TTA GCT ACG ATT ACG ATG AAC ATC
Q
F
D
Met L
A
T
I
T
Met N
I
Frame 5
TCA AAT TTG ATA TGT TAG CTA CGA TTA CGA TGA ACA TC
S
N
L
I
C
StopL
R
L
R
StopT
Frame 6
Open Reading Frame
ORF
From DNA to RNA
DNA is copied into RNA, before it is decoded to protein
Why?
The central dogma
DNA
Transcription
RNA
Translation
NH2
COOH
Protein
Genes are the functional units of the genetic
material: a part of the genome which codes for a
product with a specific function
Regulatory
region
Transcribed
region
DNA
RNA
NH2
COOH
Protein
The central dogma
Replication
DNA
Transcription
RNA
Translation
NH2
COOH
Protein
An accurate copy of the genetic information must be made every time a cell divides
Replication
Replication
Bacteria
Single replication initiation point: origin of replication
Eukarya
Hundreds of origins scattered throughout the genome
Replication only at a specific phase of the life of the cell (cell cycle) – S phase
Need for accurate spatio-temporal regulation
Information flow within the cell – a family business…
Arthur Kornberg
Nobel Price 1959
Replication
Roger Kornberg
Nobel Price 2006
Transcription
A few complications…
In eukaryotes, the genetic information is split…
Genes contain parts coding for function (exons) interrupted by
non-coding parts (introns)
Regulatory
region
Transcribed
region
DNA
Promoter
Primary
transcript
Splicing
NH2
Mature
mRNA
COOH
Protein
A few more complications…
In addition to genes, there is a hell of a lot more in a
genome…
especially in the human genome…
Less than 30% of our genome contains genes
(introns and exons)
Less than 2% of our genome
encodes for protein
Identifying genes is not straight
forward
If 1bp was 1mm…
Our genome would be 3200km
300m
300m
30m
There would be one gene
every 300m
Every gene would be 30m
mRNA would be 1m
Adapted from Molecular Biology of the Cell, Alberts et al
Genes are not evenly distributed along the human
genome…
Gene-dense “urban centers” alternate with gene-poor “deserts”
The sequence composition of gene-rich,
gene-poor regions and boundaries
differs significantly
So what about all the rest?
50% are repetitive sequences
Junk DNA ???
Or
Shapers of the genome ???
A few more complications…
DNA is very long…
The DNA in each human cell is 1m long
How to you fit a 1m long thread within a sphere
10μm in diameter?
….so that you do not tangle it up and are able to
separate it every time the cell divides?
…and so that each part of it can be accessed for
transcription?
DNA is folded together with proteins into chromatin
Active regions are less compact (euchromatin)
Inactive regions are more compact (heterochromatin)
Chromatin is further compacted just prior to cell
division to permit separation without entanglement
chromosomes
Regulation of gene expression (and inheritance of a
cell character) involves to a great extent regulation of
chromatin structure