Download Eukaryotic Genes

Document related concepts

Cancer epigenetics wikipedia , lookup

Epigenetics of diabetes Type 2 wikipedia , lookup

Cre-Lox recombination wikipedia , lookup

Gene nomenclature wikipedia , lookup

Oncogenomics wikipedia , lookup

Protein moonlighting wikipedia , lookup

Extrachromosomal DNA wikipedia , lookup

Pathogenomics wikipedia , lookup

Genomic imprinting wikipedia , lookup

Gene desert wikipedia , lookup

Transcription factor wikipedia , lookup

Epigenetics in learning and memory wikipedia , lookup

Metagenomics wikipedia , lookup

Biology and consumer behaviour wikipedia , lookup

Genetic engineering wikipedia , lookup

Messenger RNA wikipedia , lookup

Gene expression programming wikipedia , lookup

Ridge (biology) wikipedia , lookup

Short interspersed nuclear elements (SINEs) wikipedia , lookup

History of RNA biology wikipedia , lookup

Transposable element wikipedia , lookup

Epigenetics of neurodegenerative diseases wikipedia , lookup

Long non-coding RNA wikipedia , lookup

Non-coding RNA wikipedia , lookup

Polycomb Group Proteins and Cancer wikipedia , lookup

Genomics wikipedia , lookup

Human genome wikipedia , lookup

Nutriepigenomics wikipedia , lookup

Minimal genome wikipedia , lookup

Epitranscriptome wikipedia , lookup

Genome (book) wikipedia , lookup

NEDD9 wikipedia , lookup

Vectors in gene therapy wikipedia , lookup

History of genetic engineering wikipedia , lookup

Genome editing wikipedia , lookup

Site-specific recombinase technology wikipedia , lookup

Gene expression profiling wikipedia , lookup

Non-coding DNA wikipedia , lookup

Point mutation wikipedia , lookup

Designer baby wikipedia , lookup

Epigenetics of human development wikipedia , lookup

RNA-Seq wikipedia , lookup

Genome evolution wikipedia , lookup

Microevolution wikipedia , lookup

Gene wikipedia , lookup

Helitron (biology) wikipedia , lookup

Therapeutic gene modulation wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Primary transcript wikipedia , lookup

Transcript
Fundamental Features of
Eukaryotic Gene
Eukaryotes VS Prokaryotes
Key differences between eukaryotic and prokaryotic
genomes:
 Eukaryotic genomes are larger.
 Eukaryotic genomes have more regulatory
sequences.
 Much of eukaryotic DNA is non-coding.
 Eukaryotes have multiple chromosomes.
 In eukaryotes, translation and transcription are
physically separated which allows many points of
regulation before translation begins.
Eukaryotes VS Prokaryotes
Some eukaryotic genes that have no
homologs in prokaryotes:
• Genes encoding histones
• Genes encoding cyclin-dependent
kinases that control cell division
• Genes encoding proteins involved in
processing of mRNA
Eukaryotes VS Prokaryotes
Gene characteristics not found in
prokaryotes:
• Eukaryote genes contain non-coding
internal sequences (introns).
• Form gene families—groups of
structurally and functionally related
genes.
Eukaryotic Genes
Eukaryotic genes have a promoter to
which RNA polymerase binds and a
terminator sequence to signal end of
transcription.
Terminator sequence comes after the
stop codon.
Stop codon is transcribed into mRNA and
signals the end of translation at the
ribosome.
Eukaryotic Genes
Eukaryotic mRNA Splicing
• Researchers had expected the complementary of
upstream (5’) and downstream (3’).
• However, this idea was soon discounted.
• The sequences at 5’ and 3’ are not complementary, yet
not random.
• The sequences at 5’ ends of introns were similar to each
other, as were the sequence at the 3’ ends.
Eukaryotic mRNA Splicing
• Almost all introns begin with G:U and end with A:G.
• The branch site: sequences within introns, 20-50
nucleotides just upstream of 3’, is essential for splicing.
• The consensus sequence of the branch site is
YNYURAC. The A is the key nucleotide in the
sequence. Mutations of this branch site A prevent
splicing.
• Splicing is carried out by a complex of proteins
and RNA called the spliceosome.
• Small nuclear RNAs (snRNAs), U1, U2, U4, U5,
and U6 associate with proteins to form small
nucleoprotein particles (snRNPs).
• The U1 snRNP is the first component to bind 5’
splice site, followed by U2 snRNP binding to the
branch site.
• The other three snRNPs bring 5’ and 3’ splice
site together. The U6 snRNP removes the intron
and joins the exons.
Eukaryotic mRNA Splicing
Video
porpax.bio.miami.edu/.../150/gene/mol_gen.htm
Alternative mRNA Splicing
• Many primary gene transcripts can be spliced in
different ways to produce distinct RNA
molecules that each encode a different protein.
• Alternative splicing often produces two forms of
the same protein that are necessary at different
stage of development or in different cell types.
• Immunoglobulins of the IgM class exist as either
a membrane bound protein displayed on the cell
surface (B cell) or as a soluble protein secreted
into the blood (plasma cell).
Alternative mRNA Splicing
• Alternative splicing can be extremely
complicated, e.g. the gene encoding the protein
α-tropomyosin containing 14 exons.
• Different combinations of exons are used to form
mature tropomyosin mRNAs in different cell
types.
• The overall structure of each tropomyosin
protein is similar, the cell-type specific amino
acids may function as binding sites for different
proteins.
Alternative mRNA Splicing
Alternative mRNA Splicing
• The complexity of an organism is not reflected in
the number of its genes.
• The analysis of human genome identified 22,000
genes, whereas the number of gene transcripts
is 35,845 (about the same as Caenorhabditis
elegans and less then Arabidopsis thaliana.)
• The difference between these two numbers is
accounted for by alternative splicing.
• One-half of the transcripts are non-coding RNAs.
Errors in Splicing Cause Disease
• At least 15% of human genetic disorders are found
to result from splicing defects.
• Mutation in consensus splicing sequences may lead
to exon skipping, deletion of part of exon, and the
inclusion of the sequence that should not be part of
the mature mRNA.
• Mutations in the splicing sites of the β-globin gene
disrupt splicing and cause β-thalassemia.
• Mutations in (exonic splicing enhancers) ESEs can
have profound effects, even a silent mutation can
disrupt ESE and cause splicing problem.
Functions of Introns
• Introns do not encode proteins, yet maintained
in eukaryotic genomes. Why?
• It is suggested that introns might facilitated
genetic variation (the substrate for natural
selection) if exons encode functional domains of
proteins and introns promote recombination of
exons (exon shuffling.)
• The low-density-lipoprotein receptor (LDL-R)
appears to be made up of bits of other proteins
stitched together to make a new protein.
Function of Introns
Evidence of exon shuffling in the LDL-R gene
Functions of Introns
• Almost all of the genes for small nucleolar RNA
genes (snoRNAs) that are involved in ribosomal
RNA maturation in vertebrates are found within the
introns of genes that code for proteins.
• After splicing and excision of the introns, nucleotides
are removed from the 5’ and 3’ ends of introns to
produce functional snoRNAs.
• The human U22 snoRNA is in an intron of the U22
host gene (UHG) that does not appear to code for a
protein.
• It is the intronic sequences of UHG that are “useful”.
Genes Are Within Other Genes
• Genes can be found within the introns of other
genes.
• In Drosophila, unrelated gene for a pupal cuticle
protein was found embedded within one of the
introns of GART gene, coding for enzyme
important for biosynthesis of purines.
• Intron 22 of the human factor VIII gene is very
large (32 kb) and harbors two genes, F8A and
F8B.
Genes Are Within Other Genes
• The tumor suppressor genes p16INK4a and P19ARF are
encoded by a single locus in the human genome.
• The reading frames are different resulting in completely
unrelated proteins.
Repetitive DNA Sequences
• Most genes are present only once per haploid
genome. However, the genes for histones,
tRNAs and rRNAs are present many times
within the genomes and are often clustered
together.
• Moderately repetitive sequences (genes):
code for tRNAs and rRNAs
– These molecules are needed in large
quantities; the genome has multiple copies
of the sequence.
Repetitive DNA Sequences
• Four different rRNAs
– 16S, 5.8S, 28S are transcribed as a single
precursor molecule. Humans have 280
copies of the sequence on five different
chromosomes
– 5S (S = Svedberg unit)
Figure 14.3 A Moderately Repetitive Sequence Codes for rRNA
Repetitive DNA Sequences
• The histones gene family consists of major
genes (H1, H2A, H2B, H3, and H4). In
Drosophila, these 5 genes occur in a cluster of
about 5000-6000 bp, and each cluster is
tandemly repeated between 100 – 1000 times.
• In higher eukaryotes, a cluster of histones
genes exists at only 10-40 copies per genome.
Repetitive DNA Sequences
• Eukaryote genomes have two types of
highly repetitive sequences that do not code
for proteins
– Minisatellites: 15–100 bp long, repeated 20-50
times.
– Microsatellites: 2-6 nucleotides, and are
present in tandem arrays of five to about 30
copies.
• Number of copies varies among
individuals—provides molecular markers.
Repetitive DNA Sequences
• Other moderately repetitive sequences can move
from place to place in the genome - transposons
• Transposons make up 40 percent of human genome,
only 3 - 10 percent in other sequenced eukaryotes
• Transposons are not tandemly repeated but, rather,
exist as isolated elements that may be present in
many thousands of copies per genome.
• SINEs (short interspersed elements) range in length
from 130 - 300 bp; 15 percent of human DNA. One,
Alu, is present in a million copies
Gene Duplication and Divergence
• Eukaryotic genomes increase in size and complexity
through gene duplication and subsequent sequence
divergence.
• Duplication of a gene allows duplicated copy undergo
mutation without selection, because the other copy
supplies the protein needed for cell function. This
process is called genetic drift.
• The evolutionary significant of genetic drift is that
mutations may lead to a protein acquiring new
functions – an enzyme acting on different substrate
or cell type.
Gene Duplication and Divergence
Gene Duplication and Divergence
• There are 7 genes in the Arabidopsis related to the
terpene synthases, three of which are closely related.
• Two genes, 25820 and 25830, are identical and the
third gene, 25810, is 80% identical to these two
genes.
• The 25810 is expressed exclusively in roots and does
not synthesize one of the terpenes made by 25820
and 25830.
• An ancestral gene underwent duplication; one gene
diverged in expression and function to give rise to
25810, whereas the other underwent a second
duplication to produce 25820 and 25830.
Gene Duplication and Divergence
• Genetic drift in duplicated genes may be desirable
since generating genetic diversity.
• However, some situations, it needs to be
counteracted.
• Mutations may be corrected in two ways, by
elimination and gene conversion.
• For example, unequal crossing-over can lead to the
accumulation of extra copies of tandemly repeated
genes. These individuals carrying these copies may
be selectively disadvantage and will be eliminated
from the population.
Gene Duplication and Divergence
• Gene duplication and divergence can lead to pseudogene.
• DNA sequences are highly related to functional gene,
containing inserted or deleted sequences or other
mutations that prevent the production of functional protein.
• Processed pseudogene - arisen by reverse transcription
of an mRNA molecule into DNA.
Gene Regulation in Eukaryotes
– Eukaryotes have three different RNA polymerase
complexes, making it possible for three independently
regulated families of promoters.
– The eukaryotic RNA polymerases require more
transcription factors to initiate RNA synthesis.
– These transcription factors play many different roles
including recognizing other transcription factors, removing
chromatin proteins that block polymerase access, and
unwinding the DNA at the promoter.
– Control of eukaryotic gene expression is more complex
than in prokaryotes.
Transcriptional Factors
• A eukaryotic promoter consists of TATA box, CAT box, and
GC box that lie about -25, -75, and -90, respectively.
• Each eukaryotic structural gene has its own set of
response elements.
• In addition to DNA-protein interaction, protein-protein
associations are important for regulating eukarytic
transcription.
Transcriptional Factors
– Formation of an RNA
polymerase II
transcription initiation
complex at a TATA box.
– TFIID binds to a TATA
box, and in sequence,
other transcription factors
and RNA polymerase is
responsible for initiating
transcription.
Transcriptional Factors
– Transcription factors must bind DNA and interact with
RNA polymerase.
– These two functions – DNA binding and activation are
carried on different surfaces of the protein (separate
domains.)
– Domain swap experiment strongly suggested that the
DNA binding region might be separate from the
transcriptional activating region.
Transcriptional Factors
– A transcription factor can
activate different targets
depending on what other
transcription factors are
present.
– Various combinations of
transcription factors are
required for transcription.
Eukaryotic Enhancers
– The DNA sequence upstream and downstream of evenskipped are termed enhancers, they can enhance
transcription of a promoter.
– Enhancers contain DNA sequence that are recognized
and bound by transcriptional factors.
– The transcriptional factors, then, promote transcription by
recruiting RNA polymerase and associated factors
necessary for transcription.
Eukaryotic Enhancers
– Enhancers increase gene expression independently of
their position relative to the gene.
– Enhancers acts over long distances, as far as 1,000,000
bases away from the promoter.
– Enhancers can act at the great distances by being brought
in close proximity to a promoter by looping of the DNA
strand.
– An insulator sequences can block the action of enhancers.
– Flanking a transgene with insulators sequences frequently
lead to higher levels of transgene transcription.
Eukaryotic Enhancers
biology.kenyon.edu/.../Chap10/Chap10.html
DNA Packaging and Gene Expression
– DNA molecules are tightly wrapped around histone proteins to form a
structure called nucleosomes.
– The string of nucleosomes are further looped and wrapped into a
compact structure called chromatin.
– The “active” chromatin (e.g., that making rRNA) lacked the normal
appearance of chromatin.
– The actively transcribed DNA, looping out from the chromosome and
contain either naked DNA or single nucleosome, is more accessible to
nuclease such as restriction enzymes than tightly condensed inactive
chromatin.
Protein Secretion Pathways
Protein Secretion Pathways
– Type II secretion
system in gram
negative bacteria.
– It is Sec-dependent
pathways.
– The type II secretion
pathway consists of a
Gsp complex spanning
periplasmic space and
forming a channel
through outer
membrane.
Protein Secretion Pathways
– Type III secretion
system in gram
negative bacteria.
– It is made up about
20 different proteins
that form a
continuous channel
through the inner
and outer
membranes.
Protein Secretion Pathways