Download Slide 1

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Pathogenomics wikipedia , lookup

Non-coding RNA wikipedia , lookup

Zinc finger nuclease wikipedia , lookup

Epigenomics wikipedia , lookup

Copy-number variation wikipedia , lookup

Genomic imprinting wikipedia , lookup

Molecular cloning wikipedia , lookup

Extrachromosomal DNA wikipedia , lookup

Cancer epigenetics wikipedia , lookup

Polycomb Group Proteins and Cancer wikipedia , lookup

Oncogenomics wikipedia , lookup

X-inactivation wikipedia , lookup

Mutation wikipedia , lookup

Epigenetics of diabetes Type 2 wikipedia , lookup

Gene therapy of the human retina wikipedia , lookup

Biology and consumer behaviour wikipedia , lookup

Metagenomics wikipedia , lookup

Genomic library wikipedia , lookup

Frameshift mutation wikipedia , lookup

Minimal genome wikipedia , lookup

Cre-Lox recombination wikipedia , lookup

No-SCAR (Scarless Cas9 Assisted Recombineering) Genome Editing wikipedia , lookup

Genetic engineering wikipedia , lookup

Gene nomenclature wikipedia , lookup

Deoxyribozyme wikipedia , lookup

Transposable element wikipedia , lookup

Gene therapy wikipedia , lookup

Gene expression programming wikipedia , lookup

Human genome wikipedia , lookup

Gene desert wikipedia , lookup

Genomics wikipedia , lookup

Nutriepigenomics wikipedia , lookup

Genome (book) wikipedia , lookup

Gene expression profiling wikipedia , lookup

Non-coding DNA wikipedia , lookup

Epigenetics of human development wikipedia , lookup

NEDD9 wikipedia , lookup

Point mutation wikipedia , lookup

Genome evolution wikipedia , lookup

History of genetic engineering wikipedia , lookup

Genome editing wikipedia , lookup

Primary transcript wikipedia , lookup

Site-specific recombinase technology wikipedia , lookup

Vectors in gene therapy wikipedia , lookup

Gene wikipedia , lookup

RNA-Seq wikipedia , lookup

Therapeutic gene modulation wikipedia , lookup

Designer baby wikipedia , lookup

Helitron (biology) wikipedia , lookup

Microevolution wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Transcript
The Nature of a GENE;
Genomic/Evolutionary Context
Gene Duplication
Susquehanna MAGNET School for Medicine and
Health Sciences
October 21, 2013
Professor Michael Chorney
Learning Objectives
Explain the nature of unequal crossing-over and the
consequences related to gene family expansion
Explain the ramifications of copying genes, both the positive
and negative
Describe the nature of a pseudogene and the process by
which it deteriorates
Discuss the rate of base change in the genome, considering
such concepts as purifying selection, synonymous
versus nonsynonymous substitution (ds, dn), neutrality
etc.—formulate some personal thoughts as there is no
right or wrong here.
Relay what is meant by the THEORY of evolution.
Consider a gene, which is functional…………
gene
It has a promoter, generates an open reading frame
after splicing (maintained consensus sequences for
splicing, of course), has 5’ and 3’ untranslated exons,
has a polyadenylation site at its 3’ end, etc.
Consider a gene, which is functional…………but
which succumbs to base changes: these are a
function of a variety of enzymatic and chemical
processes that may be consistent over millions of years
of time (polymerase error, oxidation, radiation,etc.)
Some of the mutations are silent and can be tolerated, some are missense but conservative and
tolerated okay, some are missense but
nonconservative-some are nonsense and some
shift frame: these would appear to be under
Darwinian selection, and……
Purifying selection, which serves as a gatekeeper by
eliminating harmful mutations within important
DNA sequences
Some view the bulk of mutations and polymorphisms
as being generally neutral, in that one allele is as
good, or bad, as the next=neutrality (check it out)
Somewhere in between may be a more realistic view
of the nature of genomic change
Within critical genes, conservation is maintained….
Within unimportant sequences, can a more true rate
of change be ascertained? A base change
rate is called the molecular or evolutionary clock—
it can tell the relative distance two organisms are
separated, but not an exact time—this is where
the fossil record is important
The best sort of clock may be based on the frequency
of third base position changes, called synonymous
substitution rate……….
It is my intention to have you think about DNA changes
and the composition of the genome—it is ever changing,
apparently slowly based on our perspective of time
One instance where evolution can speed up is through
gene duplication, via UNEQUAL CROSSING-OVER
X
Meiosis, 2X to 1X
or
Homologues
gamete 1
gamete 2
or, via crossing-over
gamete 3
gamete 4=recombinants
Let us magnify the chromosome and look at two
homologous chromosomes containing alleles
lost
selected
Gamete progenitor
gamete 1
gamete 2
gene under purifying selection
gene free to succumb to accelerated
rate of mutation, first step being loss
of CpG binding (protective) proteins
creation of a new niche for the mutated
‘paralog’ (within species)—’homolog’
between species
Gene conversion-DNA polymerase switches
templates during copying of DNA during
gametogenesis based on the gene similarity-the sequence incorporated is small
donor template
is unaltered,
recipient becomes
a composite,
intrachromosomal
Consider olfaction genes
400 functional genes
600 pseudogenes
and
Transplantation Antigen=Major Histocompatibility
Complex=Human Leukocyte Antigen
and
globin genes
See:
http://en.wikipedia.org/wiki/List_of_gene_families
Cytogenetics
Giemsa-stained metaphase spread of human chromosomes from
one cell, the most condensed form of DNA within the cell, seen at
MITOSIS
Vocabulary
metacentric
sub-metacentric
acro(telo)centric
centromere
p arm
q arm
banding
heterochromatin
euchromatin
telomeres
autosome
Each chromosome is a linear strand of
helical ds-DNA with capped ends called
telomeres
Information flow
Sense strand
A GENE
TRANSCRIPT
Figure 6-2 Molecular Biology of the Cell (© Garland Science 2008)
Anti-sense
strand
Like copy
ing the leading strand
Figure 6-21 Molecular Biology of the Cell (© Garland Science 2008)
RNA polymerase
replaces U for T (why?) in
RNA and ribose for
the deoxy sugar
Deaminated C=U
Figure 6-4 Molecular Biology of the Cell (© Garland Science 2008)
DNA, the Puzzle, Review
Only a small amount (percentage) of human DNA contains information that
is ostensibly converted into proteins: these sequences are associated with
genes. The proteins coded for by genes do biochemical work and regulate
cell division, generate energy, respond to the environment, provide
immunity to invasive DNA sequences (infection), etc.
What (and where) is this information we keep hearing so
much about?
For starters,
It resides in the bases; particular triplet base
combinations which comprise the exons and
provide information called codons
You should have been exposed to the list of codons
last week in the case study, next slide
There are 64 codons that equate to the twenty amino
acids (a.a’s), with multiple codons existing for most of
the a.a.’s, called degeneracy. Three of the codons
are called termination codons, more later
CODONS, what do you notice?
Figure 6-50 Molecular Biology of the Cell (© Garland Science 2008)
The question for
molecular biologists:
What distinguishes
a gene (1-2% of DNA)
from the remaining
DNA (98%)?
This has posed a
problem for some
time; now that this is
becoming solved, the
question becomes,
what does the ‘gene’
do?
Figure 4-7 Molecular Biology of the Cell (© Garland Science 2008)
Can you see, at
a quick glance, a
gene in the
sequence at the
left?
The yellow highlighted
bases signify the
beta globin gene!!!
Genes are subject to the following:
1. They must be recognized by a polymerase, that
is, an RNA polymerase that will guide gene copying
called TRANSCRIPTION—compare DNA polymerase
2. The collective DNA sequence that summons forth
RNA polymerase is called a PROMOTER
3. The information copied into RNA immediately
adjacent to the promoter must be readable
(CODING SEQUENCE); i.e. no stop codons until
the naturally determined end of translation
4. There has to be a place after the coding sequence
that signals the end of transcription, different than the
end of translation
The eukaryotic gene’s general features and processing characteristics
5’
p
exon
AGGT A AGG
exon AGGT A AGG exon AGGT A AGG exon
AATAAA
3’UTR
3’
ATG
STOP
The gene is controlled by a promoter (p) which is not simple – there are
generalized transcription factors and more gene-specific ones that may reside
outside of the promoter proper, within the gene, within the 3’ end of the gene
or even far 5’ and/or 3’ of the gene itself –they open the DNA and expose sites
The gene is structured in ‘staccato,’ with coding sequence (exons) interrupted by
noncoding intervening sequences, called introns; the first exon begins with the ATG
met codon, the last exon ends with one of three translational terimantion codons
(TAA, TAG, TGA)
Termination of transcription occurs in the 3’ untranslated region (3’UTR) which
possesses termination signals and an RNA domain which drives 3’ processing, the
AATAAA polyadenylation signal
Exon-intron borders possess sequences which aid in splicing, AG/GT……A……AG/G
along with small, nuclear RNAs forming the spliceosome
5’UTR
exon1
exon 2
exon 3
exon 4
3’UTR
CpG Islands: under-represented nucleotides found at the 5’ end of eukaryotic genes
AATAAA
5’
p
exon
AUG
AGGT A AGG
exon AGGT A AGG exon AGGT A AGG exon
3’UTR
STOP
CH3ase
[CG]
Maintaining DNA euchromatic also rests upon factors that bind to C’s and G’s, which
protect the CpG ‘islands’ from cytosine methylases best known for their role in
imprinting
Let’s try a poor analogy, constrained by the English
language and a dearth of three-letter words, but
Here goes….
Find the three letter (codon)-containing ‘exons’ that
make a kind of a sensible phrase (names included)This is comparable to an open reading frame
Word
DNA
…..Wlsjeutlsjimsatouttutyecmdsisladksltkald
Thedayforeeeuslkeiandseveeubhismomand
ttugosocunntewherebudtedandtueislsiecn
Tisnggotallsixeooaltaxlekqzztiellforthebigbadsum
rrrrrrrrrrrrteidas………
Answer: jimsatoutthedayforhismomandbudted
andgotallsixforthebigbadsum……….
jim sat out the day for his mom and bud ted
and got all six for the big bad sum……….
…..Wlsjeutlsjimsatouttutyecmdsisladksltkald
Thedayforeeeuslkeiandseveeubhismomand
ttugosocunntewherebudtedandtueislsiecn
Tisnggotallsixeooaltaxlekqzztiellforthebigbadsum
rrrrrrrrrrrrteidas………
What happens if I delete the s?
Jim sat out the day for him oma ndb udt eda
ndg ota lls ixf ort heb igb ads um……….
FRAMESHIFT—the OPEN READING FRAME
IS GONE
RNA
Figure 6-51 Molecular Biology of the Cell (© Garland Science 2008)
CODING
SEQUENCE
IS CONSERVED
SEQUENCE
ACROSS
SPECIES
LEPTIN
GENE
ALIGNMENT
Figure 4-76 Molecular Biology of the Cell (© Garland Science 2008)
THERE IS GREATER EVOLUTIONARY PRESSURE
TO CONSERVE CODING SEQUENCE (EXONS) THAN
INTRON SEQUENCES
Figure 4-78 Molecular Biology of the Cell (© Garland Science 2008)
DNA, the puzzle.2
Humans have approximately 23,000 genes (down from the 80-140k prediction
Genes are dispersed along the chromosomes in what appears to be a random
fashion, although many gene clusters exist which seem to aid coordinate
expression: globin, histone, immunoglobulin, MHC, etc.
Some chromosomes are more rich in genes than others, although
chromosome size roughly correlates with gene number
A gene’s location is termed its locus as we have touched upon
Genes vary in size, from beginning to end
And in their number of exons, whose tally following splicing
must = an open reading
frame, or ORF
Exons’ size varies, but average about 200 basepairs (based on my
Knowledge of the Ig superfamily members); their translated sequences often
equate to ‘domains,’ units of primary amino acid sequence that perform function
The average protein is 45Kd (110 for the mw of an average amino acid); the
average size of a spliced gene (mRNA) is 1.5 kb, therefore, the amount of coding
sequence in the human genome is 0.14%
http://www.cshlp.org/ghg5_all/section/gene.shtm
BIG GENESl