Download Lecture 9 – Genomics

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Lecture - 9
Genomics - 3
GEB 406
Course Instructor: Sheikh Ahmad Shah
Semester: Summer 2016
1
Review of Few Terms
Homolog:
A gene related to a second gene by descent from a common
ancestral DNA sequence.
Ortholog:
Orthologs are genes in different species that evolved from a
common ancestral gene by speciation.
Paralog:
Paralogs are genes related by duplication within a genome.
2
Review of Few Terms
3
Understanding Evolution through Genomics
From the genome sequences, better understanding of evolution can
be obtained. Organisms alive today carry in their genomes
information that has been shaped by evolution. Nucleus,
mitochondria, chloroplast, etc. organelles most probably have come
into existence through “Endosymbiotic” events. Chloroplast and
mitochondrial genomes look very much like Eubacteria, so that
debate had been settled already. But nuclear genome is very
different from other genomes.
Considering that, a group of four Japanese scientists from Takao
Shinozawa's lab at Gumma University in Kiryu, Japan, used whole
genome sequence comparisons to determine the origin of nucleus. 4
Understanding Evolution through Genomics
They used the BLAST program to determine the number of
orthologous yeast genes in six Archaea and nine Eubacteria genomes.
The first group of yeast genes they tested were those involved in
"nuclear organization," and they compared the number of orthologs
to yeast genes (i.e, the hit number) in each prokaryotic genome. It
was found that Archeae genomes contain more orthologs than
Eubacteria genomes. The next two groups of genes were involved in
metabolism and energy production. But there, yeast orthologs are
more numerous in Eubacteria genomes.
The group thus concluded that nuclear organization genes originated
5
in Archeae, not in Eubacteria.
Understanding Evolution through Genomics
Another example of Aphid-Buchnera relation gives us good insights
about the intermediate situation during eukaryotic genome
evolution. Aphid is a special type of insect which does not excrete
any nitrogenous waste. All other organisms excrete nitrogenous
waste, so it is an exception of nature’s rule.
Then, it was found that most aphid species have 60-80 large cells in
their abdomens bacteriocytes where Buchnera bacteria can be
found. The bacteria is transmitted to next generation of aphids from
the mother. Both aphid and buchnera are mutually dependent to
each other, which relation was established about 225 million years
ago.
6
Understanding Evolution through Genomics
The genome of Buchnera was sequenced. Its total genomic
content was about 650 kb, which was the second smallest genome
sequenced at that period. It indicates, buchnera might have lost
some of its original DNA.
By analyzing the genome, scientists found 583 ORFs. Using BLAST,
500 ORFs were assigned to the similar genes from E. coli. But E.
coli has in total 4288 genes. Only four genes were found unique to
the Bucknera genome. The researchers concluded that Buchnera
evolved from a close relative of modern E. coli, although buchnera
has lost 75% of its genome. This phenomenon provides a great
example about the steps during endosymbiosis.
7
Understanding Evolution through Genomics
Aphids can not produce all the amino acids. Buchnera provide the
aphids the necessary amino acids. On the return, Buchnera enjoys
all the other cellular facilities of Aphids. Buchnera thus lost many
genes during its mutual relationship with Aphids, and only
sustaining the genes which are necessary for the mutual relation.
This phenomenon provides a great example about the steps during
endosymbiosis. Buchnera would not survive on its own as it lacks
many DNA repair enzymes, cell wall synthesizing enzymes,
phospholipid synthesizing enzymes. But Buchnera has its own ATP
synthesizing mechanisms. Thus, Aphid-Buchnera relation seems
like an intermediate of cell-mitochondria relation.
8
Understanding Evolution through Genomics
Another good example of the transitional states for eukaryotic
nucleus development can be provided by a bacterium –
Mycobacterium leprae. In February 2001, its complete genome
sequence was done. Previously on 1998, the genome of M.
tuberculosis was sequenced.
They found, only 50% of leprosy genome encodes 1604 proteins,
whereas it is 91% for tuberculosis, with 3959 proteins. But there
were 1000 pseudogenes in leprosy, where it was scarce in
tuberculosis. It appears that M. leprae genome has lost over 2000
functional genes and over 1 million base pairs of DNA.
9
Understanding Evolution through Genomics
The leprosy genome is lacking many genes compared to
tuberculosis. It contains a gene (ProS) which is not homologous to
tuberculosis, rather homologous to another bacteria Borrelia. It
may had the original ProS, which can still be found in tuberculosis,
but it also may have obtained ProS gene from Borrelia ancestor
and finally retained that.
Also, the leprosy is barely able to synthesize its membrane lipids.
But it can produce a unique lipid-synthesizing gene which
compensates for the loss of more common lipids. M. leprae
derives its energy by metabolizing lipids, where lacking many
genes involved in carbon metabolism.
10
Using Genomic Sequence to Make New Vaccine
Meningitis and sepsis involve inflammation of the outer lining of
the brain (meningitis) or the blood (sepsis). If left untreated, they
can rapidly progress to cause permanent nerve damage or even
death. The principle cause is a bacterium called “Neiserria
meningitidis”. There are five serotypes, and among those, serotype
B was much dangerous because no good vaccine was available for
this strain. The reason were: 1) Outer capsule of serotype B was
made of a polysaccharide that is very similar to sugars found on
human cells. 2) Outer surface of Serotype B contains another
protein with strong immunogenic response, but the protein highly
variable among different strains of serotype B.
11
Using Genomic Sequence to Make New Vaccine
The genome of N. meningitidis was sequenced by a group involving
TIGR. Using ORF finding software and comparing the genome with
the other bacteria, they found 570 ORFs from the total 2158 ORFs,
which were predicted to be associated with surface-proteins and
secreted proteins.
After overexpressing these 570 ORFs in E. coli, 350 different types
of proteins were found. Then researchers applied different
approaches to find which proteins can elicit immune response and
found 7 proteins which were positive in all the assays. These
proteins were termed as GNAs (Genome-derived Neiserria
Antigens).
12
Using Genomic Sequence to Make New Vaccine
Then the degree of conservation of those candidate proteins were
determined by comparing those with other 31 Neiserria strains of
different serotypes. By combining the results in immunological
response and the percentages of conservation among the species,
two proteins were finally selected as a target against which vaccine
production may bring positive result.
Thus, the investigators started with whole-genome sequence
information and selected two highly promising proteins to be used
as vaccine against N. meningitidis serotype B that had resisted
previous attempts.
13
Using Genomic Sequence to Make New Antibiotics
While working with E. coli genome, scientists found critical
enzymes in the biosynthetic pathway of Lipopolysaccharide (LPS).
Merck is a giant company of pharmaceuticals, which has millions
of compounds in its chemical library. These libraries dramatically
reduce the number of substances for experimental activity
assessment.
So, after a high throughput screening procedure to identify
candidate drugs that could block LPS synthesis, a compound was
found as the best candidate. When the drug was tested on
infected mouse, it gave positive results. That is how, new kind of
antibiotics were developed using genomic information.
14
Using Genomic Sequence to Make
Personalized Medicine
One famous example is AZT which is a pre-drug must get activated
inside human system. Cytochrome P450 is the enzyme which
metabolyzes the pre-drug. But, according to the activity of the
enzyme Cyt P450, people fall under three classification: typical
metabolizers, poor metabolizers, and ultra-rapid metabolizers.
Drug doses are selected for “average people” with typical
metabolizing rates. The dose selected for average people may not
work for poor or ultra-rapid metabolizers. To solve the problem,
genomic research was done.
15
Using Genomic Sequence to Make
Personalized Medicine
Cytochrome P450 is encoded by two separate genes called 2C19
and 2D6. In exon 4 of 2D6 gene, an SNP was found which affect the
protein production. 2D6-defective people can’t activate many
popular pre-drugs. On the other hand, there is an SNP in 2C19
gene’s exon 5, which also deactivates protein. Though only 2-3%
Caucasians carry that abnormalities and thus poor 2C19
metabolizer, About 23% Asian people carries that SNP. So, a drug
designed for the Caucasian population may not work on almost 1
in four Asian patients. Now, several measures are being taken to
determine a population-specific recommended dosage for the
16
drugs.