Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Lecture - 9 Genomics - 3 GEB 406 Course Instructor: Sheikh Ahmad Shah Semester: Summer 2016 1 Review of Few Terms Homolog: A gene related to a second gene by descent from a common ancestral DNA sequence. Ortholog: Orthologs are genes in different species that evolved from a common ancestral gene by speciation. Paralog: Paralogs are genes related by duplication within a genome. 2 Review of Few Terms 3 Understanding Evolution through Genomics From the genome sequences, better understanding of evolution can be obtained. Organisms alive today carry in their genomes information that has been shaped by evolution. Nucleus, mitochondria, chloroplast, etc. organelles most probably have come into existence through “Endosymbiotic” events. Chloroplast and mitochondrial genomes look very much like Eubacteria, so that debate had been settled already. But nuclear genome is very different from other genomes. Considering that, a group of four Japanese scientists from Takao Shinozawa's lab at Gumma University in Kiryu, Japan, used whole genome sequence comparisons to determine the origin of nucleus. 4 Understanding Evolution through Genomics They used the BLAST program to determine the number of orthologous yeast genes in six Archaea and nine Eubacteria genomes. The first group of yeast genes they tested were those involved in "nuclear organization," and they compared the number of orthologs to yeast genes (i.e, the hit number) in each prokaryotic genome. It was found that Archeae genomes contain more orthologs than Eubacteria genomes. The next two groups of genes were involved in metabolism and energy production. But there, yeast orthologs are more numerous in Eubacteria genomes. The group thus concluded that nuclear organization genes originated 5 in Archeae, not in Eubacteria. Understanding Evolution through Genomics Another example of Aphid-Buchnera relation gives us good insights about the intermediate situation during eukaryotic genome evolution. Aphid is a special type of insect which does not excrete any nitrogenous waste. All other organisms excrete nitrogenous waste, so it is an exception of nature’s rule. Then, it was found that most aphid species have 60-80 large cells in their abdomens bacteriocytes where Buchnera bacteria can be found. The bacteria is transmitted to next generation of aphids from the mother. Both aphid and buchnera are mutually dependent to each other, which relation was established about 225 million years ago. 6 Understanding Evolution through Genomics The genome of Buchnera was sequenced. Its total genomic content was about 650 kb, which was the second smallest genome sequenced at that period. It indicates, buchnera might have lost some of its original DNA. By analyzing the genome, scientists found 583 ORFs. Using BLAST, 500 ORFs were assigned to the similar genes from E. coli. But E. coli has in total 4288 genes. Only four genes were found unique to the Bucknera genome. The researchers concluded that Buchnera evolved from a close relative of modern E. coli, although buchnera has lost 75% of its genome. This phenomenon provides a great example about the steps during endosymbiosis. 7 Understanding Evolution through Genomics Aphids can not produce all the amino acids. Buchnera provide the aphids the necessary amino acids. On the return, Buchnera enjoys all the other cellular facilities of Aphids. Buchnera thus lost many genes during its mutual relationship with Aphids, and only sustaining the genes which are necessary for the mutual relation. This phenomenon provides a great example about the steps during endosymbiosis. Buchnera would not survive on its own as it lacks many DNA repair enzymes, cell wall synthesizing enzymes, phospholipid synthesizing enzymes. But Buchnera has its own ATP synthesizing mechanisms. Thus, Aphid-Buchnera relation seems like an intermediate of cell-mitochondria relation. 8 Understanding Evolution through Genomics Another good example of the transitional states for eukaryotic nucleus development can be provided by a bacterium – Mycobacterium leprae. In February 2001, its complete genome sequence was done. Previously on 1998, the genome of M. tuberculosis was sequenced. They found, only 50% of leprosy genome encodes 1604 proteins, whereas it is 91% for tuberculosis, with 3959 proteins. But there were 1000 pseudogenes in leprosy, where it was scarce in tuberculosis. It appears that M. leprae genome has lost over 2000 functional genes and over 1 million base pairs of DNA. 9 Understanding Evolution through Genomics The leprosy genome is lacking many genes compared to tuberculosis. It contains a gene (ProS) which is not homologous to tuberculosis, rather homologous to another bacteria Borrelia. It may had the original ProS, which can still be found in tuberculosis, but it also may have obtained ProS gene from Borrelia ancestor and finally retained that. Also, the leprosy is barely able to synthesize its membrane lipids. But it can produce a unique lipid-synthesizing gene which compensates for the loss of more common lipids. M. leprae derives its energy by metabolizing lipids, where lacking many genes involved in carbon metabolism. 10 Using Genomic Sequence to Make New Vaccine Meningitis and sepsis involve inflammation of the outer lining of the brain (meningitis) or the blood (sepsis). If left untreated, they can rapidly progress to cause permanent nerve damage or even death. The principle cause is a bacterium called “Neiserria meningitidis”. There are five serotypes, and among those, serotype B was much dangerous because no good vaccine was available for this strain. The reason were: 1) Outer capsule of serotype B was made of a polysaccharide that is very similar to sugars found on human cells. 2) Outer surface of Serotype B contains another protein with strong immunogenic response, but the protein highly variable among different strains of serotype B. 11 Using Genomic Sequence to Make New Vaccine The genome of N. meningitidis was sequenced by a group involving TIGR. Using ORF finding software and comparing the genome with the other bacteria, they found 570 ORFs from the total 2158 ORFs, which were predicted to be associated with surface-proteins and secreted proteins. After overexpressing these 570 ORFs in E. coli, 350 different types of proteins were found. Then researchers applied different approaches to find which proteins can elicit immune response and found 7 proteins which were positive in all the assays. These proteins were termed as GNAs (Genome-derived Neiserria Antigens). 12 Using Genomic Sequence to Make New Vaccine Then the degree of conservation of those candidate proteins were determined by comparing those with other 31 Neiserria strains of different serotypes. By combining the results in immunological response and the percentages of conservation among the species, two proteins were finally selected as a target against which vaccine production may bring positive result. Thus, the investigators started with whole-genome sequence information and selected two highly promising proteins to be used as vaccine against N. meningitidis serotype B that had resisted previous attempts. 13 Using Genomic Sequence to Make New Antibiotics While working with E. coli genome, scientists found critical enzymes in the biosynthetic pathway of Lipopolysaccharide (LPS). Merck is a giant company of pharmaceuticals, which has millions of compounds in its chemical library. These libraries dramatically reduce the number of substances for experimental activity assessment. So, after a high throughput screening procedure to identify candidate drugs that could block LPS synthesis, a compound was found as the best candidate. When the drug was tested on infected mouse, it gave positive results. That is how, new kind of antibiotics were developed using genomic information. 14 Using Genomic Sequence to Make Personalized Medicine One famous example is AZT which is a pre-drug must get activated inside human system. Cytochrome P450 is the enzyme which metabolyzes the pre-drug. But, according to the activity of the enzyme Cyt P450, people fall under three classification: typical metabolizers, poor metabolizers, and ultra-rapid metabolizers. Drug doses are selected for “average people” with typical metabolizing rates. The dose selected for average people may not work for poor or ultra-rapid metabolizers. To solve the problem, genomic research was done. 15 Using Genomic Sequence to Make Personalized Medicine Cytochrome P450 is encoded by two separate genes called 2C19 and 2D6. In exon 4 of 2D6 gene, an SNP was found which affect the protein production. 2D6-defective people can’t activate many popular pre-drugs. On the other hand, there is an SNP in 2C19 gene’s exon 5, which also deactivates protein. Though only 2-3% Caucasians carry that abnormalities and thus poor 2C19 metabolizer, About 23% Asian people carries that SNP. So, a drug designed for the Caucasian population may not work on almost 1 in four Asian patients. Now, several measures are being taken to determine a population-specific recommended dosage for the 16 drugs.