* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download What have we learned from Unicellular Genomes?
Copy-number variation wikipedia , lookup
Short interspersed nuclear elements (SINEs) wikipedia , lookup
Gene desert wikipedia , lookup
Long non-coding RNA wikipedia , lookup
Extrachromosomal DNA wikipedia , lookup
Mitochondrial DNA wikipedia , lookup
Vectors in gene therapy wikipedia , lookup
No-SCAR (Scarless Cas9 Assisted Recombineering) Genome Editing wikipedia , lookup
Genetic engineering wikipedia , lookup
Nutriepigenomics wikipedia , lookup
Gene expression programming wikipedia , lookup
Quantitative trait locus wikipedia , lookup
Metagenomics wikipedia , lookup
Whole genome sequencing wikipedia , lookup
Transposable element wikipedia , lookup
Essential gene wikipedia , lookup
Oncogenomics wikipedia , lookup
Therapeutic gene modulation wikipedia , lookup
Public health genomics wikipedia , lookup
Non-coding DNA wikipedia , lookup
Genomic imprinting wikipedia , lookup
Polycomb Group Proteins and Cancer wikipedia , lookup
Biology and consumer behaviour wikipedia , lookup
Genomic library wikipedia , lookup
Human genome wikipedia , lookup
Human Genome Project wikipedia , lookup
Site-specific recombinase technology wikipedia , lookup
Designer baby wikipedia , lookup
Ridge (biology) wikipedia , lookup
Pathogenomics wikipedia , lookup
Microevolution wikipedia , lookup
Epigenetics of human development wikipedia , lookup
Helitron (biology) wikipedia , lookup
History of genetic engineering wikipedia , lookup
Gene expression profiling wikipedia , lookup
Genome (book) wikipedia , lookup
Artificial gene synthesis wikipedia , lookup
Genome editing wikipedia , lookup
What have we learned from Unicellular Genomes? Propionibacterium acnes • Responsible for acne, its genome sequenced in 2004. • It lives on human skin in sebaceous follicles; feeds on sebum and this stimulates immune response of inflammation. • Can we understand pimples? Anatomy of acne Propionibacterium acnes genome • Sequenced by three different groups. – 32 190 sequencing reactions – 8.7-fold coverage of 2 560 265 bp genome – Error rate of 0.0001 – Genome contains a single circular chromosome and no additional plasmids. – Annotation of 2333 putative genes, allowed for construction of the metabolism. Propionibacterium acnes genome • 12% encoded RNA products (rRNA and tRNA). • 1578 (68%) is orthologous with other organisms and 20% does not match with anything. GC skewing • A non-uniform distribution of guanine and cytosine bases on the two strands of DNA. – Origin of replication have the lowest GC skew (even distribution) – Terminus of replication have higher GC skewing. Horizontal Transfer • Genes appeared in genome through an unknown mechanism. • To find alien genes, scan the genome with a sliding window for segments that have an abnormal GC content (either higher or lower than the species average) and evaluate the codon bias. – Which codon is used more often than other codons for a particular amino acid. Transcriptional Phase Variation • Variation in the Gs is used to produce transcriptional variation. • Initiation of transcription depends on the number of consecutive guanines on a particular strand at a critical location upstream of the coding region. • Regions of replicating bases are difficult to accurately replicate which will affect the transcriptional efficiency. Which genes cause pimples? • Metabolic reconstruction: • Can grow anaerobically and aerobically. • Has many enzymes to degrade lipids, esters and amino acids. • P. acnes digestive enzymes have LPXTG motif that targets proteins to the extracellular wall; these enzymes chew away on your cells. – cell-wall sorting signal LPXTG responsible for covalently anchoring proteins to the cell-wall peptidoglycan – LPxTG, the target for cleavage and covalent coupling to the peptidoglycan by enzymes called sortases Which genes cause pimples? • Cells exterior is decorated with hyaluronate lyase that destroys the extracellular matrix binding your skin cells together and thus facilitates further tissue invasion and digestion. LPxTG Database: Sortase substrates http://bamics3.cmbi.kun.nl/cgi-bin/jos/sortase_substrates/index.py Stimulation of immune response • Genome encodes five CAMP (Christie, Atkins, Munch-Peterson) factors. CAMP factors are secreted proteins that bind to antibodies (IgG and IgM) and can form pores in eukaryotic cell membranes. • Lysis of our cells trigger an immune response. CAMP factors • Proteins from BACTERIA and FUNGI that are soluble enough to be secreted to target ERYTHROCYTES and insert into the membrane to form beta-barrel pores. Biosynthesis may be regulated by HEMOLYSIN FACTORS Quorum Sensing • Many bacteria have evolved the ability to condition culture medium by secreting lowmolecular-weight signaling pheromones in association with growth phase to control expression of specific genes, a process termed quorum sensing – – – – Bioluminescence antibiotic biosynthesis Pathogenicity plasmid conjugal transfer Quorum Sensing • LuxS produces the precursor of autoinducer-2 (AI-2), 4,5,-dihydroxy-2,3pentanedione (DPD), whilst converting Sribosylhomocysteine to homocysteine. Are all bacteria Living in Us Bad for Us? • An average adult body is composed of about 10 trillion human cells. • Every milliliter of your large intestine’s content is estimated to contain 10 billion microbes and our intestines contain about 1 L.. • There are about 500 to 1000 different species living in an adult’s intestines. Bacteroides thetaiotaomicron • • • • 31 million bases Assembly of 867 contigs with many gaps. Finished assembly by PCR 67 938 sequencing runs into a single 6 260 361 bp circular contig. • Annotated 4779 predicted ORFs with 58% orthologs of known function, 18% orthologs of proteins with no known function and 24% with no recognizable sequence similarity. COGs • Clusters of orthologous group are functional categories of genes. • They are phylogenetic classiciation of proteins encoded in complete genomes. – Transcription – Energy production, etc. http://www.ncbi.nlm.nih.gov/COG/ Eukaryotic Clusters ADH CDH Bacteroides thetaiotamicron • It can metabolize sugars. • 170 genes for polysaccharide metabolism; paralogs of 23 genes. • E. coli has only 8 of them. • It can also import sugars into its own cytoplasm. – Has two genes SusC and SusD represented by 163 paralogs. Transposable Elements • 63 TEs contain ORFs (open reading frames) that help spread tetracycline and erythromycin resistance between individual cells and between species in the microbiota of the gut. Coding Capacity • Gene density for B. thetaiotaomicron is 89%. • Average size of a gene is 1170 bp-largest among bacteria. – M. genitalium 1100 bp – H. pylori 1000 bp – E. coli 950 bp Can Microbial Genomes Become Dependent upon Human Genes? • Second smallest bacterial genome of a selfreplicating species (589 070 bp). • A team in TIGR (The Institute for Genomic Research) – 5 people, 8 weeks assembled 8472 high-quality sequencing reactions. – Overall GC content is 32% – GC skew reveals the origin of replication as DnaA and DnaN genes. • Right to the OR transcribed from plus strand • Left to the OR transcribed from minus strand • tRNA and rRNA genes have higher GC content, 52 and 44%. Genome Map • 470 ORFs; 88% coding capacity; average gene is 1040bp. • Retained genes for energy metabolism, fatty acid and PL metabolism, replication, transcription, and protein transport. • Lost DNA when no need for it. – – – – aa synthesis Cofactors Cell envelope Regulatory factors Synteny • When a series of genes are conserved in order and orientation between two or more species, the genes are described as syntenic. – M. genitalium and H. influenzae has similar gene orders with respect to two clusters of ribosomal proteins. Minimum Number of Genes • Synthetic biology: to synthesize de novo (from scratch) a functioning genome with as few genes as possible. – Bacillus subtilis – 190 genes – M. genitalium – 260 genes Bacteria vs. Viruses • Smallest genome is an Archean N. equitans (490 kb) • HIV-9200 nt • SARS-29797 nt • Lambda-48502 nt • Acanthamoeba polyphaga-Mimivirus: infects amoeba – dsDNA-1 181 404 bp with 1262 ORFs linear chromosome Mimivirus Genome • 28%GC rich • 90% coding capacity • Uses biased codons-lacking G or C; uses the least common codon in amoeba the least. • It has proteins used for translation, posttranslation modification, DNA repair-sounds more like a eukaryote. – Encodes topoisomerases – Has a self-splicing intron Is Mimivirus Alive? • Mimivirus is most closely associated with Eurkaryota • Infectious after 1 year of incubation at 4 C. • Survived 48 hours of desiccation and 1% survived 55 C. • Mimivirus can participate in all major steps of translation. – A life form – Highly modified virus? Malaria • 3 billion people in the world in tropical and subtropical climates affected. • Malaria causing ekaryotic parasite genus Plasmodium • 2.7 million people die each year. Plasmodium • Plasmodium falciparum is the most lethal form transmitted to humans by Anopheles mosquito. – Infected mosquito bites, parasite leaves salivary glands move to liver and infects hepatocytes. They mature in hepatocytes and hatch out into RBCs. – A new parasite emerges from RBCs by bursting it, release progeny and metabolic waste causing fever followed by chills. – A few cells differentiate into gametes move through blood can be ingested by new mosquito and gatmetes form zygotes and meiosis and to salivary glands. Infection of RBCs • RBC 6 micron; plasmodium 1.2 micron • Plasmodium enters RBC by evading immune system by sticking to RBCs. • Apicoblast organnelle that is made up of a remnant internalized alga retaining its small genome needed for plasmodium survival. Plasmodium Genome • Three genomes – Nuclear: chromosomes separated through pulse-field gel electrophoresis before random fragmentation and cloning; 22 853 764 with 5268 ORFs; 19.4% GC; 52.6% coding capacity; average gene length 2283 bp. – Mitochondrial: 5967 bp encodes 3 proteins – Apicoplastic: 29 422 bp encodes 30 proteins Plasmodium is a eukaryote • 54% of its genes contains one or more intron with an average 13.5%GC (exons have higher GC%). • 60% of ORFs have no known function rRNA genes • In many species rRNA genes appear in linear clusters • In Plasmodium, rRNA gene distribution var, their expression is host specific; some are expressed in human; the other set is active in mosquito Centromeres and telomeres • Centromeres are AT rich (97%) and contain short tandem repeats. • Telomeres have repeated sequences that vary in length; some genes located nearby telomeres are replicated many times therefore genes have paralogs. – Highly variable gene families, var, rif and stevor (polymorphic) and may add variation to the extracellular surface of the Plasmodium. Hydropathy plot http://expasy.org/cgi-bin/protscale.pl Hydropathy plot Plasmodium • 31% of the encoded polypeptides are predicted to be integral proteins. – 1% cell-to-cell adhesion – 4% evasion of immune system Apicoblast • Derived chloroplast • Synthesizes fatty acids, isoprenoids, and heme groups • 10% of all proteins help apicoblast DNA replication and repair, transcription, translation, posttranslational glycosylation etc. Food • Plasmodium feeds on hemoglobin, digests it in food vacuole; • It has no genes for aa synthesis; no trehalose (storage sugar in yeast) storage nor glycogen ‘lives at the moment’ Is there a model eukaryote genome? • yeast Yeast Genome • Published in October 1996 • 12 068 kb genome of 16 chromosomes • 6272 ORFs – 38.3% GC with a coding capacity of 70.3% – GC content for eukaryotes generally higher for the coding portions. – Coding capacity is much lower than bacteria • Yeast has a gene every 2 kb • Worm has a gene every 6 kb • Humans have a gene every 30 kb Genome structure • S. cereviciae experienced genome duplication events. • Chromosomes V and X, IV and II, and III and XIV are have paralogous regions. – Duplicated region on chr III contains four genes; one of which is citrate synthase (cit2). • Cit2(chrIII) targets peroxisome and cit1(chrXIV) targets the mitochondrion.