* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Document
Epigenetics of human development wikipedia , lookup
Genomic imprinting wikipedia , lookup
Metagenomics wikipedia , lookup
Whole genome sequencing wikipedia , lookup
Gene expression profiling wikipedia , lookup
Oncogenomics wikipedia , lookup
Point mutation wikipedia , lookup
Genomic library wikipedia , lookup
Public health genomics wikipedia , lookup
Transposable element wikipedia , lookup
Non-coding DNA wikipedia , lookup
Pathogenomics wikipedia , lookup
Human–animal hybrid wikipedia , lookup
History of genetic engineering wikipedia , lookup
Site-specific recombinase technology wikipedia , lookup
Artificial gene synthesis wikipedia , lookup
Human genetic variation wikipedia , lookup
Segmental Duplication on the Human Y Chromosome wikipedia , lookup
Genome editing wikipedia , lookup
Minimal genome wikipedia , lookup
Microevolution wikipedia , lookup
Designer baby wikipedia , lookup
Helitron (biology) wikipedia , lookup
Genome (book) wikipedia , lookup
Human Genome Project wikipedia , lookup
ID _PANTR HPI meeting The chimp and us 25 April 2006 ID _PANPA HPI meeting The chimp and us 25 April 2006 Complete chimp (PANTR) genome publication: Nature, sept 2005 - Genome derived from one individual ‘Clint’ (male from west Africa) - Inter vs intra (polymorphism) species differences !!! - Individual human genome variation: 1bp/1000 - Individual chimp genome variation: 1bp/250 (estimation Varki (2000) Cheeta has been recognized by the Guinness Book of World Records as the world's oldest chimp. Chimps rarely live past the age of 40 in the wild, but can reach 60 in captivity. HPI meeting The chimp and us 25 April 2006 - The chimpanzee genome was sequenced to approximately four-fold coverage (error rate < 10-4) - WGS sequencing approach (-> problem for the assembly of region with segmental duplication): ~22.5 millions of sequence reads to assemble. 2 assembly approaches (PCAP* and ARACHNE) - In one* of the 2 approaches, contigs were assembled using the human genome as a guide "humanized" in their construction. some sequences, such as insertions, deletions, and gene duplications, may not be accurately represented by the current chimpanzee assembly. chromosomal fusion in the human lineage ? -NCBI has adopted the NEW chimpanzee chromosome naming system as proposed by McConkey, 2004 - The UCSC-Genome browser currently uses the original chimpanzee chromosome naming system. Humanness - Bipedalism - Large cranial capacity (Brain size) - Advance brain development (langage capability) - A long generation time - and some other ‘biomedical’ differences…. Chimps expressed apoE4 allele Chimps: no acne, rhinitis but no asthma, no rheumatoid arthritis Olson et al., (2002) -The last common acestor of humans and chimp is believed to have walked on 4 legs. -The oldest fossils that resemble bipedal human are 6 to 7 millions years old. - DNA sequence analyses suggest the 2 lineages separated about 5.4 millions years ago. Short time since human-chimp split: it is likely that a few mutations of large effects are responsible for part of the differences. Comparative genomic analysis Human vs mouse, chick…: focus on similarities Human vs chimp: focus on differences Hypothesis to account for the evolution of humanness traits Quantifying the sequence divergence: Single nucleotide subtitutions: 1.23% (1, 78% for chromosome Y) (0.8 % in protein coding region) Indels: ~1.5 % Transposable elements: 3 % Recent duplication of DNA segments: 2.7 % ~ 35 mo nucleotides differences ~ 5 mo indels Many chromosomal rearrangements Human: 3.4 109 bp; Chimp: 3.6 109 bp ~ 35 mo nucleotides differences ‘Since we apparently diverged from a common ancestor 6 million years ago, that is roughly 6 mutations per year that get fixed within the genome (or 3 per year if you divide them equally amongst the 2 branching species). Given a conservative estimate of average generational time of 10 years, this means that 30 new mutations had to be fixed within the population every generation. The current human mutation rate is around 3 or 4 mutations per organism.’ http://www.uncommondescent.com/index.php/archives/875 HPI meeting The chimp and us 25 April 2006 At the genome level 1) Structural variations chromosomal fusion in the human lineage ? A genome-wide survey of structural variation between human and chimpanzee (Newman et al., (2005)) - Approach: Mapping chimp fosmid against human reference sequence and identifying discordant regions by size and orientation - Limitations: The human genome is not complete The chimp genome = 1 individual (! Inter/intraspecific differences !) -Identification of 651 regions of putative structural variation between the human genome assembly and the single chimp individual (293 chimp deletions, 184 chimp insertions and 174 inversions/duplicative transpositions). - Chromosome Y is the most rearranged chromosome between human and chimp (! Repetitive regions !) - They have identified 245 (RefSeq) genes that may be affected by the structural differences between chimp and human (drug detoxification, receptors, reproduction) (Newman et al., (2005)) At the genome level 1) Structural variations 2) Segmental duplication Segmental duplication (impact: 2.7 %) Longer than 20 kilobases (-> 300 kb), greater than 94 % sequence identity - 33% of human duplicated segments are human specific - 17 % of chimp duplication are chimp specific. Half of the genes in the human specific duplicated regions exhibit significant differences in gene expression relative to chimp and are most often upregulated. Cheng et al., Nature (2005) About 300 region were identified where the human genome showed significant increase in copy number when compared to chimp. ‘Only’ 92 regions where the chimp genome showed an increase in copy number compared to human (but with higher rate of duplication) Cheng et al., Nature (2005) Example: 4 human regions represented ~ 400 x in chimp genome (99.2% identity) Cheng et al., Nature (2005) At the genome level 1) Structural variations 2) Segmental duplication 3) Interspersed/Transposable repeats -The human genome is composed of ~ 45 % of interspersed elements Including: Long interspersed elements (LINEs); these encode a reverse transcriptase Short interspersed elements (SINEs); these include Alu repeats -The human genome contains about 1,000,000 Alu elements. - Found only in primates . Interspersed/Transposable element insertions (impact 3 %) - endogenous mutagens which can alter genes, promote genomic rearrangements… - may help to drive the speciation of organisms - Particular interest in recently mobilized transposons - The transposons that inserted into human or chimp genome during the passed 6 mo years would be expected to be present in only one of the 2 genome. ~11’000 ‘recent’ transposons copies that are differentially present in human/chimp: 73 % found in human and 27 % found in chimp Interspersed/Transposable element insertions Endogenous retrovirus Mills et al., Am. J. Hum. Genet., 78:671-679, 2006 Interspersed /Transposable element insertions - Alu, L1 and SVA insertions accounted for > 95% of the insertion in both species SVA: composite element (1.5-2.5 kb) (2 Alu, a tandem repeat and a region derived form HERV-k) - Human and chimp have amplified different subfamilies of these elements. Human have supported higher levels of transposition than chimp during the past several million years (but…not the case for the baboo which shows an activity 1.6 fold higher than human -> general decline in Alu activity in chimp) Blat human DNA vs chimp DNA AJ271736 Xq pseudoautosomal Interspersed /Transposable element insertions - 34 % of the insertions were located within known genes during the evolution of human and chimp Interspersed /Transposable element insertions - conclusions - The original set of transposons in the common ancestor of human and chimp behaved differently during the subsequent evolution of the 2 organisms - Human received at least 4’800 additional transposon insertions compared to chimp -> impact of transposon mutagenesis is likely to be greatest in human during the past several million years. - Human and chimp have amplified different subfamilies of these elements. - Factors such as differences in population size may also have influence the pattern of transposon insertion. At the sequence level (coding sequence level) Nucleotide divergence: 1.23 % 14-22 % of these differences are due to polymorphism -> fixed divergence rate = ~1.06 % Chromosome X: ~0.94 % Chromosome Y: ~1.9 % Higher mutation rate in the male compared with female germ line (higher number of cell division (5 to 6 fold)) At the gene level: 13’454 pair of orthologous genes (507 Swiss-Prot, 1134 TrEMBL: 1641) (NCBI: 3111) - 29 % are 100 % identical - 5% with in-frame indel (mainly in repetitive region) A classical measure of the overall evolutionary constraint on a gene KA: non-synonymous substitution rate in coding sequence KB or Ks synonymous substitution rate in coding sequence Kl: substitution rate in non-coding sequence KA/KB << 1: typical of most proteins where change is detrimental (negative selection) KA/KB > 1: for the rare protein for which it is a positive selection About 500 genes with a KA/KB > 1 Most of the genes with a KA/KB > 1 are not involved in process related to supposed humanness. Genes with highest KA/KB ratio are mostly related to hostpathogen interaction, immunity and reproduction (pattern also found in other mammals (cf Valeria’s work on human/mouse orthologs) In fact genes related to brain function and neuronal activities show lower-than-average KA/KB ratio - Neural genes, as a group, have much lower average of KA/KB ratio than genes expressed outside of the brain. Hypothesis: only a small subset of genes may be the target of positive selection: not visible in such type of studies. (Hill, Walsh (2005)) Example 1: FOXP2 - gene relevant for the human ability to develop language - among the 5% most conserved protein -CC -CC -CC -CC -CC -CC -CC -CC -!- DISEASE: Defects in FOXP2 are the cause of speech-language disorder 1 (SPCH1) [MIM:602081]; also known as autosomal dominant speech and language disorder with orofacial dyspraxia. Affected individuals have a severe impairment in the selection and sequencing of fine orofacial movements, which are necessary for articulation. They also show deficits in several facets of language processing (such as the ability to break up words into their constituent phonemes) and grammatical skills. - Extremely conserved among mammals - Acquired 2 aa changes in the human lineage (T303N and N325S), including one potential/functional phosphorylation site (N325S) -Estimation: fixation of these mutations occurs during the last 200’000 years of human history, concomitant with of subsequent to the emergence of anatomically modern humans. Enard et al., Nature (2002) BUT: - no aa substitution are shared between song-learning birds, vocal learning whales, dolphins and bats, and human, … AND… - during times of song plasticity, FoxP2 is upregulated in a striatal region esssential for song learning. - selection acted on large non-coding regulatory regions of FoxP2 ??? - duplication of the chromosomal region (27 genes including FoxP2) may be another cause of speech and language disturbance ??? Less-is-more hypothesis Loss of function changes (lack of body hair, preservation of juvenile traits, expansion of the cranium) could be caused by non-synonymous substitutions, indels, loss of coding regions and deletions of entire genes. -> 53 human genes with disruptive indels in the coding regions (compared to chimp) Well documented examples of human specific pseudogenization - MYH16, CMAH, CASP12, ELN, T2R62P (bitter taste receptor), MBL1 - Microcephalin (MCPH1) Challenge: dating the event ! MYH16 Myosin gene mutation (MYH16) correlates with anatomical changes in the human lineage inactivated by a frameshifting mutation after the lineages leading to humans and chimpanzees diverged (~2.4 Myr). The gene is transcribed (-> the coding sequence deletion was not preceeded by a mutation in a transcriptional control domain). Expressed only in masticatory muscles in other mammals. Loss of this protein isoform is associated with marked size reductions in individual muscle fibres and entire masticatory muscles. Nature 428, 415-418 (2004) Phylogenetic reconstruction for all human sarcomeric myosin genes (heavy chain), showing early divergence of MYH16 from others. Nature 428, 415-418 (2004) Aligned DNA sequences for MYH16 exon 18 representing seven non-human primate species and six geographically dispersed human populations, revealing the effect of frameshift on reading frame and deduced amino acid sequence. Note stop codon at position 72−74. Nature 428, 415-418 (2004) The findings on the age of the inactivating mutation in the MYH16 gene raise the intriguing possibility that the decrement in masticatory muscle size removed an evolutionary constraint on encephalization, as suggested by the anatomy of the muscle attachments relative to the sutures -> marked increase in cranial capacity. Nature 428, 415-418 (2004) But: Human encephalization -> obstetric constraints associated with pelvic dimensions for bipedality Importance of the genes that control the development of brain size in mammals: ASPM, MCPH1, CASP3 … Have undergone accelerated rates of protein evolution Strong positive selection at several loci McCollum et al., (2006), J. of Human Evolution, 50, 232-236 MCPH locus (microcephalin) MCPH1 -> MCPH6 MCPH5/ASPM locus (abnormal spindle microcephaly) - High KA/KB ratio - Patients with loss-of-function in microcephalin have cranial capacities about 4 SD below the mean at birth and ~1/3 of the size as adult. - May control the proliferation and/or differenciation of neuroblasts during neurogenesis. - Continues its trend to adaptive evolution - Ex: APSM acquires an advantageous aa change every 350’000 Evans et al., Nature 2005 years. Pseudogenization CMAH Alu-mediated sequence replacement -> inactivation of the enzyme CMP-N_acetylneuraminic acid hydrolase in human This mutation occurred after our last common ancestor with bonobos and chimpanzees, and before the origin of present-day humans (~2.8 mya ) -> susceptibility or resistance to certain microbial pathogens (host receptors). Chou et al., 2005 Pseudogenization CASP12 Functional gene in all mammals except human. Mediator of apoptosis in response to perturbed calcium homeostasis -> loss of this gene in mice increases resistance in to amyloid-induced neuronal apoptosis (-> Alzheimer in human ?) -> loss of this gene seems also to confer resistance to severe sepsis Last but not least… HPI meeting The chimp and us 25 April 2006 Nature (2005) Comparative analysis of cancer genes in the human and chimpanzee genomes - The incidence of cancer in non-human primates is very low. -All examined human cancer genes (n=333) are present in chimpanzee, contain intact open reading frames and show a high degree of conservation between both species (99.38%) Blat of P53_Human vs chimp Ex: Pro-72 is polymorphic only in human - Sequencing of the BRCA1 gene has shown an 8 Kb deletion in the chimpanzee sequence that prematurely truncates the co-regulated NBR2 gene. Puente et al., (2006) Transcriptome evolution (and epigenetics events) Changes in gene usage may be a primary contributor to the differences in chimp and human brains. 10% of all genes expressed in the brain differ in their expression levels between humans and chimps…but no causative connection found… Several studies, but none with convincing results Heissig et al., 2005 Khaitovich et al., 2005 Swiss-Prot annotation Existence of orthologs DEF7_PANTR MISCELLANEOUS: The human orthologous protein seems not to exist, its coding region does not have a start codon. Pseudogenization (when documented) T2R64_PANPA MISCELLANEOUS: The human and chimpanzee orthologous proteins do not exist, their genes are pseudogenes. That’s all folk !