Survey
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
Human Genetic Variation Motivation to study human genetic variation Intellectual interests: evolution of our species and its history. Medical importance: there is a genetic component to Javad Tavakkoly Bazzaz MD, PhD [email protected] many diseases, esp. more common complex disorders such as diabetes, cancer, cardiovascular, and neurodegenerative. Pharmaceutical: genetics will determine an individual’s response to a drug. Convergent Evolution Why is phenotypic variation not as important? Phenotypic variation is the result of: ¾Genotypic variation ¾Environmental variation ¾Other effects come to resemble one another if they live in very similar environments. Example: 8Such as maternal or paternal effects Not completely heritable! Potential Topics: I. Evolution of Gene Expression II. Evolution of transcriptional regulatory code III. Evolution of regulatory networks IV. Evolution of duplicate genes IV. Emergence of new genes V. Protein Protein interactions VI. Genetic variation within and between species VII. Effects of recombination and gene conversion on GC content and substitution rate 1 Species from different evolutionary branches may 1. Ostrich (Africa) and Emu (Australia). 2. Sidewinder (Mojave Desert) and Horned Viper (Middle East Desert) Evolution of Gene Expression Examine how gene expression has diverged between species (or between duplicate genes). A study may examine only changes in expression but not changes in regulatory sequences. There have been many studies on this topic. Emergence of new genes The Paradox of Variation: Evolution requires variation, but natural selection eliminates variation. Hardy-Weinberg Principle The concept that the shuffling of genes that occur during sexual reproduction, by itself, cannot change the overall genetic makeup of a population. Hardy-Weinberg Principle This principle will be maintained in nature only if all five of the following conditions are met: 1. Very large population 2. Isolation from other populations 3. No net mutations 4. Random mating 5. No natural selection Remember: If these conditions are met, the population is at equilibrium. This means “No Change” or “No Evolution”. 2 HWE: 5 violations Hardy-Weinberg Principle So, five ways in which populations CAN evolve! Mutation Nonrandom mating Migration (Gene flow) Small population sizes (Genetic drift) Natural selection p2 + 2pq + q2 = 1 Describes a population which is: ¾large ¾Panmictic (no group structures or mating restrictions ¾not undergoing natural selection ¾not subject to gene flow from other pop’ns Random mating Under random mating, the chance of any individual in a population mating is exactly the same as for any other individual in the population Mechanisms that change gene or genotype frequencies natural selection Generally, hard to find in nature genetic drift But, can approximate in many large populations over bottlenecks and founder effects short periods of time inbreeding assortative mating Non-random mating Violations of random mating lead to changes in genotypic frequencies, not allele frequencies But, can lead to changes in effective population size… Non-random mating Reduction in the effective population size leaves a door open for the effects of… Genetic Drift! Drift is the effect of chance occurrences on the genotype frequencies of a population Greater effect in small populations than large populations 3 Genetic Drift Genetic Drift Change in the gene pool of a small population due to chance. Bottlenecks: Founder effect: What happens when a population is started with only a few individuals from a larger population The chances of being represented in the new population are related to the frequency in the original population BUT, chance plays a role, as DRIFT Instead of founding a new population, the whole population is drastically reduced in size (e.g. by a tornado or hurricane); chances of being represented in the population are based on frequencies before the catastrophe Bottleneck Effect Genetic bottlenecks and founder effects Genetic drift (reduction of alleles in a population) resulting from a disaster that drastically reduces population size. Examples: catastrophe or colonization event 1. Earthquakes 2. Volcano’s Founder Effect Genetic drift resulting from the colonization of a new location by a small number of individuals. Results in random change of the gene pool. Example: Islands (first Darwin finch) 4 Migration = Gene Flow Definition: The migration of individuals (and their alleles) from one population to another Gene flow is the flow of alleles between populations Mechanisms that maintain genetic diversity Heterozygote advantage Frequency-dependent selection Gene flow from other populations Genotype-environment interactions Heterozygote advantage Cystic fibrosis ¾ nonfunctional chloride channels 8 normal chloride channels funnel water out of cell ¾ thick, dry mucus in lungs and intestines ¾ mutation is 52,000 yrs old Possible heterozygote advantage ¾ heterozygotes have half as many working chloride channels ¾ lose half as much water during about with cholera or other infections Sources of genetic variation (during meiosis) Reassortment of genetic material (during meiosis) Chromosomal reassortment; a human has 23 pairs of chromosomes, one of each pair is inherited from the father, and the other one from the mother. Mutation; errors in DNA copying. This may result in SNPs or also larger portions of DNA may be duplicated or copied incorrectly. Genetic recombination; shuffling of segments between partner chromosomes of a pair. Molecular Biology of the Cell, Alberts et al. Garland Publishing 2002 (Fig. 20-8) Other types of genetic variations… Single Nucleotide Polymorphisms (SNPs) Major source of genetic variation. Estimated approx. 7 million SNPs that occur with frequencies at least 5% in the human population; approx. 11 million with frequencies at least 1%. Can we determine the associations between these variants and diseases? 5 Mutation Genetic variation The human genome has approximately 10 million polymorphisms, i.e. genetic variants that occur at the level of about 1% or more in the population. Many of these polymorphisms are SNPs, single Mutation is the source of genetic variation! nucleotide polymorphisms. These polymorphisms contribute to our individuality, and No other source for entirely new alleles also influence our susceptibility to various diseases. Rates of mutation Vary widely across: Rates of mutation Measured by phenotypic effects in humans: ¾Rate of 10-6 to 10-5 per gamete per generation ¾Species Total number of genes? ¾Genes ¾Loci (plural of locus) ¾Environments ¾Estimates range from about 30,000 to over 100,000! ¾Nearly everyone is a mutant! Rates of mutation Mutation rate of the HIV–AIDS virus: Rates of mutation Rates of mutation generally high ¾One error every 104 to 105 base pairs Size of the HIV–AIDS genome: ¾About 104 to 105 base pairs Leads to a high load of deleterious (harmful) mutations So, about one mutation per replication! Sex may be a way to eliminate or reduce the load of deleterious mutations! 6 Types of mutations Point mutations Types of mutations Gene duplication ¾Base-pair substitutions ¾Result of unequal crossing over during meiosis ¾Caused by chance errors during synthesis or ¾Leads to redundant genes repair of DNA ¾Leads to new alleles (may or may not change phenotypes) Types of mutations Chromosome duplication 8Which may mutate freely 8And may thus gain new functions Effects of mutations Relatively speaking… ¾Caused by errors in meiosis (mitosis in plants) ¾Common in plants 8Leads to polyploidy 8Can lead to new species of plants ÁDue to inability to interbreed How can mutations lead to big changes? Accumulation of many small mutations, each with a small effect Accumulation of several small mutations, each with a large effect One large mutation with a large effect Mutation in a regulatory sequence (affects regulation of development) Most mutations have little effect Many are actually harmful Few are beneficial What Does It Mean? Any permanent heritable genetic change in the genome called mutation Importance: ¾ Source of genetic polymorphism ¾ Driving evolution ¾ Causes of many human disorders Sources of mutations: ¾ Endogenous mutations: the greatest source ¾ Exogenous causes: environmental mutagens Classification: ¾ Many different basis for classification 7 Causes of Mutations Endogenous causes : most common ¾ Depurination: 5000 adenine or guanine loss per day in each nucleated human cell ¾ Deamination: the net effect is C º T transition 8 around 100 cytosines deaminate per day per human cell to produce uracil ¾ Reactive oxygen: attack purine and pyrimidine rings ¾ DNA replication errors: nt mismatched due to incorrect proofreading ¾ Mistake in recombination & ….. Exogenous causes: ¾ Chemical mutagens ¾ Physical mutagens Definitions and Terminology Evolution: ¾ the change in genotype frequencies over time Microevolution ¾ Changes within populations or species in gene frequencies and distributions of traits Macroevolution ¾ Higher level changes, e.g. generation of new species or higher–level classification Microevolution Macroevolution A change in a population’s gene pool over a secession Any evolutionary change above the level of species origin of of generations. Evolutionary changes in species over relatively brief periods of geological time. taxonomic groups higher than the species level. Macroevolution’s subject matter includes the origins and fates of major novelties such as tetrapod limbs and insect wings, and the impact of continental drift and other physical processes on the evolutionary process. With its unique time perspective, paleontology has a central role to play in this area: the fossil record provides a direct, empirical window onto large-scale evolutionary patterns, and thus is invaluable both as a document of macroevolutionary phenomena, and as a natural laboratory for the framing and testing of macroevolutionary hypotheses. Depurination, Deamination Depurination & Deamination Molecular Mechanisms in Genetic Changes 8 Exogenous Mutagens Exogenous causes of DNA damage Classification of Mutations Mutations are classified on different basis: ¾ On the basis of affected cells 8 Somatic mutation 8 Germline mutation ¾ Due to the level of change in the genome 8 Genome mutation 8 Chromosome mutation 8 Gene mutation Classification of Mutations Gene Mutations Gene mutations can be grouped into different classes An important classification is on the basis of the mutation effect on the gene function ¾ Gain of function mutations 8 Mutational homogeneity ¾ Loss of function mutations 8 Genetic heterogeniety according to the effect on the DNA sequence: ¾ Base substitutions (mainly single base) 8 Silent 8 Nonsense 8 Missense ¾ Deletions ¾ Insertions Every one of the above classes may be categorized as: ¾ Simple mutations: involve just a single DNA sequence ¾ Sequence exchange mutations: involve exchanges between two allelic or nonallelic sequences Simple Mutations Main cause of simple mutations: errors in DNA replication and repair ¾ How big is this treat? ¾ Frequency of uncorrected replication errors: 10 -9 - 10 -11 per incorporated nucleotide ¾ Human genome: 6 x 10 9 nucleotides ¾ Adult human: 1014 cells, around 1016 - 1017 cell division life time ¾ So: 50 billions mutations during this class ¾ Average human gene: 1.65 kb so: ¾ Average mutation frequency / gene / cell division: 1.65 x 10-6 10-8 ¾ During human lifetime (1016 mitosis) each gene collects about 108 - 1010 mutations 9 Facts About Simple Mutations Frequency of base substitutions is nonrandom according to substitution class Frequency of mutations in noncoding DNA is higher than coding DNA The location of base substitutions in coding DNA is nonrandom Amino acids have different degree of mutability Substitution rates vary between different genes and between different gene components Substitution rate can vary in different chromosomal regions and in different lineages Facts About Simple Mutations Frequency of base substitutions is nonrandom according to substitution class ¾ Comparing transition vs transversion rate ¾ Statistically transversions is expected to be twice as frequent as transitions ¾ Comparison of orthologs (DNA molecules that share a common origin) showed an unexpected higher transition rate in mammalian genomes Transition vs Transversion Comparison of 337 pairs of human and rodent orthologs by Collins and Jukes, 1994 showed: ¾ 1.4/1 ratio of transition to transversion for substitutions with no amino acid change ¾ 2/1 ratio, with amino acid change Why is that for? ¾ High frequency of C ÖT transitions (at CpG dinucleotide) (CpG: hotspot of mutation, 8.5 times higher than average dinucleotide mutation rate) ¾ sequence-dependent proofreading activities of the DNA polymerase Base Substitutions in Coding DNA Degenerate Base Positions The location of base substitutions in coding DNA is nonrandom Base positions in amino acid-specifying codons are grouped into three classes: ¾ Nondegenerate sites: 65% ¾ Fourfold degenerate sites: 16% ¾ Twofold degenerate sites: 19% Amino acid mutability: depends on the physiologic effect of the amino acid (mainly its side chain) ¾ Cysteine: less mutable ¾ Serine and threonine: higher mutability Relative Amino Acid Mutability Substitution Rates in Different Genes Substitution rates vary considerably between different genes Variation is much more among non-silent positions 10 Substitution Rate & Gene Components Substitution Rate & Gene Components Substitution rates vary between different gene Substitution rates vary considerably between different components ¾ Overall genome wide sequence identity between orthologs is estimated around 69.1% ¾ Coding regions are the most conserved (85% sequence identity between orthologs) ¾ This is 68.6% for Introns ¾ 75.9% for 5’ UTR ¾ 74.7% for 3’ UTR ¾ 73.9% for promoters (200 bp upstream) ¾ 70.9% downstream 200 bp Substitution Rate & Chromosomes Substitution rates vary among different chromosomes ¾ X chromosome: lowest substitution rate (number of germ cell divisions in males and females) ¾ Chromosome 19: Highest (see below) ¾ Mitochondrial genome has much higher substitution rate than nuclear genome gene components Substitution Rate/Chromosomal Regions Substitution rates vary among different chromosomal regions There is a correlation between substitution rate, recombination rate and also SNP density Substitution Rate/Different Lineage Comparison between mouse and human ¾ Substitution rate is much higher in mouse (2.2 x 10-9 in human vs 4.5 x 10-9 in mouse) ¾ Both species showed a net loss of nucleotides which is at least twice as high in mouse 11 Genetic Mechanisms Causing Sequence Exchange Between Repeats Slipped Strand Mispairing Slipped strand mispairing can cause deletion or insertion into microsatellite loci (short tandem repeats) It can occur during replication (replication slippage or polymerase slippage) or in nonreplicating DNA Creation of Fusion (Hybrid) Genes Homologous equal crossover between alleles on nonsister chromatids can generate novel fusion (hybrid) genes Ins/Del in UEC and UESCE 12 Unequal Crossover Homologous recombination: is the recombination (crossover) between identical or very similar DNA sequences, and usually involves breakage of nonsister chromatids and rejoining of the fragments Unequal crossover (UEC): nonallelic homologous recombination in which the crossover takes place between nonallelic sequences on nonsister chromatids Sister chromatid exchange: is an analogous type of sequence exchange involving breakage of individual sister chromatids and rejoining fragments Unequal sister chromatid exchange (UESCE)? ….. Consquences of UEC and UESCE UEC and UESCE occur predominantly in regions with tandem repeats of moderate to large size sequence with high homology between the repeats UEC and UESCE can cause insertions and deletions UEC and UESCE can also occur by mispairing between repeats which are separated by a considerable amount of intervening sequence (short interspersed repeats, eg. Alu repeats) causing tandem gene duplication UEC in a tandem repeat array can result in sequence homogenization Tandem Gene Duplication in UEC/UESCE Sequence Homogenization in UEC Nonreciprocal Sequence Exchange Nonreciprocal sequence exchange is the transfer of DNA sequence between a pair of nonallelic DNA sequence (interlocus gene conversion) or allelic sequences (interallelic gene conversion) [a directional sequence exchange] Here the donor strand remains unchanged And the acceptor strand changes One possible mechanism for this is formation of a heteroduplex between the donor strand and acceptor strand and conversion of the acceptor strand by mismatch repair DNA repair system Nonreciprocal Sequence Exchange Interallelic vs Interlocus Mechanism of NSE Splicing Mutations These are mutations causing changes in splicing pattern of mRNA Mechanisms: ¾ Alteration of conserved splice signals ¾ Activation of cryptic splice sites Consequences: ¾ Intron retention ¾ Exon skipping ¾ Extension or shortening of exon Splice Mutations The mtDNA, A Mutation Hotspot High rate of mitochondrial disorders (unexpected) There is a high rate of muation in mtDNA Reasons: ¾ 93% of mtDNA is coding DNA ¾ No protection by histones ¾ Lack of adequate DNA-repair mechanism ¾ Many round of replication ¾ Bottleneck effect may also fix the mutation in mtDNA 13