* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Prof. Kamakaka`s Lecture 14 Notes
Epigenetics of neurodegenerative diseases wikipedia , lookup
Mitochondrial DNA wikipedia , lookup
DNA profiling wikipedia , lookup
Oncogenomics wikipedia , lookup
Zinc finger nuclease wikipedia , lookup
Transposable element wikipedia , lookup
Metagenomics wikipedia , lookup
DNA vaccination wikipedia , lookup
Frameshift mutation wikipedia , lookup
DNA damage theory of aging wikipedia , lookup
Nucleic acid analogue wikipedia , lookup
Cancer epigenetics wikipedia , lookup
United Kingdom National DNA Database wikipedia , lookup
Gel electrophoresis of nucleic acids wikipedia , lookup
Nutriepigenomics wikipedia , lookup
Nucleic acid double helix wikipedia , lookup
Quantitative trait locus wikipedia , lookup
Genetic engineering wikipedia , lookup
Molecular cloning wikipedia , lookup
Epigenomics wikipedia , lookup
Vectors in gene therapy wikipedia , lookup
DNA supercoil wikipedia , lookup
Molecular Inversion Probe wikipedia , lookup
Genome evolution wikipedia , lookup
Public health genomics wikipedia , lookup
Bisulfite sequencing wikipedia , lookup
Genome (book) wikipedia , lookup
Extrachromosomal DNA wikipedia , lookup
Genomic library wikipedia , lookup
Human genetic variation wikipedia , lookup
No-SCAR (Scarless Cas9 Assisted Recombineering) Genome Editing wikipedia , lookup
Human genome wikipedia , lookup
Cre-Lox recombination wikipedia , lookup
Deoxyribozyme wikipedia , lookup
Therapeutic gene modulation wikipedia , lookup
Cell-free fetal DNA wikipedia , lookup
Designer baby wikipedia , lookup
Site-specific recombinase technology wikipedia , lookup
Genome-wide association study wikipedia , lookup
Genealogical DNA test wikipedia , lookup
Non-coding DNA wikipedia , lookup
Point mutation wikipedia , lookup
Genome editing wikipedia , lookup
History of genetic engineering wikipedia , lookup
Helitron (biology) wikipedia , lookup
Artificial gene synthesis wikipedia , lookup
Microevolution wikipedia , lookup
Microsatellite wikipedia , lookup
Studying differences/similarities in Individuals Methods used to study differences between individuals RFLP SNP DNA Repeats 1 Ind1 ATTGTATTGATTTATAGCGCGCGCGCTAGTTGACTGG Ind2 ATTGTATTGATTTATAGCGCGCTAGTTGACTGG Ind3 ATTCTATTGATTTATAGCGCGCGCGCTAGTTGACTGG 2 Genetic polymorphism •Genetic Polymorphism: A difference in DNA sequence among individuals, groups, or populations. Genetic mutations are a kind of genetic polymorphism. Genetic Variation Single nucleotide Polymorphism (point mutation) Repeat heterogeneity Ind1 ATTGTATTGATTTATAGCGCGCGCGCTAGTTGACTGG Ind2 ATTGTATTGATTTATAGCGCGCTAGTTGACTGG Ind3 ATTCTATTGATTTATAGCGCGCGCGCTAGTTGACTGG 3 Repeats and DNA fingerprint Variation between people- small DNA change – a single nucleotide polymorphism [SNP] – in a target site, RFLPs and point mutations are proof of variation at the DNA level. Satellite sequences: a short sequence of DNA repeated many times. Chr1 Interspersed Chr2 tandem 4 Repeats Satellite (large) (100->1000 bps long) Centromere/telomere Micro-satellite Extremely small repeats (2-5bp)GCGCGCGC AGCAGCAGCAGC Mini-satellite- larger repeats (6-100 bp long) Repeats can be dispersed or in tandem- E E 2 E 5 E 6 Chr1 Chr2 3 E 1 E 4 tandem E 0.5 E 5 Mini Satellite Repeats and Blots Mini Satellite sequences: a short sequence (20-100bp long) of DNA repeated many times E E 2 E 5 E 6 Chr1 Chr2 3 1 E E M E tandem DNA 7 4 M Take Genomic DNA, Digest with EcoRI, Probe southern blot with repeat probe 5 3 3 1 1 DNA gel E DNA 7 5 0.5 Blot 6 Mini Satellite Repeats and Blots Mini Satellite sequences: a short sequence (20-100bp long) of DNA repeated many times E E 2 E 5 E 6 Chr1 Chr2 3 1 E E M E DNA 7 4 tandem M E DNA Take Genomic DNA, Digest with EcoRI, Probe southern blot with repeat probe 7 5 5 3 3 1 1 DNA gel 0.5 Blot 7 Repeat expansion/contraction Tandem repeats expand and contract during recombination. Mistakes in pairing leads to changes in tandem repeat numbers These can be detected by Southern blotting because as the number of repeats expand at a specific site, the restriction fragment at that site expands in size An allele of a mini-satellite varies by the number of repeats One repeat to many repeats- (varying in length from 0.5 to 20 kb) Chr1 Individual 1 4 E E Chr1 Individual 2 5 E Ind2 Ind1 E 5 3 There are on average between 2 and 10 alleles (repeats) per mini-sat locus 1 8 Ind3 Ind2 Ind1 Homozygous/heterozygous? 5 3 1 Chr1 E E Chr1 E E 9 Micro-satellite and PCR Microsatellite repeat expansion and contraction is investigated using PCR and gels instead of gels and southern blots AGCGTCAGCGCGCGCTTATTGA TCGCAGTCGCGCGCGAATAACT AGCGTCA AATAACT 22 bp PCR fragment AGCGTCAGCGCGCGCGCGCGCTTATTGA TCGCAGTCGCGCGCGCGCGCGAATAACT AGCGTCA AATAACT 28 bp PCR fragment 10 GC GC GC GC GC GC GC GC TA TA TA TA TA TA GC 2 GC 1 GC 2 GC 1 GC GC TA TA Genotype Locus1 Individual1 allele5, allele2 Individual2 allele4, allele3 Locus2 allele2, allele1 allele3, allele2 11 Microsatellite DNA fingerprint 1 2 3 4 1 2 In this example, 2 totally different microsatellites (1) and (2) located on the short arm of chromosome 6 have been amplified by the polymerase chain reaction (PCR). The PCR products are labeled with a blue or green fluorescent marker and run in a gel each lane showing the genetic profile of a different individual. Each individual has a different genetic profile because each person has a different number of microsatellite length repeats, the number of repeats giving rise to bands of different sizes after PCR. Locus1 locus2 Indi1 allele2,allele4 allele6,allele6 Indi2 allele2, allele6 allele1, allele4 Indi3 allele5, allele6 allele4, allele5 12 Crossover in mispaired duplicated segments What happens if you get a crossover after mis-pairing in meiosisI? B C A B C D E a b c d e b c A B C B C D A B C B c b a b C D a b c b E c d e E c d e 13 FBI and Microsatellite The FBI uses a set of 13 different microsatellite markers in forensic analysis. 13 sets of specific PCR primers are used to determine the allele present in the test sample for each marker. The marker used, the number of alleles at each marker and the probability of obtaining a random match for a marker is shown. How often would you expect an individual to be mis-identified if all 13 markers are analyzed Locus No. of alleles A B C D E F G H I J K L M 11 19 7 7 10 10 10 11 10 8 8 15 20 probability of random match 0.112 0.036 0.081 0.195 0.062 0.075 0.158 0.065 0.067 0.085 0.089 0.028 0.039 P= 0.112x0.036x0.081x0.195x----- = = 1.7x10-15 14 Variation between people- small DNA change – a single nucleotide polymorphism [SNP] – in a target site, RFLPs and SNPs are proof of variation at the DNA level, Satellite sequences: a short sequence of DNA repeated many times. Micro satellite are 2-5 bp repeats in tandem repeats 15-100 times in a row Mini satellite are 6-100 bp repeats in tandem (0.5 to 20kb long) Class size No of loci method Micro ~200bp 200,000 PCR Mini 0.2-20kb 30,000 southern blot SNP 1 bp 100 million PCR/microarray 15 DNA Fingerprint Analysis Mr. Chan’s family consists of mom, dad and four kids. The parents have one daughter and one son together, another daughter is from the mother’s previous marriage, and the other son is adopted. Here are the DNA analysis results: 1. Which child is adopted? Child4 1. Which child is from the mother’s previous marriage? Child2 1. Who are the own children of Mr and Mrs Chan? Child1 and Child3 16 Individuals Methods used to study differences between individuals DNA Repeats RFLP SNP 17 Very small Deletion (of a restriction site) 1kb 2kb E E 3kb 4kb GeneR GeneC 8kb H Marke r Marke r 3kb EcoRI WT E 5kb 4.5kb GeneX H E 0.5kb GeneA 9kb Marke r E Marke r E EcoRI Deletion 18 RFLP analysis RFLP= Restriction fragment length polymorphism Refers to variation (presence or absence) in restriction sites between individuals Because of mutations in Restriction sites These are extremely useful and valuable for geneticists (and lawyers) On average two individuals (humans) vary at 1bp in every 3001000 bp The human genome is 3x109 bp This means that they will differ in more than 3 million bp!!! By chance these changes will CREATE or DESTROY the recognition sites for restriction enzymes 19 RFLP Lets generate a EcoRI map for the region in one individual 3kb GAATTC 4kb GAATTC GAATTC The the same region of a second individual may appear as The nucleotides are not deleted – just changed 7kb GAGTTC GAATTC 1 Normal GAATTC Mutant GAGTTC 2 EcoRI Marker GAATTC 20 RFLP The internal EcoRI site is missing in the second individual For X1 the sequence at this site is GAATTC CTTAAG This is the sequence recognized by EcoRI The equivalent site in the X2 individual is different GAGTTC CTCAAG This sequence IS NOT recognized by EcoRI and is therefore not cut Now if we examine a large number of humans at this site we may find that 25% possess the EcoRI site and 75% lack this site. We can say that a restriction fragment length polymorphism exits in this region These polymorphisms usually do not have any phenotypic consequences Silent mutations that do not alter the protein sequence because of redundancy in codon usage, localization in introns or non-genic regions 21 RFLP RFLP are identified by southern blots In the region of the human X chromosome, two forms of the X-chromosome are Segregating in the population. 4 B 5 R 3 R 6 R X1 3.5 B 1 R 2 Digest DNA with EcoRI or BamHI and probe with Probe1/ probe2 What do we get? 8 4 B R 6 R 1 X2 3.5 B R 2 22 RFLP in individuals If we used probe1 for southern blots with a BamHI digest what would be the results for X1/X1, X1/X2 and X2/X2 individuals? X1/X1 Probe1 X1/X2 X2/X2 18 BamHI 18 18 If we used probe1 for southern blots with a EcoRI digest what would be the results for X1/X1, X1/X2 and X2/X2 individuals? X1/X1 Probe1 Probe2 B 5 R 4 B 9.5 6 R 9.5 X1 3.5 B 8 R 2 8 R 8, 5, 3 9.5 3 R 1 X2/X2 5, 3 EcoRI EcoRI 4 X1/X2 R 1 6 R X2 3.5 B R 2 23 RFLP RFLP’s are found by trial and error They require an appropriate probe and appropriate enzyme They are very valuable because they can be used just like any other genetic marker to map genes They are employed in recombination analysis (mapping) in the same way as conventional morphological allele markers are employed The presence of a specific restriction site at a specific locus on one chromosome and its absence at a specific locus on another chromosome can be viewed as two allelic forms of a gene The phenotype in this case is a Southern blot rather than white eye/red eye 4 6 R 2 R X1 2 R 1 R R 4 5 R 2 1 4 8 R R 1 X2 2 R R 1 R 4 R 2 R 3 242 R 4 6 R 2 R X1 2 R 1 R R 4 5 R 2 1 4 8 R R 1 X2 2 R R Probe1 1 R 4 R 2 R 3 2 Probe2 6+2 X1/X1 individual 5 8 X2/X2 individual 3 25 R RFLPs can be followed in pedigrees 1 Alleles 3 C 7 4 GAGTTC A 3 4 4 R1 CA CC AA CA R3R3+ R3R3- R3+ R3+ R3R3+ R3 GAATTC Mutation is recessive Most likely Resides at or near R3- R4 RFLPs can be followed in pedigrees Alleles 8 6 C 2 2 4 B 2 2 4 A 2 2 4 4 R1 CA CB AA R2 BA R2-R3- R2-R3- R2+R3+ R2+R3R2+R3+ R2+R3- R2+R3+ R2+R3+ Mutation is recessive Most likely Resides at or near R3 R3 R4 Using RFLPs to map human disease genes 8 EcoRI 6 Probe3 5 EcoRI 3 Probe2 9 EcoRI 1 Probe1 1 1 2 2 3 3 Which RFLP pattern segregates specifically with all of the diseased individual BUT NOT WITH NORMAL INDIVIDUALS? 1, 2 or 3 Which band segregates with the phenotype Top or bottom? Using DNA probes for different RFLPs you can screen individuals for a RFLP pattern that shows co-inheritance with 28 the disease Conclusion: the actual mutation resides at or near RFLP3 bottom band (Dominant ?) Mapping To map Two Genes, you perform crosses and measure recombination frequency between the two genes IN A HETEROZYGOUS INDIVIDUAL. Gene W and B are responsible for wing and bristle development W Centromere B Telomere To find the map distance between these two genes we need ALLELIC variants at each locus W=wings w= No wings B=Bristles b= no bristles To measure distance between these two genes, You do a TEST CROSS A double heterozygote is crossed to the double homozygote recessive WB/wb Female Wings Bristles X wb/wb Males No wings No bristles ----W--------B------w--------b--- ----w--------b------w--------b--- 29 Mapping WB/wb Female Wings Bristles X wb/wb Males No wings No bristles Female gamete Male gamete (wb) Genotype phenotype WB WB/wb Wings Bristles 51 wb wb/wb No wings No bristles 43 Wb Wb/wb Wings No bristles 3 wB wB/wb No wings Bristles 4 Map distance= # recombinants /Total progeny 7/101= 7 M.U. 30 Mapping To find the map distance between genes, multiple alleles are required to measure appearance of recombinants We know the distance between W and B by the classical method because multiple alleles exist at each locus (W & w, B & b). It is 7MU. We know the distance between B and R by the classical method as 20MU by using heterozygotes for B and R in a cross and looking for recombinants 7MU Centromere W 20MU B C R Telomere Now suppose you find a new gene C. You could map this gene with respect to Genes W, B and R using classical methods. However, what if it is difficult to study the function of this new gene (the phenotype is difficult to see with the naked eye) If the researcher identifies an RFLP in this gene you can 31 map the gene mutation by simply following the RFLP. Mapping Prior to RFLP analysis, only a few classical gene markers existed (approximately 200 in humans) Now over 7000 RFLPs have been mapped in the human genome. Newly inherited disorders are now mapped by determining whether they are linked to previously identified RFLPs 7MU Centromere W or w Centromere RFLP1 Probe1 7kb or 4kb 20MU B or b R or r Telomere RFLP2 RFLP3 Telomere Probe2 1kb or 2kb Probe3 3kb or 9kb 32 RFLP Mapping Both the normal and mutant alleles of gene B (B and b) are sequenced and we find W Centromere B B GAATTC 3 2 E Telomere E E b E 5 E AAATTC The mutation disrupts the sequence and alters a EcoRI site! If DNA is isolated from B/B, B/b and b/b individuals, cut with EcoRI and probed in A Southern blot, the pattern that we will obtain is B/B Bristle B/b Bristle b/b No bristle 33 Mapping To find the map distance between two genes we need ALLELIC variants at each locus Therefore in the cross (WB/wb x wb/wb), the genotype at the B locus can be distinguished either by the presence and absence of bristles OR by a Southern blot WB/wb Female x wb/wb Male Wings Bristles No wings No Bristles Southern blot: Southern blot: 5 and 2 kb band 5 kb band You can use RFLPs instead of genes as markers along a chromosome Just like Genes, RFLPs mark specific positions on chromosomes and can be used for mapping. 34 Mapping Female gamete Male gamete (wb) Parental Genotype phenotype WB WB/wb Wings RFLP 5kb 2kb 51 wb wb/wb No wings RFLP 5kb 5kb 43 Wb Wb/wb Wings RFLP 5kb 5kb 3 wB wB/wb No wings RFLP 5kb 2kb 4 Recombinant 35 Map distance= # recombinants /Total progeny 7/101= 7 M.U. Individuals Methods used to study differences between individuals RFLP SNP DNA Repeats 36 Genetic polymorphism •Genetic Polymorphism: A difference in DNA sequence among individuals, groups, or populations. •Genetic Mutation: A change in the nucleotide sequence of a DNA molecule. Genetic mutations are a subset of genetic polymorphism Genetic Variation Single nucleotide Polymorphism (point mutation) Repeat heterogeneity 37 SNP •A Single Nucleotide Polymorphism is a source variance in a genome. •A SNP ("snip") is a single base change in DNA. •SNPs are the most simple form and most common source of genetic polymorphism in the human genome (90% of all human DNA polymorphisms). •There are two types of nucleotide base substitutions resulting in SNPs: –Transition: substitution between purines (A, G) or between pyrimidines (C, T). Constitute two thirds of all SNPs. –Transversion: substitution between a purine and a pyrimidine. While a single base can change to all of the other three bases, most SNPs have only one allele. 38 SNPs- Single Nucleotide Polymorphisms -----------------------ACGGCTAA -----------------------ATGGCTAA Instead of using restriction enzymes, these are found by direct sequencing/PCR They are extremely useful for mapping Markers Classical Mendelian RFLPs SNPs ~200 7000 1.4x106 SNPs occur every 300-1000 bp along the 3 billion long human genome Many SNPs have no effect on cell function Note: RFLPs are a subclass of SNPs 39 SNPs Humans are genetically >99 per cent identical: it is the tiny percentage that is different Much of our genetic variation is caused by single-nucleotide differences in our DNA : these are called single nucleotide polymorphisms, or SNPs. As a result, each of us has a unique genotype that typically differs in about three million nucleotides from every other person. SNPs occur about once every 300-1000 base pairs in the genome, and the frequency of a particular polymorphism tends to remain stable in the population. Because only about 3 to 5 percent of a person's DNA sequence codes for the production of proteins, most SNPs are found outside of "coding sequences". 40 How did SNPs arise? F2a----ACGGACTGAC----CCTTACGTTG----TACTACGCAT---| F1 ----ACTGACTGAC----CCTTACGTTG----TACTACGCAT---P ----ACTGACTGAC----CCTTACGTTG----TACTACGCAT---- | F1 ----ACTGACTGAC----CCTTACGTTG----TACTAGGCAT---| | F2b----ACTGACTGAC----CCATACGTTG----TACTAGGCAT---- Compare the two F2 progeny Haplotype1 (F2a) = SNP allele1 ----ACGGACTGAC----CCTTACGTTG----TACTACGCAT---Haplotype2 (F2b) = SNP allele2 ----ACTGACTGAC----CCATACGTTG----TACTAGGCAT---- 41 Each of 1013 cells in the human body receives approximately thousand DNA lesions per day (Lindahl and Barnes 2000) When these mutations are not repaired they are fixed in the genome of that particular cell If a mutation is fixed in germ cells that go on to be fertilized and form an embryo they will be propagated to progeny It was calculated that there are ~70 new chnages in each diploid human genome 42 SNPs, RFLPs, point mutations GAATTC GAATTC GAATTC GAATTC GAGTTC GAATTC RFLP SNP SNP Pt mut SNP GAATTC GAATTC GAATTC GACTTC RFLP Pt mut SNP 43 Coding Region SNPs •Types of coding region SNPs –Synonymous: the substitution causes no amino acid change to the protein it produces. This is also called a silent mutation. –Non-Synonymous: the substitution results in an alteration of the encoded amino acid. A missense mutation changes the protein by causing a change of codon. A nonsense mutation results in a misplaced termination. –More than half of all coding sequence SNPs result in non-synonymous codon changes. Intergenic SNPs Researchers have found that most SNPs are not responsible for a disease state because they are intergenic SNPs Instead, they serve as biological markers for pinpointing a disease on the human genome map, because they are usually located near a gene found to be associated with a certain disease. Scientists have long known that diseases caused by single genes and inherited according to the laws of Mendel are actually rare. Most common diseases, like diabetes, are caused by multiple genes. Finding all of these genes is a difficult task. Recently, there has been focus on the idea that all of the genes involved can be traced by using SNPs. By comparing the SNP patterns in affected and non-affected individuals—patients with diabetes and healthy controls, for example—scientists can catalog ALL of the DNA sequence variations in affected Vs unaffected individuals to identify mutations that underlie susceptibility for diabetes GAATTC GAGTTC GAATTC RFLP SNP SNP Pt mut SNP GAATTC GACTTC RFLP Pt mut SNP PCR If a region of DNA has already been sequenced in one individual, the sequence information can be used to isolate and amplify that sequence from other individuals DNA in a population. Individuals with mutations in p53 are at risk for colon cancer To determine if an individual had such a mutation, prior to PCR one would have to clone the gene from the individual of interest (construct a genomic library, screen the library, isolate the clone and sequence the gene). With PCR, the gene can be isolated directly from DNA isolated from that individual. No lengthy cloning procedure necessary Only small amounts of genomic DNA required 30 rounds of amplification can give you >109 copies of a gene 46 5’AAAGATCGGGGGGGGGGGGGGGTCGATCTA3’ 3’TTTCTAGCCCCCCCCCCCCCCCAGCTAGAT5’ PRIMER1 5’AAAGATC3’ 3’AGCTAGAT5’ PRIMER2 5’AAAGATCGGGGGGGGGGGGGGGTCGATCTA3’ 3’AGCTAGAT5’ WT 5’AAAGATC3’ 3’TTTCTAGCCCCCCCCCCCCCCCAGCTAGAT5’ 5’AAAGATCGGGGGGGGGGGGGGGTCGATCTA3’ 3’TTTCTAGCCCCCCCCCCCCCCCAGCTAGAT5’ Agarose Gel 5’AAAGATCGGGGGGGGGGGGGGGTCGATCTA3’ 3’TTTCTAGCCCCCCCCCCCCCCCAGCTAGAT5’ 47 SNPs and Primers- ASO hybridization Individual 1 GACTCCTGAGGAGAAGTG Individual 2 GACTCCTGTGGAGAAGTG Raise Temperature Raise Temperature 2 1 DNA from individuals 1 and 2 are tested under CONDITIONS that only allow perfect matches of oligos to anneal to the genomic DNA. 48 Mutants cannot be amplified 5’AAAGATCGGGGGGGGGGGGGGGTCGATCTA3’ 3’TTTCTAGCCCCCCCCCCCCCCCAGCTAGAT5’ PRIMER1 5’AAAGATC3’ 3’AGCTAGAT5’ PRIMER2 5’AAAGATCGGGGGGGGGGGGGGGTCGATCTA3’ 3’AGCTAGAT5’ WT 5’AAAGATC3’ 3’TTTCTAGCCCCCCCCCCCCCCCAGCTAGAT5’ 5’AAAGATCGGGGGGGGGGGGGGGTCGATCTA3’ 3‘TTTCTAGCCCCCCCCCCCCCCCAGCTAGAT5’ 5’AAAGATCGGGGGGGGGGGGGGGTCGATCTA3’ 3’TTTCTAGCCCCCCCCCCCCCCCAGCTAGAT5’ Agarose Gel 49 Mutants cannot be amplified 5’AAAGATCGGGGGGGGGGGGGGGTCGATCTA3’ 3’TTTCTAGCCCCCCCCCCCCCCCAGCTAGAT5’ PRIMER1 5’AAAGATC3’ 3’AGCTAGAT5’ PRIMER2 5’AAAGATCGGGGGGGGGGGGGGGTCGATTTA3’ 3’AGCTAGAT5’ mut WT 5’AAAGATC3’ 3’TTTCTAGCCCCCCCCCCCCCCCAGCTAGAT5’ 5’AAAGATCGGGGGGGGGGGGGGGTCGATCTA3’ 3’AGCTAGAT5’ 5’AAAGATCGGGGGGGGGGGGGGGTCGATCTA3’ 3’TTTCTAGCCCCCCCCCCCCCCCAGCTAGAT5’ Agarose Gel 50 PCR allows only one SNP to be tested at a time 1 aAa 2 cCc 3 A 4 G 5 G 6 T 7 A 8 T 9 C Ind1 aTa cGc A T G T G T G Ind2 Oligonucleotide chips contain thousands of short DNA sequences immobilised at different positions. Such chips can be used to discriminate between alternative bases at the site of a SNP. Chips allow many SNPs to be analyzed in parallel. Short DNA sequences on the chip represent all possible variations at a polymorphic site; A labeled genomic DNA from an individual will only stick if there is an exact match. The base is identified by the location of the fluorescent signal. 51 Microarrays and SNPs SNP1 B SNP1 A GACTCCTGTGGAGAAGTG GACTCCTGAGGAGAAGTG SNP2 B SNP2 A GGGGGGGGCGGGGGGGGG GGGGGGGGGGGGGGGGGG Design oligonucleotides complementary to each Polymorphism. These oligos are arrayed on a slide Each spot corresponds to a polymorphism Isolate genomic DNA from individual Label Genomic DNA and hybridize to array Oligo probes on slide GACTCCTGAGGAGAAGTG SNP1 GACTCCTGTGGAGAAGTG GGGGGGGGGGGGGGGGGG GGGGGGGGCGGGGGGGGG SNP2 52 Take Genomic DNA from individual Fragment DNA Label DNA with fluorescent dye 53 Microarray slide GACTCCTGAGGAGAAGTG SNP1 1 2 3 4 5 6 7 8 9 GACTCCTGTGGAGAAGTG GGGGGGGGGGGGGGGGGG SNP2 GGGGGGGGCGGGGGGGGG Individua11 GACTCCTGAGGAGAAGTG SNP1 GACTCCTGTGGAGAAGTG 1TT 2GG GGGGGGGGGGGGGGGGGG SNP2 GGGGGGGGCGGGGGGGGG Individua12 GACTCCTGAGGAGAAGTG SNP1 GACTCCTGTGGAGAAGTG 2CC 1AA GGGGGGGGGGGGGGGGGG SNP2 GGGGGGGGCGGGGGGGGG Individua13 GACTCCTGAGGAGAAGTG SNP1 GACTCCTGTGGAGAAGTG 1AT 2GG GGGGGGGGGGGGGGGGGG SNP2 GGGGGGGGCGGGGGGGGG 54 Genotype and Haplotype In the most basic sense, a haplotype is a “haploid genotype”. Haplotype: particular pattern of SNPs (or alleles) found on a single chromosome in a single individual. The DNA sequence of any two people is 99 percent identical. Sets of nearby SNPs on the same chromosome are inherited in blocks. Therefore while Blocks may contain a large number of SNPs, a few SNPs are enough to uniquely identify the haplotype of that block. The HapMap is a map of these specific SNPs. SNPs that identify the haplotypes are called tag SNPs. This makes genome scan approaches to finding regions with genes that affect diseases much more efficient and comprehensive. Haplotyping: involves grouping individuals by haplotypes, or particular patterns of sequential SNPs, on a single chromosome. There are thought to be a small number of haplotype patterns for each chromosome. Microarrays or PCR are used to accomplish haplotyping. Haplotype and SNPs Each individual has a characteristic pattern of SNPs SNPs occur every 300-1000bp apart. There are over a million SNPs in each individual When we generate a SNP map for an individual we DO NOT check every single SNP in that individuals DNA SNPs are transmitted as blocks (Recombination hot spot)- so no point analyzing SNPs that go together 1 aAa 2 cCc 3 A 4 G 5 G 6 T 7 A 8 T 9 C Ind1 aGa cGc A T G T G T G Ind2 SNPs in red were not studied. Only the 9 black SNPs were studied SNP mapping is used to narrow down the known physical location of mutations to a single gene. The human genome sequence provided us with the list of many of the parts to make a human. The HapMap provides us with indicators which we can focus on in looking for genes involved in common disease. By using HapMap data to compare the SNP patterns of people affected by a disease with those of unaffected people, researchers can survey the whole genome and identify genetic contributions to common diseases more efficiently than has been possible without this genome-wide map of variation: the HapMap Project has simplified the search for gene variants. 57 A recessive disease pedigree 58 Mapping recessive disease genes with DNA markers SNP markers are mapped evenly across the genome. The markers are polymorphic. We can tell looking at the SNP pattern of a particular grandchild which grandparent contributed a certain part of its DNA. If we knew that grandparent carried the disease, we could say that part of the DNA might be responsible for the disease. 1 2 3 4 5 4 different alleles at each locus Position1 can be A or C or G or T 6 7 8 9 SNPs in red were not studied. Only the 9 black SNPs were studied Position2 can be A or C or G or T Position3 ……………….. Grand parent Adam A-A-A-A-A-A-A-A-A Chromosome A-A-A-A-A-A-A-A-A Carla C-C-C-C-C-C-C-C-C C-C-C-C-C-C-C-C-C Gary G-G-G-G-G-G-G-G-G G-G-G-G-G-G-G-G-G Tracy T-T-T-T-T-T-T-T-T T-T-T-T-T-T-T-T-T 59 Mapping recessive disease genes with SNP markers 1 Grand-parent A A-A-A-A-A-A-A-A-A A-A-A-A-A-A-A-A-A Dad 2 3 4 C C-C-C-C-C-C-C-C-C C-C-C-C-C-C-C-C-C A-A-A-A-A-A-A-A-A C-C-C-C-C-C-C-C-C Offspring1 A-A-A-C-C-A-A-C-C G-G-G-G-T-T-T-T-G Offspring2 C-C-A-A-C-A-C-A-A G-G-G-G-T-T-T-G-G Offspring3 A-A-A-A-A-C-C-C-C T-T-G-G-G-G-T-T-T Offspring4 C-C-C-C-C-C-A-A-A G-G-T-T-T-T-T-T-T 5 6 7 G G-G-G-G-G-G-G-G-G G-G-G-G-G-G-G-G-G 8 9 T T-T-T-T-T-T-T-T-T T-T-T-T-T-T-T-T-T Mom G-G-G-G-G-G-G-G-G T-T-T-T-T-T-T-T-T Grandparents A and T and offspring 1 and 4 have the disease We would look at the markers and see that ONLY at position 7 do offspring 1 and 4 have the DNA from grandparents A and T. 60 It is therefore likely that the disease gene will be somewhere near marker 7.