* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Document
DNA sequencing wikipedia , lookup
Transcriptional regulation wikipedia , lookup
Promoter (genetics) wikipedia , lookup
Cre-Lox recombination wikipedia , lookup
Gene regulatory network wikipedia , lookup
Gene expression profiling wikipedia , lookup
Genomic imprinting wikipedia , lookup
Gene desert wikipedia , lookup
Non-coding DNA wikipedia , lookup
Silencer (genetics) wikipedia , lookup
Endogenous retrovirus wikipedia , lookup
Exome sequencing wikipedia , lookup
Whole genome sequencing wikipedia , lookup
Community fingerprinting wikipedia , lookup
Genome evolution wikipedia , lookup
X-inactivation wikipedia , lookup
Today’s Lecture Genetic mapping studies: two approaches Classical linkage map/genome-wide association study Physical map Cloning and isolating genes the old-fashioned way using positional cloning Search for the cystic fibrosis gene Next Lecture Modern genome sequencing Shotgun sequencing an entire genome Sequencing the human genome Functional/comparative genomics RNA-Seq – sequencing the transcriptome Classical genetic linkage & association studies: Genetic linkage mapping involves determining the statistical association of specific traits with genetic markers on chromosomes using pedigrees and crosses. Genome–wide Association Studies = GWAS Use recombination frequencies to determine a relative distances between markers on a chromosome + statistical association with trait or disease of interest. Humans require 24 different maps, one for each of the 22 autosomes and one each for the X and Y chromosomes. Linked genes = on the same chromosome or genome The unit measured for each linkage map is the recombination frequency = # recombinants/total progeny Reported as map units (mu) or centiMorgans (cM) --- distinct from physical distances. Different types of markers used in genetic mapping: 1. Genes can be used as genetic markers, but they are not ideal choices because they occur infrequently (ca. every 100 kb in humans). 2. Greater marker density is usually required. 3. 3 major types of markers have been used: 1. RFLPs = substitutions at a restriction site. 2. Microsatellites (STR) = short tandem repeats 3. STS = sequence tagged site 4. Single nucleotide polymorphisms (SNPs) Genome-wide mapping: High density genetic mapping was revolutionized in the 1980s by the discovery of abundant polymorphic genetic markers like microsatellites. Research teams collaborated and added to a common database. By 1994, human genetic map had localized: 5,264 microsatellites to 2,335 chromosome loci (average density of one marker every 599 kb) In the process, thousands of sequence tagged site (STS) identified. STS = couple hundred base pairs of known sequence High-density genetic map of 5,264 microsatellites localized to each of 23 chromosomes. From genome-wide mapping to genome sequencing… For many species with small genomes, such a map would have provided enough landmarks to begin sequencing the entire genome using a conventional map and sequence approach using PCR + sequencing. Human map still lacked resolution, large stretches of uncharted DNA remained. Average distance between markers was 600 kb. Physical mapping was required to assist with the sequencing. Physical map = map of physically identifiable regions of genomic DNA constructed without recombination analysis. Time and effort could be minimized by targeting sequencing efforts to a specific chromosome (or smaller regions). Two types of physical maps useful for sequencing a genome: 1. Low Resolution-Cytogenetic/FISH maps • Stained chromosomes produce banding patterns composed of bands that average 6 Mb. • Regions are designated by their position relative to the centromere. “q” = long arm “p” = short arm Numbered from the centromere starting with “1” • Genes and other sequences are localized to chromosome maps with probes and by using a technique called fluorescent in situ hybridization (FISH) • Various types of radioactive probes and stains also can be used to mark specific regions of chromosomes. • Provides a physical map of the overall structure of each chromosome/region. http://www.mun.ca/biology/scarr/FISH_chromosome_painting.htm http://www.euchromatin.org/E09.htm Two types of physical maps useful for sequencing a genome: 2. High Resolution-YAC/BAC Clone Contig Maps • Mechanically shear or partially digest genomic DNA with restriction enzymes and clone large 200-500 kb overlapping fragments to YACs or BACs. • An entire genome or single chromosome can be represented in a YAC or BAC clone library (depends on starting point). • Overlapping YAC/BAC clones can be assembled into a scaffold without sequencing by DNA fingerprinting using markers like microsatellites. • BAC vectors with a capacity of 300 kb and ability to replicate in E. coli have become popular for genome sequencing (now routinely sequenced using the shotgun approach). Fig. 10.1 2nd edition, YAC contig physical map assembled by microsatellite mapping (combination YACs + microsatellite mapping) Cloning, isolating, and sequencing genes: Locating a gene is easy if the gene product (protein) is identified. 1. Create a cDNA library using an expression vector. 2. Probe with antibodies that bind the gene product. 3. Isolate and sequence positive clones. If the gene product is unknown, locating and sequencing a gene is more difficult. 1. Identify a marker (microsatellite, RFLP, SNP) that: 1. 2. 1. Shows a strong statistical association with the disease phenotype in test crosses or genome-wide association study (GWAS). Is physically linked to the gene on the same chromosome. Use linkage map + physical map and a technique called positional cloning + chomosome walking to home in on gene and actually sequence it. e.g., cloning and discovery of the cystic fibrosis (CF) gene. Positional Cloning- identification the cystic fibrosis (CF) gene: Most common lethal genetic disease in the U.S. (~1 in 2,000). First human gene identified by positional cloning. Required 4 years and the work of many laboratories. Overview of cystic fibrosis: CF results from defect in protein that regulates the movement of salt and water in and out of cells. Causes thick mucus secretions in the lungs, pancreas, and intestines. Causes lung disease and organ failure, patients experience chronic bacterial infections. Life expectancy is abut 40 years. First steps to identifying the CF gene by positional cloning: 1. Many hundreds of individuals with CF pedigrees were screened with a large number of RFLPs. 2. A single recurring RFLP showed weak linkage (statistical association) to the cystic fibrosis trait. 3. CF gene was next localized to chromosome 7 using a labeled RFLP probe and in situ hybridization to condensed chromosomes. 4. All other known RFLPs from chromosome 7 were simultaneously screened for linkage to CF. 5. Two more linked RFLPs were discovered on a 500,000 bp subregion (31-32) of the long arm of chromosome 7 (7q31-q32). 6. The data indicated CF locus is within a 500,000 bp region of chromosome 7. Steps to identifying the CF gene (cont.): 1. Section (500 kb) of chromosome 7 containing the CF gene was cut, cloned, and mapped using a technique called chromosome walking. 1. End of a cloned sequence is used as a probe to find adjacent overlapping fragments in a genomic library. 2. Clones that overlap are mapped with RFLPs to determine the extent of overlap. 3. A new labeled probe designed for the second clone is used to screen the library once again. 4. Repeat… Technique does not work well with highly repetitive sequence that is scattered throughout the genome. Length of each step in the chromosome walk is limited by the size of inserts in the library and the size of the overlap. Fig. 9.10, 2nd edition Illustration showing how chromosome walking was used to identify a candidate gene for a disease like cystic fibrosis. Technique called chromosome jumping also was used: 1. Use partial restriction digestion to cut a large section of chromosomal DNA into large overlapping fragments. 2. Circularize fragments with DNA ligase, bringing ends of DNAs that previously were distant close together. 3. Cut the circles with a restriction enzyme yet again to release the junction region (ends are now inverted). 4. Clone junction regions to form a jumping library. 5. Subclone a small fragment of DNA and use as a probe to find the next junction fragment occurring in the library (same technique as chromosome walking). 6. Repeat… and/or start chromosome walking. 7. Chromosome jumping reaches the target gene faster than walking. 8. Similar technique called “mate pair” is used in today’s nextgeneration sequencing. Chromosome jumping Preparation of next-generation mate-pair library: http://www.investigativegenetics.com Summary of the search for the CF gene: 1. 7 chromosome jumps were made for CF. 2. Chromosome walks were made from each jump site to identify overlapping clones. 3. Clones spanning a total 500 kb eventually were characterized. 4. Next, cloned DNA was used as a probe against other species using a restriction digest + Southern blot. *Genes are more conserved than non-coding sequences and similar sequences should be found in other species. 5. Five subclones (or candidates) hybridized with other organisms. 6. Two of the subclones were ruled out by linkage analysis, and a third was a pseudogene (gene-like sequence lacking expression signals). 7. Remaining two clones were hybridized with mRNA on a Northern blot to test whether their sequences are transcribed. 8. One more candidate was eliminated, and the 5th candidate was sequenced… Characteristics of the CF gene: 1. cDNA (mature mRNA of same size) is 6,500 bp. 2. Genomic DNA: CF gene spans 250 kb and contains 24 exons. 3. 68% of Caucasians with cystic fibrosis show a 3-bp deletion that results in the loss of phenylalanine (Phe). 4. Sixty other mutations described. Fig. 4.13, CFTR Structure Cystic Fibrosis Transmembrane Conductance Regulator Protein