* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download mnw2yr_lec17_2004
Medical genetics wikipedia , lookup
Cell-free fetal DNA wikipedia , lookup
Gene therapy wikipedia , lookup
Nutriepigenomics wikipedia , lookup
Polycomb Group Proteins and Cancer wikipedia , lookup
Oncogenomics wikipedia , lookup
Behavioural genetics wikipedia , lookup
Cre-Lox recombination wikipedia , lookup
Epigenetics of human development wikipedia , lookup
Vectors in gene therapy wikipedia , lookup
Point mutation wikipedia , lookup
Segmental Duplication on the Human Y Chromosome wikipedia , lookup
Gene desert wikipedia , lookup
Therapeutic gene modulation wikipedia , lookup
Quantitative trait locus wikipedia , lookup
Metagenomics wikipedia , lookup
Skewed X-inactivation wikipedia , lookup
Transposable element wikipedia , lookup
Genealogical DNA test wikipedia , lookup
Gene expression programming wikipedia , lookup
Genomic imprinting wikipedia , lookup
Genetic engineering wikipedia , lookup
No-SCAR (Scarless Cas9 Assisted Recombineering) Genome Editing wikipedia , lookup
Y chromosome wikipedia , lookup
Copy-number variation wikipedia , lookup
Minimal genome wikipedia , lookup
Molecular Inversion Probe wikipedia , lookup
Neocentromere wikipedia , lookup
Whole genome sequencing wikipedia , lookup
Non-coding DNA wikipedia , lookup
Microsatellite wikipedia , lookup
History of genetic engineering wikipedia , lookup
Helitron (biology) wikipedia , lookup
X-inactivation wikipedia , lookup
Designer baby wikipedia , lookup
Human genetic variation wikipedia , lookup
Site-specific recombinase technology wikipedia , lookup
Pathogenomics wikipedia , lookup
Microevolution wikipedia , lookup
Genomic library wikipedia , lookup
Human Genome Project wikipedia , lookup
Artificial gene synthesis wikipedia , lookup
Human genome wikipedia , lookup
Genome editing wikipedia , lookup
SNP genotyping wikipedia , lookup
Genome evolution wikipedia , lookup
Genome (book) wikipedia , lookup
Genomics An introduction Aims of genomics I Establishing integrated databases – being far from merely a storage Linking genomic and expressed gene sequences cDNA Aims of genomics II Describing every gene: • function/expression data/relationships/phenotype • 3-d structure and features (introns/exons, domains, repeats) • similarities to other genes Characterize sequence diversity in population Genomics can be: Structural – where it is? Functional – what it does? – DNA microarrays: Comparative – finding important fragments Mapping genomes Past – Genetic maps Distance between simple markers expressed in units of recombination – Cytological maps Stained chromosomes, observable under microscope Present – Physical maps Distance between nucleotides expressed in bases – Comparative map Corresponding genes detection; Regulatory sequence detection; Genome sizes Organism DNA length Genes Mycoplasma genitalium 0.5 Mb 470 Deinococcus radiodurans 3 Mb in 410 copies! 3 200 Escherichia coli 4.5 Mb 4 400 Saccharomyces 12 Mb 6 200 Caenorhabditis elegans 97 Mb 22 000 Drosophila melanogaster 120 Mb 18 000 Homo sapiens 3200 Mb 32 000 cerevisiae Genetic differences among humans Goals – Genetic diseases – Identifying criminals Methods – Genetic markers (fingerprints) and DNA sequence. Repeats: • Microsatellites (repeats of 1-12 nucleotides) • Minisatellites (> 12) – Other types of variation • Genome rearrangements • Single nucleotide mutations Microsatellites and disease Huntington’s disease – Huntingtin gene of unknown (!) function – Repeats #: 6-35: normal; 36-120: disease • Friedrich ataxia disease – GAA repeat in non-coding (intron) region – Repeats #: 7-34: normal; 35 up: disease – Repeat expansion reduces expression of frataxin gene SNP - Single Nucleotide Polymorphism Definition – SNP and phenotype Occurrence in genome – Rarity of most SNPs (agrees with neutral molecular evolutionary theory) – SNPs in human population: Inter-genic regions Coding regions Every 1400bp Every 1430bp • High variance in genome! Detection of SNPs: Hybridization Sickle cell anemia Sickle looks like this: SNP on Beta Globin gene, which is recessive: • 2 faulty copies: red blood cells change shape under stress anemia • 1 faulty copy: red blood cells change shape under heavy stress – but gives resistance to malaria parasite SNPs and haplotypes Passengers and their evolutionary vehicles SNP - Phase inference In the data from sequencing the genome the origin of SNP is scrambled G G ...CT AC GT... T A Possibility 1 CTGACGGT... ... CTTACAGT... ... chromosome ... chromosome ... Possibility 2 CTGACAGT... CTTACGGT... Which SNPs are on the same chromosome (are in phase)? SNP – phase inference Determining the parent of origin for each SNP G C ...CT AC GT... A G C A CT AC GT... T A ... G G CT AC GT... T A ... In this case: GG TA Phase inference – the reason why many SNPs sequencing is done for child and two parents. Linkage Disequilibrium, intro How hard is it to break a chromosome An allele/trait/SNP A and a are on the same position in genome (locus), thus on a single chromosome an individual can have either of them – but not both – fA - frequency of occurrences of trait A in population – fa = 1- fA – fB, fb = 1 - fB are frequency occurrences of B and b Probabilities of occurences of both traits on the same chromosome: A B fAB A b fAb a B faB fab a b LD and genomic recombination Linkage Disequilibrium, calculation When these alleles are not correlated we expect them to occur together by chance alone: fAB = fA fB fAb = fA fb faB = fa fB fab = fa fb But if A and B are occurring together more often (disequilibrium state), we can write fAB = fA fB + D fAb = fA fb - D faB = fa fB - D fab = fa fb + D where D is called the measure of disequlibrium Of course from definitions above we have D = fAB - fA fB How can we use it? Phase inference tells us how SNPs are organized on chromosome Linkage disequilibrium measures the correlation between SNPs Back to SNPs Daly et al (2001), Figure 1 Haplotypes - vehicles for SNPs Daly et al (2001) were able to infer offspring haplotypes largely from parents. They say that “it became evident that the region could be largely decomposed into discrete haplotype blocks, each with a striking lack of diversity“ The haplotype blocks: – Up to 100kb – 5 or more SNPs For example, this block shows just two distinct haplotypes accounting for 95% of the observed chromosomes Haplotypes on the genome fragment a) b) c) Observed haplotypes with dotted lines wherever probability of switching to another line is > 2% Percent of explanation by haplotypes Contribution of specific haplotypes Another genetic test Does haplotypes exist? - Each row represents an SNP - Blue dot = major yellow = minor - Each column represents a single chromosome - The 147 SNPs are divided into 18 blocks defined by black lines. - The expanded box on the right is an SNP block of 26 SNPs over 19kb of genomic DNA. The 4 most common of 7 different haplotypes include 80% of the chromosomes, and can be distinguished with 2 SNPs How much SNPs we can ignore? …and still predict haplotypes with high accuracy? Literature Gibson, Muse „A Primer of Genome Science” N Patil et al . Blocks of limited haplotype diversity revealed by high-resolution scanning of human chromosome 21 Science 294 2001:1719-1723. M J Daly et al . High-resolution haplotype structure in the human genome Nat. Genet. 29 2001: 229-232.