* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download day2
Zinc finger nuclease wikipedia , lookup
Frameshift mutation wikipedia , lookup
Epitranscriptome wikipedia , lookup
Mitochondrial DNA wikipedia , lookup
Epigenetics of human development wikipedia , lookup
Non-coding RNA wikipedia , lookup
Genome (book) wikipedia , lookup
SNP genotyping wikipedia , lookup
Metagenomics wikipedia , lookup
DNA polymerase wikipedia , lookup
Genomic library wikipedia , lookup
Expanded genetic code wikipedia , lookup
Genetic engineering wikipedia , lookup
Bisulfite sequencing wikipedia , lookup
History of RNA biology wikipedia , lookup
No-SCAR (Scarless Cas9 Assisted Recombineering) Genome Editing wikipedia , lookup
DNA damage theory of aging wikipedia , lookup
Genome evolution wikipedia , lookup
Cancer epigenetics wikipedia , lookup
Genealogical DNA test wikipedia , lookup
Human genome wikipedia , lookup
United Kingdom National DNA Database wikipedia , lookup
Nutriepigenomics wikipedia , lookup
Site-specific recombinase technology wikipedia , lookup
Gel electrophoresis of nucleic acids wikipedia , lookup
DNA vaccination wikipedia , lookup
Molecular cloning wikipedia , lookup
Epigenomics wikipedia , lookup
Genome editing wikipedia , lookup
Designer baby wikipedia , lookup
Microsatellite wikipedia , lookup
DNA nanotechnology wikipedia , lookup
Cell-free fetal DNA wikipedia , lookup
Genetic code wikipedia , lookup
DNA supercoil wikipedia , lookup
Nucleic acid double helix wikipedia , lookup
Vectors in gene therapy wikipedia , lookup
Cre-Lox recombination wikipedia , lookup
Extrachromosomal DNA wikipedia , lookup
Point mutation wikipedia , lookup
Microevolution wikipedia , lookup
Non-coding DNA wikipedia , lookup
History of genetic engineering wikipedia , lookup
Primary transcript wikipedia , lookup
Therapeutic gene modulation wikipedia , lookup
Deoxyribozyme wikipedia , lookup
Nucleic acid analogue wikipedia , lookup
Introduction to Molecular Biology G-C and A-T pairing. A&G= Purines C&T= Pyrimidines Important terms: • Nucleotide Pair = Base pair (bp) • 1000 base pairs = 1 kilobase pairs (kb) • 1,000,000 base pairs = 1 megabase pairs (Mb) • 1,000,000,000 base pairs = Gb? Double-stranded DNA is peeled apart to replicate DNA • The 2 daughter molecules are identical to each other and exact duplicates of the original (assuming errorfree replication). • One chromosome is one long, twisted, dramatically compacted DNA molecule. • The average length of a human chromosome is 130 million b.p. Genes are defined segments of DNA •The information content of the DNA molecule consists of the order of bases (A, C, G, and T) along the length of the molecule. Nucleic Acids DNA vs. RNA • RNA is quite similar to DNA, but usually singlestranded. Both are nucleic acids In RNA, “U” replaces “T “ Important Concepts • DNA and RNA have polarity- each strand has a 5’ and a 3’ end. (The 2 strands of DNA are antiparallel) • The common convention is to list only one strand of DNA, in a 5’ to 3’ direction: 5’ AGTCGTAGTCGTAGTCGTAGTCTG3’ (3’TCAGCATCAGCATCAGCATCAGAC 5’) How Genes are Expressed- the Central Dogma. Transcription = RNA synthesis Translation = Protein synthesis Eukaryotic transcription operates ‘gene by gene’. One strand of DNA is copied (sense strand); the antisense strand is never transcribed. Transcription produces an RNA ‘copy’ of a gene (DNA) • animation Important Term: • Transcription = RNA synthesis • Quiz question- how does sequence of mRNA compare to sequence of noncoding strand of DNA? The mRNA are translated in the cytoplasm Three consecutive bases in the mRNA form one codon No exceptionsthe genetic code is a triplet code. tRNA are the ‘bilingual’ molecules The genetic code is the codon-amino acid conversion table http://academy.d20.co.edu/kadets/lundbe rg/DNA_animations/protein.mov The immediate product of translation is the primary protein structure The primary sequence dictates the secondary and tertiary structure of the protein Important Term: • Translation = Protein synthesis There are 2 basic types of genes: • Protein-coding genes: (DNA mRNA protein) • RNA-specifying genes: (DNA tRNA) (DNA rRNA) (DNA small RNA) Genetic information, stored in DNA, is conveyed as proteins Protein sequences are also represented linearly. • Each of the 20 amino acid is can be represented by a 3 letter code: Ser Tyr Met Glu His In bioinformatics, each of the 20 amino acid is commonly represented by a 1 letter code: MDETSGHLKPWECVGH . . . Genetic information, stored in DNA, is conveyed as proteins In sickle-cell anemia, one nucleotide change is responsible for the one amino acid change. Sickle-cell anemia is caused by one amino acid change. A single base-pair mutation is often the cause of a human genetic disease. How to find a gene?* • One way is too search for an open reading frame (ORF). • An ORF is a sequence of codons in DNA that starts with a Start codon, ends with a Stop codon, and has no other Stop codons inside. * = inexact science Each strand has 3 possible ORFs. 5' 3’ atgcccaagctgaatagcgtagaggggttttcatcatttgagtaa 1 atg ccc aag ctg aat agc gta gag ggg ttt tca tca ttt gag taa M P K L N S V E G F S S F E * 2 3 tgc cca agc tga ata gcg tag agg ggt ttt cat cat ttg agt C P S * I A * R G F H H L S gcc caa gct gaa tag cgt aga ggg gtt ttc atc att tga gta A Q A E * R R G V F I I * V Eukaryotic Genomes • Finding a gene is much more difficult in eukaryotic genomes than in prokaryotic genomes. WHY?? Prokaryotic (bacterial) genomes: • Are much smaller than eukaryotic genomes E. coli = 4,639,221 bp, 4.6 Mb Human = ~~ 3,300 Mb • Contain a small amount of noncoding DNA E. coli= ~ 11% Human = > 95% Eukaryotic transcripts (mRNA) are processed and leave the nucleus Exon = Genetic code Intron = Non-essential DNA ? ? • The mechanism of splicing is not well understood. Alternate Splice sites generate various proteins isoforms (HGP estimate = 35%) Variable mutation rate? • Most mutations in introns and intergenic DNA are (apparently) harmless • Consequently, intron and intergenic DNA sequences diverge much quicker than exons. Bacteria cells are different: • Prokaryotic cells- No splicing (i.e. – no split genes) • Eukaryotic cells- Intronless genes are rare (avg. # of introns in HG is 3-7, highest # is 234); dystrophin gene is > 2.4 Mb. How to confirm the identification of a gene? • Possible answer- Identify the gene by identifying its promoter. Promoters are DNA regions that control when genes are activated. Promoter [ ] Exons encode the information that determines what product will be produced. Promoters encode the information that determines when the protein will be produced. Nucleotides of a particular gene are often numbered: • De Demonstration of a consensus sequence. How to find a gene? • Look for a substantial ORF and associated ‘features’. • Two nucleic acids, that are exact complements of each other will hybridize. • Two nucleic acids that are mostly complementary (some mismatchs) will . . . . . . hybridize under the right conditions. Recombinant DNA techniques? • Many popular tools of recDNA rely on the principle of DNA hybridization. • In large mixes of DNA molecules, complementary sequences will pair. Hybridization ‘in silico’ • Algorithms have been written that will compare two nucleic acid sequences. Two similar DNA sequences (they would hybridize in solution) are said ‘to match’ when software determines that they are of significant similarity. Protein- Protein similarity searches? • Many algorithms have been designed to compare strings of amino acids (single letter amino acid code) and find those of a defined degree of similarity. Significance of sequence similarity • DNA similarity suggests: • Similar function • Similar structure • Evolutionary relationship The End