* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download printable
Polyadenylation wikipedia , lookup
SNP genotyping wikipedia , lookup
Promoter (genetics) wikipedia , lookup
Real-time polymerase chain reaction wikipedia , lookup
Community fingerprinting wikipedia , lookup
Amino acid synthesis wikipedia , lookup
Eukaryotic transcription wikipedia , lookup
Silencer (genetics) wikipedia , lookup
Gel electrophoresis of nucleic acids wikipedia , lookup
Transcriptional regulation wikipedia , lookup
Molecular ecology wikipedia , lookup
Transformation (genetics) wikipedia , lookup
Epitranscriptome wikipedia , lookup
Personalized medicine wikipedia , lookup
Bisulfite sequencing wikipedia , lookup
Biochemistry wikipedia , lookup
Vectors in gene therapy wikipedia , lookup
Gene expression wikipedia , lookup
Genetic engineering wikipedia , lookup
Molecular cloning wikipedia , lookup
DNA supercoil wikipedia , lookup
Point mutation wikipedia , lookup
Non-coding DNA wikipedia , lookup
Artificial gene synthesis wikipedia , lookup
Molecular evolution wikipedia , lookup
Biosynthesis wikipedia , lookup
Genetic code wikipedia , lookup
Strings in molecular biology Bioinformatics Algorithms (Fundamtal Algorithms, module 2) Strings are finite sequences over an alphabet ⌃ (also called sequences). Zsuzsanna Lipták ⌃ = {A,C,G,T} • DNA (characters: nucleotides) Masters in Medical Bioinformatics academic year 2016/17, spring term • RNA (characters: nucleotides) • proteins (characters: amino acids) Strings and Sequences in Biology ⌃ = {A,C,G,U} ⌃ = {A,C,D,E,F,...,W,Y} • many other problems in molecular biology can be modelled by strings (e.g. gene order, SNPs, haplotypes, . . . ) 2 / 10 DNA: nucleotides The central dogma of molecular biology 5’ ...AACAGTACCATGCTAGGTCAATCGA...3’ 3’ ...TTGTCATGGTACGATCCAGTTAGCT...5’ • 4 characters: A C G T: adenine, cytosine, guanine, thymine (bases, nucleotides) • orientation (read from 5’ to 3’ end) • length measured in bp (base pairs) • double stranded, the two strands are antiparallel • A - T and C - G complementary (Watson-Crick pairs) • reverse complement: (ACCTG)rc = CAGGT source: Wonderwikikids.com 3 / 10 4 / 10 DNA: nucleotides RNA: nucleotides 5’ ...AACAGTACCATGCTAGGTCAATCGA...3’ 3’ ...TTGTCATGGTACGATCCAGTTAGCT...5’ • like DNA, except: • 4 characters: A C U G: adenine, cytosine, uracil, guanine (U instead of T) • RNA is single-stranded • during transcription, one strand is copied into mRNA (messenger • builds double stranded hybrids with DNA RNA), except all T’s are replaced by U’s • RNA folds upon itself (makes complex 3-dim structures), using the • the strand which is identical to the mRNA is called coding strand • the other strand (the one which is used for the transcription) is called Watson-Crick pairs and other bonds (RNA folding) template strand • Both strands can be used as coding strands (for di↵erent genes). • Some DNA strings are circular: bacterial DNA, mitochondrial DNA. 5 / 10 6 / 10 Protein: Amino acids The genetic code There are 20 common amino acids (aa’s); two systems of abbreviations are used: 3-letter-code and 1-letter-code. We usually use the 1-letter-code. alanine arginine asparagine aspartic acid cysteine glutamine glutamic acid glycine histidine isoleucine Ala Arg Asn Asp Cys Gln Glu Gly His Ile A R N D C Q E G H I leucine lysine methionine phenylalanine proline serine threonine tryptophan tyrosine valine Leu Lys Met Phe Pro Ser Thr Trp Tyr Val L K M F P S T W Y V source: Wikimedia commons 7 / 10 8 / 10 The genetic code The genetic code • standard genetic code (some organisms use a di↵erent one) • standard genetic code (some organisms use a di↵erent one) • 3 di↵erent reading frames for translation: The DNA sequence • 3 di↵erent reading frames for translation: The DNA sequence 5’ ...TATTCGAATCGGC...3’ 5’ ...TATTCGAATCGGC...3’ can be translated in 3 di↵erent ways, leading to di↵erent aa sequences. can be translated in 3 di↵erent ways, leading to di↵erent aa sequences. • degeneracy of the genetic code • degeneracy of the genetic code: 64 codons but only 20 aa’s plus stop • silent mutations • silent mutations: if third position mutates, this often does not alter codon the aa 9 / 10 The genetic code Exercise: Translate this DNA sequence according to the 3 di↵erent reading frames: 5’ ...TATTCGAATCGGC...3’ 10 / 10 9 / 10