Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Bioinformatics Why Can’t It Tell Us Everything? Bioinformatics What are our Data Sets? • Interested in information flow with cells • Currently, the key information is mostly a matter of biological macromolecules • Eventually, information of interest will also include flow of nutrients, energy, and impact of small molecules on macromolecular function Bioinformatics What are our Questions? • • • • • • • • What is in there? What does it do? How similar is it to something else? How does it fold? Where does it go in a cell? What does it interact with? How it is regulated? Level of confidence? Bioinformatics Logical Reasoning Behind Data Sets * Function of organism is determined by function of its cells * Function of cells determined by chemical reactions that take place within them * Chemical reactions occur or not according to presence and activity of enzymes * Enzymes are proteins * Proteins are determined by genes * Therefore, genes determine organismal function Genomics Proteomics Central Dogma Flow of Information Central Dogma DNA as the Blueprint for Life? Central Dogma DNA as the Blueprint for Life? Central Dogma DNA RNA Protein Genes & proteins are different molecular languages, but they are colinear DNA Basic Unit (alphabet): Nucleotide (base) Only 4: A, T, G, and C Double-stranded: A<>T and G<>C 5’..AGCTGCATGCTAGCTGACGTCA….3’ 3’..TCGACGTACGATCGACTGCAGT….5’ “Words” (genes) to encode proteins, RNA Double helical DNA Structure Connected to Information DNA Tower in Perth, AUS DNA Replication & Transcription as Algorithms • With rare exceptions, all DNA is replicated • Crucial tool is ability to go from one strand to another • Transcription uses same base-pairing rules with U instead of T, but occurs in packets Transcription = DNA to RNA Where to Start is a Big Question Protein Alphabet: amino acids There are 20 amino acids Met Cys Ser Leu Ala Ala Val Proteins Number of Possible 100-mer Peptides? 20 possible residues at each position For 2-mers, 20 possible at position 1 and 20 possible at position 2, so 20 x 20 = 202 = 400 Same logic for 100-mers, 20100 = 2100 x 10100 = (210) 10 x 10100 = ~ (103) 10 x 10100 = 10130 beta-pleated sheet Proteins Folding Starts Local alpha-helix Proteins Folding Goes Global Proteins Predictive Protein Folding as Holy Grail Protein Alphabet: amino acids There are 20 amino acids Encoded by codons (triplets of nucleotides) ATG TGCAGCCTAGCTGCCGTC CTAGCTGCCGTC Met Cys Ser Leu Ala Ala Val Genetic Code Found on Earth: How Does It Work? 5’-UCGACCAUGGUUGACCAUUGAUUACCACG-3’ Genetic Code • Triplet • Nonoverlapping • Comma-less • Redundant Bioinformatics: Mining a Mountain of Data Where are the putative genes?