* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Why teach a course in bioinformatics?
Epitranscriptome wikipedia , lookup
Amino acid synthesis wikipedia , lookup
Transformation (genetics) wikipedia , lookup
Genomic library wikipedia , lookup
Real-time polymerase chain reaction wikipedia , lookup
DNA supercoil wikipedia , lookup
Molecular cloning wikipedia , lookup
Genetic engineering wikipedia , lookup
Molecular ecology wikipedia , lookup
Two-hybrid screening wikipedia , lookup
Transcriptional regulation wikipedia , lookup
Endogenous retrovirus wikipedia , lookup
Community fingerprinting wikipedia , lookup
Biochemistry wikipedia , lookup
Promoter (genetics) wikipedia , lookup
Vectors in gene therapy wikipedia , lookup
Nucleic acid analogue wikipedia , lookup
Gene expression wikipedia , lookup
Non-coding DNA wikipedia , lookup
Deoxyribozyme wikipedia , lookup
Genetic code wikipedia , lookup
Silencer (genetics) wikipedia , lookup
Biosynthesis wikipedia , lookup
Point mutation wikipedia , lookup
What is bioinformatics? Answer: It depends who you ask. Various definitions: • The science of using information to understand biology. • The science that uses computational approaches to answer biological questions. (subtle distinction - bioinformatics is subset of the larger field of computational biology). • Organizing and analyzing complex data resulting from molecular and biochemical techniques What skills should a bioinformatician have? • You should have fairly deep background in molecular biology. • You must understand the central dogma of molecular biology. • You should have experience with programming. • You should be comfortable working in a command line computing environment. Why teach a course in bioinformatics? Part of answer: Bioinformaticians are needed. From Science ‘next wave’: • Imagine a job fair with 50 high-tech companies competing to recruit one of the handful of properly qualified scientists who bothered to show up. Sounds like a pie-inthe-sky dream, doesn't it? But according to Victor Markovitz, vice president of bioinformatics systems at Gene Logic Inc., this actually happened at a recent biotech fair. And it is more or less typical of the prevailing global job market in bioinformatics and computational biology, where there are many more headhunters than heads. Why now?? Easy Answer: * The astronomical growth of Genbank and The Protein Data Bank! More complex answer: The way biology is done is changing. *one of goals of this course: to illustrate the ways biology is changing Biology is scaling up. • • Genetics lab don’t do things one gene at a time anymore. Genetics lab use a ‘Genomic Approach’. *These types of large scale projects required, more than anything, a change in mindset: 1. Focus isn't everything. 2. Do things smarter and save work. 3. Think big! Collaborative projects require centralized databases and systematic methods for sharing data. • A good example is the C. elegans genome project. . Knowledge is built by constructing relations between different kinds of data. • SMD (Stanford Microarray Database) stores raw and normalized data from microarray experiments. The data for a given gene is linked to a mass of genetic information, including an expression history for that entity, a description of the associated protein, chromosomal location, etc. . Data is a resource that can be mined. • Beyond the initial project, data is still a valuable resource . Results from numerous research projects that might themselves be of minimal significance, can often be put together to make generalizations or observations that could be quite significant. Why teach a course in bioinformatics? Part of answer: Biologists who understand bioinformatics are needed. Introduction to Molecular Biology Overview • The DNA-based Genome • Biology’s Central Dogma • Genotype to Phenotype A&G= Purines C&T= Pyrimidines Purine Pyrimidine G-C and A-T pairing. Double-stranded DNA is peeled apart to replicate DNA • The 2 daughter molecules are identical to each other and exact duplicates of the original (assuming errorfree replication). • One Chromosome is one long twisted, dramatically compacted DNA molecule. • The average length of a human chromosome is 130 million b.p. Genes are defined segments of DNA •The information content of the DNA molecule consists of the order of bases (A, C, G, and T) along the length of the molecule. How Genes are Expressed- the Central Dogma. • RNA is quite similar to DNA, but usually singlestranded. In RNA, “U” replaces “T “ Transcription = RNA synthesis Translation = Protein synthesis Eukaryotic transcription operates ‘gene by gene’. One strand of DNA is copied (sense strand); the antisense strand is never transcribed. Transcription produces an RNA ‘copy’ of a gene (DNA) • animation Eukaryotic transcripts (mRNA) are processed and leave the nucleus The mRNA are translated in the cytoplasm Three consecutive bases in the mRNA form one codon No exceptionsthe genetic code is a triplet code. tRNA are the ‘bilingual’ molecules The genetic code is the codon-amino acid conversion table Two amino acids are joined by a peptide bond. http://academy.d20.co.edu/kadets/lundbe rg/DNA_animations/protein.mov The immediate product of translation is the primary protein structure The primary sequence dictates the secondary and tertiary structure of the protein Genetic information, stored in DNA, is conveyed as proteins A mutation in the DNA may alter the primary sequence of the corresponding protein Alteration of the primary sequence of the polypeptide may alter the secondary and tertiary sequence of the protein. The altered protein may not function properly. Sickle-cell anemia is caused by one amino acid change. One nucleotide change is responsible for the one amino acid change. A single base-pair mutation is often the cause of a human genetic disease. mid 1970s- The discovery of ‘split genes’. Split genes are the norm in eukaryotic organisms. Exon = Genetic code Intron = Non-essential DNA ? ? The mechanism of splicing is not well understood. Molecular evolution is the study of organismal relationship, based in part on the comparison of conserved exon sequences. Comparison of intron sequences are rare. Why? . • Most mutations in introns are (apparently) harmless • Consequently, intron sequences diverge much quicker than exons. • Prokaryotic cells- No splicing (i.e. – no split genes) • Eukaryotic cells- Intronless genes are rare. Promoters are DNA regions that control when genes are activated. Exon encode the information that determines what product will be produced. Promoters encode the information that determines when the protein will be produced. • De Deomonstration of a consensus sequence. The End The IUPAC-IUB symbols for nucleotide nomenclature are shown below: Symbol Meaning Symbol Meaning G Guanine K G or T A Adenine S G or C C Cytosine W A or T T Thymine H A or C or T U Uracil B G or T or C R Purine (A or G) V G or C or A Y Pyrimidine (C ,T) D G or T or A M A or C G , A , T, or C N List of Amino Acids and Their Abbreviations amino acid glycine alanine valine leucine isoleucine methionine phenylalanin e tryptophan proline 3 letter code Gly Ala Val Leu Ile Met Phe Trp Pro 1 letter code G A V L I M F W P Polar (hydrophilic) serine threonine cysteine tyrosine asparagine glutamine Ser Thr Cys Tyr Asn Gln S T C Y N Q Electrically Charged (negative and hydrophilic) aspartic acid glutamic acid Asp Glu D E Electrically Charged (positive and hydrophilic) lysine Lys K arginine Arg R Histidine His H Others X = unknown * = STOP