* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download BINF 4445/5445
Gene expression wikipedia , lookup
Transcriptional regulation wikipedia , lookup
Synthetic biology wikipedia , lookup
Silencer (genetics) wikipedia , lookup
Comparative genomic hybridization wikipedia , lookup
DNA barcoding wikipedia , lookup
Maurice Wilkins wikipedia , lookup
Agarose gel electrophoresis wikipedia , lookup
Molecular evolution wikipedia , lookup
Molecular cloning wikipedia , lookup
Gel electrophoresis of nucleic acids wikipedia , lookup
Community fingerprinting wikipedia , lookup
Exome sequencing wikipedia , lookup
Non-coding DNA wikipedia , lookup
Cre-Lox recombination wikipedia , lookup
DNA supercoil wikipedia , lookup
Whole genome sequencing wikipedia , lookup
DNA sequencing wikipedia , lookup
Nucleic acid analogue wikipedia , lookup
BINF 5445/4445 Welcome! Please let me know if you would like to discuss a particular topic – If so, I will probably be able to schedule it BINF 5445/4445 This week’s material: Course info and syllabus Overview of the field Defining Bioinformatics (A1) To begin… write down your best guess! Defining Bioinformatics II We might define 3 components: Making databases of biological data Manipulation of biological data(bases) with statistics and algorithms Using software to work with biological databases using statistics and algorithms This does not mean other characterizations are incorrect Defining Bioinformatics III Let’s distinguish bioinformatics from… Computational Biology Biology Biotechnology Databases Information Technology Computer/Information Science Defining Bioinformatics IV What is – – a bioinformatician? a bioinformaticist? biostar.stackexchange.com/questions/1184/ bioinformaticist-vs-bioinformatician says: – “-ist: ...member of a profession...” – “-ician: ...person skilled with a field...” What do you think? Defining Bioinformatics V What is – a bioinformatician? – a bioinformaticist? Google has the following #s of hits: – Bioinformaticist • – 9,700 (5/28/10); 43,700 but later in the day 182,000 (8/22/11); 26,400 (8/20/13) Bioinformatician • 49,500 (5/28/10); 436,000 (8/22/11); 299,000 (8/20/13) The Web and Bioinformatics Without the Web, bioinformatics would truly be a shadow of what it is! There are lots of network architectures The “internet” one is the winning WAN (WAN?) The internet communicates by the TCP/IP protocol The Web is built on the internet What’s a protocol? Like “Understanding Bioinformatics” is built on English The Web’s protocol is http (http?) http is built using TCP/IP Analogy: you and telephones (A2) The Web: Infrastructure for Bioinformatics The Web is full of name servers They are on-line databases… (of what?) They contain lists of names and their IP addresses IP addresses are numerical: 195.172.6.15 Names are, for example, www.ualr.edu Names are only a convenience! The Web is full of Web browsers These are the clients we know and love A Web browser uses http to get hypermedia files Search Engines Early Web did not have search engines There were “jump pages” I used to use Then there were keyword-based engines Then linkage-based (crowd sourced) engines Customizing engines … There are now also specialized engines (A3) For biological literature, sequences, etc. Can you name one? Any on-line database is a search engine! The Sequencing Problem Sequence, v., to determine the order of the elementary units of (B1) a protein, or a nucleic acid You can sequence DNA, RNA, & protein Let’s start with DNA DNA Sequencing “Principles” Vocabulary Approach is called chain termination sequencing Also called dideoxy sequencing Why chain? Termination? Sequencing? Dideoxy? Also called Sanger method Sanger? Dideoxy Sequencing II Key terms are Single-stranded DNA template Primer Deoxyribonucleoside triphosphates Dideoxyribonucleoside triphosphates dATP, dCTP, dGTP, dTTP ddATP, ddCTP, ddGTP, ddTTP Electrophoresis A Question to Consider See Figure 1, p. 11, Westhead et al. How it works: Given a 800-bp or less single DNA strand Make lots and lots of copies of it Try to build complementary strands. Use dATP, dCTP, dGTP, & _____ DNA polymerase Halt the building at random places < 800 Use ddATP, ddCTP, _____, & _____ A Question to Consider Given a 800-bp or less single DNA strand Make lots and lots of copies of it Try to build complementary strands. Use dATP, dCTP, dGTP, & _____ DNA polymerase Halt the building at random places < 800 Mix batches together, “read” order via PAGE Do it in 4 batches Batch one: use ddATP, tag it with fluorescent “A” Batch two: use ddCTP, tag it with fluorescent “C” Batch three and four: use ____, tag it with ____ PAGE: polyacrylamide gel electrophoresis See Figure 1 again A Question to Consider See Figures 1 and 2, p. 11, Westhead et al. Is Figure 2 the result of the rectangle part of Figure 1? Is Fig. 2 the template sequence? Example of a real demo of Figure 1. Combine the 800-bp pieces Not necessarily 800 bp Shotgun sequencing: Break up long DNAs into pieces Pieces from DNA molecule 1 overlap with pieces from DNA molecule 2 Sequence the pieces Look for overlaps String it all together Cost and Future Trends Source: The Singularity is Near, by Ray Kurzweil, p. 73 NIH wants Human genome for $100k by 2009: (http://www.wired.com/wiredscience/2008/07/british-institu/). So when will it be $100? How to Sequence a Genome Shotgun sequencing Break DNA molecule randomly into pieces Sequence the pieces Look for overlaps to assemble the full sequence Why are there overlaps? Clone contig sequencing… Clone Contig Sequencing “Subclones” DNA fragments Does it in a “rational manner”, “systematically” - Westhead et al., p. 12 Builds up the full sequence result Is shotgun sequencing irrational? Unsystematic? Explain… (Supplementary Slides) RNA Sequencing More variations in the base pairs than for DNA This makes sequencing more challenging than for DNA Protein Sequencing Proteins are not built of nucleotides Like with RNA, the elementary units can have various modifications They are made from ________? “modified residues or other types of …modification…such as cleavage [and] disulfide bonds” Uses mass spectroscopy (MS) What is the basic idea of MS? Quality control What would be the result of poor quality? Some vocabulary you might like to find out about: clone, contigs, repeats…enjoy the HW! Single Pass Sequencing Quality is a problem Kinds of DNA Genomic DNA mtDNA Coding DNA Noncoding DNA cDNA Recombinant DNA Which is found in the chromosomes? Which is not found in the cell of interest? Which has more junk? Less junk?