Download BINF 4445/5445

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Gene expression wikipedia , lookup

Transcriptional regulation wikipedia , lookup

Synthetic biology wikipedia , lookup

Silencer (genetics) wikipedia , lookup

DNA repair wikipedia , lookup

Comparative genomic hybridization wikipedia , lookup

DNA barcoding wikipedia , lookup

Maurice Wilkins wikipedia , lookup

Agarose gel electrophoresis wikipedia , lookup

RNA-Seq wikipedia , lookup

Replisome wikipedia , lookup

Molecular evolution wikipedia , lookup

Molecular cloning wikipedia , lookup

Gel electrophoresis of nucleic acids wikipedia , lookup

Community fingerprinting wikipedia , lookup

Exome sequencing wikipedia , lookup

Non-coding DNA wikipedia , lookup

Cre-Lox recombination wikipedia , lookup

DNA supercoil wikipedia , lookup

Whole genome sequencing wikipedia , lookup

DNA sequencing wikipedia , lookup

Nucleic acid analogue wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Deoxyribozyme wikipedia , lookup

Transcript
BINF 5445/4445
Welcome!

Please let me know if you would like to
discuss a particular topic
–
If so, I will probably be able to schedule it
BINF 5445/4445
This week’s material:
Course info and syllabus
Overview of the field
Defining Bioinformatics

(A1)
To begin… write down your best guess!
Defining Bioinformatics II

We might define 3 components:




Making databases of biological data
Manipulation of biological data(bases) with
statistics and algorithms
Using software to work with biological
databases using statistics and algorithms
This does not mean other
characterizations are incorrect
Defining Bioinformatics III

Let’s distinguish bioinformatics from…






Computational Biology
Biology
Biotechnology
Databases
Information Technology
Computer/Information Science
Defining Bioinformatics IV

What is
–
–


a bioinformatician?
a bioinformaticist?
biostar.stackexchange.com/questions/1184/
bioinformaticist-vs-bioinformatician says:
– “-ist: ...member of a profession...”
– “-ician: ...person skilled with a field...”
What do you think?
Defining Bioinformatics V


What is
–
a bioinformatician?
–
a bioinformaticist?
Google has the following #s of hits:
–
Bioinformaticist
•
–
9,700 (5/28/10); 43,700 but later in the day
182,000 (8/22/11); 26,400 (8/20/13)
Bioinformatician
•
49,500 (5/28/10); 436,000 (8/22/11);
299,000 (8/20/13)
The Web and Bioinformatics



Without the Web, bioinformatics would truly be a
shadow of what it is!
There are lots of network architectures
The “internet” one is the winning WAN (WAN?)

The internet communicates by the TCP/IP protocol


The Web is built on the internet


What’s a protocol?
Like “Understanding Bioinformatics” is built on English
The Web’s protocol is http (http?)


http is built using TCP/IP
Analogy: you and telephones
(A2)
The Web: Infrastructure for
Bioinformatics

The Web is full of name servers


They are on-line databases… (of what?)
They contain lists of names and their IP addresses


IP addresses are numerical: 195.172.6.15
Names are, for example, www.ualr.edu


Names are only a convenience!
The Web is full of Web browsers


These are the clients we know and love
A Web browser uses http to get hypermedia files
Search Engines

Early Web did not have search engines





There were “jump pages” I used to use
Then there were keyword-based engines
Then linkage-based (crowd sourced) engines
Customizing engines …
There are now also specialized engines



(A3)
For biological literature, sequences, etc.
Can you name one?
Any on-line database is a search engine!
The Sequencing Problem

Sequence, v., to determine the order
of the elementary units of




(B1)
a protein, or
a nucleic acid
You can sequence DNA, RNA, & protein
Let’s start with DNA
DNA Sequencing “Principles”

Vocabulary

Approach is called

chain termination sequencing


Also called dideoxy sequencing


Why chain? Termination? Sequencing?
Dideoxy?
Also called Sanger method

Sanger?
Dideoxy Sequencing II

Key terms are



Single-stranded DNA template
Primer
Deoxyribonucleoside triphosphates


Dideoxyribonucleoside triphosphates


dATP, dCTP, dGTP, dTTP
ddATP, ddCTP, ddGTP, ddTTP
Electrophoresis
A Question to Consider


See Figure 1, p. 11, Westhead et al.
How it works:



Given a 800-bp or less single DNA strand
Make lots and lots of copies of it
Try to build complementary strands. Use



dATP, dCTP, dGTP, & _____
DNA polymerase
Halt the building at random places < 800

Use ddATP, ddCTP, _____, & _____
A Question to Consider




Given a 800-bp or less single DNA strand
Make lots and lots of copies of it
Try to build complementary strands. Use

dATP, dCTP, dGTP, & _____

DNA polymerase
Halt the building at random places < 800





Mix batches together, “read” order via PAGE


Do it in 4 batches
Batch one: use ddATP, tag it with fluorescent “A”
Batch two: use ddCTP, tag it with fluorescent “C”
Batch three and four: use ____, tag it with ____
PAGE: polyacrylamide gel electrophoresis
See Figure 1 again
A Question to Consider
See Figures 1 and 2, p. 11, Westhead et
al.
 Is Figure 2 the result of the rectangle
part of Figure 1?
 Is Fig. 2 the template
sequence?

Example of a real
demo of Figure 1.
Combine the 800-bp pieces


Not necessarily 800 bp
Shotgun sequencing:





Break up long DNAs into pieces
Pieces from DNA molecule 1 overlap with
pieces from DNA molecule 2
Sequence the pieces
Look for overlaps
String it all together
Cost and Future Trends
Source: The Singularity is Near, by Ray Kurzweil, p. 73
NIH wants Human genome for $100k by 2009:
(http://www.wired.com/wiredscience/2008/07/british-institu/). So when will it be $100?
How to Sequence a Genome

Shotgun sequencing





Break DNA molecule randomly into pieces
Sequence the pieces
Look for overlaps to assemble the full
sequence
Why are there overlaps?
Clone contig sequencing…
Clone Contig Sequencing

“Subclones” DNA fragments



Does it in a “rational manner”,
“systematically” - Westhead et al., p. 12
Builds up the full sequence result
Is shotgun sequencing irrational?
Unsystematic? Explain…
(Supplementary Slides)
RNA Sequencing

More variations in the base pairs than
for DNA

This makes sequencing more challenging
than for DNA
Protein Sequencing

Proteins are not built of nucleotides


Like with RNA, the elementary units can have
various modifications


They are made from ________?
“modified residues or other types of
…modification…such as cleavage [and] disulfide
bonds”
Uses mass spectroscopy (MS)

What is the basic idea of MS?
Quality control


What would be the result of poor
quality?
Some vocabulary you might like to find
out about: clone, contigs,
repeats…enjoy the HW!
Single Pass Sequencing

Quality is a problem
Kinds of DNA

Genomic DNA

mtDNA

Coding DNA

Noncoding DNA

cDNA

Recombinant DNA
Which is found in the chromosomes?
Which is not found in the cell of interest?
Which has more junk? Less junk?