Download IntroBio520 - Nematode bioinformatics. Analysis tools and data

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Genome (book) wikipedia , lookup

Bisulfite sequencing wikipedia , lookup

Nutriepigenomics wikipedia , lookup

Transposable element wikipedia , lookup

Vectors in gene therapy wikipedia , lookup

Whole genome sequencing wikipedia , lookup

Protein moonlighting wikipedia , lookup

Point mutation wikipedia , lookup

No-SCAR (Scarless Cas9 Assisted Recombineering) Genome Editing wikipedia , lookup

Genomic library wikipedia , lookup

Microevolution wikipedia , lookup

Designer baby wikipedia , lookup

Minimal genome wikipedia , lookup

History of genetic engineering wikipedia , lookup

Gene wikipedia , lookup

Metabolic network modelling wikipedia , lookup

Gene expression profiling wikipedia , lookup

Public health genomics wikipedia , lookup

Site-specific recombinase technology wikipedia , lookup

Human genome wikipedia , lookup

Non-coding DNA wikipedia , lookup

Therapeutic gene modulation wikipedia , lookup

Human Genome Project wikipedia , lookup

Pathogenomics wikipedia , lookup

RNA-Seq wikipedia , lookup

Genome evolution wikipedia , lookup

Helitron (biology) wikipedia , lookup

Metagenomics wikipedia , lookup

Genome editing wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Genomics wikipedia , lookup

Transcript
Bioinformatics
BIO520/INF520
Jim Lund
Assigned reading:
Ch1 & 2
Bioinformatics
Bioinformatics applies principles of information
science (derived from applied math, computer science,
and statistics) to make the vast, diverse, and complex
life sciences data more understandable and useful. It
automates simple but repetitive types of analysis.
Computational biology uses mathematical and
computational approaches to address theoretical and
experimental questions in biology.
BIO520 Topics
• Navigating biological databases.
• Sequence alignment.
• Proteins
- 3D structure visualization, prediction, motif
analysis.
• DNA sequence annotation.
– Gene finding in prokaryotes and eukaryotes.
• RNA structure.
• Phylogenetic inference
• Genome/transcriptome/proteome
– Function & Analyses.
Molecular information-DNA
• Raw bacterial DNA
sequence
– Coding or not?
– Parse into genes?
– Find regulatory
sequences?
– PCR primers, vector
engineering?
– 4 bases: ACGT
• 1kb for a gene
• Mb for a genome
19
8
19 2
8
19 3
8
19 4
8
19 5
8
19 6
8
19 7
8
19 8
8
19 9
9
19 0
9
19 1
9
19 2
9
19 3
9
19 4
9
19 5
9
19 6
9
19 7
9
19 8
9
20 9
0
20 0
0
20 1
0
20 2
0
20 3
0
20 4
0
20 5
0
20 6
0
20 7
0
20 8
09
Sequences (millions)
110
100
90
70
60
50
100
80
80
x
http://www.ncbi.nlm.nih.gov/Genbank/genbankstats.html
60
40
30
40
20
20
10
0
0
Base pairs (billions)
Growth of Genbank (1982-2009)
120
Protein Structure
Prediction
Proteomics
1978-1998
MALDI-TOF?
ESI-MS?
Metabolic Networks
KEGG, 1998
Regulatory Networks
KEGG
Bioinformatics-what is it?
Acquisition, curation, and analysis of
biological data
Hypothesis
Bioinformatic Data-1978 to 2008
• DNA sequence
• Gene expression
• Protein
expression
• Protein Structure
• Genome mapping
• Metabolic
networks
• Regulatory
networks
• Trait mapping
• Gene function
analysis
• Scientific literature
Goals of the HGP,1998-2003
•
Reference Human Genome Sequence
•
•
Improved Sequence Technology
•
•
•
•
$0.25 per finished base
Human Genome Sequence Variation
Technology for Functional Genomics
Comparative Genomics
•
•
Draft 2001, Finished in 2003
Finish Mouse by 2005 (well ahead here)
ELSI
Genome sequences highlight the finiteness
of the set of sequences!
•
What remains to be
done?
Comparative
Genomics
• Description of
mRNAs, proteins
(identity and
structure)
• Functional
analysis
• Detailed
understanding of
development,
regulation,
variation
The Gene for…
Other Reasons to Care
Affymetrix
Genentech
Biologist User Training
• Internet sites
–Range from high quality to unreliable.
• Unread documentation
• Popular program sites with NO
documentation
–Perhaps one day I will get around to
writing some documentation”–Help from a WWW service, hit several hundred times
per day!
Dramatic Changes in
Information Science
• Information Storage
– Digital: text, numbers, images
• Computerized Data Analysis
• Automated Data Analysis
• Information Distribution
– Internet, cloud, etc.
Moore’s Law
Intel Corporation
Computer Science and
bioinformatics
• Operating Systems
• Programming
• Algorithms
– New problems keep turning up!
• Data structure/databases
• Interfaces
• Search and visualization
BIO520 Nuts and Bolts
• Syllabus & Schedule • Labs on Fridays
In Young B-35
• Textbook
– Internet
• Exams (2 + final)
– Program
• Grading:
documentation
– 12 labs: 10 pts
– Exams: 50 pts
– Final: 50 pts
http://elegans.uky.edu/520
Textbooks
Required textbook:
• Understanding Bioinformatics by
Marketa Zvelebil and Jeremy Baum
Supplemental reading (don’t buy):
• Bioinformatics: A Practical Guide to
the Analysis of Genes and Proteins,
3rd Ed.
– Baxevanis and Ouellette
Biology background material:
– Genes IX (Lewin)
– Cell Biology (Watson et al, Darnell et al)
– NCBI Bookshelf
(http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?d
b=Books&itool=toolbar)
Computer Resources
• http://elegans.uky.edu/520
• Locally installed Programs:
– Cn3D, Clustal, TreeView, Chime
• Web based tools:
– Databases
– Software programs
Biological Principles
Evolution by natural selection
DNA->RNA->Protein
StructureFunction