Download Introduction to Bioinformatics - Computer Science | Winona State

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Exome sequencing wikipedia , lookup

Microsatellite wikipedia , lookup

Helitron (biology) wikipedia , lookup

Human Genome Project wikipedia , lookup

Transcript
Summer Bioinformatics Workshop 2008
Introduction to Bioinformatics
Summer Bioinformatics Workshop 2008
Chi-Cheng Lin, Ph.D., Professor
Department of Computer Science
Winona State University – Rochester
[email protected]
Summer Bioinformatics Workshop 2008
Outline
• What is Bioinformatics
• The Human Genome Project
• Applications of Bioinformatics
• References
Acknowledgement: The presentation includes adaptations from DOE’s “Human Genome 2
Project and Beyond Primer” and Dr. Yan Asmann’s (Mayo Clinic) lecture notes
Summer Bioinformatics Workshop 2008
Bioinformatics
• Living things have the ability to store,
utilize, and pass on information
• Bioinformatics strives to
– determine what information is biologically
important
– decipher how it is used to precisely control the
chemical environment within living organisms
3
Summer Bioinformatics Workshop 2008
What is Bioinformatics
• The collaboration of
Biology and Informatics
• Originally referred to the use of
computational tools to organize and
analyze genetic and protein sequence
data (first coined by Dr. Hwa Lim in 1988)
4
Summer Bioinformatics Workshop 2008
NCBI’s Definition of Bioinformatics
• NCBI (National Center for Biotechnology
Information, http://www.ncbi.nlm.nih.gov/)
– “Bioinformatics is the field of science in
which biology, computer science, and
information technology merge to form a single
discipline.”
– “The ultimate goal of the field is to enable the
discovery of new biological insights as well as
to create a global perspective from which
unifying principles in biology can be
discerned.”
5
Summer Bioinformatics Workshop 2008
Human Genome Project
6
Summer Bioinformatics Workshop 2008
Human Genome Project
• Goals include
– Identify genes in human DNA
– Determine sequence making up human DNA
– Store this information in databases
– Improve tools for data analysis
– Etc.
• Milestone
– April 2003: HGP sequencing is completed and
project is declared finished two years ahead
7
of schedule
Summer Bioinformatics Workshop 2008
Interesting Numbers characterizing the
Human Genome
• 3 billion:
– The number of chemical nucleotide bases (A, C, G,
and T) contained in the haploid human genome
• 3 million:
– The number of locations where single-base DNA
differences occur in the human genome
• 2.4 million:
– The number of bases comprising the largest known
human gene (the average gene comprises 3000
bases)
• 30,000:
– The total number of genes estimated (much lower
than previous estimates of 80,000 to 140,000)
8
Summer Bioinformatics Workshop 2008
Interesting Numbers characterizing the
Human Genome
• 99.9%
– Fraction of nucleotide bases that are exactly the
same in all people
• 50%
– Fraction of discovered genes for which function is
unknown
• 2%
– Fraction of genome that codes for proteins (the rest:
“junk”(?) DNA)
• 9%, 11%, 26%, 28%, 45%, 83%, 89%, and 95%
– The percentage of genes E. coli, rice, roundworm,
yeast, fruit fly, zebrafish, mouse, and chimpanzee
share with humans, respectively.
9
Summer Bioinformatics Workshop 2008
How does the human genome stack up?
Organism
Genome Size Estimated
(Bases)
Genes
Human (Homo sapiens)
3 billion
30,000
Laboratory mouse
(M. musculus)
2.6 billion
30,000
Mustard weed (A. thaliana)
100 million
25,000
Roundworm (C. elegans)
97 million
19,000
Fruit fly (D. melanogaster)
137 million
13,000
Yeast (S. cerevisiae)
12.1 million
6,000
Bacterium (E. coli)
4.6 million
3,200
Human immunodeficiency
9700
9
virus (HIV)
Humans share most of the same protein families
with worms, flies, and plants!
10
Summer Bioinformatics Workshop 2008
Anticipated Benefits of Genome Research
•
•
•
•
•
•
•
•
Molecular medicine
Microbial genomics
Bioarchaeology
Anthropology
Evolution
Human Migration
DNA identification (forensics)
Agriculture, livestock breeding, and
bioprocessing
11
Summer Bioinformatics Workshop 2008
ELSI: Ethical, Legal, and Social Issues
•
•
•
•
•
•
•
•
•
•
Privacy and confidentiality of genetic information
Fairness in the use of genetic information
Psychological impact, stigmatization, and discrimination
Reproductive issues
Clinical issues
Uncertainties associated with gene tests for
susceptibilities and complex conditions
Fairness in access to advanced genomic technologies.
Conceptual and philosophical implications
Health and environmental issues
Commercialization of products
12
Summer Bioinformatics Workshop 2008
Mike Thompson, Detroit, Michigan -- from The Detroit Free Press
Source: http://cagle.msnbc.com/news/gene/gene5.asp
13
Summer Bioinformatics Workshop 2008
Future Challenges:
What We Still Don’t Know
• Gene prediction and discovery
– location, function, structure, regulation, etc.
• Single-base DNA variations among individuals
– Correlation with health and disease
– Disease-susceptibility prediction
•
•
•
•
Genes involved in complex traits and multigene disorders
Protein conservation (structure and function)
Proteomes (total protein content and function) in organisms
Systems biology
– Coordination of gene expression and protein synthesis
– Interaction of proteins in complex molecular machines
– Microbial consortia useful for environmental restoration
• Developmental genetics and genomics
• Evolutionary conservation among organisms
• And many more …
14
Summer Bioinformatics Workshop 2008
Tackle Future Challenges:
Bioinformatics
• High volume of data to store, compute, and
analyze
• Huge amount of information to retrieve, interpret,
and visualize
• Complex system to study, model, and simulate
THAT’S WHY
BIOINFORMATICS
IS
INDISPENSABLE!!
15
Summer Bioinformatics Workshop 2008
Genomics Studies
• Genomics
– Study of the whole genome
– Sequencing and annotating genomes
• Comparative genomics
– Comparison and characterization of genomes from different
species to identify genes and their functions and to investigate
evolutionary history
• Functional genomics
– Understanding the function of genes and other parts of the
genome
• Structural genomics
– Determining the 3D structure of all proteins
• Pharmacogenomics
– Study of how an individual's genetic inheritance affects the
body's response to drugs
16
Summer Bioinformatics Workshop 2008
Genome Sequencing
Drew Sheneman, New Jersey -- The Newark Star Ledger
Source: http://cagle.msnbc.com/news/gene/gene14.asp
17
Summer Bioinformatics Workshop 2008
Human Migration Patterns using DNA Sequences
18
Summer Bioinformatics Workshop 2008
Medicine and the New Genetics
Gene Testing  Pharmacogenomics  Gene Therapy
• Anticipated benefits:
– Improved diagnosis of disease
– Earlier detection of genetic predispositions
to disease
– Pharmacogenomics:
• Genetic testing before prescribing drugs
• Dose-selection based on genetic variations
• Drugs tailor-made to each patient
However, the application of pharmacogenomics in medical
practice is still quite limited today, due to the lack of genetic
information from a large population
19
Summer Bioinformatics Workshop 2008
References
• NCBI (National Center for Biotechnology
Information) http://www.ncbi.nlm.nih.gov/
homepage
• NCBI Science Primer
http://www.ncbi.nlm.nih.gov/About/primer/
• Human Genome Project Information
http://www.ornl.gov/sci/techresources/Human_G
enome/home.shtml (esp. link to the Education
module)
• The Human Genome Project and Beyond Primer
http://www.ornl.gov/sci/techresources/Human_G
enome/publicat/primer2001/primer.ppt
20