Download BI-Lec 2

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts

Therapeutic gene modulation wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

NEDD9 wikipedia , lookup

Protein moonlighting wikipedia , lookup

Metagenomics wikipedia , lookup

Point mutation wikipedia , lookup

Genomics wikipedia , lookup

Transcript
BIOINFORMATICS
Ayesha Masrur Khan
Spring 2013
Lec-2
1
Bioinformatics
A complete understanding of the term
The National Centre for Biotechnology Information (NCBI
2001) defines bioinformatics as:
"Bioinformatics is the field of science in which biology,
computer science, and information technology merge into a
single discipline. There are three important sub-disciplines
within bioinformatics: the development of new algorithms and
statistics with which to assess relationships among members
of large data sets; the analysis and interpretation of various
types of data including nucleotide and amino acid sequences,
protein domains, and protein structures; and the development
and implementation of tools that enable efficient access and
management of different types of information."
Lec-2
2
Bioinformatics-Aim
 It is not just “informatics”
 Bioinformatics is the field of science in which biology, computer
science, mathematics and information technology merge into a
single discipline. The ultimate goal of the field is to enable the
discovery of new biological insights as well as to create a global
perspective from which unifying principles in biology can be
discerned.
 We want to be able to understand the words in a sequence sentence
that form a particular protein structure, and one day to be able to
write sentences (design proteins) of our own.
 Furthermore, this new knowledge could have profound impacts on
fields as varied as human health, agriculture, the environment,
energy and biotechnology.
Lec-2
3
Bioinformaticists, Bioinformaticians &
Bioinformatics scientists
A Bioinformaticist versus a Bioinformatician (1999):
Bioinformatics has become a mainstay of genomics, proteomics,
and all other *omics (such as phenomics) that many information
technology companies have entered the business or are
considering entering the business, creating an IT (information
technology) and BT (biotechnology) convergence.
 A bioinformaticist is an expert who not only knows how to use
bioinformatics tools, but also knows how to write interfaces for
effective use of the tools.
 A bioinformatician, on the other hand, is a trained individual
who only knows how to use bioinformatics tools without a deeper
understanding.
Lec-2
4
Bioinformaticists, Bioinformaticians &
Bioinformatics scientists
There are bioinformaticists interested in the
theory behind the manipulation of that data
and there are bioinformatics scientists
concerned with the data itself and its
biological implications.
Lec-2
5
Challenges facing the bioinformatics
community
Mass of Data
- Need to provide easy and reliable access to this
data
- This data itself is meaningless before analysis and
the sheer volume present makes it impossible for
even a trained biologist to begin to interpret it
manually
- Incisive computer tools must be developed to
allow the extraction of meaningful biological
information
Lec-2
6
Earliest Efforts in Bioinformatics
Bioinformatics started over a century ago by Gregor Mendel,
known as Father of Genetics Genetic record keeping
He cross-fertilized different colors of the same species of
flowers and kept careful records of the colors of flowers that he
cross-fertilized and the color(s) of flowers they produced.
Mendel illustrated that the inheritance of traits could be more
easily explained if it was controlled by factors passed down
from generation to generation.
Lec-2
7
Lec-2
8
Lec-2
9
Lec-2
10
Terms that need to be understood
• Homology-denotes an absolute divergent
relationship between sequences.
• Analogy-can denote, based on similar folds or
catalytic residues similarity, either divergent
or convergent relationship.
• Orthology-Proteins that perform same
functions in different species.
• Paralogy-Proteins that perform different but
related functions within on organism.
Lec-2
11
Origin of bioinformatic/biological
databases
The first bioinformatic/biological databases were constructed a
few years after the first protein sequences began to become
available.
•The first protein sequence reported was that of bovine insulin in
1956, consisting of 51 residues.
•Nearly a decade later, the first nucleic acid sequence was
reported, that of yeast alanine tRNA with 77 bases
•Just a year later, Dayhoff gathered all the available sequence data
to create the first bioinformatic database.
•The Protein Data Bank followed in 1972 with a collection of ten Xray crystallographic protein structures
•SWISSPROT protein sequence database began in 1987.
Lec-2
12
Types of data available
Enormous amounts of data available publicly
– DNA/RNA sequence
– SNPs
– protein sequence
– protein structure
– protein function
– organism‐specific databases
– genomes
– gene expression
– biomolecular interactions
– molecular pathways
– scientific literature
– disease information
Lec-2
13
Three Central biological processes around which
bioinformatics tools must be developed:
DNA sequence determines protein sequence
Protein sequence determines protein structure
Protein structure determines protein function
Lec-2
14