Download slides

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

DNA wikipedia , lookup

Zinc finger nuclease wikipedia , lookup

SNP genotyping wikipedia , lookup

Biomolecular engineering wikipedia , lookup

Genetic engineering wikipedia , lookup

Biology wikipedia , lookup

DNA vaccination wikipedia , lookup

Molecular cloning wikipedia , lookup

Life wikipedia , lookup

DNA-encoded chemical library wikipedia , lookup

History of molecular biology wikipedia , lookup

Cre-Lox recombination wikipedia , lookup

Non-coding DNA wikipedia , lookup

Introduction to genetics wikipedia , lookup

Vectors in gene therapy wikipedia , lookup

Molecular paleontology wikipedia , lookup

Genetics wikipedia , lookup

History of genetic engineering wikipedia , lookup

Transcript
CSE 6406: Bioinformatics Algorithms
Course Outline
http://203.208.166.84/masudhasan/cse6406_April_08.html
Why Bioinformatics ?
•
•
•
•
New topic
Huge scope in research, higher studies, and job market
Expected to dominate in future research in computer science
Scope for diversive research fields:
•
•
•
•
•
•
AI
Algorithms
Database
Data mining
Software
System biology
• Other hot topics of current time:
• Wireless communication, Ad-hoc networks, Sensor networks,
Database …
Our Strategy
• We would say “Algorithms Motivated by/Initiated from/Applied in
Bioinformatics”
• Main focus on algorithms, not on biology. We will study from
algorithm point of view, not from biological point of view
• We will quickly formulate the biological to an algorithmic problem
• We will meet many of our old friends in algorithms:
•
Exhaustive search, Greedy algorithms, Dynamic programming, Divide
and conquer, Graph algorithms, Trees, etc.
• How much biology do I need to know ?
•
•
Content from the text would suffice
Should not be nothing more than school biology
Brief Introduction
[1]: Ch 3, [2]: Ch 13
What is Life made of?
What is Life made of?
• All living things are made of Cells
• What is Inside the cell: DNA, RNA, Proteins
Cells
• Fundamental working units of every living system
• Every organism is composed of one of two types of cells:
•
•
prokaryotic cells or
eukaryotic cells
Life begins with Cell
Overview of organizations of life
•
•
•
•
Nucleus = library
Chromosomes = bookshelves
Genes = books
Almost every cell in an organism contains the same libraries
and the same sets of books.
• Books represent all the information (DNA) that every cell in
the body needs so it can grow and carry out its various
functions.
More Terminology
• The genome is an organism’s complete set of DNA.
• a bacteria contains about 600,000 DNA base pairs
• human and mouse genomes have some 3 billion.
• Human genome has 24 distinct chromosomes.
• Each chromosome contains many genes.
• Gene
• basic physical and functional units of heredity.
• specific sequences of DNA bases that encode instructions on
how to make proteins.
• Proteins
• Make up the cellular structure
• large, complex molecules made up of smaller subunits called
amino acids.
All Life depends on 3 critical molecules
• DNA
• Hold information on how cell works
• RNA
• Act to transfer short pieces of information to different parts of cell
• Provide templates to synthesize into protein
• Protein
• Form enzymes that send signals to other cells and regulate gene
activity
• Form body’s major components (e.g. hair, skin, etc.)
DNA: The Code of Life
• The structure and the four genomic letters code for all living
organisms
• Adenine, Guanine, Thymine, and Cytosine which pair A-T and C-G
on complimentary strands.
DNA, continued
• DNA has a double helix
structure which
composed of
• sugar molecule
• phosphate group
• and a base (A,C,G,T)
DNA, RNA, and the Flow of
Information
Replication
Transcription
Translation
DNA the Genetics Makeup
DNA: The Basis of Life
• Humans have about 3 billion base
pairs.
• How do you package it into a cell?
• How does the cell know where in
the highly packed DNA where to
start transcription?
• DNA size does not mean more
complex
• Complexity of DNA
• Eukaryotic genomes consist of
variable amounts of DNA
• Single Copy or Unique DNA
• Highly Repetitive DNA
RNA
• RNA is similar to DNA chemically. It is usually only a single
strand. T(hyamine) is replaced by U(racil)
• Some forms of RNA can form secondary structures by “pairing
up” with itself. This can have change its properties dramatically.
• DNA and RNA can pair with each other.
Protein Folding
• Proteins are not linear structures,
though they are built that way
• Proteins tend to fold into the
lowest free energy conformation.
• Its structure determines its function
DNA Operations
• Copying DNA
• Cutting and Pasting DNA
• Measuring DNA Length
• DNA sequencing
• Probing DNA
Genetic Variation
• Despite the wide range of physical variation, genetic variation
between individuals is quite small.
• Out of 3 billion nucleotides, only roughly 3 million base pairs
(0.1%) are different between individual genomes of humans.
• Although there is a finite number of possible variations, the
number is so high (43,000,000) that we can assume no two
individual people have the same genome.
• What is the cause of this genetic variation?
Sources of Genetic Variation
• Mutations are rare errors in the DNA replication process that
occur at random.
• Recombination is the shuffling of genes that occurs through
sexual mating and is the main source of genetic variation.
• Others…..
Molecular evolution can be visualized
with phylogenetic tree.
Turnip and Cabbage
• Cabbages and turnips share a common ancestor
Genetic Similarities Between
Turnip and Cabbage
• In 1980s, scientists discovered evolutionary change in plants
by comparing mitochondrial genomes of the cabbage and
turnip
• 99% similarity between genes
• These more or less identical gene sequence surprisingly
differed in gene order
Important discovery
DNA Reversal
5’ A T G C C T G T A C T A 3’
3’ T A C G G A C A T G A T 5’
Break
and
Invert
5’ A T G T A C A G G C T A 3’
3’ T A C A T G T C C G A T 5’
Algorithmic Problem
• Given two strings (of same set of characters) find a sequence
of reversals of substrings that will transform one to other
• Biologist are interested in shortest such sequence
• Which makes the algorithm more challenging, and it is one of
the most studied problem in algorithmic bioinformatics !!!
Thanks