Download Bioinformatics

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Nucleosome wikipedia , lookup

Metabolism wikipedia , lookup

RNA interference wikipedia , lookup

Gel electrophoresis of nucleic acids wikipedia , lookup

Bisulfite sequencing wikipedia , lookup

Promoter (genetics) wikipedia , lookup

Community fingerprinting wikipedia , lookup

Endogenous retrovirus wikipedia , lookup

Transformation (genetics) wikipedia , lookup

Real-time polymerase chain reaction wikipedia , lookup

Polyadenylation wikipedia , lookup

Genetic code wikipedia , lookup

Molecular cloning wikipedia , lookup

DNA supercoil wikipedia , lookup

Messenger RNA wikipedia , lookup

Two-hybrid screening wikipedia , lookup

Gene wikipedia , lookup

RNA polymerase II holoenzyme wikipedia , lookup

Eukaryotic transcription wikipedia , lookup

Biochemistry wikipedia , lookup

RNA silencing wikipedia , lookup

Non-coding DNA wikipedia , lookup

Point mutation wikipedia , lookup

Silencer (genetics) wikipedia , lookup

Transcriptional regulation wikipedia , lookup

Vectors in gene therapy wikipedia , lookup

RNA wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

RNA-Seq wikipedia , lookup

Biosynthesis wikipedia , lookup

Epitranscriptome wikipedia , lookup

Nucleic acid analogue wikipedia , lookup

Gene expression wikipedia , lookup

Deoxyribozyme wikipedia , lookup

Transcript
Bioinformatics
Lecturer: Antinisca Di Marco
Tutor: Francesco Gallo
E–mails: [email protected]
For appointment
 Di Marco: Tuersday 3.304:30 p.m.
Friday 10:00-11:00 a.m.
 please ask appointment via email
Syllabus
Topics
Biological definitions and concepts (DNA, RNA,
microRNA, Central dogma, mutations,…)
Main biological on-line DB. Data extraction from such
systems.
Alignment algorithms and substitution matrix. Suffix
Tree.
Phylogenetic analysis. Algorithms to build Phylogenetic
trees. Test and accuracy.
Computation models for biological systems modeling.
Petri Nets and Hidden Markov Model
Technologies and languages: Python, Bio-Python e
NEO4J
Laboratory
Material
• Some chapter on two books;
• On line DB manuals (item 1-2 del Syllabus)
• Selected scienfitic papers (item 5-6 del Syllabus )
Exam
Project: pratical project to develop together with
Biotecnologists that uses the concepts introduced
in the course.
Oral exam: it includes the project discussion and
the presentation of a scientific paper selected
from the lecturer.
Mid-term exam
Lectures’ Time
Thursday 11:00-13:00,
Friday 11:00 – 13:00
Course Web Page
www.di.univaq.it/teaching/bioinfo15-16
Introduction to Bioinformatics
Bioinformatics
Bioinformatics is a multi-disciplinary research field
having the objective of understanding biology
fenoms and mechanisms.
Involved disciplines: biology, biochimics, informatics
e statistics.
It devises and develops
Systems to collect and retrieve biology data.
Mathematical and statistical techniques and
metods for biology data analysis
Computational techniques for the management
and analysis of biological data.
Bioinformatics
In particular, we focus on molecular mechanisms
(molecular biology) that constist the basis of the life.
We focus on the genetic material evolution
(molecular evolution) .
We study the evolutionary process of DNA, RNA and
proteines.
Information trasmission -> error -> DNA sequence
mutation
Single individual mutation.
Mutation fixed on the whole population or on a part of it.
Bioinfomatics
Typically, in bioinformatics the study and the
analysis of biological data is comparative and made
on nucleotide or molecular sequences.
Analysis philogenetics has the objective of study the
history of evolution.
By building the phylogenetic tree that specifies the most
probably history of evolution among species.
Bioinformatics
Bioinformatics
Different studies
BioInformatics
Algorithms for the management and
analysis of biological data (mainly
molecular and DNA) e.g., sequences
analysis and philogenetics analysis)
SYSTEM BIOLOGY
Computational (formal) models to
describe and analyse biology
phenomes to predict specific
aspects of the modelled phenoms.
Computer graphics techniques to graphically
represent behaviors of biology phenome or to identify
specific situations (diagnosis).
Biomedical applications
Molecular Biology Basics
Cell
• Cells are considered the basic units of life in
part because they come in discrete and
easily recognizable packages
• It is composed by a set of molecules
separated from the context by membranes.
• It has a metabolism.
• It reproduces it-self
• Eukaryotic and prokariotics
• Virus: no cellular organisms
Eukariotic Cells
• Organisms composed by one or more cells that have a
well-differentiated nucleus that contains the majority of
cellular DNA, enclosed in a porous envelope formed by
two membranes.
• The DNA is therefore retained in a compartment
separated from the rest
of the contents of the
cell, namely cytoplasm,
in which takes place
most of the reactions
of cell metabolism.
Prokaryotic Cell
• The prokaryotic cells are cells lacking a well-defined nucleus
and bounded by the cell membrane.
• Prokaryotic cells, differently from the eukaryotic ones, do not
possess organelles, except for ribosomes, and have a very
simple internal structure.
• Not having the nucleus the DNA is
scattered in the cytoplasm.
• The cellular genome is more
simple and is constituted by a
single circular DNA molecule,
in addition to any autonomous
replicons. It absents the
nuclear membrane.
• The cytoplasm containing DNA and ribosomes.
Bio-molecules
• Macromolecules:
– DNA
These molecules define the structural and
functional characteristics of cells.
– RNA
– Proteins
– saccharides
Mainly Energetic metabolism
– lipids
• Metabolites (small molecules)
DNA
• The deoxyribonucleic acid (DNA) is a nucleic acid that
contains the genetic information necessary to the
biosynthesis of RNA and protein (essential molecules for the
development and proper functioning of most living
organisms).
• From a chemical standpoint, DNA is an organic polymer
consisting of nucleotide sequences: A adenine, T thymine,
guanine G, cytosine C.
• The process of genetic translation (protein synthesis) is
possible only in the presence of an intermediate RNA
molecule, which is generated by the process known as
transcription.
RNA
• Polymers composed of nucleotide sequences: A adenine, U
uracil, guanine G, cytosine C
• Single-stranded (RNA)
• The RNA molecules are synthesized through a process
known as DNA transcription, where a strand of DNA is
copied into the corresponding strand of RNA.
• There are three common types of RNA in all cellular
organisms:
– mRNA (messenger RNA) that contains the information for the
synthesis of proteins;
– rRNA (ribosomal RNA), which enters into the structure of the
ribosome;
– tRNA (transfer RNA) needed for translation in the ribosomes.
Proteins
• The proteins are large biological molecules formed by one
or more amino-acid chains - Polymers consisting of amino
acids (20 different)
• The polypeptide chain is the primary structure of the
protein
• Protein function: metabolism, energy, transcription, protein
synthesis, transportation, communication, cell cycle, ...
Central Dogma
DNA→RNA→Protein
Central Dogma
• Transcription and translation are the two
main processes linking gene to protein
• Genes provide the instructions for making
specific proteins.
• The bridge between DNA and protein synthesis
is RNA.
• RNA is chemically similar to DNA, except that it
contains ribose as its sugar and substitutes the
nitrogenous base uracil for thymine.
– An RNA molecules almost always consists of a single
strand.
DNA→RNA→Protein
• DNA is TRANSCRIBED to
messenger RNA (mRNA)
• mRNA carries the message to
tranfer RNA (tRNA)
• tRNA is TRANSLATED to an
amino acid chain, which makes
up proteins
• In DNA or RNA, the four nucleotide
monomers act like the letters of the alphabet
to communicate information.
• The specific sequence of hundreds or
thousands of nucleotides in each gene
carries the information for the primary
structure of a protein (the linear order of the
20 possible amino acids)
• To get from DNA, written in one chemical
language, protein, written in another,
requires two major stages, transcription and
translation.
• During transcription, a DNA strand
provides a template for the synthesis of a
complementary RNA strand.
– This process is used to synthesize any type of RNA
from a DNA template.
• Transcription of a gene produces a messenger
RNA (mRNA) molecule.
– mRNA carries the message from the nucleus to the
ribosomes
• During translation, the information contained in
the order of nucleotides in mRNA is used to
determine the aminoacid sequence of a
polypeptide.
– Translation occurs at ribosomes.
• To summarize, genes program protein
synthesis via genetic messenger RNA.
• The molecular chain of command in a cell is
DNA → RNA → protein.
This is referred to as the Central Dogma of
Biology
Mutazioni: alterazioni
dell'informazione codificata
nel DNA
●
Sostituzioni: cambiamento di una singola base
●
Inserzioni: aggiunte di nucleotidi
●
Delezioni: rimozioni di nucleotidi
Mutazioni in sequenze
codificanti
●
Sostituzioni
sinonime: non modificano l'amino acido
di senso: cambiano un amino acido in uno diverso
non-senso: cambiano un amino acido in un codone di stop
●
Inserzioni/Delezioni
Con cornice di lettura mantenuta (multipli di tre)
Frameshift
Evoluzione nel tempo