Download Introduction Document

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Polyadenylation wikipedia , lookup

DNA supercoil wikipedia , lookup

Real-time polymerase chain reaction wikipedia , lookup

Promoter (genetics) wikipedia , lookup

Protein–protein interaction wikipedia , lookup

SR protein wikipedia , lookup

RNA polymerase II holoenzyme wikipedia , lookup

Eukaryotic transcription wikipedia , lookup

RNA silencing wikipedia , lookup

Endogenous retrovirus wikipedia , lookup

Messenger RNA wikipedia , lookup

Vectors in gene therapy wikipedia , lookup

Protein wikipedia , lookup

Amino acid synthesis wikipedia , lookup

Non-coding DNA wikipedia , lookup

RNA wikipedia , lookup

Transcriptional regulation wikipedia , lookup

Metabolism wikipedia , lookup

Metalloprotein wikipedia , lookup

RNA-Seq wikipedia , lookup

Gene wikipedia , lookup

Two-hybrid screening wikipedia , lookup

Proteolysis wikipedia , lookup

Point mutation wikipedia , lookup

Silencer (genetics) wikipedia , lookup

Protein structure prediction wikipedia , lookup

Epitranscriptome wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Gene expression wikipedia , lookup

Deoxyribozyme wikipedia , lookup

Nucleic acid analogue wikipedia , lookup

Genetic code wikipedia , lookup

Biosynthesis wikipedia , lookup

Biochemistry wikipedia , lookup

Transcript
Basic Concepts of Molecular Biology
*from a bioinfo. point of view...
To be recalled to those trained in exact sciences: nothing is 100%true
Contents:
1. Life
2. Proteins
3. Nucleic Acids
4. Molecular Genetics (mechanisms).
5. How the Genome is studied
6. Sequence Databases
1
1. Life
Living organism: due to complex array of chemical reactions, exchanges
constantly matter and energy with its surroundings, no deadlock.
The main actors in the chemistry of life(biochemistry) are molecules
called proteins ("we are our proteins") and nucleic acids.
Molecular biology research: basically devoted to the understanding of the
structure and functions of proteins and nucleic acids.
2
2. Proteins
Most substances in our bodies are proteins:
-structural proteins act as tissue building blocks
-enzymes: act as catalyst of chemical reactions
A protein is a chain of amino acids. It has a central carbon to which is
attached :
-aH
-an amino group NH2
- a carboxy group COOH
-a side chain particular to each amino acid
3
4
5
Protein (cont.1)
In a protein, amino acids are joined y peptide bonds: the C belonging to
COOH of protein Ai bonds to the NH of protein Ai+1 and a water
molecule is liberated.
What we really find inside a polypeptide chain is a residue of the original
amino acid. Thus, we generally speak of proteins having 10 (typivally
from 300 to 5000) residues.
Repetition of blocks -N-Cα-CO- is the backbone . The convention is that
a polypeptide chain begins at the amino group (N-terminal) and ends at
the carboxy group (C-terminal).
6
7
Protein(cont.2)
Structures:
The primary structure is the sequence of the residues.
The secondary structure takes in account local interactions between
atoms of the backbone (α_helix, β-sheet, loops).
The tertiary structure expresses the folding in 3D.
Quaternary structure: group of different proteins packed together.
8
9
Proteins (cont.3)
Finding protein folding is one of the main research area in molecular
biology (the "Graal").Values of all pairs of angles φ (between the Cα
atom and the N atom) and ψ (between the Cα atom and the other C
atom) for the different amino acids would give exact structure. Very
difficult problem.
The three dimensional form of a protein is related to its function. A
folded protein has varied nooks and bulges to bind to other molecules to
build group or exchange atoms.
Proteins are produced in a cell structure called ribosome where the amino
acids are assembled one by one from an important molecule called
messenger ribonucleic acid
10
Nucleic Acids
Living organisms contains two kinds of nucleic acids:
- ribonucleic acid (RNA)
- deoxyribonucleic acid (DNA)
DNA
- double chain with two strands.
- backbone is formed by a sugar molecule (2'-deoxyribose) attached to a
phosphate residue.
- orientation: carbon atoms are labeled 1' to 5'. The basic bond of the
backbone is : 3' carbon -phosphate residue- 5' carbon. By convention, a
strand begins at the 5' end and finishes at the 5' end.
11
12
DNA (cont.1)
- orientation: carbon atoms are labeled 1' to 5'. The basic bond of the
backbone is : 3' carbon -phosphate residue- 5' carbon. By convention, a
strand begins at the 5' end and finishes at the 5' end.
- To each 1' carbon is attached a base: adenine A, guanine G,(they are
purines), cytosine C, thymine T (they are pyrimidines).
- A nucleotide is a set sugar + phosphate + base
-An oligonucleotide is a DNA molecule having a few (ten of)
nucleotides.
13
14
15
DNA(cont.2)
- DNA molecules are double strands which are tied together in a helix
structure (James Watson and Francis Crick, 1953).
-A (resp. C) is the complement of T(resp. G). Unit of length: bp (base
pair)
-The two strands are antiparallel. one can be deuced from the other by
reverse complementation
Example: s = AGACGT,
s' = TGCAGA (reverse)
ś = AGACGT (reverse complement)
16
17
RNA
Differences with RNA
- Sugar is ribose instead of 2'deoxyribose.
- Instead of T, one finds U (uracil) which binds also with adenine.
- RNA does not form a double helix. It may have a far more varied three
dimensional structure.
-They are different kinds of RNA which perform different functions.
18
4. The Mechanisms of Molecular
Genetics
19
Genes and the genetic code
- Chromosome: long DNA molecule which contains coding parts which
contains genes which code for proteins.
- Each amino acid is specified by a codon, a triplet of nucleotide. The
correspondence between each triplet (using RNA) and each amino acid is
given by the genetic code :
- There are 64 possible triplets, but only 20 amino acids.
- Several codons can code for one amino acid (ie. AAG and AAA for
lysine)
- Three codons STOP are used to signal the end of a gene.
20
Transcription
Produces RNA from DNA by the mean of the RNA polymerase: mRNA
(for messenger RNA) from a gene , or rNA (ribosomal RNA) or tRNA
(transfert RNA).
- the RNA polymerase recognizes the beginning of a gene (or of a gene
cluster) thanks to a promoter (TATA box is the best known of them)
which is situated upstream (before the START codon AUG). Termination
is not well-known (polyadenization).
- the template strand is the one that is transcribed (mRNA is composed
by binding together ribonucleotides complementary to this strand).
21
22
Transcription (cont.1)
-for eukariotes (organisms whose cells have a nucleus), the mechanism is
more complex than for (cells without a nucleus, like bacteria). Genes can
contain alternating parts, called exons and introns (which are not
transcripted). Splicing (which removes introns from the primary
transcript) is done in the nucleus and delivers(outside the nucleus) the
mRNA. Alternative splicing (same DNA can give rise to two or more
different mRNA by choosing introns and exons in a different way) may
also occur...
- One distinguishes genomic DNA (gene as found in the chromosom)
from complementary DNA (cDNA, sequence consisting of exons only).
cDNA can be produce from RNA by reverse transcription (EST,
Expressed Sequence TAG).
23
Translation
Produces a protein from a mRNA by using a ribosome which make use
of tRNA to construct an amino acid from a codon.
-initiated when one of the tRNAs of the ribosome binds to a particular
sequence (more or less the Shine-Delgarno sequence ATTCCTCCA) in
the RNA.
-the first codon to be translated is AUG
-there are not as many tRNAs as there are codons. Their number varies
among species (40 for bacterium E.Coli).
24
25
Junk DNA and Reading Frame
Junk DNA: intergenic regions between coding parts. Prokatiotes have
little of it, eukariotes quite more (more than 90% for the human genome).
Reading frame : one of the three possible ways of grouping bases to form
a RNA sequence.
Example: TAATCGAATGGC has the three following frames:
[TAA, TCG, AAT,...], [AAT, CGA, ...], [ATC, GAA,...]
- 6 reading frames have to be considered if one wants to translate a DNA
sequence into a (supposed) protein, because of the two strands.
Open Reading Frame (ORF) in a DNA sequence: a subsequence
beginning at a start codon, having an integral number of codons, non of
which being a STOP codon.
26
Chromosomes
Genome : complete set of chromosomes of an organism.
Prokariotes have usually one chromosome (sometime circular).
In eukariotes, chromosomes appear in pair (23 for humans, the cells
containing them are called diploid). The two chromosomes of a pair are
said homologous. Genes which appear differently in the two
chromosomes are alleles.
Cells which carry only one member of each pair are haploid (these used
in sexual reproduction) formed through the process of meiosis.
- not all genes are expressed by a specific cell.
27
Genome size of "important" species
Bacteriophage λ (virus)
1 chr.
5*104 bp
Escherichia Coli
1
5*106
Saccharomyces cerevisaie (yeast)
32
1*107
Caenorhabditis elegans (worm)
12
5*108
Drosophila melanogaster (fruit fly) 8
2*108
Homo sapiens (human)
5*109
46
28