Download BINF6201/8201 Basics of Molecular Biology

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Histone acetylation and deacetylation wikipedia , lookup

Gel electrophoresis of nucleic acids wikipedia , lookup

Molecular cloning wikipedia , lookup

Protein wikipedia , lookup

Cell-penetrating peptide wikipedia , lookup

Polyadenylation wikipedia , lookup

RNA silencing wikipedia , lookup

Gene regulatory network wikipedia , lookup

Messenger RNA wikipedia , lookup

Community fingerprinting wikipedia , lookup

Cre-Lox recombination wikipedia , lookup

List of types of proteins wikipedia , lookup

Bottromycin wikipedia , lookup

RNA-Seq wikipedia , lookup

Non-coding DNA wikipedia , lookup

RNA polymerase II holoenzyme wikipedia , lookup

Molecular evolution wikipedia , lookup

Promoter (genetics) wikipedia , lookup

Gene wikipedia , lookup

Expanded genetic code wikipedia , lookup

Eukaryotic transcription wikipedia , lookup

Replisome wikipedia , lookup

Point mutation wikipedia , lookup

Protein structure prediction wikipedia , lookup

Non-coding RNA wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Epitranscriptome wikipedia , lookup

Genetic code wikipedia , lookup

Silencer (genetics) wikipedia , lookup

Gene expression wikipedia , lookup

Biochemistry wikipedia , lookup

Deoxyribozyme wikipedia , lookup

Transcriptional regulation wikipedia , lookup

Nucleic acid analogue wikipedia , lookup

Transcript
BINF6201/8201
Basics of Molecular Biology
08-26-2016
Linear structure of nucleic acids
Ø  Nucleic acids are polymers of nucleotides
Ø  Nucleic acids
Deoxyribonucleic acids (DNA)
Ribonucleic acids (RNA)
Phosphate
Ø  Nucleotide
Ribose or deoxyribose
Nucleoside
Purines
Base
Adenine (A)
Guanine (G)
Thymine (T)
Pyrimidines Uracil (U)
Cytosine (C)
RNA
Mono-, di-, and tri-phosphate nucleotides
DNA
RNA
Base
Nucleoside
Nucleotide
Adenine (A)
Deoxyadenosine
dAMP
dADP
dATP
Guanine (G)
Deoxyguanosine
dGMP
dGDP
dGTP
Cytosine (C)
Deoxycytidine
dCMP
dCDP
dCTP
Thymine (T)
Deoxythymidine
dTMP
dTDP
dTTP
Adenine (A)
Adenosine
AMP
ADP
ATP
Guanine (G)
Guanosine
GMP
GDP
GTP
Cytosine (C)
Cytidine
CMP
CDP
CTP
Uracil (U)
Uridine
UMP
UDP
UTP
The pairing rule of the bases in nucleic acids:
A-T/U and G-C.
Ø A-T/U pairing forms
two hydrogen bonds--weak bond.
Ø G-C pairing forms
three hydrogen
bonds---strong bond.
Ø Therefore, G-C
pairing is more stable
than A-T/U pairing.
The double helical structure of DNA
Ø Two complementary DNA strands run in antiparallel, and are coiled around each other, forming
a double helical structure.
Ø There are two grooves on the surface of the
double helix: the major groove and minor
groove.
Ø Regulatory molecules bind to DNA in
these grooves, changing the structure
and function of DNA.
Ø Cytosine residues in
some regions in DNA
can be modified by
methylation,thereby
changing their
functional states.
Higher level structures of DNA
Ø In eukaryotic cells, DNA molecules are highly
compacted by wrapping around the histone protein
core, forming nucleosomes.
Ø  The histone core is made up of 2
copies of each of the four histone
proteins (H2A, H2B, H3 and H4).
Ø Nucleosomes are further coiled to form
super coils.
Ø The N-terminal tail of histones can be
modified by methylation or acetylation
on the lysine or
arginine for controlling
the open or close states
of chromatin, and thus
its functions.
3D structure of
a nucleosome
Structure of RNAs
Ø  RNAs are single stranded.
Ø However, the complementary
parts in a RNA molecule can
form local double-stranded
structures, thus, causing loops
in the non-complementary
regions.
Structure of a tRNA-Ala
5’
3’
Ø There are at least four
major functional types
of RNAs:
1. 
2. 
3. 
4. 
mRNA
tRNA
rRNA
Small regulatory RNAs, e.g.,
micro RNA (miRNA) and small
interfering RNA(siRNA)
Protein structure
Ø Proteins are polymers of amino acids linked by peptide
bonds;
Ø There are twenty amino acids found in proteins;
Ø  Amino acids differ in their side chains: R groups;
Ø  The linear order of amino acid sequence of a protein is
called its primary structure.
Classification of amino acids according to the
structure of side chains
Classification of amino acids according to the
structure of side chains
Classification of amino acids according to the
structure of side chains
Classification of amino acids according to the
structure of side chains
Higher level structures of proteins
Ø Secondary structure: the ways that the linear amino acid sequence
forms specific structures: α-helix and β-sheets.
α-helix
β-sheet
Ø Tertiary structure: the ways that the linear
amino acids of a polypeptide chain form a
specific 3D structure for a specific function.
Ø Prediction of the 3D structure of a protein
from its sequence is a challenging problem
in computational biology.
The Central Dogma of Molecular Biology
Ø Genetic information is stored in DNA and passed from DNA to RNA
to protein.
DNA
Reverse
transcription
replication
Transcription
mRNA
Translation
Protein
What is a gene ?
Ø A gene is a segment of DNA that contains the information necessary to
make functional RNA and peptide molecules.
Ø  According to this definition, a gene includes transcribed sequence
and non-transcribed regulatory sequences that control the
transcription and translation of the gene product.
Ø  Genes can be classified as protein coding genes and RNA-specifying
genes.
Ø  In bioinformatics and computational biology, a gene often refers to the
DNA sequence that specifies the sequence of a protein (open reading
frame, ORF) or a RNA molecule, and its regulatory sequences are
treated separately.
Structure of genes in prokaryotes
Ø Adjacent genes of the same orientation in prokaryotes can be
transcribed simultaneously, forming a structure, called an operon.
Ø A typical operon contains the following elements:
1.  Open reading frames;
2.  Upstream regulatory elements
3.  A downstream transcriptional terminator
FT binding
site
-300
Promoter
region
-35
TSS
Ribsome
binding site
Terminator
-10 +1
Upstream regulatory region
Ø Prediction of genes (ORFs) in prokaryotes has reached a high accuracy
using machine learning algorithms.
Structure of genes in eukaryotes
Ø Due the complexity of gene structures in eukaryotes, accurate
prediction of genes in these organisms is still a challenging problem.
DNA replication
Ø Chromosomes are replicated before each cell division;
Ø DNA replication is semi-conservative: each of the two newly
synthesized DNA molecules contains an original strand of DNA and
a newly synthesized complementary strand;
Ø The leading strand is synthesized continuously, while the lagging
strand is produced in fragments (Okazaki fragments), which are
later jointed;
Ø Major enzymes involved:
1. Primase for the synthesis of RNA primers;
2. DNA polymerase III for extension;
3. DNA polymerase I for the excision
of primers and filling the gaps;
4. Ligase for joining fragments.
Ø Although both polymerase III and I have the capability of proofreading, incorrectly paired bases can still be incorporated, which is a
major source of mutations.
Transcription
Ø Transcription is catalyzed by RNA polymerase using one of the DNA
strands as the template — template strand or non-coding strand;
Ø The opposite strand is called non-template stand or coding strand,
because it has the same sequence as the transcribed RNA with a T
replaced by a U.
Coding strand
Non-coding strand
Transcription
Ø Transcription is controlled by the interaction of trans-acting elements
called transcription factors (TFs) and cis-acting elements of DNA.
Ø Prediction of cis-acting elements or TF binding sites is a challenging
problem in computational biology.
Regulation of transcription in prokaryotes
TF binding
site
Promoter
region
α α
TF1 TF2 β β
-300
-35
Ribosome
TSS binding site
σ
-10
Terminator
+1
Transcription
5’ UTR
RNA
3’ UTR
RNA processing in eukaryotes
Ø A “cap” is added to the 5’ end, consisting of a methylated guanosine
and cap-binding proteins
Ø A string of bout 200 adenosines are added to the 3’ end. This poly-A
tail is bound by poly-A binding proteins.
Ø Splicing: introns are cut out, and exons are linked.
•  There can be many forms of splicing, generating different mRNAs —alternative
splicing, so a gene can code for many proteins.
•  Splicing can be mediated by spliceosome or the RNA itself.
•  Prediction of alternative splicing sites is a challenging problem in computational
biology.
Translation
Ø Translation starts by the association of ribosome with the ribosome
binding site in the mRNA molecule, and the following components are
involved :
1. Ribosome: consisting of a small and a large subunit, each is
composed of a few rRNA and hundreds of protein molecules.
2. tRNA: carrying a specific amino acid, and recognizing a codon
using its anti-codon through base paring.
3. Amino acyl-tRNA synthetase: attaching an amino acid to its tRNA.
Transcription and translation are two highly
coupled process in prokaryotes
TF binding
site
Promoter
region
α α
TF1 TF2 β β
-300
-35
Ribosome
TSS binding site
σ
-10
Terminator
+1
Ribosome
binding site
Transcription
3’ UTR
RNA
5’ UTR
Proteins
Translation
Standard genetic codons
Ø There are 61 sense codons
and 3 non-sense (stop)
codons;
Ø Degeneracy of codons;
Ø Some codons for the same
amino acid are more
frequently used than the
others, a phenomenon
called codon bias;
Ø Mutations in the 1st and
2nd nucleotides in a codon
often result in changes in
amino acids, while a
mutation in the 3rd
nucleotide does not, thus it
is called a wobble base.
http://www.nature.com/scitable/content/The-genetic-code-consists-of-64-codons-42614