Download Chapter 2 - rci.rutgers.edu

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Public health genomics wikipedia , lookup

Oncogenomics wikipedia , lookup

DNA polymerase wikipedia , lookup

Mitochondrial DNA wikipedia , lookup

Nucleosome wikipedia , lookup

Epigenetics of neurodegenerative diseases wikipedia , lookup

Metagenomics wikipedia , lookup

Transposable element wikipedia , lookup

DNA damage theory of aging wikipedia , lookup

Gel electrophoresis of nucleic acids wikipedia , lookup

Zinc finger nuclease wikipedia , lookup

United Kingdom National DNA Database wikipedia , lookup

Bisulfite sequencing wikipedia , lookup

Minimal genome wikipedia , lookup

Mutation wikipedia , lookup

Genealogical DNA test wikipedia , lookup

SNP genotyping wikipedia , lookup

Epigenetics of human development wikipedia , lookup

Gene expression profiling wikipedia , lookup

Cancer epigenetics wikipedia , lookup

DNA vaccination wikipedia , lookup

Genome (book) wikipedia , lookup

Genetic engineering wikipedia , lookup

RNA-Seq wikipedia , lookup

Replisome wikipedia , lookup

Genomic library wikipedia , lookup

Human genome wikipedia , lookup

Epigenomics wikipedia , lookup

Molecular cloning wikipedia , lookup

DNA supercoil wikipedia , lookup

Nutriepigenomics wikipedia , lookup

Nucleic acid double helix wikipedia , lookup

Cell-free fetal DNA wikipedia , lookup

No-SCAR (Scarless Cas9 Assisted Recombineering) Genome Editing wikipedia , lookup

Genome evolution wikipedia , lookup

Microsatellite wikipedia , lookup

Extrachromosomal DNA wikipedia , lookup

Nucleic acid analogue wikipedia , lookup

Site-specific recombinase technology wikipedia , lookup

Cre-Lox recombination wikipedia , lookup

Primary transcript wikipedia , lookup

Designer baby wikipedia , lookup

Gene wikipedia , lookup

Vectors in gene therapy wikipedia , lookup

Non-coding DNA wikipedia , lookup

Genomics wikipedia , lookup

Point mutation wikipedia , lookup

Deoxyribozyme wikipedia , lookup

Genome editing wikipedia , lookup

Microevolution wikipedia , lookup

History of genetic engineering wikipedia , lookup

Therapeutic gene modulation wikipedia , lookup

Helitron (biology) wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Transcript
Genomics basics
Genes
- In old times: Hereditary mechanism that carried information from parent to child.
- Family members tend to exhibit similar traits: skin color, diabetes,….
- Monarch butterflies take 4 generations to complete a migration cycle.
- Not a perfect copying process. (eye color).
- Genes: discrete hereditary units: provide a clear and unambiguous set of
instructions for producing some property of the organism.
Genome
- The complete set of genes in an organism, master blueprint.
- Contains all the hereditary instructions for building, operating, and maintaining
the organism, and for passing life in like form on to the next generation of that
organism.
- Twentieth century: innovative research work and path-breaking discoveries over
the first half of the twentieth century gave genes a chemical (molecular) existence
and culminated in the pivotal realization that genes are made of deoxyribonucleic
acid (DNA, for short).
DNA (Deoxyribonucleic Acid)
- Two long strands wound tightly around each other in a spiral structure.
- double helix: A twisted ladder: sides are made of sugar and phosphate : rungs
are made of bases.
- Nucleotides: Each strand of the DNA molecule is a linear arrangement of
repeating similar units.
- Nucleotide = a sugar (deoxyribose), a phosphate group, and a nitrogenous base.
- Bases : Adenine, Thymine, Guanine, Cytosine (A, T, G, C).
- Watson-Crick base pairing rules: The bases on one strand are paired with the
bases on the other strand according to the Complementary base pairing rules
- Adenine pairs with Thymine, Guanine pairs Cytosine.
- Base pairs: they form the coplanar rungs of the ladder. Weak hydrogen bond.
- DNA is chemically inert and is a stable carrier of genetic information.
DNA replication:
- Complementary sequencing => Easy to self-replicate. Information passed on
from generation to generation.
(i) DNA molecule unwinds, allowing the strands to separate.
(ii) Each strand directs the synthesis of a brand new complementary strand,
(iii) Two descendant DNA molecules: each new strand is an exact copy of the old one.
DNA sequence: Mechanism that specifies the exact genetic instructions required
to create the traits of a particular organism.
Genes: Specific contiguous subsequence of the DNA sequence.
A gene A-T-G-C sequence is the code required for constructing a protein.
Proteins: Giant complex molecules made of chains of amino acids:
- The building blocks and the workhorses of life
- Proteins also regulate most of life’s day-to-day functions.
- DNA replication process is mediated by enzymes, proteins whose job is to
catalyze biochemical reactions.
Gene expression
- Cells are the fundamental units of all living organisms, both structurally and
functionally.
- Nucleus, It contains the organism’s DNA. Enclosed by the nuclear membrane,
-Cytoplasm. plasma membrane encloses the cell.
- Inside cytoplasm: Molecules that act as channels and pumps controlling in/out of the cell.
-Gene expression: The set of protein-coding instructions in the DNA sequence of
a gene resembles a computer program. (compiled and executed in order for
anything to happen).
- A gene must be expressed in order for anything to happen.
- A gene expresses by transferring its coded information into proteins that dwell in
the cytoplasm, a process called gene expression.
- A central dogma of molecular biology:
DNA →
mRNA
transcription
→
protein
translation
Messenger ribonucleic acid (mRNA) resembles a single strand of DNA.
- In mRNA uracil (U) replaces thymine.
- When a gene is expressed, the DNA double helix splits open along its length.
- Transcription: One strand of the open helix remains inactive, while the other
strand acts as a template against which a complementary strand of mRNA forms.
- The mRNA strand then separates from the DNA strand and transports out of the
nucleus, across the nuclear membrane, and into the cellular cytoplasm.
- There it serves as the template for protein synthesis, with consecutive (nonoverlapping) triplets of bases (called codons) acting as a code to specify the
particular amino acids that make up an individual protein.
- Translation: The sequence of bases along the mRNA is thus converted into a
string of amino acids that constitutes the protein molecule for which it codes.
- Each possible triplet of mRNA bases codes for a specific amino acid, one of the
twenty amino acids that make up proteins.
- GCC codes for alanine, CAC for histidine, AUC for isoleucine, GAG for glutamic
acid
- There are 43=64 possible triplets, but only twenty possible amino acids.
- This means that there is room for redundancy: for example, GCU, GCC, GCA,
and GCG, all code for alanine → Safeguard against small errors.
- Start codon, AUG, is the triplet of mRNA bases that signals the initiation of a
sequence that is to be translated.
- Stop codon is a triplet of mRNA bases, either UGA, UAG, or UAA,
- Open reading frame (ORF, for short). All sequence information of coding interest
lies in ORFs (but not every ORF codes for a gene).
- Expressed sequence tags (ESTs) a unique short subsequence (only a few
hundred base pairs long), generated from the DNA sequence of a gene, that acts
as a “tag” or “marker” for the gene.
- Not all cells express the same genes, which is why different cells do different things.
- Within the same cell, different genes will be expressed at different times, at
different levels, in response to different stimuli.
- Few exceptions: Housekeeping genes, maintain basic cell functions.
Hybridization assays
- Hybrid DNA: Single-stranded DNA molecules of like sequence have a tendency
to bind together two DNA strands or one DNA strand and one mRNA strand.
- Hybridization assays. A probe consisting of a homogenous sample of singlestranded DNA molecules, whose sequence is known, is prepared and labeled with
a reporter chemical, usually a radioactive or fluorescent substance. An
immobilized target, usually a heterogeneous mixture of single-stranded DNA
molecules of unknown composition is challenged by the probe. As the probe will
hybridize only to sequences complementary to its sequence, DNA sequences in
the target that are complementary to the probe DNA sequence can be identified
by the presence of reporter molecules.
Blotting techniques.
- Southern blotting: the target DNA is separated by electrophoresis (see below)
and transferred onto a filter, where it is exposed to the probe.
- Northern blotting: is a variant in which the target is mRNA instead of DNA. As
mRNA is the intermediary molecule in gene expression, Northern blotting provides
a means of studying the expression patterns of specific genes. DNA microarrays
can be regarded as a massively parallel version of Northern blotting.
- In situ hybridization: denatured DNA (DNA in which the two strands are unwound
and separated) is kept in place in the cell and is then challenged with mRNA or
DNA extracted from another source and labeled with a reporter chemical, usually
a fluorescent substance. By retaining the DNA in the cell, the specific
chromosome containing the DNA sequence of interest can be identified by
observing, under a microscope, the location of the fluorescence.
Other relevant assays
- Electrophoresis: use an electric field to separate large molecules, such as DNA
and proteins, from a mixture of similar molecules.
- Cloning: produce multiple, exact copies of a single gene or other segment of
DNA to obtain enough material for further study. clone libraries.
- Polymerase chain reaction (PCR) procedure for amplifying virtually any
fragment of DNA.
(i) Denaturing: Two strands of DNA are unwound and separated by heating
(ii) Annealing: primers - short strands of single-stranded DNA that match the
sequences at either end of the target DNA, are bound to their complementary
bases on the now single-stranded DNA.
(iii) Polymerase: an enzyme whose job is to copy genetic material. Starting from
the primer, the polymerase reads a template strand and matches it with free
complementary bases. This produces two descendant DNA strands.
- Cycling through these three steps generates many copies of the target DNA.
- The amount of DNA grows exponentially as it doubles with every cycle.
- Reverse transcription: is a procedure for reversing, in a laboratory, the process
of transcription. It is accomplished by isolating mRNA and using it as a template to
synthesize a complementary DNA (cDNA, for short) strand, so-called because its
sequence is complementary to the original mRNA sequence. This process utilizes
the enzyme reverse transcriptase. The resultant single-stranded cDNA molecule is
considerably shorter than the parent DNA sequence, as it will have only its coding
exon sequences; the noncoding intron sequences would have been excised
during the formation of the original mRNA.
- The process of translation cannot be reversed.
- Reverse transcriptase polymerase chain reaction (RT-PCR) The cDNA
generated by reverse transcription is amplified by PCR.
- RT-PCR can be used to detect and quantify target mRNA sequences, and
thereby to provide information regarding gene expression.
The Human Genome
- DNA in the human genome is made up of roughly three billion base pairs.
- 46 molecules, threadlike cellular structure called a chromosome.
- Chromosomes come in pairs (except sex): father + mother
- Chromosomes range in length from about 50 million bp to 250 million bp. Each
chromosome contains many genes. In total, the human genome is estimated to
contain somewhere around 35,000-40,000 genes.
- Genes vary widely in length, from a few hundred bp to several thousand bp.
- Only a tiny percentage of human DNA includes exons.
- Introns: sequences that have no coding function
- In between genes are other noncoding regions whose functions remain largely obscure.
- Every single human being has almost the exact same genome. (99.9% identical)
- Allele: Each alternate version of a gene Genes vary slightly from person to
person. Every person carries two alleles of each gene.
- Homozygous(person): When both alleles are the same for that gene.
- Heterozygous. Only one of the alleles (dominant allele) may be expressed.
Example: Blood type. A, B, AB or O. The ABO gene controls the blood group has
three alleles, which are designated as A, B and O. Alleles A and B, which code for
proteins A and B respectively, are co-dominant. A = AA or AO B = BB or BO AB =
A and B O = O and O.
Genome variations and their consequences
- mutations and polymorphisms: alterations in a DNA sequence
- substitution: one base being replaced by another
- deletion:
- insertion:
- inversion: small subsequence of bases being removed and then reinserted in the
opposite direction
- translocation: a small subsequence of bases being removed and then reinserted
in a different place.
- Genome variations may be inherited or acquired.
- An inherited variation could be passed on to the next generation of that
organism.
- Acquired variations are
(i)
Mutations that occur spontaneously during DNA replication caused by an
external environmental factor such as exposure to a toxic substance.
(ii)
Hereditary gene mutation: Mutation that affects a reproductive cell.
Present in less than 1 percent of people.
(iii)
Polymorphism: present in at least 1% Pop.
- wild-type allele: common and properly functioning version of a gene
- mutant allele: version with a mutation
Robustness of the genetic code
- Many genome variations do not produce any noticeable effects: (i) outside the
genome’s coding regions. (ii) redundancies in the genetic code (iii) cell repairing
mechanisms for certain types of damaged DNA.
- A small percentage of genome variations do produce noticeable effects, some
deleterious, some beneficial. This is the genetic basis of biological diversity and
the evolutionary process.
- Polymorphisms that produce noticeable effects → natural selection process
-Not so simple: People with blood type O: peptic ulcers and cholera yet the trait
did not die out Mutations in the p53 gene, which, in its wild-type, codes for a
protein that suppresses abnormal cell proliferation, Mutations in the p53 gene →
associated with cancer.
Single nucleotide polymorphisms (SNPs)
- Example: Blood type: difference between A and O: O = A with G base deleted
- Example: The apolipoprotein E gene (ApoE, for short) and Alzheimer’s disease. - ApoE has three alleles (called E2, E3, E4), differs from any other by a SNP(2).
- Have E4 allele => a greater risk of Alzheimer’s
- Have E2 allele => lesser risk of developing Alzheimer’s.
- SNPs are not randomly scattered along a chromosome.
- Occur in groups, called haplotypes that inherit together.
Genotype-phenotype relationships
- Association between ApoE and Alzheimer’s disease.
- Genotype refers to the genetic makeup of an individual.
- Phenotype refers to the outward characteristics of the individual.
Genomics
- Structural genomics. Establish genome sequences for organisms, particularly
humans.
- Functional genomics, which, as its name implies, is the study of the functions of
genes.
- Genes do not function in isolation. Instead, genes (and proteins) operate
collectively in pathways, as coordinated sequences of genetic and molecular
activities.
- Pathways underlie all cellular processes. Constant interplay between genes,
proteins and external factors
- Microarrays: simultaneous study of large numbers of molecular entities. Snapshot
- Web page with biological pathway examples: http://emp.mcs.anl.gov/
- Questions in functional genomics :
Which genes are expressed in which tissues?
How is the expression of a gene affected by extracellular influences?
Which genes are expressed during the development of an organism?
How does gene expression change during development and differentiation?
What is the effect of misregulated expression of a gene?
What patterns of gene expression cause a disease or lead to disease
progression?
What patterns of gene expression influence response to treatment?
The role of genomics in medical research and pharmaceutical research
Genotype-phenotype relationships
Risk of diseases
How it affect drug response
Hope for a more personalized medicine.
Proteins:
Bioinformatics activities.
Creation and maintenance of databases.
GenBank (a database that contains public DNA and protein sequence data),
SWISS-PROT (a protein sequence database)
PDB (a database of three dimensional biological macromolecular structure data).
Analysis of sequence information. specialized tools (e.g., BLAST) search, view
and analyze the data in these databases.
Prediction of three-dimensional structure.
Expression analysis. using statistical and data mining tools
Modeling dynamic life processes. develop ways of putting together the information
gathered from all the diverse areas of research in order to understand
fundamental life processes.
Supplementary reading
Clark, D. and L. Russell (1997). Molecular Biology Made Simple and Fun, Vienna,
IL: Cache River Press.
Alberts, B., D. Bray, J. Lewis, M. Raff, K. Roberts and J. Watson (1994).
Molecular Biology of the Cell, New York, NY: Addison-Wesley.
Strachan, T. and A.P. Read (1999). Human Molecular Genetics, second edition,
New York, NY: John Wiley.
Vingron, M. (2001). Bioinformatics needs to adopt statistical thinking.
Bioinformatics, 17, 389-390.
Gonick, L. and M. Wheelis (1991). A Cartoon Guide to Genetics. New York, NY:
Harper Collins.