Download Chapter 18

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Zinc finger nuclease wikipedia , lookup

Exome sequencing wikipedia , lookup

Microsatellite wikipedia , lookup

Helitron (biology) wikipedia , lookup

Transcript
CHAPTER 18
LECTURE
SLIDES
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Whole Genome Sequencing
• The ultimate physical map is the base-pair
sequence of the entire genome
• Automation of this process increased the
rate of sequence generation
• Genome sequencing is one case in which
technology drove the science, rather than
the other way around
2
• Sequencers provide accurate sequences
for DNA segments up to 800 bp long
• To reduce errors, 5–10 copies of a
genome are sequenced and compared
• Vectors used to clone large pieces of DNA
– Yeast artificial chromosomes (YACs)
– Bacterial artificial chromosomes (BACs)
3
The Human Genome Project
• Originated in 1990 by the International Human
Genome Sequencing Consortium
– Goal of this publicly funded effort was to use a cloneby-clone approach to sequence the human genome
• Craig Venter formed a private company and
entered the “race” in May, 1998
– Using shotgun-sequencing
• In 2001, both groups published a draft sequence
• Gaps in sequence still being filled
• Still being revised
4
The Human Genome Project
In 2004, the “finished” sequence was published
as the reference sequence (REF-SEQ) in
databases
-3.2 gigabasepairs
-1 Gb = 1 billion basepairs
-Contains a 400-fold reduction in gaps
-Error rate = 1 per 100,000 bases
5
Characterizing Genomes
The Human Genome Project found fewer
genes than expected
-Initial estimate was 100,000 genes
-Number now appears to be about 25,000!
In general, eukaryotic genomes are larger and
have more genes than those of prokaryotes
-However, the complexity of an organism is
not necessarily related to its gene number
6
Characterizing Genomes
7
Finding Genes
Genes are identified by open reading frames
-An ORF begins with a start codon and
contains no stop codon for a distance long
enough to encode a protein
Sequence annotation
-The addition of information, such as ORFs,
to the basic sequence information
8
Noncoding DNA in Eukaryotes
Each cell in our bodies has about 6 feet of
DNA stuffed into it
-However, less than one inch is devoted to
genes!
Six major types of noncoding human DNA
have been described
9
Noncoding DNA in Eukaryotes
Noncoding DNA within genes
-Protein-encoding exons are embedded
within much larger noncoding introns
Structural DNA
-Called constitutive heterochromatin
-Localized to centromeres and telomeres
Simple sequence repeats (SSRs)
-One- to six-nucleotide sequences repeated
10
thousands of times
Noncoding DNA in Eukaryotes
Segmental duplications
-Consist of 10,000 to 300,000 bp that have
duplicated and moved
Pseudogenes
-Inactive genes
11
Noncoding DNA in Eukaryotes
Transposable elements (transposons)
-Mobile genetic elements
-Four types:
-Long interspersed elements (LINEs)
-Short interspersed elements (SINEs)
-Long terminal repeats (LTRs)
-Dead transposons
12
Noncoding DNA in Eukaryotes
13
Expressed Sequence Tags
ESTs can identify genes that are expressed
-They are generated by sequencing the
ends of randomly selected
-But how can 25,000 human genes encode
three to four times as many proteins?
-Alternative splicing yields different
proteins with different functions
14
Alternative Splicing
15
Variation in the Human Genome
Single-nucleotide polymorphisms (SNPs)
are sites where individuals differ by only one
nucleotide
-Must be found in at least 1% of population
16
Genomics
Functional genomics is the study of the
function of genes and their products
DNA microarrays (“gene chips”) enable
the analysis of gene expression at the
whole-genome level
-DNA fragments are deposited on a slide
-Probed with labeled mRNA from
different sources
-Active/inactive genes are identified
17
18
19
Genomics
20
Genomics
21
Genomics
22
Genomics
23
Proteomics
Proteomics is the study of the proteome
-All the proteins encoded by the genome
The transcriptome consists of all the RNA
that is present in a cell or tissue
24
Applications of Genomics
Genome science is also a source of ethical
challenges and dilemmas
-Gene patents
-Should the sequence/use of genes be
freely available or can it be patented?
-Privacy concerns
-Could one be discriminated against
because their SNP profile indicates
susceptibility to a disease?
25