Download Human Genome and Human Genome Project

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Genomic imprinting wikipedia , lookup

Human–animal hybrid wikipedia , lookup

Adeno-associated virus wikipedia , lookup

Cre-Lox recombination wikipedia , lookup

Polyploid wikipedia , lookup

Zinc finger nuclease wikipedia , lookup

Mutation wikipedia , lookup

Extrachromosomal DNA wikipedia , lookup

Point mutation wikipedia , lookup

Vectors in gene therapy wikipedia , lookup

Copy-number variation wikipedia , lookup

Therapeutic gene modulation wikipedia , lookup

Segmental Duplication on the Human Y Chromosome wikipedia , lookup

RNA-Seq wikipedia , lookup

Mitochondrial DNA wikipedia , lookup

Human genetic variation wikipedia , lookup

Gene wikipedia , lookup

Genetic engineering wikipedia , lookup

Microevolution wikipedia , lookup

Oncogenomics wikipedia , lookup

Transposable element wikipedia , lookup

NUMT wikipedia , lookup

Metagenomics wikipedia , lookup

No-SCAR (Scarless Cas9 Assisted Recombineering) Genome Editing wikipedia , lookup

Genome (book) wikipedia , lookup

Designer baby wikipedia , lookup

Site-specific recombinase technology wikipedia , lookup

ENCODE wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Public health genomics wikipedia , lookup

Helitron (biology) wikipedia , lookup

Pathogenomics wikipedia , lookup

History of genetic engineering wikipedia , lookup

Non-coding DNA wikipedia , lookup

Minimal genome wikipedia , lookup

Genomic library wikipedia , lookup

Whole genome sequencing wikipedia , lookup

Human genome wikipedia , lookup

Genome editing wikipedia , lookup

Genomics wikipedia , lookup

Genome evolution wikipedia , lookup

Human Genome Project wikipedia , lookup

Transcript
Human Genome
and
Human Genome Project
Louxin Zhang
A Primer to Genomics
• Cells are the fundamental working units of every living
systems.
• DNA is made of 4 nucleotide bases. The DNA sequence
is the particular side-by-side arrangement of bases along
the DNA strand. This order spells out the exact
instructions required to create a particular organism.
• The genome is an organism’s complete set of DNA.
Except for mature red blood cells, all human cells
contains a complete genome arranged in 24 distinct
chromosomes.
A Primer to Genomics
• Each chromosome contains many genes,
the basic physical and functional units of heredity.
Genes are specific sequences of bases that encode
instructions on how to make proteins.
• Proteins perform most life functions and even make up
the majority of cellular structures.
Proteins are large, complex molecules made up of
smaller subunits called amino acids.
A protein folds up into specific three-dimensional
structure that define their particular functions in the cell.
Human Genome Project:
Background
• HGP arose from two key insights in the
early 1980s.
1. The ability to take global views of genomes
could greatly accelerate biomedical research, by
allowing researchers to attack problems in a
comprehensive fashion.
2. The creation of such global views would
requires a communal effort in infrastructure
research.
Human Genome Project
Background
• Key projects helped to crystallize the insights,
including
i) The sequencing of the some bacterial and
animal viruses, as well as the human
mitochondrion between 1977 and 1982.
ii) The development of (random) shotgun
sequencing of long DNA fragments for highthroughput gene discovery, later dubbed with
expressed sequence tags(ETSs) and
assembling computer programs.
How does the human genome
stack up?
Organism
Human (Homo sapiens)
Genome Size
(Bases)
3 billion
Estimated
Genes
30,000
Laboratory mouse (M. musculus)
2.6 billion
30,000
Mustard weed (A. thaliana)
100 million
25,000
Roundworm (C. elegans)
97 million
19,000
Fruit fly (D. melanogaster)
137 million
13,000
Yeast (S. cerevisiae)
12.1 million
6,000
Bacterium (E. coli)
4.6 million
3,200
Human immunodeficiency virus (HIV)
9700
9
Human Genome Project
Goals
The idea of sequencing the entire human genome was first
proposed in discussions at scientific meetings from 1984 to
1986.
And a broader programme was recommended in a report by
NRC, USA in 1998:
• Sequencing the human genome: creation of genetic, physical
and sequence maps of the human genome.
• Parallel efforts in key model organisms.
• The development of technology in support of these objectives
• Research in the ethical, legal, and social issues raised by the
programme.
Human Genome Project
Milestones:
■ 1990:
Project initiated as joint effort of U.S.
Department of Energy and the National Institutes of
Health
■ June 2000: Completion of a working draft of the
entire human genome
■ February 2001: Analyses of the working draft are
published
■ April 2003: HGP sequencing is completed and
Project is declared finished two years ahead of
schedule
U.S. Department of Energy Genome Programs, Genomics and Its Impact on Science and Society, 2003
What does the draft human
genome sequence tell us?
By the Numbers
• The
human genome contains 3 billion chemical nucleotide
bases (A, C, T, and G).
• The average gene consists of 3000 bases, but sizes vary
greatly, with the largest known human gene being dystrophin at
2.4 million bases.
• The total number of genes is estimated at around 30,000-much lower than previous estimates of 80,000 to 140,000.
• Almost all (99.9%) nucleotide bases are exactly the same in
all people.
• The functions are unknown for over 50% of discovered genes.
U.S. Department of Energy Genome Programs, Genomics and Its Impact on Science and Society, 2003
What does the draft human genome
sequence tell us
How It's Arranged
• The
human genome's
gene-dense "urban centers"
are in nucleotides G and C.
• In contrast, the gene-poor
"deserts" are rich in the DNA
nucleotides A and T.
What does the draft human genome
sequence tell us
• Genes appear to be
concentrated in random
areas along the genome,
with vast expanses of
noncoding DNA
between.
•
Chromosome 1 has the
most genes (2968), and
the Y chromosome has
the fewest (231).
What does the draft human
genome sequence tell us?
The Wheat from the Chaff
• Less
than 2% of the genome codes for proteins.
• Repeated sequences that do not code for proteins ("junk
DNA") make up at least 50% of the human genome.
Repetitive sequences shed light on chromosome structure
and dynamics. Over time, these repeats reshape the genome
by rearranging it, creating entirely new genes, and modifying
and reshuffling existing genes.
What does the draft human genome
sequence tell us
The repeats fall into five classes:
i) transposon-derived repeats, known as interspersed
repeats.
ii) inactive retroposed copies of cellular genes, known as
processed pseudogenes.
Nonfunctional copies of the exon sequences of an active gene and
thought to arise by integration into chromosomes of a natural cDNA
sequence generated by reverse transcription.
iii) repeats of short k-mers such as (A)n, (CA)n, (AAT)n.
Since they show a high degree of length polymorphisms in the
human population, (CA)n repeat have been used as genetic marker
in genetic mapping.
What does the draft human genome
sequence tell us
iv) segmental duplications, consisting of blocks of 10-300
kb that have been copied from one region of the genome
into another region.
Such duplications appears often in pericentromeres and
subtelomeres of chromosomes.
Recurrent structural rearrangements in duplication regions give
rise to contiguous gene syndromes.
v) tandemly repeated sequences, usually at centromere,
telomers, the short arms of acrocentric chromosomes
and ribosomal gene clusters. These regions are underrepresented in the draft genome sequence.
What does the draft human
genome sequence tell us?
How the Human Compares with Other Organisms
• Unlike the human's seemingly random distribution of gene-rich areas,
many other organisms' genomes are more uniform, with genes evenly
spaced throughout.
• Humans have on average three times as many kinds of proteins as the
fly or worm because of mRNA transcript "alternative splicing" and chemical
modifications to the proteins.
• Humans share most of the same protein families with worms, flies, and
plants; but the number of gene family members has expanded in humans,
especially in proteins involved in development and immunity.
• The human genome has a much greater portion (50%) of repeat
sequences than the mustard weed (11%), the worm (7%), and the fly (3%).
U.S. Department of Energy Genome Programs, Genomics and Its Impact on Science and Society, 2003
What does the draft human
genome sequence tell us?
Variations and Mutations
• Scientists have identified about 3 million locations where single-base DNA differences
(SNPs) occur in humans. This information promises to revolutionize the processes of
finding chromosomal locations for disease-associated sequences and tracing human
history.
• The ratio of germline (sperm or egg cell) mutations is 2:1 in males vs females.
Researchers point to several reasons for the higher mutation rate in the male germline,
including the greater number of cell divisions required for sperm formation than for eggs.
U.S. Department of Energy Genome Programs, Genomics and Its Impact on Science and Society, 2003
Future Challenges:
What We Still Don’t Know
• Gene number, exact locations, and functions
• Noncoding DNA types, amount, distribution, information
content, and functions
• Functional genomics
• Evolutionary conservation among organisms
• Proteomes (total protein content and function) in organisms
• Correlation of SNPs (single-base DNA variations among
individuals) with health and disease
• Genes involved in complex traits and multigene diseases
U.S. Department of Energy Genome Programs, Genomics and Its Impact on Science and Society, 2003
Anticipated Benefits of
Genome Research
Molecular Medicine
• improve diagnosis of disease
• create drugs based on molecular information
• design “custom drugs” (pharmacogenomics) based on individual genetic profiles
Microbial Genomics
• rapidly detect and treat pathogens (disease-causing microbes) in clinical practice
• protect citizenry from biological and chemical warfare
U.S. Department of Energy Genome Programs, Genomics and Its Impact on Science and Society, 2003
Anticipated Benefits of
Genome Research-cont.
Risk Assessment
• evaluate the health risks faced by individuals who may be exposed to radiation
(including low levels in industrial areas) and to cancer-causing chemicals and toxins
Bioarchaeology, Anthropology, Evolution, and Human Migration
• study evolution through germline mutations in lineages
• study migration of different population groups based on maternal inheritance
• study mutations on the Y chromosome to trace lineage and migration of males
U.S. Department of Energy Genome Programs, Genomics and Its Impact on Science and Society, 2003
Anticipated Benefits of
Genome Research-cont.
DNA Identification (Forensics)
• identify potential suspects whose DNA may match evidence left at
crime scenes
• exonerate persons wrongly accused of crimes
• identify crime and catastrophe victims
• establish paternity and other family relationships
U.S. Department of Energy Genome Programs, Genomics and Its Impact on Science and Society, 2003
Anticipated Benefits of
Genome Research-cont.
Agriculture, Livestock Breeding, and Bioprocessing
• grow disease-, insect-, and drought-resistant crops
• breed healthier, more productive, disease-resistant farm animals
• grow more nutritious produce
• develop biopesticides
• incorporate edible vaccines incorporated into food products
• develop new environmental cleanup uses for plants like tobacco
U.S. Department of Energy Genome Programs, Genomics and Its Impact on Science and Society, 2003
Sequencing Strategy:
Hierarchical shotgun sequencing