Download encode 2012

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Deoxyribozyme wikipedia , lookup

Chromosome wikipedia , lookup

Mutation wikipedia , lookup

Extrachromosomal DNA wikipedia , lookup

Epigenetics of diabetes Type 2 wikipedia , lookup

Polyploid wikipedia , lookup

Zinc finger nuclease wikipedia , lookup

Adeno-associated virus wikipedia , lookup

Copy-number variation wikipedia , lookup

Ridge (biology) wikipedia , lookup

Gene desert wikipedia , lookup

Genomic imprinting wikipedia , lookup

Gene expression profiling wikipedia , lookup

Epigenomics wikipedia , lookup

Polycomb Group Proteins and Cancer wikipedia , lookup

Human genetic variation wikipedia , lookup

Segmental Duplication on the Human Y Chromosome wikipedia , lookup

NEDD9 wikipedia , lookup

Point mutation wikipedia , lookup

Genetic engineering wikipedia , lookup

Metagenomics wikipedia , lookup

Mitochondrial DNA wikipedia , lookup

Epigenetics of human development wikipedia , lookup

Vectors in gene therapy wikipedia , lookup

Microevolution wikipedia , lookup

Short interspersed nuclear elements (SINEs) wikipedia , lookup

Oncogenomics wikipedia , lookup

Long non-coding RNA wikipedia , lookup

NUMT wikipedia , lookup

Therapeutic gene modulation wikipedia , lookup

Primary transcript wikipedia , lookup

Public health genomics wikipedia , lookup

Gene wikipedia , lookup

Pathogenomics wikipedia , lookup

Transposable element wikipedia , lookup

Genome (book) wikipedia , lookup

No-SCAR (Scarless Cas9 Assisted Recombineering) Genome Editing wikipedia , lookup

Designer baby wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Site-specific recombinase technology wikipedia , lookup

History of genetic engineering wikipedia , lookup

Whole genome sequencing wikipedia , lookup

Helitron (biology) wikipedia , lookup

RNA-Seq wikipedia , lookup

Genomics wikipedia , lookup

Genomic library wikipedia , lookup

Minimal genome wikipedia , lookup

Non-coding DNA wikipedia , lookup

Human genome wikipedia , lookup

ENCODE wikipedia , lookup

Human Genome Project wikipedia , lookup

Genome editing wikipedia , lookup

Genome evolution wikipedia , lookup

Transcript
ENCODE 2012
• The Human Genome project sequenced “the
human genome”
• “the human genome” that we have labeled as
such doesn’t actually exist
• What we call the human genome sequence is
really just a reference
• Furthermore, the current reference genome
sequence is haploid
Whose genome did Celera sequence?
Supposedly:
African-American Asian-Chinese
Hispanic-Mexican
Caucasian
Caucasian
Actually:
Celera’s genome is Craig Venter’s
Science v. 291, pp 1304-1351
• Every time an individual cell divides, new
mutations arise; no two cells even within any
individual have the identical sequence.
ENCODE
• The Encyclopedia of DNA Elements (ENCODE)
is a public research consortium initiated by
the US National Human Genome Research
Institute (NHGRI) in September 2003.
• The goal is to find all functional elements in
the human genome.
• All data generated in the course of the project
will be released “rapidly” into public
databases.
• Pilot phase – 2003-2007 – method evaluation
– 1% of genome
• Production phase 2007-2012
–
–
–
–
September 2012 – 30 papers published
442 scientists
31 labs
147 different types of cells with 24 types of
experiments
– 1,642 experiments
– Data released
• Identification and quantification of RNA
species in cells and subcellular compartments
• Mapping of noncoding and protein-coding
genes
• Delineation of chromatin and DNA
accessibility
• Mapping of histone modifications and
transcription factor-binding sites
• Measurement of DNA methylation
Credits: Darryl Leja (NHGRI), Ian Dunham (EBI)
What did they find?
• Controversy!
• Assigned biochemical functions to over 80% of
the genome.
•
•
•
•
Junk DNA or no?
What is a biochemical function?
“a reproducible biochemical signature”
“millions of switches”
• The vast majority (80.4%) of the human genome participates in at
least one biochemical RNA- and/or chromatin-associated event in at
least one cell type.
• Primate-specific elements as well as elements without detectable
mammalian constraint show, in aggregate, evidence of negative
selection; thus, some of them are expected to be functional.
• Classifying the genome into seven chromatin states indicates an
initial set of 399,124 regions with enhancer-like features and 70,292
regions with promoter-like features, as well as hundreds of
thousands of quiescent regions.
• It is possible to correlate quantitatively RNA sequence production
and processing with both chromatin marks and transcription factor
binding at promoters, indicating that promoter functionality can
explain most of the variation in RNA expression.
• Many non-coding variants in individual genome
sequences lie in ENCODE-annotated functional
regions; this number is at least as large as those
that lie in protein-coding genes.
• Single nucleotide polymorphisms (SNPs)
associated with disease by GWAS are enriched
within non-coding functional elements, with a
majority residing in or near ENCODE-defined
regions that are outside of protein-coding genes.
In many cases, the disease phenotypes can be
associated with a specific cell type or
transcription factor.
Changing how we view a gene?
• Genes should be defined by transcripts.
• Transcripts are the basic unit that’s affected by
mutation and selection.
• A “gene” then becomes a collection of
transcripts, united by some common factor.
• Another related challenge is understanding the genome’s threedimensional shape. Far from being arranged in a line, chromosomes
are folded in fantastically complicated fractal patterns, and these
topographies appear to shape network interaction.
• “Every gene is surrounded by an ocean of regulatory elements.
They’re everywhere. There are only 25,000 genes, and probably
more than 1 million regulatory elements,” said Job Dekker, a
molecular biophysicist at the University of Massachusetts Medical
School who worked on ENCODE’s structural descriptions of the
genome.
• He continued, “It’s not just one gene touching one regulator. It can
touch and interact with a whole collection of them. It must involve
a very complicated three-dimensional structure. At this scale,
chromosomes topography turns out to be incredibly dynamic,
complex and cell type-specific.”
• http://selab.janelia.org/people/eddys/blog/?p
=683
• http://arstechnica.com/staff/2012/09/mostof-what-you-read-was-wrong-how-pressreleases-rewrote-scientific-history/
• http://blogs.discovermagazine.com/notrocket
science/2012/09/05/encode-the-rough-guideto-the-human-genome/#ENCODEgene
• http://www.nature.com/news/encode-thehuman-encyclopaedia-1.11312
• http://www.nature.com/nature/journal/v489/
n7414/full/nature11247.html