Download Whole genome assembly from next generation sequencing

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

DNA virus wikipedia , lookup

DNA polymerase wikipedia , lookup

DNA profiling wikipedia , lookup

Zinc finger nuclease wikipedia , lookup

DNA nanotechnology wikipedia , lookup

Replisome wikipedia , lookup

United Kingdom National DNA Database wikipedia , lookup

Exome sequencing wikipedia , lookup

DNA sequencing wikipedia , lookup

Microsatellite wikipedia , lookup

Human Genome Project wikipedia , lookup

Helitron (biology) wikipedia , lookup

Transcript
SINGAPORE
Home › Tools & Resources › Whole genome assembly from next generation sequencing data using restriction and nicking enzymes in optical mapping and proximity-based ligation strategies
Whole genome assembly from next generation sequencing
data using restriction and nicking enzymes in optical
mapping and proximity-based ligation strategies
High throughput sequencing methods have revolutionized genomic analysis by
producing millions of sequence reads from an organism’s DNA at an ever decreasing
cost. However, a number of obstacles challenge our ability to generate contiguous
chromosome-sized assemblies from the typically short sequence reads obtained.
These include large regions of repetitive DNA, paralogous gene families and
interspersed retrotransposable elements, which together often comprise up to 70% of
an organism’s genome (1). A number of long read technologies, such as PacBio RS II
sequencing, successfully traverse many of these repetitive elements, but are
associated with higher costs, and even these improved sequencing methods often fall
short of complete chromosome assembly.
Alternative innovative strategies are overcoming the challenge of generating long
contiguous genomic assemblies from short sequence reads. Two broad methodologies
are: i) proximity ligation-based sequencing and ii) optical mapping. These techniques
are highly dependent on the use of restriction enzymes or nicking endonucleases to
either cleave or label DNA at specific sites, with an optimal frequency for downstream
analysis.
Videos
1 of 1
NEB TV Episode 15
These days, restriction enzymes are being used
in many more applications other than cloning.
Learn more in this episode of NEB TV.
Proximity-based ligation
Proximity based ligation coupled with massively parallel sequencing, is exemplified by the Hi-C method (2) which probes the threedimensional architecture of whole genomes by identifying higher order chromatin interactions. In the Hi-C method, cells are treated with the
crosslinking reagent formaldehyde; DNA is then digested with a restriction enzyme that leaves a 5′-overhang. The overhang is filled-in using a
dNTP mix that includes a biotinylated nucleotide triphosphate. The resulting blunt-end fragments are ligated under dilute conditions, which
favor ligation events between the crosslinked DNA fragments. The resulting DNA sample contains ligation products consisting of fragments
that were originally in close spatial proximity in the nucleus, marked with biotin at the fragment junctions. A Hi-C library is created by
shearing the DNA and selecting the biotin-containing fragments with streptavidin beads. The number of read pairs between intrachromosomal
regions is a decreasing function of the distance between them.
Inherent limitations in chromatin capture methods, such as Hi-C, are the requirement for living cells and the enrichment of undesired
interchromosomal associations, such as those attributable to in vivo telomere clusters. The related “Chicago” method, developed by Dovetail
Genomics, overcomes these limitations by capturing linkages from high molecular weight DNA mixed with reconstituted chromatin in vitro
(3). In both the Hi-C and Chicago proximity ligation-based approaches, the choice of restriction endonuclease is critical for linking information
(distance in kb) provided by the read pairs. Frequently parallel libraries are prepared using distinct restriction enzymes to provide linking
information differing by several hundred kilobases (3). With approximately 300 restriction endonucleases available, it is easy to select
enzymes that provide the desired fragment sizes.
Optical mapping
Optical mapping encompasses various techniques for fluorescent imaging of linearly extended DNA molecules, to document sequencespecific patterns across large genomic regions (4). Although optical mapping techniques can also address questions concerning specific
genomic loci, epigenetic modification, DNA binding protein distribution, and genomic structural variation (5), the traditional and most
widespread application is in the production of ordered, high resolution genome-wide pattern maps. Large individual DNA molecules are
typically immobilized on a charged surface or held in solution in an extended state using nanochannels or extension flow devices. The DNA
is then digested with an appropriate restriction enzyme. Cleaved molecules retract at the cut sites to leave a gap. The DNA fragments are
labeled with a fluorescent dye, then visualized by microscopy. The fluorescent intensity of each fragment correlates with the fragment size.
Frequently, optical maps are produced with different restriction enzymes to build a consensus genome map. Argus, an automated optical
mapping system, has been developed by OpGen Inc.
A variation of optical mapping that employs nicking endonucleases to hydrolyze just one strand of DNA, has been developed by BioNano
Genomics. Long DNA molecules are electrokinetically driven into nanochannels where they are held in an extended state. The DNA is then
nicked, labeled at the nick site by incorporation of fluorescently labeled nucleotides, and ligated with Taq DNA ligase. This approach has
been successfully applied to complex genomes including the human genome (6,7). The increasing commercial availability of nicking
endonucleases will likely promote wide acceptance of this technology in the genomic community.
When paired with next generation sequencing (NGS), optical mapping offers a powerful solution to the time consuming and costly processes
of genome assembly and gap closure. Chromosome-sized optical maps provide a scaffold onto which sequence contigs can be oriented and
aligned by overlaying in silico restriction digest or nick site patterns of the contigs on to the maps (8). Interfacing NGS with optical mapping
facilitates de novo sequencing and assembly of large mammalian genomes in the absence of any reference genome (9). As with proximitybased ligation methods, the great diversity of restriction enzyme specificities enables optimization of the cut-site frequency. This, in turn,
maximizes the alignment of sequence contigs to the optical map, and therefore the extent of genome assembly.
1. de Koning, A.P., Gu, W., Castoe, T.A., Batzer, M.A. and Pollock, D.D. (2011) Repetitive elements may comprise over two-thirds of the
human genome. PLoS Genet 7: e1002384. PMID: 22144907
2. Lieberman-Aiden, E., van Berkum, N.L., Williams, L., Imakaev, M., Ragoczy, T., et al. (2009) Comprehensive mapping of long-range
interactions reveals folding principles of the human genome. Science 326: 289-293. PMID: 19815776
3. Putnam, N.H., O'Connell, B.L., Stites, J.C., Rice, B.J., Blanchette, M., et al. (2016) Chromosome-scale shotgun assembly using an in
vitro method for long-range linkage. Genome Res. 26: 342-350. PMID: 26848124
4. Dorfman, K.D., King, S.B., Olson, D.W., Thomas, J.D. and Tree, D.R. (2013) Beyond gel electrophoresis: microfluidic separations,
fluorescence burst analysis, and DNA stretching. Chem. Rev. 113: 2584-2667. PMID: 23140825
5. Levy-Sakin, M. and Ebenstein, Y. (2013) Beyond sequencing: optical mapping of DNA in the age of nanotechnology and nanoscopy.
Curr. Opin. Biotechnol. 24: 690-698. PMID: 23428595
6. Mostovoy, Y., Levy-Sakin, M., Lam, J., Lam, E.T., Hastie, A.R., et al. (2016) A hybrid approach for de novo human genome sequence
assembly and phasing. Nat. Methods. 13(7): 587-590. PMID: 27159086
7. Xiao, S., Li, J., Ma, F., Fang, L., Xu, S., et al. (2015) Rapid construction of genome map for large yellow croaker (Larimichthys crocea)
by the whole-genome mapping in BioNano Genomics Irys system. BMC Genomics 16: 670. PMID: 26336087
8. Nagarajan, N., Cook, C., Di Bonaventura, M., Ge, H., Richards, A., et al. (2010) Finishing genomes with limited resources: lessons from
an ensemble of microbial genomes. BMC Genomics 11: 242. PMID: 20398345
9. Dong, Y., Xie, M., Jiang, Y., Xiao, N., Du, X., et al. (2013) Sequencing and automated whole-genome optical mapping of the genome of a
domestic goat (Capra hircus). Nat. Biotechnol. 31: 135-141. PMID: 23263233