Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
B RIEFINGS IN FUNC TIONAL GENOMICS . VOL 13. NO 1. 28 ^38 doi:10.1093/bfgp/elt031 Gene regulatory elements of the cardiac conduction system Karel van Duijvenboden, Jan M. Ruijter and Vincent M. Christoffels Advance Access publication date 22 August 2013 Abstract The coordinated contraction of the heart relies on the generation and conduction of the electrical impulse. Aberrations of the function of the cardiac conduction system have been associated with various arrhythmogenic disorders and increased risk of sudden cardiac death. The genetics underlying conduction system function have been investigated using functional studies and genome-wide association studies. Both methods point towards the involvement of ion channel genes and the transcription factors that govern their activity. A large fraction of diseaseand trait-associated sequence variants lie within non-coding sequences, enriched with epigenetic marks indicative of regulatory DNA. Although sequence conservation as a result of functional constraint has been a useful property to identify transcriptional enhancers, this identification process has been advanced through the development of techniques such as ChIP-seq and chromatin conformation capture technologies. The role of variation in gene regulatory elements in the cardiac conduction system has recently been demonstrated by studies on enhancers of SCN5A/SCN10A and TBX5. In both studies, a region harbouring a functionally implicated single-nucleotide polymorphism was shown to drive reproducible cardiac expression in a reporter gene assay. Furthermore, the risk variant of the allele abrogated enhancer function in both cases. Functional studies on regulatory DNA will likely receive a boost through recent developments in genome modification technologies. Keywords: cardiac conduction system; arrhythmias; enhancers; genetic variation; epigenetic variation; TBX transcription factors THE FUNCTION OF THE CARDIAC CONDUCTION SYSTEM Sudden cardiac death (SCD) from ventricular fibrillation, often in the context of coronary heart disease, is a leading cause of death [1, 2]. The occurrence of arrhythmia in general, including sinus node dysfunction, atrioventricular (AV) block and atrial fibrillation, represents a major worldwide health problem [3]. The majority of such abnormal heart rhythms develop with increasing age or in a setting of comorbidity, though a significant portion is present at birth as a consequence of congenital heart defects, the most prevalent type of birth defect [4]. The coordinated contractions of the atria and ventricles require the rhythmic generation of a depolarizing wave of electricity by the cardiac pacemaker and the coordinated propagation of this wave by the cardiac conduction system. This process is capacitated by localized cardiomyocyte specialization in the different components of the heart. The compartment-specific differences in the expression of ion channel and gap junction genes define the specific Corresponding author. Vincent M. Christoffels, Department of Anatomy, Embryology & Physiology, Academic Medical Centre, L2-108, Meibergdreef 15, 1105 AZ Amsterdam, the Netherlands. Tel.: þ 31 20 5667821; Fax: þ 31 20 6976177; E-mail: [email protected] Karel van Duijvenboden studied molecular and cellular biology and is currently a PhD student in the Christoffels lab in the Academic Medical Center, Amsterdam. His research focuses on the development of in silico methods for the identification of tissue-specific enhancers and analysis of cardiac gene regulatory networks. Jan M. Ruijter trained as medical biologist and worked in endocrinology, neurobiology, biostatistics, ophthalmology and embryology. He currently heads a research group studying cardiac gene expression and development with molecular, image analysis and 3D-reconstruction techniques. Vincent M. Christoffels is head of the Department of Anatomy, Embryology and Physiology of the Academic Medical Center, University of Amsterdam. He studies transcriptional regulatory mechanisms of development, focusing on the regulation of heart and conduction system development. ß The Author 2013. Published by Oxford University Press. All rights reserved. For permissions, please email: [email protected] Gene regulatory elements of the cardiac conduction system PR SAN PR AVN PR & QRS QRS 29 QT (repolarization) CAV1 SCN10A HAND1 SCN5A MEIS1 SCN5A TBX20 TBX5 NKX2-5 TBX5 CACNA1D KCNQ1 SOX5 TBX3 DKK1 KCNE1 CDKN1A NOS1AP NF1A AVB QRS RBB LBB PVCS Figure 1: Schematic representation of the surface ECG and components of the cardiac conduction system. The cardiac conduction system consists of the sino-atrial node (SAN), atrioventricular node (AVN), atrioventricular bundle (AVB), right bundle branch (RBB), left bundle branch (LBB) and the peripheral ventricular conduction system (PVCS). The association between the cardiac conduction system components and the ECG parameters is shown. GWAS identified genes near loci associated with variation in PR interval, QRS complex or QT interval. The boxes list a selection of these genes. electrophysiological characteristics of the distinct cell populations that are involved in the generation of the action potential and propagation of the impulse [5–7] (Figure 1). ECG REFLECTS CARDIAC CONDUCTION SYSTEM FUNCTION Variables derived from the electrocardiogram (ECG) are indicative of the function of the cardiac conduction system. They can describe a range of intermediate phenotypes of arrhythmogenic disorders and SCD risk [8]. The measurements routinely obtained with ECG include heart rate, PR interval, QRS duration and QT interval. A strong correlation between elevated heart rate and cardiovascular morbidity and mortality has been established [9]. The PR interval reflects conduction through the AV node. The QRS complex records the depolarization of the ventricles through the Purkinje system and ventricular myocardium. Deviations in these measures of electrical activation have been associated with increased risk of potentially lethal arrhythmias [10, 11]. The QT interval reflects myocardial repolarization. Both tails of the QT interval distribution are linked to high risk of SCD. Numerous rare mutations in ion channel genes give rise to known Mendelian long- and short-QT syndromes [2, 12]. THE POWER AND LIMITATIONS OF GWAS In genetic epidemiology, genome-wide association studies (GWAS) represent a research technique to examine many common genetic variants in different individuals to see if any variant is associated with a trait, such as a parameter derived from the ECG, or a major disease. These studies typically compare the genomic sequences of case and control groups to determine which single-nucleotide polymorphisms (SNPs) are more frequent in people with the trait. GWAS can also be used to study continuous traits that are attributable to polygenic effects. The relative contribution of a sequence variant to a complex varying phenotype, such as blood pressure, is then estimated using quantitative trait locus (QTL) mapping, wherein statistical methods are applied to differentiate between marker genotype groups. To date a multitude of GWAS have identified statistical associations between common sequence variants and ECG variables [13–19]. Strikingly, though not unexpectedly, the loci identified in these GWAS are enriched with ion channel genes and cardiac transcription factors (TFs) that govern their activity. These TFs are also known to be critical for the development of the heart and its conduction system (e.g. [20, 21]). These sequence variants that are statistically associated with electrophysiological 30 van Duijvenboden et al. properties of the heart, may thus identify the genetic components of a gene regulatory network of TFs and target genes controlling the function of the conduction system. However, though the GWAS data greatly help in pointing out the suspect areas of the genome, it is often challenging to move from these statistical associations to knowledge of the underlying biological mechanism that explains why a genomic interval correlates with a complex physiological trait. Further complicating the functional annotation of the identified genetic variants is the fact that GWAS studies identify tag SNPs for regions in linkage disequilibrium (LD). In population genetics a combination of alleles is in LD when their observed haplotype frequencies deviate from the expected frequencies based on random genetic cross-over. When alleles are in complete LD, QTL segregation is impossible as the allele will always be in the same marker genotype group in each GWAS comparison. This hampers the identification of the precise variants in each region that causes the statistical association with the trait. In some cases the associated risk allele resides in a protein-coding region, causing a different amino-acid to be encoded. However, such clear-cut causative biological mechanisms are a rare find. It has been reported that between 88% and 93% of diseaseand trait-associated variants emerging from GWAS studies lie within non-coding sequences [22, 23]. Since these variants are reported to be concentrated in NFkB binding sites [24] and deoxyribonuclease I (DNaseI) hypersensitive sites [23], both markers indicative of regulatory DNA, these high percentages do not merely reflect the increased polymorphism of non-coding DNA. Instead, such findings suggest a pervasive involvement of variation in regulatory DNA in common human disease. activated exclusively in the presence of external signals, like hormones. Enhancers are a class of cisregulatory elements that promote gene expression but can be located in intergenic regions, introns and exons, tens to hundreds of kilobases from their target genes (e.g. [25]). The bulk of DNA is compacted into chromatin, but enhancers must be accessible to proteins and are, therefore, localized in euchromatin regions with exposed DNA. The regions of the genome harbouring enhancers may require appropriate stimuli to become accessible. To that end, histones, the proteins that constitute nucleosomes, can be modified at different residues by chromatin remodelling enzymes. A wide assortment of histone modifications allows for dynamic repositioning of nucleosomes [26, 27]. For example, the acetylation of lysine 27 at histone 3 (H3K27ac) has been linked to active enhancers [28]. The acetyltransferase p300 is a catalyst for histone acetylation and given this role, it has been a popular tool in enhancer identification [29, 30]. The end result of enhancer activation by nucleosome repositioning, acetylation, TF binding and the docking of RNA polymerase is gene transcription. Mounting evidence supports a mechanism where enhancers directly contact their target gene promoters and initiate gene expression by looping out of intervening chromatin [31–33], reviewed in [34]. Enhancers typically span a few hundred base pairs (bp) and are composed of clusters of TF-specific binding sites, which facilitate interaction with both trans-activating and -repressive factors. Complex gene expression patterns in different tissues and time points can arise through biological integration of these different cues and physically supplying them to their targeted gene promoters. ENHANCER IDENTIFICATION THROUGH CONSERVATION TRANSCRIPTIONAL REGULATION BY CIS-REGULATORY ELEMENTS The stage- and tissue-specific requirements of the genes responsible for the generation and functioning of the cardiac conduction system, require complex spatiotemporal gene expression patterns in these tissues. In metazoans, an important part of gene regulation occurs at the transcriptional level. The docking of RNA polymerase to the promoters of genes to initiate transcription is mediated by the interactions between TFs and cis-regulatory DNA sequences. TFs can be cell-lineage-specific, or It has been hypothesized that enhancers, being important biological sequences, are conserved between species due to functional constraints. This assumption has been validated by a number of studies that have been successful in identifying functional sequences solely using inter-species genomic comparisons [35–38]. For example, a study by Visel et al. [38] found that half (115/231) of the ultraconserved regions they tested drove reproducible reporter gene expression in various tissues of the developing mouse embryo. In contrast to these findings, approaches using epigenomic cues have Gene regulatory elements of the cardiac conduction system illustrated how poorly conserved enhancers can be in certain tissues [39–41]. Using extreme conservation as the guideline, Blow et al. [39] were largely unsuccessful in identifying cardiac enhancers. However, when they searched for cardiac enhancers using p300 binding sites from embryonic day 11.5 (e11.5) mouse embryos, they reported a success rate of 68% (89/130). Furthermore, the cardiac p300 binding sites they uncovered showed lower evolutionary conservation than their counterparts found in brain and limb tissue. Low success rates of evolutionary approaches in enhancer discovery can be partly explained by limitations of the used assays, in terms of selected time points of development, but intuitively it makes sense that the small size of TF binding sites (6–12 bp) results in highly localized and, therefore, limited sequence constraints. On the other hand, it could be argued that enhancers consisting of multiple TF binding sites, arranged in modular fashion within large clusters, would be under considerable selection pressure. CHIP-SEQ IS A POWERFUL TOOL TO LOCATE ENHANCERS Chromatin immunoprecipitation (ChIP) followed by high-throughput DNA sequencing (ChIP-seq) has become the gold standard approach to map protein–DNA interactions in vivo on a genome-wide scale [42–44]. The ChIP-seq approach is independent of sequence conservation and directly targets the epigenetic marks from the tissues of study. The result of ChIP-seq analysis is the quantified occurrence of DNA fragments, which reflect the genomic occupancy by the factor through direct binding or indirect binding via complex formation. Such quantitative maps of potential cis-regulatory binding elements are useful for enhancer identification, as proven through the use of enhancer reporter vectors in vitro [45] and in vivo [29]. ChIP-seq on tissue-specific TFs provides useful marks for finding enhancers that are active in the tissue of choice [43, 46–48]. Other commonly used marks for identifying putative enhancers include p300, RNA polymerase 2 (Pol2), H3K27ac and H3 monomethylated at K4 (H3K4me1). Collective efforts, such as the ENCODE project, have generated genome-wide maps of histone modifications, TF binding and DNaseI hypersensitivity in a variety of cell lines and tissues [49–51]. Each of these datasets can 31 potentially be used to identify regulatory elements in the non-coding genome. 4C-SEQ TECHNOLOGY DEMONSTRATES REGULATORY POTENTIAL OF GENOMIC REGIONS The development of chromatin conformation capture (3C) technology and its genome-wide derivatives have enabled the unbiased study of the spatial organization of DNA into chromatin. The strategy of 3C technology relies on digestion and religation of fixed chromatin in cells [52]. The subsequent quantification of ligation junctions allows insights into DNA contact frequencies. Using this technology, a chromatin loop can be demonstrated to exist if two distal sites on the same chromosome form more ligation junctions with each other than with intervening sequences. Circular chromatin conformation capture (4C) extended 3C technology to enable the more unbiased screening of the genome from a single viewpoint [53, 33]. In 4C, the ligated 3C template is processed with a second round of DNA digestion and ligation to create small DNA circles. An inverse PCR using viewpoint-specific primers then amplifies all sequences that contacted the chromosome at the chosen viewpoint. An even higher resolution and larger dynamic range can be achieved when 4C is coupled with high-throughput DNA sequencing (4C-seq) [54]. It is important to realize that spatial proximity of a genomic region with a promoter only conveys the possibility of an interaction between the two; alternatively, it could indicate the location of a poised enhancer or simply be the result of non-functional stochastic DNA looping. However, the demonstration of spatial proximity of candidate enhancer and promoter regions using 4C-seq methodology constitutes a very powerful observation, given the fact that enhancer function requires such proximity to the target promoter for direct contact. GENETIC VARIATION FUNCTIONALLY AFFECTS SCN5A/SCN10A ENHANCER A series of GWAS implicated that genetic variants linked to loci harbouring genes for cardiac TFs (including NKX2-5, TBX3 and TBX5) and ion channels (most notably SCN5A and SCN10A), modulate 32 van Duijvenboden et al. cardiac impulse conduction and increase the risk of arrhythmia [13–19]. These three TFs are essential for heart development and function of the cardiac conduction system (reviewed in [20, 55, 56]). Several of the genetic variants are located in non-coding regions, leading van den Boogaard et al. [48] to hypothesize that these variants affect the function of enhancers controlling the expression pattern of the implicated nearby genes. In this study, 2 T-boxregulated enhancers were identified in the Scn5a/ Scn10a locus [48]. After performing ChIP-seq on the cardiac TFs TBX3/TBX5, NKX2-5, GATA4 and the enhancer-associated factor p300, overlap analysis (Figure 2A) revealed that a conserved region in Scn10a contained multiple TF binding sites and the SNP rs6801957 implicated by GWAS. This SNP is directly positioned under a TBX3 ChIP-seq peak and the G–A substitution it represents is located in a highly conserved portion of the T-box binding consensus sequence (Figure 2C). When exposing the implicated sequence to TBX5 and TBX3 using in vitro experiments, the reported responses were consistent with the function of TBX5 as an activator [57] and TBX3 as a transcriptional repressor [58, 59]. The observed endogenous expression patterns of Scn5a and Scn10a indeed correlate positively and negatively with those of Tbx5 and Tbx3, respectively. In vivo enhancer activity of the implicated sequence was proven when the human and mouse orthologous fragments were cloned into an Hsp68-LacZ reporter vector. The LacZ expression pattern driven by the regulatory fragment closely resembled that of both endogenous Scn5a and Scn10a, matching the future ventricular conduction system components (Figure 2C). In contrast, the human fragment cloned with the risk allele showed decreased enhancer driven reporter activity in zebrafish and the loss of T-box-mediated response in vitro, confirming that rs6801957 variation affects enhancer function by perturbing a T-box site. Individuals carrying the risk allele of this SNP may express reduced levels of SCN5A/SCN10A (it has not been defined which promoters are controlled by this enhancer) suggesting a likely mechanism to explain the observed association with the increased risk of arrhythmia. Further data demonstrating T-boxmediated regulation of the Scn5a locus was provided by Arnolds et al. [60]. This study shows that an enhancer 15 kb downstream of Scn5a loses the capacity to drive cardiac LacZ expression upon mutating three conserved T-box binding sites. The dosage of a TF can be of critical importance in development (e.g. Tbx5 [61] and Tbx3 [62]). Such dosage sensitivity is conferred through the presence of multiple binding sites for a single TF in the regulatory DNA of the target gene. Such modular enhancer function allows for the fine-tuning of localized gene expression that is necessary for conduction system function. Consistent with this paradigm is the observation that the effect of the rs6801957 variant on conduction times was very small, implying that multiple genetic variants together are required to result in a cumulative, significant effect on sodium channel gene expression. It is likely that other genetic variants in non-coding DNA identified by GWAS will influence the cardiac conduction system in a similar way. Their cumulative effect may then lead to disease. GENETIC VARIATION IN TBX5 ENHANCER LEADS TO CONGENITAL HEART DISEASE More evidence for the role of regulatory variation in cardiac function and disease came from a study from Smemo etal. [63]. There are several coding mutations in TBX5 known that lead to Holt–Oram syndrome [64]. Holt–Oram syndrome patients invariably show malformations in the upper limbs and have an increased risk of congenital heart defects. The incomplete penetrance of the cardiac phenotype suggests that additional triggers are necessary. Due to the reported dosage sensitive function of TBX5, Smemo et al. [63] screened the TBX5 locus for developmental enhancers. The 700 kb gene desert that encompasses TBX5 was narrowed down for regulatory potential using 3 LacZ recombined mouse bacterial artificial chromosomes (BACs), 2 of which showed cardiac expression consistent with the endogenous TBX5 expression profile. Cues from evolutionary conservation and from cardiac ChIP-seq datasets (generated by the Pu laboratory [47]) were used to further pinpoint individual enhancers within the genomic window indicated by the BACs. Smemo et al. [63] selected and screened 19 candidate enhancers of which 18 overlap with genomic areas that show high conservation between chicken and mouse (Figure 2B). Using Hsp68-LacZ reporter vectors, six elements were shown to drive reproducible cardiac expression. This implies that candidate approaches relying on evolutionary conservation can still prove successful to identify cardiac enhancers, despite the Gene regulatory elements of the cardiac conduction system A 33 B Tested elements Exog C D Conserved elements with Chicken DNaseI p300 NKX2-5 TBX3 GATA4 C D la lv rs6801957 7/9 ra rv ivs E14.5 G 7/ 9 G>T 1/ 11 Mouse Scn5a/Scn10a enhancer F1-2 (major) 8/9 la lv ra rv ivs E12.5 Human SCN5A/SCN10A enhancer F1-2 (major) % Hearts expressing GFP 80 60 40 20 0 rs 6801957 E10.5 - G G>A Human TBX5 enhancer 9 (major) E11.5 Human TBX5 enhancer 9 (minor) Figure 2: Overview of the regulatory elements discussed in this review. (A, B) UCSC genome browser overviews of the Scn5a/Scn10a locus studied in [48] (A) and the Tbx5 locus studied in [61] (B). In both panels, the upper track shows the tested elements, where arrow heads indicate confirmed regulatory elements; black regions did not drive detectable expression in the assays used. Functional assays of the elements denoted with a C and D are shown in panels (C) and (D), respectively. The track containing genomic regions corresponding with high evolutionary conservation between mouse and chicken was obtained from the UCSC genome browser database. The DNase1 track depicts the results of a DNase1 hypersensitivity assay; the p300 track depicts the results of a p300 ChIP-seq. Both were performed on adult heart and made available by the ENCODE consortium [51]. The NKX2-5, TBX3 and GATA4 tracks depict the results of ChIP-seq with antibodies for these TFs performed on adult mouse heart by van den Boogaard et al. [48]. (C, D) Enhancer activity in vivo using reporter constructs. (C) The C element [panel (A), top track] is shown to drive reproducible expression in the interventricular septum for both the human and mouse orthologue. The element contains the rs6801957 SNP, the major allele is highly conserved in evolution. The minor allele showed diminished enhancer activity in a zebrafish GFP-reporter assay. Pictures courtesy of M. van den Boogaard. ra, right atrium; la, left atrium; rv, right ventricle; lv, left ventricle; ivs, interventricular septum. (D) The Tbx5 enhancer D [panel (B), top track] is shown to drive reporter gene expression at E11.5 in ventricular myocardium, consistent with endogenous Tbx5. A patient with a VSD possessed an SNP at an evolutionary conserved position in this enhancer. The element containing the minor sequence variant showed abrogated cardiac enhancer function. Pictures adapted from Smemo et al. [63], copyright owned by Oxford University Press. reported lower degree of evolutionary conservation of this sub-type of enhancers [39]. Three enhancers showed patterns largely consistent with cardiac TBX5 expression (Figure 2D), albeit that, interestingly, the left-ventricular-restricted pattern of the endogenous gene was not recapitulated. The genomic regions of the three enhancers were sequenced in a patient cohort consisting of Brazilian children born with isolated congenital heart defects. One patient diagnosed with a non-syndromic ventricular septal defect (VSD) possessed a homozygous G–T substitution in an enhancer driving expression in the ventricular septum. This may impact on cardiac conduction as the AV bundle is established by Tbx5 and Tbx3 in the ventricular septum [60, 65, 66]. After excluding protein-coding mutations in this 34 van Duijvenboden et al. patient as the cause for the VSD, this low frequency variant (0.3% in the Brazilian population) was tested for enhancer function in vivo. Whereas the wild-type allele showed reproducible cardiac expression in 7 out of 8 transgenic founder mice, the variant allele showed cardiac expression in 1 out of 11 cases only. Moreover, the latter expression was weak and in a different compartment, the atrial free wall. These results were confirmed with a zebrafish assay. Taken together, these findings demonstrate that the low frequency variant abrogates the correct cardiac expression of this enhancer, likely explaining its link to congenital malformations. Furthermore, the modular function of this enhancer was demonstrated. The patient with the enhancer variant did not show any other Holt–Oram syndrome symptoms, indicating that the non-cardiac repertoire of TBX5 function was unaffected, which is consistent with the observation that this enhancer exclusively drives cardiac expression in reporter assays. INTEGRATIVE METHODS TO LOCATE ENHANCERS As the role of regulatory elements in common human disease is becoming more established, methods to reliably identify such enhancers are in high demand. Through ChIP-seq tens of thousands of TF binding sites and dynamic histone modifications for a wealth of factors and tissues are found. How many of the biochemical events we observe in a ChIP-seq experiment actually contribute to gene regulation remains an open question. In the absence of a single mark that identifies all active enhancers in a given tissue, the challenge to reliably identify these elements is best undertaken through efficient integration of the various cues available. The majority of ChIP-seq peak calling algorithms, including MACS [67], CisGenome [68], PeakSeq [69], SPP [70] and Sole-Search [71] use so-called input controls to correct for sampling bias in the data. Input controls are generated by performing ChIP-seq without an antibody and help identify regions where read density is high despite the absence of biological signal. The use of input controls can be questioned, as protein–antibody interactions have a pronounced effect on the bias. In input controls such interactions are absent, leading to a different bias that cannot be used for proper modelling of the background of an antibody-driven ChIP-seq assay [72]. Consequentially, the use of input controls can lead to regions that are incorrectly labelled as false positive: peaks are ignored. The incorrect elimination of such regions would decrease the power of any integrative method, which would ultimately rely on the overlap of a wealth of different cues. Although integrative analysis on data generated by ChIP-seq has been applied [47, 73], a systematic framework enabling such analysis is still lacking. NEW GENOME EDITING TECHNOLOGIES TO FURTHER UNRAVEL THE FUNCTIONS OF REGULATORY DNA Perhaps an even harder challenge than completing the annotation of all enhancers in the genome, is finding out which ones actually affect fitness. Until recently, methods to interrogate enhancer function were very limited, costly and time consuming. However, powerful new approaches for functional genomic applications have been developed. With artificial transcription activator-like effector nucleases (TALENs), single-stranded DNA oligonucleotides can be used to precisely modify sequences at predefined locations in the genome through the induction of double-strand breaks and subsequent repair through non-homologous end joining (NHEJ) or homology-directed repair [74, 75]. Bacteria and archaea resist invading viruses and plasmids through an RNA-based immune system using clustered regularly interspaced short palindromic repeat (CRISPR) and CRISPR-associated (Cas) proteins [76]. Recently, this CRISPR/Cas system has been adapted for efficient multiplexed genome editing [77, 78]. With these possibilities to efficiently target the genome in a direct way, new avenues have been opened to generate insights in enhancer-mediated phenotypes. These developments will also facilitate the functional assessment of other types of regulatory elements, such as insulators and repressors, which are traditionally hard to interrogate with functional studies. CONCLUSIONS The development and deployment of the most recent technologies and approaches, such as GWAS, ChIP-seq, 3C and 4C, TALEN and CRISPR is continuously advancing the identification and functional characterization of regulatory Gene regulatory elements of the cardiac conduction system Sequence constraint 3C & 4C Technology ChIP-seq DNase1 Hypersensitivity Enhancer identification GWAS 35 Integration Reporter gene assays Cell culture transfection assay Transient transgenic embryos High throughput functional analysis Genome modification TALEN CRISPR / Cas9 Enhancer function x x Light mRNA sequencing Phenotypic read-outs Figure 3: Overview of the enhancer identification and characterization methods described in this review. Non-coding regions of the genome can be screened for the presence of regulatory regions using a combination of GWAS data, evolutionary conservation, ChIP-seq, 3C and 4C technologies and DNaseI hypersensitivity assays. Functional characterization of (putative) enhancers can be performed using in vitro luciferase assays and in vivo reporter gene assays. With the development of flexible new genome modification techniques, such as TALEN and CRISPR, regulatory regions can directly be altered greatly facilitating functional studies on regulatory DNA. Picture of transgenic embryo adapted from the VISTA Enhancer Browser (http://enhancer.lbl.gov/). DNA elements (summarized in Figure 3). This has lead to improved understanding of the role that regulatory DNA plays in transcriptional regulation and ultimately in the establishment of phenotypes that could predispose to disease. In line with these developments, genetic variations in non-coding DNA have been found that influence cardiac conduction, and the first enhancers have now been identified that mechanistically link these variations to the functioning of the cardiac conduction system. Such studies are establishing a model for understanding the molecular pathology of cardiac conduction system disease. Key Points Functioning of the cardiac conduction system relies on compartment-specific ion channel and gap junction gene expression. Genetic variation in gene regulatory elements can influence conduction system function through the aberration of fine-tuned cardiac transcription factor networks. Cataloguing the location and function of all relevant gene regulatory elements is an ongoing effort and is advanced by continuing technological developments. Acknowledgement We thank Malou van den Boogaard for kindly providing the images of Figure 2C. 36 van Duijvenboden et al. FUNDING KvD is supported by a special fellowship of the Academic Medical Centre, Amsterdam. References 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. Huikuri HV, Castellanos A, Myerburg RJ. Sudden death due to cardiac arrhythmias. N EnglJ Med 2001;345:1473–82. Bezzina CR, Pazoki R, Bardai A, et al. Genome-wide association study identifies a susceptibility locus at 21q21 for ventricular fibrillation in acute myocardial infarction. Nat Genet 2010;42:688–91. Shah AJ, Liu X, Jadidi AS, et al. Early management of atrial fibrillation: from imaging to drugs to ablation. Nat Rev Cardiol 2010;7:345–54. Hoffman JI, Kaplan S. The incidence of congenital heart disease. J Am Coll Cardiol 2002;39:1890–900. Schram G, Pourrier M, Melnyk P, et al. Differential distribution of cardiac ion channel expression as a basis for regional specialization in electrical function. Circ Res 2002;90: 939–50. Moorman AFM, Christoffels VM. Cardiac chamber formation: development, genes and evolution. Physiol Rev 2003; 83:1223–67. Marionneau C, Couette B, Liu J, et al. Specific pattern of ionic channel gene expression associated with pacemaker activity in the mouse heart. J Physiol 2005;562:223–34. Kolder IC, Tanck MW, Bezzina CR. Common genetic variation modulating cardiac ECG parameters and susceptibility to sudden cardiac death. J Mol Cell Cardiol 2012;52: 620–9. Jouven X, Empana JP, Schwartz PJ, et al. Heart-rate profile during exercise as a predictor of sudden death. N EnglJ Med 2005;352:1951–8. Cheng S, Keyes MJ, Larson MG, et al. Long-term outcomes in individuals with prolonged PR interval or first-degree atrioventricular block. JAMA 2009;301:2571–7. Hesse B, Diaz LA, Snader CE, et al. Complete bundle branch block as an independent predictor of all-cause mortality: report of 7,073 patients referred for nuclear exercise testing. AmJ Med 2001;110:253–9. Newton-Cheh C, Shah R. Genetic determinants of QT interval variation and sudden cardiac death. Curr Opin Genet Dev 2007;17:213–21. Holm H, Gudbjartsson DF, Arnar DO, et al. Several common variants modulate heart rate, PR interval and QRS duration. Nat Genet 2010;42:177–22. Sotoodehnia N, Isaacs A, de Bakker PI, et al. Common variants in 22 loci are associated with QRS duration and cardiac ventricular conduction. Nat Genet 2010;42:1068–76. Newton-Cheh C, Eijgelsheim M, Rice KM, etal. Common variants at ten loci influence QT interval duration in the QTGEN Study. Nat Genet 2009;41:399–406. Chambers JC, Zhao J, Terracciano CM, et al. Genetic variation in SCN10A influences cardiac conduction. Nat Genet 2010;42:149–52. Pfeufer A, van NC, Marciante KD, et al. Genome-wide association study of PR interval. Nat Genet 2010;42:153–9. Butler AM, Yin X, Evans DS, et al. Novel loci associated with PR interval in a genome-wide association study of 10 19. 20. 21. 22. 23. 24. 25. 26. 27. 28. 29. 30. 31. 32. 33. 34. 35. 36. 37. African American cohorts. Circ Cardiovasc Genet 2012;5: 639–46. den Hoed M, Eijgelsheim M, Esko T, et al. Identification of heart rate-associated loci and their effects on cardiac conduction and rhythm disorders. Nat Genet 2013. Christoffels VM, Smits GJ, Kispert A, et al. Development of the pacemaker tissues of the heart. Circ Res 2010;106: 240–54. Munshi NV. Gene regulatory networks in cardiac conduction system development. Circ Res 2012;110:1525–37. Hindorff LA, Sethupathy P, Junkins HA, et al. Potential etiologic and functional implications of genome-wide association loci for human diseases and traits. Proc Natl Acad Sci USA 2009;106:9362–7. Maurano MT, Humbert R, Rynes E, et al. Systematic localization of common disease-associated variation in regulatory DNA. Science 2012;337:1190–5. Karczewski KJ, Dudley JT, Kukurba KR, et al. Systematic functional regulatory assessment of disease-associated variants. Proc Natl Acad Sci USA 2013. Kleinjan DA, van Heyningen V. Long-range control of gene expression: emerging mechanisms and disruption in disease. AmJ Hum Genet 2005;76:8–32. Jenuwein T, Allis CD. Translating the histone code. Science 2001;293:1074–80. Ruthenburg AJ, Li H, Patel DJ, et al. Multivalent engagement of chromatin modifications by linked binding modules. Nat Rev Mol Cell Biol 2007;8:983–94. Creyghton MP, Cheng AW, Welstead GG, et al. Histone H3K27ac separates active from poised enhancers and predicts developmental state. Proc Natl Acad Sci USA 2010;107: 21931–6. Visel A, Blow MJ, Li Z, et al. ChIP-seq accurately predicts tissue-specific activity of enhancers. Nature 2009; 457:854–8. Heintzman ND, Hon GC, Hawkins RD, et al. Histone modifications at human enhancers reflect global cell-typespecific gene expression. Nature 2009;459:108–12. Su W, Porter S, Kustu S, et al. DNA-looping and enhancer activity: association between DNA-bound NtrC activator and RNA polymerase at the bacterial glnA promoter. Proc Natl Acad Sci USA 1990;87:5504–8. Deng W, Lee J, Wang H, et al. Controlling long-range genomic interactions at a native locus by targeted tethering of a looping factor. Cell 2012;149:1233–44. Simonis M, Klous P, Splinter E, et al. Nuclear organization of active and inactive chromatin domains uncovered by chromosome conformation capture-on-chip (4C). Nat Genet 2006;38:1348–54. Kulaeva OI, Nizovtseva EV, Polikanov YS, et al. Distant activation of transcription: mechanisms of enhancer action. Mol Cell Biol 2012;32:4892–7. Dubchak I, Brudno M, Loots GG, et al. Active conservation of noncoding sequences revealed by three-way species comparisons. Genome Res 2000;10:1304–6. Nobrega MA, Ovcharenko I, Afzal V, et al. Scanning human gene deserts for long-range enhancers. Science 2003;302:413. Pennacchio LA, Ahituv N, Moses AM, et al. In vivo enhancer analysis of human conserved non-coding sequences. Nature 2006;444:499–502. Gene regulatory elements of the cardiac conduction system 38. Visel A, Prabhakar S, Akiyama JA, et al. Ultraconservation identifies a small subset of extremely constrained developmental enhancers. Nat Genet 2008;40:158–60. 39. Blow MJ, McCulley DJ, Li Z, et al. ChIP-seq identification of weakly conserved heart enhancers. Nat Genet 2010;42: 806–10. 40. May D, Blow MJ, Kaplan T, et al. Large-scale discovery of enhancers from human heart tissue. Nat Genet 2011. 41. Schmidt D, Wilson MD, Ballester B, et al. Five-vertebrate ChIP-seq reveals the evolutionary dynamics of transcription factor binding. Science 2010;328:1036–40. 42. Barski A, Cuddapah S, Cui K, et al. High-resolution profiling of histone methylations in the human genome. Cell 2007;129:823–37. 43. Robertson G, Hirst M, Bainbridge M, et al. Genome-wide profiles of STAT1 DNA association using chromatin immunoprecipitation and massively parallel sequencing. Nat Methods 2007;4:651–7. 44. Johnson DS, Mortazavi A, Myers RM, et al. Genome-wide mapping of in vivo protein-DNA interactions. Science 2007; 316:1497–502. 45. Heintzman ND, Stuart RK, Hon G, et al. Distinct and predictive chromatin signatures of transcriptional promoters and enhancers in the human genome. Nat Genet 2007;39: 311–8. 46. Junion G, Spivakov M, Girardot C, et al. A transcription factor collective defines cardiac cell fate and reflects lineage history. Cell 2012;148:473–86. 47. He A, Kong SW, Ma Q, et al. Co-occupancy by multiple cardiac transcription factors identifies transcriptional enhancers active in heart. Proc Natl Acad Sci USA 2011;108:5632–7. 48. van den Boogaard M, Wong LY, Tessadori F, et al. Genetic variation in T-box binding element functionally affects SCN5A/SCN10A enhancer. J Clin Invest 2012;122: 2519–30. 49. The ENCODE (ENCyclopedia Of DNA Elements) Project. Science 2004;306:636–40. 50. Birney E, Stamatoyannopoulos JA, Dutta A, et al. Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project. Nature 2007;447:799–816. 51. Stamatoyannopoulos JA, Snyder M, Hardison R, et al. An encyclopedia of mouse DNA elements (Mouse ENCODE). Genome Biol 2012;13:418. 52. Dekker J, Rippe K, Dekker M, et al. Capturing chromosome conformation. Science 2002;295:1306–11. 53. Zhao Z, Tavoosidana G, Sjolinder M, et al. Circular chromosome conformation capture (4C) uncovers extensive networks of epigenetically regulated intra- and interchromosomal interactions. Nat Genet 2006;38:1341–7. 54. Splinter E, de WE, Nora EP, et al. The inactive X chromosome adopts a unique three-dimensional conformation that is dependent on Xist RNA. Genes Dev 2011;25: 1371–83. 55. Hoogaars WMH, Barnett P, Moorman AFM, et al. T-box factors determine cardiac design. Cell Mol Life Sci 2007;64: 646–60. 56. McCulley DJ, Black BL. Transcription factor pathways and congenital heart disease. CurrTop Dev Biol 2012;100:253–77. 57. Bruneau BG, Nemer G, Schmitt JP, et al. A murine model of Holt-Oram syndrome defines roles of the T-box 58. 59. 60. 61. 62. 63. 64. 65. 66. 67. 68. 69. 70. 71. 72. 73. 74. 75. 37 transcription factor TBX5 in cardiogenesis and disease. Cell 2001;106:709–21. Hoogaars WM, Engel A, Brons JF, et al. Tbx3 controls the sinoatrial node gene program and imposes pacemaker function on the atria. Genes Dev 2007;21:1098–112. Horsthuis T, Buermans HP, Brons JF, et al. Gene expression profiling of the forming atrioventricular node using a novel Tbx3-based node-specific transgenic reporter. Circ Res 2009;105:61–9. Arnolds DE, Liu F, Fahrenbach JP, et al. TBX5 drives Scn5a expression to regulate cardiac conduction system function. J Clin Invest 2012;122:2509–18. Mori AD, Zhu Y, Vahora I, et al. Tbx5-dependent rheostatic control of cardiac gene expression and morphogenesis. Dev Biol 2006;297:566–86. Frank DU, Carter KL, Thomas KR, etal. Lethal arrhythmias in Tbx3-deficient mice reveal extreme dosage sensitivity of cardiac conduction system function and homeostasis. Proc Natl Acad Sci USA 2011;109:E154–63. Smemo S, Campos LC, Moskowitz IP, et al. Regulatory variation in a TBX5 enhancer leads to isolated congenital heart disease. Hum Mol Genet 2012;21:3255–63. Basson CT, Bachinsky DR, Lin RC, et al. Mutations in human TBX5 (corrected) cause limb and cardiac malformation in Holt-Oram syndrome. Nat Genet 1997;15: 30–5. Bakker ML, Boukens BJ, Mommersteeg MTM, et al. Transcription factor Tbx3 is required for the specification of the atrioventricular conduction system. Circ Res 2008;102: 1340–9. Moskowitz IP, Kim JB, Moore ML, et al. A molecular pathway including id2, tbx5, and nkx2-5 required for cardiac conduction system development. Cell 2007;129:1365–76. Zhang Y, Liu T, Meyer CA, et al. Model-based analysis of ChIP-seq (MACS). Genome Biol 2008;9:R137. Ji H, Jiang H, Ma W, etal. An integrated software system for analyzing ChIP-chip and ChIP-seq data. Nat Biotechnol 2008;26:1293–300. Rozowsky J, Euskirchen G, Auerbach RK, et al. PeakSeq enables systematic scoring of ChIP-seq experiments relative to controls. Nat Biotechnol 2009;27:66–75. Valouev A, Johnson DS, Sundquist A, et al. Genome-wide analysis of transcription factor binding sites based on ChIPseq data. Nat Methods 2008;5:829–34. Blahnik KR, Dou L, O’Geen H, et al. Sole-Search: an integrated analysis program for peak detection and functional annotation using ChIP-seq data. Nucleic Acids Res 2010;38:e13. Cheung MS, Down TA, Latorre I, et al. Systematic bias in high-throughput sequencing data and its correction by BEADS. Nucleic Acids Res 2011;39:e103. Bolouri H, Ruzzo WL. Integration of 198 ChIP-seq datasets reveals human cis-regulatory regions. J Comput Biol 2012;19:989–97. Cermak T, Doyle EL, Christian M, et al. Efficient design and assembly of custom TALEN and other TAL effectorbased constructs for DNA targeting. Nucleic Acids Res 2011; 39:e82. Bedell VM, Wang Y, Campbell JM, et al. In vivo genome editing using a high-efficiency TALEN system. Nature 2012; 491:114–8. 38 van Duijvenboden et al. 76. Horvath P, Barrangou R. CRISPR/Cas, the immune system of bacteria and archaea. Science 2010;327:167–70. 77. Cong L, Ran FA, Cox D, et al. Multiplex genome engineering using CRISPR/Cas systems. Science 2013;339: 819–23. 78. Wang H, Yang H, Shivalila CS, et al. One-step generation of mice carrying mutations in multiple genes by CRISPR/Cas-mediated genome engineering. Cell 2013; 153:910–8.