Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Home Search Collections Journals About Contact us My IOPscience Theory of phase segregation in DNA assemblies containing two different base-pair sequence types This content has been downloaded from IOPscience. Please scroll down to see the full text. 2017 New J. Phys. 19 015014 (http://iopscience.iop.org/1367-2630/19/1/015014) View the table of contents for this issue, or go to the journal homepage for more Download details: IP Address: 134.84.238.92 This content was downloaded on 23/03/2017 at 22:52 Please note that terms and conditions apply. You may also be interested in: Statistical mechanical model for a closed loop plectoneme with weak helix specific forces Dominic J (O’) Lee Torsional fluctuations in columnar DNA assemblies D J Lee and A Wynveen Collapse and coexistence for a molecular braid with an attractive interaction component subject to mechanical forces Dominic J (O') Lee The tilt-dependent potential of mean force of a pair of DNA oligomers from all-atom molecular dynamics simulations Ruggero Cortini, Xiaolin Cheng and Jeremy C Smith Chiral electrostatics breaks the mirror symmetry of DNA supercoiling R Cortini, D J Lee and A Kornyshev Correlation forces between helical macro-ions in the weak coupling limit D J Lee Charge renormalization of helical macromolecules D J Lee Statistical mechanics of polymers subject to a force E Orlandini and S G Whittington The role of electrostatics in the B to A transition of DNA: from solution to assembly Laura Rudd, Dominic J Lee and Alexei A Kornyshev New J. Phys. 19 (2017) 015014 https://doi.org/10.1088/1367-2630/aa5482 PAPER OPEN ACCESS Theory of phase segregation in DNA assemblies containing two different base-pair sequence types RECEIVED 26 September 2016 REVISED 28 November 2016 ACCEPTED FOR PUBLICATION Dominic J (O’) Lee1, Aaron Wynveen2 and Alexei A Kornyshev1 1 2 Department of Chemistry, Imperial College London, SW7 2AZ, London, UK School of Physics and Astronomy, University of Minnesota, Minneapolis, MN 55455, USA 19 December 2016 E-mail: [email protected], [email protected] and [email protected] PUBLISHED Keywords: DNA, homology recognition, phase segregation, statistical mechanics, DNA assemblies 30 January 2017 Supplementary material for this article is available online Original content from this work may be used under the terms of the Creative Commons Attribution 3.0 licence. Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI. Abstract Spontaneous pairing of homologous DNA sequences—a challenging subject in molecular biophysics, often referred to as ‘homology recognition’—has been observed in vitro for several DNA systems. One of these experiments involved liquid crystalline quasi-columnar phases formed by a mixture of two kinds of double stranded DNA oligomer. Both oligomer types were of the same length and identical stoichiometric base-pair composition, but the base-pairs followed a different order. Phase segregation of the two DNA types was observed in the experiments, with the formation of boundaries between domains rich in molecules of one type (order) of base pair sequence. We formulate here a modified ‘X–Y model’ for phase segregation in such assemblies, obtain approximate solutions of the model, compare analytical results to Monte Carlo simulations, and rationalise past experimental observations. This study, furthermore, reveals the factors that affect the degree of segregation. Such information could be used in planning new versions of similar segregation experiments, needed for deepening our understanding of forces that might be involved, e.g., in gene–gene recognition. 1. Introduction In this article we develop a statistical mechanical model of DNA assemblies containing mixtures of fragments of two different base-pair sequences that are allowed to phase segregate. This model, e.g., may describe phase segregation of DNA in cholesteric spherulites. Spherulites are liquid crystal structures formed in DNA-noncondensing electrolytic solutions upon applying mild osmotic stress that helps to overcome net repulsion between the DNA molecules. This current study was initially motivated by experiments [1] in which partial segregation of DNA molecules, according to their base-pair sequence texts (i.e., the order of their nucleic bases), was observed in spherulites containing a mixture of two DNA species with different sequence texts. These experiments were performed by mixing an equal amount of 294 bp-long fragments of synthetic dsDNA with two different sequences in 0.5 M NaCl under mild osmotic stress. Each of the two types of sequences contained the same numbers of GC and AT base pairs, but the base pairs were shuffled randomly into two different orderings. As osmotic pressure was applied slowly, causing the molecules to condense into spherulites, the two sequence types simultaneously segregated, which was observed using confocal microscopy. The fragments were observed to be in a cholesteric arrangement with a large pitch (a small tilt angle between cholesteric layers), and so the arrangement of molecules could be described as being, roughly, in a columnar phase, in which DNA fragments are arranged parallel to each other. To detect segregation between the two species in these experiments [1], each species was labelled by chromophores of different colour (red or green). In order not to overload the samples with chromophores, only 5% of DNA fragments were thus labelled. But this fraction was sufficient to see a signature of homology segregation, in which DNA fragments with the same base pair ordering preferentially condensed in domains richer in their proportion. This was indicated by the appearance of red and green patches in the observed images for each spherulite. The degree of colour segregation © 2017 IOP Publishing Ltd and Deutsche Physikalische Gesellschaft New J. Phys. 19 (2017) 015014 D J (O’) Lee et al was significant enough that it could not be explained simply by random statistics. Conversely, control experiments in which dsDNA fragments of only one species, half labelled with one colour chromophore and half with the other, demonstrated random mixing of the molecules labelled with the two chromophore types. These ruled out segregation effects due to the type of chromophore, and, hence, a difference in base-pair sequence was shown as the primary driving force behind the segregation. Such evidence of homology recognition in a protein-free environment has been widely discussed in the scientific press [2] as being an innate consequence of the structure of DNA. Signatures of homology recognition were also reported earlier in gel electrophoresis studies [3]; and some time later, strong indications of the same effect were obtained by single molecule manipulation techniques [4] and AFM studies for DNA on nucleosomes [5]. Further corroborative evidence of homology recognition in a protein free environment was presented in a very recent study, in which preferential interaction of long homologous tracts, in parallel orientation within single molecules, was observed [6] in magnetic bead distance-force experiments. The existence of such interaction forces that distinguish between similar and different molecules might be biologically important. Indeed the pairing of chromosomes of similar sequences is an essential step in many biological processes [7–9]. It has been speculated whether this initial pairing might be some innate general characteristic of DNA [9], resulting, again, from the contrasting forces between DNA with differences in sequence. Prior to the work of [1], it was shown that the phenomena of phase segregation between different DNA sequences might indeed be theoretically possible. Indeed, an explanation of the results of osmotic stress experiments presented in [1] relies on the idea that the overall interaction energy between each fragment, although net repulsive, should contain a helix-specific attractive component. The degree of attraction may be modelled quantitatively through the Kornyshev–Leikin (KL) theory [10, 11]. In the theory, the attractive component is found to be stronger between homologous molecules than between two non-homologous fragments upon considering the non-ideal helix structure of DNA molecules [12, 13]. This difference, referred to as the ‘recognition energy’, was first calculated in [14]. The theory of homology recognition was then developed with a higher level of sophistication in a series of subsequent works (for review see [10, 15], and for a recent study [16] with citations to later articles contained therein). The recognition energy can comprise several kB T per persistence length, and it increases with the length of the molecules. However, in the segregation experiments of a liquid-crystal configuration [1], the employed molecules had to be sufficiently short—not much longer than the dsDNA bending thermal persistence length—to form well-ordered liquid crystalline phases. Those molecules used in [1] were roughly twice the persistence length, and according to theory, segregation effects would have been weak. This, indeed, was observed, but distinguishably seen. Later works have suggested that the preferred conformation of DNA with the same sequence will be parallel (i.e. parallel alignment with base pair sequences running in the same direction) to each other but not antiparallel (parallel alignment but base pair sequences in opposite directions), as this recognition depends on the homologous sequences being in face-to-face register, with base pairs of the same type for the two molecules being directly across from each other [16]. Alternately, preferential pairing between identical DNA sequences might be, in some part, due to interactions that depend locally on the type of base pair content [6, 17, 18]. Here, the interaction energy between two like base pairs has lower interaction energy than two non-alike ones. One such mechanism, mediated by counter-ions being localised by particular base pairs [19], has been discussed in [6]. An alternate mechanism utilising secondary hydrogen bonding was proposed in [17]. Sequence specific variations in van der Waals interactions might also be responsible [18]. Such mechanisms would also be influenced by the base-pair specific pattern of DNA distortions and are manifestly helix-structure-dependent forces. Though, in this work, we assume that the global pairing mechanism due to helix distortions (discussed in the previous paragraph) is the root cause of segregation. In this paper, we lay the groundwork in developing a model that may rationalise the degree of segregation under different conditions. In the main text we develop a statistical mechanical description taking into account key details of the putative helix specific forces in DNA–DNA interactions; in the supplementary material we append a simpler phenomenological model which we will refer to in the main text for comparison. In fact, the final form of the partition function that we use in our calculations (see the end of next section) is independent of a specific pairing mechanism, up to the choice of model parameters, as it simply relies on the fact that there are helix-structure-dependent forces. 2 New J. Phys. 19 (2017) 015014 D J (O’) Lee et al Figure 1. Schematic of the crossections of two DNA molecules in parallel alginment at the midpoint of their long axes. The diagram shows azimuthal orientations, f1 and f2, of the two molecules, the relative positions of both the major and minor grooves, as well as the phophate strands. Also shown are counter-ions adsorbed at the molecular surface. The arrows that bisect the minor grooves define these azimuthal angles, which we also refer to as the molecules’ ‘spins’. 2. Model 2.1. General considerations Below, we formulate the principles of the theory that could describe phase segregation between two types of DNA fragments, each with differently ordered base pair texts, in liquid crystalline columnar assemblies. Our theory takes into account thermal fluctuation modes corresponding to rotations of the molecules about their long axes. These modes are important if the helix structure of DNA molecules in DNA–DNA interactions matters. Indeed, for interactions that depend on helix structure, the pair interaction between molecules depends on the relative azimuthal orientation of two molecules about their long axes [10]. Thus, such fluctuations are restricted by helix dependent interactions. Any pair potential (for two molecules labelled 1 and 2) that depends on the continuous helical symmetry of the molecule, for molecules aligned parallel to each other, may be expanded as a Fourier series in f1 - f2 ¥ V1,2 (R) = L å a n (R) cos (n (f1 - f2)) , (2.1) n= 0 where f1 and f2 are the (‘spin’) angles that both helices (at the midpoint of the long axes of the molecules) make to an axis that lies in a plane perpendicular to the long axes of both the molecules (see figure 1). We use the term ‘spin’ to emphasise an isomorphism (or analogy) between this approach for modelling interactions in columnar DNA assemblies and a kind of 2D X–Y model (modified to account of an extra cosine(s)) of magnetism. In the latter, the angle describes the orientation of real, magnetic spins. In equation (2.1), R is the inter-axial separation between the two molecules. Also, in writing equation (2.1), the interaction has been assumed to scale linearly with the length of the molecules, L. This supposes the two molecules are fixed to be in complete parallel juxtaposition with each other. In a 2D-assembly of N molecules, there are N such spin angles fm describing these degrees of freedom and N (N - 1) 2 pair potentials. Our model describes a single layer of DNA molecules, in columnar alignment, whose molecular centres are assumed to be fixed to a 2D regular lattice structure, in a plane perpendicular to the molecular centre lines, as in [20]. Though, as in [20], we allow for the lattice to distort from a hexagonal to a rhombic structure to minimise the free energy in response to the DNA–DNA interactions. As in [20], as well as summing up the pair potentials for the lattice, we truncate the sum in equation (2.1) at n = 2. Thus, our theory involves a modified version of an X–Y magnetic model, originally studied in [20]. Reference [20] solely dealt with columnar aggregates comprising only of one type of DNA fragments, i.e., all of them having the same sequence of base pairs. However, here, with the intention of describing the experiments of [1], we extend this model to the case where there are now DNA molecules with two different sequence texts. For there to be phase segregation between molecules with different sequences, there must be a more attractive interaction between like (homologous) molecules than between different (non-homologous) ones, as was suggested by previous works [14–16, 21]. Here, the difference in the interaction between homologous and nonhomologous molecules results from the helical distortions, specifically in the twist and rise between adjacent base-pairs, of the DNA double helix. The pattern of distortions is dependent on base pair text [22]. Molecules with the same base pair sequence share the same distortions, and therefore can maintain an energetically preferential charge alignment along their full length, whereas non-homologous molecules, with different sequences and, therefore, different distortions, cannot maintain this alignment. These patterns of distortions 3 New J. Phys. 19 (2017) 015014 D J (O’) Lee et al influence the positions of the negatively charged helical phosphate groups along the molecule as well as the positions of the groove centres, where positively charged counter-ions may localise. Therefore, homologous pairs may possess a lower interaction energy than their non-homologous counterparts. This was shown to be the case with the mean-field electrostatic KL-theory [14, 15]. To model helical distortions of the molecules, modifying the model considered previously [20], we replace the constant azimuthal (or, again, ‘spin’) angle fm with fm (z ) (for a molecule at site m in the assembly), since the ‘spin’ angle depends on the position z along the long axes of the molecules due to the distortions of the helical twist of the molecules. (A discussion and justification for this can be found in [10, 14, 15].) Hence, the function fm (z ) depends on the base pair sequence of the molecule. The form for the interaction energy is then obtained by replacing L in equation (2.1) with an integral over z along the length of the molecule. As there are DNA molecules of two different sequences, and thus two different distortion patterns, the azimuthal orientation for a molecule at the lattice site m are labelled either as fmA (z ) or fmB (z ); with the molecule types A and B. An additional degree of freedom in the system is thus manifested since we allow for the assembly to exchange molecules of type A and B with an external reservoir, and thus allow for the relative proportions of the two types of molecule in the assembly to vary. Summing up all the pair potentials, we can write the total interaction energy for the assembly (including all pair interactions) as Eint = 1 2 L/ 2 ⎡ a (∣R - R ∣) - a (∣R - R ∣) cos (f (z ) - f (z )) 1 n m n ⎣ 0 m m n ò-L/2 dz må ¹n a + a2 (∣Rm - R n∣) cos (2fam (z ) - 2f nb (z ))⎤⎦ . b (2.2) As mentioned, in this expression, we have truncated the sum over helical harmonics (equation (2.1)) at n = 2 (as in [20]). The position vectors Rm and Rn are restricted to fixed lattice sites m and n. Here, the indices a and b can either be set to A or B, depending on which of the two species of molecules occupies the sites m and n, respectively. Now, the azimuthal angles fam (z ) (and fnb (z )) can be expressed as fam (z ) = f0m + 1 h ò0 z d Wa (z ¢) dz ¢ , (2.3) where, h (3.4 Å) is the average spacing between two adjacent base-pairs along a molecule. Here f0m is the azimuthal angle that the DNA double helix associated with the lattice site m at its molecular centre makes with a single axis (defined throughout the assembly); the function dWa (z ) represents the pattern of distortions away from an ideal helix, which, again, is dependent on the base pair structure. For an ideal helix dWa (z ) = 0. If two neighbouring molecules possess random sequences with respect to each other, i.e., are non-homologous, the two functions dWa (z ) are uncorrelated over large length scales along the DNA molecules, and therefore obey Gaussian statistics: ád W A (z ) ñW = ád WB (z ) ñW = 0, h2 d (z - z ¢) , l(c0) ád W A (z ) d WB (z ¢) ñW = ád WB (z ) d W A (z ¢) ñW = 0. ád W A (z ) d W A (z ¢) ñW = ád WB (z ) d WB (z ¢) ñW = (2.4) In equation (2.4), the subscript W of the averaging brackets refers to ensemble averaging over all possible base pair realisations. The parameter l(c0) = (1 lW, W + 1 lh, h - 2 lW, h )-1 is the (intrinsic) helical coherence length [22] of DNA molecules, where lW, W and lh, h are the contributions due to variations from the mean values of, respectively, twist angles and base pair rises (vertical distances) between adjacent base-pairs along any DNA molecule. The correlation length lW, h arises from a correction due to correlations between the rise and the twist angle. One can represent lW, W = h ád W2ñ and lh, h = h (g 2 ádh2ñ ) (here g = 2p H , where H is the average helical pitch of DNA), where ád W2ñ and ádh2ñ are effective standard deviations of base pair twist and rise, between adjacent base pairs, away from their average values. Thus, l(c0) is a measure of the level of distortion away from the ideal, undistorted, helix structure. For an ideal double helix, l(c0) = ¥ . In general, then, the difference in the interaction energies between homologous and non-homologous molecular pairs only becomes pronounced when the molecules are longer than l(c0). This is effectively the length over which the two non-homologous sequences can stay commensurate—strands on one molecule aligned with grooves on the opposite molecule—with each other (if not allowing for torsional flexibility), and maintain helix structure dependent interactions with each other. Hereafter, we use the value l(c0) » 150 Å, which is within the range estimated in [22]. 4 New J. Phys. 19 (2017) 015014 D J (O’) Lee et al Note that fam (z ) also depends on thermal torsional fluctuations of the molecules. For short molecules this simply renormalizes l(c0) [10]. However, this can be neglected, as it is a relatively small correction. Another, less trivial effect is torsional adaptation where the helical structure of the molecules adjusts to facilitate better helix dependent interactions, due to the elasticity of the molecule [23, 24]. This effect on the pair interaction was studied in [16, 25–27]). Including these effects in molecular assemblies is mathematically and computationally involved for finite length molecules [28]. However, here we may, in the first approximation, neglect this effect by considering molecules that are not much longer than the thermal persistence length of DNA, like those used in the liquid-crystalline experiments [1]. Using equation (2.3), it is possible to recast equation (2.2) in a more practical form: Eint = 1 2 ⎛ L/ 2 [a 0 (∣Rm - R n∣) - a1 (∣Rm - R n∣) cos fm ò-L/2 dz må ⎝ ¹n ⎜ (0) - fn(0) + ⎞ (1 - sm s v ) DfW (z )⎟ ⎠ 2 sm ⎞ ⎛ + a2 (∣Rm - R n∣) cos ⎜ 2f(m0) - 2f(n0) + sm (1 - sm s v ) DfW (z )⎟ , ⎝ ⎠ (2.5) where DfW (z ) = 1 h ò0 z (W A (z ¢) - WB (z ¢)) dz ¢ . (2.6) In equation (2.5), the freedom of choice between type A or B molecules has been characterised in terms of a spinvariable sm = 1; sm = 1 corresponds to type A and sm = -1 to type B. Thus, the partition function (summing over both sm and f0m degrees of freedom) can be defined as N Z= å ò0 m = 1 sm =-1,1 2p dfm0 ⎛ Eint {sm , f0 } ⎞ m ⎟. exp ⎜⎜ ⎟ kB T ⎝ ⎠ (2.7) Potentially, for two specific base pair sequences, one should be able to determine the distortion patterns and calculate the partition function for those two sequences. However, our task is to calculate a more general, ensemble-averaged free energy, using the Gaussian statistics described previously. This average is defined as áF ñW = á- kB T ln Z ñW . (2.8) To calculate equation (2.8) precisely would be rather laborious and computationally expensive, involving detailed simulations that model specific patterns of helical distortions of many possible base-pair realisations, which then must be averaged. It might be possible to perform some kind of replica trick [29], on equation (2.8), to facilitate performing the ensemble average. However, this is not without its problems: firstly the ensemble average still cannot be performed analytically for this type of model, and secondly the analytic continuation in the number of replicas, n to zero is not a well-defined limit, and particular care must be taken. Instead, we utilise an approximation in which each pair potential for different helical distortion realizations is replaced with an ensemble-averaged one. This is effectively replacing quenched disorder with annealed disorder. This approximation automatically arises within the framework of a variational mean-field approximation, where we neglect correlations in sm. When the molecules are completely phase segregated or completely mixed those correlations are unlikely to be important; however, close to the transition such assumption may break down. The details of the variational approximation are fully presented in the supplemental material, whereas a more heuristic derivation of it, which relies on assuming annealed disorder, is presented in the main text. Thus, we write áF ñW » - kB T ln Z eff , (2.9) where Z eff = N å ò0 2p m = 1 sm =-1,1 ⎛ áEint {sm , f(0) } ñW ⎞ m ⎟. df(m0) exp ⎜⎜ ⎟ k T B ⎠ ⎝ (2.10) The approximate free energy described by both equations (2.9) and (2.10) is also computed in MC simulations (for details see section 3 below). One may compute the leading order correction (and higher order corrections) to equation (2.9) in both simulations and the variational approximation, from considering the full average equation (2.8). Again, the method used to find these corrections is discussed in the supplemental material, which relies on an alternate to the replica trick. The average áE int {sm, f(m0) } ñW is simple to obtain: 5 New J. Phys. 19 (2017) 015014 D J (O’) Lee et al Figure 2. Diagram illustrating the various possible states that the molecules can be azimuthally ordered with respect to their ‘spin’ angles f(m0). In each figure the arrows represent average spin directions (see figure 1). The three possible states here include: (a) the F state, where all the average spins face the same way; (b) the P state where, around a triangular plaquette, the spins all point in different directions with the average differences between two spin pairs being the same (more information on this latter state can be found in [20]); and (c) the AF state where the molecules form rows, in which there are alternating rows of spins in the same direction. áEintñW = ⎡ ⎛ (1 - sm sn ) ⎛ L ⎞ ⎞ L ⎢a 0 (∣Rm - R n∣) - a1 (∣Rm - R n∣) ⎜⎜1 Y ⎜ (0) ⎟ ⎟⎟ cos (f(m0) - f(n0)) å 2 m ¹ n ⎢⎣ 2 ⎝ 2lc ⎠ ⎠ ⎝ ⎛ (1 - sm sn ) ⎛ 2L ⎞ ⎞ ⎤ + a2 (∣Rm - R n∣) ⎜⎜1 Y ⎜ (0) ⎟ ⎟⎟ ⎥ cos (2f(m0) - 2f(n0) ) , 2 ⎝ lc ⎠ ⎠ ⎥⎦ ⎝ (2.11) where the function Y is defined as Y (x ) = 1 - 1 (1 - exp ( - x )) . x (2.12) The structure of equation (2.11), without the specific form for Y, is quite generic; thus its form may be also applicable to other candidate mechanisms for homologous pairing . To consider solely a local pairing mechanism (as described in introduction and also discussed in [6]) one could simply set Y = 1 in equation (2.11). Independently of this, as often is the case for statistical mechanical studies of specific systems, the overall model described by equations (2.9)–(2.11) may be of more general interest, as it describes a modified X–Y spin model describing particles with two types of magnetic interactions. 2.2. Possible phases and transitions in the model The model potentially allows for many different states. First of all, we may have states of different azimuthal ordering of the molecules about their long axes. If the ratio a2 (R ) a1 (R ) is sufficiently small between nearest neighbours, we expect a state witháf(m0) - fn(0) ñ = 0, where, on average, all the azimuthal angles (‘spins’) f(m0) possess the same value. We call this state the ‘ferromagnetic’ (F) State. But, when the ratio a2 (R ) a1 (R ) becomes sufficiently large, states for which áf(m0) - fn(0) ñ = 0 no longer holds everywhere may be energetically preferential. Two such examples are an ‘antiferromagnetic’ (AF) rhombic state and an antiferromagnetic ‘Pott’s’ (P) State. In the simplest of these, the AF state, the molecules in the 2D lattice form rows or 1D layers. Along one particular lattice direction (within the row), nearest neighbour molecule’s spins are parallel,áf(m0) - f(n0) ñ = 0; however, the relative angle between neighbouring molecule’s spins along other directions is non-zero, áf(m0) - fn(0) ñ = y, where y takes a finite value (see figure 2). The P state has a slightly more complicated structure in terms of f(m0), which is illustrated in figure 2. If we allow for rhombic lattice distortions, the AF state generally has a lower energy state compared to the P-state, for the values of parameters explored here. However, the free energies of these states are quite similar [20]. (If we fix the lattice as being triangular (hexagonal), the P-state will dominate at large values of a2 (R ), since the AF state is unstable without lattice distortions [20].) Generally, we are interested in determining how phase segregation (separation of non-homologous molecules) depends on different molecular parameters. Segregation is driven by the size of a1 (R ) Y (L 2l(c0) ) and 6 New J. Phys. 19 (2017) 015014 D J (O’) Lee et al a2 (R ) Y (2L l(c0) ) for nearest neighbours in the lattice. When both a1 (R ) and a2 (R ) increase, so that the interaction energy between molecules dominates over mixing entropy, we might expect a phase almost pure in one type of molecule (complete segregation) as opposed to a mixed state containing significant numbers of both types. However, these two interaction terms may also compete against each other, and, therefore, lead to nonmonotonic effects in the boundary between the mixed and segregated phases. The degree of the phase segregation is also controlled by the function Y, which characterises the difference between the interaction between homologous and non-homologous molecules. Formally, we may expect that the largest degree of segregation occurs for very long molecules (as L goes to infinity). Here Y takes on its largest value, meaning the difference in the interaction between homologous pairs and non-homologous pairs is maximised. Note, however, that when the molecules are sufficiently large, the twisting elasticity (not included in the current model) would reduce this effect, allowing non-homologous molecules (again, molecules with different sequence text) to twist about their long axes in such a way to reduce their distortional mismatch, and hence their interaction energy [26]. On the other hand, for these long molecules, whose lengths are on the order of several bending persistence lengths, the liquid crystalline structures seen in the experiments of [1], generated to observe phase segregation, cannot be formed. And hence, we neglect this effect in our study as we desire to describe segregation effects only of (approximately) columnar liquid-crystalline mesophases of DNA. 2.3. Parameters for the KL model of DNA–DNA interactions In our numerical calculations, the interaction coefficients, a 0 (R ), a1 (R ), and a2 (R ), are determined from the KL model for helix structure specific DNA–DNA interactions [10, 11], which is based on a Debye–Bjerrum model (utilising mean-field electrostatics where ion-ion correlations are neglected). It entails the electrostatic interactions arising from the helical charge patterns on dsDNA. Here, the double helix is described by two phosphate backbones separated by two grooves—the smaller, minor groove and the larger, major groove (see figure 1). Each phosphate bears one negative charge, thus tracing out a double-helical charge distribution. Adsorbed counterions, cations, are assumed to be localised within these grooves, and therefore are considered as fixed charges on the molecular surface. These form helical positive charge charge distributions running along the grooves. All other ions in the system that are not bound to DNA provide linear Debye screening of these helical charge distributions. Within the framework of this model, the interaction terms a 0 (R ), a1 (R ), and a2 (R ) are given by a 0 (R) = W (k R , k n a ) K 0 (kD R) 2lB (1 - q )2 2l + 2B å [xn (q , f1 , f2)]2 n n , 2 2 (akD K1 (kD a)) (akn Kn¢ (kn a))2 le le n a1 (R) = a2 (R) = 4lB [x1 (q , f1 , f2)]2 le2 4lB [x 2 (q , f1 , f2)]2 le2 (2.13) K 0 (k1 R) , (ak1 K1¢ (k1 a))2 (2.14) K 0 (k2 R) , (ak2 K2¢ (k2 a))2 (2.15) where Wn (kn R , kn a) = - åKn - j (kn R) Kn - j (kn R) j I ¢j (kn a) , (2.16) kD2 + n2g 2 . (2.17) K ¢j (kn a) and xn (q , f1 , f2) = qf1 + qf2 ( - 1)n - cos (nf˜ s) , kn = Here, In (x ) and Kn (x ) are modified Bessel functions of the first and second kind, and I ¢n (x ) and K ¢n (x ) are their derivatives with respect to their arguments, respectively. The parameters of the model are: lB, the Bjerrum length (lB » 7 Å); le , the mean distance between phosphate charges per unit charge (le » 1.7 Å); kD, the inverse Debye 1 screening length (in all calculations we use the value kD = 7Å ); g = 2p H , where H is the average helical pitch of DNA (H = 33.8 Å); a is the effective electrostatic radius of DNA (a » 11.2 Å); f̃s is the angular half-width of the minor groove (f˜s » 0.4p ); q is the degree of counterion condensation (or binding) near the DNA surface (if the charge of phosphates is 100% compensated with adsorbed counterions, q = 1 ); f1 is the proportion of condensed (bound) ions localised in the minor groove; and f2 is the proportion of condensed (bound) ions localised in the major groove. We will also investigate the more generic phase behaviour of the phase diagram of our system, for a triangular lattice, in terms of a1 and a2, without their actual R-dependence being specified. Concentrating solely on these interaction terms realises results that are independent of the underlying interaction model, based on symmetry arguments presented above. 7 New J. Phys. 19 (2017) 015014 D J (O’) Lee et al 3. Methods We calculate the free energy of these systems in order to determine when phase segregation occurs via two means: (i) a variational mean-field approximation and (ii) Monte Carlo simulations. We compare the results of the two methods to test their accuracy. 3.1. Variational approximation In the variational approximation, when mixing between the two species occurs, we may use a ‘jellium’-like, mean-field approximation. Here, we suppose that we can write the following mean-field ansatz (for the ‘ferromagnetic’ state) for the probability that a lattice site m is occupied by either type A or B Pma (1 - c) da, A + cda, B . (3.1) Here, c = áNBñ N is effectively an order parameter that characterises the degree of phase segregation, where NB is number of lattice sites which are occupied by molecules of type B . In the model the total number of lattice sites N is fixed, and they are occupied either by A- or B-molecules, so that always NA + NB = N is constant (NA is number of lattice sites which are occupied by molecules of type A). Hence, we have (1 - c ) = áNAñ N . Thus, it is evident that when c = 0 the system contains purely molecules of type A; whereas c = 1 corresponds to only type B; and when c = 1 2, there is 1:1 mixing of the two species. Note that the above ansatz is shown to be equivalent to performing a more rigorous variational approximation using an effective ‘magnetisation’ (see supplemental material). However, using equation (3.1) is more physically transparent than this more sophisticated method, so we present it in the main text. In the variational approximation, we may write an approximate free energy FT = - kB T ln ZT ,0 + áááEintñW ña - ET ñT , (3.2) where ZT ,0 = N ¥ ò-¥ m= 1 ⎛ ET {f(0) } ⎞ m ⎟ df(m0) exp ⎜⎜ ⎟, k T B ⎠ ⎝ (3.3) and ááá Eint ñW ña - ET ñT = 1 N ZT ,0 m = 1 ¥ ò-¥ ⎛ ET {f(0) } ⎞ m ⎟ df(m0) ( ááEint {f(m0) } ñW ña - ET {f(m0) }) exp ⎜⎜ ⎟. k T B ⎠ ⎝ (3.4) The variational approximation is chosen to have a Gaussian (or harmonic) trial function where the finite range of values of f0m is neglected, in both equations (3.3) and (3.4), and the limits of integration are extended to infinity. Here, the subscript a on one of the averaging brackets represents averaging over the effective probability given by equation (3.1). In the variational trial function, ET {f0m} are variational parameters that are chosen to minimise FT to give the most accurate approximation to the full free energy (equation (2.9)). For an illustration of the approximation method, we will consider in detail the F state. The trial energy function describing this state is given by J ET = kB T å (f(m0) - f(n0) )2 , (3.5) ám, n ñ 2 where the angular brackets on the summation indices indicate that only nearest neighbour interactions have been considered. Indeed, J, an effective spring constant for relative rotations in f(m0) - fn(0) for nearest neighbours, is chosen to minimise equation (3.2), which is the closest value to the exact free energy. The variational approximation can be extended to include next-to-nearest neighbours and so on. The variational approximation yields, for the F state (considering only nearest neighbours), ⎛ L ⎞⎤ ⎛ 1 ⎞⎡ FT = 3a 0 (R) L - 3a1 (R) L exp ⎜ - ⎟ ⎢1-2c (1 - c) Y ⎜ (0) ⎟ ⎥ ⎝ 6J ⎠ ⎢⎣ N ⎝ 2lc ⎠ ⎥⎦ ⎛ 2L ⎞ ⎤ ⎛ 2 ⎞⎡ k T + 3a2 (R) L exp ⎜ - ⎟ ⎢1-2c (1 - c) Y ⎜ (0) ⎟ ⎥ + B ln J + kB TCˆ hex . ⎝ 3J ⎠ ⎢⎣ 2 ⎝ lc ⎠ ⎥⎦ (3.6) When we use equation (3.1), in order to optimise c we must also include a mixing entropy to realise a total free energy F = FT - TSmix , 8 (3.7) New J. Phys. 19 (2017) 015014 D J (O’) Lee et al where Smix = - kB N [(1 - c) ln (1 - c) + c ln (c)] . (3.8) The terms in FT favour segregation of the two molecular types, while mixing entropy Smix opposes this. The optimal c balances these two effects. In the more rigorous variational approximation, this additional entropy term is automatically included in the free energy and does not need to be added by hand (see supplemental material). Further details and how to formulate the variational approximation for the AF state (using a similar trial function to equation (3.5)) are given in the supplemental material. For the P state, we have not extended the mean-field approximation considered in [20] to consider mixing between the two molecule types. This is because the approximation for molecules of solely one type of base pair sequence (or ideal helices) is already very complicated. This is due to the reduced lattice symmetry of the P state. Additionally, it does not work very well due to thermal topological and domain-wall type excitations that were not incorporated [20]. Thus, MC simulations are required to probe the free energies for a fixed triangular lattice in which this P state occurs. In appendix A of the supplemental material we present a simpler model, for ease of understanding, as it captures some key features of the analysis. Most importantly it describes the competition between mixing entropy and the interaction energy difference between homologous and non-homologous molecules. Similarities, as well as the disadvantages, of this model compared with the analysis used in the main text are discussed in the supplemental material. 3.2. MC Simulations In the computational analysis, before applying the Monte Carlo Algorithm, the initial state of a system, at a given average nearest neighbour distance, was determined by performing a lattice sum of nearest-neighbour interactions. Here, the ground state energy was minimised upon varying (i) neighbouring spins of molecules, (ii) the type of pair interaction, being either of the homologous or non-homologous form , and (iii) the rhombic angle ω which provides the level of distortion of the rhombic lattice from a triangular one, where w = 60. The Monte Carlo simulations were carried out on a 24×24 site rhombic or triangular lattice with periodic boundary conditions for molecules of two different types. Here, only pair interactions (given by the terms in the summation of equation (2.11)) between nearest neighbours were considered. Three types of Monte Carlo move were considered: (i) altering the azimuthal angle f(m0) of a molecule at a lattice site, (ii) changing the type (sequence) of a molecule (for instance, from type A to B) at a lattice site, or (iii) changing the rhombic angle w of the lattice. Individual simulation runs were performed for systems of molecules with different lengths, densities, charges, and charge distributions. Furthermore, more generic simulations were carried out varying the interaction terms a1 and a2. According to the Metropolis algorithm, MC moves were accepted with a probability of exp (-DE kB T ), if the energy of the new configuration was DE greater than that of the original configuration, and a probability of 1 if the energy of the new configuration was less than that of the old configuration. Systems were allowed to equilibrate for 105 MC steps before statistical averages of the states were calculated. 4. Results and discussion 4.1. Phase segregation and consistency between the two methods In our results, we have determined the position of a mixing transition between a phase-segregated lattice state, with predominantly one type of molecule in the lattice, and a state in which two types of molecules are evenly mixed. This has been obtained for a range of interaction parameters for the KL model of DNA–DNA interactions. The transition line between these two states is defined as where á∣NA - NB∣ñ N = 0.5. In determining the position of this mixing transition, both methods of calculation, the analytical variational approximation and the Monte Carlo simulations, agree well (see figure 3), indicating the reliability of these two approaches. However, each of the methods is not without its minor drawbacks. In the variational calculation, the system is treated at the mean-field level, thus we cannot expect it to be accurate at the transition point, since inhomogeneities and correlations in the spatial distribution of the two molecule types on the lattice are likely to become important here. On the other hand, the MC simulations were carried out on a relatively small lattice of molecules, and so finite-size effects may influence phase segregation behaviour in these simulations. For example, one clear artefact of a finite sized lattice, is that á∣NA - NB∣ñ N remains significantly above zero for values of R ave , the average interaxial distance between molecules, well above the transition. In fact, for the simulation results above the transition, the value of á∣NA - NB∣ñ N remains constant as R ave is increased. This constant value, arising from random ordering in a finite system, is calculated to be 2/pN , which, for the lattice size used in the simulations, is found to have a value of 0.033, consistent with what was found in the MC 9 New J. Phys. 19 (2017) 015014 D J (O’) Lee et al Figure 3. Phase segregation curves comparing the two methods of calculation. Plotted is á∣NA - NB∣ñ N , the fractional difference of species (sequence) populations within the lattice. When á∣NA - NB∣ñ N = 1 complete phase segregation between two types of molecule has occurred, corresponding to a lattice of only one type of sequence. If the value of this parameter equals zero, this corresponds to perfectly even mixing with the same populations of both species spread over the lattice sites. Results are shown for molecules with a charge compensation q = 0.7. Here, (a) and (c) are for a molecular length L = 340 Å, whereas (b) and (d) have been calculated for molecules of length L = 680 Å. In both (a) and (b) a molecular groove charge distribution of f1 = 0.35 and f2 = 0.65 has been used, and in (c) and (d) the molecular charge distribution was given by f1 = 0.5 and f2 = 0.5, i.e., the same amount of bound charge was apportioned to the minor and majors grooves. Simulation results are provided as points (connected with dashed lines for clarity’s sake), whereas analytical results are shown as solid lines. Note, for the parameters in (c), the sharp segregation transition occurs at the location of the F–AF state transition (see figure 4), whereas the segregation transition for the other cases occurs within the ‘ferromagnetic’ regime (figure 4). Note that for the variational calculationá∣NA - NB∣ñ N = ∣1-2c∣. simulations. As shown in figure 3, near the transition, the curves for the simulations in figure 3 are much less steep than those of the variational calculation. But it is well known that finite size effects generally smear transitions, which is precisely what is observed in the finite-sized MC simulations. We presume that exact results for á∣NA - NB∣ñ N lie somewhere between these two curves. This mixing transition, as shown, depends on molecule length and the distribution of charge on the molecular surface as provided in the interaction model. The effect of increasing length is to push the transition to larger values of R ave, which thus favours the segregated state over a larger density range in these DNA assemblies. Indeed, this result is not surprising. According to our model, the difference between the average interaction energy between homologous and non-homologous DNA increases as the molecular length increases, and hence, this increasing difference would increase the density range over which phase segregation occurs. Of the two ion distributions shown in figure 3, we find the distributions with the larger value of f1 ( a greater fraction of ions localised in the minor groove) yield transitions at smaller interaxial distances R ave. As shown in figure 5, this is a general trend, which we will discuss below. 4.2. The AF–F transition Besides the transition between the mixed and segregated phases of molecules with different sequences, there is also an ‘antiferromagnetic’–‘ferromagnetic’ (AF–F) transition, as discussed in section 2.2. This second transition changes the ordering of the azimuthal orientations of the centres of the DNA helices about their long axes. At large values of R ave (whenáf(m0) - fn(0) ñ = 0 for all nearest neighbours), the lowest energy state corresponds to the F state. As R ave decreases, a2 (in equation (2.11)) that favours áf(m0) - fn(0) ñ ¹ 0 grows faster relative to a1, which then results in the AF rhombic lattice (or P state for molecules fixed to a triangular lattice) being the lowest energy state. Again, for the AF state, the molecules are azimuthally ordered in rows. The average value of the azimuthal angle difference between neighbouring molecules áf(m0) - f(m0) ñ = 0 within a single row of molecules, while, along other lattice directions, between nearest neighbours (i.e. between rows), áf(m0) - fn(0) ñ = y, where y is some no-zero value. (Again, for details, see section 2.2.) 10 New J. Phys. 19 (2017) 015014 D J (O’) Lee et al Figure 4. The averaged ‘spin–spin’ correlation between molecules at neighbouring lattice sites. The figures (a)–(d) correspond to the same parameter values used to generate those same plots in figure 3. An illustration of the rhombic lattice unit cell (in the top left panel) is used to identify the two curves (red and green) shown in each of the panels as between different nearest neighbours. The point of the F–AF transition occurs at the value of R ave, the average lattice separation, where the two curves of the average difference between the azimuthal orientation (spins) between neighbouring molecules diverge. For large average molecular separations (low densities), the F state is preferred; whereas, at small separations, as the antiferromagnetic term (a2 (R )) in the interaction grows faster than the ferromagnetic term (a1 (R )) with diminishing R ave, the AF state is preferred. For increasing values of R ave, in the ‘ferromagnetic’ regime, the spin interaction between molecules grows weaker, and so the spin correlation weakens, i.e., ácos (f(m0) - f(v0) ) ñ 0. Again, as shown in panel (c), the location of the segregation transition occurs at the same interaxial spacing as the F–AF transition, whereas for all other cases, the segregation transition occurs within the F state (see figure 3). To probe this transition, as well as to measure the amount of azimuthal fluctuations in the lattice, we consider the ‘spin–spin’ correlation function ácos (f(m0) - fn(0) ) ñ between nearest neighbours. As fluctuations increase, the value of ácos (f(m0) - fn(0) ) ñ decreases. Indeed, for large interaxial spacings between molecules, we expect both the size of a1 and a2 in equation (2.11) to exponentially decrease (see equations (2.14) and (2.15)) with increasing R ave, leading to a weaker interaction between the molecules and, hence, a decrease in the ‘spin’–‘spin’ correlation function as shown in figure 4. for large R ave . In the AF state, between the rows of molecules with different ’spins’ (see figure 2), we can write ácos (f(m0) - f(n0) ) ñ = cos y ácos (f¢m(0) - f¢n(0) ) ñ . (4.1) Here, f(m0) - fn(0) = y + f¢m(0) - fn¢ (0) and the values f¢m(0) - f¢n(0) fluctuate about the value zero, i.e. áf¢m(0) - fn¢ (0) ñ = 0. Indeed ácos (f¢m(0) - f¢n(0) ) ñ is reduced by thermal fluctuations. In equation (4.1), as y increases, ácos (f(m0) - fn(0) ) ñ must also decrease. Thus, in the AF state, between rows with different ‘spins’ we expect that ácos (f(m0) - fn(0) ) ñ is smaller than its value along the rows. Due to this difference, in figure 4, the F–AF transition takes place where the red and green curves of ácos (f(m0) - fn(0) ) ñ, for nearest neighbours in the two lattice directions, as functions of R ave diverge. By comparing both figures 3 and 4 we may judge the relative positions of the mixing transition with respect to the F–AF transition. For the parameter values f1 = 0.35 and f2 = 0.65, the mixing transition occurs well within the F state of the molecules, whereas, for f1 = 0.5 and f2 = 0.5 (an even distribution of bound counterions distributed in the minor and major grooves), the mixing transition still occurs well within the F state for longer molecules (L = 680 Å), but for the smaller molecules (L = 340 Å), the mixing transition occurs at the same interaxial molecular spacing as the AF–F transition. Increasing the ratio f1 f2 , i.e., the fraction of bound ions in the minor groove to that in the major groove, has the effect of pushing the AF–F transition to larger interaxial spacing, Rave since doing so increases the size of the ratio a2 (R ) a1 (R ) that controls the transition, as it is the relative proportion of the two interaction terms that dictates the strength of the antiferromagnetic aspect of the interaction. Furthermore, increasing the ratio of 11 New J. Phys. 19 (2017) 015014 D J (O’) Lee et al Figure 5. Average inter-axial spacing of the segregation transition, RC , as a function of interaction model parameters. In these plots, the value of L = 1000Å was used. The transition point RC is defined at the value of R ave where á∣NA - NB∣ñ N = 0.5. Shown in (a) are plots of RC against charge compensation q for two different charge groove distributions and (b) plots of RC against the proportion of ions in the minor groove, f1 (with f2 = 1 - f1) for two different values of charge compensation. a2 (R ) to a1 (R ) essentially reduces the strength of the interaction (in the F state) since these terms come into the interaction with opposite signs. Therefore, the mixing transition moves down to smaller interaxial spacing as the a2 (R ) a1 (R ) ratio grows, since here, thermal fluctuations grow, leading to increased mixing of different molecular species. The position of the F–AF transition is insensitive to L. However, increasing the length, L, of the molecules does significantly reduce the degree of f(m0) fluctuations, as can be seen in figure 4. The plots shown in figure 4 are found from the MC simulations, where we observe a smooth transition between the segregated and mixed states. The variational calculation yields a discontinuous AF-F transition (see [20]), which more or less occurs around where the curves in figure 4 show a more dramatic weakening in the spin correlation between the molecules. The reasons for the main discrepancies result, again, from the finite size effects of the simulations or the mean-field elements of the variational theory breaking down very close to the point of transition. 4.3. The dependence of segregation on KL-model parameters For further exploration of the effects of changing the counter-ion distribution close to the surface of the molecules, we specifically analyse results for molecules of a particular (average) length, L = L* , where L* = 1000 Å or 294 base pairs. This was the length of molecules used in the segregation experiments of [1]. For these length molecules, we then find the location of the mixing transition location, defined as RC = R ave where á∣NA - NB∣ñ N = 0.5, upon varying q (the counterion charge compensation of the DNA molecule) and f1 and f2 (again, the values that denote how bound or adsorbed counterions are apportioned between the minor and major grooves). These terms were varied because the effective DNA charge and distribution of counter-ions were found to depend on the species and concentrations of ions used in experiments. In these results we suppose that f1 + f2 = 1, which should be more or less the case for ions that bind strongly to the grooves. To describe the experiments of [1], f1 + f2 is not precisely known, however, but it is likely smaller than 1 considering the monovalent salt solutions used in these experiments. However, a DNA charge compensation value of q = 0.3-0.4 could be predicted for the experiments upon fitting the osmotic pressure data of DNA osmotic pressure experiments under similar conditions [30, 31]. Finally, these fits revealed that the results were rather insensitive to the value of f1 + f2 for these smaller charge compensations q. Later we consider a set of, more or less, realistic values of model parameters to describe those experiments. In figure 5(a), results are shown for systems in which the DNA charge compensation, q, is varied for two sets of charge groove distributions, f1 and f2 values, namely f1 = 0.5, f2 = 0.5 and f1 = 0.35, f2 = 0.65. In the case 12 New J. Phys. 19 (2017) 015014 D J (O’) Lee et al Figure 6. A contour plot showing the degree of mixing between molecules of two different sequences, molecule types A and B, upon varying the relative strength of the coefficients of the helical interaction between molecules, as well as their overall strength with respect to thermal energies. The coloured contours give the mixing order parameterá∣NA - NB∣ñ N , where ‘1’ corresponds to fully segregated and ‘0’ to fully mixed. In addition is included a red ring which corresponds to a plausible range of parameter values for the experiments of [1] (see main text). Results are shown for simulations with molecules of length L* = 1000 Å . where f1 = 0.35, f2 = 0.65, the interaxial spacing of the mixing transition, RC , increases strongly with q. The helical electrostatic coefficient, a1, of the interaction increases as more counter-ions are localised to the helical grooves (q becoming larger). Therefore, at a given average separation, R ave, this increase results in stronger preferential interactions between identical molecules, which dominates the thermal fluctuations at separation values below the mixing transition. It is this that pushes the mixing transition to a larger interaxial spacing RC . However, when the counterion bound charge is more evenly distributed between the grooves, as seen for the values f1 = 0.5, f2 = 0.5, the effects of varying q are weakened by the increase in the ratio a2 (R ) a1 (R ). Again, since a2 (R ) comes in with an opposite sign with respect to a1 (R ) in the interaction, the larger value of a2 (R ) effectively weakens the interaction, and, hence, the impetus for segregation, so that the system remains mixed until the assemblies are at larger densities (smaller values of R ave ), with the increase of q . We can also fix q and vary f1 , holding f2 = 1 - f1 , such that counterions are bound either to the minor or major grooves. In figure 5(b) results are shown for two fixed values of the charge compensation, q = 0.5 and q = 0.7. In both cases, we find that value of RC depends dramatically on the value of f1 . Again, as more counterion charge is localised in the minor groove ( f1 is increased) the ratio of a2 (R ) a1 (R ) increases, effectively reducing the difference in interaction energies between homologous and non-homologous pairs. This then hinders phase segregation, such that, again, RC , the location where segregation occurs, occurs at smaller intermolecular spacing. In figure 5(b), for the larger value, q = 0.7, we find that the mixing transition occurs at greater interaxial separations than for the smaller charge compensation, q = 0.5, for most values of f1 considered, due to a reduction in overall interaction strength. However, this difference becomes smaller as f1 is increased, and reverses at values above f1 » 0.5 . This is again governed by the increasing ratio of a2 (R ) a1 (R ), as was explained above for figure 5(a). 4.4. The a1 - a2 phase diagram for a fixed triangular lattice As well as simply focusing on the KL model of interaction, we also show results that are independent of this model, though relying on the global mechanism of homology recognition through helix distortions. Here we show the degree of phase segregation as a function of the parameter values a1 and a2 for helix structure specific interactions. Results are shown, again, for molecules with the length used in the experimental segregation studies [1]. This figure illustrates some features discussed in the previous sections. Firstly, as the helix-structuredependent interaction increases, (i.e., a1 increases), which occurs with increased aggregate density or charge localisation within the DNA grooves, the relative difference in the interaction energies between homologous and non-homologous increases, favouring segregation. Secondly, as more counter-ion charge is apportioned to the minor groove, increasing the relative value of the antiferromagnetic a2 term to the ferromagnetic a1 term, the effective helical interaction decreases. Hence, the difference of the interaction energies, again, between homologous and non-homologous molecular pairs decreases, suppressing phase segregation so that it occurs at even larger values a1 (and thus, at larger aggregate densities). Interestingly enough, as the ratio of a2 a1 is further increased, this trend reverses slightly (see the behaviour of the ‘segregation line’ in right side of the figure 6). Here, the effective helical interaction increases again and, 13 New J. Phys. 19 (2017) 015014 D J (O’) Lee et al hence, the difference of the interaction energies between homologous and non-homologous molecules. This effect may be linked to the fact that the ground state here is now the P state. This azimuthal ordering is obscured by fluctuations, however it still may contribute to free energy, although on average it is not the most popular state: the mixing transition, for this molecular length, always lies on average in the F state. For this reason this opposite trend in figure 6 is very weak. Indeed, we would expect for P state that increasing a2 would increase the effective strength of interactions. We find that phase transitions between the topologically disordered and ordered spin P-states [20] lie in the phase segregated region for molecules of chosen length L*. The phase diagram for such spin transitions is described in [20]. The effect of reducing the length would be to bring this spin transition closer to the phase segregation line and eventually, crossing it, which might lead to interesting effects, as observed for the shorter molecules in figures 3 and 4. On the other hand, phase segregation for L = L* was already weak under the conditions explored in [1]. In the phase segregation transition plot, figure 6, we also show the approximate region in the a1 - a2 parameter space that may correspond to the segregation experiments of [1], using the K-L model of DNA-DNA interaction. To determine the possible range of values of these parameters, fits were performed on results of osmotic pressure experiments of DNA aggregates [30, 31] under similar experimental conditions as those found in [1]. With these fits, estimates of q » 0.3-0.4 could be deduced for the level of charge compensation of DNA by adsorbed counterions. Appropriate values for f1 and f2 , the fraction of counterions adsorbed in the minor and major grooves, respectively, could not as easily be deduced from the osmotic pressure experiments. However, with the relatively small value of the estimated counterion charge compensation found from these fits, the influence on the interaction terms a1 and a2 of the distribution of counterion adsorption, provided by f1 and f2 , is rather small. Thus, to limit the number of parameters, we assumed a random counterion distribution along the DNA molecule, which corresponds to values of f1 and f2 being 0. A larger influence is exerted on the interaction terms by the separation between neighbouring molecules, since these terms decrease exponentially with molecular separation. The molecular separations had not been measured in those experiments. However, since the aggregates in [1] were found to be in a mild cholesteric state, we could surmise that the average interaxial separation between neighbouring molecules was roughly 3–4 nm, which corresponds roughly to a surface-to-surface separation of 1–2 nm. With the estimates for charge compensation q and average molecular interaxial separation (along with the values of f1 = f2 = 0 ), using equations (2.14) and (2.15), we can determine the possible range of values for the interaction terms, a1 and a2 that correspond to the conditions of the segregation experiments [1]. Because of, mainly, the large uncertainty in the distances between neighbouring molecules, there, in turn, was a large uncertainty in these values, with a1 L* varying anywhere from 0.4-1.4kB T , and a2 varying from 0.05-0.2a1. This experimental region, drawn in figure 6, straddles the segregation transition boundary of the simulations. It encompasses the mixing ratio range á∣NA - NB∣ñ N » 0.1-0.5 found, by quantifying the degree of red-green colour separation, in the experiments. Of course, on the simulation side, there is still a degree of uncertainty in the precise location as well as the width of the mixing transition since, again, finite size effects of MC simulations have the tendency to broaden the transition. And, of course, on the experimental side, we have not accounted for smearing of the transition in experiments due to non-equilibrium effects during the formation of these DNA mesophases. 5. Concluding remarks In this article we have used variational approximations and computer simulations to establish the nature of segregation of DNA molecular fragments in a model for phase segregation of two DNA types. We have shown how this segregation depends on a variety of molecular parameters that characterise the difference in the interaction energies between fragments of homologous DNA molecules, sequences with the same base pairs, and non-homologous molecular pairs, which have different, totally uncorrelated sequences. Here, to follow one specific line of enquiry, we have assumed that the difference of energies relies exclusively on the base-pair sequence-dependent patterns of helical distortions away from an ideal helix. This is not to say that there could be mechanisms relying on local base pair dependent interactions [6, 17, 18] playing a role. The extent of such a role has yet to be ascertained. However, it should be relatively easy to adapt this general statistical model to take account of such mechanisms, once analytical models of interaction have been proposed. As well as phase segregation, our model also predicts different states in which the centres of the molecules can be azimuthally ordered, which can be described in the statistical mechanical language of magnetic systems (the helical interaction between DNA pairs depends on the relative azimuthal orientation (or ‘spin’) of the molecules about their long axes). Indeed, it is quite possible that our model may also have application in the study of magnetic systems, when two types of magnetic species are considered. Of particular interest to us was 14 New J. Phys. 19 (2017) 015014 D J (O’) Lee et al the relative molecular separation at which the mixing transition between DNA molecules (between the two types of sequences) occurs with respect to that of magnetic-like transitions of the molecular spin states. Granted, detection of these ‘spin-ordered’ DNA states may be difficult to detect experimentally, unless synchrotron x-ray diffraction experiments are invoked. On the other hand, such changes in the azimuthal ordering could affect the phase segregation of the two molecular types under certain conditions. Generally, though, for realistic model parameters, we found that this mixing transition occurs when the DNA molecules azimuthal orientations are aligned on average, i.e., they are in language of magnetic spins in a ‘ferromagnetic state’. However, we discovered non-trivial effects of the helical interaction on the location of the mixing transition. Namely, the a2 helical harmonic that favours the antiferromagnetic state in our helical interaction model may reduce the effective strength of helical interactions. This reduces the difference of the interaction energies between homologous and non-homologous molecular pairs (the ‘molecular recognition energy’), which, in turn, increases the likelihood of mixing of molecules of different sequences. Also, we found that if that if the ‘anti-ferromagnetic’ term, a2 could be made strong enough this effect could be weakly reversed. We have compared the predictions of this model, using the KL theory of interaction, with the observations of the experimental study that first reported segregation [1]. Using parameter values that fit the osmotic pressure experiments of [30, 31] and values for the intermolecular densities of these DNA assemblies, as suggested by the formation of a mild cholesteric state in experiments of [1], we roughly determined the corresponding interaction parameters of the experiments so that the experimental region could be mapped onto the predicted phase diagram for mixing/segregation. This mapping revealed a large predicted range for the mixing order parameter á∣NA - NB∣ñ N, as predicted by the theory, which completely encompasses the range of these values, á∣NA - NB∣ñ N » 0.1-0.5, found in the experiments. This indicates that the model is indeed plausible for these weakly cholesteric condensed DNA spherulite systems, although the large degree in the uncertainty in determining the interaction terms for the experiments limits comparisons to theory. Unfortunately, again, in those experiments, the exact average value of lattice spacing R for the spherulites was not measured using x-ray diffraction or some other appropriate method, nor was the osmotic stress varied with changing PEG concentration. Thus, we cannot better fit our model to experiments, or vice versa. Therefore, new experiments are clearly needed to determine the dependence of the degree of phase segregation on R in the spherulites. We envision that other relatively simple experiments could also be undertaken to study how varying other molecular parameters of DNA would influence segregation. For example, we would expect that increasing the length of the molecules would also increase the degree of segregation. Thus, different lengths of DNA fragments that form a liquid crystal phase should be investigated. Furthermore, we might also expect that changing the background electrolyte (type or concentration) in these experiments, which would alter the charge environment of the DNA, could also impact segregation by changing the distribution of adsorbed (or bound) ions about the DNA . Lastly, one might investigate to what degree changing the base-pair sequence would affect the phase segregation, and whether this correlates with expected patterns of helical distortions. These proposed systematic experimental investigations should allow us to better understand the interactions involved in homology recognition. Indeed, our first goal would be to try to fit the parameters of this current model to such data, before considering other possible candidate mechanisms of homology recognition [6, 17, 18]. Notably, such studies would be far-reaching, as the nature of such forces may be important in initiating critical biological processes such as homologous recombination. Acknowledgments AAK is grateful to Japanese Society for promotion of Science (JSPS) for short term award and Prof K Yoshikawa, his host at Doshisha University, where this work has been started. Discussions with him and Professor Takashi Ohyama are greatly appreciated. AW acknowledges the Minnesota Supercomputing Institute (MSI) at the University of Minnesota for providing resources that contributed to the research results reported within this paper. URL: http://msi.umn.edu. References [1] Baldwin G, Brooks N, Robson R, Goldar A, Wynveen A, Leikin S, Seddon J and Kornyshev A A 2008 Duplex DNA recognize sequence homology in protein free environment J. Phys. Chem. B 112 1060 [2] 2008 Spooky attraction of DNA from a distance New Sci. 15 2008 Paired pairs Nature 451 609 2008 DNA’s self regard Science 319 879 2008 Seeking recognition Biopolymers 89 4 Choi C Q 2008 Double helix double up Sci. Am. 298 30 15 New J. Phys. 19 (2017) 015014 [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23] [24] [25] [26] [27] [28] [29] [30] [31] D J (O’) Lee et al Falaschi A 2008 Commentary: Similia similibus: pairing of homologous chromosomes driven by the physicochemical properties of DNA HFSP J. 2 257 Inoue S, Sugiyama S, Travers A A and Ohyama T 2007 Self-assembly of double stranded DNA molecules at nanomolar concentrations Biochemistry B 46 164 Danilovicz C, Lee C H, Kim K, Hatch K, Coljee V W, Kleckner N and Prentiss M 2009 Single molecule detection of direct, homologous DNA/DNA pairing Proc. Natl Acad. Sci. USA 106 19824 Nishikawa J and Ohyama T 2013 Selective association between nucleosome with identical DNA sequences Nucleic Acids Res. 41 1544 Lee D J (O’), Danilowicz C, Rochester C, Kornyshev A A and Prentiss M 2016 Evidence of protein-free homology recognition in magnetic bead force–extension experiments Proc. R. Soc. A 472 20160186 Cha R S, Weiner B M, Keeney S, Dekker J and Kleckner N 2000 Progression of meiotic DNA replication is regulated by interchromosomal interaction proteins, negatively by Spo11p and positively by Rec8p Genes Dev. 14 493 Zickler D 2006 From early homologue recognition to synaptonemal complex formation Chromosoma 115 158 Barzel A and Kupiec M 2008 Finding a match: How do homologous sequences get together for recombination? Nat. Rev. Genet. 9 27 Kornyshev A A, Lee D J, Leikin S and Wynveen A 2007 Structure and interactions of biological helices Rev. Mod. Phys. 79 943 Kornyshev A A and Leikin S 1999 Electrostatic zipper motif for DNA aggregation Phys. Rev. Lett. 82 4138 Dickerson R E and Drew H R J 1981 Structure of a B-DNA dodecamer: II influence of base sequence on helix structure J. Mol. Biol. 149 761 Dickerson R E 1992 DNA structure from A to Z Methods Enzymol. 211 67 Kornyshev A and Leikin S 2001 Sequence recognition in the pairing of DNA duplexes Phys. Rev. Lett. 86 366 Kornyshev A A 2010 Physics of DNA: unravelling hidden abilities encoded in the structure of ‘the most important molecule’ Phys. Chem. Chem. Phys. 39 12352 Lee D J, Wynveen A, Albrecht T and Kornyshev A A 2015 Which way up? Recognition of homologous DNA segments in parallel and antiparallel alignment J. Chem. Phys. 142 045101 Mazur A K 2016 Homologous pairing between long DNA double helices Phys. Rev. Lett. 116 158101 Lu B, Naji A and Podgornik R 2015 Molecular recognition by van der Waals interaction between polymers with sequence-specific polarizabilities J. Chem. Phys. 142 214904 Lavery R R, Maddocks J H, Pasi M and Zakrzewska K 2014 Analyzing ion distributions around DNA Nucleic Acids Res. 42 8138 Wynveen A, Lee D J and Kornyshev A A 2005 Statistical mechanics of columnar DNA assemblies Eur. Phys. J. E 16 303 Kornyshev A A and Wynveen A 2009 The homology recognition well as an innate property of DNA structure Proc. Natl Acad. Sci. USA 106 4683 Wynveen A, Lee D J, Kornyshev A A and Leikin S 2008 Helical coherence of DNA in crystals and solution Nucleic Acids Res. 36 5540 Lipfert J, Kerssemakers J W J, Jager T and Dekker N H 2010 Magnetic torque tweezers: measuring torsional stiffness in DNA and RecADNA filaments Nat. Methods 7 977 Smith S B, Cui Y J and Bustamente C 1996 Overstretching B-DNA: the elastic response of individual double-stranded and single-strand DNA molecules Science 271 5250 Kornyshev A A and Wynveen A 2004 Nonlinear effects in torsional adjustment of interacting DNA Phys. Rev. E 69 041905 Cherstvy A G, Kornyshev A A and Leikin S 2004 Torsional deformation of double helix in interaction andaggregation of DNA J. Phys. Chem. B 108 6508 Lee D J, Wynveen A and Kornyshev A A 2004 DNA–DNA interaction beyond the ground state Phys. Rev. E 70 051913 Lee D J and Wyneveen A 2006 Torsional fluctuations in columnar DNA assembles J. Phys.: Condens. Matter 18 787 Mezard M, Parisi G and Virosoro M 1987 Spin Glass Theory and Beyond (Singapore: World Scientific) Rau D C and Paresgian V A 1992 Direct measurement of the intermolecular forces between counterion-condensed DNA double helices. Evidence for long range attractive hydration forces Biophys. J. 61 246 Rau D C and Paresgian V A 1992 Direct measurement of temperature-dependent solvation forces between DNA double helices Biophys. J. 61 260 16