Download article ()

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Mutation wikipedia , lookup

Neocentromere wikipedia , lookup

Designer baby wikipedia , lookup

Epigenetics in learning and memory wikipedia , lookup

Epigenetics wikipedia , lookup

Holliday junction wikipedia , lookup

Gene wikipedia , lookup

Chromosome wikipedia , lookup

Nutriepigenomics wikipedia , lookup

DNA barcoding wikipedia , lookup

DNA sequencing wikipedia , lookup

DNA repair wikipedia , lookup

DNA wikipedia , lookup

Zinc finger nuclease wikipedia , lookup

Mitochondrial DNA wikipedia , lookup

Mutagen wikipedia , lookup

DNA virus wikipedia , lookup

SNP genotyping wikipedia , lookup

Comparative genomic hybridization wikipedia , lookup

Genome evolution wikipedia , lookup

DNA polymerase wikipedia , lookup

DNA profiling wikipedia , lookup

Microevolution wikipedia , lookup

Replisome wikipedia , lookup

Point mutation wikipedia , lookup

Site-specific recombinase technology wikipedia , lookup

No-SCAR (Scarless Cas9 Assisted Recombineering) Genome Editing wikipedia , lookup

Human genome wikipedia , lookup

Primary transcript wikipedia , lookup

Cancer epigenetics wikipedia , lookup

Genome editing wikipedia , lookup

DNA damage theory of aging wikipedia , lookup

Metagenomics wikipedia , lookup

Therapeutic gene modulation wikipedia , lookup

Bisulfite sequencing wikipedia , lookup

Gel electrophoresis of nucleic acids wikipedia , lookup

Vectors in gene therapy wikipedia , lookup

DNA vaccination wikipedia , lookup

Molecular cloning wikipedia , lookup

Genomic library wikipedia , lookup

Genealogical DNA test wikipedia , lookup

Nucleic acid analogue wikipedia , lookup

Microsatellite wikipedia , lookup

Cell-free fetal DNA wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

United Kingdom National DNA Database wikipedia , lookup

History of genetic engineering wikipedia , lookup

Cre-Lox recombination wikipedia , lookup

Epigenomics wikipedia , lookup

DNA supercoil wikipedia , lookup

Deoxyribozyme wikipedia , lookup

Nucleic acid double helix wikipedia , lookup

Nucleosome wikipedia , lookup

Extrachromosomal DNA wikipedia , lookup

Genomics wikipedia , lookup

Non-coding DNA wikipedia , lookup

Helitron (biology) wikipedia , lookup

Transcript
po
lnst. Phys. Conf Ser. No 173.- Satellite colloquium
Paper presenled al 241h Inl.. Coll. Group Theorelica! Methods in Physics, Paris, France,
©2003 IOP Pub!ishing Lld
905
15-20 Ju/y 2002
Extracting structural and dynamical informations from
wavelet-based analysis of DNA sequences
A Arnéodo(l)t; B Audit(2), C Vaillant(1), y d'Aubenton-Carafa(3l , and
C Thermes(3)
(1) Centre de Recherche Paul Pascal, avenue Schweitzer, 33600 Pessac, France.
(2) Computational Genomics Group, EMBL-European Bioinformatics Institute, Weilcome
Trust Genome Campus, Cambridge CB10 lSD, UK.
(3) Centre de Génétique Moléculaire du CNRS, Allée de la Terrasse, 91198 Gif-sur- Yvette,
France
Abstract. The packaging of the eucaryotic genomic DNA involves the wrapping around
the histone proteins [1] followed by the successive foldings of higher order structured
nucleoprotein complexes -[2J. The bending properties of DNA play an essential role in
these compaction processes [3, 4]. This hierarchically organized pathway is likely to be
reflected in the fractal bebavior of DNA bending signais in eucaryotic genomes, but the
challenge is to somebow extract this structural information by a clever reading of the DNA
sequences. We show that wben using an adapted mathematicaJ tool, ÙJe "wavelet transform
microscope" [5, 6J, to explore the fluctuations of bending profiles, one reveals a cbaracteristic
scale of 100-200bp that separates two differentregimes of Oong-range) power-Iaw correlations
(PLC) that are co=on to eucaryotic as weil as eubacterial and archaeal genomes. The same
analysis of the DNA text yields strikingly similar resLÙts to those obtained with bending
profiles, and this for ail tbree kingdoms. In the small-scale regime'; PLC are observed in
eucaryotic genomes, in nuclear replicating DNA viroses and in archaeal genomes, which
contrasts with their total absence iD the genomes of eubacteria and their viroses, thus indicating
that small-scale PLC are likely to be reJated to the mechanisms undedying the wrapping of
DNA around histone proteins. Tbese results together with the observation of PLC belween
particular sequence motifs known to participate in the formation of nucleosomes (e.g. AA
dinucleotides) sbow that the 10 - 200 bp PLC provide a very efficient diagnostic of the
nucleosomal structUre and this in coding as well as in noncoding regions (7, 8J. We discuss
possible interpretations of these PLC in terms of the physicaJ mechanisms that might govem
the positioning and dynarnics of the nucleosomes aJong the DNA cbain tbrough cooperative
processes (8]. We further specula te that the large-scaJe PLC are the signature of the higher­
arder structure and dynarnics of chÎomatin.
The availability of fully sequenced genomes offers the possibility to study the scale-invariance
properties of DNA sequences on a wide range of scales extending frorn' tens to thousands
of nucleotides. Actually, scale invariance measurement enables us to evidence particular
correlation structures between distant nucleotides or groups of nucleotides. During the past
few years, there has been intense discussion about the existence, the nature and the origin of
long-range correlations in genomic sequences [9, 10, Il, 12]. If it is now well admitted that
long-range correlations do exist in DNA sequences [6, Il, 'l3], their biological interpretation
is still debated [9, 10, 11, 12, 13, 14, 15, 16, 17]. Most of the models proposed so far
_ are based on the genome plasticity and are supported by the reported absence of power­
· 1aw correlations (PLC) in coding DNA sequences [5, 6, 13, 18]. In a previous work [17],
· from a systematic analysis of human exons, CDS's and introns, we have found that PLC
are not onJy present in non-coding sequences but also in coding regions sornehow hidden
·in their inner codon structure. Here we report the res~ts of a recent study [7, 8] that
Present address: Ecole Normale Supérieure de Lyon, 46, aJJée d'Italie, F-69364 Lyon Cedex 07 France
906
.-'­
-
.;:.::-:~2~"'i
--.-'~~-
:-'~>'i::~~~2
.:..:..: ....
~-.
;:'". ".' .'.;....:;,'.
·:~·~·:b~·~E~_:
n (kbp)
~ ~-~::;"J;;f~-~
-.:"~.
..
,~
.., .. "
~.:::-:,.~""",
Figure 1. Cumulative bending profiles for a human DNA fragment (chromosome 21, positions ~~~~.;;~~
192 kb to 200 kb). Abscissa is the position on the sequence; the curves are cumulative·.:'·':··7::'
representations of the PNuc (black) and DNase (grey) codings (in order to facilitate the _ ...,.:~_.'
comparison, the mean drift of the curves has been eliminated) (from [8]),.;;~.~~';;;
~'.':~~~
--..
demonstrate that the long-range correlations observed in DNA sequences are more likely the ",";,.. J.'.':
signature of the hierarchical structural organization of chromatin. In contrast to previous'·:·.c~~:,'.:'
interpretations, we propose sorne understanding of these correlations as a necessity for",,::_~::::.
chromosome condensation-decondensation processes in relation with DNA replication, genè :.: ,';':
expression and cell division [8].~·'~'i.:~~·t
A major problem of fractal analysis applied to DNA sequences is that these display a_··::':~".:
mosaic structure which is characterized by "patches" resulting from compositional biaSes"::::
with an excess of one type of nucleotide. When mapping DNA sequences ta numerical'
sequences using the "DNA walk" representation, these patches appear as trends in the DNK"''''''
walk landscapes that are likely to break thescale invariance [9, la, Il, 12, 13, 18]. In previous' ...
works [5, 6], we have emphasized the wavelet transforrn (WT) as a well suited technique ta.'."
overcome this difficulty. By considering analyzing wavelets that make the "WT microscop~;;
blind to low frequency trends, any bias in the DNA walk can he removed and the existenc~"
of PLC associated with specific scale invariance properties can be revealed accurately. Whe ll .
exploring sequences selected from the human genome, we have found that the fluctuations
in the patchy landscapes of both coding and non-coding DNA walks are monofractal with
Gaussian statistics in the small-scale range, which justifies the use of a single exponent H
usually called the Hurst or roughness exponent [5, 6]. H values larger than the uncorrelated
random walk value H = 1/2, correspond to the existence of long-range correlations that
we will refer to as "persistence". To estirnate this exponent, we just have to investigate the
behavior across scales (a) of the root mean square (r.m.s.) fluctuations of wavelet coefficients:
O'(a)
0::
aH.
(1)
Wavelet coefficients actually reflect the local variation (over size a) of the concentration of
nucleotides. Persistence (H > 1/2), therefore means that these concentrations f1uctuate more
smoothly (over short distances) than for UDcorrelated sequences, but in the same tirne with a
larger amplitude (over large distances) around the mean value [8].
:
Here we summarize the results of a comparative analysis of the persistence properties
for both DNA texts and DNA bending profiles of various eucaryotic, eubacterial and archaeal
genomes [7, 8]. To studythe DNA texts, we construct "DNA walks" according to the
binary co~g method extensively used by Voss [I1J; this method decomposes the nucleotide
sequence mto four sequences corresponding to A, C, T or G (coding with 1 at the nucleotide
.... ;
....
907
position and 0 at other positions). To construct DNA bending profiles that account for the
fluctuations of the local double helix curvature, we use the trinucleotide model proposed in
[19] (here called Pnuc) and which was deduced from experimentally determined nucleosome
positioning. To test that our results do not come out from a simple recoding of the DNA
text, we use also the trinucleotide coding table defined in [20] (bere cailed DNase) and wbich
is based on sensitivity of DNA fragments to DNase l and which more likely codes for the
DNA local flexibility properties. As illustrated in Fig. l, it is worth noting that the differences
between the two tables are clearly sufficient to produce, at least at fust sight, significantly
different DNA bending profiles.
The fust completely sequenced eucaryotic genome Saccharomyces cerevisiae (S.c.)
provides an opportunity to perform a comparative wavelet analysis of the scaling properties
displayed by eacb chromosome. When looking at the global estimate of O'(a) over the DNA
walks correspond.i.iJ.g to "p.:' in each of the 16 yeast chromosomes shown in Fig. 2(a), one
sees that ail present superimposable behavior, with notably the same characteristic scale that
separates two different scaling regimes. At small scales, 20
a
200 (expressed in
nucleotide units) , PLC are observed as characterized by H = 0.59±0.02, a mean value wbicb
is significantly larger than 1/2. At large scales, 200 a, stronger PLC with H = 0.82 ± 0.01
become dominant with a cutoff around 10000bp (a nurnber by no means accurate) above
wbicb uncorrelated bebavior is observed. A similar wavelet analysis of the bending profiles of
the yeast chromosomes obtained when using the Pnuc coding table (Fig. 2(a)) reveals striking
similarities with the cUJ;ves resulting frÇ>m the DNA walk analysis, in both the smail-scale and
the large-scale regimes. These observations are not simply due to a "recoding" of the DNA
sequences since when using the DNase coding table, one notices a significant weakening of
the H exponent' observed in the large-scale regime (H ~ 0.6). The existence of these two
scaling regimes is confirmed in Fig. 3(a), where the probability density functions (pdfs) of
wavelet coefficient values of the yeast Pnuc bending profiles computed at different scales are
shown to collapse on a single curve, as predicted by the self-sirnilarity relationsbip [5, 6]
:s :s
:s
aHPa(aHT) = p(T),
(2)
:s :s
provided one uses the scaling exponent value H = 0.60 in the scale range 10
a
100
and H = 0.75 in the scale range 200
a
1000. In the small-scale regime, the pdfs are
very weil approximated by Gaussian distributions. In the large-scale regime, the pdfs have
stretched exponential-like tails. The fact that the self-sirnilarity relationsbip (2) is satisfied
in both regimes corroborates the monofractal nature of the rougbness fluctuations of the
yeast bending profiles. Similar quantitative results are obtained for the corresponding DNA
texts [7, 8]. Let us emphasize that we have also exarnined a number of eucaryotic DNA
sequences from different organisms (human, rodent, avian, plant and insect) and that we bave
observed the same characteristic features as those obiained in Fig. 2(a) for S.c. (see Fig. 2(c)
for the human chromosome 21 and Table 1 of [8]). Note that the characteristic scale found for
higher eucaryotes is slightly smaller a* ~ 100 - 140 bp than for s.c. and that the cross-over
between the two PLC regimes is remarkly robust for the four "p.:', "C", "G" and "T" DNA
walks. In a work under progress, we are investigating the possible departure from Gaussian
statistics in the small-scale regirne wben increasing the (G + C) content of the considered
DNA sequence.
.
The striking overall sirni1arity of the results obtained with these different eucaryotic
genomes prompted us"to also examine the scale invariance properties of bacterial genomes [7,
8]. In Fig. 2(b) are reported the results obtained for Escherichia coli wbich are quite
typical of what we have observed with other eubacterial genomes. Again, there exists a
wel1 defined characteristic scale a* ~ 200bp that delirnits a transition to very strong PLC
with H = 0.80 ± 0.05 at large scales. Let us point out that as for S.c. (Fig. 2(a)), if
:s :s
!; ",
i
908
E. coli
S. cerevisiae
/
/
(a)
cO 0.4
~
<D
0
1
/
/
/
//
ly/
/
~/
0.2
/
0.4
/
~
on
..s
/
/
/
0.2
/
/
/
/
2
o
/
/
t.
1
/
o
b
o
t>D
o
~
-0.2
-0.2
Human chr. 21
c-.....,.,..,..n,--,--,n-nemr--.--:,
",,-rT1nonr-,""""-TTm1r-"--=l
/
(c)· //
cO
~
0.2
/
o
o
';;j
0.2
/
/
/
/
0.1
/
0.1
.. .,.
/
/
/
/
/
/
1
bo
/
/
/
t>D
cD
/
(d)
/
/
/
.,."
.
.'
.............
""--~~
Of"'/IiOOo._"""'­
..sn -0.1
-0.1
a
a
Figure 2. Global estimate of the r.m.s ofWT coefficients: loglo ,,(a) - 0.61og 10 a is plotted
versus loglO a; the dashed lines corresponding to uncorrelated (H = 1/2) and strongly
correlated (H = 0.80) regimes are drawn to guide the eyes. (Sorne horizontal line in
this logarithmic representation will correspond to H == 0:6). The analyzing wavelet is
the "Mexican hat" wavelet [6]. (a) S. cerevisiae: "N.' DNA walks of the 16 S. cerevisiae
chromosomes (--) and of the corresponding bending profiles obtained with the Pnuc (0)
and DNase (6) coding tables when averaged over the 16 chromosomes. (b) Escherickia coli:
"N.' (grey --), "T" (grey - - -), "G" (black - - ) and "c" (black - - -) DNA walks and
the corresponding Pnuc (0) and DNase (LI) bending profiles. (c) Human chromosome 21: ''N',
"C", "G" and "T" DNA walks and corresponding Pnuc and DNase bending profiles; same
symbols as in (b). (d) Human chromosome 21: comparative analysis of DNA walks for all
adenines (--), adenines part of a dinucleotide AA (- ..) and isolated adenines not part of a
dinucleotide AA (---) (from [7]).
one uses the DNase table for human (Fig. 2(c)) and E. coli (Fig. 2(b)) sequences, one no
longer observes the strong PLC as obtained with the Pnuc table. In Fig. 3(b) are reported the
wavelet coefficient pdfs of the E. coli Pnuc bending profile that corroborate the existence
of a cross-over scaJe between two different monofractaJ scaling regimes characterized by
H = 0.50 ± 0.02 and H = O.SO ± 0.05 respectively. In order to examine if these
properties actuaJly extend homogeneously over the whole genomes, a(a) was caJculated over
a window of width 1 = 2000, sliding aJong the bending profiles. The results reported in
Fig. 4 for Yeast, a human contig and E. coli confirrn the existence of a characteristic scale
a· ~ 100 - 200bp which seems to be robust ail aJong the corresponding DNA molecules
and fuis for all investigated genomes in thethree kingdoms. Note that analogous results are
......... ­
909
S. cerevisiae
E. coli
i:
il
i
'i
~
r
J
;j
-20
:t
T
1"
li';
T
,.
Figure 3. Probability .distribution functions of wavelet coefficient values of Pnuc bending
profiles. The analyzing wavelet is the Mexican hat [6]. (a) Saccharomyces cerevisiae:
log2(a H Pa(aHT)) vs T for the set of scales a = 12 (LI), 24 (0), 48 (0), 192 (Â), 384 (_),
and 768 (e) in nucieotide lliÙts; H = 0.60 (H = 0.75) in the small (large) scale regime. (b)
Escherichia coli: sarne as in (a) but with H = 0.50 (H = 0.80) in the small (large) scale
regime (from [7]).
~.
1
1
~.
~.
r.
;
i
!
j
f
i
obtained for the four mononucleotide DNA walks as for the bending profiles [7,8].
There exists however an important difference between eucaryotié and eubacterial
genomes: no PLC are observed for the latter in the sIilall-scale regime where uncorrelated
Brownian motion-like beqavior with H = 1/2 is observed (Figs. 2(b) and 3(b)). As discussed
in previous works [5, 6, 9)0, 13, 18), separate analyses of coding and non-coding eucaryotic
DNA walks actually show that introns display PLC (with a mean H value of 0.60 ± 0.02)
in the small-scale regirne, while exons have no such correlations. At this point, it may seem
thai PLC are inherent to non-codi.Ii.g sequences only, but that is not the case. As shown in
Fig. 5 for Archaeoglobusfulgidus, the wavelet investigation of five archaeal genomes (which
are mostly coding) also reveals the presence of small-scale PLC as observed in eucaryotic
genomes, although somewhat less pronounced [8J. Note that the strong large-scale PLC are
present in ail eubacterial, archaeabacterial and eucaryotic genomes.
What mechanism or phenomenon rnight explain the small-scale PLC in eucaryotic
genomes? Their total absence in eubacterial genomes raises the possibility that they could
be related to certain nucleotide arrangements in the ISO bp long DNA regions which are
wrapped around histone proteins to forrn the eucaryotic nueleosome [1, 2J. Indeed, eubacterial
genomic DNA is associated with histone-like proteins (e.g. HU), but no nucleosome-type
structure has b~n detected in these organisms [21]. Along this line, the observation of small­
scale PLC in archaeal genomes is consistent with the presence in archaebacteria of structures
similar to the eucaryotic nucleosomes [22J. This analysis has also been extended to viral
genomes. Small-scale PLC are clearly detected in most eucaryotic viral double-strand DNA
genomes as shown for Epstein-Barr virus in Fig. 5. This further supports the hypothesis of
nucleosome-based PLC since nucleosomes are present on double-strand DNA viruses [23J.
The Poxviridae, which are the only animal DNA viruses replicating in the cytoplasm of tbeir
host cells, code for an eubacterial-type of histone-like protein [24J, and no PLC are found in
this scale range as shown in Fig. 5 for Melanoplus sanguinipes virus [8J. This observation
is consistent with our hypothesis and suggests that the genornic DNA of these viruses is
1';
1:
l'
(
.:-­
:;.:
910
4
3
2
2
Escherichia coli
4
3
2
a
2 la'
6 la'
4 la'
x
Figure 4. Space-scale 'wavelet like representation (x and a are expressed 10 nucleotide
units) of the local estimate of the r.m.s cr(a, x) of the WT coefficients of the "N' DNA
walk. er(a) is computed over a window of width 1 = 2000, sliding a10ng the fust 10 6 bp
of the yeast chromosome N (a), a human contig (b) and of the Escherichia coli genome (c).
10glO cr(a) - 2/310g lO a is coded using 128 grey levels from black (min) ta white (max). The
horizontal white dashed line marks the scale a* = 200bp Where sorne minimum is observed
consistently a10ng the entire genomes as a separation between two differeot monofractal
scaling regimes (see text). Note that in the human contig, the actual characteristic scale seems
closer to a * = 150 bp.
submitted to packaging processes different from the other animal viruses. Other classes
'of virus genomes like the single and double-strand RNA viruses (to the exception of the
retroviruses) are very unlikely associated to nucleosomes. In all cases except retroviruses, we
observe a total absence of smaIl-scale PLC as shown in Fig. 5(c). In the case of retroviruses, it
is known that the integrated viral DNA is associated to nucleosomes in the ceIl nucleus [25];
we clearly confurn in Fig. 5(c) the presence of smaIl-scale PLC (H ~ 0.57 ± 0.02). Finally,
bacteriophage genomes do not present any smaIl-scale PLC (Fig. 5 for T4 bacteriophage and
[8J) as aIready obsenred for their eubacterial hosts. This wavelet based fractal analysis of
viral and cellular genomes of ail three kingdoms sustains the fact that smail-scale PLC are a
signature of nucleosomal DNA [7, 8].
To further investigate this PLC nucleosomal diagnostic; we ask whether particular
dinucleotides which are known to participate to the positioning and formation of.
nucleosomes [26J (e.g. AA dinucleotides) would carry PLC specifically associated to
eucaryotic genomes. This can be examined if one performs the analysis of different DNA
walks generated with (i) aIl adenines, (ii) only adenines that are part of a dinucleotide AA
and (iii) isolated adenines that are not part of a dinucleotide AA. The analysis of human
---------IIIIIIIII
911
Virus es. Archae bacteria
a
a
a
Figure 5. Global estimate of the r.m.s of wr coefficients of (a) "N' DNA walks and (b), (c)
ben ding profiles. The various symbols correspond to the following genomes: Archaeglobus
fulgidus (squares), Epstein-Barr virus (dots), Melanoplus sanguinipes entomapoxvirus
(circles), T4 bacteriophage (stars), average over 21 single-strand RNA viruses (triangles) and
17 retroviruses (black triangles). Same representation as in Fig. 2.
(Fig. 2(d» and eucaryotic (Table 1 in [8]) shows that the "isolated A" DNA waIk exhibits a
clear weakening of the PLC properties at small scale, while the "AN' DNA walk accounts
for a major part of the observèd' PLC on the "A" DNA waJk, which confums the nucleosomal
signature of small-scale PLC. Note that this observation is an additional illustration of the fact
that recoding does not trivially conserve the correlation law [8J.
Several studies have established the presence in genomic sequences of DNA motifs
related to bending properties. A 10.2 base periodicity has been observed using either
Fourier [27J or correlation function [28] analysis, specifically in eucaryotic genomes where
it has been interpreted in relation to nucleosomal structures. However there is a fundamental
difference between this nucleosome diagnostic based on periodicity and our analysis based
on scale invariance properties which strongly suggests that the mechanisms underlying the
nucleosomal structure of eucaryotic genomes are multi-scale phenomena that actually involve
the whole set of scales in the 1 - 200 bp range. In this respect, periodicity (which concems
5% of sequences that present affuùties for the histone octamer that are significantly larger than
average [29]) and scale invariance (which concems 95% of bulk genomic DNA sequences that
have an affinity for the histone octamer sirnilar to that of random sequences [29]) should not
be considered as opposed to each other but rather complementary. In [8J, we have proposed
the following dynamical understanding of the observed small scale PLC. ln contrast to the
tight histone binding obtained with an adequate periodic distribution of bending sites, PLC
would facilitate the positioning of the histone core tbroughout a major part of the genome.
If one considers the translational positioning of nucleosomes as a mechanism of diffusion
along the DNA chain and if we assume that, once the histone core is bound to DNA, the
distribution of bending sites has a direct consequence on this diffusion process, then the PLC
are likely to allow nucleosome mobility along DNA to proceed with an average displacement
(after a given number of elementary steps) larger than with uncorrelated sequences. The
persistent nature of the sc ale-invariant spatial organization of bending sites would be selected
in order to favour the overall dynarnic of compaction of nucleosomes by enabling them to
explore larger segments of DNA. In other words, nucleosomes would require less energy
912
for similar amplitude of displacements. Persistence therefore offers sorne understanding
of the modest free-energie of nucleosome formation observed for most DNA sequences,
which also facilitates the translational mobility and thus the propensity of nuc1eosomes to
be dynarrùcal structures. Such properties could then favour an optimal compromise between
DNA compaction and accessibility constraints. These hypotheses constitute new directions
for the study of the effects of the small-scale PLC on the structural, mechanical and dynamical
properties of DNA in chromatin.
The interpretation of the large scale PLC observed in the DNA bending profiles as well
as in the DNA wa1ks is an open problem. As suggested in [29], the signals involved in
nuc1eosome formation may act collectively over large distances to the packing of nucleosomal
arrays (10 nm filament) into high-order chromatin structures (30 nm fiber) [1, 2, 30). Since
DNA bending sites are key elements for nuc1eosomal structures, the detailed investigation
of large-scale PLC (in the 200 - 5000 bp range) observed in eucaryotic bending profiles
should shed light on the compaction mechanisms at work in the hierarchical formation and
dynaI1Ùcs of chromatin. An important clue provided by our studies is that similar long
distance correlations in bending profiles are also observed in eubacteria (Figs. 2(b) and 3(b))
and archaebacteria (Fig. 5) [8]. Actually, all chromosomes are submitted to condensation­
decondensation processes (in relation with DNA replication, gene expression, ... ) which
might result in common dynamical and structural properties. A deep understanding of the
large-scale PLC and their interpretation in terms of these constraints remain challenging
questions requiring further investigation.
This research was supported by the GIF GREG (project "Motifs dans les Séquences")
and by the Ministère de l'Education Nationale, de l'Enseignement Supérieur, de la Recherche
et de l'Insertion Professionnelle ACC-SV (project "Génétique et Environnement") and the
Action BioInformatique (CNRS, 2000). BA acknowledges the support from the European
. Community through a Marie Curie Fellowship (contract: HPMF-CT-2001-01321).
References
[1] Luger K, Miider AW, Richmond RK, Sargent DF, and Richmond TI, 1997, Nature 389
(6648),251-260
[2] van Holde K, 1989, Chromatin (New York: Springer)
[3J Drew HR and Travers AA 1985, J. Mol. Biol. 186773-790
[4] Yao J, Lowary PT, and Widom J 1990, Proc. Nat!. Acad. Sci. USA 87 7603-7607
[5] Arneodo A, Bacry E, Graves PV and Muzy J-F 1995, Phys. Rev. Lett. 74 3293-3296
[6] Arneodo A, d'Aubenton-Carafa Y, Bacry E, Graves PY, Muzy J-F, and Thermes C 1996,
Physica D 96291-320
.
[7] Audit B, Thermes C, Vaillant ., d' Aubenton-Carafa Y, Muzy J-F, and Arneodo A 2001,
Phys. Rev. Lett. 862471-2474
[8] Audit B, Vaillant C, Ameodo A, d'Aubenton-Carafa Y, and Thermes C 2002, 1. Mol.
Biol. 316903-918
[9] Stanley HE, Buldyrev SV, Go1dberger AL, Hav1in S, Ossadnik SM, Peng C-K, and
SimonsM 1993 Fractals l (3),283-301
[10] Li W, Marr TG, and Kaneko , 1994, Physica D 75392-416
[11] Voss RF 1994, Fractals 2 (1) 1-6
[12] Karlin S and Brendel V 1993, Science 259677-679
[13] Buldyrev SV, Goldberger AL, Havlin S, Mantegna RN, Matsa ME, Peng C-K, Simons
M, and Stanley HE 1995, Phys. Rev. E 51 5084-5091
[14] Li W, 1992, In!. J. Bifurc. Chaos 2 (1),137-154.
913
[15] Buldyrev SV, Go1dberger AL, Havlin S, Stanley HE, Stanley:tv1HR, and SÎmons M 1993,
Biophys. 1. 65 2673-2679
[16] Herzel H, Trifonov EN, Weiss 0, and Grosse 11998, Physica A 249, 449
[17] Arneodo A, d'Aubenton-Carafa Y, Audit B, Bacry E, Muzy J-F, and Thermes C 1998,
Eur. Phys. 1. B 1 259-263
[18] Peng C-K, Buldyrev SV, Goldberger AL, Havlin S, Sciortino F, Simons M, and Stanley
HE 1992, Nature 356168-170
[19] Goodsell DS and Dickerson RE 1994, Nucl. Acids Res. 225497-5503
[20] Brulmer I, Sanchez R, Suck D, and Pongor S 1995, 1. Biomo!. Struct. Dynam. 13 309­
317
[21] Murphy LD and Zimmerman SB 1997, J. Struct. Biol. 119336-346
[22] Reeve JN, Sandman K, and Daniels Cl 1997, Ce1l89, 999-1002
[23] Challberg MO and KellyTJ 1989, Annu. Rev. Biochem. 58671-717
[24] Borca MV, Irusta PM, Kutish GF, Carillo C, Monso CL, Burrage AT, Neilan IG, and
Rock DL 1996, Arch. VlIoI.141301-313
[25] Stanfield-Oakley SA and Griffith ID 1996, J. Mol. Biol. 256 503-516
[26] Thastrom A, Lowary PT, Widlund HR, Cao H, Kubista M, and Widom J 1999,1. Mol.
Biol. 288 213-219
[27] Widom J 1996,1. Mol. Biol. 259 579-588
[28] Herzel H, Weiss 0, and Trifonov EN 1999, Bioinformatics 15 187-193
[29] Lowary PT and Widom J 1997, Proc. Nat!. Acad. Sci. USA 941183-1188
[30] Polach KJ and Widom J 1995,1. Mol. Biol. 254 130-149