Nucleic Acids Research Advance Access originally published online on August 30, 2008
Nucleic Acids Research 2008 36(17):5610-5622; doi:10.1093/nar/gkn543
Nucleic Acids Research, 2008, Vol. 36, No. 17 5610-5622
© 2008 The Author(s)
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.0/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
A thermodynamic overview of naturally occurring intramolecular DNA quadruplexes
Niti Kumar and
Souvik Maiti*
Proteomics and Structural Biology Unit, Institute of Genomics and Integrative Biology, CSIR, Mall Road, Delhi 110 007, India
*To whom correspondence should be addressed. Tel: +91 11 2766 6156; Fax: +91 11 2766 7471; Email: souvik{at}igib.res.in
Received February 29, 2008. Revised August 7, 2008. Accepted August 8, 2008.
 |
ABSTRACT
|
|---|
Loop length and its composition are important for the structural
and functional versatility of quadruplexes. To date studies
on the loops have mainly concerned model sequences compared
with naturally occurring quadruplex sequences which have diverse
loop lengths and compositions. Herein, we have characterized
36 quadruplex-forming sequences from the promoter regions of
various proto-oncogenes using CD, UV and native gel electrophoresis.
We examined folding topologies and determined the thermodynamic
profile for quadruplexes varying in total loop length (5–18
bases) and composition. We found that naturally occurring quadruplexes
have variable thermodynamic stabilities (
G37) ranging from –1.7
to –15.6 kcal/mol. Overall, our results suggest that both
loop length and its composition affect quadruplex structure
and thermodynamics, thus making it difficult to draw generalized
correlations between loop length and thermodynamic stability.
Additionally, we compared the thermodynamic stability of quadruplexes
and their respective duplexes to understand quadruplex–duplex
competition. Our findings invoke a discussion on whether biological
function is associated with quadruplexes with lower thermodynamic
stability which undergo facile formation and disruption, or
by quadruplexes with high thermodynamic stability.
 |
INTRODUCTION
|
|---|
In nature, guanine-rich sequences are found in important regions
such as telomeres, centromeres, immunoglobulin switch regions,
mutational hot spots and promoter elements in human genome,
and have the potential to form four-stranded structures which
are involved in regulatory roles (
1–6). These remarkably
diverse four stranded structures contain planar arrays of four
guanines, paired by intra- or intermolecular Hoogsteen bonds.
The lengths of the G-runs and the loops separating
them both contribute to overall topology (
7–10). Recently
a genome wide search has identified 376 000 potential quadruplexes
in the functionally important regions of genes (
11,
12). These
findings suggest that the potential for quadruplex formation
could contribute to the stability (or instability) of specific
classes of genes or reflect mechanisms for global regulation
of gene expression (
13). These unusual structures are also involved
in molecular recognition and may play intricate regulatory roles
at the RNA level (
14–16). Conserved elements with the
potential to form polymorphic G-quadruplex structures in the
first intron of human genes suggest that these elements may
act as structural targets for regulation of transcription, especially
RNA processing and translation (
17). The possible existence
and roles of G-quadruplexes
in vivo has been corroborated by
the detection of several proteins such as helicases and nucleases
that bind specifically to G-DNA with very high affinity (
18–24).
The growing awareness of the biological importance of quadruplexes has kindled interest in developing synthetic compounds that can bind selectively to these structures to exploit quadruplexes as a therapeutic target (25,26). Interestingly, the molecular recognition and function of these quadruplexes is influenced by the loop length and its composition. Recent computational approaches have found sequences that can fold into G-quadruplex structures to be widely dispersed in the promoters throughout human genome (27). The prevalence of putative quadruplex-forming sequences depends on the lengths of the guanine tracts and the distribution of the loops. The total loop length and its composition may determine the functionality of the quadruplexes and may thereby modulate their roles. These observations have fueled interest in understanding the role of loop length and composition on the thermodynamic stability, topology and molecular recognition of quadruplexes (28–34). The influence of loop length on the structure adopted by model sequences which form intramolecular quadruplexes has been investigated using a combination of experimental and molecular modeling techniques (28). Both these approaches suggest the preference of antiparallel over parallel conformations for longer loops and vice versa for shorter loops. The available literature suggests a strong influence of loop length on quadruplex stability and quadruplexes with shorter loops display higher thermodynamic stability. Additionally loop composition also affects the quadruplex thermodynamic stability; a single T-to-A-loop substitution in an intramolecular quadruplex with G3 tracts separated by single base loops has been shown to lower the stability by 8°C (33). Although the thermodynamic fate of quadruplexes with different loop length and composition has been addressed using model sequences, the data set for this study is limited. Moreover, it is difficult to anticipate whether the results obtained for quadruplexes formed by model sequences can be extrapolated to naturally occurring quadruplex-forming sequences which possess heterogeneous loop length and composition. Hence, thermodynamic characterization of a considerable number of naturally occurring quadruplex sequences is desired. To address these important issues, we have investigated quadruplex-forming sequences in the promoter regions of various proto-oncogenes, examined their folding topologies and determined the thermodynamic profile for structures which have variable loop length and composition. Our data show that it is difficult to find a simple correlation between loop length and the thermodynamic stability of quadruplexes, in contrast to the correlation documented for model sequences. Additionally, we have compared the relative stability of duplex and quadruplex structures to comprehend the predominance of either of these structures at equilibrium (35–41).
 |
MATERIALS AND METHODS
|
|---|
Sequence selection
Sequences 2 kb upstream from the Ensembl genes were downloaded
via UCSC Genome Browser hg18 using Table browser option. We
employed a computational search algorithm Quadfinder (
42) to
identify potential quadruplex-forming sequences. The Ensembl
IDs obtained were subsequently matched with the Ensembl IDs
of genes which have been correlated to genomic instability in
human malignancies (
13). The sequences with three G-quartets
and total loop length of 5–18 bases were selected for
biophysical investigation.
HPLC purified oligonucleotides were procured from SBS Genetech, China. The concentrations of these oligonucleotides were calculated by extrapolation of tabulated values of the monomer bases and dimers at 25°C using procedures reported earlier (43,44).
CD spectra were recorded in Jasco spectropolarimeter (model 715, Japan) equipped with a thermoelectrically controlled cell holder and a cuvette with a path length of 1 cm. The sequences were heated at 95°C for 5 min followed by programmed cooling (0.15°C/min) in 10 mM sodium cacodylate buffer, pH 7.4, 100 mM KCl. CD spectra for quadruplexes (10 µM) was recorded between 220 and 325 nm at 25°C and the spectra obtained was the average of three scans.
UV melting experiments
The UV experiments were performed using Cary 100 (Varian) UV-vis spectrophotometer. Both heating and cooling profiles of quadruplex (2 µM) were monitored at 295 nm (45) with a rate of 0.15°C/min in 10 mM sodium cacodylate buffer, pH 7.4, 100 mM KCl. The cooling and heating profile of 80 quadruplex-forming sequences were obtained. The sequences which did not give characteristic quadruplex melting or reversible cooling and heating profile were not considered for further analysis. We performed concentration dependent UV experiments (2–100 µM) and filtered out the sequences which did not display concentration independent melting temperature to differentiate between intra- and intermolecular system (64). Therefore out of 80, we selected 62 sequences which gave characteristic quadruplex melting profile with reversible cooling and heating profile. The excluded 18 sequences are provided in SI Table 2. Out of 62 sequences, we selected 40 sequences which displayed concentration independent Tm for thermodynamic analysis (Tables 234). The absorbance profile recorded at 295 nm was analyzed by nonlinear least square curve fitting method. This method involved the contribution from pre- and post-transition baselines and thermodynamic data was obtained using equations described previously (46,47). The analysis was done using Mathematica 5.1 and origin 7.0.
| (1) |
| (2) |
| (3) |
| (4) |
| (5) |
Au,
Al are linear equations describing the
upper and lower baselines, respectively, where
bu and
bl are
fitted parameters for the intercepts for the upper and lower
baseline with
mu and
ml as respective slopes.
Keq, is the equilibrium
constant for the unstructured–structured transition for
an intramolecular system and

is the folded fraction.
A(
T) is
the dependent variable and is the experimentally determined
absorbance at each temperature (
T). Using these equations, van't
Hoff enthalpy (
HvH) and entropy (
SvH) were calculated and
Tm was calculated from the peak value of the first derivative of
the fitted curve. The absorbance profiles of these sequences
were also subjected to Monte Carlo methods for the determination
of the error of fitted parameters
HvH and
SvH. The analysis
was done using the built in method in Prism 5.0 as described
previously (
46).
Free energy (
GvH) was calculated at 37°C using equation
GvH =
HvH – T
SvH assuming
HvH and
SvH are temperature independent. To obtain the thermodynamic parameters for the duplexes formed by these sequences, we used online tool HYTHER (ozone2.chem.wayne.edu) which employs Nearest Neighbor Method (48,49) for prediction of nucleic acid hybridization thermodynamics using experimental conditions of 2 µM strand concentration and 100 mM monovalent cations.
Differential scanning calorimetric (DSC) experiments
DSC experiments were performed to measure the heat required for unstructured–structured transition at strand concentration of 50 µM. The experiments were performed in VP-DSC differential scanning calorimeter from Microcal (Northampton, MA). The experiments were performed in 10 mM sodium cacodylate buffer, pH 7.4, 100 mM KCl with a scanning rate of 0.15°C/min. Repeated buffer versus buffer scans were carried out in the temperature range of 110–25°C to obtain an appropriate and reproducible baseline which was subtracted from the sample scan. The subtracted sample scan was normalized for the oligoucleotide concentration. Origin 7.0 analysis package was used to integrate the resulting curve to obtain the molar folding enthalpy (
Hcal) and Tm was determined from the midpoint of the transition.
Nondenaturing gel electrophoresis
Out of 62 quadruplex-forming sequences, 40 sequences adopted predominantly one conformation as indicated by CD experiment. These 40 sequences were annealed at 0.15°C/min rate in 10 mM sodium cacodylate buffer, 100 mM KCl, pH 7.4. The samples were electrophoresed on 20% polyacrylamide nondenaturing gel and run in 1 x TBE, pH 7.4, 100 mM KCl buffer. Gel was run at 4°C at constant voltage of 100 V. Oligonucleotide concentration used was 15 µM. After electrophoresis, gel was stained with ethidium bromide and was visualized through BioRad Gel Doc XR.
 |
RESULTS AND DISCUSSION
|
|---|
This study aims to elucidate the thermodynamic profiles of G-quadruplexes
formed by naturally occurring sequences present in the promoter
regions of various proto-oncogenes. These sequences were sorted
using Quadfinder (
42), which employs a consensus uni-molecular
G-quadruplex sequence motif of the form G
xL
y1G
xL
y2G
xL
y3G
x, where
x denotes the G-stretch and L
y1, L
y2 and L
y3 denote the length
of loops 1, 2 and 3, respectively. The selected sequences belong
to different regions of proto-oncogenes, which have been correlated
to genomic instability in human malignancies by Eddy and co-workers
(
13). The sequences investigated in our study have three G-quartets,
as they are the shortest G-tracts that form quadruplexes with
reasonable stability, for which the influence of total loop
length (5–18 bases) on the quadruplex stability can be
evaluated. Our dataset includes 62 quadruplex-forming sequences
present in the promoter regions of various proto-oncogenes located
at different positions with respect to transcription start site
(TSS) (SI
Table 1). Based on the loop length, we classified
the sequences with total loop length ranging from 5 to 18 bases
(tabulated in
Table 1). We initially examined the folding topologies
of these sequences through CD spectroscopy. Though CD does not
provide direct evidence for the structural features of quadruplexes,
the spectra obtained provide valuable information about the
structural characteristics of quadruplexes and agrees with the
topology obtained by direct methods such as NMR (
50,
51). The
characteristic CD signals arise from G-G stacking between G-quartets,
the strength of which mainly depends on the conformation of
guanine bases around the glycosidic bond (
syn or
anti). The
difference in the orientation of glycosidic torsion conformation
gives rise to parallel and antiparallel structures (
8–10).
The CD signature comprising of a positive peak at 262 nm and
a negative peak at 240 nm typically indicates a parallel conformation
and signature with a positive peak at 295 nm and a negative
peak at 238 nm indicates an antiparallel conformation. However,
the presence of both these signatures indicates mixed conformation.
The CD signature obtained for all the sequences has been tabulated
in SI
Table 1 and the spectra are provided in
Supplementary Data.
The information provided in SI
Table 1 shows that out of the
62 sequences investigated in this study, two sequences adopted
predominantly antiparallel conformation, 38 sequences adopted
predominantly parallel conformation and the remaining 22 sequences
adopted both parallel and antiparallel signatures. The characteristic
CD spectra of a few sequences are shown in
Figure 1. The CD
spectra provided in
Figure 1 show that c-
MYC (Q21), c-
KIT (Q30),
WNT 3 (Q2),
AKT 2 (Q34),
MYB (Q35) have a predominantly parallel
population and a small antiparallel population, whereas
VEGF (Q1),
PDFB (Q4) and
PIM 1 (Q9), adopt only the parallel topology.
Our results for the CD spectra of c-
MYC (Q21), c-
KIT (Q30),
VEGF (Q1) and
PDFB (Q4) are in agreement with the topology reported
in literature (
6,
7,
50–52). An interesting observation
was made from the CD spectra of the quadruplex-forming sequences
from the
HCK protooncogene, Q24 and Q25 (
Supplementary Data,
SI
Table 1), which possess the same total loop length of 10
bases and loop composition, but display variation in base distribution
among the three loops. In Q24, the number of bases in loops
L1, L2 and L3 is 2, 2 and 6, respectively, while in Q25, these
are 2, 6 and 2, respectively. This minor difference in loop
distribution results in a remarkable difference in the CD spectra.
Q24 adopts a predominant parallel conformation, whereas Q25
adopts a mixed conformation (
Supplementary Data). Detailed inspection
of the CD spectra of all these sequences shows that sequences
with shorter loop lengths such as,
VEGF (Q1),
WNT 3 (Q2),
WNT5A (Q3),
PDGFB (Q4) adopt mainly parallel structures, while longer
loops such as,
RALB (Q59),
EGFR (Q60),
VAV 1 (Q61),
FGF 6 (Q62)
adopt mixed structures comprising both parallel and antiparallel
characteristics. A few quadruplexes with longer loops such as,
FLI (Q48),
THRA (Q52),
KRAS (Q57) and
THPO (Q58) also adopted
a parallel conformation. For the given data set, we observed
that only two sequences,
FGR (Q49) and
FES (Q53), adopted an
antiparallel fold. It is worthy to mention that for few sequences
the assignment of bases is in the loops is arbitrary, and it
is also likely that sequences that have been assigned with longer
loops could also fold to adopt topologies with short loop lengths.
Examination of the CD spectra of all the naturally occurring
sequences studied here shows a predominance of the parallel
topology, hinting toward a bias for the parallel fold over the
antiparallel form as a recognition motif in biology.
Out of the 62 sequences, 40 sequences displayed predominantly
one type of conformation, (either parallel or antiparallel)
and the remaining 22 sequences displayed mixed conformations.
We performed native gel electrophoresis on these 40 sequences
to confirm the existence of one type of conformation as observed
in CD study. Our gel study showed that these sequences adopt
a single conformation with the absence of higher order structures
(
Supplementary Data). We observed that most of the quadruplexes
move faster than the oligo dT
25 marker. However, a few quadruplex-forming
sequences, namely Q6, Q7, Q47 and Q48, have mobility almost
similar to the oligo dT
25 marker. Note that the migration of
the oligo dT
n marker does not necessarily correspond to the
single strand (
53), this marker was chosen to provide an internal
migration standard. The anomalous behavior observed for Q6 and
Q7 quadruplexes (18 mer) and Q47 and Q48 (26 mer), raises a
doubt concerning the molecularity of the structure. However,
concentration dependent UV melting study for these sequences
shows that these sequences adopt a unimolecular structure. A
possible explanation for this anomaly is that quadruplexes with
different loop lengths adopt varying topologies which give rise
to different electrophoretic mobilities. Since, the loop residues
influence the quadruplex structure (compact or an extended),
a less compact structure may display a lower mobility.
Further investigations were carried out only with the 40 sequences which displayed predominantly one type of conformation to determine their thermodynamic profile by van't Hoff analysis. We also performed the UV melting of the remaining 22 sequences which showed a mixed population. The data for these sequences are presented in Supplementary Data, and can be used as a resource for future investigations; however, these 22 sequences have been excluded from the present analysis. UV melting of these quadruplexes showed characteristic hypochromic absorbance profile at 295 nm, which showed reversible heating and cooling curves, thus indicating that the system is in thermodynamic equilibrium. We performed concentration-dependent UV melting on all 40 sequences and observed that the melting temperatures were concentration independent (data not shown), which indicates that they form intramolecular structures. Representative UV melting profiles of a few of these quadruplexes is presented in Figure 2; the melting profile of rest of the sequences has been provided in Supplementary Data. In Figure 2, the upper panel displays melting profiles of quadruplexes from the promoters of c-MYC (Q21), c-KIT (Q30), VEGF (Q1) and PDGFB (Q4) which have melting temperature, Tm of 75.0, 71.6, 80.8 and 83.0°C, respectively. We identified new quadruplex-forming sequences which demonstrated remarkable stability and are presented in the lower panel of Figure 2. These sequences belong to promoter regions of AKT 2 (Q34), MYB (Q35), PIM 1 (Q9), WNT 3 (Q2) which exhibit Tm of 87.0, 88.0, 82.8 and 86.0°C, respectively. We found several examples where sequences with the same loop length display different Tm values. For example, VEGF (Q1), WNT 3 (Q2) and WNT 5A (Q3) have total loop lengths of 5 bases and Tms of 80.8, 86.0 and 65.0°C, respectively. It can also be seen that structures with different loop lengths display similar melting temperatures. For example, JUN B (Q30), MAF (Q42), FES (Q53) and FGF 6 (Q62) have total loop length of 11, 13, 14 and 18 bases, respectively, but all exhibit a Tm of 70.0°C. The melting profiles obtained for sequences Q6, Q7, Q16 and Q17 showed very broad transitions suggestive of multiple quadruplex conformations. However, the UV melting data for these sequences were concentration independent, thereby indicating intramolecular quadruplex formation. Moreover, the CD and gel studies also showed that these sequences each adopt a single conformation. The multiple transitions in the UV melting profiles of these structures, suggest the presence of conformations which have similar CD spectra and similar mobility but different melting temperatures. These sequences were not considered for further analysis.

View larger version (26K):
[in this window]
[in a new window]
[Download PowerPoint slide]
|
Figure 2. UV cooling profile of quadruplexes monitored at 295 nm along with nonlinear least square curve fitting (in red). The experiment was performed in 10 mM sodium cacodylate buffer, pH 7.4, 100 mM KCl. The cooling rate was 0.15°C/min. The thermodynamic parameters obtained by van't Hoff analysis are provided in the inset.
|
|
The sequences investigated in this study contain additional
Gs at the 3'-or 5'-ends, which create ambiguity concerning which
Gs are in the loops and which are in the G-stack. This issue
is usually avoided in model sequences, where the loops are homogenous
in composition and length. Since the promoter regions of proto-oncogenes
are enriched in guanines, therefore naturally occurring quadruplex-forming
sequences often contain GGG sequences in the loops. For the
final analysis using CD, gel and concentration-dependent UV
melting studies, we therefore selected only 36 sequences, which
predominantly showed one type of conformation and evaluated
their thermodynamic parameters using van't Hoff analysis. The
thermodynamic parameters obtained from nonlinear least square
curve fittings are tabulated in
Tables 2 –4

The raw data
were also subjected to a Monte Carlo method, which involves
fitting 1000 simulated data sets by nonlinear least square curve
equation, and these results are tabulated in SI
Tables 3–5

.
The thermodynamic parameters obtained by Monte Carlo method
are in agreement with the data obtained from the direct fit
of the raw data using nonlinear least square equation. Our data
show that quadruplexes with same loop length have different
enthalpy
H and entropy
S. For example,
VEGF (Q1),
WNT 3 (Q2)
and
WNT 5A (Q3) have total loop length of 5 bases and the enthalpy
associated with these structures is –50.6, –101.5
and –44.0 kcal/mol, respectively. The corresponding entropy
associated with these structures is –142.1, –277.0
and –130.0 cal/mol/K, respectively. We also observed that
quadruplexes with different loop lengths can have similar enthalpy
and entropy values. For instance, the quadruplex structure formed
by
WNT 3 (Q2) and
MYEOV (Q46), with total loop lengths of 5
and 13 bases, respectively, possess remarkably high enthalpy
(–101.5 and –109.0 kcal/mol, respectively) and unfavorable
entropy values (–277.0 and –330.0 cal/mol/K, respectively).
We also observed that sequences with different loop lengths
and melting temperatures exhibit similar range of enthalpy and
entropy values, as seen in case of
FOS (Q8),
IGF 2 (Q13),
FBXW 7 (Q18), c-
MYC (Q21) and c-
KIT (Q30). Sequences with similar
melting temperatures displayed different enthalpy and entropy
values, as observed in case of
VEGF (Q1) and
JUN B (Q5). Due
to these striking differences, it is difficult to draw generalized
reasons for the variable thermal stability observed between
these sequences.
View this table:
[in this window]
[in a new window]
|
Table 5. Comparison of thermodynamic data obtained from DSC and UV study performed in 10 mM sodium cacodylate buffer, pH 7.4, 100 mM KCl
|
|
Next we performed differential scanning calorimetry (DSC) to
obtain model-independent values which were compared with the
thermodynamic data obtained from UV study. The DSC experiments
were performed for few sequences which were selected randomly
to obtain the van't Hoff and calorimetric enthalpy associated
with the unstructured–structured transition. The unstructured–structured
transition may be associated with nonzero heat capacity changes.
These heat capacity changes arise from the change in solvent
exposure for nonpolar and polar groups. These heat capacity
changes are determined by the difference between pre- and post-transition
baseline of the DSC curves. The net effect of solvent exposure
during unstructured–structured transition can sometimes
be relatively small due to simultaneous exposure or burial of
polar and non-olar groups. Precise nonzero heat capacity changes
(
Cp) associated with quadruplex melting can be obtained at very
high oligonucleotide concentration (

200 µM). But the quadruplex
melting experiments performed at high concentrations lead to
DNA precipitation at high temperatures. At low concentrations
(50 µM) such aggregation problem can be avoided but determination
of
Cp was not possible due to the low signal. Due to this limitation,
we performed the analysis with the assumption of zero heat capacity
changes. The assumption of negligible heat capacity change associated
with unstructured–structured transition is very common
in this field, as the process yields heat capacity changes within
experimental errors (
54,
55). The shape of the curve provides
the van't Hoff enthalpy and the area under the curve provides
calorimetric enthalpy (
Figure 3). We observed that ratio of
van't Hoff and calorimetric enthalpy is nearly equal to 1, therefore,
indicating that structured–unstructured transition follows
a two-state process. The thermodynamic parameters obtained from
DSC experiments correlated well with the parameters obtained
from UV study, further indicating that the model used for the
analysis in the UV study is correct (
Table 4). Recently using
isothermal calorimetry (ITC), heat capacity changes associated
with quadruplex formation in a solution containing excess of
salt were measured (
56). ITC measurements differ from thermal
denaturation such as DSC and UV melting. The results obtained
from the former technique are valid at temperature selected
for the experiment, which is usually lower than
Tm, while, the
results obtained from the latter technique are reliable at or
near the
Tm of the unstructured–structured transition.
In the ITC experiments, the quadruplex formation is measured
under a high salt concentration where the oligonucleotide is
subjected to changes in salt conditions; while the thermal melting
experiments do not involve such changes in salt conditions.
Therefore disagreement in the heat capacity changes obtained
through ITC and DSC experiments is quite likely.
Examination of
Tables 2–4

shows that the enthalpy and entropy
associated with quadruplex formation for these 36 sequences
varies from –17.7 to –109.0 kcal/mol and –48.0
to –330.0 cal/mol/K, respectively. This indicates that
quadruplex melting is associated with favorable enthalpy and
unfavorable entropy changes. During quadruplex formation the
enthalpic stabilization is compensated at entropic cost due
to ordering or stacking of bases in the loop region. This is
to be noted that for sequences with G-tracts longer than three
guanines can potentially adopt multiple structures, and the
thermodynamic parameters obtained for these sequences may contain
large errors. The presence of multiple conformations with similar
stabilities will give shallow melting profiles with apparently
low values for enthalpy. Due to the limited studies on thermodynamic
stability of the naturally occurring quadruplex-forming sequences,
we can only compare our data with literature values reported
for model sequences. It has been reported that quadruplexes
formed by model sequences with loop length varying from T2 to
T5 have enthalpy values ranging from –44.0 to –31.0
kcal/mol (
35). Another study on stability of intramolecular
quadruplexes formed by DNA sequences containing four G3 tracts
separated by either single T or T4 loops, showed higher stability
with the single T loops and the arrangement of different length
loops has little effect on thermodynamic stability of quadruplex
(
31). Comparing our data with literature reports for model sequences
with total loop length of 7 bases, we observe that enthalpy
ranges from –17.7 to –53.0 kcal/mol for quadruplexes
formed by naturally occurring sequences in contrast to values
reported for model sequences which ranges from –26.8 to
–33.7 kcal/mol (
34). Further for longer loops, we observe
that our enthalpy values are slightly lesser than those reported
for model sequences, thereby hinting toward multiple conformations
adopted by naturally occurring sequences (
31,
32,
34,
35).
Both enthalpy and entropy in turn influence G, which is an indicator of thermodynamic stability. It is well documented in literature that increases in loop length results in decrease in quadruplex stability; however this observation is reported mainly for model sequences (29–34). A recent study using model sequences has shown that total loop length
5 has a major effect on quadruplex stability, and an addition of single nucleotide significantly affects the thermodynamic stability (34). Another study using model sequences with long central loops (6–9 bases) and two side loops composed of single base has shown that pyrimidines provide more stability over adenines and the composition of the central loop can minimize the destabilizing effect of a long central loop on quadruplex stability (29). This is in agreement with another study which demonstrated that, for intramolecular quadruplexes with G3 tracts and single base loops, a T to A loop substitution in reduced stability by 8°C (33). Intriguingly, the thermodynamic data obtained for quadruplexes formed by naturally occurring sequences having different loop lengths show that there is no clear correlation of loop length with the thermodynamic stability. Figure 4 depicts the variability in the thermodynamic stability with respect to the loop length. Quadruplexes with a total loop length of 5 formed by WNT 5A (Q3), VEGF (Q1) and WNT 3 (Q2) have thermodynamic stability ranging from –3.7 to –15.6 kcal/mol, respectively. Similarly, wide variations in the thermodynamic stability (–1.8 to –9.3 kcal/mol) were observed among quadruplex formed by sequences with loop size of 7. Likewise, no significant correlation of loop length and thermodynamic stability was observed for other data sets with longer loops. The lack of correlation between the thermodynamic analysis of quadruplexes formed by model sequences and genomic sequences with varied loop length can be attributed to the loop composition. For the model sequences, the loop length and composition are usually homogeneous. However, the naturally occurring sequences show random occurrences of purines and/or pyrimidines in the loops between the G stretches, which substantially influence the quadruplex thermodynamic stabilities. For instance, PIM1 (Q9) and FGF 3 (Q10) have total loop lengths of 7 with similar loop composition display remarkably different thermodynamic stabilities of –9.3 and –1.8 kcal/mol, respectively. Apart from total loop length and loop composition, the number of bases in the central loop is critical for quadruplex stability (32–35). We observed that for smaller loops, quadruplexes with a single base central loop are much more stable and the stability decreased upon increasing the bases in the central loop, as observed in case of WNT 3 (Q2) and WNT 5A (Q3) with thermodynamic stability of –15.6 and –3.7 kcal/mol, respectively. Our results shows that the majority (32) of quadruplex-forming sequences of protoncogenes have
G37 values approximately within the range of –1.8 to –8.6 kcal/mol. Only five genes from our data set have high
G37 values (
–9.0 kcal/mol). These genes include WNT 3 (Q2), MYB (Q35), AKT 2 (Q34), PDGFB (Q4) and PIM 1 (Q9) which form quadruplex structures having 1–3 bases in the central loop and have
G37 as –15.6, –10.4, –9.6, –9.6 and –9.3 kcal/mol, respectively. Potential quadruplex-forming sequences of promoter regions of c-MYC, c-KIT, VEGF and PDGFB have served as paradigms for quadruplex-mediated gene expression regulation (6,7,51). The structure adopted by promoter regions c-MYC (Q21), c-KIT (Q30), VEGF (Q1) and PDGFB (Q4) have thermodynamic stability of –3.2, –3.6, –6.5 and –9.6 kcal/mol, respectively.
Emerging evidence from bioinformatic data analysis indicates
a wide distribution of quadruplex-forming sequences in the human
genome with an average incidence of 1 quadruplex in 10 000 bases
(
27). Our study shows that the sequences in the promoter region
of proto-oncogenes adopt quadruplex structures which exhibit
variable thermodynamic stability ranging from –1.7 to
–15.6 kcal/mol. These findings raise a discussion over
whether biological function is associated with quadruplex structures
having lower thermodynamic stability, which undergoes facile
formation and disruption, or is it accomplished, by quadruplex
structures with high thermodynamic stability. Our attempt to
elucidate the thermodynamic profile of the quadruplexes formed
by naturally occurring sequences also invites a systematic effort
to integrate the thermodynamics and biological relevance of
quadruplexes that would allow better understanding of quadruplex-mediated
regulatory mechanism. In the human genome, a number of sites
with potential quadruplex-forming structures have been estimated
and some of these potential sites for quadruplex formation have
been correlated with gene function (
13). Quadruplex formation
at specific human promoter regions has suggested the notion
that quadruplexes might serve as potential regulatory motifs.
However, these speculations have only been supported by analysis
of either synthetic oligonucleotides or supercoiled DNA. To
comprehend how quadruplex formation contributes in gene regulation,
it is essential to take into account the contribution of the
competing Watson–Crick duplex structure. For quadruplex
formation, G-rich regions must be released from the duplex DNA,
which occurs under transient denaturation during replication,
transcription and recombination. The interconversion of quadruplex
and duplex DNA is dependent on the relative stability of these
competing secondary structures. In the current study we have
examined the role of loop length on thermodynamic stability
of quadruplex formed by naturally occurring sequences in human
proto-oncogene promoters. However, it is necessary to compare
thermodynamic stability of quadruplexes with their respective
duplexes to understand duplex–quadruplex interconversion.
It is difficult to obtain the thermodynamic parameters involved
in duplex formation from the same sequences and their respective
complementary strand by UV melting studies, as it includes the
contributions from both duplex and quadruplex. We therefore
obtained the thermodynamic profiles for the duplexes through
a nearest neighbor (NN) method (
48,
49). This method assumes
that the stability of a given base pair depends on the identity
and orientation of the neighboring base pairs. During the past
decades numerous studies have been performed to calculate thermodynamic
parameters of a given duplex under specified experimental conditions
by this method. It is well established that NN method allows
evaluation of the thermal stability and thermodynamic parameters
of duplexes with precision and data are in agreement with the
experimental data. Hyther is a tool that allows calculation
of nucleic acid hybridization thermodynamics using NN method.
Using Hyther, we calculated the thermodynamic profile of duplexes
formed by G-rich sequence with their respective complementary
strand with the strand concentration and buffer conditions (
Tables 2–4

)
used in this study. The important parameter that dictates the
predominance of either the duplex or quadruplex is the relative
free energy difference, the

G37 between the duplex and quadruplex
forms. It is noteworthy that in all the cases

G37 values are
negative indicating that duplex is predominant structure in
the competition (
Tables 2–4

). We also observed an increase
in duplex stability upon increasing the loop length (
Tables 2–4

).
The relative free energy difference, the

G37 between duplex
and quadruplex structure increases (

–8.1 kcal/mol to –35.1
kcal/mol) upon increasing the loop length. The greater the negative
magnitude of

G37, the higher is the predominance of duplex at
equilibrium. As shown in
Figure 5, only the quadruplexes formed
by
WNT 3 (Q2),
PDGFB (Q4),
VEGF (Q1) and
WNT 5A (Q3) demonstrate
less negative

G37 values of –8.1 kcal/mol, –12.1
kcal/mol, –13.3 kcal/mol and –16.0 kcal/mol, respectively.
These lower

G37 values indicate that although the duplex is
more stable than the quadruplex, the contribution from competing
quadruplex population is high, which leads to less negative

G37 values.
The genomic DNA predominantly exists in double-stranded conformation;
however, during specialized conditions guanine stretches fold
spontaneously into quadruplex structures. These structures have
the potential to serve as functional elements and identifying
their structural and functional characteristics is essential
for understanding their gene regulation. Literature cites examples
for opposing roles of quadruplexes in biology. Repressor role
of quadruplex in
KRAS and c-
MYC promoter region has been demonstrated
using reporter assay and quadruplex interacting ligand, TMpyP4
(
18). In contrast, the stimulatory role of quadruplexes has
been shown for chicken beta globin genes and human insulin genes
through biochemical and biophysical analysis (
18). These contrasting
roles observed for quadruplexes can be explained on the basis
positional regulomics. Here, the position of the regulatory
motif which binds a transcription factor influences its activator
or repressor activity in different tissues (
57). Recently, a
dual role as a transcriptional repressor and activator has been
demonstrated for a GGA element in quadruplex-forming region
in the c-
MYB promoter (
58). Analysis of a genome-wide study
has shown that the presence of potential quadruplex-forming
motifs in the transcriptional regulatory region (–500
to +500 bp) is associated with significant enrichment of RNA
polymerase II at the transcriptional regulatory region (
59).
These findings taken together lead to the hypothesis that quadruplex
structures may be common regulatory elements involved in transcriptional
regulation. The transient separation of duplex DNA during transcription
increases the opportunity for the quadruplex formation. These
structural motifs are kinetically trapped species which dissociate
slowly and hold the open DNA structure, which thereby render
the template available for higher rate of transcription. Apart
from the G-rich strand, the C-rich complementary strand also
has the potential for forming a secondary structure called i-motif.
This structure consists of intercalated hemiprotonated C:C
+ base pairs (
60,
61). Although these structures are stable at
acidic pH, they may also form at near physiological pH as in
case of c-
MYC (
61). Lately, the evidence of proteins interacting
with i-motif structures has aroused interest in biological role
of i-motifs (
62,
63). Therefore, during transcription, ion or
pH fluctuations both G- and C-rich strand may adopt competing
secondary structures which dissociate slowly and keep the DNA
in an open conformation or act as molecular recognition motif
for transcription factors.
 |
CONCLUSION
|
|---|
Though loop length has marked influence on the thermodynamic
stability of quadruplexes, the variation in loop composition
makes it difficult to find a significant correlation between
loop length and quadruplex stability. Apart from loop length
and composition, loop symmetry also affects thermodynamic stability
of quadruplexes. Furthermore, the results obtained in this study
can be discussed in the light of natural scenario, where both
quadruplex and duplex structures are likely to coexist and undergo
transition to execute their respective biological role. The
relative stability of the duplex and quadruplex structure would
dictate the predominance of either of these structures at equilibrium.
Our results affirm that increase in loop length pushes the equilibrium
toward duplex and outcompetes quadruplex formation.
 |
SUPPLEMENTARY DATA
|
|---|
Supplementary Data are available at NAR Online.
 |
FUNDING
|
|---|
Council of Scientific and Industrial Research grant (N.K.).
Council of Scientific and Industrial Research grant (S.M.).
Funding for open access charge: Council of Scientific &
Industrial Research (CSIR), India.
Conflict of interest statement. None declared.
 |
ACKNOWLEDGEMENT
|
|---|
Authors thank Dr Vinod Scaria for his help in using Quadfinder.
The authors also thank Dr Anjan A. Sen from Centre of Theoretical
Physics, Jamia Milia Islamia University for the help in use
of Mathematica 5.1.
 |
REFERENCES
|
|---|
- Blackburn EH. Telomeres: no end in sight. Cell (1994) 77:621–623.[CrossRef][ISI][Medline]
- Sen D, Gilbert W. Formation of parallel four-stranded complexes by guanine-rich motifs in DNA and its implications for meiosis. Nature (1988) 334:364–366.[CrossRef][ISI][Medline]
- Simonsson T, Pecinka P, Kubista M. DNA tetraplex formation in the control region of c-myc. Nucleic Acid Res. (1998) 26:1167–1172.[Abstract/Free Full Text]
- Simonsson T, Henriksson M. c-myc Suppression in Burkitt's lymphoma cells. Biochem. Biophys. Res. Commun. (2002) 290:11–15.[CrossRef][ISI][Medline]
- Jain AS, Grand CL, Bearss DJ, Hurley LH. Direct evidence for a G-quadruplex in a promoter region and its targeting with a small molecule to repress c-MYC transcription. Proc. Natl Acad. Sci. USA (2002) 99:11593–11598.[Abstract/Free Full Text]
- Sun D, Guo K, Rusche JJ, Hurley LH. Facilitation of a structural transition in the polypurine/polypyrimidine tract within the proximal promoter region of the human VEGF gene by the presence of potassium and G-quadruplex-interactive agents. Nucleic Acids Res. (2005) 33:6070–6080.[Abstract/Free Full Text]
- Saxena S, Bansal A, Kukreti S. Structural polymorphism exhibited by a homopurine.homopyrimidine sequence found at the right end of human c-jun proto-oncogene. Arch. Biochem. Biophys. (2008) 471:95–108.[CrossRef][ISI]
- Williamson JR. G-quartet structures in telomeric DNA. Annu. Rev. Biophys. Biomol. Struct. (1994) 23:703–730.[ISI][Medline]
- Phan AT, Kuryavyi V, Patel DJ. DNA architecture: from G to Z. Curr. Opin. Struct. Biol. (2006) 16:288–298.[CrossRef][ISI][Medline]
- Simonsson T. G-quadruplex DNA structures – variations on a theme. Biol. Chem. (2001) 382:621–628.[CrossRef][ISI][Medline]
- Huppert JL, Balasubramanian S. Prevalence of quadruplexes in the human genome. Nucleic Acids Res. (2005) 33:2908–2916.[Abstract/Free Full Text]
- Todd AK, Johnston M, Neidle S. Highly prevalent putative quadruplex sequence motifs in human DNA. Nucleic Acids Res. (2005) 33:2901–2907.[Abstract/Free Full Text]
- Eddy J, Maizels N. Gene function correlates with potential for G4 DNA formation in the human genome. Nucleic Acid Res. (2006) 34:3887–3896.[Abstract/Free Full Text]
- Wieland M, Hartig JS. RNA quadruplex-based modulation of gene expression. Chem. Biol. (2007) 14:757–763.[CrossRef][ISI][Medline]
- Kumari S, Bugaut A, Huppert JL, Balasubramanian S. An RNA G-quadruplex in the 5' UTR of the NRAS proto-oncogene modulates translation. Nat. Chem. Biol. (2007) 4:218–221.
- Arora A, Dutkiewicz M, Scaria V, Hariharan M, Maiti S, Kurreck J. Inhibition of translation in living eukaryotic cells by an RNA G-quadruplex motif. RNA (2008) 14:1290–1296.[Abstract/Free Full Text]
- Eddy J, Maizels N. Conserved elements with potential to form polymorphic G-quadruplex structures in the first intron of human genes. Nucleic Acid Res. (2008) 36:1321–1333.[Abstract/Free Full Text]
- Fry M. Tetraplex DNA and its interacting proteins. Front. Biosci. (2007) 12:4336–4351.[CrossRef][ISI][Medline]
- Giraldo R, Suzuki M, Chapman L, Rhodes D. Promotion of parallel DNA quadruplexes by a yeast telomere binding protein: a circular dichroism study. Proc. Natl Acad. Sci. USA (1994) 91:7658–7662.[Abstract/Free Full Text]
- Muniyappa K, Anuradha S, Byers B. Yeast meiosis-specific protein Hop1 binds to G4 DNA and promotes its formation. Mol. Cell. Biol. (2000) 20:1361–1369.[Abstract/Free Full Text]
- Ghosal G, Muniyappa K. Saccharomyces cerevisiae Mre11 is a high-affinity G4 DNA-binding protein and a G-rich DNA-specific endonuclease: implications for replication of telomeric DNA. Nucleic Acids Res. (2005) 33:4692–4703.[Abstract/Free Full Text]
- Liu Z, Gilbert W. The yeast KEM1 gene encodes a nuclease specific for G4 tetraplex DNA: implication of in vivo functions for this novel DNA structure. Cell (1994) 77:1083–1092.[CrossRef][ISI][Medline]
- Sun H, Yabuki A, Maizels N. A human nuclease specific for G4 DNA. Proc. Natl Acad. Sci. USA (2001) 98:12444–12449.[Abstract/Free Full Text]
- Zaug AJ, Podell ER, Cech TR. Human POT1 disrupts telomeric G-quadruplexes allowing telomerase extension in vitro. Proc. Natl Acad. Sci. USA (2005) 102:10864–10869.[Abstract/Free Full Text]
- Hurley LH, Wheelhouse RT, Sun D, Kerwin SM, Salazar M, Fedoroff OY, Han FX, Han H, Izbicka E, Von Hoff DD. G-quadruplexes as targets for drug design. Pharmacol. Ther. (2002) 85:141–158.[CrossRef]
- Izbicka E, Wheelhouse RT, Raymond E, Davidson KK, Lawrence RA, Sun D, Windle BE, Hurley LH, Von Hoff DD. Effects of cationic porphyrins as G-quadruplex interactive agents in human tumor cells. Cancer Res. (1999) 59:639–644.[Abstract/Free Full Text]
- Huppert JL, Balasubramanian S. G-quadruplexes in promoters throughout the human genome. Nucleic Acids Res. (2007) 35:406–413.[Abstract/Free Full Text]
- Hazel P, Huppert J, Balasubramanian S, Neidle S. Loop-length-dependent folding of G-quadruplexes. J. Am. Chem. Soc. (2004) 126:16405–16415.[CrossRef][ISI][Medline]
- Guédin A, De Cian A, Gros J, Lacroix L, Mergny JL. Sequence effects in single-base loops for quadruplexes. Biochimie (2008) 90:686–696.[Medline]
- Smirnov I, Shafer RH. Effect of loop sequence and size on DNA aptamer stability. Biochemistry (2000) 39:1462–1468.[CrossRef][ISI][Medline]
- Rachwal PA, Findlow IS, Werner JM, Brown T, Fox KR. Intramolecular DNA quadruplexes with different arrangements of short and long loops. Nucleic Acid Res. (2007) 35:4214–4222.[Abstract/Free Full Text]
- Rachwal PA, Brown T, Fox KR. Effect of G-tract length on the topology and stability of intramolecular DNA quadruplexes. Biochemistry (2007) 46:3036–3044.[CrossRef][ISI][Medline]
- Rachwal PA, Brown T, Fox KR. Sequence effects of single base loops in intramolecular quadruplex DNA. FEBS Lett. (2007) 44:1657–1660.
- Bugaut A, Balasubramanian S. A sequence-independent study of the influence of short loop lengths on the stability and topology of intramolecular DNA G-quadruplexes. Biochemistry (2008) 47:689–697.[CrossRef][ISI][Medline]
- Kumar N, Sahoo B, Varun KAS, Maiti S, Maiti S. Effect of loop length variation on quadruplex-Watson Crick duplex competition. Nucleic Acid Res. (2008) 36:4433–4442.[Abstract/Free Full Text]
- Miura T, Thomas GJ. Structural polymorphism of telomere DNA: interquadruplex and duplex–quadruplex conversions probed by Raman spectroscopy. Biochemistry (1994) 33:7848–7856.[CrossRef][ISI][Medline]
- Risitano A, Fox KR. Stability of intramolecular DNA quadruplexes: comparison with DNA duplexes. Biochemistry (2003) 42:6507–6513.[CrossRef][ISI][Medline]
- Kumar N, Maiti S. Quadruplex to Watson–Crick duplex transition of the thrombin binding aptamer: a fluorescence resonance energy transfer study. Biochem. Biophys. Res. Commun. (2004) 319:759–767.[CrossRef][ISI][Medline]
- Kumar N, Maiti S. The effect of osmolytes and small molecule on quadruplex–WC duplex equilibrium: a fluorescence resonance energy transfer study. Nucleic Acid Res. (2005) 33:6723–6732.[Abstract/Free Full Text]
- Kumar N, Maiti S. Role of locked nucleic acid modified complementary strand in quadruplex/Watson–Crick duplex equilibrium. J. Phys. Chem. B. (2007) 111:12328–12337.[Medline]
- Li W, Miyoshi D, Nakano S, Sugimoto N. Competition involving G-quadruplex DNA and its complement. Biochemistry (2003) 42:11736–11744.[CrossRef][ISI][Medline]
- Scaria V, Hariharan M, Arora A, Maiti S. Quadfinder: server for identification and analysis of quadruplex-forming motifs in nucleotide sequences. Nucleic Acid Res. (2006) 34:W683–W685.[Abstract/Free Full Text]
- Cantor CR, Warshaw MM, Shapiro H. Oligonucleotide interactions. Circular dichroism studies of the conformation of deoxyoligonucleotides. Biopolymers (1970) 9:1059–1077.[CrossRef][ISI][Medline]
- Marky LA, Blumenfeld KS, Kozlowski S, Breslauer KJ. Salt-dependent conformational transitions in the self-complementary deoxydodecanucleotide d(CGCAATTCGCG): evidence for hairpin formation. Biopolymers (1983) 9:1247–1257.
- Mergny JL, Phan AT, Lacroix L. Following G-quartet formation by UV spectroscopy. FEBS Lett. (1998) 435:74–78.[CrossRef][ISI][Medline]
- Bishop GR, Ren J, Polander BC, Jeanfreau BD, Trent JO, Chaires JB. Energetic basis of molecular recognition in a DNA aptamer. Biophys. Chem. (2007) 126:165–175.[CrossRef][ISI][Medline]
- McTigue PM, Peterson RJ, Kahn JD. Sequence-dependent thermodynamic parameters for locked nucleic acid (LNA)-DNA duplex formation. Biochemistry (2004) 43:5388–5405.[CrossRef][ISI][Medline]
- SantaLucia J Jr. A unified view of polymer, dumbbell, and oligonucleotide DNA nearest-neighbor thermodynamics. Proc. Natl Acad. Sci. USA (1998) 95:1460–1465.[Abstract/Free Full Text]
- Peyret N, Seneviratne PA, Allawi HT, SantaLucia J. Jr. Nearest-neighbor thermodynamics and NMR of DNA sequences with internal A·A, C·C, G·G, and T·T mismatches. Biochemistry (1999) 38:3468–3477.[CrossRef][ISI][Medline]
- Phan AT, Kuryavyi V, Burge S, Neidle S, Patel DJ. Structure of an unprecedented G-quadruplex scaffold in the human c-kit promoter. J. Am. Chem. Soc. (2007) 129:4386–4392.[CrossRef][ISI][Medline]
- Qin Y, Rezler EM, Gokhale V, Sun D, Hurley LH. Characterization of the G-quadruplexes in the duplex nuclease hypersensitive element of the PDGF-A promoter and modulation of PDGF-A promoter activity by TMPyP4. Nucleic Acid Res. (2007) 35:7698–7713.[Abstract/Free Full Text]
- Phan AT, Modi YS, Patel DJ. Propeller-type parallel-stranded G-quadruplexes in the human c-myc promoter. J. Am. Chem. Soc. (2004) 126:8710–8716.[CrossRef][ISI][Medline]
- Kejnovska I, Kypr J, Vorlickova M. Oligo(dT) is not a correct native PAGE marker for single-stranded DNA. Biochem. Biophys. Res. Commun. (2007) 353:776–779.[CrossRef][ISI][Medline]
- Olsen CM, Gmeiner WH, Marky LA. Unfolding of G-quadruplexes: energetic, and ion and water contributions of G-quartet stacking. J. Phys. Chem. B. (2006) 110:6962–6969.[Medline]
- Lee HT, Olsen CM, Waters L, Sukup H, Marky LA. Thermodynamic contributions of the reactions of DNA intramolecular structures with their complementary strands. Biochimie (2008) 90:1052–1063.[CrossRef][ISI][Medline]
- Majhi PR, Qi J, Tang C.-F, Shafer RH. Heat capacity associated with guanine quadruplex formation: an isothermal titration calorimetry study. Biopolymers (2008) 89:302–309.[CrossRef][ISI][Medline]
- Tharakaraman K, Bodenreider O, Landsman D, Spouge JL, Mariño-Ramírez L. The biological function of some human transcription factor binding motifs varies with position relative to the transcription start site. Nucleic