Basic Operating Procedure
1) Evaluate the "Initialize" notebook for the case you're interested in; then close it
and open the corresponding "Output" notebook.
2) Add your own experimental data for comparison with theoretical data if desired.
3) Set the "Input Parameters" to the desired values.
4) Evaluate the "Output" notebook.
5) Repeat 3) and 4) as desired.
Since there is some redundancy in the various output parameters that are plotted
and tabulated, you may want to unlock and make inactive or delete cells for output
you're not interested in. Nearly all the essential calculations are performed in one
cell immediately under the first "Calculate..." heading, between the input parameters
and the first plot (generally the Scatchard plot), so this cell must be re-evaluated
each time the input parameters are changed. Once this is done, the various other plots
may generally be performed in any order. However, additional calculations must be
performed before 3-D or 2-D cluster length distributions are plotted; these
calculations are performed in the cell immediately under the "Calculate..." heading
just before the first 3-D or 2-D cluster length plot.
Introduction
These Mathematicaª notebooks deal with a simple model for the non-sequence-
selective binding of small molecules (ligands) such as proteins to long linear
polymers such as DNA. They calculate and plot data for the binding of ligands to an
infinite linear lattice for the following cases:
a) symmetric ligands of one type ("Sym");
b) symmetric ligands of two types binding to an isotropic lattice ("2Lig" and
"2L-Ex&Th");
c) asymmetric ligands of one type binding to an isotropic lattice ("Iso");
d) asymmetric ligands of one type binding to an anisotropic lattice ("Ani").
In the last case ("Ani") two binding modes are allowed, i.e., different numbers of
lattice residues may be covered when the ligand binds in different orientations. Case
a is equivalent to the case analyzed by McGhee & von Hippel in their well-known paper
(J. Mol. Biol. vol. 86, 469-489 [1974]. Erratum vol. 103, 679 [1976]). Cases b-d are
more general. Schematic depictions of binding situations associated with the various
cases, as well as a derivation of the equations relevant to case a, are contained in
the "schema&derivation" notebook.
For each case there is an initialization notebook that contains the relevant sets of
equations, and an output notebook that calculates and plots the Scatchard plot, the
neighbor-effect parameters, and the conditional probabilities. These quantities are
calculated as functions of the ratio of bound ligands to lattice residues (r) except in
the case of one of the sets of programs for two ligands ("2L-Ex&Th"), in which the
concentration of ligand 1 is the abscissa. In addition, for all cases except the two
ligand case ("2Lig" and "2L-Ex&Th"), the output notebooks calculate the 2D (at a given
r) and 3D (for the whole range of lattice saturation) cluster length distributions. The
method used is based on the treatment given in Wolfe & Meehan, J. Mol. Biol. vol. 223,
1063-1087 (1992). It is important to note that except for the case of symmetric
ligands of one type ("Sym"), the programs will not give correct output under all
circumstances. IT IS THE RESPONSIBILITY OF THE USER TO EVALUATE THE
CORRECTNESS OF HIS OR HER OUTPUT. This can be done in a straightforward manner
(see below).
In addition, all the programs except one of the sets of programs for the two types of
ligand case ("2Lig") allow comparison of user-supplied experimental data with the
calculated theoretical data.
Using the Programs
To use the notebooks, first open the initialization notebook for the desired case and
evaluate the "initialization cells" using the "Kernel-> Evaluation-> Evaluate
Initialization" command. This needs to be done only once per Mathematica session.
Then, close the initialization notebook, open the corresponding output notebook, and
specify the desired "input variables" values. Integer values for n are expected but are
generally not required. The programs for asymmetric ligands on an anisotropic lattice
("Ani") with two binding modes (n1 not equal to n2), and for type 2 or type 3
isotherms for two kinds of symmetric ligands on an isotropic lattice ("2Lig") may
generate errors at saturation for non-integral values of n. The variable "points" gives
the number of points at equally spaced degrees of lattice occupancy above r=0 to be
used in the plot. Because parameter values frequently change rapidly when the lattice
is nearly saturated, I have added several "extra" points just before rSat (the maximum
value of r at lattice saturation), using magenta text in the cell under the first
"Calculate..." heading.
The output notebook can be executed either in its entirety (with the "Kernel->
Evaluation-> Evaluate Notebook" command) or in sections (by selecting the cells and
hitting the "shift-return" keys or the "enter" key). If the program is being executed in
sections, the Scatchard plot must generally be calculated and plotted first (this can
be done by selecting and executing the group of cells in the right-most bracket
starting at the "input variables" cell). The only exception to this rule is in the case of
the calculation of cluster length distributions for symmetric ligands of one type
("Sym"), which can be done without any preliminary calculation of a Scatchard plot.
Here, it is only necessary to execute the input variables cell before performing the 3D
or 2D cluster length calculations.
The Scatchard plot for an asymmetric ligand with a single binding mode (n1 = n2)
binding to an isotropic ("Iso") or anisotropic ("Ani") lattice is accompanied by a
"reference" symmetric ligand Scatchard plot having the same intercepts and initial
slope (in gray). Since the asymmetric ligand Scatchard plot (in black) is plotted after
the reference plot, the reference plot will be invisible if the two plots are identical
or nearly so. The reference plot can be eliminated, if desired, by setting the Boolean
variable "referencePlot" to "False" (in the first line of the first calculation cell).
After the Scatchard plot has been calculated, one or more of the neighbor-effect
parameters, conditional probabilities, or cluster-length distributions can be plotted
in any order. The only restriction is that additional calculations are needed for both
the 2D and 3D cluster length distributions, and these must be performed (by the cell
just below the respective "Calculate..." heading) before the data can be plotted. The
particular degree of saturation for a 2D cluster length distribution is specified by
"rCluster2D", which equals r/rSat. 2D cluster length distributions can be calculated
repeatedly for different r values after the Scatchard plot has been calculated once.
The number of equally spaced values of r at which the cluster distribution will be
calculated for the 3D plot is specified by "points3D". The amount of time needed for
the cluster length distribution calculation increases exponentially with the length of
cluster being considered (you set the maximum length for the calculations with
"maxLength2D" or "maxLength3D"). However, data is printed as it is calculated, so it
won't be lost if you abort.
When the computer eventually runs out of memory, it is necessary to quit
Mathematica and start over. NOTE: The "Ani" set of programs have the highest
memory requirement. If this program crashes after a small number of calculations,
try increasing Mathematica's memory allocation. It is also best to quit Mathematica
and start over when switching from one case to another.
Most of the cells in the programs are locked, so they would have to be unlocked (using
the "Cell-> Cell Properties-> Cell Editable" command) to make any changes. Also,
many of the cells are closed. In order to view their contents, one of two procedures
must be followed. If the cell is a subgroup (hidden behind another cell with a
downward pointing arrow to the right), it can be viewed using the "Cell-> Cell
Grouping-> Open All Subgroups" command. If it is not a subgroup, unlock it and open it
using the "Cell-> Cell Properties-> Cell Open" command. The output from the programs
is presented both graphically and numerically. The numeric output may in most cases
be eliminated by putting a semicolon at the end of the line in the cell just before the
corresponding graph.
A meaningless equation ("scroll = 1") has been added to the end of a number of the text
cells in the output notebooks. This allows all of a notebook's results to be displayed
by automatic scrolling when the entire notebook is being evaluated (the "Kernel->
Evaluation-> Evaluate Notebook" command). Otherwise, the automatic scrolling
usually stops at an inactive cell (at least in the Macintosh version of Mathematica).
Comparing Experimental and Theoretical Data
All three sets of programs for the single type of ligand case ("Sym", "Iso", and "Ani")
allow comparison of the calculated theoretical Scatchard plot with an experimental
Scatchard plot. In order to perform this comparison, the experimental r values must
be listed in the array rExpArray, and the corresponding experimental r/[free ligand]
values must be listed in the array experimentalrLf. The Boolean variable
experimentalData must also be set to True. The programs will then plot the
experimental and theoretical curves together (in gray and black, respectively), using
the experimental r values for the theoretical data points. Several plots of the
residuals (experimental - theoretical) will also be displayed, and a correlation
coefficient (R) will be calculated. (If the correlation is very poor, the resulting value
of R will be a complex number.) If the initial value of experimentalrLf is negative,
the experimental r/[free ligand] values will be disregarded. However, as long as
experimentalData is True, the r values in rExpArray will still be used for the
theoretical calculation.
The "2L-Ex&Th" notebooks for the binding of two kinds of symmetric ligands to an
isotropic lattice are also intended for comparison of experimental data with the
theoretical binding curve (the "2Lig" programs do not allow this). The calculations are
performed for an abscissa array of ligand 1 concentrations (not for an array of r
values of ligand 1, as in the other programs), and do not reach lattice saturation. The
experimental data needs to be given as an array of concentrations of ligand 1
("concArray ", representing either total concentrations or free concentrations,
depending on whether type is set to 1 or 2, respectively), and an array of r values
("experimentalR2 ", representing bound ligand 2 concentration divided by the
polymer concentration in terms of monomer units). The manner in which the two data
sets are compared is the same as described in the preceding paragraph. If the initial
value of experimentalR2 is negative, the experimental r2 values will be
disregarded. However, as long as experimentalData is True, the [ligand 1] values in
concArray will still be used for the theoretical calculation.
Recognizing Incorrect Output
Problems may occasionally arise with execution of these programs, signaled by an
unexpected discontinuity or plateau in one or more of the plots, a message that
FindRoot has encountered a single Jacobian, or a message that "Newton's method has
failed to converge to the prescribed accuracy" after the specified iteration limit.
The discontinuity or plateau may indicate that FindRoot has jumped to an incorrect
solution of the equations being solved for. This type of problem may not be evident in
the Scatchard plot if it occurs at a point where the curve is close to the x-axis, but it
will generally be obvious in the plots of one or more of the neighbor-effect
parameters or q (particularly in the latter case). These problems usually arise when
the values of one or more of the neighbor-effect parameters (or q) are changing
rapidly as a function of r, and the step size used for r in the iteration is too large (i.e.,
the value of "points" is too small).
If meaningless or physically impossible values are generated for one or more of the
neighbor-effect parameters, q, or conditional probabilities, a flag will be set and
warning messages will accompany the output. In addition, data which has been
determined to be incorrect will not be displayed. HOWEVER, THE ABSENCE OF A
WARNING MESSAGE IS NO GUARANTEE THAT THE OUTPUT IS CORRECT. Incorrect values
may still be in the physically possible range. If the results of any calculation are in
doubt, a simple and reliable way to test the correctness of the output is to change
(e.g. double or quadruple) the number of points in the plot. If the initial output was
incorrect, the numbers above the r value where the mistake occurred will change in
the second calculation.
Avoiding Incorrect Output
The problem of FindRoot jumping to the wrong root can generally be overcome by one
of three methods: 1) by increasing the number of points in the plot; 2) by choosing less
extreme values of the input variables; 3) by changing the form of the equations being
solved. The third method should be used when the first doesn't work and you don't
want to use the second. To use this approach you need to note which set of equations
are being solved when the problem arises (given by the value of "ne"). Then go back
into the initialization notebook and find the equations in question. In many cases
there will be a standard version of these equations, given in an initialization cell, and
one or more "alternate" versions, in cells which are not initialization cells. The
version of the equations in the initialization cell is the one that normally executes
most rapidly; however, one of the slower versions may be less susceptible to jumping
to the wrong root in a particular situation. Select and execute a cell containing one of
the alternate versions and retry the calculation. This approach cannot be used for
errors that occur at saturation, because alternate forms of the equations used for this
calculation are not available.
An example of a situation where calculating the correct output is difficult is the
anisotropic lattice case with k1 = k2 = n1 = n2 = 1, w11 = 10, w22 = 0, w12 = 0.01,
w21 = 0.011. The ligands in this case initially form tightly bound head-to-head pairs,
but must rearrange into a much weaker tail-to-head arrangement close to saturation.
Mistakes are often made in this calculation without setting a flag. The error is
obvious when inspecting the plot for q (and the plots for some of the b-b conditional
probabilities). The second alternative set of ne1 equations works much better for
this case than the primary version.
Program Bugs and Limitations
A discontinuity that appears between the last and next-to-last points in a plot that
does not respond to the above-noted measures may represent a program bug [most
likely in the anisotropic lattice case ("Ani") with n1 not equal to n2]. If you believe
you have encountered a bug, please report it to the programmer (Alan R. Wolfe, Dept. of
Biopharmaceutical Sciences, University of California, San Francisco, CA 94143-0446;
e-mail address: alwolfe@bigfoot.com). Remember that the parameter values will not
always be correctly calculated at saturation if you use nonintegral values of n.
=======================
End of Read Me
=======================