Basic Operating Procedure

1) Evaluate the "Initialize" notebook for the case you're interested in; then close it 
    and open the corresponding "Output" notebook.  
2) Add your own experimental data for comparison with theoretical data if desired.  
3) Set the "Input Parameters" to the desired values.  
4) Evaluate the "Output" notebook.  
5) Repeat 3) and 4) as desired.  

Since there is some redundancy in the various output parameters that are plotted 
and tabulated, you may want to unlock and make inactive or delete cells for output 
you're not interested in.  Nearly all the essential calculations are performed in one 
cell immediately under the first "Calculate..." heading, between the input parameters 
and the first plot (generally the Scatchard plot), so this cell must be re-evaluated 
each time the input parameters are changed.  Once this is done, the various other plots 
may generally be performed in any order.  However, additional calculations must be 
performed before 3-D or 2-D cluster length distributions are plotted; these 
calculations are performed in the cell immediately under the "Calculate..." heading 
just before the first 3-D or 2-D cluster length plot.  


Introduction

These Mathematicaª notebooks deal with a simple model for the non-sequence- 
selective binding of small molecules (ligands) such as proteins to long linear 
polymers such as DNA.  They calculate and plot data for the binding of ligands to an 
infinite linear lattice for the following cases:
     a) symmetric ligands of one type ("Sym"); 
     b) symmetric ligands of two types binding to an isotropic lattice ("2Lig" and 
         "2L-Ex&Th"); 
     c) asymmetric ligands of one type binding to an isotropic lattice ("Iso"); 
     d) asymmetric ligands of one type binding to an anisotropic lattice ("Ani").  
In the last case ("Ani") two binding modes are allowed, i.e., different numbers of 
lattice residues may be covered when the ligand binds in different orientations.  Case 
a is equivalent to the case analyzed by McGhee & von Hippel in their well-known paper 
(J. Mol. Biol. vol. 86, 469-489 [1974].  Erratum vol. 103, 679 [1976]).  Cases b-d are 
more general.  Schematic depictions of binding situations associated with the various 
cases, as well as a derivation of the equations relevant to case a, are contained in 
the "schema&derivation" notebook.  

For each case there is an initialization notebook that contains the relevant sets of 
equations, and an output notebook that calculates and plots the Scatchard plot, the 
neighbor-effect parameters, and the conditional probabilities.  These quantities are 
calculated as functions of the ratio of bound ligands to lattice residues (r) except in 
the case of one of the sets of programs for two ligands ("2L-Ex&Th"), in which the 
concentration of ligand 1 is the abscissa.  In addition, for all cases except the two 
ligand case ("2Lig" and "2L-Ex&Th"), the output notebooks calculate the 2D (at a given 
r) and 3D (for the whole range of lattice saturation) cluster length distributions.  The 
method used is based on the treatment given in Wolfe & Meehan, J. Mol. Biol. vol. 223, 
1063-1087 (1992).  It is important to note that except for the case of symmetric 
ligands of one type ("Sym"), the programs will not give correct output under all 
circumstances.   IT IS THE RESPONSIBILITY OF THE USER TO EVALUATE THE 
CORRECTNESS OF HIS OR HER OUTPUT.  This can be done in a straightforward manner 
(see below).  

In addition, all the programs except one of the sets of programs for the two types of 
ligand case ("2Lig") allow comparison of user-supplied experimental data with the 
calculated theoretical data.  


Using the Programs 

To use the notebooks, first open the initialization notebook for the desired case and 
evaluate the "initialization cells" using the "Kernel-> Evaluation-> Evaluate 
Initialization"  command.  This needs to be done only once per Mathematica session.  
Then, close the initialization notebook, open the corresponding output notebook, and 
specify the desired "input variables" values.  Integer values for n are expected but are 
generally not required.  The programs for asymmetric ligands on an anisotropic lattice 
("Ani") with two binding  modes (n1 not equal to n2), and for type 2 or type 3 
isotherms for two kinds of symmetric ligands on an isotropic lattice ("2Lig") may 
generate errors at saturation for non-integral values of n.  The variable "points" gives 
the number of points at equally spaced degrees of lattice occupancy above r=0 to be 
used in the plot.  Because parameter values frequently change rapidly when the lattice 
is nearly saturated, I have added several "extra" points just before rSat (the maximum 
value of r at lattice saturation), using magenta text in the cell under the first 
"Calculate..." heading.  

The output notebook can be executed either in its entirety (with the "Kernel-> 
Evaluation-> Evaluate Notebook" command) or in sections (by selecting the cells and 
hitting the "shift-return" keys or the "enter" key).  If the program is being executed in 
sections, the Scatchard plot must generally be calculated and plotted first (this can 
be done by selecting and executing the group of cells in the right-most bracket 
starting at the "input variables" cell).  The only exception to this rule is in the case of 
the calculation of cluster length distributions for symmetric ligands of one type 
("Sym"), which can be done without any preliminary calculation of a Scatchard plot.  
Here, it is only necessary to execute the input variables cell before performing the 3D 
or 2D cluster length calculations.  

The Scatchard plot for an asymmetric ligand with a single binding mode (n1 = n2) 
binding to an isotropic ("Iso") or anisotropic ("Ani") lattice is accompanied by a 
"reference" symmetric ligand Scatchard plot having the same intercepts and initial 
slope (in gray).  Since the asymmetric ligand Scatchard plot (in black) is plotted after 
the reference plot, the reference plot will be invisible if the two plots are identical 
or nearly so.  The reference plot can be eliminated, if desired, by setting the Boolean 
variable "referencePlot" to "False" (in the first line of the first calculation cell).  

After the Scatchard plot has been calculated, one or more of the neighbor-effect 
parameters, conditional probabilities, or cluster-length distributions can be plotted 
in any order.  The only restriction is that additional calculations are needed for both 
the 2D and 3D cluster length distributions, and these must be performed (by the cell 
just below the respective "Calculate..." heading) before the data can be plotted.  The 
particular degree of saturation for a 2D cluster length distribution is specified by 
"rCluster2D", which equals r/rSat.  2D cluster length distributions can be calculated 
repeatedly for different r values after the Scatchard plot has been calculated once.  
The number of equally spaced values of r at which the cluster distribution will be 
calculated for the 3D plot is specified by "points3D".  The amount of time needed for 
the cluster length distribution calculation increases exponentially with the length of 
cluster being considered (you set the maximum length for the calculations with 
"maxLength2D" or "maxLength3D").  However, data is printed as it is calculated, so it 
won't be lost if you abort.

When the computer eventually runs out of memory, it is necessary to quit 
Mathematica and start over.  NOTE:  The "Ani" set of programs have the highest 
memory requirement.  If this program crashes after a small number of calculations, 
try increasing Mathematica's memory allocation.  It is also best to quit Mathematica 
and start over when switching from one case to another.  

Most of the cells in the programs are locked, so they would have to be unlocked (using 
the "Cell-> Cell Properties-> Cell Editable" command) to make any changes.  Also, 
many of the cells are closed.  In order to view their contents, one of two procedures 
must be followed.  If the cell is a subgroup (hidden behind another cell with a 
downward pointing arrow to the right), it can be viewed using the "Cell-> Cell 
Grouping-> Open All Subgroups" command.  If it is not a subgroup, unlock it and open it 
using the "Cell-> Cell Properties-> Cell Open" command.  The output from the programs 
is presented both graphically and numerically.  The numeric output may in most cases 
be eliminated by putting a semicolon at the end of the line in the cell just before the 
corresponding graph.  

A meaningless equation ("scroll = 1") has been added to the end of a number of the text 
cells in the output notebooks.  This allows all of a notebook's results to be displayed 
by automatic scrolling when the entire notebook is being evaluated (the "Kernel-> 
Evaluation-> Evaluate Notebook" command).  Otherwise, the automatic scrolling 
usually stops at an inactive cell (at least in the Macintosh version of Mathematica).  


Comparing Experimental and Theoretical Data

All three sets of programs for the single type of ligand case ("Sym", "Iso", and "Ani") 
allow comparison of the calculated theoretical Scatchard plot with an experimental 
Scatchard plot.  In order to perform this comparison, the experimental r values must 
be listed in the array rExpArray, and the corresponding experimental r/[free ligand] 
values must be listed in the array experimentalrLf.  The Boolean variable 
experimentalData must also be set to True.  The programs will then plot the 
experimental and theoretical curves together (in gray and black, respectively), using 
the experimental r values for the theoretical data points.  Several plots of the 
residuals (experimental - theoretical) will also be displayed, and a correlation 
coefficient (R) will be calculated.  (If the correlation is very poor, the resulting value 
of R will be a complex number.)  If the initial value of experimentalrLf is negative, 
the experimental  r/[free ligand] values will be disregarded.  However, as long as 
experimentalData is True, the r values in rExpArray will still be used for the 
theoretical calculation.  

The "2L-Ex&Th" notebooks for the binding of two kinds of symmetric ligands to an 
isotropic lattice are also intended for comparison of experimental data with the 
theoretical binding curve (the "2Lig" programs do not allow this).  The calculations are 
performed for an abscissa array of ligand 1 concentrations (not for an array of r 
values of ligand 1, as in the other programs), and do not reach lattice saturation.  The 
experimental data needs to be given as an array of concentrations of ligand 1 
("concArray ", representing either total concentrations or free concentrations, 
depending on whether type is set to 1 or 2, respectively), and an array of r values 
("experimentalR2 ", representing bound ligand 2 concentration divided by the 
polymer concentration in terms of monomer units).  The manner in which the two data 
sets are compared is the same as described in the preceding paragraph.  If the initial 
value of experimentalR2 is negative, the experimental  r2 values will be 
disregarded.  However, as long as experimentalData is True, the [ligand 1] values in 
concArray  will still be used for the theoretical calculation.  


Recognizing Incorrect Output

Problems may occasionally arise with execution of these programs, signaled by an 
unexpected discontinuity or plateau in one or more of the plots, a message that 
FindRoot has encountered a single Jacobian, or a message that "Newton's method has 
failed to converge to the prescribed accuracy" after the specified iteration limit.  
The discontinuity or plateau may indicate that FindRoot has jumped to an incorrect 
solution of the equations being solved for.  This type of problem may not be evident in 
the Scatchard plot if it occurs at a point where the curve is close to the x-axis, but it 
will generally be obvious in the plots of one or more of the neighbor-effect 
parameters or q (particularly in the latter case).  These problems usually arise when 
the values of one or more of the neighbor-effect parameters (or q) are changing 
rapidly as a function of r, and the step size used for r in the iteration is too large (i.e., 
the value of "points" is too small).

If meaningless or physically impossible values are generated for one or more of the 
neighbor-effect parameters, q, or conditional probabilities, a flag will be set and 
warning messages will accompany the output.  In addition, data which has been 
determined to be incorrect will not be displayed.  HOWEVER, THE ABSENCE OF A 
WARNING MESSAGE IS NO GUARANTEE THAT THE OUTPUT IS CORRECT.  Incorrect values 
may still be in the physically possible range.  If the results of any calculation are in 
doubt, a simple and reliable way to test the correctness of the output is to change 
(e.g. double or quadruple) the number of points in the plot.  If the initial output was 
incorrect, the numbers above the r value where the mistake occurred will change in 
the second calculation.


Avoiding Incorrect Output

The problem of FindRoot jumping to the wrong root can generally be overcome by one 
of three methods: 1) by increasing the number of points in the plot; 2) by choosing less 
extreme values of the input variables; 3) by changing the form of the equations being 
solved.  The third method should be used when the first doesn't work and you don't 
want to use the second.  To use this approach you need to note which set of equations 
are being solved when the problem arises (given by the value of "ne").  Then go back 
into the initialization notebook and find the equations in question.  In many cases 
there will be a standard version of these equations, given in an initialization cell, and 
one or more "alternate" versions, in cells which are not initialization cells.  The 
version of the equations in the initialization cell is the one that normally executes 
most rapidly; however, one of the slower versions may be less susceptible to jumping 
to the wrong root in a particular situation.  Select and execute a cell containing one of 
the alternate versions and retry the calculation.  This approach cannot be used for 
errors that occur at saturation, because alternate forms of the equations used for this 
calculation are not available.  

An example of a situation where calculating the correct output is difficult is the 
anisotropic lattice case with k1 = k2 = n1 = n2 = 1, w11 = 10, w22 = 0, w12 = 0.01, 
w21 = 0.011.  The ligands in this case initially form tightly bound head-to-head pairs, 
but must rearrange into a much weaker tail-to-head arrangement close to saturation.  
Mistakes are often made in this calculation without setting a flag.  The error is 
obvious when inspecting the plot for q (and the plots for some of the b-b conditional 
probabilities).  The second alternative set of ne1 equations works much better for 
this case than the primary version.  


Program Bugs and Limitations 

A discontinuity that appears between the last and next-to-last points in a plot that 
does not respond to the above-noted measures may represent a program bug [most 
likely in the anisotropic lattice case ("Ani") with n1 not equal to n2].  If you believe 
you have encountered a bug, please report it to the programmer (Alan R. Wolfe, Dept. of 
Biopharmaceutical Sciences, University of California, San Francisco, CA 94143-0446; 
e-mail address:  alwolfe@bigfoot.com).  Remember that the parameter values will not 
always be correctly calculated at saturation if you use nonintegral values of n.  

=======================
End of Read Me
=======================