Basic Operating Procedure 1) Evaluate the "Initialize" notebook for the case you're interested in; then close it and open the corresponding "Output" notebook. 2) Add your own experimental data for comparison with theoretical data if desired. 3) Set the "Input Parameters" to the desired values. 4) Evaluate the "Output" notebook. 5) Repeat 3) and 4) as desired. Since there is some redundancy in the various output parameters that are plotted and tabulated, you may want to unlock and make inactive or delete cells for output you're not interested in. Nearly all the essential calculations are performed in one cell immediately under the first "Calculate..." heading, between the input parameters and the first plot (generally the Scatchard plot), so this cell must be re-evaluated each time the input parameters are changed. Once this is done, the various other plots may generally be performed in any order. However, additional calculations must be performed before 3-D or 2-D cluster length distributions are plotted; these calculations are performed in the cell immediately under the "Calculate..." heading just before the first 3-D or 2-D cluster length plot. Introduction These Mathematicaª notebooks deal with a simple model for the non-sequence- selective binding of small molecules (ligands) such as proteins to long linear polymers such as DNA. They calculate and plot data for the binding of ligands to an infinite linear lattice for the following cases: a) symmetric ligands of one type ("Sym"); b) symmetric ligands of two types binding to an isotropic lattice ("2Lig" and "2L-Ex&Th"); c) asymmetric ligands of one type binding to an isotropic lattice ("Iso"); d) asymmetric ligands of one type binding to an anisotropic lattice ("Ani"). In the last case ("Ani") two binding modes are allowed, i.e., different numbers of lattice residues may be covered when the ligand binds in different orientations. Case a is equivalent to the case analyzed by McGhee & von Hippel in their well-known paper (J. Mol. Biol. vol. 86, 469-489 [1974]. Erratum vol. 103, 679 [1976]). Cases b-d are more general. Schematic depictions of binding situations associated with the various cases, as well as a derivation of the equations relevant to case a, are contained in the "schema&derivation" notebook. For each case there is an initialization notebook that contains the relevant sets of equations, and an output notebook that calculates and plots the Scatchard plot, the neighbor-effect parameters, and the conditional probabilities. These quantities are calculated as functions of the ratio of bound ligands to lattice residues (r) except in the case of one of the sets of programs for two ligands ("2L-Ex&Th"), in which the concentration of ligand 1 is the abscissa. In addition, for all cases except the two ligand case ("2Lig" and "2L-Ex&Th"), the output notebooks calculate the 2D (at a given r) and 3D (for the whole range of lattice saturation) cluster length distributions. The method used is based on the treatment given in Wolfe & Meehan, J. Mol. Biol. vol. 223, 1063-1087 (1992). It is important to note that except for the case of symmetric ligands of one type ("Sym"), the programs will not give correct output under all circumstances. IT IS THE RESPONSIBILITY OF THE USER TO EVALUATE THE CORRECTNESS OF HIS OR HER OUTPUT. This can be done in a straightforward manner (see below). In addition, all the programs except one of the sets of programs for the two types of ligand case ("2Lig") allow comparison of user-supplied experimental data with the calculated theoretical data. Using the Programs To use the notebooks, first open the initialization notebook for the desired case and evaluate the "initialization cells" using the "Kernel-> Evaluation-> Evaluate Initialization" command. This needs to be done only once per Mathematica session. Then, close the initialization notebook, open the corresponding output notebook, and specify the desired "input variables" values. Integer values for n are expected but are generally not required. The programs for asymmetric ligands on an anisotropic lattice ("Ani") with two binding modes (n1 not equal to n2), and for type 2 or type 3 isotherms for two kinds of symmetric ligands on an isotropic lattice ("2Lig") may generate errors at saturation for non-integral values of n. The variable "points" gives the number of points at equally spaced degrees of lattice occupancy above r=0 to be used in the plot. Because parameter values frequently change rapidly when the lattice is nearly saturated, I have added several "extra" points just before rSat (the maximum value of r at lattice saturation), using magenta text in the cell under the first "Calculate..." heading. The output notebook can be executed either in its entirety (with the "Kernel-> Evaluation-> Evaluate Notebook" command) or in sections (by selecting the cells and hitting the "shift-return" keys or the "enter" key). If the program is being executed in sections, the Scatchard plot must generally be calculated and plotted first (this can be done by selecting and executing the group of cells in the right-most bracket starting at the "input variables" cell). The only exception to this rule is in the case of the calculation of cluster length distributions for symmetric ligands of one type ("Sym"), which can be done without any preliminary calculation of a Scatchard plot. Here, it is only necessary to execute the input variables cell before performing the 3D or 2D cluster length calculations. The Scatchard plot for an asymmetric ligand with a single binding mode (n1 = n2) binding to an isotropic ("Iso") or anisotropic ("Ani") lattice is accompanied by a "reference" symmetric ligand Scatchard plot having the same intercepts and initial slope (in gray). Since the asymmetric ligand Scatchard plot (in black) is plotted after the reference plot, the reference plot will be invisible if the two plots are identical or nearly so. The reference plot can be eliminated, if desired, by setting the Boolean variable "referencePlot" to "False" (in the first line of the first calculation cell). After the Scatchard plot has been calculated, one or more of the neighbor-effect parameters, conditional probabilities, or cluster-length distributions can be plotted in any order. The only restriction is that additional calculations are needed for both the 2D and 3D cluster length distributions, and these must be performed (by the cell just below the respective "Calculate..." heading) before the data can be plotted. The particular degree of saturation for a 2D cluster length distribution is specified by "rCluster2D", which equals r/rSat. 2D cluster length distributions can be calculated repeatedly for different r values after the Scatchard plot has been calculated once. The number of equally spaced values of r at which the cluster distribution will be calculated for the 3D plot is specified by "points3D". The amount of time needed for the cluster length distribution calculation increases exponentially with the length of cluster being considered (you set the maximum length for the calculations with "maxLength2D" or "maxLength3D"). However, data is printed as it is calculated, so it won't be lost if you abort. When the computer eventually runs out of memory, it is necessary to quit Mathematica and start over. NOTE: The "Ani" set of programs have the highest memory requirement. If this program crashes after a small number of calculations, try increasing Mathematica's memory allocation. It is also best to quit Mathematica and start over when switching from one case to another. Most of the cells in the programs are locked, so they would have to be unlocked (using the "Cell-> Cell Properties-> Cell Editable" command) to make any changes. Also, many of the cells are closed. In order to view their contents, one of two procedures must be followed. If the cell is a subgroup (hidden behind another cell with a downward pointing arrow to the right), it can be viewed using the "Cell-> Cell Grouping-> Open All Subgroups" command. If it is not a subgroup, unlock it and open it using the "Cell-> Cell Properties-> Cell Open" command. The output from the programs is presented both graphically and numerically. The numeric output may in most cases be eliminated by putting a semicolon at the end of the line in the cell just before the corresponding graph. A meaningless equation ("scroll = 1") has been added to the end of a number of the text cells in the output notebooks. This allows all of a notebook's results to be displayed by automatic scrolling when the entire notebook is being evaluated (the "Kernel-> Evaluation-> Evaluate Notebook" command). Otherwise, the automatic scrolling usually stops at an inactive cell (at least in the Macintosh version of Mathematica). Comparing Experimental and Theoretical Data All three sets of programs for the single type of ligand case ("Sym", "Iso", and "Ani") allow comparison of the calculated theoretical Scatchard plot with an experimental Scatchard plot. In order to perform this comparison, the experimental r values must be listed in the array rExpArray, and the corresponding experimental r/[free ligand] values must be listed in the array experimentalrLf. The Boolean variable experimentalData must also be set to True. The programs will then plot the experimental and theoretical curves together (in gray and black, respectively), using the experimental r values for the theoretical data points. Several plots of the residuals (experimental - theoretical) will also be displayed, and a correlation coefficient (R) will be calculated. (If the correlation is very poor, the resulting value of R will be a complex number.) If the initial value of experimentalrLf is negative, the experimental r/[free ligand] values will be disregarded. However, as long as experimentalData is True, the r values in rExpArray will still be used for the theoretical calculation. The "2L-Ex&Th" notebooks for the binding of two kinds of symmetric ligands to an isotropic lattice are also intended for comparison of experimental data with the theoretical binding curve (the "2Lig" programs do not allow this). The calculations are performed for an abscissa array of ligand 1 concentrations (not for an array of r values of ligand 1, as in the other programs), and do not reach lattice saturation. The experimental data needs to be given as an array of concentrations of ligand 1 ("concArray ", representing either total concentrations or free concentrations, depending on whether type is set to 1 or 2, respectively), and an array of r values ("experimentalR2 ", representing bound ligand 2 concentration divided by the polymer concentration in terms of monomer units). The manner in which the two data sets are compared is the same as described in the preceding paragraph. If the initial value of experimentalR2 is negative, the experimental r2 values will be disregarded. However, as long as experimentalData is True, the [ligand 1] values in concArray will still be used for the theoretical calculation. Recognizing Incorrect Output Problems may occasionally arise with execution of these programs, signaled by an unexpected discontinuity or plateau in one or more of the plots, a message that FindRoot has encountered a single Jacobian, or a message that "Newton's method has failed to converge to the prescribed accuracy" after the specified iteration limit. The discontinuity or plateau may indicate that FindRoot has jumped to an incorrect solution of the equations being solved for. This type of problem may not be evident in the Scatchard plot if it occurs at a point where the curve is close to the x-axis, but it will generally be obvious in the plots of one or more of the neighbor-effect parameters or q (particularly in the latter case). These problems usually arise when the values of one or more of the neighbor-effect parameters (or q) are changing rapidly as a function of r, and the step size used for r in the iteration is too large (i.e., the value of "points" is too small). If meaningless or physically impossible values are generated for one or more of the neighbor-effect parameters, q, or conditional probabilities, a flag will be set and warning messages will accompany the output. In addition, data which has been determined to be incorrect will not be displayed. HOWEVER, THE ABSENCE OF A WARNING MESSAGE IS NO GUARANTEE THAT THE OUTPUT IS CORRECT. Incorrect values may still be in the physically possible range. If the results of any calculation are in doubt, a simple and reliable way to test the correctness of the output is to change (e.g. double or quadruple) the number of points in the plot. If the initial output was incorrect, the numbers above the r value where the mistake occurred will change in the second calculation. Avoiding Incorrect Output The problem of FindRoot jumping to the wrong root can generally be overcome by one of three methods: 1) by increasing the number of points in the plot; 2) by choosing less extreme values of the input variables; 3) by changing the form of the equations being solved. The third method should be used when the first doesn't work and you don't want to use the second. To use this approach you need to note which set of equations are being solved when the problem arises (given by the value of "ne"). Then go back into the initialization notebook and find the equations in question. In many cases there will be a standard version of these equations, given in an initialization cell, and one or more "alternate" versions, in cells which are not initialization cells. The version of the equations in the initialization cell is the one that normally executes most rapidly; however, one of the slower versions may be less susceptible to jumping to the wrong root in a particular situation. Select and execute a cell containing one of the alternate versions and retry the calculation. This approach cannot be used for errors that occur at saturation, because alternate forms of the equations used for this calculation are not available. An example of a situation where calculating the correct output is difficult is the anisotropic lattice case with k1 = k2 = n1 = n2 = 1, w11 = 10, w22 = 0, w12 = 0.01, w21 = 0.011. The ligands in this case initially form tightly bound head-to-head pairs, but must rearrange into a much weaker tail-to-head arrangement close to saturation. Mistakes are often made in this calculation without setting a flag. The error is obvious when inspecting the plot for q (and the plots for some of the b-b conditional probabilities). The second alternative set of ne1 equations works much better for this case than the primary version. Program Bugs and Limitations A discontinuity that appears between the last and next-to-last points in a plot that does not respond to the above-noted measures may represent a program bug [most likely in the anisotropic lattice case ("Ani") with n1 not equal to n2]. If you believe you have encountered a bug, please report it to the programmer (Alan R. Wolfe, Dept. of Biopharmaceutical Sciences, University of California, San Francisco, CA 94143-0446; e-mail address: alwolfe@bigfoot.com). Remember that the parameter values will not always be correctly calculated at saturation if you use nonintegral values of n. ======================= End of Read Me =======================