A simple analytical formula for the free-energy of ligand-receptor mediated interactions

Recently \1, we presented a general theory for calculat- ing the strength and properties of colloidal interactions mediated by ligand-receptor bonds (such as those that bind DNA-coated colloids). In this communication, we derive a surprisingly simple analytical form for the inter- action free energy, which was previously obtainable only via a costly numerical thermodynamic integration. As a result, the computational effort to obtain potentials of in- teraction is significantly reduced. Moreover, we can gain insight from this analytic expression for the free energy in limiting cases. In particular, the connection of our general theory to other previous specialised approaches is now made transparent. This important simplification will significantly broaden the scope of our theory.

Recently 1 , we presented a general theory for calculating the strength and properties of colloidal interactions mediated by ligand-receptor bonds (such as those that bind DNA-coated colloids). In this communication, we derive a surprisingly simple analytical form for the interaction free energy, which was previously obtainable only via a costly numerical thermodynamic integration. As a result, the computational effort to obtain potentials of interaction is significantly reduced. Moreover, we can gain insight from this analytic expression for the free energy in limiting cases. In particular, the connection of our general theory to other previous specialised approaches is now made transparent. This important simplification will significantly broaden the scope of our theory.
We consider a general system of many linkers, such as a solution of colloids coated with DNA strands that are capped with reactive sticky-ends. At any given time, each linker i can bind at most one other distinct linker j, with a free energy change ∆G ij that depends on the polymer statistics of the linkers (e.g., length, flexibility and grafting position). In many cases, including those of experimental relevance, the probability that linker i is unbound is approximately independent of whether or not any other linker is also unbound. Here, we show that in this limit, the free-energy of interaction of the system is given by where p i is the probability that linker i is unbound and p ij is the probability that linkers i and j form a bond. Previously 1 , we showed that these quantities are given by the unique physical solution to the following set of self-consistent equations: In what follows, we first motivate the free energy expression in Eq. (1) through a calculation that closely resembles that for mixing entropy of solutions and gases. This free energy is minimised for the values of {p i } and {p ij } that solve the self-consistent conditions in Eqs. (2) and (3). We then show that the free energy in Eq. (1) is identical to that obtained through the costly and numerical thermodynamic integration previously proposed. In the discussion section, we compare the performance of Eq. (1) with that of the thermodynamic integration, establish the explicit connection between our Eq. (1) and a previous treatment of DNAmediated colloid interactions 2 , and state the analogous result to Eq. (1) for the mean-field system of plates discussed in Ref.
Two different realisations of an ensemble of two copies. Linkers are depicted as straight lines, and bonds are shown as filled circles. Although the numbers of bonds formed in each ensemble, {Nij}, are equal, the number of copies where both i and j are unbound, N−i−j, differs.

I. DERIVATION OF THE MAIN RESULT
We consider here an ensemble of N independent copies of the real system. In each copy, a different set of bonds forms between the linkers (see Figure 1). Let N i be the number of copies where linker i is unbound, and let N ij be the number of copies where i and j are bound to each other. Conversely, let N −i,−j be the number of copies where both i and j are unbound. These quantities are not independent: each linker i is either unbound or unbound, so The fraction of copies where i is unbound (f i ), where i and j are bound to each other (f ij ), or where they are both unbound (f −i,−j ) follow immediately: Let Z({N ij }) be the partition function of an ensemble under the constraint that each pair of linkers i and j is bound to each other in exactly N ij copies. A closed-form expression for Z({N ij }) can be constructed recursively, by adding each bond one by one. For a given set of {N ij }, we need to work out how Z({N ij }) changes upon adding one more i-j bond, which we do as follows. We call the set of realisations of the ensemble with {N ij } bonds the old ensemble, and that of realisations with one more i-j bond, the new ensemble. In the old ensemble, there are N −i,−j copies where an i-j bond can be added.
In doing this, all the realisations of the new ensemble are generated, but not uniquely. For example, given two realisations, one with an i−j bond in copy X but not in Y or Z, and one with an i−j bond in Y but not in X or Z, the same final realisation can be obtained by adding an i − j bond to Y in the former and X in the latter. Conversely, in the new ensemble, we can generate a realisation of the old ensemble in N ij + 1 ways by removing one of the i-j bonds. For example, the old realisation with an i-j bond in copy X but not Y or Z can be obtained by deleting the i − j bond from a new realisation with an i − j bond in X and Y but not Z, or from one with an i−j bond in X and Z but not Y. Since the number of ways of going from the old to the new ensemble is equal to the number of ways of going from the new to the old ensemble, it follows that The value of N −i,−j depends not just on the values of {N ij } but on the details of how those bonds are distributed between copies (see Figure 1). To remove this complication, we approximate the probability of j being unbound as independent of whether or not i is also bound. Hence, Neatly, this approximation allows us to treat N −i,−j as a function of only {N ij }. From the discussion above, we obtain an expression for the increase in Z({N ij }) upon adding one i-j bond to the system: (10) This recursion relation, and the fact that Z = 1 when no bonds form, allows us to write an approximate closedform expression for Z({N ij }), namely (11) Using Stirling's approximation and Eq. (11), the free energy per copy βf * att = −(1/N ) ln Z({N ij }) is then given by Treating {f ij } as continuous in the range [0, 1], the overall free energy per copy of the ensemble, F * att , follows from a saddle-point approximation:

A. Connection to thermodynamic integration
The free energy F * att , defined in Eq. (13), is equal to the free energy of the real system to the extent that the approximation in Eq. (9) is valid. Since this is the same approximation that we used previously 1 to calculate the free energy in terms of a thermodynamic integral, F att , it is reasonable to suppose that F * att and F att are equal.
We now show this explicitly.
In our original paper, we calculated the exact attractive free energy for the real system of linkers using thermodynamic integration. Specifically, we replaced β∆G ij by β∆G ij +λ, whereupon the probabilities {p i } and {p ij } become functions of λ. We then integrated the appropriate free energy derivative over the range 0 ≤ λ < ∞, and obtained The same replacement of β∆G ij with β∆G ij + λ can be made in the ensemble of N copies. In that case, using Eqs. (12) and (13), we find that The first term vanishes owing to Eqs. (14) and (15), and the second term follows immediately from Eq. (12). We then have Moreover, both F * att and F att are zero when λ is infinite, so the two quantities are equal for all λ.
The previous result, together with Eqs. (12)-(15), yields the following closed-form expression for F att : This expression is, in fact, equivalent to the much more compact Eq. (1). Concretely, II. DISCUSSION Figure 2 reports the typical computational speedups obtained by using Eq. (1) versus our original thermodynamic integral, Eq. (16), for systems of M linkers. The speedup is higher for larger M and for stronger bonds because each evaluation of the thermodynamic integrand involves solving a system of M equations, and the size of the integration domain scales linearly with the bond strength. Typically, experimentally relevant regimes deal with hundreds to tens of thousands of strands, a regime that can now be treated exactly with Eq. (1).
In the limit of weak-bonds, where p ij is close to 0, we find that This approximate result has been widely used by previous authors 2-5 under the name of the "Poisson ap-proximation" or the "weak binding regime". However, in experiments with micron-sized DNA-coated colloids, this approximation can be significantly inaccurate 6 . At the nanoscale, where high bond strengths are commonly used, the Poisson approximation is expected to break down (see Fig. 3 for comparison). Eq. (1) is instead quantitatively accurate for bonds of any strength 1 .
Eqs. (1), (2) and (3) also make explicit the connection between our theory and previous treatments. For example, Dreyfus et al. 2 model the attraction between two DNA-coated spheres by first estimating the maximum number N p of linkers on each sphere that could form a bond with a linker on the second sphere, and then assuming that each such linker can independently bind any of k linkers with an average free energy ∆F tether . For later convenience, we define a small expansion parameter x as function of the system is from which the free energy F att follows, 7 In the present framework, which does not treat the linkers as binding independently, every linker has the same probability p of being bound, given by the solution to The free energy then follows from Eqs. (1),(2): Thus, our theory recovers the results of Dreyfus et al. in the weak binding regime. However, there is significant disagreement already at second order in x, where linkers begin to compete for binding partners. Finally, using the same procedure as in our original paper, we can directly write an attractive free energy density for a pair of plates, treated at a more approximate, spatial mean-field level. In the notation of Ref. 1, This result also follows from a large-area limit of Eq. (19) with random grafting points 1,8 . We expect the simplification provided in this communication will boost the use of our model for calculating interactions free-energy for general ligand-receptor mediated systems. Free energy per linker in the "Poisson approximation" and the present model. Higher values of x = k exp(−β∆F tether ) lead to higher bonding probabilities, either because bonds are stronger or because linkers have more binding partners. The two models agree in the "weak binding regime" (x 1), but disagree when correlations between neighbouring strands become significant.