Connecting density fluctuations and Kirkwood-Buff integrals for finite-size systems

Kirkwood-Buff integrals (KBI) connect the microscopic structure and thermodynamic properties of liquid solutions. KBI are defined in the grand canonical ensemble and evaluated assuming the thermodynamic limit (TL). In order to reconcile analytical and numerical approaches, finite-size KBI have been proposed in the literature, resulting in two strategies to obtain their TL values from computer simulations. (i) The spatial block-analysis method in which the simulation box is divided into subdomains of volume $V$ to compute fluctuations of the number of particles. (ii) A direct integration method where a corrected radial distribution function and a kernel that accounts for the geometry of the integration subvolumes are combined to obtain KBI as a function of $V$. In this work, we propose a method that connects both strategies into a single framework. We start from the definition of finite-size KBI, including the integration subdomain and an asymptotic correction to the radial distribution function, and solve them in Fourier space where periodic boundary conditions are trivially introduced. The limit $q\to 0$, equivalent to the value of the KBI in the TL, is obtained via the spatial block-analysis method. When compared to the latter, our approach gives nearly identical results for all values of $V$. Moreover, all finite-size effect contributions (ensemble, finite-integration domains and periodic boundary conditions) are easily identifiable in the calculation. This feature allows us to analyse finite-size effects independently and extrapolate the results of a single simulation to different box sizes. To validate our approach, we investigate prototypical systems, including SPC/E water and aqueous urea mixtures.

KBI are strictly defined in the grand canonical ensemble. Moreover, in practice, it is usual to take the thermodynamic limit (TL) to reduce their calculation to spherically symmetric real-space integrals of the radial distribution functions. In computer simulations, the TL is approximated by introducing periodic boundary conditions (PBC) for a system with a fixed number of particles N 0 . Accordingly, finite subvolumes V with * corteshu@mpip-mainz.mpg.de an average number of particles N are used to compute fluctuations of the number of particles and radial distribution functions [35,36]. Periodicity, different thermodynamic ensembles and finite integration domains introduce artefacts in the resulting KBI.
Finite-size KBI have been proposed in the literature to bridge the existing gap between analytical expressions and numerical studies. The critical assumption is that fluctuations of the number of particles in subvolumes V inside the simulation box are equal to integrals of the corresponding closed-system radial distribution functions [37][38][39]. This equality provides two routes to obtain KBI in the TL. The first one, i.e. the spatial block analysis method (SBA), is based on calculating fluctuations of the number of particles in subdomains of volume V . By using arguments from thermodynamics of small systems [35], linear scaling relations are defined to extrapolate KBI in the TL [40,41]. The second possibility is to correct the radial distribution functions for the differences in the thermodynamic ensemble, then integrate them using a kernel that takes into account finite-size domains [42,43]. Naturally, the limit V → ∞ gives the KBI in the TL.
Indeed, the two approaches are connected. In the limit V > V ζ , with V ζ the volume defined by the correlation length of the system ζ, the integration of the radial distribution functions give an expression equivalent to the result obtained from linear scaling of fluctuations of the number of particles, including ensemble and finite integration domain effects [44]. Nevertheless, the local solvation structure information, provided by the short-range part of the RDF, is lost in this case. Moreover, the effect of periodic boundary conditions is not included in the final result.
In this work, we propose a method that connects the spatial block analysis method to the direct integration of exact, finite-size KBI. We evaluate the large r limit by introducing an asymptotic correction to the RDF. By defining the geometry of the subdomain, we write and solve KBI in Fourier space where the periodicity of the cell can also be incorporated, following the procedure proposed in Ref. 45. We compute the q → 0 limit by using the spatial block analysis method. We thus obtain KBI as a function of the volume of the subdomain and find excellent agreement with fluctuations of the number of particles for SPC/E water and aqueous urea mixtures for all values of V . The method is accurate, and its implementation is straightforward. It simply requires performing a spatial block analysis and calculating onedimensional integrals of partial structure factors. The paper is organised as follows: In Section II we introduce the method, and in Section III the computational details. We present the main results in Section IV and conclude in Section V. For a multicomponent fluid of species i, j, contained in a volume V = L 3 , in thermal and chemical equilibrium with an infinite reservoir of particles, the Kirkwood-Buff integral (KBI) is defined as [1] where N i is the number of particles of the i-species and the bracket · · · denotes a thermal average, δ ij is the Kronecker delta and g ij is the pair correlation function defined in the grand canonical ensemble with r = r 2 −r 1 . In computer simulations, we usually investigate systems with fixed number of particles N 0 with volume V 0 = L 3 0 . Building on similar results for the Ornstein-Zernike equation [45], we define the finite-size KBI as where the average · · · ≡ · · · V,V0 now explicitly depends on the subdomain and total volumes, V and V 0 , respectively (See Fig. 1). Here we focus on the integral term, that contains the radial distribution function of the closed system, g ij (r; V 0 ), and a step function R(r) that defines the integration subdomain: it is one inside and zero outside the volume V . By defining Eq.
(2), we connect explicitly density fluctuations and the integral of the pair correlation function for any subdomain V .
In the following, we focus on integrating the r.h.s. of Eq.
(2). That is To include the correction due to ensemble effects, we use the approximation proposed in Ref. 44 based on the asymptotic limit g ij (r → ∞; thus, the finite-size KBI becomes where the second term on the r.h.s. contains the correction due to ensemble effects [3,4,44,46] and This expression can be easily written in Fourier space whereh ij is the Fourier transform of h ij . An additional advantage of integrating in reciprocal space is that periodic boundary conditions can be considered explicitly [45,47]. It is enough to rewriteh ij (k) such that periodic copies of the system are included via a phase factor. That is, we include the complete contribution of the periodic boundary conditions into Eq. (8) as where [45] h PBC with s nx,ny,nz = (n x L 0 , n y L 0 , n z L 0 ) a vector specifying the system's periodic images such that n x,y,z takes integer values. In the following, we find that the choice n x = n y = n z = 1 is sufficient to compute Eq. (9) accurately.
We assume a homogeneous fluid such thath Hence, in practice, we use the relation betweenh ij (k) and the partial structure factors S ij [48,49], namely The partial structure factors are computed as Consequently, the problem reduces to evaluate a single integral of the partial structure factors given by Eq. (9).
In principle, Eq. (6) now includes all the finite size effects present in the simulation (finite boundary, periodicity of the box and ensemble). Before entering into applications, there are still two issues demanding our immediate attention. The first is that the asymptotic correction in Eq. (6) requires the value of G ∞ ij . The second concerns the evaluation of lim k→0 S ij (k), that reduces, again, to evaluate G ∞ ij . Indeed, we have To obtain G ∞ ij we recall that, in the limit V ζ < V < V 0 (grand canonical ensemble), Eq. (7) can be approxi- [37,38,40,42] where α ij is a constant. By including this approximation into Eq. (6), we recover the spatial block analysis (SBA) result consistent with the result reported in Ref. [44] By evaluating density fluctuations for volumes V ≤ V 0 , as defined by the left hand side of Eq.
To summarise, the present method to evaluate KBI for finite systems requires information readily accessible from the simulation trajectory: density fluctuations for subvolumes V ≤ V 0 and partial structure factors. Additional corrections to the RDF or finite-domain integration kernels are not required. Moreover, periodic boundary effects are trivially included in the calculation.

III. COMPUTATIONAL DETAILS
To validate our approach, we first focus on liquid SPC/E water [50]. Molecular dynamics simulations have been carried out with GROMACS 4.5.1 [51] for systems containing 1000 and 8000 water molecules. We started with systems of initial density ≈ 26 waters/nm 3 (≈ 776 kg/m 3 ) that were optimised using steepest descent minimisation (50000 steps are sufficient). An equilibration run of 3.5 ns was carried out in the NPT ensemble at 1 bar. Next, we alternated 3.5 ns (time step = 1 fs) constant pressure (NPT) at P=1 bar and constant volume (NVT) simulations at T=300 K. For NPT simulations we used the Berendsen barostat [52], and for NVT simulations temperature was enforced by a velocity rescaling thermostat [53]. We continued with this protocol until we verified that in the NPT ensemble the density is 33.5 waters/nm 3 (1000 kg/m 3 ) and that in the NVT simulation pressure fluctuates around the 1 bar value. The last NVT trajectory obtained after this sequence of NPT-NVT equilibration runs was used for the spatial block analysis and for the calculation of the structure factor.
To test the method with a multicomponent case, we have re-used our simulation trajectories of aqueous urea solution [4,32] using the Kirkwood-Buff derived force field [54] and SPC/E water [50] in GROMACS 4.5.1 [51] with a relatively small size of the simulation box (L ∼ 8 nm). We have considered four more molar concentrations for a total of seven molar concentrations: 2.00, 3.06, 3.90, 5.07, 6.03, 7.10 and 8.03 M. Hence, we have alternated 3.5 ns (time step = 1 fs) constant pressure (NPT) at P=1 bar and constant volume (NVT) simulations at T=300 K. For NPT simulations we used a Berendsen barostat [52] to control the pressure, and for NVT simulations a velocity rescaling thermostat [53] to enforce the target temperature. We continued with this protocol (38 NPT-NVT cycles) until we verified that in the NVT simulation pressure fluctuates around 1 bar. Also in this case, the last NVT trajectory obtained after this NPT-NVT equilibration sequence was used for the spatial block analysis and for the calculation of the partial structure factors.

A. Single component liquid: SPC/E water
For the single-component liquid, we focus on the Ornstein-Zernicke integral equation for finite size systems [37,[55][56][57]. For a closed system with fixed number of particles N 0 and volume V 0 , including PBC. Similar to Eq. 2, we define [45,46] and in this case we use the asymptotic correction to the RDF proposed in Refs 56 and 58 by neglecting O(1/N 2 0 ) contributions. χ ∞ T = ρk B T κ T , κ T being the isothermal compressibility of the bulk system. To solve the integral on the r.h.s. of Eq. 15, we use the same procedure as described in Section II. In cases where V ζ < V < V 0 , we obtain the equivalent spatial block analysis expression [3, 4] with α a constant. First, we compute density fluctuations as defined on the l.h.s. of Eq. 15 for both N 0 = 1000 and 8000 systems. By defining λ = (V /V 0 ) 1/3 , we plot λχ T as a function of λ (Fig. 2). We extrapolate χ ∞ T from the curve's slope in the region λ < 0.3 for the system with N 0 = 8000 water molecules. The choice of this linear region is motivated by the fact that λχ T as obtained from Eq. 15 has the maximum at λ max = 0.63. Thus, we estimated λ = 0.3 as the value where the curve starts deviating significantly from a straight line. We can also use λ = 0.3 to choose an appropriate size of the system for the spatial block analysis. By assuming that the correlation length of water is ζ = 1.5 nm, we define V 1/3 ζ = 1.5 × (4π/3) 1/3 nm. The simulation box with N = 8000 water molecules has V 1/3 0 = 6.2 nm, thus λ = (V ζ /V 0 ) 1/3 = 0.39. This value is larger than λ = 0.3, still, it is sufficient to obtain a value of χ ∞ T = 0.062, in good agreement with the results reported in Ref. [4]. In practice, to select the size of the system one can start by estimating the correlation length from the radial distribution function and evaluating the linear size of the box such that λ ≈ 0.3. This criteria can also be applied to binary mixtures.
We use the results of this linear fit to plot the SBA results, Eq. (17). Also for this system, we compute the structure factor, and correct for the lim k→0 by using the relation We thus compute an integral equivalent to Eq. (8) to obtain χ T (V ; V 0 ). The results for both systems are also presented in Fig. 2. It is apparent that the agreement between density fluctuations and the integral method presented here is excellent. In contrast to the spatial block analysis result (Eq. (17)), oscillations of λχ T at low values of λ, related to the local liquid structure, and at large values of λ, due to the periodicity of the simulation box, are consistently reproduced with our method. This is particularly interesting for the system with N 0 = 1000 water molecules, where oscillations are more pronounced. In this case, our integral method uses information from the system with N 0 = 8000 water molecules. The small box behaviour is reproduced artificially via the periodic images in Eq. (10).
We now focus on the different finite-size effects present in the system with N 0 = 1000 water molecules. In Fig. 3 we present four possibilities of evaluating the r.h.s. of Eq. (15). (i) For the closed system, i.e. including the correction χ ∞ T V /V 0 , with PBC, we observe that χ T (λ = 1) = 0, as expected. (ii) The closed system without PBC gives a limit χ T (λ = 1) = ρα/V

B. Binary mixture: aqueous urea solution
We perform a similar analysis for the aqueous urea mixture case. First, we compute fluctuations of the number of particles as defined on the l.h.s. of Eq. (2). As for the single component case, we define λ = (V /V 0 ) 3 and plot λG ij as a function of λ. We carried out this study for all concentrations. However, we only present the results for the case 8M in Fig. 4 (dashed lines). Using the information from the linear region λ < 0.3, we extrapolate G ∞ ij and obtain α ij . In this case, we get G ∞ uu = −0.0867 ,G ∞ uw = −0.0639 and G ∞ ww = −0.0083 nm 3 with uu, uw and ww corresponding to urea-urea, urea-water and water-water components, respectively. These values well reproduce derivatives of activity coefficients reported experimentally [4,44] as well as excess chemical potentials trends with concentration obtained with different computational methods [59,60]. We insert these values in Eq. (14), i.e. SBA, and plot this result as well (solid grey lines). Finally, we use the finite KBI introduced here (Eq. (6)) and use the Fourier integral, Eq. (9) to compute G ij (V ). We present both results with (solid lines) and without (dash-dotted lines) the correction to the ensemble finite-size effects, G ∞ ij λ 3 . In this case as well, the results of our method accurately reproduce density fluctuations in the whole range 0 < λ < 1, including both, local structure (λ 1) and periodic boundary (λ ≈ 1) features. As anticipated, it is also apparent that the SBA result does not reproduce these limiting cases. Nevertheless, in the limit λ = 1 the results from fluctuations, SBA and our integration (Closed, PBC) converge to −δ ij /ρ i , the expected result of the KBI for a closed system [2]. As previously stated, we can separate finite-size contributions by focusing on the corresponding terms in Eqs (6) and (9). In particular, for an open system (Open, PBC), i.e. lim V0→∞ , we verify that the KBI converge to G ∞ ij when λ = 1. We examine this in more detail in Fig. 5 where the normalised KBI for urea-urea, λG uu , is presented. In addition to the limiting cases discussed above, we also consider an open system without periodic boundary conditions (dotted line). It is apparent in the region λ ≈ 1 (inner panel) that G uu for an open system with PBC converges to the value in the thermodynamic limit G ∞ uu , whereas for the open system without PBC, G uu is slightly larger than G ∞ uu value by a factor α uu /V 1/3 0 , as expected from the SBA expression, Eq. (14). As in the single component case, this result emphasises that PBC enforce the correct behaviour at the boundary of closed and open liquid mixtures. Finally, we also present the normalised running integral λG R ij (dashed red line) using (with R > ζ), an expression frequently used in the literature. The major differences with the results presented in this work resulting from various finite-size effects highlight the apparent limitations of using such an expression to calculate KBI.

V. CONCLUDING REMARKS
Finite Kirkwood-Buff integrals (KBI) enable us to sample the thermodynamic limit of liquid mixtures via relatively small computer simulations. The definition of finite KBI balance fluctuations of the number of particles in subdomains within the simulation box and integrals of the corresponding RDF. In this work, we underline this equality by reproducing density fluctuations as a function of the linear size of the subdomain via a simple integration strategy. In particular, we introduce a method to evaluate KBI via integrals of the partial structure factors in reciprocal space. A significant advantage of our approach corresponds to the direct inclusion of finite integration domains and PBC contributions. Consequently, we can now identify and remove finite-size effects such that grand canonical and thermodynamic limit results become readily available from finite-size computer simulations. Moreover, we show that this scheme enables us to extrapolate our results to different sizes of the simulation box simply by modifying the periodicity factor in the integration procedure. The simplicity of the method is apparent since it only requires fluctuations of the number of particles calculated for different subdomains sizes and the partial structure factors. We foresee immediate applications in situations where PBC play a pivotal role, namely, the recently introduced KBI for crystalline materials [33,34].   6) with Gij(V ) given by the Fourier integral Eq. (9) with (solid) and without (dash-dotted lines) the correction to ensemble effects given by G ∞ ij λ 3 . The solid black lines correspond to the KBI in the TL, G ∞ ij . The dashed grey lines indicate the asymptotic limit for the closed system, −δij/ρi.  6) with Guu(V ) given by the Fourier integral Eq. (9) with (solid) and without (dash-dotted lines) the correction to ensemble effects given by G ∞ uu λ 3 , both with PBC. Open, no PBC -finite KBI without the correction for the thermodynamic ensemble, and without PBC. The solid black line correspond to the KBI in the TL, G ∞ uu . The dashed grey line indicate the asymptotic limit for the closed system, −1/ρu. The red dashed line corresponds to the running integral Eq. (19). (Inset) Detail of the convergence to the TL.

ACKNOWLEDGMENTS
We are grateful to Kurt Kremer, Debashish Mukherji and Pietro Ballone for insightful discussions. We also thank Atreyee Banerjee for her critical reading of the manuscript. R.C.-H. gratefully acknowledges funding from SFB-TRR146 of the German Research Foundation (DFG). Simulations have been performed on the THINC cluster at the Max Planck Institute for Polymer Research and on the COBRA cluster of the Max Planck Computing and Data Facility.