Accelerating equilibrium isotope effect calculations: II. Stochastic implementation of direct estimators

Path integral calculations of equilibrium isotope effects and isotopic fractionation are expensive due to the presence of path integral discretization errors, statistical errors, and thermodynamic integration errors. Whereas the discretization errors can be reduced by high-order factorization of the path integral and statistical errors by using centroid virial estimators, two recent papers proposed alternative ways to completely remove the thermodynamic integration errors: Cheng and Ceriotti [J. Chem. Phys. 141, 244112 (2015)] employed a variant of free-energy perturbation called"direct estimators,"while Karandashev and Van\'{\i}\v{c}ek [J. Chem. Phys. 143, 194104 (2017)] combined the thermodynamic integration with a stochastic change of mass and piecewise-linear umbrella biasing potential. Here we combine the former approach with the stochastic change of mass in order to decrease its statistical errors when applied to larger isotope effects, and perform a thorough comparison of different methods by computing isotope effects first on a harmonic model, and then on methane and methanium, where we evaluate all isotope effects of the form $\mathrm{CH}_{\mathrm{4-x}}\mathrm{D}_{\mathrm{x}}/\mathrm{CH}_{4}$ and $\mathrm{CH}_{\mathrm{5-x}}\mathrm{D}^{+}_{\mathrm{x}}/\mathrm{CH}^{+}_{5}$, respectively. We discuss thoroughly the reasons for a surprising behavior of the original method of direct estimators, which performed well for a much larger range of isotope effects than what had been expected previously.

Path integral calculations of equilibrium isotope effects and isotopic fractionation are expensive due to the presence of path integral discretization errors, statistical errors, and thermodynamic integration errors.Whereas the discretization errors can be reduced by high-order factorization of the path integral and statistical errors by using centroid virial estimators, two recent papers proposed alternative ways to completely remove the thermodynamic integration errors: Cheng and Ceriotti [J.Chem.Phys.141, 244112 (2015)] employed a variant of free-energy perturbation called "direct estimators", while Karandashev and Vaníček  [J.Chem.Phys.143, 194104 (2017)] combined the thermodynamic integration with a stochastic change of mass and piecewise-linear umbrella biasing potential.Here we combine the former approach with the stochastic change of mass in order to decrease its statistical errors when applied to larger isotope effects, and perform a thorough comparison of different methods by computing isotope effects first on a harmonic model, and then on methane and methanium, where we evaluate all isotope effects of the form CH 4−x D x /CH 4 and CH 5−x D + x /CH + 5 , respectively.We discuss thoroughly the reasons for a surprising behavior of the original method of direct estimators, which performed well for a much larger range of isotope effects than what had been expected previously, as well as some implications of our work for the more general problem of free energy difference calculations.

I. INTRODUCTION
Equilibrium isotope effect and a closely related concept of isotope fractionation [1][2][3] belong among the most useful experimental tools for uncovering the influence of nuclear quantum effects on molecular properties. 4,5The equilibrium (or thermodynamic) isotope effect measures the effect of isotopic substitution on the equilibrium constant of a chemical reaction and is defined as the ratio of equilibrium constants, where A and B are two isotopologues of the reactant.The equilibrium constant can, in general, be evaluated as the ratio of the product and reactant partition functions (K = Q prod /Q react ), and therefore every equilibrium isotope effect can be computed as a product of several "elementary" isotope effects (IEs), given by the ratio of partition functions corresponding to different isotopologues.In the following, we will discuss different approaches for evaluating these elementary partition function ratios and call them "isotope effects" for short.
The standard textbook approach for computing the isotope effect assumes separability of rotations and vibrations, rigid rotor approximation for the rotations and a) Electronic mail: konstantin.karandashev@alumni.epfl.chb) Electronic mail: jiri.vanicek@epfl.chharmonic approximation for the vibrations, 5,6 but we will focus our discussion on a more rigorous method that avoids all three approximations and treats the problem exactly.2][13][14][15] The main drawback of this approach is that the "mass integral" is evaluated by discretizing the mass and, as a result, introduces a certain integration error.Although several elegant tricks reduce this integration error significantly, 1,2 it can never be removed completely if the integral is evaluated deterministically.This disadvantage of thermodynamic integration led to the introduction of several strategies that avoid the integration error altogether; these can be classified into two main groups: The first group avoids discretizing the mass integral by allowing the mass to take a continuous range of values during the simulation; 15,16 this approach is an example of a more general λ-dynamics method [17][18][19] for calculating free energy differences.2][23] The second group is based on the free energy perturbation, 1,15,24,25 which can be derived by rewriting the ratio (2) using the Zwanzig formula; the approach was proposed in Ref. 15, with more convenient estimators introduced in Ref. 3. Note, however, that in the more general case of free energy difference calculations one can come across a problem that cannot be solved efficiently with either of these two general strategies.For these complicated cases a third option is to run an adiabatic free energy dynamics simulation 26 and then calculate the free energy difference from the ratio of the resulting probability distributions. 27,28Because methods based on finding ratios of probability densities may run into difficulties discussed, for example, in Ref. 29, we do not consider them here.
In Ref. 16, we introduced a method following the first philosophy for eliminating the integration error-in particular, we showed that an effective Monte Carlo procedure for changing the mass reduces the thermodynamic integration error and that a special mass-dependent biasing potential renders the integration error exactly zero.Here, we explore the second strategy, namely the free energy perturbation, or "direct estimator" approach; 3,15 while direct estimators work best for isotope effects close to unity, they can be applied to larger isotope effects as well by rewriting Eq. ( 2) as a product of several smaller isotope effects, 15,30 which, incidentally, makes the resulting "stepwise" method reminiscent of thermodynamic integration.First, we investigate ways to optimize the choice of the smaller isotope effects and the way they are evaluated in order to decrease statistical error of the calculated isotope effect.Second, since the statistical convergence of thermodynamic integration can be improved by changing the mass stochastically during the simulation, it seems natural to combine this stepwise approach with the procedure for changing mass introduced in Ref. 16; indeed, testing performance of the resulting combined method on large isotope effects was the main goal of this paper.Such a combination with direct estimators is only possible for a mass sampling procedure that allows finite steps with respect to mass, making the Monte Carlo procedure of Ref. 16 suitable for the task, but disqualifying standard λ-dynamics algorithms based on molecular dynamics.Lastly, it is interesting to consider mass-scaled direct estimators used in this work as a targeted free energy perturbation 31 method, that is a method that uses a physically motivated coordinate mapping (in this case transforming to and from mass-scaled normal modes) to facilitate free energy perturbation calculations.Since all methods discussed in this work rely in some way on the mass-scaled normal mode transformation, their comparison can be rephrased as finding the most efficient way to use a targeted free energy perturbation transformation for a free energy difference calculation.This point will be elaborated further in Sec.IV.
To assess the numerical performance of the proposed methodology, we apply it to evaluate isotope effects in an eight-dimensional harmonic model and in fulldimensional methane (CH 4 ) molecule and methanium (CH + 5 ) cation.Methane was chosen because CH 4 + D 2 exchange is an important benchmark reaction for studying catalysis of hydrogen exchange over metals 32 and metal oxides, 33 and because the polydeuterated species CH 4−x D x are formed in abundance during the catalyzed reaction.We therefore demonstrate how our new Monte Carlo procedure allows computing not only the large CD 4 /CH 4 isotope effects, but also all CH 4−x D x /CH 4 isotope effects (x = 1, . . ., 4) within a single simulation.5][36] As a consequence, the equilibrium properties of methanium, unlike those of methane, cannot be reliably estimated with the harmonic approximation.We again evaluate all isotope effects of the form CH 5−x D + x /CH + 5 (x = 1, . . ., 5).

II. THEORY
In this section, we explain how stochastic change of mass 16 can be combined with the method of direct estimators 3 in order to accelerate isotope effect calculations.In particular we show that for large isotope effects, the stochastic implementation allows reducing the statistical error while keeping the integration error zero.We start with a brief overview of the path integral formalism, thermodynamic integration, and stochastic implementation of the thermodynamic integration.For more details, see Ref. 16, whose notation is followed here.

A. Path integral representation of the partition function
To evaluate the isotope effect (2) with path integrals, one first needs a path integral representation of the partition function Q = Tr exp(−β Ĥ).For a molecular system consisting of N atoms with masses m i (i = 1, . . ., N ) moving in D (= 3) spatial dimensions, the partition function can be expressed 7,8 as the limit Q = lim P →∞ Q P , where P is the number of imaginary time slices (the Trotter number) and is the discretized path integral representation of Q.The vector r contains all N DP coordinates of all N atoms in all P slices of the extended configuration space; in particular, r := r (1) , . . ., r (P ) , where r (s) , s = 1, . . ., P , is a vector containing N D coordinates of all atoms in slice s.The statistical weight ρ(r) of each path integral configuration is given by with the prefactor and with an effective potential energy of the classical ring polymer given by where r (s) i denotes the D components of r (s) corresponding to atom i, and V is the potential energy of the original system.Since the path employed to represent the partition function is a closed path, we define r (0) := r (P ) .
Note that Q P is a classical partition function of a ring polymer, a system in the extended configuration space with N DP classical degrees of freedom exposed to the effective potential Φ(r).For P = 1 the quantum path integral expression (3) reduces to the classical partition function.

B. Thermodynamic integration with respect to mass
A convenient way to evaluate the isotope effect (2) is based on thermodynamic integration 10 with respect to mass. 11The isotope change is assumed to be continuous and parametrized by a dimensionless parameter λ ∈ [0, 1], where λ = 0 corresponds to isotopologue A and λ = 1 to isotopologue B. The simplest possible choice for the mass interpolating function is a linear interpolation, [11][12][13] but faster convergence is often achieved by interpolating the inverse square roots of the masses, 1 1 which is therefore the interpolation we will use in the numerical examples below.(One can show rigorously that this interpolation is the optimal one in harmonic systems in the low temperature limit, but has a good behavior at high temperature and in various other systems as well.)If Q(λ) denotes the partition function of a fictitious system with interpolated masses m i (λ), then the isotope effect (2) can be expressed as an exponential of the "thermodynamic integral" where F (λ) is the free energy corresponding to the isotope change.While it is difficult to evaluate either with a path integral Monte Carlo method, the logarithmic derivative d ln Q P (λ)/dλ = [dQ P (λ)/dλ]/Q P (λ) = −βdF P (λ)/dλ is a normalized quantity and therefore can be computed easily with the Metropolis algorithm with sampling weight ρ (λ) (r) corresponding to the fictitious system with masses m i (λ): λ) .
Here we used a general notation for a path integral average of an observable A, given by averaging the estimator A est over an ensemble with weight ρ (λ) .An estimator for a quantity A is not unique; different choices of the estimator result in different statistical behavior.As the analogous centroid virial estimator for energy 37,38 (obtained by differentiating with respect to inverse temperature β), the centroid virial estimator for the isotope change 12 (where the differentiation is with respect to λ) has a statistical error that is independent of P .This estimator, used in all our numerical examples, is given by where is the centroid coordinate and ∇ i is the gradient with respect to the coordinates of particle i.While in path integral molecular dynamics gradients of V are available and the centroid virial estimator is "free," in path integral Monte Carlo implementations only the potential energy itself is required for sampling, and in order to avoid an unnecessary evaluation of forces, one may evaluate the centroid virial estimators by a single finite difference differentiation with respect to λ. 12,39 To summarize, the calculation of the isotope effect requires running simulations at different values of λ and then numerically evaluating the integral in Eq. (8).

C. Stochastic thermodynamic integration
Due to the discretization of the thermodynamic integral, the method of thermodynamic integration necessarily introduces an integration error.While Ceriotti and Markland 1 reduced the integration error by optimizing the interpolation functions m i (λ), and thus obtained Eq. ( 7), Maršálek and Tuckerman 2 introduced higherorder derivatives of Q(λ) with respect to λ.][19][20] More precisely, the λ-interval and the isotope effect is then calculated as (11)   where • • • Ij denotes an average over all path integral configurations as well as over all λ ∈ I j .The simultaneous stochastic evaluation of the thermodynamic and path integrals permits using a much larger number of integration steps (i.e., J), rendering the integration error negligible; if the simulation is, in addition, subject to a special umbrella biasing potential that is a piecewise linear function of λ, we have proven that the integration error becomes exactly zero. 16Our main goal here is evaluating whether the stochastic change of mass can be combined effectively with another method for evaluating the isotope effect, namely the method of direct estimators, which we review now.

D. Free energy perturbation and direct estimators for equilibrium isotope effects
Free energy perturbation 24,40 is an alternative strategy which allows to calculate isotope effects without introducing an integration error. 3,15The approach consists in rewriting Eq. ( 2) as where Z 0,1 th := ρ (1) (r)/ρ (0) (r) is the "thermodynamic direct estimator," 3,15 numerically evaluated as However, a much lower statistical error is obtained by using an alternative, mass-scaled direct estimator, 3 where the scaled coordinates r (s) λ ,λ are defined as The mass-scaled direct estimator can be derived by expressing Q(1) and Q(0) in Eq. ( 12) in terms of massscaled normal mode coordinates u (see Appendix A of Ref. 16 for details).This leads to a direct estimator ρ(1) (u)/ρ (0) (u), which becomes Eq. ( 14) upon transformation back to standard Cartesian coordinates r.In contrast to the thermodynamic integration, the method of direct estimators does not introduce an integration error; the direct estimators, however, are not suitable for calculating large isotope effects, for which both Z 0,1 th and Z 0,1 sc exhibit large statistical errors because one always uses a sampling weight ρ (0) (r) even though the natural weight changes from ρ (0) (r) to ρ (1) (r).Although isotope effect can be evaluated by running the simulation either with ρ (0) (r) or ρ (1) (r), Cheng and Ceriotti noted 3 that running the simulation at the heavier isotope leads to smaller statistical errors of Z sc ; in Appendix A we prove this property analytically for harmonic systems.In the process we also encounter cases in which direct estimators exhibit infinitely large root mean square errors, making the calculation impossible to converge; however, as shown in Appendix B, such situations cannot occur if one runs the simulation at the lower-mass isotopologue while averaging Z th or, for convex or bound potentials, at the higher-mass isotopologue while averaging Z sc .

E. Stepwise implementation of the direct estimators
A simple way to bypass the issue of large statistical errors of direct estimators for large isotope effects is performing the calculation stepwise by factoring the large isotope effect into several smaller isotope effects between virtual isotopologues. 15,30For that purpose, one can, as for thermodynamic integration, introduce a set of J + 1 intermediate values λ j (j = 0, . . ., J) such that λ 0 = 0, λ J = 1, and write For J large enough, Q(λ j )/Q(λ j−1 ) will be sufficiently close to unity, and hence can be evaluated with direct estimators with a reasonably small statistical error.It will prove useful to write Eq. ( 16), expressed in terms of direct estimator averages, in a more general manner as by using an arbitrary reference value λ j from the interval I j = [λ j−1 , λ j ] as the λ-value of the sampling weight used in the jth factor of the isotope effect; in a broader context of free energy perturbations this approach is known as "double-wide" sampling. 41,42From now on we will refer to the stepwise evaluation of the isotope effect via Eq.( 17) simply as the method of "direct estimators." Choices of λ j and λ j that minimize the statistical error are discussed in Appendix C; in general it appears that λ j = j/J (j = 0, . . ., J) and λ j = (j − 1/2)/J (j = 1, . . ., J) are quite close to the optimal values if inverse square root of mass interpolation ( 7) is used, and therefore were used throughout this work.
F. Combining direct estimators with the stochastic change of mass In the original use of direct estimators, 3 the largest isotope effect was rather small, and hence it was possible to evaluate several isotope effects in a single simulation.In situations where the isotope effect is large, one should use the generalized, stepwise expression (16) to compute it, in order to avoid large statistical errors.In the previous subsection we assumed that each of the factors contributing to the product ( 16) is obtained from a separate simulation, as in Eq. (17).
Here, we propose, instead, to run a single simulation which will explore all λ j values at once in a way similar to the stochastic thermodynamic integration with respect to mass from Ref. 16 and described briefly in Subsection II C.There are several reasons why this should be advantageous.
The first, most obvious reason, is the same as for stochastic thermodynamic integration from Ref. 16: given fixed computational resources, it is much easier to reach ergodicity, and therefore converge a single Monte Carlo simulation than J separate simulations; in other words, it is computationally less expensive to obtain a converged isotope effect if all the factors in the righthand side of Eq. ( 17) are obtained from a single simulation rather than from J separate simulations.
The second reason is similar to the motivation for the original method of direct estimators, 3 in which a single Monte Carlo simulation suffices to evaluate several different isotope effects; calculating all factors appearing in Eq. ( 17) in a single simulation will indeed enable this at least for "sequential" isotope effects, such as x /CH + 5 for x = 1, . . ., 5 that will be presented below.
Last but not least, the stochastic change of λ can lead to a decrease of statistical error of the calculated isotope effect.Consider J sets of correlated samples obtained from J independent simulations using the method of direct estimators and run at different values of λj ; reshuffling the samples between simulations should make samples inside each simulation less correlated between each other, thus lowering statistical error of averages obtained from them.The proposed method can be regarded as such a "shuffling," and in this respect it resembles the parallel tempering or replica exchange Markov chain Monte Carlo techniques, [21][22][23] but for the latter approaches the value of J that can be used in practice will depend on the number of simulation replicas one can run simultaneously.
With this motivation in mind, it is easy to see that to change λ value between different λ j values one can use the same procedure as that described in Ref. 16, the only difference being that the trial λ value should be restricted to a discrete set of values {λ j }, j = 1, . . ., J; some more tedious details are left for Appendix D. The overall isotope effect is again obtained from Eq. (17).
Combining stochastic change of mass with the thermodynamic integration or stepwise direct estimators also makes it possible to calculate several isotope effects at once more efficiently.This result is discussed properly in Supplementary Material using deuterization of methane and methanium as examples.

III. NUMERICAL EXAMPLES
To test the stepwise and stochastic implementations of the direct estimators, we applied them to a model 8-dimensional harmonic system and to several isotopologues of methane and methanium.In addition we compared the results of the new approaches with results of the thermodynamic integration with either deterministic or stochastic change of mass, and with the original method of direct estimators. 3From now on, for brevity we will refer to the five methods as follows: thermodynamic integration with respect to mass (Subsection II B) will be simply referred to as "thermodynamic integration" (TI), thermodynamic integration with stochastic change of mass (Subsection II C) as "stochastic thermodynamic integration" (STI), Cheng and Ceriotti's original method of direct estimators (Subsection II D) as "original direct estimators" (ODE), stepwise application of direct estimators (Subsection II E) as "direct estimators" (DE), and stepwise application of direct estimators with stochastic change of mass (Subsection II F) as "stochastic direct estimators" (SDE).

A. Computational details
The calculations presented in this work were done with parameters mostly identical to those used in Ref. 16.The DE and ODE calculations were done with the same number of different Monte Carlo steps and frequency of estimators' evaluation as TI, while the SDE calculations employed the same number of different Monte Carlo steps and frequency of estimators' evaluation as STI.In SDE calculations, we additionally modified simple λ-moves used in STI calculations according to a prescription detailed in Appendix D. The way the λ interval was divided into subintervals was also the same as in Sec.III of Ref. 16.
As discussed in Sec.II B, results obtained with TI contain integration error, which was estimated by comparing the calculated isotope effects with the exact analytical 43 values for a harmonic system with a finite Trotter number P and with the result of SDE for methane.(Recall that ODE, DE, and SDE have no integration error by definition, while the absence of integration error in STI was proven in Ref. 16.) Statistical errors were evaluated with the "block-averaging" method 44 for correlated samples the same way as in Sec.III of Ref. 16.Last but not least, the path integral discretization error for the harmonic system was estimated from a known analytical expression for finite P , 43 while for the CD 4 /CH 4 and CD + 5 /CH + 5 isotope effects it was estimated using a novel procedure presented in Appendix E.

B. Isotope effects in a harmonic model
Numerical results are presented in Fig. 1 and correspond to an 8-dimensional harmonic system with parameters identical to those used in Ref. 16. Panel (a) of Fig. 1 shows that analytical values of the isotope effect (at a finite value of P ) are reproduced accurately by all five methods for several values of β ω 0 , confirming, in particular, that the proposed stepwise and stochastic implementations of the direct estimators are correct.
Panel (b) displays the dependence of the thermodynamic integration error on temperature.(Recall that the integration error is zero for STI, ODE, DE, and SDE by construction).The figure is a reminder of the fact that, despite the improved mass interpolation scheme (7), TI is the only of the five presented methods that exhibits a significant integration error, especially noticeable at higher temperatures since the mass interpolating function Eq. ( 7) was designed to be most effective in the deep quantum regime.
Panel (c) of Fig. 1 compares statistical errors of the five methods considered.The first evident trend is that at lower temperatures ODE exhibit a larger statistical error compared to the other methods; it illustrates the need to use the stepwise (DE) or stochastic stepwise (SDE) variants instead in such cases.Secondly, similar statistical errors are exhibited by TI and DE, as well as by STI and SDE.This can be explained by noticing that in the limit of large J TI becomes equivalent to DE, while STI becomes equivalent to SDE; to see this one can recall that [dF (λ)/dλ] est is related to the derivative of Z λ ,λ sc with respect to λ and compare Eq. ( 17), Eq. ( 11), and the midpoint rule used for thermodynamic integration in this work [also see Eq. (32) of Ref .16].It is therefore reasonable to expect that for large values of J or small isotope effects statistical errors of TI and DE or STI and SDE will be quite close; here J = 8 is apparently already sufficient to enforce this tendency over a wide temperature range.
Panel (d) of Fig. 1 displays integration errors of TI as a function of the number J of λ intervals.Clearly, for TI, the integration error approaches zero only in the J → ∞ limit, which is, exceptionally, attainable in this particular case since the Monte Carlo procedure produces uncorrelated samples.Note, however, that in most practical calculations, this is impossible and the limit cannot be reached.
Finally, panel (e) of Fig. 1 shows the dependence of statistical errors on J, demonstrating that the statistical errors of the four methods approach their limiting values for J → ∞.As already mentioned earlier, similar statistical errors for TI and DE, or for STI and SDE in the J → ∞ limit are expected.Here, however, the tendency is already observed at a surprisingly small value of J = 1 for TI and DE.For STI and SDE it is observed after J = 64.

Computational details
As mentioned above, calculation parameters for DE, SDE, and ODE are identical to those from Ref. 16, with a few modifications described in Subsection III A; the TI and STI results were taken from Ref. 16.Here we only add that the computational time spent on TI, STI, SDE, DE, and ODE calculations was approximately the same.
For each temperature we also ran calculations at λ = 0 and λ = 1 to estimate the path integral discretization error with the method explained in Appendix E which uses estimator W 2 for Q 2P /Q P .These simulations were 10 7 Monte Carlo steps long, 14% were whole-chain moves and 86% were multi-slice moves performed on one sixth of the chain with the staging algorithm 45,46 (this guaranteed that approximately the same computer time was spent on either of the two types of moves).W 2 was evaluated only after each ten Monte Carlo steps to decrease the calculation cost.

Results and discussion
The results of the calculations of the CD 4 /CH 4 isotope effect are presented in Fig. 2, with TI and STI results from Ref. 16 plotted for comparison.Panel (a) shows that the isotope effects calculated with the five different methods described in Sec.II agree, confirming that SDE were implemented correctly, even though TI does exhibit a small integration error [see panel (b)].Panel (c) of Fig. 2 shows that STI, SDE, TI, and DE exhibit quite similar statistical errors, while the ODE, quite surprisingly, exhibit statistical errors quite close to stepwise approaches for all but the lowest temperature.Several reasons why ODE seem to perform well for a rather large range of isotope effect values are analyzed in Appendix B.
For reference, the SDE, DE, and ODE values plotted in Fig. 2 are listed in Table I together with their discretization errors estimated by the method described in Appendix E. Note that the discretization error only depends on P and not on the method used for the isotope effect calculation, and that our method for estimating this discretization error exhibits favorable statistical behavior.As mentioned in Ref. 16, benefits of stochastically changing mass become most apparent when computational resources are limited.We therefore compared isotope effects obtained with TI, DE, STI, SDE, and ODE using simulations of increasing length, starting with very short, and hence nonergodic simulations.The results of these calculations, performed according to a prescription detailed in Ref. 16, are plotted in Fig. 3.The general tendency is the same as was observed in Ref. 16: approaches that require several simulations to evaluate the isotope effect (TI and DE) require more Monte Carlo steps in total to achieve converged results than those requiring only one simulation (STI, SDE, and ODE).This is due to a certain time needed for a simulation to adequately explore the r space; while TI and DE need to take the corresponding minimal number of Monte Carlo steps for J simulations, STI, SDE, and ODE need to do it only for one simulation.For STI and SDE this benefit is slightly counteracted by introducing an extra dimension λ to be explored by the simulation, but we have found our Monte Carlo procedure tends to explore this dimension much faster than r in realistic simulations; a model example, where this is not the case, is provided in Sec.II of Supplementary Material.

D. Deuteration of methanium
In this subsection we evaluate the CD + 5 /CH + 5 isotope effect using the same techniques as for methane in Subsec.III C and Ref. 16, including TI, DE, STI, SDE, ODE, and harmonic approximation; the only difference of the calculation parameters was that J = 5 was employed for mass discretization.Potential energy surface from Ref. 47 was used during the simulations.
The resulting isotope effects are plotted in Fig. 4 along with their root mean square errors and integration error for TI.As seen in panel (a), the four path integral methods that do not exhibit integration error (DE, SDE, STI, and ODE) agree with each other up to statistical error, while TI exhibits a noticeable, but rather small, integration error plotted in panel (b).Panel (c) shows root mean square errors of the isotope effect obtained with different methods.As was the case for methane, TI, DE, STI, and SDE exhibit similar statistical errors in the entire temperature range.9][50] Note that the harmonic approximation was obtained by expanding the potential energy about one of the 120 equivalent potential energy minima of CH + 5 .For reference, all calculated isotope effects along with the estimated discretization errors are presented in Table II.
As in Subsec.III C, we compared isotope effects ob- tained with TI, DE, STI, SDE, and ODE using simulations of increasing length, starting with very short, and hence nonergodic simulations.The results of these calculations, performed according to a prescription detailed in Ref. 16, are plotted in Fig. 5.The general tendency is the same as was observed in Ref. 16 and in the case of methane in Subsec.III C: approaches that require several simulations to evaluate the isotope effect (TI and DE) require more Monte Carlo steps in total to achieve converged results than those requiring only one simulation (STI, SDE, and ODE).
Table I.Values of the CD4/CH4 isotope effect (IE) obtained with direct estimators (DE), "stochastic direct estimators" (SDE), and "original direct estimators" (ODE) (for other values seen in Fig. 2, see Ref. 16).The proposed methodology is SDE.The discretization error is defined as IEP /IE − 1 and is estimated with the procedure described in Appendix E.

IV. CONCLUSION
In this paper, we continued our program of accelerating the calculations of equilibrium isotope effects.Starting from a basic idea of thermodynamic integration with respect to mass, 11 one can 1) accelerate statistical convergence by using centroid virial estimators, which, in the case of path integral Monte Carlo avoid evaluating forces by employing a finite-difference derivative trick, 12,13 2) accelerate convergence to the quantum limit by employing higher-order factorizations of the path integral, 51 and 3) reduce or remove the error of thermodynamic integration, which was the focus of the present work.
To remove the thermodynamic integration error, we investigated optimal ways to use direct estimators 3,15 for larger isotope effects.First, we introduced a near optimal way to discretize the larger isotope effect into smaller ones, and showed that the statistical error of the resulting method was quite close to the corresponding error exhibited by thermodynamic integration.Second, we showed how statistical convergence of the method could be improved by combining it with the stochastic change of mass.We referred to the resulting method as the "stochastic direct estimators." We also observed that simple, original application of the mass-scaled direct estimators proposed in Ref. 3 performed well for isotope effects of surprisingly large magnitudes.These results, combined with our discussion of the method's convergence and divergence in Appendix B, indicate that the applicability of the direct estimators could be broader than what had been reported before.In the end, the approach of Ref. 3 can be quite efficient for isotope effects much larger than unity, especially if the system in question allows to capitalize on atom interchangeability.Yet, for the largest isotope effects using stochastic direct estimators is still preferable, as indicated by our results for the deuterization of methane and methanium at lower temperatures.
The way we used mass-scaled normal modes to define mass-scaled λ-moves can be extended to any mapping used in targeted free energy perturbation 31 provided the mapping is non-local.In this context our results compare using the same physically-motivated mapping either to perform a free energy perturbation calculation (original direct estimators) or to define an analogue of our Monte Carlo procedure and combine it with thermodynamic integration or stepwise free energy perturbation.The quality of the mapping affects statistical errors of the first approach through dispersion of the free energy perturbation estimator; in the second approach estimator properties are less affected, but an inefficient procedure for sampling λ can increase correlation length of the samples.Our results for methane and methanium at lower temperatures illustrate how the second approach is less affected by the quality of the mapping.
Finally, let us mention that stochastic direct estimators can be combined with the Takahashi-Imada or Suzuki fourth-order factorizations [52][53][54][55] of the Boltzmann operator, which would allow lowering the path integral discretization error of the computed isotope effect for a given Trotter number P , and hence a faster convergence to the quantum limit.
(B6) Combining Eqs.(B2), (B5), and (B6), then summing the result over s leads us to B7) proving that in this case Z 0,1 sc is bound in a way that does not depend on P Appendix C: Optimal choice of mass discretization and reference masses for the stepwise application of direct estimators In this appendix we discuss choices of λ j and λ j that minimize the root mean square error of the direct esti-mator expression (17).In other words, we discuss the optional choice of the partition of the interval [0, 1] into subintervals [λ j−1 , λ j ] and the optimal choice of the reference masses (λ j ) in each of the subintervals [λ j−1 , λ j ].
The first part of the problem is the choice of λ that minimizes the statistical error of the ratio Z λ,1 sc (λ) / Z λ,0 sc (λ) .Estimating the root mean square error of this ratio analytically is problematic as the averages are obtained from the same Monte Carlo trajectory, and hence are correlated.Even if one could neglect the correlation between the two averages, the resulting estimate for an optimal λ value would be too complicated to be useful.Yet, we can suggest two rules of thumb based on an approximate behavior in the deep quantum regime because the most quantum degrees of freedom typically contribute the most to the statistical error.[Equation (A13) can be shown to imply this.]For simplicity let us consider a one-dimensional harmonic oscillator with force constant k, mass changed from m(0) to m(1) > m(0), and m(λ) given by the inverse square root interpolation formula (7).Assuming the averages of Z λ,1 sc and Z λ,0 sc to be approximately independent and to have the limiting behavior of Eq. (A3) at low temperatures allows us to write We also assume that Z λ,1 sc does not exhibit a divergent [as described in Appendix A], since it could only appear in this problem for very exotic mass differences [m(1) > 2m(0)] and a too sparse discretization into λ subintervals.In the β → ∞ limit, differentiating either term in (C1) with respect to λ and dividing the result by βdm(λ)/dλ leads to an expression whose magnitude in the β → ∞ limit will be mainly determined by the argument of the exponential.The minimum corresponds to derivatives of the two terms having the same magnitude to cancel out, which implies the arguments of the two exponentials being approximately equal.Straightforward algebra indicates the argument of the first exponent is larger than in the second one at λ = 1/2, while the inequality is reversed at λ = 1, implying that the minimum lies in the interval [1/2, 1], which is consistent with the optimal λ value found in test calculations we have performed with several harmonic systems.We have not observed a significant difference between statistical errors obtained for λ = 1/2 or λ = 1 in a wide range of situations; in the limit of large J, however, picking λ = 1/2 becomes numerically identical to performing thermodynamic integration with the midpoint rule, while choosing λ = 1 would correspond to the right hand rule, which should lead to worse performance.λ = 1/2 was therefore the choice used in this work.
To illustrate our estimate for the optimal λ, in Fig. 6 we plot root mean square error of Z λ,1 (λ) / Z λ,0 (λ) as a function of λ for the same harmonic system as in Subsection III B with β ω 0 = 32.Evidently, the optimal λ is in the interval [1/2, 1] and is quite close to 1/2.
We will now discuss the best way to partition a large isotope effect into smaller isotope effects, each of which can be evaluated with the original direct estimators.Assuming that the midpoint was used for an evaluation of an intermediate isotope effect Q(λ )/Q(λ ) one can obtain the following approximate upper bound for the relative statistical error in the low temperature limit In this approximation the problem of minimizing statistical error of a "stepwise" direct estimator isotope effect calculation is equivalent to minimizing a sum of positive terms whose product is fixed.Since the solution of the latter problem requires all terms of the sum being equal, the resulting rule of thumb is to pick the intermediate isotope effects approximately equal in magnitude.In the low temperature limit, and with mass interpolation (7), this is equivalent to having the intermediate isotope effects correspond to λ subintervals of equal size.To check this estimate we ran some test calculations for the same system as in Subsection III B with β ω 0 = 32 using direct estimators and J = 2 with different values of λ 1 .The resulting root mean square errors are plotted in Fig. 7; for this rather quantum harmonic oscillator root mean square error is indeed minimized if λ 1 is close to 1/2.factorization replace V with an effective potential V eff that, unlike V , depends on P .As for the Suzuki-Chin factorization, the matter is further complicated by the fact that the weight of this effective potential depends on the bead s at which it is evaluated.

Figure 1 .
Figure1.Isotope effect calculations in an eight-dimensional harmonic model from Ref.16.Results of thermodynamic integration (TI), stochastic thermodynamic integration (STI), stepwise direct estimators (DE), stochastic direct estimators (SDE), and original direct estimator (ODE) approach from Ref. 3 are compared with exact analytical values (for the same finite Trotter number P ).The proposed method is SDE, while DE already provide some improvement over ODE.Panels (a)-(c) show the temperature dependence of (a) the isotope effect, (b) its integration errors for TI, and (c) its statistical root mean square errors (RMSEs).Panels (d)-(e) display the dependence of integration errors (for TI) and RMSEs on the number J of integration subintervals at a temperature given by β ω0 = 32.

Figure 2 .
Figure 2. CD4/CH4 isotope effect (IE) computed with several methods.Panels (a)-(c) show the temperature dependence of (a) the isotope effect, (b) its integration errors for thermodynamic integration, and (c) its statistical root mean square errors (RMSEs).

Figure 3 .
Figure 3. Impact of nonergodicity appearing in shorter calculations of the CD4/CH4 isotope effect (IE) at T = 200 K. Panel (a) presents the convergence of the IE as a function of the number of potential energy evaluations needed for calculations with different simulation lengths, while panel (b) shows the corresponding error of the IE (in logarithmic scale) relative to a converged stochastic direct estimators (SDE) result.The horizontal line in panel (a) labeled "SDE (converged)" is the converged SDE result ln(IESDE) = 19.785from Fig. 2 and Table I; the same value was used as a reference in panel (b).

Figure 5 .
Figure 5. Impact of nonergodicity appearing in shorter calculations of the CD + 5 /CH + 5 isotope effect (IE) at T = 200 K.For details, see the caption of Figure 3.The horizontal line in panel (a) labeled "SDE (converged)" is the converged SDE result ln(IESDE) = 22.674 from Fig. 4 and TableII; the same value was used as a reference in panel (b).

Figure 8 .
Figure 8.Comparison of exact analytical values of the integral discretization error of (a) the partition function Q and (b) isotope effect with their estimates [Eqs.(E2) and (E9)], which were evaluated either analytically or numerically using the estimator (E7).The figure shows the dependence of the discretization error on the Trotter number P in a one-dimensional harmonic oscillator with β ω = 8, and the isotope effect corresponds to the doubling of the mass.

Table II .
Values of the CD + 5 /CH + 5 isotope effect (IE) obtained with several different methods.For details, see the caption of TableI.The values of P for a given temperature were the same as those for methane listed in TableI.
2,51,56,57 SUPPLEMENTARY MATERIAL Section I of Supplementary Material describes an alternative procedure, in which several isotope effects are evaluated simultaneously.Examples include four isotope effects of the form CH 4−x D x /CH 4 (x= 1, . . ., 4) and five isotope effects CH 5−x D + x /CH + 5 (x = 1, . . ., 5).Section II analyzes effects of nonergodicity on isotope effect calculations in the harmonic model.