Interaction Picture Density Matrix Quantum Monte Carlo

The recently developed density matrix quantum Monte Carlo (DMQMC) algorithm stochastically samples the N -body thermal density matrix and hence provides access to exact properties of many-particle quantum systems at arbitrary temperatures. We demonstrate that moving to the interaction picture provides substantial benefits when applying DMQMC to interacting fermions. In this first study, we focus on a system of much recent interest: the uniform electron gas in the warm dense regime. The basis set incompleteness error at finite temperature is investigated and extrapolated via a simple Monte Carlo sampling procedure. Finally, we provide benchmark calculations for a four-electron system, comparing our results to previous work where possible.


I. INTRODUCTION
The overwhelming majority of electronic structure studies of matter have been conducted at zero temperature. This state of affairs has been justified as typically one is interested in the low-energy properties of condensed matter systems or the room-temperature properties of chemical systems. Due to recent experimental advances, however, there has been renewed interest in the thermodynamic properties of electron plasmas such as those found on the pathways to inertial confinement fusion 1 , in the interiors of Jupiter and other gas giants 2 , at the surfaces of solids after laser irradiation 3 , and in plasmonic catalysis 4 .
Of fundamental importance to the theoretical understanding of these systems is the uniform electron gas (UEG), which has been pivotal to the development of modern quantum mechanical approaches to the lowtemperature chemistry and physics of molecules and solids 5,6 . At finite temperatures the UEG can be described in terms of the density parameter, r s , and the degeneracy temperature, Θ = T /T F , where T F is the Fermi temperature. When both r s and Θ are of order one the system is said to be in the warm dense regime, with quantum, thermal and interaction effects all being important. It is in this region where analytical methods, such as Ref. 7, are least useful and computational approaches such as quantum Monte Carlo are most important.
In a pioneering study, Brown et al. 8 provided the first accurate data for the UEG in the warm dense regime using the restricted path integral Monte Carlo (RPIMC) method 9 , from which the first accurate parameterizations of finite-temperature density functionals have been produced 10,11 . Recent configuration path integral Monte Carlo (CPIMC) results 12 have called into question the validity of the restricted path approximation used in Ref. 8, with significant disagreement between the two methods at high densities and low temperatures. Simulations using a third method such as DMQMC would help to resolve this discrepancy 13 .
The exact thermodynamic properties of the UEG can be determined from the (unnormalized) N -particle density matrixρ where β = 1/k B T . A direct evaluation of Eq. (1) requires all eigenvalues and eigenstates ofĤ to be known and is an impossible task in all but the simplest of systems. The infinite basis set limit can be approached by only including determinants that can be constructed using a finite basis set of M plane waves, reducing the problem to the diagonalization of an M N × M N matrix. Even this problem is only tractable for very small M and N .
Recently, Booth et al. have shown, through the development of the full configuration interaction QMC (FCIQMC) method, that full configuration interaction (FCI) quality results can be obtained at zero temperature with no prior knowledge of the nodal structure of the wavefunction 14 . FCIQMC also often offers a substantially reduced memory cost compared to conventional FCI calculations. This has most dramatically been seen in the case of the UEG, where a space of O(10 108 ) Slater determinants was successfully sampled using the initiator adaptation of FCIQMC 15,16 . Subsequently, three of us used these ideas to develop the finite-temperature analogue of FCIQMC: density matrix quantum Monte Carlo (DMQMC). This was applied to the Heisenberg model as a proof of concept 17 , but has not previously been used to study more realistic systems.
Here we show how DMQMC can be applied to fermionic systems, starting with the UEG, thus opening the door to providing accurate, unbiased thermodynamic arXiv:1506.03057v2 [cond-mat.str-el] 14 Oct 2015 results for problems of chemical interest. We note that CPIMC 18 and Krylov-projected FCIQMC 19 will likely be complementary approaches to both DMQMC and PIMC in the treatment of real systems.
In Section II we outline the DMQMC method and show how moving to the interaction picture provides substantial improvements in statistical accuracy when treating weakly-correlated systems. In Section III we discuss basis-set extrapolation at finite temperatures in detail. In Section IV we present benchmark results for a four-electron system across the relevant parameter space, comparing, where possible, to previous results. Finally, in Section V, we outline the limitations and future prospects of DMQMC.

II. DENSITY MATRIX QUANTUM MONTE CARLO
In this section we briefly outline the DMQMC algorithm; a more complete description is available in Ref. 17. DMQMC is applicable to any Hamiltonian but here we focus on the specific example of the UEG. We then explain how to sample the density matrix in the interaction picture, show that this overcomes sampling issues found when treating weakly-correlated systems, and introduce a simple Monte Carlo scheme for sampling noninteracting density matrices in the canonical ensemble. Hartree atomic units are used throughout.

A. Theory
The unnormalized density matrix in Eq. (1) obeys the symmetrized Bloch equation which can be solved using a simple Euler update scheme: In DMQMC, Eq. (3) is solved stochastically by evolving a population of signed 'psi-particles', or 'psips', in a discrete operator space made of tensor products of Slater determinants. To this end, we rewrite Eq. (3) in matrix form: where ρ ij = D i |ρ|D j , |D i is a Slater determinant in the finite but large basis set, S is a variable shift introduced to control the psip population 14,20 , and the last line defines the update matrix T ij = −(H ij − Sδ ij ).
2. Psips on the density matrix element ρ ij clone/die, whereby their population is increased or decreased, with probability p d (ij) = ∆β 2 |T ii + T jj |. The population is increased if sign(T ii + T jj ) × sign(ρ ij ) > 0 and decreased otherwise.
Additionally, we annihilate psips of opposite sign on the same density matrix element to improve the efficiency of the algorithm and overcome the Fermion sign problem 14,21 . We note that, unlike PIMC methods where the quality of averages depends on the average sign of the sampled paths 8,18 , in FCIQMC and DMQMC, we require a system specific and basis set dependent critical psip population to obtain correct low temperature and ground state estimates 14,[21][22][23] .
The simplest starting point for a simulation is at β = 0, where the density matrix is the identity and can be sampled by occupying diagonal density matrix elements with uniform probability. A simulation then consists of propagating the initial distribution of psips with the rules described above to a desired value of β. Estimates for thermodynamic quantities can be found by averaging over many such simulations, a single one of which we call a 'β-loop'.
In Fig. 1(a) we see that a direct application of DMQMC to the dense UEG can result in estimates for the internal or total energy that are too high in the intermediate temperature range. This can be understood by noting that, at r s = 1, the ground state of a few-electron UEG system is well described by a single (Hartree-Fock) determinant, |D 0 . The probability of initially selecting this determinant, however, is M N −1 , which rapidly approaches zero as M increases. If, by chance, the Hartree-Fock determinant or another low-energy determinant is sampled at β = 0, the population of psips arising from that low-energy determinant will dominate the simulation, but most simulations miss the low-energy part of the Hilbert space altogether. As shown in Fig. 1(b), this sampling problem reduces as the number of β-loops (or the population of psips per β-loop) increases, thus increasing the chance of sampling the low-energy space; however, this approach soon becomes impractical. Fig. 1(a) shows that, by moving to the interaction picture, we can effectively solve this sampling issue and regain FCI-quality thermodynamic averages.

B. Moving to the Interaction Picture
There are two sampling issues present when treating real systems; the distribution of weight in the density matrix changes rapidly as a function of β, and important determinants are rarely present in our initial configurations. Feynman originally pointed out that if we can writeĤ =Ĥ 0 +V , whereV is small compared toĤ 0 , then the quantity e βĤ 0ρ will be a slowly-varying function of β 24 . This does not solve the issue of selecting important determinants, so we define an auxiliary matrix which has the propertieŝ From Eq. (7) above we see that, by working with the operatorf , we can start the simulation from e −βĤ 0 instead of the identity. For most weakly-correlated systems this should provide a good first approximation to the distribution of weight in the fully interacting density matrix. Differentiating Eq. (6) with respect to τ we find where we have used the usual definition of an operator in the interaction picture: In practice the exponential factors appearing in Eq. (10) are time-consuming to evaluate and we prefer to work with Eq. (9). Since we chooseĤ 0 to be diagonal in our determinantal basis set of Hartree-Fock eigenstates, the only modification to the original DMQMC algorithm is This scheme, which we dub interaction-picture DMQMC (IP-DMQMC), has the added benefit that there is typically no death along the diagonal as long as H 0 ii ≥ H ii . This overcomes a related issue in DMQMC simulations of large systems, whereby the weight of walkers on the diagonal decays nearly to zero with β; this was previously remedied with the use of importance sampling 17 .
Eq. (8) shows that IP-DMQMC only samples the correct distribution at τ = β so that, unlike in DMQMC, where the whole temperature range is sampled in one simulation, separate simulations are required for each β value. As in DMQMC, estimates for observables require averaging over multiple independent simulations.
Referring back to Fig. 1 we see that working in the interaction picture effectively eliminates this sampling issue, with the correct total energy being reproduced using small numbers of psips and β-loops.

C. Sampling the initial condition
The choice ofĤ 0 is somewhat arbitrary, but it should allow for an efficient sampling off (τ = 0) and this is most easily achieved ifĤ 0 is non-interacting. In principle, any initial density matrix can be sampled using the Metropolis algorithm 25 , but we have found this approach problematic due to the long equilibration times required at low temperatures and in large basis sets. An alternative method, which is free from such issues, is to use knowledge of the grand canonical density matrix corresponding toĤ 0 and sample this in such a way that the desired, canonical, distribution is reached.
Consider the grand canonical density matrix iĉi is some non-interacting Hamiltonian whose single-particle eigenvalues ε i are known,N is the number operator and µ is the chemical potential, which can be determined from the implicit relationship where we have identified the usual Fermi-factor, p i . The grand canonical partition function, Z GC , can be written as (15) where {n N i } denotes a set of occupation numbers such that i n i = N and n i ∈ {0, 1} for fermions.
The probability of selecting a particular set {n N i } is However, we wish to generate determinants in the canonical ensemble where the correct probability is and We see that Thus, by independently occupying orbitals with probability p i and then discarding those configurations with N = N , we attain the correct proportionality factor Z GC /Z C . Only about one in √ N of the configurations sampled has the right value of N , but the sampling process is so fast that very little computer time is required for any system size in the reach of many-body simulation methods. The chemical potential can be obtained by numerically inverting Eq. (13) in the appropriate finite basis set, a procedure we carry out using Scipy 26 . A demonstration of the above procedure is given in Fig. 2, where we see that Ĥ is indeed a slowly varying function of τ and that the correct estimate is reproduced at τ = β.
Finally, we note that any diagonal density matrix can be obtained by reweighting the configurations which result from the above sampling procedure as where E new and E old are the new and old total energies of a given configuration {n N i }, respectively.

III. BASIS SET EXTRAPOLATION
To treat the UEG using DMQMC we need to work in a finite basis set of M plane waves; thermodynamic quantities will therefore need to be extrapolated to the M → ∞ limit. At T = 0 it has been found 16 obtained using an M −1 extrapolation, although in principle this relationship only holds for an unpolarized system; we have found that, for polarized systems, extrapolating with M −5/3 results in a better fit 30 , consistent with similar observations by other authors 31,32 . Based on the analysis presented here, Schoof et al. 12 have used the M −5/3 extrapolation with CPIMC. In addition, we find that the convergence of the total energy is strongly dependent on temperature.
At T > 0 there is a competition between the convergence of the kinetic and potential energies with M . To investigate this further we focus on a two-electron spinpolarized system, which can be solved exactly using diagonalization in large basis sets. In Fig. 3 we see this competition between energy scales: the total energy initially increases rapidly with basis-set size before appearing to saturate. As the size of the basis set is further increased, a slight reduction in the total energy is observed, with the residual error apparently proportional to M −5/3 (see inset of Fig. 3). In this regime the convergence of the total energy is dominated by exchange and correlation effects.
The initial increase of the total energy with respect to M at non-zero temperatures can be understood by looking at the non-interacting total energy as a function of basis-set size, which is most easily analyzed in the grand canonical ensemble. The non-interacting basis-set error is where k c is the plane-wave cutoff. For ε c = 1 2 k 2 c 1, this can be approximated as where we have used p k ≈ e −β(ε k −µ) for ε k µ. Hence if ε c β −1 so that (ε c + x) 3/2 ≈ ε 3/2 c everywhere e −βx is significant. It then follows that the leading-order correction is From Eq. (25) we see that the kinetic energy begins to converge exponentially once where V is the simulation-cell volume. In practice, we find that for large Θ the kinetic energy and hence the total energy converge quite slowly. This is an issue for DMQMC simulations as the cost of a calculation increases with basis-set size.
We can mitigate some of these issues by instead extrapolating the temperature-dependent correlation energy, E c (β, M ) = E(β, M ) − E HF (β, M ). The infinite basis-set total energy can then be reconstructed as E(β, M = ∞) = E HF (β, M = ∞) + E c (β, M = ∞). We calculate the 'Hartree-Fock' energy, E HF = Ĥ HF , using the density matrix 33 where E HF i = D i |Ĥ|D i and the sum runs over all determinants in the basis set. E HF (β) can be found using the sampling procedure outlined in Section II C, i.e., where w(i) = e −β(E HF i −E 0 i ) and p(i) = Z −1 0 e −βE 0 i . Thus, by generating determinants as described in Section II C and reweighting them using w(i), we can instead samplê ρ HF and, as a result, estimate E HF as desired. In Fig. 4 we show the convergence of E HF (β, M ) as a function of basis set for a four-electron, spin-polarized system at r s = 1 and Θ = 4. Note the large basis-set sizes required to converge the total energy to within statistical error bars. Fig. 4 also shows various other 'non-interacting' or 'mean-field' energy estimates as functions of M . Any of these could in principle be subtracted from E(β, M ) to define a correlation energy, but the quantity defined by subtracting E HF (β, M ) extrapolates most smoothly to the infinite M limit. The non-interacting grand canonical energy, Ĥ 0 GC , is significantly larger than the canonical estimates. Fig. 5 shows how E c (β, M ) depends on M at a number of different temperatures. For small basis sets E c shows a power-law decay with M , but this ceases for large enough M and the energies begin to increase again. We believe that the increase is due to kinetic effects that are not captured in the non-interacting expression we subtract. The value of M at which the correlation energy begins to increase again corresponds to the onset of the powerlaw convergence of the total energy observed in Fig. 3. This non-variational behavior of the internal energy with respect to M is not surprising as, at finite temperatures, it is the free energy that satisfies a variational principle.
We can estimate E c (β, M = ∞) by taking the value calculated with the largest basis set after the minimum is reached. This will in general over estimate E c (it will be too negative) but the remaining discrepancy is typically smaller than the stochastic error bar. The systematic errors left after extrapolating E c (β, M ) in this manner are well within chemical accuracy (∼ k B T ) and can be orders of magnitude smaller than those introduced by a direct extrapolation of the total energy.

Computational Methods
All calculations discussed in this paper were performed using the HANDE code 34 . Unless otherwise stated, the QMC calculations used real amplitudes [35][36][37] to sample the density matrix, which improves stochastic efficiency compared to integer weights. The full data set is available in the supplementary material 38 .

A. Four electron uniform electron gas
Using the procedures outlined above we are now in a position to provide exact benchmarks for the UEG in small simulation cells across the relevant parameter space. In this first study we focus on the four-electron spin-polarized system, which is the smallest non-trivial system and one for which there already exist benchmark calculations 29 . All energies contain the Madelung contribution 39 where appropriate. As a first step we compare our four-electron IP-DMQMC results to FCI results in small basis sets and see perfect agreement across the whole temperature range (Fig. 6).
We have extended these results to basis sets far beyond the reach of conventional full diagonalization procedures; the largest space sampled here contains approximately 10 22 density matrix elements. We used the initialization procedure outlined in Section II C and the free-electron Hamiltonian forĤ 0 for r s ≤ 1; for r s > 1 we found it advantageous to use Hartree-Fock density matrix defined in Eq. (26). The calculations were initialized with 10 3 -10 7 psips and the results averaged over 100-5000 simulations, each using a different random number seed. Time steps ∆τ ranging from 0.01/E F to 0.001/E F were used, with a smaller time-step required at lower r s ; the values chosen were small enough that we could resolve no time-step error within the statistical errors. Each (r s , Θ, M ) calculation was typically run for 2 hours on 48 cores with a total computational cost of approximately 80000 core hours. The separate calculations of E HF required 9000 core hours.
In Figs. 7 and 8 we show the convergence of the IP-DMQMC results with basis set at low and high temperatures, respectively. We note that the behavior found in the two-electron system is also found in the four-electron system; in particular the non-trivial dependence of the correlation energy on M is reproduced. We find that a direct extrapolation of the total energy with respect to M is best for Θ ≤ 0.25 as it is here where kinetic effects are minimal and there is a clear trend in the total energy for the basis set sizes considered. The procedure outlined in Section III is best suited for temperatures above this, becoming increasingly useful for Θ ≥ 2 as more highly excited states become accessible, requiring prohibitively large basis sets for a direct extrapolation to be possible. In between these too regimes both methods produce statistically identical results 38 . Fig. 9 summarizes our results and shows perfect agreement with the available CPIMC data from Ref. 29. Further results at higher temperatures and other r s values are available in tabular form in the supplementary material and again agree with the available CPIMC results. the infinite-basis-set limit carried out using a least-squares fit as implemented in Scipy 26 . At this low temperature, a direct extrapolation of the total energy works better than the correlation-energy extrapolation technique discussed in Section III.

V. DISCUSSION AND CONCLUSIONS
In this paper we have demonstrated how DMQMC can be applied to realistic systems. By moving to the interaction picture we have removed sampling issues found when treating weakly-correlated systems with large basis sets.
We have examined in detail the convergence of the total and correlation energies with respect to basis-set size M and temperature using a system accessible to FCI calculations. We found that, in general, these quantities exhibit a non-trivial dependence on M attributable to the competing energy scales present. By developing a simple Monte Carlo sampling scheme, we showed that it is possible to reduce the error in extrapolating these quantities to the complete basis-set limit by at least an order of magnitude at high temperatures. We believe this analysis and developments will be useful when treating molecular systems at finite temperatures. In addition, our approach to calculating the 'Hartree-Fock' energy should a useful first approximation when providing accurate benchmark calculations for systems away from the thermodynamic limit and in analyzing single-particle finite-size effects for the UEG at non-zero temperatures.
Using these developments we have reproduced the fourelectron CPIMC benchmarks of Ref. 29 and provided results at higher temperatures. We hope that these smallsystem results will aid the analysis of the apparent discrepancies between other QMC methods for larger system sizes 40 and serve as benchmarks for other QMC methods based in configuration space.
Whilst the results presented here are for much smaller systems than those accessible by RPMIC and CPIMC, DMQMC provides access to exact finite-temperature data for a given basis set. The main limitation on the system size is the critical population (determined by the plateau height 17,21 ) required to sample the density matrix. There are several grounds for optimism. The sign problem is much weaker at higher temperatures, implying that larger systems will be accessible albeit over a restricted temperature range. Further, in our previous study 17 we found that the plateau height in DMQMC, which is a measure of the strength of the sign problem, was roughly the square of that in FCIQMC, which would suggest a rather limited utility of our method. Remarkably, however, we have found that, for the UEG at various r s values and for some other models, the plateau height is only a small multiple of the FCIQMC plateau height. For example, for N = 4, r s = 5 and M = 1045, the FCIQMC plateau height is roughly 8000 psips in comparison to 90000 in IP-DMQMC at Θ = 0.0625. Finally, there is evidence that methodological developments will increase the scope of systems that can be treated with DMQMC. Given that the initiator approximation 15 enabled FCIQMC to be applied successfully to large UEG systems across a range of densities 16,27 , we are confident that, using similar approximations and developments in importance sampling, DMQMC will become a competitive method in treating degenerate Fermi systems.