RPA natural orbitals and their application to post-Hartree-Fock electronic structure methods

We present a method to approximate post-Hartree-Fock correlation energies by using approximate natural orbitals obtained by the random phase approximation (RPA). We demonstrate the method by applying it to the helium atom, the hydrogen and fluorine molecule, and to diamond as an example of a periodic system. For these benchmark systems we show that RPA natural orbitals converge the MP2 correlation energy rapidly. Additionally we calculated FCI energies for He and H$_2$ , which are in excellent agreement with literature and experimental values. We conclude that the proposed method may serve as a compromise to reach good approximations to correlation energies at moderate computational cost and we expect the method to be especially useful for theoretical studies on surface chemistry by providing an efficient basis to correlated wave function based methods.


I. INTRODUCTION
In ab initio quantum chemistry and computational materials physics there exists a well know trade off between accuracy and computational cost. Solving the many electron Schrödinger equation is an exponentially hard problem with respect to the number of electrons and therefore solving it exactly is out of scope for most practically relevant systems. Therefore approximations with varying degrees of accuracy are employed. While mean field methods like Hartree-Fock (HF) and density functional theory (DFT) possess computationally favorable scaling of N 3 with the number of electrons, they sometimes lack accuracy and fail to describe certain processes. For example describing the dissociation of simple molecules like H 2 already poses a challenge for such methods. The properties that are not well described by the HF approximation are attributed to so called electron correlation and there exists a wide variety of electronic structure methods all of which attempt to take the correlation effects into account in an approximate manner. Most of them have in common that one can increase the accuracy systematically at the expense of computational resources, e.g. Møller-Plesset perturbation theory (MPn), coupled cluster (CCSD,CCSDT) and configuration interaction (CI) methods.
DFT and HF calculations can provide a useful single electron basis as a starting point for correlated methods. However the correlated methods have in common that they usually involve virtual orbitals, i.e. eigenstates of the single-electron Hamiltonian from the underlying HF or DFT calculation that are unoccupied in the respective underlying calculation. To obtain the energy exactly (within the approximation of the respective correlated method) a complete basis set of single particle states is required. The convergence behavior with respect to the number of single particle states depends of course on the a) Electronic mail: georg.kresse@univie.ac.at specific set of orbitals employed. Therefore one can in principle obtain results that are converged up to machine precision with a finite number of virtual states and ideally reduce the computational costs significantly by using a very small number of appropriate single particle orbitals.
The essence of the above statement is that, by choosing an appropriate set of single particle orbitals, one can extract the relevant information resting in the full many body wave function from a small number of single particle wave functions. The question remains how to choose an appropriate set. One possible answer is to use natural orbitals as they turned out to capture much of the relevant information required to converge correlation energies quickly 1,2 .
In this article, we will show how we can approximate natural orbitals by employing the random phase approximation (RPA) and how to utilize the approximate natural orbitals to obtain accurate correlation energies at moderate computational cost. We will demonstrate the method by showing the convergence behavior of the MP2 energy with respect to the number of RPA natural orbitals (RPANOs) for the following prototypical systems: the helium atom, the hydrogen molecule at various internuclear distances, the fluorine molecule at equilibrium distance and diamond as an example for a periodic system. In the case of He and H 2 we will also show results for FCI calculations using a RPANO basis and will furthermore compare the exact natural orbitals from FCI with the RPANOs.

II. THEORY
In second quantization notation, the one particle reduced density matrix (1-RDM) is with the usual field operatorψ. Natural orbitals {ϕ i } are just the eigenstates of the 1-RDM: The 1-RDM is the central object of reduced density matrix functional theory (RDMFT) 3 . In comparison with conventional DFT, RDMFT has the advantage that besides the Hartree energy, also the kinetic and the exchange energy are obtained exactly. Finding an efficient way to approximate the 1-RDM as well as natural orbitals can therefore provide an interesting pathway for RDMFT 4 . Though we did not pursuit it, we still want to note that the method described hereafter for approximating the 1-RDM within the RPA might also be interesting for research in the field of RDMFT.
We will now try to establish a connection between the 1-RDM and the RPA by using a Green's function formalism. The 1-RDM is obtained from the Green's function in the limit of small negative time: which can be easily seen from the definition of the Green's function whereT is the Wick time-ordering operator. In a system with N electrons, the independent particle Green's function G 0 corresponds to a ground state with a single Slater determinant |φ 1 · · · φ N and the 1-RDM reduces to The natural orbitals are then just the occupied independent particle orbitals {φ i }. This would be the situation for a DFT or HF calculation. To improve on the independent particle 1-RDM and to include "information" on the unoccupied virtual space, the Dyson equation for the Green's function can be used to write a perturbation series for the 1-RDM as a functional of the self-energy Σ and the the independent particle Green's function G 0 : In the imaginary time/frequency domain, the first order term n (1) is explicitly obtained as 5 In the method shown below we will use this first order correction to the density matrix. Furthermore, we will approximate the self-energy using the RPA so that we will ultimately obtain an approximate set of natural orbitals, the RPANOs, as eigenfunctions of the first order RPA density matrix

III. METHOD
Assuming that with only a few RPA natural orbitals one can span the relevant subspace of the one electron wave functions for advanced methods like MP2 or CI, we aim to accurately approximate electronic ground state energies of these methods using a small number of virtual states. Thereby we are hopefully able to reduce the computational cost in these calculations significantly. In the following subsection, we will give a step by step recipe for obtaining RPA natural orbitals in order to calculate approximate post-Hartree-Fock ground state energies (e.g. MP2, CC and CI) with a reduced number of virtual states.

A. RPA natural orbitals
In the first step one calculates the electronic ground state at mean field level (HF or DFT) in order to obtain a well converged set of one electron wave functions for the occupied space. These occupied orbitals fix the respective mean field Hamiltonian for the subsequent steps. In the second step "all" unoccupied orbitals are calculated by diagonalizing the one particle Hamiltonian in the entire underlying basis set. "All" means that the number of eigenfunctions of the Hamiltonian (i.e. orbitals) that are calculated is equal to the number of basis functions used. Specifically for a plane wave basis with a cut-off energy E cut this means that the sum of occupied orbitals N occ and virtuals N virt is equal to the number of plane waves below the cut-off, or in other words equal to the size of the underlying plane-wave basis set: where k is the Bloch wave vector.
In the next step, the independent particle Green's function G 0 is calculated from all orbitals to subsequently calculate the independent particle polarizability χ 0 , the RPA screened interaction W RPA , as well as the RPA self energy Σ RPA6-8 . Note that Σ RPA is related to the GW -approximation as the self energy in the first self-consistency cycle 5,9 Σ RPA (τ) = G 0 (τ)W 0 (τ).
Having the self energy at hand, one can use Eq. (9) to calculate the RPA density matrix. The density matrix n = n 0 + n RPA is than diagonalized in the subspace of the virtual states, i.e. the occupied states from the mean field calculation are retained. The occupied orbitals together with the eigenstates obtained from that subspace diagonalization form the RPANOs basis set. Keeping the occupied states has practical reasons. In this way, the RPANOs reproduce the underlying HF or DFT ground state energy exactly, while at the same time the virtual states are rearranged to yield a more compact set for higher hierarchy methods when the orbital set is truncated. RPANOs obtained by this approach therefore provide a versatile basis for different applications.
The RPANOs are sorted by descending occupation number, i.e. eigenvalue with respect to the density matrix, and to obtain a truncated basis with N basis functions, one simply uses the first N RPANOs according to this order. In some cases, it is now necessary to diagonalize the mean-field Hamiltonian in the subspace spanned by the truncated RPANOs basis. For example, to calculate the MP2 energy with 256 RPANOs, the HF Hamiltonian is diagonalized in the subspace spanned by the first 256 RPANOs. The subspace HF orbitals from that diagonalization span the same space as the truncated RPANOs and allow at the same time to use the Brillouin theorem as required by most MP2 implementations.
The computational cost for all the above calculations scales cubically in system size [6][7][8] , allowing the method to be used for relatively big systems (≈ 100 atoms) with moderate computing resources (≈ 100 CPUs).

B. FCI
The full configuration interaction method (FCI) solves the non-relativistic Schrödinger equation within a given basis set exactly. Therefore it provides a useful benchmark for other correlation-consistent methods. Its computational cost scales exponentially with the number of orbitals and the number of electrons, limiting its range of application. In order to obtain the FCI solution, one starts with the HF orbitals and constructs all possible Slater determinants with N occ occupied orbitals. By separating each Slater determinant in its spin up and spin down part, and by using Slater-Condon rules, it is possible to efficiently evaluate contractions of the form HC, where C is the vector of Slater determinant weights in the wave function. Following the method of Handy and Knowles 10 and using a Davidson-Liu algorithm to iteratively diagonalize the matrix H 11 , the FCI solution can be obtained. Using Eq. (1), the FCI 1-RDM can be easily calculated once the FCI wave function is known. Finally, the diagonalization of this density matrix gives the FCI natural orbitals.

IV. APPLICATION
All calculations regarding the RPANOs were performed using VASP [12][13][14] . To obtain the RPANOs we tried two different approaches, namely (restricted) HF as well as PBE as underlying single determinant method. However, since the observations for these two approaches were by and large very similar, we will focus on presenting the results that we obtained from using the HF orbitals as the underlying approach for the RPA calculations. For large scale applications on many atom systems, one would most likely though revert to the more efficient DFT approximation. The RPANO calculations are based on the RPA implementation described in previous work [6][7][8] . For the systems investigated in this work, 8-12 points on the imaginary time/frequency axis were sufficient. For the MP2 calculations in this work, we used the standard available in VASP 15,16 . The FCI calculations were performed with a development version, briefly described above. Details can be found in the master thesis of Sukurma 17 . Where not stated otherwise, the shown figures were obtained from calculations with a plane wave cut-off of 750 eV. Note that the orbitals shown below are approximations and do not fulfil the exact cusp condition due to the use of a finite sized plane wave basis set [compare Eq. (10)].
A. Helium atom The first benchmark system that we studied is the helium atom. The helium atom is an ideal benchmark system for many body approaches for various reasons. First, unlike other model systems (e.g. Jellium) it is a real many body system that can be studied experimentally. Though it is a relatively simple system, its theoretical description faces already prototypical many-body challenges and it is not possible to write down a closed form for its exact solution. Furthermore a numerically exact solution is available, providing a perfect ground to compare to the results of approximate many body approaches 18 .
The set-up was a 10Å × 10Å × 10Å unit cell with periodic boundary conditions. We used a plane-wave basis with different cut-offs up to 750 eV and a 1/r potential.
First, we examined the convergence behavior of the MP2 correlation energy with respect to the RPANOs and compared it with that for canonical HF-orbitals. The results are shown in Fig. 1. While the correlation energy using canonical HF orbitals converges very slowly and would require almost the full basis set (for this set-up with a 750 eV plane-wave cut-off, these are 4.7 · 10 4 orbitals), with only 128 RPANOs the correlation energy converged to < 1 meV above the exact value. Note that in this case, converged results means converged for a specific fixed plane wave basis set size. To obtain accurate MP2 correlation energies one would also need to check convergence with respect to the plane wave cut-off. However to show the qualitative difference between RPANOs and HF orbitals it is sufficient to inspect the behavior at a fixed cut-off, where the error in the MP2 energy related to the cut-off is much smaller than the difference between the MP2 energy at a few hundred RPANOs and the same number of canonical HF orbitals.
We repeated the procedure for FCI correlation energies using 10 to 60 orbitals, see Fig. 2. As was seen for the MP2 correlation energy, canonical HF-orbitals converge the energy very slowly. Again, to reach highly accurate results, one would also need to go to higher plane wave cut-offs, but for a qualitative comparison the chosen cut-off suffices for the same reasons as at the MP2 level. As a matter of fact, the ground state energy of an FCI calculation with a specific basis is always an upper bound for the real ground state energy. One can conclude from this, that the RPANOs are much better in spanning the relevant space of correlated single particle states than the HF-orbitals.
In order to obtain a more accurate approximation to the total energy of the helium atom, E tot = E HF + E cor , we first used a plane wave cut-off extrapolation for the FCI correlation and the HF energy 19,20 . The correlation energy was extrapolated from FCI values at plane wave cut-offs of 474 eV, 621 eV  and 750 eV. The HF energy was extrapolated separately from plane wave cut-offs up to 6000 eV in a 16Å × 16Å × 16Å unit cell in order to reach more accurate predictions. This procedure is justified, because the correlation energy is less than 1.5% of the HF energy. For the plane wave extrapolation of the energies we used a linear regression with respect to the inverse number of plane waves, N −1 PW , which is itself propor- This value is the best approximation we can achieve for a fixed number of truncated RPANOs. However, the value for the cut-off extrapolated correlation energy still depends on the number of RPANOs used in the FCI calculation. Therefore, to obtain a highly accurate corre- lation energy, we performed an orbital-set extrapolation on top of the cut-off extrapolation. To achieve this, we calculated cutoff extrapolated FCI correlation energies as described above with different numbers of RPANOs (10,20,...,60). These cutoff extrapolated values were extrapolated by another linear regression, this time with respect to the inverse number of RPANOs used in the respective FCI calculations, By doing so, we obtained an approximate value for the FCI total energy of −79.019 eV, which differs by less than 1 meV from the literature value from the basis set extrapolation with Gaussian cc-pVxZ basis sets 18 . The extrapolated values for the two methods are therefore well within the error bars of each other. Another exact value from Hylleraas-like calculations is −79.014 eV 21 ; our estimation is around 5 meV bellow that. These results are summarized in Fig. 3. Finally, we compared the shape of the natural orbitals along an axes through the nucleus of the atom. In Figs. 4-6 we show the 1s and 2s orbitals of helium for the different methods we employed. While the occupied 1s orbital is well described in the mean field methods, the 2s orbitals for the different methods vary significantly. This is not very surprising as the HF/DFT states are not computed as eigenstates of the density matrix, but stem from a mean field Hamiltonian. In fact, by construction, the virtual HF/DFT states are degenerate eigenstates to the HF/DFT density matrix with eigenvalue 0. This is easily seen by Eq. (2) and (5), since the mean field single particle states are orthogonal to each other. Furthermore, from Fig. 6 we can deduce that the 2s FCI natural orbital is already converged with as little as 10 RPANOs, but far from converged with 60 HF orbitals. This substantiates the results from Figs. 1 and 2 that canonical HF orbitals calculated from a plane wave basis are impractical for correlated calculations, which is cured by introducing RPANOs.

B. Hydrogen molecule
The second benchmark system investigated is the hydrogen molecule. Though H 2 is the simplest molecule, it's exact quantum mechanical description already shows prototypical challenges. Specifically the dissociation of the two hydrogen atoms is challenging from a theoretical point of view. In the dissociation limit an accurate wave function based method should yield a wave function corresponding to two separate hydrogen atoms. The restricted HF method, however, yields an inadequate dissociation limit with an energy substantially above the correct limit due to a flawed mixture of single electron states. Conversely, the spin-unrestricted HF (URHF) method yields a solution with an up electron on one site and a down electron on the other site, which is a spin contaminated singlet. The true singlet would have an equal expectation value for up and down electrons on both sites and is a mixture of (at least) two Slater determinants. The underlying problem is that a single determinant description is insufficient to describe the dissociation process. Similar problems exist in DFT 22 . Therefore the hydrogen molecule is, despite being seemingly simple, an interesting benchmark system for correlated electronic structure methods.
The computational set-up was again a 10Å × 10Å × 10Å unit cell with periodic boundary conditions. We used a planewave basis with different cut-offs up to 750 eV. The internuclear distance was varied between the equilibrium distance of 0.74 Å and 5.0 Å.
As for the helium atom, we examined the convergence behaviors of the MP2 as well as the FCI correlation energy with respect to the RPANOs and compared them with convergence for HF-orbitals. The results of the calculations are shown in Figs. 7 and 8. Similarly to the helium atom, the correlation energies converge very slowly with an increasing number of HF orbitals. Using RPANOs, we only need a few hundred basis functions to obtain converged results for the MP2-energy. The statements on the cut-off convergence in the previous section on the helium atom apply here as well.
For the FCI total energies, a cut-off extrapolation was done in the same fashion as for the helium atom according to Eq. (12) -from cut-offs up to 6000 eV for HF, and from cut-offs of 474 eV, 621 and 750 eV for the FCI correlation energy. Additionally we also performed an orbital set extrapolation with respect to the number of RPANOs, compare Eq. (13). In order to compare our results with experimental energies, we calculated the dissociation energy by subtracting twice the URHF energy for the isolated atom obtained with the same procedure (cut-off extrapolation), from the extrapolated FCI total energy at equilibrium distance (0.74 Å). An appropriate experimental value to compare our ab-initio Born-Oppenheimer approximation result with, was obtained by adding the experimental vibrational zero point energy (E ZPE ) 23 to the experimental dissociation energy D 0 24 , Our extrapolated FCI binding energy (−D e ) is −4.749 eV, which is 1 meV below the corresponding experimental value of −4.748 eV. Additionally we performed FCI calculations with the Gaussian pp-VxZ basis sets, with x = 2, 3, 4, and extrapolated in order to obtain an estimation for the basis set limit of the H 2 binding energy in the Born-Oppenheimer approximation. The extrapolation was done along the lines of Ref. 18 , i.e. for the correlation energy we extrapolated according to while for the HF energy we fitted The extrapolated energy from the pp-VxZ basis sets was −4.755 eV, 6 meV below the value for RPANOs and 7 meV below the experimental value. These results are summarized in Fig. 9.  Again we also inspected the shape of the orbitals, this time along the bond-axis of the hydrogen molecule. In the equilibrium position, the first NO (i.e. the occupied 1σ state) is well approximated by the mean field methods 1σ orbital, see Fig. 10. But the second natural orbital is already very different from DFT and even more so from the HF 1σ * , see Fig. 11. This has the same reasons as for the He 2s orbital.
In Fig. 11, one can also see that the RPANO does not match the FCINO perfectly. But using just 10 RPANOs for the FCI calculation, the FCI natural orbital is already converged, see Fig. 12. This is a hint that, at least in this case, the RPANOs span the relevant space efficiently. In contrast, the HF orbitals are not even close to convergence with 60 orbitals.
Similar observations can be made for other cases, with varying internuclear distances and orbital levels as well as basis set sizes and underlying mean field methods. We have in-  vestigated a variety of cases, and we have picked a few illustrative examples in an effort to summarize our findings. One of these examples is the third orbital (2σ ) at an internuclear distance of 1.588 Å (= 3a 0 ). This example is interesting since in this case the RPANO noticeably differs from the FCINO, see Fig. 13. However, inspecting Fig. 14, one can see again that only very few RPANOs are necessary to describe the third FCINO correctly, even though the third RPANO does not match well with the third FCINO. Furthermore once again, the canonical HF-orbitals poorly describe the FCINO.
Finally, we show the orbitals at dissociation (5 Å). This time we compare the first two orbitals again. At this distance, the occupied orbitals vary more between the employed methods than they did at equilibrium distance, compare Fig. 10 and  15. By construction the occupied RPANO is equal to the occupied HF orbitals. It is noteworthy that in this case, the PBE orbitals and hence the PBE electron density, is much closer to the FCINO than the HF orbital is.

C. Fluorine molecule
In the two electron systems H 2 and He, the direct and exchange contribution to the MP2 correlation energy are related by a simple factor of -2, i.e. E (2) x . As Σ RPA consists of an infinite series of bubble diagrams (χ 0 ), it intrinsically contains the direct part of the MP2 self energy Σ (2) d , but also accordingly models the exchange part Σ (2) x . For these two reasons, we expected the RPANOs to efficiently span the relevant subspace for MP2 calculations in H 2 and He. However, for a multi electron system like the fluorine molecule, the simple relation between direct and exchange part of the MP2 energy is no longer valid and it is therefore not clear from the outset that RPANOs will converge the MP2 energy quickly. Thus we investigated the convergence behavior for the direct (MP2D) and exchange (MP2X) contribution to the MP2 correlation energy separately and compared again the performance of RPANOs with HF orbitals. For these calculation, we employed the PAW method 25,26 and used again a 10Å × 10Å × 10Å unit cell and a plane wave cut-off of 750 eV. We observed that for fluorine, the exchange contribution to the MP2 correlation energy converges rapidly and in the same manner as the direct contribution, see Fig. 17. This is a strong indication that the proposed method, using RPANOs to construct orbitals for more expansive correlated methods, is straightforwardly applicable to complex, multi electron systems.

D. Diamond
The fourth benchmark system, diamond, was chosen to demonstrate the method on a simple periodic multi electron system. Fig. 18 shows that the convergence behavior of the MP2 energy is slightly improved by the use of RPANOs, but the effect is much smaller than in the molecular systems.  Since we use the primitive cell, the number of plane waves in the full basis set is much smaller than it was for the molecules with large vacuum regions between the periodic images -in this case ≈ 400 for diamond vs. ≈ 47000 for H 2 . In this case, between 200-300 RPANOs were sufficient to reach convergence, similar to the numbers for the hydrogen molecule. Thus the improvement being small is not caused by an underperformance of RPANOs, but rather due to a comparably small number of orbitals in the full basis set.

V. CONCLUSION
We have tested the efficiency of RPANOs for MP2 and FCI calculations and found that for atoms and molecules they can drastically reduce the number of orbitals that are necessary to reach converged correlation energies in comparison to canonical HF or PBE orbitals. In the present implementation, the calculation of RPANOs scales only cubically with system size. The compute cost of canonical MP2 or FCI calculations increases with the fifth power (MP2) or exponentially (CI) with respect to the number of orbitals. Thus the compute cost for the preceding compression of the space of orbitals will generally be small compared to the savings gained in the final accurate correlated calculations.
An important finding of the present work is that RPANOs yield a set of orbitals which allows to converge the correlation energy rapidly, even in cases, where the RPA itself is most likely not very accurate. Typical examples for such situations are bond breaking and bond making, in this study exemplified for the case of bond dissociation of H 2 . Specifically, upon dissociation, we found cases where the first few RPANOs deviate significantly from the natural orbitals determined from the FCI correlated density matrix. Regardless of this difficulty, when the correlation energy is expanded in an increasing set of RPANOs, rapid convergence with the number of RPANOs is observed even in these "difficult" cases.
The present method is especially useful for systems comprised of large vacuum regions where the basis set size becomes unmanageable when plane waves are used (we are talking of ten thousands of plane wave orbitals). In fact, some of us have already applied a preliminary implementation of the algorithm presented here to calculate RPA natural orbitals and subsequently solve the BSE equations in this smaller basis. In this way, we were able to determine accurate quasiparticle energies in the GWΓ approximation for molecules 27 . Apart from molecular systems, open structures (zeolites) and surface science studies with large vacuum regions are likely to be an interesting and promising field of application for the RPANO method. Recall that coupled cluster methods scale at least with the fourth order of the number of virtual orbitals. If one is capable to compress the number of virtual orbitals by say a factor 2-3, speed ups of one to two orders of magnitude can be expected. We also note that the RPA is expected to be already fairly accurate for the prediction of adsorption energies 28,29 , however, close to transition states we expect that methods beyond the RPA will be required. Our results for the bond dissociation in H 2 gives hope that the RPANOs will work also well in such challenging situations. Overall, the present implementation will greatly boost the applicability of plane wave basis sets allowing them to compete with the now omnipresent local Gaussian basis sets used in quantum chemistry.
Last not least, natural orbitals in periodic systems can be used as a stepping stone for "strongly" correlated calculations, such as dynamical mean field theory (DMFT) or density matrix renormalization group (DMRG) theory. There are already examples in literature indicating that such an approach might be at least competitive with the usual approach of maximally localized Wannier functions [30][31][32] , since natural orbitals with an occupancy far from zero or one, are likely to contribute most to the correlation energy. Thus, combining perturbative methods for weakly correlated orbitals with an accurate correlation method for the strongly correlated orbitals is an important future development to be pursued. 20