Surface science using coupled cluster theory via local Wannier functions and in-RPA-embedding: the case of water on graphitic carbon nitride

A first-principles study of the adsorption of a single water molecule on a layer of graphitic carbon nitride employing an embedding approach is presented. The embedding approach involves an algorithm to obtain localized Wannier orbitals of various types expanded in a plane-wave basis and intrinsic atomic orbital projectors. The localized occupied orbitals are employed in combination with unoccupied natural orbitals to perform many-electron perturbation theory calculations of local fragments. The fragments are comprised of a set of localized orbitals close to the adsorbed water molecule. Although the surface model contains more than 100 atoms in the simulation cell, the employed fragments are small enough to allow for calculations using high-level theories up to the coupled cluster ansatz with single, double and perturbative triple particle-hole excitation operators (CCSD(T)). To correct for the missing long-range correlation energy contributions to the adsorption energy, we embed CCSD(T) theory into the direct random phase approximation, yielding rapidly convergent adsorption energies with respect to the fragment size. Convergence of computed binding energies with respect to the virtual orbital basis set is achieved employing a number of recently developed techniques. Moreover, we discuss fragment size convergence for a range of approximate many-electron perturbation theories. The obtained benchmark results are compared to a number of density functional calculations.

However, despite the fact that molecular adsorption can often be understood as a relatively local process, it is challenging to take full advantage of locality and realize a computational model that requires a quantum mechanical description of only a small number of particles in the vicinity of the adsorption site without compromising accuracy. In this work we apply a recently presented quantum mechanical embedding approach based on a planewave basis set implementation to a prototypical molecular adsorption problem. We obtain highly accurate adsorption energies on the level of quantum chemical manyelectron theories up to coupled cluster with single, double and perturbative triple particlehole excitation operators * tobias.schaefer@tuwien.ac.at (CCSD(T)).
Here, we investigate the case of the adsorption of a single water molecule on graphitic carbon nitride (gC 3 N 4 ).
Graphitic carbon nitride is a promising compound for catalysis and energy storage [14].
It exhibits a high thermal stability and is easily processable from abundant materials. In addition, its band gap is suciently large to overcome the endothermic character of the watersplitting reaction pathway [5], which makes it interesting for applications in photocatalysis. This has also motivated theoretical studies of molecular adsorption processes [6,7]. An initial step in such a reaction pathway is the adsorption of a single water molecule on the surface.
Theoretical studies can provide estimates of the adsorption energy but have so far been limited to calculations on the level of DFT. This can partly be attributed to the fact that the two dimensional surface is corrugated and needs to be modeled using relatively big supercells, making it dicult to apply computationally more expensive but accurate methods, which are typically limited to unit cells containing only a few ten atoms. Therefore it can be assumed that embedding approaches, as described in the present work, help to remove limitations imposed by the simulation cell.
The theoretical study of the adsorption process of a single water molecule on a two dimensional material unveils strengths and weaknesses of many widely-used elec- ited to a few ten atoms only [810]. Here, we advance our recently published embedding strategy [11] which enables a systematic path to approach the accuracy of CCSD(T) for regional phenomena in large systems of more than thousand electrons or periodic supercells with edge lengths of more than 20Å.
Regional embedding techniques for rst-principles methods, in general, form an active research eld [1221].
This strategy was likewise adapted for many-electron correlation methods applied to molecules as well as periodic systems [11,2236]. However, regional phenomena in extended (periodic) environments exhibit long-range correlation eects, such as lattice relaxations or electron dispersion, which often exceed the size of a local fragment.   [40,41], as well as the many-electron correlation at the second-order Møller-Plesset perturbation theory (MP2) [42,43] and RPA [37] level of theory. The basis set is dened by the single cuto parameter ENCUT = 500 eV and leads to about 183 000 plane-wave basis vectors which corresponds to more than 600 basis vectors per occupied band. vasp makes use of the projector augmented wave (PAW) method [44] in combination with frozen core potentials, see Tab. I This decomposition is exact since E M F↔R is implicitly dened by the above equation, while all other quantities are explicitly dened. The contribution E M F↔R can be interpreted as the inter-fragment correlation. Embedding the high-level method M 1 into the low-level method M 2 (in short M 1 :M 2 ) can then intuitively be dened by Provided the low-level method M 2 can be applied to all electrons in the entire supercell, the presented embedding scheme accounts for all many-electron correlation eects, but at mixed levels of theory.
The actual restriction of a correlation method to a local fragment of the density is by no means trivial nor unique.
The following six steps explain our approach which involves an ecient compression of the unoccupied space and allows for a systematic convergence of the correlation energy with respect to the size of the fragment F.
The expected Brillouin zone integration [50] is neglected here, since we currently aim for large supercells using a Γ-point-only sampling. The unitary matrix u ij is determined by the optimization (maximization/minimization) of a cost functional L which ultimately denes the localized Wannier orbitals. For instance, the PM localization [51] is dened by maximizing the atomic partial charges, where p ≥ 2 and P A = µ∈A |µ µ| is a projector onto atom centered functions |µ at atom A. In the case of spin-polarization, a separate cost functional is considered for each spin. While our implementation allows the optimization of any cost functional L which is polynomial in u ij (including also other optimizations beside orbital localization), we choose the PM localization cost functional and its variants as a specic example. In particular, we consider the IBOs which are obtained by choosing intrinsic atomic orbitals (IAOs) |µ IAO for the projector P A in the construction of the PM cost functional [52].
The algorithm to maximize/minimize the considered cost functional is essentially an iterative unitary matrix constrained optimization (UMCO) procedure. Starting from a stochastic unitary matrix U = (u ij ) each iteration step propagates the unitary matrix according to The Riemannian derivative G = ΓU † − U Γ † is anti-Hermitian, G † = −G, and is calculated from the Euclidean derivative Γ = ∂L/∂U * . A conjugate-gradient technique based on the Polak-Ribière-Polyak update factor is applied in order to speed-up the convergence. We refer the reader to Ref. [46] for details. The iterations are considered as converged, once the norm obeys G, G = tr(GG † )/2 < δ where we choose δ = 10 −10 .
For the case of the PM functional we nd The optimal step size µ obeys ∂ ∂µ L[exp(±µG)U ] = 0 and is estimated by a polynomial line search approximation as described in Ref. [46]. A positive sign +µ is used for the maximization (e.g. for the PM functional), while −µ corresponds to a minimization of L.
The plane-wave basis and the specics of the PAW method have to be respected only in the preparatory calculation of the overlaps (scalar product) between χ i |µ to compute the action of the atomic projectors P A . The calculation of these overlaps are parallelized and has previously been implemented for an interface between vaspand wannier90 in Ref. [53]. In contrast, the iter-  7) is not obvious and can be achieved using the following equivalent formulation, The atomic overlap matrix (X A ) iµ = χ i |µ with µ centered on atom A only and its unitary rotated variant Y A = X † A U was used. For each atom, matrix multiplications have to be performed, resulting in a complexity of is the number of atom centered functions per atom.
As atom centered functions we choose IAOs, |µ → |µ IAO , which can be obtained from atomic functions |f µ by a simple projection procedure, where we follow the formulation of Janowski with the overlap matrix S µν = f µ |f ν . The subsequent orthogonalization is denoted as orth. The orbitals decorated with a tilde, | χ i , can be understood as the oc- (9) constructs the necessary augmentation, such that the IAOs |µ IAO form an exact basis of the occupied space as described by the complete basis. This is true as long as no single occupied orbital is orthogonal to the atomic functions, ν f ν |χ i = 0, ∀i. Additionally, if the density spillage of the tilted orbitals is not too large, |µ IAO exhibits almost the same spatial shape as the atomic function |f µ . In this case, the IAOs can be considered as a minimal but exact atom centered and local basis for the occupied space. Furthermore, the atomic partial charge in the PM cost functional (5) are stable against dierent choices of atomic functions |f µ if the IAOs are used for the projectors P A , as shown in Ref. [52]. The resulting Wannier orbitals are thus the IBOs. The exponential decay of the gradient norm G, G using the presented UMCO algorithm applied to water on gC 3 N 4 is shown in  The virtual orbitals are computed using the following two-step procedure as already outlined in Ref. [11]. We rst compute the RPA natural orbitals (RPANOs) for the full supercell prior to the localization procedure of the occupied orbitals [41]. We truncate the RPANOs to a subset constructed from 60 natural orbitals per occupied spatial orbital and re-canonicalize the Fock matrix in this subspace. We refer to the corresponding total number of unoccupied orbitals at this stage by N pre where p, q, r and s refer to (occupied or virtual) orbital indices. F refers to auxiliary basis functions that have been obtained using a singular value decomposition outlined in Ref. [45]. limit is performed. The extrapolation procedure was previously described in Ref. [42]. Here, the extrapolation is performed for each electron pair correlation energy contribution to yield MP2 ij . Throughout this work we refer to CBS limit estimates when referring to the MP2 energy.
The CCSD correlation energies are computed using 20 natural orbitals per occupied orbital for each fragment.
To correct for the remaining basis set incompleteness error (BSIE) of CCSD correlation energies we employ a cor-  We now turn to the results obtained using the embedding approach and their convergence with respect to the number of occupied orbitals in the studied fragments. In the following we discuss the performance of our embedding scheme for the coupled cluster methods. As mentioned above, we believe that embedding a high-level into a low-level correlation method allows to remedy two central issues of local fragments in periodic environments:   and RPA of about 100 meV prohibits a consistent picture.
In passing we note that we perform RPA calculations using a HF reference, which is in contrast to the more  Here we show an analog to Fig. 4 of the manuscript for the case of equivalent (i.e. non-relaxed) atomic positions in the layer in the supercells with and without water. To save computing time, a smaller plane-wave basis (ENCUT = 300 eV) and less natural orbitals (40 instead of 60) where chosen for the correlation energy calculations.