Octopus, a computational framework for exploring light-driven phenomena and quantum dynamics in extended and finite systems

Over the last few years, extraordinary advances in experimental and theoretical tools have allowed us to monitor and control matter at short time and atomic scales with a high degree of precision. An appealing and challenging route toward engineering materials with tailored properties is to find ways to design or selectively manipulate materials, especially at the quantum level. To this end, having a state-of-the-art ab initio computer simulation tool that enables a reliable and accurate simulation of light-induced changes in the physical and chemical properties of complex systems is of utmost importance. The first principles real-space-based Octopus project was born with that idea in mind, i.e., to provide a unique framework that allows us to describe non-equilibrium phenomena in molecular complexes, low dimensional materials, and extended systems by accounting for electronic, ionic, and photon quantum mechanical effects within a generalized time-dependent density functional theory. This article aims to present the new features that have been implemented over the last few years, including technical developments related to performance and massive parallelism. We also describe the major theoretical developments to address ultrafast light-driven processes, such as the new theoretical framework of quantum electrodynamics density-functional formalism for the description of novel light–matter hybrid states. Those advances, and others being released soon as part of the Octopus package, will allow the scientific community to simulate and characterize spatial and time-resolved spectroscopies, ultrafast phenomena in molecules and materials, and new emergent states of matter (quantum electrodynamical-materials).


I. INTRODUCTION
It is a general challenge in the electronic structure community to develop accurate and efficient methods for modeling materials of ever increasing complexity in order to predict their properties.In this respect, time-dependent density functional theory (TDDFT) and related methods have become a natural choice for modeling systems in and out of their equilibrium.
During the last few years, novel directions of research emerged in the fields of chemistry, physics, and materials science that required the development of novel simulation tools needed to face the new challenges posed by those experimental advances and emerging phenomena.One can cite some examples among these new fields of research: the strong coupling between light and matter (including materials embedded in cavities), new states of matter (hidden phases and topological solids and molecules), and strong field dynamics in periodic systems.Whereas the strong-field dynamics of atoms and molecules is now well understood, strong-field dynamics in solids is an active field of research.Real-time (rt) TDDFT 1,2 represents a natural tool to study highly non-linear phenomena in solids and low-dimensional materials, and the development of efficient numerical methods to perform real-time TDDFT in such periodic or semi-periodic systems is crucial to explore this new phenomenon (including the quantum nature of light and phonons).Indeed, it allows us to describe highly nonlinear processes without having to resort to perturbation theory on top of equilibrium DFT calculations.
8][19] These new features include the implementation of new levels of self-consistent and microscopic couplings of light and matter, the treatment of solvent effects through the polarizable continuum model (PCM), the implementation of various methods to treat van der Waals (vdW) interactions, new methods to calculate magnons, conductivities, and photoelectron spectroscopy from real-time TDDFT, and the calculation of orbital magneto-optical responses.Advances in numerical algorithms and methods, such as new propagators and the use of iterative eigensolvers in the context of reduced density matrix (RDM) functional theory, are also discussed.Finally, recent improvements in the treatment of periodic systems, as well as more technical code improvements, are also presented.For a more detailed explanation about how to use these newly introduced features in practice, we refer the readers to the Octopus webpage, 20 as new tutorials and examples are regularly being added to it.This paper is organized as follows: First, we present in Secs.II-XII the new implementations of physical theories and algorithms that allow us to deal with non-equilibrium phenomena in materials and nanostructures.This is followed by Secs.XIII-XVI dealing with technical developments that are fundamental in improving the code performance and the stability of the algorithms.Finally, we draw our conclusions in Sec.XVII.Unless otherwise stated, atomic units are used throughout the paper.

II. COUPLED MAXWELL-KOHN-SHAM EQUATIONS
In most cases when light-matter interactions are considered, a decoupling of light and matter is performed at the outset.Either the electromagnetic fields are prescribed and then the properties of the matter subsystem are determined, as frequently found in, e.g., quantum chemistry or solid-state physics, or the properties of matter are prescribed and then the properties of the photon subsystem are determined, as done in, e.g., quantum optics or photonics. 12,21he microscopic interaction of light and matter in Octopus has followed this decoupling strategy and has been treated so far only in the forward-coupling direction.In this approximation, an external classical laser pulse or kick is prescribed and the response of the system is computed by evolving the Kohn-Sham orbitals. 22he back-action of the matter subsystem on the laser pulse, the subsequent effect of this modified pulse on the matter, and so on are ignored.This forward-coupling approximation is highly accurate when the total current generated in the system is comparably small, such as in atoms or small molecules.Small refers here to currents that induce transverse electric fields that are about an order of magnitude smaller than the longitudinal electric field that originates from the Hartree potential of the system.This has been exploited in Octopus and other codes, and many different types of spectroscopies were computed successfully in the past using this approach.
In contrast, in classical electromagnetic modeling, the opposite view is taken.Here, the material properties are prescribed and the resulting electromagnetic fields are computed.In practice, the material properties are routinely approximated by a local continuum model or by the dielectric function of the system (such as the Debye, Lorentz, or Drude models) and then Maxwell's equations are solved for linear dielectric media arranged in appropriate geometries. 23t is clear that both perspectives, a focus on matter dynamics alone or a focus on electromagnetic field dynamics alone, break down when the total induced currents become large and when electromagnetic near-field effects on the scale of the material system are not negligible anymore.Prime examples for such cases are nanoplasmonic systems, surface plasmon-polaritons, or tip-enhanced spectroscopies.In these cases, the back-action of the material response on the system itself has to be taken into account, leading to screening and retardation effects.The proper theoretical framework, which encompasses all these effects, is quantum electrodynamics.Starting with a generalized Pauli-Fierz field theory for the combined system of electrons, nuclei, and photons, we have recently derived different levels of self-consistent and microscopic couplings of light and matter.In the classical limit, this results in a coupled set of Ehrenfest-Maxwell-Pauli-Kohn-Sham equations. 21To implement these equations, we added a Maxwell solver to the Octopus code, which we couple self-consistently to the dynamics of the electrons and nuclei.In the following, we briefly summarize the basic ingredients for this implementation and show an example of self-consistent lightmatter interactions for a nanoplasmonic system.Further details of the implementation and nano-optical applications can be found in Ref. 21.
Since over the years, Octopus has been optimized heavily to solve time-dependent Schrödinger and Kohn-Sham equations, we have exploited the fact that Maxwell's equations can be formulated in Schrödinger form 24 to benefit from the efficient time-evolution in the code.This reformulation is based on the Riemann-Silberstein vector, 25 which is a combination of the electric E(r, t) and magnetic field B(r, t), The sign of the imaginary part of the Riemann-Silberstein vector corresponds to different helicities.The reformulation of Maxwell's equations in Schrödinger form is purely algebraic and starts out with the microscopic Maxwell's equations where E and B are the classical electric and magnetic fields, respectively, ρ and J are the charge and current densities, ϵ 0 and μ 0 are the vacuum permittivity and permeability, respectively, and c = (ϵ 0 μ 0 ) −1/2 is the speed of light.Using the Riemann-Silberstein vector, the electric and magnetic Gauss laws may now be combined, and likewise, the Faraday and Ampere laws can be combined into one evolution equation for the Riemann-Silberstein vector, Here, S = (Sx, Sy, Sz) denotes a vector of spin one matrices which are analogous to the Pauli matrices and show the spin-one character of the photon.Having cast Maxwell's equations as an inhomogeneous Schrödinger equation, it is now straightforward to use the time-evolution algorithms in Octopus to time-evolve the Riemann-Silberstein vector.The only difference to the matter propagation is that we are now dealing with the "Maxwell Hamiltonian" which acts upon six orbitals of the Riemann-Silberstein vector corresponding to the three components of the electric and magnetic field vectors.As in the matter case in Octopus, the discretization of the gradient in the Maxwell Hamiltonian is performed with finite-difference stencils, and the domain parallelization of Octopus can be used seamlessly for the Maxwell case as well.The difference to finite-difference time-domain (FDTD) codes based on the Yee algorithm is that we do not employ two shifted grids for the electric and magnetic fields, 23 but rather a single grid for the Riemann-Silberstein vector.This simplifies the coupling to matter and allows us to use higher-order finite-difference discretizations for the gradient.Since the spatial discretization is connected to the temporal discretization through the Courant condition, this, in turn, allows us to use larger time steps.Furthermore, from our experience, a unified grid also improves the stability compared to FDTD.Instead of using the constitutive relations, we couple Maxwell's equations directly to the microscopic current density of the matter subsystem, consisting of the usual paramagnetic current term, the diamagnetic current term, and the magnetization current term.For the coupling of the electromagnetic fields to the matter subsystem, The Journal of Chemical Physics ARTICLE scitation.org/journal/jcpwe are relying on the Power-Zienau-Woolley transformation, 26,27 which leads to a multipole expansion.We have implemented the first two orders of this expansion: the dipole approximation in lowest order and electrical-quadrupole and magnetic-dipole coupling in the next order.In addition, we are currently working on implementing the full minimal coupling with a full position dependent vector potential.
The time evolution of the Kohn-Sham orbitals and the Maxwell fields is performed side-by-side and the two subsystems are coupled self-consistently in each time step, as illustrated in Fig. 1.To propagate different subsystems with different Hamiltonians and different sets of orbitals simultaneously, we implemented a multi-system framework in Octopus (for more details, refer Subsection XVI C).
Similar to the matter propagation, also in the Maxwell case, outgoing waves that reach the boundary of the simulation box have to be absorbed to avoid artificial reflections and backscattering.Our first attempt was to use also the mask functions that are used for the matter propagation in Octopus.However, in the electromagnetic case, the mask absorption of outgoing waves turned out to be not efficient enough so that we implemented a perfectly matched layer (PML) 23 for the Maxwell propagation.
When considering incoming electromagnetic fields with optical wavelengths, the coupling to atomistic or nano-scale systems leads to a multi-scale problem.The optical wavelength of the radiation is in this case much larger than the de Broglie wavelength of the matter.Likewise, the electromagnetic waves are traveling with the speed of light, which requires sub-attosecond time steps.We have, therefore, implemented different multi-scale couplings in space and time.For example, the Maxwell simulation box can be on the same scale as the matter box.In this case, the electromagnetic waves are represented as incoming analytical time-dependent boundary conditions and are propagated numerically inside the simulation box.Alternatively, the Maxwell simulation box can be much larger than the matter box to fully encompass laser pulses with optical wavelengths.In this case, prolongations and interpolations have to be used similarly to multi-grid methods.Since the electronic and nuclear motion is much slower than the time-evolution of the electromagnetic waves, we also implemented a multi-scale approach for the real-time propagation.The Riemann-Silberstein vector is propagated with frozen electronic current from the last point of interaction for many intermediate time steps before a coupling to the matter subsystem takes place.The number of intermediate steps is a convergence parameter and depends on the physical situation at hand.
Since we now include the description of classical electromagnetic fields explicitly in our real-time simulations, we have directly access to the outgoing electromagnetic radiation.This allows us to define electromagnetic detectors at the box boundaries, which accumulate the outgoing electromagnetic waves.We implemented such electromagnetic detectors in Octopus, and this allows us to run simulations in close analogy with experiments and to directly observe the outgoing radiation.For example, it is no longer needed to Fourier transform the time signal of the matter dipole to get optical spectra, but we rather have access to the spectrum directly on the Maxwell grid.
As an example of a coupled Ehrenfest-Maxwell-Pauli-Kohn-Sham propagation with Octopus, we selected a nano-optical application.We consider in this example two almost spherical sodium nanoparticles with 297 sodium atoms each, which are arranged in a dimer configuration.This system interacts with an incoming laser pulse that excites either the internal dipole or quadrupole plasmon motion of the dimer.In Fig. 2, we show the resulting electromagnetic field enhancements for different levels of light-matter coupling.In panels (a) and (b), we show the temporal profile of the incoming laser pulse and the resulting current density at the center point between the two nanoparticles.The field enhancement can be seen in panels (c)-(e).Including a self-consistent back reaction in the light-matter coupling, the field enhancement is reduced at the center point between the two nanoparticles, as shown in panel (c), while far away from the dimer, as shown in panels (d) and (e), the field enhancement is larger than in the forward coupled case.Furthermore, frequency shifts can also be observed, which are more pronounced in the near-field.We found that the field enhancement is also sensitive to the coupling terms of the multipole expansion, which are included, and that the quantitative difference of switching from local-density approximation (LDA) to the Perdew-Burke-Ernzerhof (PBE) functional for the exchange-correlation functional is in this case smaller than including the back reaction in the light-matter coupling (cf.Ref. 21).
To conclude, with our new efficient implementation for coupled Ehrenfest-Maxwell-Pauli-Kohn-Sham equations in Octopus, we have now a very versatile tool that allows us to compute fully selfconsistent forward-backward light-matter coupling in real-time and real-space for a vast set of applications and in close analogy to experiments.As the coupling to Maxwell's equations of motion (EOM) represents the classical limit of the light-matter interaction, this development leads to the classical limit of QEDFT.

III. STRONG ELECTRON-PHOTON INTERACTIONS IN REAL SPACE: QUANTUM-ELECTRODYNAMICAL DENSITY-FUNCTIONAL THEORY
The nascent field of strong light-matter interaction expanded over the past decades from small atomic structures to chemistry 3 and materials science. 28This development necessitates predictive firstprinciples methods capable to describe light and matter on the same footing.We introduced for the first time, a time-dependent density functional theory for quantum electrodynamics (QEDFT) 4,10 to treat ab initio weak and strong light-matter interactions and its applications to chemistry and materials, which provides a unique framework to explore, predict, and control new states of matter out of equilibrium.This generalization of the time-dependent density-functional method allows us, for the first time, to explore the effects of dressing electronic states with photons while retaining the electronic properties of real materials.
A general non-relativistic Hamiltonian for light-matter systems treating N interacting electrons coupled to Np photonic modes in the case of the so-called length-gauge, and after employing the long-wavelength (dipole) approximation, 13 reads as follows: Here, the first two terms on the right-hand side correspond to the usual electronic many-body Hamiltonian, while the last term describes the photon modes, which is characterized for each photon mode α by its elongation qα, frequency ωα, and electron-photon coupling strength vector λα that includes the polarization of the photon mode and introduces the coupling to the total dipole ∑ N k=1 r k of the electronic system.For a more detailed discussion of the different regimes of weak/strong/ultra-strong light-matter coupling, we refer to Ref. 6.The external variable for the photon system is the time-dependent current j (α) ext (t).QEDFT 4,10 is structurally similar to time-dependent densityfunctional theory in that it is based on a one-to-one correspondence between internal and external variables.If now photons are also considered, the set of internal variables has to be expanded.In the

ARTICLE
scitation.org/journal/jcpframework of Eq. ( 8), the internal variables become the density n(r, t) and the mode-resolved contributions to the electric displacement field qα(t).By introducing and exploiting the bijective mapping of these internal and the external variables [vext(r, t) and jext(t)], the auxiliary Kohn-Sham system is described by the electronic Kohn-Sham equations as well as Maxwell's equations, 5 leading to no exchange-correlation contribution in the photon subsystem [j xc (t) = 0].This reformulation subsumes the "quantumness" of the light-matter interaction solely into the local exchange-correlation potential that now features a component due to the electron-photon interactions, in addition to the part due to electron-electron interaction, i.e., vxcσ(r, t) = v ee xcσ (r, t) + v ep xcσ (r, t). 4,10,12Extending the coupled Maxwell-Kohn-Sham equations to consider quantum photons is, thus, condensed into the calculation of the local exchange-correlation potential.
In practice, QEDFT requires the construction of an additional exchange-correlation potential that describes the electron-photon interaction.First attempts to construct this potential were based on many-body perturbation theory via the Sham-Schlüter equation 29 within the exact exchange (EXX) approximation 13,30 by utilizing the optimized-effective potential (OEP) method. 31

A. The optimized effective potential
The framework of the optimized effective potential (OEP) equation for electron-photon systems has been introduced in Ref. 30 for the static and time-dependent regimes.Solving the timedependent OEP equation is unfortunately computationally challenging.Applications to electronic systems [32][33][34] have been realized so far only for small and low-dimensional systems while their utilization for interacting electron-photon systems remained restricted to models. 30,35,36This section provides a brief introduction to the implementation that allows us to solve the computationally more tractable static OEP equation.
As introduced in Ref. 30, the OEP photon energy depends on both the occupied and unoccupied electronic orbitals.Alternatively, the OEP photon energy can also be formulated using occupied orbitals and orbital shifts only. 13The full energy expression is given by where E (ee) xc describes the electronic exchange-correlation energy and E (α) x describes the exchange energy due to the interaction of the electrons with the photon mode α.Avoiding unoccupied orbitals is computationally much more favorable for larger systems and the photonic induced exchange energy can be correspondingly expressed as a sum over the Nσ occupied orbitals of spin channel σ, iσ,α | dα|φiσ⟩ + c.c., (10)   where ωα describes the αth mode of the electromagnetic field and dα = λα⋅r describes the dipole operator and the electron-photon coupling strength.We can now reformulate the problem in terms of two electron-photon orbital shifts.The Kohn-Sham orbitals φiσ contribute to both electron-photon orbital shifts Φ iσ,α and Φ (2) iσ,α that can be calculated using the Sternheimer equations. 13The first electronphoton orbital shift can be obtained explicitly by the solution of a linear Sternheimer equation with the matrix element d (α) ijσ = ⟨φiσ| dα|φjσ⟩.In addition, the second electron-photon orbital shift Φ (2) iσ,α (r) can be defined explicitly as follows: From Eq. ( 10), we can now deduce the potential using After these reformulations, we find for the final OEP equation including electron-electron effects as well as electron-photon effects where the inhomogeneity Λiσ(r) is given by iσ,α ⟩φ * iσ (r)φiσ(r)].
In Eq. ( 14), we defined a third orbital shift, the exchange-correlation orbital shift, which will be used to obtain the corresponding exchange-correlation potential and is defined along the lines of the orbital shift usually used in OEP calculations. 31We can also obtain ψ * iσ (r) using a Sternheimer equation where M * iσ (r) now consist of the electron-photon orbital shifts and the Kohn-Sham orbitals, as described in Ref. 13. Accordingly, we define this quantity as where the effects of the electron-electron interaction are included in the quantity uxiσ(r).For instance, in exchange-only calculations, this quantity is defined as uxiσ δφ iσ (r) , where E (ee) x is the usual Fock exchange energy.
Equation (15) has to be solved self-consistently with Eq. (11).By this reformulation, we replaced the problem of calculating the For practical implementation, we reformulate the OEP equation in the following form, as it is commonly done to construct the electronic OEP: 31 and update the potential with The quantity Sσ(r) becomes a measure for convergence, since it is vanishing at the solution point [compare Eqs. ( 17) and ( 14)].For the function c(r), we have different possibilities, such as using a constant or using the inverse of the electron density, as used in Ref. 37.
Other methods are also possible, such as the Barzilai-Borwein (BB) method. 38We found stable algorithms when using a constant and the Barzilai-Borwein method.While we will show the computational feasibility of this approach in the following, often a simplified solution is beneficial as a starting point for the self-consistency procedure.Such a simplified approximation can be deduced by reformulating Eq. ( 15) into the following equivalent form: and subsequently assuming ( ĥsσ(r) − ϵiσ)ψ * iσ (r) = 0 to start the iterative process.0][41] By multiplying Eq. ( 19) by |φjσ(r)| 2 and integrating over the spatial coordinates, we arrive at a linear equation that, in turn, can be solved for the approximate vxσ.This approximation scheme has proven to often deliver sufficiently accurate results for electronic structure calculations with significantly lower computational effort and reliable stability.As this approximation can be seen as a diagonal approximation to the response function, it unavoidably fails in accurately describing polarization features.In the context of light-matter correlated ground-states, this leads to a slight unbalance when approximating components including photonic excitations [introduced by Φ (1) ] and self-polarization interaction [introduced by Φ (2) ]. 13,30,42,43 This results in a violation of translational invariance and introduces an artificial dependence on the permanent dipole.When performing KLI calculations including light-matter interaction, we thus suggest moving the set of coordinates into the electronic center-of-charge instead of the center-ofmass frame.To reduce the effect of this dependence during a selfconsistent calculation, the optional input parameter KLIpt_coc has been introduced in the code.When activated, this option defines the dipole operator dα with respect to the electronic center-of-charge, thus improving the stability of the algorithm.
Finally, in Fig. 3, we show the capabilities of the new implementation, where we calculate two sodium dimers in the weak coupling regime under light-matter coupling.Figure 3(a) shows the electron density, and Fig. 3(b) shows the comparison of OEP and KLI results.As shown in Ref. 13, in the weak-coupling regime, the KLI is close to the OEP result.In Fig. 3(c), we show the convergence behavior when using a constant in Eq. ( 18) and when using the Barzilai-Borwein method. 38or a more detailed analysis on how the OEP equation performs also in the strong and the ultra-strong coupling limit, we refer to Refs. 13 and 30, where we find accurate results for few-level systems and asymmetric external potentials up to high electron-photon coupling strength.

FIG. 3.
Ground-state density of two sodium dimers affected by the vacuum-field of a cavity: In (a), we show the electron density, in (b), we show the difference of the electron density inside and outside the cavity for the OEP and the KLI approximations (see main text for details).We find for the exchange energy E Extensions of QEDFT to the regime of vibrational strong coupling, 44 the linear-response regime, 45 and multitrajectory methods that capture quantum fluctuations 46 are currently work in progress and will further strengthen the capabilities of the Octopus code for the real-space description of strongly coupled light-matter systems.To describe the effects in the ultra-strong coupling regime, one can use an alternative method that is presented in Sec.IV.

IV. DRESSED REDUCED DENSITY MATRIX FUNCTIONAL THEORY FOR ULTRA-STRONGLY COUPLED LIGHT-MATTER SYSTEMS
The accurate description of the (ultra-)strong coupling regime of light-matter systems is a formidable task.In many cases, the known functionals for QEDFT (see Sec. III) are inaccurate, and for complex electronic systems, typical few-level approximations become unreliable. 42,47In this section, we present the Octopus implementation of an alternative real-space ab initio method for coupled light-matter systems.Dressed Reduced Density Matrix Functional Theory (dressed RDMFT) 48 extends standard electronic RDMFT to coupled light-matter systems similarly to how QEDFT extends DFT.First tests on simple model systems suggest that the dressed RDMFT remains accurate from the weak to the ultra-strong coupling regime.A proper introduction of the theory, examples, and convergence studies can be found in Ref. 49.This approach allows for the description of an interacting Nelectron system coupled to one photonic mode within the dipole approximation.The respective Hamiltonian is given in Eq. ( 8) of Sec.III.Note that we set jext = 0 throughout this section and that the ground state Ψ of Eq. ( 8) depends on 4N + 1 coordinates, i.e., Ψ = Ψ(r 1 , σ 1 , . .., rN, σN, p), where {σi} denotes the spin degrees of freedom and p is the elongation of the photon mode.
Within the dressed RDMFT, the original Hamiltonian (8) of N electrons in d dimensions and one mode is replaced by an extended auxiliary Hamiltonian of N dressed fermions in d + 1 dimensions with coordinates z = (r, q) ∈ R d+1 .This auxiliary Hamiltonian reads and gives access to the same physics (see Refs. 49, Sec. 4).For a d = 1 matter subsystem, the operators introduced in Eq. ( 20) read as follows: the dressed Laplacian and the dressed interaction kernel The ground state Ψ ′ (z To apply RDMFT theory on the auxiliary system, we have to replace the total energy functional of electronic RDMFT, given in Ref. 19, with the newly introduced quantities of the dressed system, i.e., the auxiliary Hamiltonian of Eq. ( 20) (approximately) evaluated by the dressed 1RDM γ ′ of Eq. (23).By that, common approximations for the two-body energy expression in terms of the 1RDM can be directly transferred from electronic theory to the dressed system. 50he minimization is performed like in the electronic case and is based on the RDMFT implementation of Octopus, though the convergence of the dressed system requires a more complicated protocol that can be found in the supplementary material of Ref. 49.
The current implementation in Octopus approximates the conditions under which the dressed 1RDM corresponds to a wave function by ensuring the fermionic ensemble N-representability conditions. 51However, the auxiliary wave function also exhibits another exchange symmetry with respect to the auxiliary coordinates, which is currently neglected.For the practical validity of this approach, the reader is referred to Sec. 5 and the supplementary material of Ref.
and the soft Coulomb interaction In Fig. 4, we show the total energy and the total photon number of the dressed RDMFT, the dressed HF, and the exact manybody calculations 53 for different coupling strengths.We see that for small couplings, both observables are captured well by the dressed RDMFT and dressed HF.With an increase in the coupling strength, both approximations fail to capture the strongly increasing photon number.For the total energy instead, the dressed RDMFT remains very close to the exact result, whereas the deviations to the dressed HF increase with increasing coupling strength.This shows the potential of the dressed RDMFT to describe correlated electron systems that are strongly coupled to a cavity mode.In the future, we plan to investigate better approximations to the polaritonic Nrepresentability conditions that also account for the symmetry of the many-body wave function with respect to the exchange of photon coordinates.

V. TOWARD DYNAMICS OF STRONGLY CORRELATED SYSTEMS: THE TDDFT+U FUNCTIONAL
It is well known that the standard local and semilocal functionals of DFT tend to over-delocalize the electrons, as usual approximations are based on the homogeneous electron gas.This leads to several failures of DFT for materials in which the localization of electrons plays a critical role in dictating the system's properties.This is, for instance, the case for transition metal oxides.5][56][57] In essence, the DFT+U method aims at a better description of the local electronelectron interaction, which is achieved by adding the mean-field Hubbard model on a chosen localized subspace to the DFT total energy.The double counting of electron interaction in this localized subspace is then removed.The DFT+U total energy functional reads where E (ee) is the usual electron-electron interaction energy and E dc accounts for the double counting of the electron-electron interaction already present in E DFT .The analytical expression of this double-counting term is not known in the general case, a general problem to all +U methods.Several approximated forms have been proposed along the years. 58,59In the Octopus code, we implemented the most commonly used double-counting terms: the fully localized limit (FLL) 60 and the around-mean field (AMF) double-counting terms. 61They read as 59 and respectively, where Nσ = ∑ m n σ mm and N = N ↑ + N ↓ .The Eee and E dc energies depend on the density matrix of a localized orbital basis set {ϕI,m}, which are the localized orbitals attached to the atom I.In the following, we refer to the elements of the density matrix of the localized basis as occupation matrices and we denote them as {n I,σ mm ′ }.In the rotational-invariant form of DFT+U proposed by Dudarev et al., 60 and for the FLL double-counting term, we obtain the EU energy to be added to the DFT total energy, which only depends on an effective Hubbard U parameter where I is an atom index, σ is the spin index, and n, l, and m refer to the principal, azimuthal, and angular quantum numbers, respectively.In the case of a periodic system, the occupation matrices n I,n,l,σ mm ′ are given by where w k is the k-point weight and f σ nk is the occupation of the Bloch state |ψ σ n,k ⟩.Here, |ϕI,n, l,m ⟩ are the localized orbitals that form the basis used to describe the electron localization.Details of the implementation can be found in Ref. 62.We recently extended our original implementation to be able to construct a localized subspace from localized states, such as Wannier states, 63 and to treat the intersite interaction. 64n its usual formulation, the DFT+U method is an empirical method in which the effective U is a parameter of the calculation.However, it recently became possible to evaluate U and J fully ab initio and self-consistently using the ACBN0 functional. 65We also implemented this method in Octopus and extended it to the time-dependent case 62 to investigate strongly correlated materials out-of equilibrium.We showed that the absorption spectra of transition metal oxides, such as NiO or MnO, are well reproduced by our TDDFT+U simulations. 62igure 5 shows the calculated time profile of the effective Hubbard U eff = U − J for the 3d orbitals of Ni for light-driven NiO.The top panel exhibits the time profile of the driving vector potential.This shows that strongly driven correlated materials cannot be described by simply assuming that the effective electronic parameters (here, the effective Hubbard U) remain constant out of equilibrium.Moreover, the possibility to tune these effective electronic parameters offers opportunities for light-driven phase transitions, such as light-induced magnetic Weyl semimetals. 66e finally note that TDDFT+U is not the only method that can be used to describe correlated materials such as transition metal oxides.An alternative approach to this problem would be to use hybrid functionals.However, such types of calculations in complex systems, potentially involving strong spin-orbit coupling interactions, are usually numerically very expensive in the context of real-time real-space TDDFT.

VI. VAN DER WAALS INTERACTIONS
The van der Waals (vdW) interactions arise from correlations between electrons and are, in principle, described by the exact energy functional through the correlation energy functional Ec[n(r)].However, the vdW interactions are inherently non-local and long-range and, by construction, cannot be described by usual local and semilocal functionals. 68Therefore, much work has been devoted to finding consistent ways to enhance available functionals to correctly describe the necessary correlations yielding vdW forces.
Within a pure DFT approach, the inclusion of vdW interactions should be done through the correlation functional.To this end, a family of non-local density functionals has been proposed. 68he so-called vdW density functionals (vdW-DF) are derived ab initio from the screened response of the homogeneous electron gas. 68Another route to include vdW interactions in DFT calculations is by adding explicit corrections to the energy and forces based on atomic parameterizations.The most well-known method of this type is probably the vdW-D3 scheme of dispersion corrections from Grimme. 69Although computationally more favorable than the vdW-DF method, the major disadvantage of this type of approach, particularly from the perspective of time-dependent simulations, is that it fails to correctly describe the effects that depend on the electron dynamics.A trivial example would be a lone electron traveling through space.Another scheme based on explicit corrections to the energy and forces was proposed by Tkatchenko and Scheffler (vdW-TS). 70This scheme has the advantage of retaining much of the low computational cost of the vdW-D3 scheme while making the atomic parameterization dependent on the electronic density.Recently, these three schemes (vdW-DF, vdW-TS, and vdW-D3) were implemented in the Octopus code to deal with vdW interactions in isolated and periodic systems.
In the context of real-time TDDFT, it is important to note that these various levels of the description of vdW interactions have different implications.Indeed, in the vdW-D3 method, the energy and forces do not depend on the electronic density and will not change in time, unless atoms are moved, as already mentioned above.This should be a good approximation only when the time-dependent density remains close enough to the ground-state density.On the contrary, the expressions for the vdW-TS and the vdW-DF methods have a direct dependence on the electronic density and should thus yield a better description of vdW energies and forces out of equilibrium.However, we note that the use of the vdW-TS and vdW-DF methods within real-time TDDFT must be done using the adiabatic approximation and will, therefore, behave poorly when this approximation breaks down.

A. vdW-DF
Octopus supports vdW-DF functionals 68 through the libvdwxc library. 71The vdW-DF functionals are expressed as a sum, of the LDA correlation energy 72 and a fully non-local correlation term.The latter is the integral over a kernel function ϕ(q 0 , q 0 ′ , r), where q 0 (r) depends on the local density and its gradient.An explicit evaluation of this six-dimensional integral is very expensive and scales as the volume squared, O(N 2 ).Roman-Pérez and Soler proposed an efficient method, which approximates it as a sum of threedimensional integrals. 73The method works by expressing the integrand as a convolution using a limited set of helper functions and then applying the Fourier convolution theorem for a scaling of O(N log N) and a much lower prefactor.This has since become the standard method for evaluating vdW-DF functionals.
As implemented in Octopus, the density is redistributed from its normal uniform grid of arbitrary shape onto a uniform 3D grid forming a cube or parallelepiped, which is suitable for 3D Fourier transforms.The library libvdwxc then evaluates the non-local energy and contributions to the potential, relying on the FFTW library 74 for efficient parallel Fourier transforms.After calculating the energy and potential, the potential is redistributed back to the original form.

B. vdW-TS
Since the vdW-TS approach depends on the density, the effect of the van der Waals interaction can be observed in properties other than the forces.In particular, we expect to observe an effect in the excited states of systems that interact through vdW forces.We use the hydrogen fluoride dimer as a simple model system for a proof-ofconcept application of the modular implementation of the vdW-TS functional correction on TDDFT calculations.The dimer geometry is setup as shown in Fig. 6.The hydrogen fluoride monomers are placed in an anti-parallel fashion, each one with its main symmetry axis oriented along the y-axis.The hydrogen fluoride bond in each monomer is 0.92 Å long, and the molecules are separated by 2.8 Å along the z-axis.At this distance, the van der Waals interaction between the monomers is the strongest according to the vdW-TS model.We choose this dimer model because it is a conveniently small system in which the effects of including the dispersion correction can be shown at an affordable computational cost.
To calculate absorption, we excite the system with an infinitesimal electric-field pulse and then propagate the time-dependent Kohn-Sham (TDKS) equations for 30.38535 ̵ h/eV.The singlet dipole spectrum is evaluated from the time-dependent dipole moment.The strength of the perturbation is set to 0.01 Å −1 , and the polarization is in the z-axis.The time evolution is carried out using the enforced time-reversal symmetry propagator with (default) time steps of 0.033 52 ̵ h/eV.The results (Fig. 6) show a small van der Waals-induced bathochromic-like (red) shift in the optical spectrum of the hydrogen fluoride dimer calculated with the LDA.This example opens the door for a new series of applications in supra-molecular chemistry, structural biology, and polymer science, which incorporate van der Waals effects on real-time electron dynamics.

C. vdW-D3
Octopus also supports the DFT-D3 van der Waals correction. 69his correction depends only on the atomic positions and does not depend on the atomic density.It is implemented in Octopus by linking to the DFT-D3 library provided by the authors.As we have validated Octopus results with the reference data provided with the library, the results are guaranteed to be consistent with other codes that implement this correction.
The only modification we did to the library was to move the very large set of coefficients from a source file to a stand alone text file that is parsed only when necessary.This speeds up compilation and reduces the size of the binaries.The modified library is distributed with Octopus, so users need not compile it separately.

VII. POLARIZABLE CONTINUUM MODEL
The Polarizable Continuum Model (PCM) 81 comprises a family of implicit-solvent approaches to tackle quantum-mechanical calculations of molecules in solution.The PCM assumes that (i) the solvent is a continuum and infinite dielectric medium characterized by a frequency-dependent dielectric function and (ii) a void cavity of appropriate shape and size encapsulates the solute molecule, separating it from the solvent by a sharp interface.The numerical implementation of the PCM relies on the Apparent Surface Charge (ASC) approach and the Boundary Element Method (BEM). 82In this framework, the solvent polarization response induced by the molecule's charge density is modeled by a reaction potential defined by a set of point charges q = {q 1 , q 2 , . .., q T } that spread over a tessellated cavity surface consisting of T finite surface elements or tesserae. 83The reaction potential is the object through which the complex dielectric environment is accounted for. 84The implementation of the PCM in Octopus rests on the Integral Equation Formalism (IEF). 85The key IEF-PCM equation for computing the polarization charges is where q and V are column vectors of size T storing the induced polarization charges and the molecule's electrostatic potential at the tesserae representative points, respectively.Q is the T × T PCM response matrix, which depends on the geometry of the cavity and the dielectric function of the solvent. 81 realistic description of the molecular cavity is key to capture accurate electronic and optical properties of molecules in solution.In principle, the cavity should (i) exclude the solvent, (ii) comprise most of the solute electronic density, and (iii) conform with the molecular shape.Octopus uses the GEPOL algorithm, 86 which builds up the van der Waals cavity from the union of interlocking spheres with element-specific radii centered at each atom position (by default, no spheres are built around hydrogen atoms).Within GEPOL, the BEM tessellation of the solute-solvent interface is done starting from a 60-or 240-face circumscribed polyhedron per sphere, selecting only exposed tesserae and properly reshaping those that are partially exposed.The BEM tessellation is a surface grid constructed independently from the real-space threedimensional grid used in Octopus to represent both the Kohn-Sham electronic Hamiltonian and molecular orbitals.The mismatch between the two discrete representations might cause numerical problems arising from the Coulomb singularities whenever tesserae and grid points are close to each other.The implementation of the PCM in Octopus regularizes such singularities by using normalized spherical Gaussian functions to smooth the discretized polarization charges q. 84 Having briefly described the implementation, we now look at specific and relevant cases for chemical reactions, namely, solvation energies and ground-state stabilization.The most evident effect of the presence of a solvent is to change the total energy of a system compared to its value in vacuum.In the framework of DFT, the Kohn-Sham Hamiltonian of the solvated molecule contains the solvent reaction potential, which is a functional of the electronic and nuclear densities of the solute molecule. 84The latter implies that the Kohn-Sham and IEF-PCM equations become coupled and the polarization charges qe = QV Hartree [ρ e ], induced by the molecule's The Journal of Chemical Physics ARTICLE scitation.org/journal/jcpelectronic density, have to be computed at each Kohn-Sham iteration until convergence is reached.The resulting electronic density can be used to compute the electrostatic contribution to the solvation free energy where G and E denote the free and total energy functionals of the molecule in solvent and in vacuum, respectively.The current PCM implementation in Octopus 84 has been thoroughly tested on a benchmark set of organic molecules and compared with analogous calculations performed with the quantum chemistry package GAMESS. 87As an example of the solvent stabilization effect in the ground state, we studied the nitrobenzene molecule in water (static dielectric constant set to ϵs = 80).
In Fig. 7, we show the convergence of the solvation free energy as a function of the Kohn-Sham iteration.The solvation free energy is stabilized already after 10 Kohn-Sham iterations out of the 19 required to optimize the molecular orbitals of the solvated molecule.We have also verified that the numerical error inherent to the discretization of the solute cavity surface is very small.For example, the total polarization charge Q PCM = ∑ T i=1 q e i at each Kohn-Sham iteration deviates by only 0.03% from the actual number of valence electrons in the nitrobenzene molecule (the relation between q and the solute charge is determined by Gauss's theorem 81 ).However, the magnitude of this error will depend, in general, on the size and the geometry of the molecule.Now we move forward and describe the extension of the previous PCM method to the time domain using real-time PCM and non-equilibrium solvation.The ground state PCM is unable to capture the complex dynamical interactions between the solvent and the solute when the latter is in an excited state.Fortunately, the PCM admits a generalization to account for solvation dynamics. 89he time-dependent PCM (TD-PCM) implementation in Octopus comes in three flavors of increasing complexity, all coupled with real-time TDDFT calculations of the solute molecules.The first one, called equilibrium TD-PCM, 84 assumes that the solvent is fast enough to instantaneously equilibrate the solute charge density fluctuations and to polarize accordingly.This approach is physically sound for weakly polar solvents, having similar values for the static and dynamic dielectric constants.The second is a nonequilibrium approach called inertial TD-PCM and amounts to partitioning the solvent response in a fast (dynamic) and a slow (inertial) part. 90aster degrees of freedom respond instantaneously to the changes of the applied potential (either of molecular or external in origin), whereas slower degrees of freedom remain "frozen" and in equilibrium with the initial value of the field.This approximation works well when the solvent relaxation times are large enough with respect to the electronic excitations in the solute molecule (e.g., within the picosecond scale).The third TD-PCM approach, called equation of motion (EOM) TD-PCM, considers the full history-dependent evolution of the solvent polarization through a set of equations of motion for the polarization charges. 91Nonequilibrium polarization effects of this sort originate from the frequency-dependent dielectric response of the solvent, encoding the fact that it takes a different non-negligible time to adjust to fast or slow electrostatic perturbations.Solvation dynamics affect strongly and nontrivially the absorption spectrum of molecules, especially for fast and polar solvents, by inducing solvatochromic shifts of the peaks in the UV-visible absorption spectrum and modifying their relative amplitudes.The details about the implementation of all of these schemes and a detailed discussion of their effects can be found in Refs.84 and 92.
Here, we show the differences among the TD-PCM flavors described above for the case of the nitrobenzene molecule in aqueous solution, excited with an electrical dipole perturbation in the x direction.We take the dynamic dielectric constant of water, entering in the inertial and EOM TD-PCMs, as ϵ d = 1.786.Figure 8 shows the photo-absorption spectrum of the gas-phase and solvated Kohn-Sham equations were solved in real-space as implemented in Octopus using the GGA-PBE approximation 88 to the exchange-correlation energy.The simulation box was built using spheres of radius 5 Å and an uniform spacing of 0.19 Å between grid points.The radii of the spheres used to build the cavity surface of the solute molecule are 2.4 Å for carbon, 1.8 Å for oxygen, and 1.9 Å for nitrogen.molecule.We can see that, with respect to the absorption in vacuo, all of the TD-PCM methods produce shifts of the features (in this case, toward lower excitation energies).The shifts are not rigid overall but depend on the excited state and on the specific solvation scheme (equilibrium, inertial, and EOM).Within Debye's model, the equilibrium and inertial TD-PCMs are limiting cases for the dynamics when the relaxation time is zero and infinite, respectively.The EOM TD-PCM, with a finite relaxation time, interpolates between these limits.Water has a large relaxation time of 3.37 ps, and therefore, the absorption spectrum is almost coincident for the EOM and the inertial TD-PCMs, as shown in Fig. 8. Whenever the solvent is as slow as water, there is not much gain in selecting the EOM over inertial TD-PCM and the latter is the method of choice in terms of computational performance.Instead, for faster solvents, the EOM results depart from those of the inertial algorithm, approaching those of the equilibrium TD-PCM.In Fig. 8, we considered an effective solvent having the same static and dynamic dielectric constants as water, but with a 1 fs relaxation time.The EOM TD-PCM for such a model solvent produces almost the same excitation energy shift equilibrium TD-PCM and peak intensities in between the inertial and the equilibrium TD-PCMs, as expected.

The Journal of Chemical Physics
A real-time representation of the molecular dipole coupled with the PCM surface charges is found in the supplementary material, SM2, which also highlights how the different TD-PCM approaches affect the time-evolution of the dipole.
When studying the evolution of a solvated molecule under the effect of an external time-dependent electromagnetic field, a separate treatment is required to take into account that there are two solvent polarization contributions interacting with the solute molecule, namely, the reaction and the cavity fields. 93The reaction field comes from the polarization induced by the solute molecule itself, while the cavity field arises as a polarization induced by the external electric field.Both reaction and cavity field effects can be accounted for in static and real-time quantum-mechanical calculations of molecules within the PCM and TD-PCM simulations in Octopus. 92ll TD-PCM calculations shown here were performed using both reaction-and cavity-field effects, although the redshift of the peaks in each TD-PCM scheme is mainly a feature of the reaction-field, as ascertained in test calculations (not shown) and as expected. 94Cavity field effects impact directly on the peak intensities by making the absorption more favorable depending on how large the effective local field acting on the molecule is allowed to be by solvent dielectric properties and the geometry of the cavity (normally, reassembling the molecular shape).Still, cavity field effects can have a non-trivial influence in the absorption spectrum shape when considering non-equilibrium solvation dynamics (EOM TD-PCM) by changing the relative peak intensities.This effect can be seen in the inset of Fig. 8, where the normalized cavity field factor (CFF)-the ratio between absorption cross section with and without cavity field effects-is plotted against the excitation energy for the EOM TD-PCM simulations with water and the aforementioned faster water-like solvent.The large difference in absorption peak intensities between the equilibrium and the rest of the TD-PCMs in Fig. 8 is also related to the modification of the probing electromagnetic field inside the dielectric.The photo-absorption cross section is, by definition, the ratio between the absorbed and the incoming power of light.In our case, both increase with the dielectric constant: the larger the dielectric constant of a medium, the larger the CFF, but also a larger dielectric constant implies a smaller phase velocity of light, therefore increasing the power of the traveling electromagnetic wave.The impact of the latter effect (∝ refractive index = √ (|ϵ(ω)| + Rϵ(ω))/2) is stronger than that of the former [∝CFF, e.g., for a spherical cavity 93 ∝ | 3ϵ(ω) 2ϵ(ω)+1 | 2 ] for a large enough dielectric constant such as the one corresponding to the equilibrium TD-PCM for water (ϵs = 80).
In conclusion, Octopus is now capable of including an implicit dielectric continuum model as an environment for quantum mechanical calculations of excited states in the time-domain.The different versions of the TD-PCM scheme implemented allow us to select the most suitable to capture the relevant physics, accounting for a full range of different relaxation and response times.

VIII. MAGNONS FROM REAL-TIME TDDFT
][97][98][99][100] Here, we are interested in transverse magnetic excitations, specifically magnons, which are long wavelength collective excitations with a typical energy of tens to hundreds of millielectron volts.We recently developed an alternative to the linear-response TDDFT formulation 98,99,101 to compute the spin susceptibilities of magnetic systems based on real-time TDDFT. 102Our approach follows the work of Bertsch et al. for optical excitations.In the original work of Bertsch et al., the system is perturbed by a sudden change in the vector potential, which induces a charge-and current-density response.To investigate magnons, we employ a "transverse magnetic kick," which induces magnetic fluctuations in the system.To be more precise, our perturbation corresponds to an infinitely short where q is the momentum of the spin wave we are exciting, B is the strength of the perturbation, and σ ± = σx ± iσy are Pauli matrices.
In this expression, the z axis is taken to be along the direction of the magnetization of the system before excitation.In case the system has a preferred magnetization direction due to the presence of spin-orbit coupling, usually referred as an "easy axis," we perform a kick in the transverse direction with respect to this "easy axis" of the system.
The subsequent time evolution of the spin magnetization m(r, t), governed by the time-dependent Schrödinger equation, is then computed and analyzed in Fourier space.If we perturb our system from its ground state and we assume linear response, we have directly that where χ+−(q; ω) is the spin susceptibility we want to extract.A similar expression is obtained for χ−+(q; ω).
In order to access finite momenta, one typically has to employ large supercells to perform the dynamics, which can be computationally very expensive when a few meV energy resolution is needed.Fortunately, there is a way to circumvent the construction of supercells by using the so-called generalized Bloch theorem (GBT).The GBT has been introduced by Sandradskii 103 for the calculation of ground-state spin waves and requires the implementation of specific boundary conditions.We implemented the GBT for the real-time calculation of magnons, taking advantage that Octopus is a real-space finite difference code, for which we can easily specify any type of boundary conditions.However, it is important to note that this applies only in the absence of spin-orbit coupling.The boundary condition, described below, depends on the momentum q of the perturbation and acts differently depending on whether a state was originally "up" or "down" with respect to the unperturbed magnetization.This is determined by the sign of ⟨Φn k |Sz|Φn k ⟩ just before the perturbation for each spinor state |Φn k ⟩.If we label these states α and β, the boundary condition reads We checked that performing the simulation using the GBT or the supercell approach leads to the same results up to numerical precision.We tested our approach for cubic Ni, Fe, and Co, which have been widely studied using linear response TDDFT, and we found that our results are in very good agreement with previous work, 96 thus validating our implementation.Figure 9 shows the results obtained for bulk nickel using the adiabatic local-density approximation.We used here a real-space grid spacing of 0.27 bohr, norm-conserving pseudopotentials, and we employed a 16 × 16 × 16 k-point grid shifted 4 times in order to resolve momenta q that are multiples of 2π/(16a), where a is the lattice parameter of Ni, which we took to be 3.436 Å.In order to obtain the linear response, we took B = 0.02 and we propagated for 435.4 fs using a time step of 1.81 as.To reduce the numerical burden, symmetries were employed.
Let us now comment on the interest of the proposed approach.So far, we only used this new method for investigating weak magnetic kicks, where we recover the results from linear response theory.However, our approach does not rely on the assumptions of small perturbations and can be used to investigate nonlinear phenomena induced by a strong magnetic field as well as out-ofequilibrium situations where the system is kicked from an excited state.6][107][108] Moreover, we can investigate directly the coupling with other degrees of freedom, such as phonons or photons, without any new theory or code development.Using the real-space method, we benefit from the favorable scaling of the time propagation, which is linear in the number of states, whereas sum-over-state approaches usually scale quadratically with the number of states.Finally, our approach offers the great advantage of not requiring the use of an exchangecorrelation kernel, as only the exchange-correlation potential is needed to perform a time propagation.This is very interesting in order to test new functionals and theory levels for which deriving the expression of the exchange-correlation kernel can become complicated.These different aspects will be investigated in future works.

IX. ORBITAL MAGNETO-OPTICAL RESPONSE OF SOLIDS AND MOLECULES FROM A STERNHEIMER APPROACH
In this section, we review the implemented routines for magneto-optical phenomena, which arise from the loss of symmetry between left and right circularly polarized light in the presence of a magnetic field. 109,110While magneto-optical spectra can be straightforwardly computed for molecules, [111][112][113][114][115] the theory for solids has been developed only recently. 116][125] To treat uniform magnetic fields in periodic systems, we use the approach based on perturbation theory for the one-particle density matrix. 116,120,122Such an approach allows us to work under purely periodic boundary conditions and to take into account automatically gauge invariance.A periodic and gauge-invariant counterpart Õ is distinguished for any operator O = Or 1 r 2 defined for two points r 1 and r 2 in real space where A is the vector potential associated with external electric (E = −c −1 ∂/∂tA, where c is the speed of light) and magnetic (B = ∇ × A) fields and the integral is taken along the straight line between points r 2 and r 1 .
The time-dependent Liouville equation for the gauge-invariant counterpart ρ of the one-particle density matrix to the first order in E and B takes the form 116 Here, the commutator and anticommutator of operators O (1) and O (2) are denoted by [O (1) , O (2) ] and {O (1) , O (2) }, respectively, the velocity operator V = −i[r, H] is computed taking into account all non-local contributions to the Hamiltonian (e.g., non-local pseudopotentials), and the Hamiltonian is represented as H = H 0 + δ H, where the difference δ H between the gauge-invariant counterpart H of the Hamiltonian and unperturbed Hamiltonian H 0 is related to the local-field effects.Unlike the singular position operator r, the commutator [r, ρ] of the position operator with the periodic function ρ is well defined in Eq. (38) and corresponds to the derivative with respect to the wave vector, i∂ k ρk , in reciprocal space.Differentiating Eq. (38), one finds the derivatives of the density matrix ρ(P) = ∂ρ/∂P with respect to external perturbations P (E, B, etc.).
For TDDFT calculations in Octopus, ρ is the Kohn-Sham density matrix.The nth order derivative ρ(P) describing the joint response to the perturbations P = P 1 P 2 . . .Pn is divided into four blocks within and between the occupied (V) and unoccupied subspaces (C): ρ(P) VV = Pv ρ(P) Pv, ρ(P) CC = Pcρ (P) Pc, ρ(P) VC = Pv ρ(P) Pc, and ρ(P) CV = Pcρ (P) Pv, where Pv = ρ (0) and Pc = 1 − Pv are the projectors onto the occupied and unoccupied bands.In accordance with the density matrix perturbation theory, 126 to get the elements ρ(P) CV , Eq. ( 38) is projected onto unperturbed Kohn-Sham wavefunctions |ψ  [ρ (n−1) , . . ., ρ (0) , n (P) ]|ψ Here, the operator on the left-hand side is given by Lv k (Ω) = Ω + H 0 −ϵv k , where Ω is the frequency considered and ϵv k is the energy of the unperturbed state |ψ (0) vk ⟩.The operator R comes from the right-hand side of Eq. ( 38) and is determined by the derivatives of the density matrix of the previous orders.If the local-field effects are taken into account, the right-hand side R also depends on the derivative of the electron density n (P) (r 1 ) = ρ (P) (r 1 , r 2 )δ(r 1 − r 2 ).In this case, Eq. ( 39) needs to be solved self-consistently.The calculations, in practice, work with the periodic parts of the wavefunctions, |u (0) vk ⟩.The commutator [r, ρ] corresponding to i∂ k ρk in the reciprocal space is computed within the k ⋅ p theory. 19,124,125Equation ( 39) is solved using the efficient Sternheimer approach, 19,[123][124][125] where the function |η (P) vk (Ω)⟩ that fits into Eq.( 39) is found iteratively at each frequency Ω.To avoid divergences at resonances, a small but finite imaginary frequency iδ is added to the frequency Ω 0 of the external perturbation so that Ω = Ω 0 + iδ.
Once the solution of Eq. ( 39) is known, the elements ρ(P) CV are obtained as The elements of ρ(P) between the occupied and unoccupied subspaces are found using the relation ρ(P) VC (Ω) = (ρ (P) CV (−Ω * )) * , and for that, Eq. ( 39) is also solved for the frequency −Ω * .
To find the elements within the occupied ρ(P) VV and unoccupied ρ(P) CC subspaces, the idempotency condition ρ = ρρ is used.In terms of the periodic counterpart ρ of the density matrix and to the first order in the magnetic field, it is written as 120,122 The contribution ανμ,γ to the polarizability in the presence of the magnetic field (ανμ = α 0νμ + ανμ,γBγ) is finally obtained from the current response as where indices ν, μ, and γ are used to denote components of the vectors V, E, and B. The dielectric tensor in the presence of the magnetic field is computed as ϵνμ = δνμ + 4πανμ/V, where V is the volume of the unit cell.Note that according to the "2n + 1" theorem, 127,128 there is no need to calculate explicitly the secondorder derivative ρ (E μ B γ ) .Instead, ανμ,γ is expressed through the firstorder derivatives to the perturbations P = Eμ, Bγ and a supplementary perturbation corresponding to a vector potential P = Aν at frequency −Ω. 116o test the formalism for solids, we applied it to bulk silicon and the corresponding results are shown in Fig. 10.The full details of the calculations can be found in Ref. 116.It is seen in Fig. 10(b) that even without accounting for excitonic effects, the calculated spectra Re/Im ϵxy for the transverse component of the dielectric tensor are already qualitatively similar to the experimental curves 110 at the direct absorption edge.To model excitonic effects, we used the model from Ref. 129 The blue shift by 0.7 eV is applied to take into account the GW correction to the bandgap. 130,131The experimental data for ϵxx 132 and ϵxy 110 (scaled by 1/2) are indicated by symbols.
accounting for excitonic effects show a better agreement with the experimental results [Figs.10(a) and 10(b)].Although the magnitudes of the peaks in the magneto-optical spectra are about a factor of two smaller than in the magneto-optical measurements, 110 they can be corrected by reducing the linewidth δ assumed in the calculations.
In the limit of a large supercell, this formalism becomes equivalent to the simpler standard formulation for finite systems, which we have also implemented for reference.In this case, the Liouville equation for the density matrix is written in the Coulomb gauge as where d = −r is the electric dipole moment, m = −r × V/2c is the orbital magnetic dipole moment, and δH describes local-field effects.The change in the polarizability ανμ,γBγ in the presence of the magnetic field is calculated from the dipole response As in the periodic case, due to the "2n + 1" theorem, 127,128 the explicit calculation of the second-order derivative ρ (E μ B γ ) is avoided.Instead, a supplementary electric field at frequency −Ω, E ′ ν , is introduced. 116In the case of real wavefunctions, |η )⟩) * , and there is no need to solve additionally the Liouville equation for this supplementary electric field.
The efficiency of the present scheme to compute magnetooptics is comparable to standard linear-response calculations of simple optical polarizability in the absence of the magnetic field. 116hen local-field effects are included self-consistently, the calculations of magneto-optical spectra for solids take only twice as long as those of polarizability.For finite systems, the computational effort is the same as for the simple optical polarizability.

X. TIME-DEPENDENT ANGULAR RESOLVED PHOTOELECTRON SPECTROSCOPY
The real-space representation of the dynamics of the electronic structure allows for a seamless and straightforward description of dynamical processes outside the material, i.e., processes where electrons are excited into vacuum.Beyond simply describing the ionization process, Octopus has routines implemented that compute photoelectron spectroscopy in different flavors.Photoelectron spectroscopy is particularly ubiquitous for the characterization of the electronic structure in solids because it provides a direct observable of the energy and momentum distribution of electronic states, known as the bandstructure.In contrast to other electronic structure methods that compute quantities linked to the photoelectron spectrum, most commonly the single-quasiparticle spectral function obtained through the GW approximation to the many-body selfenergy, the approach described here does not consider unit cells of bulk materials but instead computes directly the energy and momentum resolved ionization probability. 107,133Besides the study of the electronic structure of solids, photoelectron spectra of atoms and molecules are of equal interest and the implementation in Octopus is capable to describe accurately all such systems on an equal footing.
The most detailed quantity available in the experiments is the momentum-resolved photoelectron probability P(p), i.e., the probability to detect an electron with a given momentum p.In some cases, the experimental setup offers the possibility to measure the spin polarization along a given axis.The formalism we are going to outline can easily accommodate a non-collinear spin structure and, therefore, calculate spin-resolved quantities, but for the sake of simplicity, in the following, we are going to restrict ourselves to closed shell systems with collinear spins, i.e., where all the orbitals are doubly occupied with electrons having opposite spins.The reader interested in the most general case can find a detailed description in Ref. 134.From P(p), one can obtain other derived quantities by simple manipulation.For instance, the energy-resolved spectrum P(E), used to identify the occupied energy levels, can be obtained with a change of variable using the free electron energy dispersion relation E = p 2 /2.The angle-resolved photoelectron spectrum (ARPES), normally employed to measure the quasiparticle bandstructure, is simply obtained by taking the energy resolved spectrum as a function of the electron momentum parallel to the surface P(p ∥ , E).
The t-SURFF method was first proposed by Tao and Scrinzi 135 for one-electron systems and later extended to many electrons with TDDFT for periodic 134 and non-periodic systems. 136It is based on the assumption that the Kohn-Sham Hamiltonian describing the full experimental process, i.e., including the ionization and detection, can be decomposed into the sum of two Hamiltonians acting into complementary spatial regions, inner and outer, and that can be approximated in different ways.In the inner region surrounding the system, the electron dynamics is governed by the interacting Kohn-Sham Hamiltonian ĤKS (r, t), while in the outer region, electrons are free from the Coulomb tails of the parent system and behave as independent particles driven by an external field and, therefore, are described by the Volkov Hamiltonian ĤV (r, t) = 1/2(−i∇ − A(t)/c) 2 .We express the field with a time dependent vector potential potential A(t) in the dipole approximation, i.e., by discarding the spatial dependence of the field A(r, t) ≈ A(t).The

ARTICLE scitation.org/journal/jcp
advantage of this approach is provided by the fact that the timedependent Schrödinger equation associated with the Volkov Hamiltonian can be solved analytically and that the solutions, the Volkov waves, e ip⋅r e iΦ(p,t) ( with are eigenstates of the momentum operator.We can, therefore, expand the Kohn-Sham orbitals φi as a superposition of detector states in the form of Volkov waves φi(r, t) = ∫ dp bi(p, t)χp(r, t) and obtain the photoelectron probability in terms of the expansion coefficients by summing up the contribution of all the orbitals: Using the continuity equation, t-SURFF allows us to express the coefficients bi, and thus P(p), as a time integral of the photocurrent flux through the surface S separating the domains of the two Hamiltonians.More specifically, with the single-particle current density operator matrix element given by The flexibility offered by the definition of the boundary surface S allows us to easily adapt t-SURFF to periodic and non-periodic systems.For periodic systems, we choose S as a plane parallel to the material surface, while for the non-periodic case, the most natural choice is a sphere, as shown in Figs.11(a) and 11(b).
For periodic systems, one can apply the Bloch theorem to the Volkov Hamiltonian and further reduce Eq. ( 48) to an expression where only the periodic part of the Bloch waves is employed. 134he present computation of the photoelectron spectrum relies on the condition that the Volkov Hamiltonian can be used to describe electrons in a region of space and entails the following approximations: (i) the ionized electrons become non-interacting at some distance from the system and (ii) they are free with respect to the tail of the Coulomb potential of the system, i.e., it has to be possible to neglect it at some distance.Both conditions are controlled by the placement of the analyzing surface S for the evaluation of the flux, and in practice, this has to be converged by placing it at

ARTICLE
scitation.org/journal/jcpsuccessively larger distances from the system.This method needs an explicit description of the vacuum around the probed system, which for solids requires an explicit construction of the electronic structure of the surface layer, and if bulk properties are probed, one needs to converge a slab of material.Compared to unit cell calculations of solids, this puts the present method at a disadvantage.However, when aiming at simulating specific experiments, it natively includes signals coming from the surface layers and can thus capture more of the experimental reality of the spectroscopic process compared to standard perturbative many-body approaches that are directly performed in the bulk unit cell.
In practical calculations, in order to avoid spurious reflection of electrons from the boundaries of the simulation box, one has to employ absorbing boundaries, and depending on the condition of their transparency, this can add up to the size of the simulation box. 137Other than the increased size of the computational box, this method is not computationally costly, as it requires the evaluation of the gradient operator on surface points S only and straightforwardly supports the existing k-point, state, and mesh parallelizations.
Pump-probe configurations can be simulated naturally since there is no restriction on the functional form of the vector potential A(t), and therefore, we can accommodate any linear combination of pulses.As an example, in Fig. 11(c), we show the result of a simulated pump-probe time-resolved ARPES measurement of monolayer hexagonal boron nitride (hBN) driven by a field resonant with the gap at the K-point in the Brillouin zone.As it is apparent from the simulation, the excited state population transfer to the conduction band is well observed in the resulting ARPES spectrum.
The method is not limited to simple photoelectron spectroscopy but can be used to simulate complex experimental techniques, such as the reconstruction of attosecond beating by interference of two-photon transitions (RABBITT). 138Furthermore, due to the versatility of the real-time description in Octopus, the underlying excitation does not need to be originating from an optical pulse (i.e., an external vector potential), and since the ions can move according to TDDFT-Ehrenfest dynamics, 139 one can also simulate features coming from the ionic or lattice motion, specifically electronphonon coupling signature in ARPES. 140Pump-probe simulations are not limited to the dynamical process of excitations but can also be employed to study steady-state modifications of driven electronic states. 141This opens the possibility of studying Floquet physics from an ab initio perspective and to simulate directly the effect that a periodic force would have on the dressed electronic structure-an important aspect to underline, given the growing interest in the field of Floquet engineering 142 and Floquet analysis. 143inally, t-SURFF is particularly suited to simulating strong field ionization processes such as in laser-induced photoelectron diffraction (LIED) experiments, where the ionization takes place by direct tunneling into the continuum and the field is so strong as to drive electrons in trajectories recolliding with the parent system.In fact, since in Eq. ( 48), we accumulate the flux through S over the time of the propagation, with t-SURFF, electrons can seamlessly flow back and forth through the surface without producing any artifact in the final spectrum.As an example, in Fig. 11(d), we present the photoelectron angular distribution of C 60 ionized by a strong IR pulse capable of inducing rescattering dynamics, which then gets imprinted in the photoelectron spectrum as characteristic rings centered in the value of the vector potential at the instant of rescattering. 144In this regime, the result of simulations obtained with Octopus are in excellent agreement with the experiments. 145,146

XI. ELECTRIC AND THERMAL CONDUCTIVITIES
Electrical conductivity in real materials can be described to a first approximation by Ohm's law, which is a linear relationship between the applied electric field and the current density generated in response.The standard method to study electrical and thermal conductivity is to carry out a DFT calculation at equilibrium and apply Kubo-Greenwood linear-response theory to evaluate the conductivities.Although this method has been used successfully in the past to calculate the transport properties, 147,148 the application is limited to systems in the linear response regime.It is known that in many cases, especially in the presence of strong external fields, Ohm's law is no longer valid and the system exhibits non-linear behavior.In order to capture this complex behavior in materials, one must go beyond simple, linear approximations and describe electron interactions in a non-trivial manner.
One promising route to go beyond linear response is to use density functional methods to study directly thermoelectric transport.This topic has received less attention but has recently been implemented in Octopus and applied to liquid aluminum. 149For detailed descriptions and derivations of non-equilibrium thermoelectric phenomena using density based methods, we refer the reader to Ref. 150.
In Octopus, we are able to calculate the current density and heat current density at each step during a time-dependent simulation.A current is induced in this study by applying an electric field of the form E(t) = E0δ(t), where E0 describes the magnitude and direction of the field.The electric field is induced through a time-dependent vector potential [E(t) = −c −1 ∂tA(t), where c is the speed of light] in the Hamiltonian.This vector potential satisfies the periodic boundary conditions of an extended system.It should be noted that there is no limitation on the form of the applied electric field in general.We are able to evaluate the macroscopic current density as where Ω is the volume of the unit cell.The energy current density, Ĵh (r, t), is expressed as where Ĵt (r) is the kinetic energy contribution given as and φ = φ(r, t) are understood to be the time-dependent states.
The output of the time-dependent simulation can be further analyzed to yield the frequency-dependent conductivity.The electrical (or thermal) conductivity σ can be found by Fourier transforming the corresponding current as In general, it is possible to study the conductivity of an extended system in the presence of an applied electric field.Here, we ARTICLE scitation.org/journal/jcpFIG.12. (Left) Time-dependent current density for one ionic snapshot of a supercell of 128 atoms of hydrogen in the phase at 1400 K and 400 GPa with an initial electric field E 0 of 0.1 a.u.The blue line represents the current density as given in Eq. ( 50), and the red line is the heat current evaluated from Eq. ( 52).Both currents are decaying to zero.(Right) The Fourier transform of the respective currents gives the frequencydependent conductivities.
illustrate the use of this method on hydrogen at 1400 K and 400 GPa.At this temperature and pressure, hydrogen is in its liquid metallic phase.A molecular dynamics simulation was carried out with VASP 151 to produce several ionic snapshots of the system.For each ionic snapshot, we performed a TDDFT simulation with a time step of 0.05 a.u.(0.001 21 fs) for a total time of 3 fs.The initial electric field E 0 was 0.1 a.u.The calculation was carried out on a 3 × 3 × 3 k-point grid to ensure that the current density decays to zero.
In Fig. 12, we plot the time-dependent current density and heat current density as a function of time after an initial electric field is applied at time zero.Both currents are shown to decay to zero by the end of the simulation.It should be noted that the heat current corresponds to the kinetic energy contribution given in Eq. ( 52).In general, this heat current is related to the Peltier coefficient, since it describes the heat generated from the response to an electric current.The remaining contributions in Eq. (51) are not yet implemented in Octopus.Both Ju and J f arise from electron-electron interactions and Jv is the potential energy.In future work, it would be interesting to evaluate these remaining contributions to determine if they are large or can reasonably be ignored for some systems.The electrical conductivity and the heat conductivity can be evaluated using Eq. ( 53).These frequency-dependent conductivities are shown in Fig. 12 (right).One may also calculate the DC conductivity by taking ω = 0.This suite of tools implemented in Octopus will allow for the study of time-dependent thermoelectric phenomena using TDDFT.This approach to study current and conductivity is more general and widely applicable than the standard Kubo-Greenwood approach.While Kubo-Greenwood is a linear response theory, this method can be applied to study materials where non-linear conduction effects can appear.In a recent study, this TDDFT method was used to illustrate non-linear conductivity effects in liquid aluminum, 149 which would not be possible by applying standard linear response theory.In the future, we plan to extend these tools by implementing a thermal vector potential that can induce a heat current in an extended system during a time-dependent simulation.

XII. LOCAL DOMAIN CONTRIBUTION TO PHYSICAL OBSERVABLES
Usually, we are interested in studying systems consisting of different atoms, molecules, regions, or domains.In such cases, we might want to understand the different contributions from different parts of the system toward a specific observable.Based on the Hohenberg-Kohn theorems for the DFT, 152 and its extension for TDDFT, the Runge-Gross theorem, 153 we can state that any physical observable of the system is a functional of the electronic density, either static (ground state) or time dependent.In other words, the expectation value of an operator Ô can be expressed as a functional of the electronic density Let us now consider the case where we write the total electronic density as a sum of N densities and no further constraints are imposed on n or ni.Such partitioning of the density is at the basis of subsystem DFT, which allows us to divide a system into several Kohn-Sham subsystems that interact with each other in a theoretically well justified manner (see Ref. 154  and references therein).However, in our case, we are only interested in this kind of partition for post-processing purposes where we assign a density ni to a specific part of the system in order to identify how it contributes to a given observable.We are, thus, interested in operators for which the following condition is true: Such operators are said to be additive.One example of an additive operator that is particularly relevant in the context of TDDFT is the time-dependent dipole as this is the relevant observable for obtaining the optical absorption cross section of finite systems.The range of operators that fulfill Eq. ( 56) can be further expanded if we now consider non-overlapping densities, that is, densities ni that are non-zero only in a given domain Vi and that the domains do not overlap.We refer to this type of partitioning as a local domain partitioning.In such cases, any (semi-)local operator that depends (semi-)locally on the density should fulfill Eq. (56).In this context, an example of a relevant observable is the exchangecorrelation energy within the LDA (or GGA) approximation The Journal of Chemical Physics ARTICLE scitation.org/journal/jcp where e LDA xc is the exchange-correlation energy per unit particle.Currently, several different options can be used in Octopus to define specific regions of the simulation box.These options include simple geometric shapes, such as spheres centered on particular points of space, atom-dependent domains, such as a union of spheres centered on the atoms, or the definition of Bader volumes.The latter option follows the Quantum Theory of Atoms in Molecules (QTAIM) 155 and associates a density region with a given atom through a density gradient path such that the boundaries of each volume are defined as the surfaces through which the charge density gradient has a zero flux.Note that some of these options allow for the user to specify overlapping regions.In this case, the overlap must be taken into account when analyzing the results and extra care is required when comparing results from different domains.
Let us exemplify the applicability of the local domain partitioning by decomposing the optical response for a coupled chromophore system.Similar to Frozen Density Embedding real-time TDDFT (FDE-rt-TDDFT), 156 we decompose the total optical spectrum as a sum of the local response within each domain where α is the dynamic polarizability.In addition, from each local dynamic polarizability tensor, we can compute for each fragment the corresponding cross section.This is justified by the fact that the relevant operator to calculate α is the dipole operator, and as shown above, this is an additive operator that fulfills Eq. (56).
As an example, we performed a series of optical spectrum simulations using real-time TDDFT for a benzene-fulvene dimer with different π-stacking separation ranging from 4 Å to 8 Å.We use the standard PBE exchange-correlation functional 157,158 and the optimized norm-conserving Vanderbilt PBE pseudopotential (sg15) set. 159The real-space grid is defined as a parallelepiped box with length 16 Å, 17 Å, and 20 Å in the three Cartesian axes.The spacing between points is set to be 0.13 Å.
For each intermolecular separation, we carry out a single ground-state calculation and three time-propagations.For each time-propagation, a dipolar electric perturbation is applied along one of the Cartesian axes. 125We let the perturbed Kohn-Sham states evolve for a propagation time of T = 24 ̵ h/eV (15.8 fs) with a resolution of 0.26 eV (2π/T).The ground-state electron density is fragmented following the Bader atomic decomposition, and each molecular domain is defined as the sum of these atomic volumes.Then, the corresponding dipole operator is applied over each defined domain.Finally, by Fourier transform of the local time-dependent dipole moment, the local polarizability tensor is recovered.
13 shows the spectral decomposition of the benzenefulvene system as a function of the intermolecular separation.The full system at large intermolecular separation (8 Å) shows two major peaks located at the same frequencies as for the isolated molecules.As the intermolecular distance becomes smaller, the global spectrum changes due to electrostatic effects induced by the electronic cloud FIG. 13. (Top) Schematic representation of the benzene-fulvene dimer with π-stacking along the z axis and a intermolecular distance of 4 Å.From left to right, the electronic density for the global system for an isosurface of 0.006 Å −3 and the QTAIM local density for the fulvene and benzene molecules, respectively.Visualized with USCF Chimera software. 160(Bottom) Photo-absorption cross section obtained from, from left to right, the full system density, the fulvene local density, and the benzene local density.

ARTICLE scitation.org/journal/jcp
of the neighboring molecule.We can see that the peak located at around 5 eV suffers a red-shift, while the stronger peak between 6.5 eV and 7 eV reduces its intensity, giving rise to an oscillator strength transfer to excited states at higher energy.The local domain analysis reveals that the major change is caused by the reduction of the oscillator strength of the fulvene peak located at around 6.5 eV.These results are in agreement with the FDE-rt-TDDFT calculations of Ref. 156, further validating our strategy.In addition, the local domain methodology does not require any previous fragmentation, selection, or localization of the basis set, allowing us also to treat very large systems, such as biological molecules like the major light harvesting complex LHC-II. 161Recently, we have shown that this technique also allows for exciton coupling calculations when combined with a new formulation of the transition densities from real-time TDDFT. 162

XIII. NEW PROPAGATORS FOR REAL-TIME TDDFT
At the core of most Octopus calculations lies the need to integrate the time-dependent Kohn-Sham (TDKS) equations.These are a set of non-linear equations, given the dependence of the Kohn-Sham Hamiltonian on the electronic density.Upon the discretization of the electronic Hilbert space, they take the generic form of a system of first-order ordinary differential equations (ODEs) where t 0 is the initial time, φ is an array containing all the Kohn-Sham orbitals, {φ N m } (φ 0 is its initial value), and f is a vector function (f = (f 1 , . .., f N )) given by the action of the Kohn-Sham Hamiltonian Ĥ[n(t), t], fi(φ(t), t) = −i Ĥ[n(t)t]φi(t) , (i = 1, . . ., N).
Note that (i) if the nuclei are also to be propagated, the state should be supplemented with their position and momenta, and the equations with the corresponding nuclear equations of motion.(ii) Likewise, the formalism for solids includes a polarization vector field 22 that must also be included in the system definition and propagation.(iii) Strictly speaking, the TDKS equations are not ordinary but "delay differential equations" (i.e., equations for which the derivative of the unknown function at a certain time depends on the function values at previous times), due to the dependence of the exact exchange and correlation potential functional on the past densities.This is ignored in the adiabatic approximation, which is almost always assumed.A myriad of numerical propagators for ODEs are available, all of them theoretically applicable to the TDKS equations.Ideally, however, one should choose a propagator that respects all the mathematical properties of the equations that describe the problem at hand, for example, the preservation of the norm of the orbitals.In the case of the TDKS equations in the adiabatic approximation, another such property is symplecticity (a fact that is demonstrated in Ref. 163).This property is usually stated in terms of the conservation of the volume of the flow of the system of ODEs in the phase space, although the precise definition is as follows: Any system of ODEs can be viewed as a "flow," a differentiable map g : R p → R p that transforms the state y(t 0 ) at some point in time t 0 into the state at time t, y(t) = g(y(t 0 )).For systems described with complex variables such as ours, since it can be split into a real and an imaginary part, p is even: g : R 2N → R 2N .A map with an even number of variables such as this one is defined to be symplectic if and only if where I is the unit matrix of dimension N, and x ∈ R 2N (conventionally, the first N variables of x are called the "coordinates" and the second half are the "momenta," but no physical meaning should be assumed for them in this purely mathematical definition).
A numerical propagator is also a differentiable map, which relates the solution y(t) to y(t + Δt).As such, it may respect or not the symplecticity-and other properties-of the original flow.
The symplecticity also means that the system has to be "Hamiltonian": one may find a set of coordinates q ∈ R N , p ∈ R N (in this case, the real and imaginary parts of the orbitals coefficients) and some scalar function H(q, p) (the Hamiltonian expectation value), which permits us to rewrite the system into the well-known form of Hamilton's equations The relevance of the symplectic numerical propagators stems from the fact that they present some features, such as a better conservation of the energy at long times, where the value of the energy ends up oscillating around the true value instead of diverging.For a more detailed discussion about symplecticity on numerical propagators, see Ref. 164.
The Octopus code has several propagator options, many of them already described in Ref. 165, such as the Crank-Nicolson, standard Runge-Kutta, exponential midpoint rule (EMR) and variations (e.g., the "enforced time-reversal symmetry" scheme), and split operator techniques.In a recent paper, 163 some of the present authors have studied propagators of different families that had been scarcely (or not at all) tested for the TDKS equations: multistep, exponential Runge-Kutta, and commutator-free Magnus (CFM) expansions.After considering the accuracy, stability, and the performance of the propagators, the CFM techniques were identified as suitable schemes for TDDFT problems and implemented in Octopus.In this section, we make a brief description.
Developed in Ref. 166 for linear non-autonomous systems, the CFM expansion offers an alternative to the "standard" Magnus expansion, which requires expensive application of nested commutators of the Hamiltonian with itself at different times.In essence, CFM expansions Γ(t + Δt, t) consist of substituting the propagator Û(t + Δt, t) by a product of exponentials The m linear operators Di are either the Hamiltonian at different times within the interval [t, t + Δt] or parts of it.

ARTICLE scitation.org/journal/jcp
We implemented in Octopus an order four (q = 4) version, given, for example, in Eq. ( 43) of Ref. 166, which we call hereafter CFM4.This propagator requires two exponentials (m = 2), where The alternative that we implemented, however, is to approximate Ĥ[ti] via an extrapolation from the Hamiltonian at various previous time steps (in practice, it is the Hartree, exchange, and correlation parts that must be extrapolated).The resulting method is then explicit, i.e., no linear or non-linear algebraic equations need to be solved.The fourth order accuracy is preserved as long as the extrapolation is also done at order four.
As an example, Fig. 14 shows the performance of this CFM4 method against the well-known exponential midpoint rule (EMR).The benchmark system consists of a benzene molecule.It is subject to an instantaneous perturbation at the beginning of the propagation, and then, it evolves freely for some fixed interval of time.The propagations are performed at varying values of Δt.The plot displays the cost of the propagation vs the accuracy achieved in each case.As we can see, the CFM4 outperforms the EMR for all values of the error examined, although the differences become more obvious when higher accuracy is demanded.The key difference between the two methods is the fourth-order accuracy of the CFM4 scheme-at the cost of requiring two exponentials, instead of just one, while the EMR is only second-order accurate.This is reflected in the different slopes of the curves as the error becomes smaller, i.e., as Δt → 0.

XIV. CONJUGATE GRADIENT IMPLEMENTATION IN RDMFT
The RDMFT implementation in Octopus has been described in detail in a previous paper. 19In this section, we review briefly the existing implementation, providing some new insights, and introduce a recently implemented method to solve the RDMFT equations, which is better suited to the real-space grids used in the code.
The optimization of the natural orbitals in RDMFT is subject to an orthonormalization constraint for the orbitals.One minimizes the functional where E total and M denote the total energy and the number of natural orbitals ψj included in the calculation, respectively.The parameters λ jk are the Lagrange multipliers, which ensure the orthonormality of the natural orbitals at the solution point.We note that, for an RDMFT calculation, the number of natural orbitals has to be larger than the number of electrons in the system (core electrons that are included in the pseudopotential do not count).The exact number M is system-dependent and should be treated as an additional parameter with respect to which a convergence study should be carried out, just like it is done for the basis set or the grid parameters.Typically, the optimization is performed using the so-called Piris method, 167 which was the method previously implemented in Octopus.Within this approach, one uses the orthonormality constraint of the natural orbitals, which implies that a certain matrix constructed from the Lagrange multipliers λ jk is diagonal at the solution point.As an immediate consequence of the Piris method, the natural orbitals at the solution point are linear combinations of the orbitals used as a starting point for the minimization.In other words, the initial orbitals serve as a basis.Consequently, the necessary matrix elements for different energy contributions can be calculated for the basis functions (initial orbitals) before the iterative optimization of the natural orbitals and their occupation numbers is started.In addition, the optimization of the occupation numbers can be turned off completely, resulting in a Hartree-Fock calculation on the basis of the initial orbitals.
For the existing implementation in Octopus of the Piris method, the initial orbitals are taken to be the solutions obtained with a different level of theory, such as independent particles or DFT.In order to better understand the effect of the choice of basis, we tested the following choices: (i) independent particles, density functional theory within (ii) the local density approximation (LDA) or (iii) the exact exchange (EXX) approximation, and (iv) the Hartree-Fock approximation.In all cases, we have to ensure that the number The Journal of Chemical Physics ARTICLE scitation.org/journal/jcp of unoccupied states in the calculation is sufficient to cover all the natural orbitals that will obtain significant occupation in the following RDMFT calculation.The results for the convergence of the total energy of a one-dimensional (1D) hydrogen molecule using the Müller functional 168 are given in Fig. 15.The calculations were performed on a 1D grid extending from −12.0 bohr to 12.0 bohrs, with a grid spacing of 0.03 bohr.The nuclear potential for the 1D hydrogen molecule reads with d = 1.628 bohr, which corresponds to the equilibrium geometry.
The electron-electron interaction in one dimension is described by the soft-Coulomb interaction Since Octopus performs all calculations on a finite grid, we typically obtain a finite number of bound states in the calculations for the basis set and any additional orbitals that extend over the whole grid, i.e., they are unbound and, therefore, delocalized.However, all natural orbitals with non-zero occupation, because they decay with the ionization potential of the system, 170 are localized on the system.Hence, the extended basis states will only contribute with very small coefficients, if at all, and their inclusion in the basis set does not lead to a significant improvement of the results.In the past, this problem was addressed by performing an additional step to localize the initial states before starting the RDMFT calculation.However, further investigations showed that this only improves the results for a small number of natural orbitals in the calculation.Testing the convergence with respect to the number of natural orbitals then showed that the additional localization step slows down the convergence FIG. 15.Total energy for the RDMFT calculation for one-dimensional H 2 169 using the Piris method with different basis sets and using the conjugate gradient implementation.The inset shows a zoomed-in view of the area where convergence is reached.We employ the Müller functional 168 for all calculations.
with respect to the number of basis functions.The fastest convergence and lowest total energies are obtained by using the results from an independent particle calculation, as those yield the largest number of localized orbitals from all the different basis sets that were tested, as shown in Fig. 15.As the additional localization step proved to be unnecessary and even hinders convergence, it has been removed from the new version of the code.
Since the natural way of representing quantities in Octopus is directly on the real-space grid, and to circumvent the limitations with the quality of the basis sets available for the Piris method, we decided to implement a conjugate gradient optimization of the natural orbitals.This implementation follows the procedure for DFT explained in Ref. 171, which we adapted for RDMFT.This procedure allows us to take advantage of the full flexibility of a realspace grid and provides a systematic way of improving the results by enlarging the grid and reducing the spacing between grid points.The conjugate gradient algorithm requires a set of initial orbitals to start the self-consistent calculation; however, at convergence, the results are independent of that starting point.Therefore, while the calculation using the Piris method requires a set of initial states that serve as the basis, the conjugate gradient algorithm can be used starting from an initial set of random states.In our tests of the conjugate gradient implementation, the quality of the initial states only had an influence on the number of iterations necessary for the convergence, but not on the final result.We suggest to use the orbitals obtained from an independent particle calculation as initial states since they can be obtained for a small numerical cost and simultaneously can serve as a basis set in the Piris implementation.
Since we are not solving an eigenvalue equation to obtain the natural orbitals, they are not automatically orthogonal.As mentioned above, the orthogonality is taken into account via a constraint.Compared to Ref. 171, the non-diagonal Lagrange multipliers λ jk lead to two modifications in the conjugate gradient procedure.First, the steepest-descent direction [cf.Eq. (5.10) of Ref. 171] reads here where Second, the one parameter of the line-minimization [cf.Eq. (5.26) of Ref. 171] is changed to where Θ parameterizes the descend in the direction of the gradient and the single-particle Hamiltonian acting on a state |ϕ⟩ is defined as For details of the notation, we refer the reader to Ref. 171.
The convergence study with respect to the number of natural orbitals for the conjugate gradient algorithm is also included in The Journal of Chemical Physics ARTICLE scitation.org/journal/jcpFig. 15.As one can see, a smaller number of natural orbitals needs to be included in the calculation than for the basis set implementations with the Piris method.As the number of natural orbitals equals the number of basis functions in these calculations, this is mostly due to the fact that the available basis sets are of rather poor quality.In addition, the converged total energy for the conjugate gradient algorithm is slightly lower than for all the basis sets using the Piris method (see the inset of Fig. 15), which shows that the conjugate gradient algorithm exploits the full flexibility of the grid implementation, allowing for contributions to the natural orbitals that are not covered by any of the basis sets.We have also verified that the converged result of the conjugate gradient method is indeed independent of the choice of initial state for the Müller functional 168 employed in our calculations.The Müller functional is known to be convex when all infinitely many natural orbitals are included. 172This property is most likely not shared by all available RDMFT functionals.In addition, the number of natural orbitals in any practical calculation is always finite.Consequently, in practice, one needs to test the convergence for the different starting points, as the appearance of local minima cannot be excluded.

XV. PERIODIC SYSTEMS AND SYMMETRIES
Electronic structure in periodic systems is usually described using plane waves, but real-space grids have been shown to be a viable alternative when performing DFT and TDDFT calculations. 22,173Unfortunately, the discretization introduced by the realspace grid often breaks the direct connection between the physical system and the basis set, as symmetries and translations are, in most of the cases, incompatible with the discretized grid.However, realspace grids offer many advantages, such as natural mixed periodicboundary conditions for semi-periodic systems, and the calculation of the exchange and correlation term of DFT is straightforwardly obtained on the grid.
One of the main challenges for treating periodic systems in real space is the generation of the real-space grid and the corresponding weights for the finite differences.This becomes relevant when the grid is generated along the primitive axes of the Wigner-Seitz cell (primitive cell) of a solid, where the generating axes are usually nonorthogonal.
In Octopus, the grid points are generated along the primitive axes, and the calculation of the finite-difference weights follows the implementation described in Ref. 174.Once the grid is generated, and the weights for the finite differences (gradient and Laplacian) are obtained, it is necessary to deal with the discretization of the reciprocal space.Indeed, for periodic systems, the full crystal is replaced by the primitive cell of the crystal in real-space, thanks to the Bloch theorem, but is then complemented by the Brillouin zone, which also needs to be sampled.This grid, usually called k-point grid, is common to any type of basis sets, as long as one decides to reduce the crystal to its primitive cell.
In order to reduce the numerical effort associated with the description of periodic systems, we make use of the symmetries to reduce the Brillouin zone to its irreducible zone, which can drastically reduce the number of k-points.The space group of the crystal, as well as its symmetries, are obtained, thanks to the spglib library. 175In the present implementation, we restrict the symmetries to the symmorphic symmetries (inversion, rotations, and mirror planes), leaving for later the so-called non-symmorphic symmetries, i.e., symmetries involving a fractional translation, as they are not compatible with arbitrary real-space grids.
In order to assess the validity of our implementation, we used the so-called delta-factor test. 176Using the Schlipf-Gygi ONCVPSP 2015 pseudopotential set, we obtained the value of 1.50 meV/atom, which is very close to the value obtained by other codes using the same pseudopotential set.
When investigating the electron dynamics driven by a laser field or any type of symmetry-breaking perturbation (vector potentials, kicks with a finite momentum, strain, etc.), some of the original symmetries are lost.To deal with that, Octopus finds the small group of symmetries corresponding to the original space-group of the solid, retaining only the symmetries that leave invariant the perturbation direction.This defines the symmetries that are used for a time-dependent calculation.
One important aspect when using symmetries is the symmetrization (in real-space) of the charge and current densities, as well as other observables, such as kinetic energy density.In complement to the reduction of the Brillouin zone to its irreducible zone, we implemented a real-space symmetrization.We found that performing the real-space symmetrization is very important to achieve good and stable numerical results when taking into account symmetries.
For a comprehensive description of periodic systems, the ion dynamics has to be considered in addition to the electron dynamics.For isolated systems, it is often described with the TDDFT-Ehrenfest dynamics method, 17,177 where the ions obey the Newton equation with force fields computed by TDDFT.In contrast, for periodic systems, the ion dynamics has to be described with two sets of equations: one is the Newton equation for ions on the reduced coordinates in the primitive cell and the other is the equation of motion for the primitive cell itself, as the ionic coordinates of periodic systems are described by the combination of the reduced coordinates with the lattice vectors of the primitive cell.The lattice dynamics is often treated with the Parrinello-Rahman method, 178,179 and the key ingredient of the equation of motion is the stress tensor.Therefore, to realize the ab initio simulations for electron-ion-lattice dynamics based on TDDFT, the calculation of the stress-tensor has been implemented based on Ref. 180, and the implementation of the lattice dynamics is under way.

XVI. ADDITIONAL TECHNICAL CODE IMPROVEMENTS
In order to make all the new developments and applications described in the above sections possible, the code needs to be accurate, efficient, and reliable.In addition, when a code reaches the size and complexity of Octopus (currently more than 200 000 lines of source code), the amount of time required for maintenance and for adding new features becomes considerable.Therefore, continuous efforts at optimizing, validating, and improving the source code quality are needed.These efforts are essential, but often do not get the attention they deserve.In this section, we present some noteworthy developments to improve the code reliability and efficiency, to make the project easier to maintain, and to tackle new computer architecture challenges.

A. Web application for analyzing regression tests
In a complex code such as Octopus, every new development may have unintended side effects, possibly introducing errors to already existing features.To avoid such regressions, every change to the code is required to pass a suite of tests before being accepted.This suite is executed automatically using Buildbot 181 and is integrated with the continuous integration framework of the gitlab platform where Octopus is hosted for several years now.The testsuite is executed with 30 different toolchains spanning a variety of architectures, compilers, and Message-Passing Interface (MPI) libraries.It contains currently about 180 tests that execute Octopus roughly 740 times and that contain about 12 000 comparisons to reference values.The overall coverage of the tests is determined using the codecov.ioservice and lies at about 71% at the moment.
Although the testsuite is efficient in avoiding regressions, analyzing why a test failed has been difficult so far due to the large amount of data generated.Thus, we implemented an interactive analysis and visualization of the testsuite results as a web application available at https://octopus-code.org/testsuite/.This application makes it easier to understand why a test failed: Is it a problem with just one toolchain, e.g., a particular architecture, or is it simply a larger numerical variation of the results?Moreover, it facilitates updating the tests and improving the testsuite itself.
The application was implemented using the web framework Django coupled to a Postgres database.The testsuite results are automatically uploaded by the Buildbot service as soon as they are available.The application allows us to analyze single toolchain runs and single comparison matches and to compare all toolchain runs for a single commit in git.It provides histograms to judge if there are outliers or if it is a broad distribution; moreover, this allows us to determine, e.g., if there is a difference between MPI and non-MPI toolchains.This application has already helped us to identify bugs causing regressions and will continue to be useful to understand failed tests, to update them, and to improve the testsuite.

B. Improving ground-state calculations
To improve the reliability of ground-state calculations with Octopus, the default eigensolver used inside the self-consistent field (SCF) cycle was improved and different real-space preconditioners for the eigensolvers were evaluated.
The default eigensolver, a conjugate-gradient algorithm, has been improved over the previous implementation by now following closely Ref. 171, which greatly improves the reliability of the solver.The updated implementation differs from Ref. 171 in only one point: by default, the current band is not orthogonalized against all bands, instead it is only orthogonalized against previously computed bands with lower energies.According to our tests, in Octopus, this leads to a faster convergence for most cases.
Preconditioners for eigensolvers are an integral part of SCF calculations because they greatly accelerate convergence.They achieve this by applying an approximate inverse of the Hamiltonian in each iteration that leaves the solution invariant and brings the system closer to the real solution.As Octopus uses a real-space grid (as opposed to plane waves like in Ref. 171), different preconditioners are needed.A range of preconditioners has been compared for a wide variety of systems, comprising molecules, semiconductors, metallic systems, and surfaces to test convergence for very different cases.
The default preconditioner in Octopus is a low-pass filter obtained by adding the neighboring values in each dimension to the current value at a grid point, weighted by a certain factor α, This preconditioner was first described by Saad et al. 182 with α = 0.5.
It was proven to be the most effective preconditioner in our comparison because it is quite cheap to apply and it, nevertheless, decreases the number of SCF iterations noticeably.It can also be understood as two weighted Jacobi iterations (up to a prefactor) to solve for the inverse of the kinetic term of the Hamiltonian (i.e., 0.5Δψ ′ = ψ).
From the convergence radius of the Jacobi iterations, we can conclude that the allowed values for α are between 0.5 and 1.From theoretical considerations of damping of different spatial wavelengths during Jacobi iterations (see Secs. 4.1.3and 4.2 of Ref. 183), the ideal value should be 0.75 to most effectively damp high spatial frequencies.In practice, we find that this depends on the system.Moreover, increasing the number of Jacobi iterations does not make the preconditioner more effective.In addition, using a single Jacobi iteration (i.e., dividing by the diagonal of the Laplacian) does not speed up the convergence significantly.
A multigrid method similar to the one implemented in GPAW 184 also uses Jacobi iterations to solve for the kinetic Hamiltonian but employs different grids to reach a faster decrease in the error.Although this method reduces the number of SCF iterations, it is computationally more expensive because the Laplacian is applied several times for each iteration and is, thus, less effective than the filter preconditioner.
A preconditioner specifically targeting real-space methods was proposed by Seitsonen et al. 185 It uses the ratio between the difference of the energy and the potential to the kinetic energy [their Eq. ( 3)] with a preconditioning function [their Eq. ( 4)] originally from Ref. 171.However, this preconditioner did not speed up convergence significantly when used in Octopus.
Another way to get an approximate inverse of the Hamiltonian is to solve the Poisson equation associated with the kinetic part of the Hamiltonian.This reduces the number of iterations needed to converge the SCF calculation, but it increases the total calculation time, as each iteration is much more costly.
In summary, we find that for our real-space code, the filter preconditioner is most effective in reducing the total time needed to compute ground states because it reduces the total number of iterations without being computationally too expensive.
Nonetheless, sometimes certain states cannot be fully converged using a preconditioner, especially for calculations with many unoccupied states.In this case, restarting without preconditioner is needed to obtain a complete SCF convergence.The reasons for this are still under investigation.

C. Novel multi-system framework
In the way Octopus was originally designed, the entire code was structured around the idea that there was only one system (albeit of arbitrary size and complexity) with an associated Hamiltonian.

ARTICLE scitation.org/journal/jcp
This was typically a combined system of electrons and nuclei, where the former were described using DFT and the latter were treated classically.All possible sorts of interactions of this system with some external source (e.g., external electromagnetic fields, solvents described with the PCM, etc.) were then added in an ad hoc fashion.Although this approach worked well for many features and applications, its limitations became evident with several recent developments, in particular, the coupled Maxwell-Kohn-Sham equations described in Sec.II, where the Kohn-Sham orbitals need to be time-propagated alongside the Maxwell fields.This has prompted us to start a major refactoring of the code, where the full physical system is treated as several subsystems interacting with each other.Such subsystems can be electrons, nuclei, and Maxwell fields.While this framework was mainly motivated by the Maxwell-TDDFT coupling, it has been implemented in a very general fashion so that all existing features can be converted to it and that, in the future, many other developments that require the coupling of several subsystems can be based on it.

D. Memory layout
A new memory layout for storing the orbitals was introduced in Ref. 186, where all states are stored in a number of smaller batches with the innermost index being the state index instead of the realspace grid index.Thus, for all states in a batch, exactly the same operations can be executed when looping over the grid while accessing memory contiguously.This allows efficient parallelization on graphical processing units (GPUs), where all threads in a warp need to execute the same instructions, as well as vectorization on central processing units (CPUs), where one instruction can operate on several data points.To fully utilize these instructions, the kernel for computing finite differences (e.g., the Laplacian) has been specialized to explicitly use SSE, AVX, or AVX512 instructions.Now, more parts of the code have been ported to use this new layout to increase their performance, and whenever possible, the code will use this layout by default.

E. GPUs
The first GPU implementation was described in Ref. 186.This was based on OpenCL and was limited to a small selection of numerical algorithms and calculation modes.These included some of the most commonly used features of the code or algorithms that were particularly suited for GPU porting, such as the residual minimization method-direct inversion in the iterative subspace (RMM-DIIS) eigensolver for ground-state calculations or the enforced time-reversal-symmetry propagator for timedependent calculations.Since then, the GPU implementation has been expanded in several different ways.Most notably, it now supports CUDA through an additional compatibility layer.The number of supported features and algorithms has also been increasing steadily such that most time-dependent calculations can now be run efficiently on GPUs.The implementation was also expanded to support multiple devices per host.Using the packed storage format described above, Octopus is able to store the states fully in the GPU memory, provided the memory is large enough, thus reducing memory transfers to a minimum.Recently, this was improved further by removing frequent allocations and deallocations of temporary variables on the GPU, now using a custom memory management for those.This proved very effective in improving the scaling to several GPUs and also to several nodes with GPUs.
We show two examples of time-dependent runs in Fig. 16.For the first example (left panel), β-cyclodextrine was used as an input system and the simulation was run on the GPU island of the COBRA supercomputer at the Max Planck Computing and Data Facility.Each node in this island consists of two 20-core Intel Xeon 6148 Gold sockets (Skylake architecture) with two Nvidia Volta 100 GPUs (PCIe); the nodes are connected with an Omnipath fabric (100 Gbit/s).When executed on a full node with GPUs, the average time step for the example is a factor of 4.8 faster than on one full CPU node, i.e., the time to solution is reduced by a factor of 4.8.This also means that the GPU version consumes less energy: although one GPU node draws more power than a CPU node (∼950 W vs 450 W), the faster execution time reduces the overall energy to solution by a factor of 2.3 when running on a GPU node.When scaling to 2, 4, and 8 nodes, the speed-up is 1.26, 2.14, and 3.37, respectively, and the corresponding parallel efficiency is 63%, 54%, and 42%.Although the scaling is not yet perfect, it has improved considerably and also hints at the need for improving inter-node communication.For the second example (Fig. 16, right panel), a time-dependent simulation of a part of a chlorophyll complex was executed on a Supermicro server with two 8-core Intel Xeon 6134 Gold CPUs (Skylake architecture) and eight Nvidia Volta 100 GPUs interconnected with NVLink.Here, the speed-up of the average time step of the simulation when using 2, 4, and 8 GPUs is 1.95, 3.70, and 5.97, which corresponds to parallel efficiencies of 97%, 93%, and 75%, respectively.These efficiencies are much better than for the multi-node example, probably because data are only communicated within one node.For both examples, all states were stored in the GPU memory, and the parallelization was achieved by distributing the states only.Distributing the grid (i.e., parallelization over domain) is not as effective, because it leads to more frequent communication, and thus has a larger overhead.

ARTICLE scitation.org/journal/jcp
Our current and future efforts regarding the GPU implementation are focused on two aspects.First, we are planning to enhance the existing implementation by improving the communication, possibly by overlapping computation and communication, and by optimizing the existing kernels.Second, we want to port more algorithms that are currently only implemented for the CPU version, especially for spin-dependent calculations and ground-state runs.This is a long-term effort, and the ultimate goal is to have as many features and algorithms as possible running efficiently on GPUs.

XVII. CONCLUSIONS
It has been almost 20 years since the development of Octopus started.During this period, the code has matured and expanded to cover an ever-growing range of methods, theories, applications, and systems.It has also adapted continuously to the available computer architectures and computing paradigms, allowing researchers to tackle increasingly challenging problems in electronic structure theory.
Although many of the code capabilities have been routinely used by many groups around the world, mainly to study electronic excited state properties and dynamics, which still remain, to a large extent, the core of Octopus, we believe the main reason for the code's success lies elsewhere.From the beginning, the code has been designed to take full advantage of the flexibility and versatility offered by the use of real-space grids and provides developers with a framework to easily implement and test new ideas and methods that can be later on adopted by other codes (as it has already been the case with quite a few features previously developed within Octopus).This is demonstrated by the number of new theoretical methodologies and frameworks presented in this paper to deal with non-equilibrium phenomena of complex systems and to address the combined dynamics of electrons, phonons, and photons that go beyond what can be found in other electronic structure codes.This includes several novel approaches to treat the coupling of the electronic systems to the photons, such as the coupled Maxwell-Kohn-Sham equations, the OEP approach to the electron-photon coupling, and the dressed RDMFT.Other examples include the description of magnons in real-time and the orbital magneto-optical response in solids and molecules using the Sternheimer approach.In the case of magnons, supercells need to be employed, and the scalability of real-space grid methods make this approach very promising for future applications in more complex correlated magnetic systems.We expect that many of these methods and approaches will be integrated into other electronic-structure codes and will become standard tools in the near future.
The flexibility of the code is also demonstrated by the variety of systems that it can efficiently treat, as the applications described in this paper include molecules, nanoparticles, model systems, solvents, solids, and monolayers.Particularly noteworthy is the efficient treatment of periodic systems, which traditionally have been described using plane-wave basis-sets, and this has been achieved without loss of accuracy, as shown by our results for the delta-factor test.In the end, it is our hope that combining this flexibility with a growing range of methodologies will provide researchers with the necessary tools to study new challenging phenomena, such as novel correlated materials, out of equilibrium physics, or coupled electron-boson systems.
We also discussed recent improvements in performance and scalability, with particular emphasis on the GPU support.These developments are crucial in the view of the new challenges that electronic-structure applications are facing with the upcoming exaflop supercomputers.
Finally, several other implementations are in the pipeline, such as Floquet and cavity QED materials engineering, multitrajectory methods to deal with the nonadiabatic electron-ion dynamics, treatement of open quantum dissipative systems, and spectroscopies with entangled photons.All of these should be available to the users in the next years, so we invite you to stay tuned to the Octopus webpage at https://octopus-code.org/.

SUPPLEMENTARY MATERIAL
The supplementary material contains the following: (i) Movie SM1-The nitrobenzene molecule is surrounded by the solvation cavity on which the PCM charges rest.PCM charges are modified in an iterative scheme coupled to the KS problem to get the molecular ground state in equilibrium with the solvent (here, water).The size and color of the spheres reflect the PCM charge value at each point of the surface.The animation is the 3D version of Fig. 7 in the main text.(ii) Movie SM2-Time dependent evolution of the molecular dipole of the nitrobenzene molecule in water subject to an electrical dipole perturbation in the x direction applied at the initial time.(Left) The PCM surface charges (represented as in movie SM1) and the molecular dipole (red arrow) are animated according to the equilibrium method (see main text for details).The white arrow is the molecular dipole in vacuo.(Right) The molecular dipole magnitude (top panel) and xy in-plane angle (bottom panel) calculated with the different PCM methods implemented in Octopus.Observe how the presence of the solvent enhances the intrinsic molecular dipole magnitude and produces a noticeable dephasing with respect to the in vacuo case already after few fs.This dephasing reflects into the solvatochromic shift reported in Fig. 8 of the main text.

FIG. 5 .
FIG. 5. Self-consistent dynamics of Hubbard U for the Ni 3d orbitals (bottom panel) for pump intensities as indicated.The top panel represents the time-dependent vector potential, and the vertical dashed lines indicate the extrema of the vector potential, i.e., the minima of the driving electric field.Results are from Ref. 67.

FIG. 6 .
FIG. 6. Absorption cross section spectrum of the hydrogen fluoride dimer with and without van der Waals effects.A zoomed-in view of the absorption feature at around 9.3 eV illustrates a small red shift when van der Waals effects are considered.Both hydrogen fluoride molecules are placed on the same plane at a separation of 2.8 Å.

FIG. 7 .
FIG. 7. (Left panel) Nitrobenzene molecule in water (ϵs = 80) surrounded by the ground state PCM charges distributed on the solvation cavity.The actual distribution of the charges in 3D is given in the supplementary material, SM1.(Right panel) Electrostatic contribution to the solvation free energy computed at each Kohn-Sham iteration.Kohn-Sham equations were solved in real-space as implemented in Octopus using the GGA-PBE approximation88 to the exchange-correlation energy.The simulation box was built using spheres of radius 5 Å and an uniform spacing of 0.19 Å between grid points.The radii of the spheres used to build the cavity surface of the solute molecule are 2.4 Å for carbon, 1.8 Å for oxygen, and 1.9 Å for nitrogen.

FIG. 8 .
FIG. 8.Absorption spectrum of nitrobenzene in vacuo and in water (ϵs = 80, ϵ d = 1.786) for all the TD-PCM schemes.The inset shows the normalized cavity field factor (CFF) vs the excitation energy for the two EOM TD-PCMs with solvent relaxation times of 3.37 ps (slow) and 1 fs (fast) corresponding to CFFmax = 1.27 and 2.55, respectively.Real-time TDDFT simulations of 20 fs with a time step of 1.7 × 10 −3 fs were performed to obtain the absorption spectra.An electric dipole perturbation within the linear regime and oriented along x was applied as an initial perturbation.The rest of the computational details are shared with the ground-state DFT calculations leading to Fig.7.

FIG. 9 .
FIG. 9. Calculated magnon dispersion: spin susceptibility of bulk Ni along the ΓX direction, displayed in logarithmic scale.Numerical details are given in the main text.The experimental data are taken from Ref. 104.

FIG. 10 .
FIG.10.Calculated components (a) ϵxx and (b) ϵxy of the dielectric tensor of silicon for different frequencies of light Ω 0 (in eV) and the magnetic field of 1 T along the z axis (with and without account of excitonic effects).The linewidth δ = 0.1 eV is used.The blue shift by 0.7 eV is applied to take into account the GW correction to the bandgap.130,131The experimental data for ϵxx132 and ϵxy 110 (scaled by 1/2) are indicated by symbols.

FIG. 11 .
FIG. 11.Calculated photoelectron spectrum with t-SURFF, as implemented in Octopus.136The geometries employed to calculate the flux of the photoelectron current are depicted in (a) for periodic and (b) for non-periodic systems.(c) Pump-probe ARPES spectrum of monolayer h-BN driven by a laser field resonant with the gap at K, adapted from Ref. 134.(d) Photoelectron angular distribution obtained by strong-field ionization of C 60 with an IR field polarized along x, adapted from Ref. 136.All the electrons rescattering at the same time in one period of the field end up with final momenta forming a ring (dashed) centered at the value of A(tr ) at the rescattering time, tr .The two arrows represent the graphical decomposition of the final momentum of the photoelectron in the vector potential at the moment of rescattering (horizontal arrow) plus the rescattering momentum.

FIG. 14 .
FIG. 14. Wall-time computational cost of the method (in seconds) against the error in the wave function [defined as Error(T, Δt) = √ ∑ m ||φm(T) − φ exact m 2 , where the φ exact m are the values of the Kohn-Sham orbitals computed using the explicit fourth-order Runge-Kutta integrator with a very small time step] for the EMR and CFM4 propagators.

FIG. 16 .
FIG. 16.Scaling plots for the GPU implementation of Octopus.The left panel shows speed-up for a machine with two 20-core Intel Xeon 6148 Gold sockets (Skylake architecture) and two V100 GPUs (PCIe) per node.The right panel shows speed-up for a Supermicro server with 8 V100 GPUs (NVLink).

The Journal of Chemical Physics ARTICLE scitation.org/journal/jcp OEP
equation using all unoccupied states by a problem of solving Np+1 Sternheimer equations that only involve the occupied orbitals.In this way, the formulation of the problem becomes similar to the one of Ref. 37 for electrons only.

The Journal of Chemical Physics ARTICLE scitation.org/journal/jcp
49.
Ĥ[t 1 ] and Ĥ[t 2 ] are the Hamiltonians at times t 1 and t 2 , which are, in fact, unknown because they depend on the Kohn-Sham states through the density: we are dealing with a non-linear problem and the CFM expansions were, in fact, developed for linear systems.We have various options to extend them for our non-linear problem: for example, one could define Ĥ[t 1 ] and Ĥ[t 2 ] as interpolated Hamiltonians from Ĥ[t] and Ĥ[t + Δt] in which case we end up with an implicit equation for φ(t + Δt) that we would have to solve at a substantial cost.