DFTB+, a software package for efficient approximate density functional theory based atomistic simulations

DFTB+ is a versatile community developed open source software package offering fast and efficient methods for carrying out atomistic quantum mechanical simulations. By implementing various methods approximating density functional theory (DFT), such as the density functional based tight binding (DFTB) and the extended tight binding method, it enables simulations of large systems and long timescales with reasonable accuracy while being considerably faster for typical simulations than the respective ab initio methods. Based on the DFTB framework, it additionally offers approximated versions of various DFT extensions including hybrid functionals, time dependent formalism for treating excited systems, electron transport using non-equilibrium Green’s functions, and many more. DFTB+ can be used as a user-friendly standalone application in addition to being embedded into other software packages as a library or acting as a calculation-server accessed by socket communication. We give an overview of the recently developed capabilities of the DFTB+ code, demonstrating with a few use case examples, discuss the strengths and weaknesses of the various features, and also discuss on-going developments and possible future perspectives.


I. INTRODUCTION
Density Functional Theory (DFT) 1,2 dominates the landscape of electronic structure methods, being the usual go-to technique to model large, chemically complex systems at good accuracy.For larger systems and time scales, force-field models instead dominate materials and chemical modeling.Between these is the domain of semi-empirical methods, derived from approximations to Hartree-Fock or DFT based methods.Within this space, density functional based tight binding (DFTB) [3][4][5] effectively offers a reduced complexity DFT method, being derived from a simplification of Kohn-Sham DFT to a tight binding form. 6his paper describes the DFTB+ code, 7 an open source implementation, which aims at collecting the developments of this family of methods and making them generally available to the chemical, materials, and condensed matter communities.This article describes extensions to this code since its original release in 2007, 8 there being a lack of a more recent overview of its features and underlying theory.

II. DFTB+ FEATURES A. The core DFTB-model
The basic DFTB-equations are presented below.They can be easily generalized for periodic cases (k-points) as well as for other boundary conditions, as implemented in DFTB+.All equations throughout this paper are given in atomic units with Hartree as the energy unit.

Expansion of the total energy
The DFTB models are derived from Kohn-Sham (KS) DFT 2 by expansion of the total energy functional.Starting from a properly chosen reference density ρ 0 (e.g., superposition of neutral atomic densities), the ground state density is then represented by this reference, as perturbed by density fluctuations: ρ(r) = ρ 0 (r) + δρ(r).The total energy expression then expands the energy functional in a Taylor series up to third order, with with XC being the exchange correlation energy and potential.Several DFTB models have been implemented, starting from the first order non-self-consistent DFTB1 3,4 [originally called DFTB or non-SCC DFTB], the second order DFTB2 (originally called SCC-DFTB), 5 and the more recent extension to third order, DFTB3. 9-12

DFTB1
The first order DFTB1 method is based on three major approximations: (i) it takes only E 0 [ρ 0 ] and E 1 [ρ 0 , δρ] from Eq. (2) into account, (ii) it is based on a valence-only minimal basis set (ϕμ) within a linear combination of atomic orbitals (LCAO) ansatz, The Journal of Chemical Physics ARTICLE scitation.org/journal/jcpfor the orbitals ψi, and (iii) it applies a two-center approximation to the hamiltonian operator Ĥ[ρ 0 ].
a. Minimal atomic basis set.The atomic orbital basis set ϕμ is explicitly computed from DFT by solving the atomic Kohn-Sham equations with an additional (usually harmonic) confining potential, This leads to slightly compressed atomic-like orbitals for describing the density in bonding situations.The actual values for r 0 are usually given in the publications describing the specific parameterization.The operator Ĥ[ρ 0 ] also depends on the superposition of atomic densities, ρA (or potentials, V eff A ) of neutral atoms, {A}, in the geometry being modeled.This density is usually determined from the same atomic KS equations using a slightly different confinement radius, r d 0 .
b. DFTB matrix elements.The hamiltonian can be represented in an LCAO basis as where the neglect of the three center terms and pseudo-potential contributions 6 lead to a representation, which can be easily computed by evaluating the Kohn-Sham equations for dimers.These matrix elements are computed once as a function of inter-atomic distance for all element pairs.The Slater-Koster 13 combination rules are applied for the actual orientation of these "dimers" within a molecule or solid.
c. Total energy.E 0 [ρ 0 ] depends only on the reference density, so is universal in the sense that it does not specifically depend on the chemical environment (which would determine any charge transfer (CT), δρ, occurring).It can, therefore, be determined for a "reference system" and then applied to other environments.This is the key to transferability of the parameters.In DFTB, E 0 [ρ 0 ] is approximated as a sum of pair potentials called repulsive energy terms, (see Ref. 14), which are either determined by comparison with DFT calculations 4 or fitted to empirical data. 15Forces are calculated with the Hellmann-Feynman theorem and derivatives of the repulsive energy.

DFTB2 and DFTB3
To approximate the E 2 and E 3 terms in Eq. ( 2), the density fluctuations are written as a superposition of atomic contributions, taken to be exponentially decaying spherically symmetric charge densities By neglecting the XC-contributions for the moment, the second order integral E 2 leads to an analytical function, γAB, with energy, 5 E 2 (τA, τB, RAB) = 1 2 ∑ AB(≠A) γAB(τA, τB, RAB)ΔqAΔqB.
The energy depends on the Mulliken charges {qA} (where the atomic charge fluctuation, ΔqA = qA − ZA, is with respect to the neutral atom), which are, in turn, dependent on the molecular orbital coefficients, cμi.Thus, the resulting equations have to be solved selfconsistently.At large distances, γAB approaches 1/RAB, while at short distances, it represents electron-electron interactions within one atom.For the limit RAB → 0, one finds τA = 16 5 UA, i.e., the so-called Hubbard parameter UA (twice the chemical hardness) is inversely proportional to the width of the atomic charge density τA.This relation is intuitive in that more diffuse atoms (or anions) have a smaller chemical hardness.For DFTB, the chemical hardness is computed from DFT, not fitted.
The third order terms describe the change of the chemical hardness of an atom and are also computed from DFT.A function ΓAB results as the derivative of the γ-function with respect to charge, and the DFTB3 total energy is then given by The third order terms become important when local densities deviate significantly from the reference, i.e., ΔqA is large.Apart from including the third order terms, DFTB3 also modifies γAB for the interactions between hydrogen and first row elements, 9 where the deviation from the relation between the charge width and the chemical hardness, as formulated above, is most pronounced.The resulting DFTB3 hamiltonian takes the form where Sμν is the overlap matrix between orbitals ϕμ and ϕν, and γ h is the modified DFTB2 interaction.

Spin
Analogous to DFTB2, expanding the energy with respect to spin fluctuations [16][17][18] leads to the spin-polarized expressions for DFTB.By introducing the magnetization density m(r) = ρ ↑ (r) − ρ ↓ (r) as difference of the densities of spin-up and spin-down electrons and its corresponding fluctuations [δm(r)] around the spin-unpolarized reference state [|m(r)| = 0], a spin dependent term is added to the spin-independent E 2 of Eq. ( 2), The Journal of Chemical Physics

ARTICLE scitation.org/journal/jcp
where a local or semi-local E xc has been assumed.Identifying the spin density fluctuations with up-and downspin Mulliken charge differences, Δp Al , for angular momentum shell l at atom A, and approximating the second derivative of E xc [ρ, m] as an atomic constant W All ′ (similar to the Hubbard UA) lead to an on-site energy contribution This term in Eq. ( 14) is to be added to Eq. ( 8).It captures the spin-polarization contribution to the total energy and couples different atomic angular momentum shells via magnetic interaction.
The W All ′ are usually an order of magnitude less than the UA and are multiplied with a (typically) small Δp Al ; hence, inclusion of spinpolarization via Eq.( 14) gives only a small energy contribution.
If there is a net imbalance of up-and down-spin electrons in the system, the occupation of electronic states alone carries most of the effect of the unpaired electron(s) without including Eq. ( 14).The use of Mulliken charges leads to an additional hamiltonian contribution 17 to the (now) shell resolved form of Eq. ( 10), where the spin up (down) hamiltonian has this term added (subtracted).
Expanding further to local (not global) up and down spin populations via Pauli spinors gives the non-collinear spin model. 19quation ( 14) becomes and the wave-function generalizes to two component spinors.The hamiltonian contributions take the form where σi is the Pauli matrix for spin component i(=x, y, z) and H σ i is constructed from the ith spin component of Δ⃗ p.This spinblock two component hamiltonian then also enables spin-orbit coupling 19,20 to be included in DFTB+.The spin-block hamiltonian addition is where ξ Al is the spin orbit coupling constant for shell l of atom A with L ± and Lz being the angular momentum operators for atomic shells.

Limitations of the core DFTB-model
DFTB is an approximate method, and as such shows limitations, which can be traced back to the different approximations applied.However, the fitting of Eq. ( 6) can compensate for some of the inaccuracies.Since until now, only bonding contributions are addressed by the two-center nature of the repulsive potentials, bond-lengths, bond-stretch frequencies, and bond-energies can be targeted (properties such as bond angles or dihedral angles cannot be influenced by repulsive pair parameterization).This is the reason why DFTB performs better than a fixed minimal basis DFT method, which would be only of limited use in most of the applications.In some cases, DFTB can even perform better than double-zeta (DZ) DFT using generalized gradient approximation (GGA) functionals, as shown, e.g., in Ref. 12.This accuracy definitely can be traced back to the parameterization.
a. Integral approximations.There are some approximations in DFTB that cannot be compensated by parameterization, effecting, e.g., bond angles and dihedrals, which on average show an accuracy slightly less than DFT/DZ.Furthermore, the integral approximation leads to an imbalanced description of bonds with different bond order.For example, C-O single, double, and triple bonds have to be covered by a single repulsive potential, which shows only a limited transferability over the three bonding situations.This is the reason why both good atomization energies and vibrational frequencies cannot be covered with a single fit. 12Hence, in that work, two parameterizations were proposed, one for obtaining accurate energies and one for the vibrational frequencies.Similarly, description of different crystal phases with the same chemical composition but with very different coordination numbers can be challenging.Recent examples show, 21,22 however, that it is possible to reach a reasonable accuracy if special care is taken during the parameterization process.
b. Minimal basis set.The minimal basis set used has several clear limitations, which show up in the overall DFTB performance: First, for a good description of hydrogen in different bonding situations, relatively diffuse wave functions have to be chosen.For this atomic wave-function, however, the H 2 atomization energy is in error, which is dealt with by an ad hoc solution, again providing a special repulsive parameter set. 12Furthermore, nitrogen hybridization and proton affinities require at least the inclusion of d-orbitals in the basis set: this again can be compensated by a special parameter set, which has to be applied under certain conditions. 12A similar problem occurs for highly coordinated phosphorus containing species. 23The minimal basis can also become problematic when describing the high lying (conduction band) states in solids.For example, silicon needs d-orbitals in order to describe the conduction band minimum properly.The valence band, on the other hand, can be reasonably described with an sp-only basis.
c. Basis set confinement.As a result of the orbital confinement, Pauli repulsion forces are underestimated, which leads to DFTB non-bonding interactions being on average too short by 10%-15%.This has been investigated in detail for liquid water, where a different repulsive potential has been suggested. 24A related problem concerns molecular polarizabilities, which are underestimated using a minimal basis set.Approaches to correct for this shortcoming have been summarized recently in Ref. 25.The too-confined range of basis functions also impairs the calculation of electron-transfer couplings.Here, unconfined basis sets have to be used. 26Similarly, it can be challenging to find a good compromise for the basis confinement when describing 2D-layered materials.As the interlayer distances are significantly longer than the intra-layer ones, the binding between the layers often becomes weaker compared to DFT. d.DFT inherited weaknesses.DFTB is derived from DFT and uses standard DFT functionals, which also come with some wellknown limitations.There, several strategies applied within DFT are also viable for DFTB, as discussed below in more detail.

B. Density matrix functionals
The typical behavior of the SCC-DFTB ground state resembles local-density approximation (LDA) or GGA, 27 i.e., a mean-field (MF) electronic structure method with associated self-interaction errors and, for some systems, qualitatively incorrect ground states.This is in contrast to non-SCC DFTB, which gives the correct linearity of total energy and step-wise chemical potentials 28 for fractionally charged systems.However, non-SCC can also produce MF-DFT limits, such as in the case of dimer dissociation 29,30 due to self-interaction errors in the underlying atomic DFT potentials.
DFTB+ now also supports long-range corrected hybrid functionals for exchange and correlation.With respect to conventional local/semi-local functionals, these are known to provide a better description of wave function localization and significantly reduce self-interaction. 31In the longer term, DFTB+ will continue to develop post-DFT based methods with the aim of making large (≳1000 atom) correlated systems tractable via methods with correlated self-energies or wave-functions.

Onsite corrections
DFTB2 neglects on-site hamiltonian integrals of the type (μν|μν), where ϕμ and ϕν are two different atomic orbitals of the same atom [both Eq. ( 5) and the use of Mulliken charges give onsite elements only for δμν = 1].A generalized dual population 32 can be introduced as where Q A,l μν is a population matrix for shell l of atom A and the diagonal of each block represents the conventional Mulliken charges for orbitals in the lth shell.Based on this population, all fluctuations of the atomic parts of the density matrix from the reference can be included, not only the diagonal (charge) elements.These must then be treated self-consistently during the calculation.This generalization leads, for example, to an improved description of hydrogen bonds in neutral, protonated, and hydroxide water clusters as well as other water-containing complexes. 33he onsite-corrected DFTB requires additional atomic parameters; these are not tunable but computed numerically using DFT (see Ref. 34 for details of their evaluation).The onsite parameter for some chemical elements can be found in the DFTB+ manual.The calculation requires convergence in the dual density populations.This is a somewhat heavier convergence criterion than just charge convergence, and thus, the computational time is moderately affected.

DFTB+U and mean-field correlation corrections
For correlated materials such as NiO, a popular correction choice in DFT is the LDA+U family of methods, 35 which add a contribution to the energy of the specified local orbitals obtained from the Hubbard model.The rotationally invariant 36 form of LDA+U can be written in terms of several choices of local projections of the density matrix. 32Likewise, the double-counting between the Hubbard-model and the density functional mean-field functional take several limiting cases. 37In DFTB+, the fully localized limit of this functional was implemented early in the code's history 27 using the populations of Eq. (19).Originally applied for rare-earth systems, 38 DFTB+U gives excellent agreement with GGA+U. 39A closely related correction, pseudo-SIC, 40 where the local part of the self-interaction is removed, modifying only the occupied orbitals, is also available.These approximations lower the energies of occupied atomic orbitals within the specified atomic shells with the aim of removing self-interaction or more accurately representing selfenergy.However, as with its use in DFT, this approximation suffers from three main drawbacks.First, the form of the correction depends on the choice of double counting removal. 41The correlation is also mean-field in nature; hence, all equally filled orbitals within a shell receive the same correction, and therefore, cases not well described by a single determinant are not systematically improved.Finally, the choice of the U (and J) values is not necessarily obvious, with a number of different empirical, linear response, and self-consistent choices possible.Specific to DFTB, 42 the U values may also require co-optimization with the repulsive parameters, in particular, for systems where the electronic structure is geometrically sensitive.

Long-range corrected hybrid functionals
a. Single determinant formulation.To correct longer range errors, the electron-electron interactions can be split into short and long range components based on a single parameter ω, The short range contribution is treated in a local or semi-local density functional approximation, while the long range term gives rise to a Hartree-Fock-like exchange term in the hamiltonian. 31The necessary adaptions for the DFTB method (termed LC-DFTB) were introduced in Refs.43 and 44.Note that quite generally for DFTB+, the exchange-correlation functional is effectively chosen by loading the appropriate Slater-Koster files created for the desired level of theory.This also holds for LC-DFTB, where different values for the range-separation parameter, ω, lead to different Slater-Koster files.The database at www.dftb.orgcurrently hosts the ob2 set 45 for the elements O, N, C, and H with ω = 0.3 a −1 0 .LC-DFTB calculations can also be performed for spin-polarized systems, enabling evaluation of triplet excited states and their corresponding relaxed geometries.It also paves the way for a rational determination and tuning 31 of the range-separation parameter ω, which amounts to total energy evaluations for neutral and singly ionized species.Note that the required atomic spin constants are functional specific.The spin parameters for the ob2 Slater-Koster set are available in the manual.
b. Spin restricted ensemble references.Instead of single determinants, the spin-restricted ensemble-referenced Kohn-Sham (REKS) method and its state-interaction state-averaged variant (SI-SA-REKS or SSR) [46][47][48][49][50][51] based on ensemble density functional theory are now available in DFTB+.SSR can describe electronic states with multi-reference character and can accurately calculate excitation energies between them (see Sec. II C 2).The SSR method is formulated in the context of the LC-DFTB method (LC-DFTB/SSR) 52 since a long-range corrected functional is crucial to correctly describe the electronic structure particularly for the excited states (see Ref. 52  for details of the formalism).Spin-polarization parameters are also required to describe open-shell microstates.It was observed that LC-DFTB/SSR sometimes gives different stability of the open-shell singlet microstates from the conventional SSR results, depending on excitation characters.In such a case, a simple scaling of atomic spin constants is helpful to account for correct excitation energies (see Ref. 52 for the required scaling of spin constants).The LC-DFTB/SSR method can be extended in the future by using larger active spaces or with additional corrections such as the onsite or DFTB3 terms.

Non-covalent interactions
In large systems, non-covalent interactions (van der Waals/vdW forces) between molecules and between individual parts of structures become of key importance.The computational performance of DFTB makes these systems accessible, but large errors are observed for these weaker interactions.Being derived from (semi-)local density-functional theory, DFTB naturally shares the shortcomings of these approximations.This includes the lack of long-range electron correlation that translates to underestimated or missing London dispersion.An accurate account of vdW forces is essential in order to reliably describe a wide range of systems in biology, chemistry, and materials science.DFTB has already been successfully combined with a range of different correction schemes [53][54][55][56][57][58] to account for these weaker interactions, but here we outline some newer methods available in DFTB+.
a. H5 correction for hydrogen bonds.The H5 correction 59 addresses the issue of hydrogen bonding at the level of the electronic structure.For DFTB2 and DFTB3, interaction energies of H-bonds are severely underestimated for two main reasons: most importantly, the monopole approximation does not allow on-atom polarization; even if this limitation is lifted, the use of minimal basis does not allow polarization of hydrogen.In the H5 correction, the gamma function (Sec.II A 3) is multiplied by an empirical term enhancing the interactions at hydrogen bonding distances between hydrogen atoms and electronegative elements (N, O, and S).The H5 correction is applied within the SCC cycle, thus including many-body effects (the source of the important cooperativity in H-bond networks).The H5 correction was developed for DFTB3 with the 3OB parameters and a specific version of the DFT-D3 60,61 dispersion correction.Note that this D3 correction also includes an additional term augmenting hydrogen-hydrogen repulsion at short range (necessary for an accurate description of aliphatic hydrocarbons 62,63 ).
b. DFT-D4 dispersion correction.The D4 model 64,65 is now available in DFTB+ as a dispersion correction.Like D3, pairwise C AB 6 dispersion coefficients are obtained from a Casimir-Polder integration of effective atomic polarizabilities α eff A/B (iu), The influence of the chemical environment is captured by using a range of reference surroundings, weighted by a coordination number.D4 improves on its predecessor by also including a charge scaling based on atomic partial charges determined as either Mulliken 64 or classical electronegativity equilibration. 65Especially for metalcontaining systems, the introduced charge dependence improves thermochemical properties. 66Large improvements can also be observed for solid-state polarizabilities of inorganic salts. 67For a full discussion on the methodology behind D4, we refer the reader to Ref. 65, and the implementation details are presented in Ref. 67.The damping parameters for several Slater-Koster sets are provided in the supplementary material.
To investigate the performance of the DFTB-D4 parameterizations, we evaluate the association energies for the S30L benchmark set. 68,69DFTB-D4 is compared to DFTB3(3ob)-D3(BJ), 54 GFN1-xTB, 70 and GFN2-xTB; 71 additionally, we include the dispersion corrected SCAN 72 functional in comparison to DFT.The deviation from the reference values is shown in Fig. 1.For the mio parameterization, complexes 4, 15, and 16 were excluded due to missing Slater-Koster parameters.The direct comparison of DFTB3(3ob)-D3(BJ) with a MAD of 7.1 kcal/mol to the respective D4 corrected method with a MAD of 6.5 kcal/mol shows a significant improvement over its predecessor.The DFTB2(mio)-D4 gives an improved description with a MAD of 4.5 kcal/mol, which is better than GFN1-xTB with a MAD of 5.5 kcal/mol.The best performance is reached with GFN2-xTB due to the anisotropic electrostatics c.Tkatchenko-Scheffler (TS) dispersion.The Tkatchenko-Scheffler (TS) 73 correction includes vdW interactions as Londontype atom-pairwise C 6 /R 6 -potentials with damping at short interatomic separations, where the electronic structure method already captures electron correlation.The suggested damping parameters for the mio and 3ob parameter sets are listed in the supplementary material.In the TS approach, all vdW parameters including the static atomic dipole polarizability, α, and C 6 -dispersion coefficients depend on the local electronic structure and the chemical environment. 73High-accuracy in vacuo reference values (labeled by vac) are rescaled via In the case of DFT, x is approximated based on the Hirshfeld atomic volumes. 74When combined with DFTB, a fast yet accurate alternative has been proposed, 58 which does not require evaluating a real-space representation of the electron density.Instead, the ratio between atom-in-molecule and in vacuo net atomic electron populations [i.e., tr(ρ) A /ZA] is used to define x.
d. Many-body dispersion (MBD).Going beyond pairwise interactions, many-body dispersion (MBD) 75,76 accounts for manyatom interactions in a dipolar approximation up to infinite order in perturbation theory.This is achieved by describing the system as a set of coupled polarizable dipoles 75 with rescaled in vacuo reference polarizabilities [as in Eq. ( 22)].At short-ranges, this model switches, via a Fermi-like function with a range of β, to the local atomic environment as accounted for by solving a Dyson-like selfconsistent screening equation. 76β represents a measure for the range of dynamic correlation captured by the underlying electronic structure method, so it depends on the density functional or DFTB parameterization.The recommended β-values for the mio and 3ob parameter sets are listed in the supplementary material.
Figure 2 and Ref. 58 demonstrate that DFTB and MBD represent a promising framework to accurately study long-range correlation forces and emergent behavior at larger length-and timescales.Recently, the DFTB+MBD approach has allowed the study of organic molecular crystals 55 and solvated biomolecules, revealing the complex implications of many-body vdW forces for proteins and their interaction with aqueous environments. 82Further improvements of TS and MBD, including a better description of charge transfer effects 83 and variational self-consistency, 84 may also be incorporated into DFTB in the future.Both methods are formulated independently of the underlying electronic-structure methods.As a result, DFTB+ outsources the evaluation of the MBD and TS interactions to Libmbd, 85 an external open-source library.

Time dependent DFTB with Casida formalism
Electronic excited states are accessible in DFTB+ through time dependent DFTB methods (see Ref. 86 for a review and detailed discussion of this formalism).In a linear response treatment in the frequency domain, excitation energies are obtained by solving an eigenvalue problem known as Casida or RPA (random phase approximation) equations.Compared to first-principles time dependent DFT, the computational scaling can be reduced in DFTB from N 6 to N 3 .This is due to the Mulliken approximation for two-electron integrals, 87 which uses transition charges q pqσ A , for transitions from the Kohn-Sham orbital pσ to qσ.
For fixed geometry, DFTB+ provides a user defined number of low lying excitation energies, oscillator strengths, and orbital participations.In another mode of operation, the code computes excited state charges, eigenvectors of the Casida equation, and energy gradients for a specific state of interest, which can be combined with

FIG. 2. Mean absolute errors (MAEs)
and mean absolute relative errors (MAREs) in inter-molecular interaction energies of bare DFTB and with different van der Waals models in comparison to high-level reference data.][81] The Journal of Chemical Physics ARTICLE scitation.org/journal/jcpMD or geometry relaxation.For spin-unpolarized calculations, the response matrix is block diagonal for the singlet and triplet channels to speed up the computation.DFTB+ allows for the computation of the excited state properties of systems with general fractional occupation of the KS orbitals.This is useful, for example, for the simulations of metals and semi-metals at a finite temperature.For a detailed discussion on spin-polarization and fractional occupation within time dependent (TD) DFTB, see Ref. 34.The onsite correction, discussed in Sec.II B 1, is also possible for excited state calculations and was shown to lead to marked improvements. 34ue to their improved treatment of charge-transfer transitions, range-separated functionals are also relevant in the context of excited states.DFTB+ implements the time dependent long range corrected (TD-LC) DFTB method, as described in Ref. 88.Compared to the conventional TD-DFTB, the lower symmetry of the response matrix leads to a non-Hermitian eigenvalue problem, which we solve by the algorithm of Stratmann and co-workers. 89Somewhat surprisingly, it turns out that TD-LC-DFTB calculations are, in practice, not significantly slower than TD-DFTB calculations (see Ref. 88 for a deeper discussion).Gradients can also be calculated with TD-LC-DFTB, making it possible to perform geometry optimizations and MD simulations in singlet excited states.
Note that energetically high lying states and Rydberg excitations are clearly outside of the scope of TD(-LC)-DFTB since their description generally requires very diffuse basis sets.Apart from this class, the photochemically more relevant set of low energy valence excitations are predicted with similar accuracy to first principles TD-DFT, as several benchmarks indicate. 34,90,91As mentioned above, charge-transfer excitations can now be also treated using TD-LC-DFTB. 88

SSR and excitations
Currently, the SSR method implemented in DFTB+ is formulated for active spaces including two electrons in two fractionally occupied orbitals [i.e., SSR(2,2)], which is suitable for a singlet ground state and the lowest singlet excited state as well as a doubly excited state. 52In addition, since the SSR method is based on an ensemble representation and includes the electronic correlation, it can give correct state-interactions among nearly degenerate electronic states.Thus, the SSR approach is useful to investigate conical intersections.The LC-DFTB/SSR method with scaled spin constants can accurately describe the ground and excited states including π/π * or n/π * character, undergoing bond cleavage/bond formation reactions as well as the conical intersections where the conventional (TD)DFTB fails to obtain the electronic properties.Analytic energy gradients as well as non-adiabatic couplings are also available.

Time-independent excited states from ΔDFTB
The linear response approach to excited-state properties in DFTB is efficient and powerful, but there exist circumstances where a more direct route to the excited states is desirable.For example, the excited-state properties obtained from linear response theory require an additional order of derivatives relative to the ground state.As noted in Sec.II C 1, linear-response TD-DFTB (like its parent method TD-DFT) 92 should invoke range-separation to achieve a qualitatively correct picture of charge-transfer excitations and related long-range phenomena. 88 an alternative to the time-dependent linear-response approach, it is possible to variationally optimize certain electronically excited states directly.The ΔDFTB method, modeled on the Δ-self-consistent-field (ΔSCF) approach to excited states in DFT, 93,94 involves solving the SCC-DFTB equations subject to an orbital occupation constraint that forces the adoption of a non-aufbau electronic configuration consistent with the target excited state.This method is implemented for the lowest-lying singlet excited state of closedshell molecules in DFTB+. 95The converged, non-aufbau SCC-DFTB determinant is a spin-contaminated or "mixed" spin state, but the excitation energy can be approximately spin-purified through the Ziegler sum rule, which extracts the energy of a pure singlet from the energies of the mixed state and the triplet ground state.
A significant advantage of the ΔDFTB approach is that excitedstate gradients and hessians are quite straightforward to compute, both mathematically and in terms of computational cost, relative to linear response approaches.Benchmarks of ΔDFTB excitedstate geometries and Stokes shifts 95 demonstrate the suitability of the method for simulating excited-state energetics and dynamics of common organic chromophores along the S 1 potential energy surface.

Real-time propagation of electrons and Ehrenfest dynamics
It is often desirable to study time dependent properties outside the linear response regime, e.g., under strong external fields.The numerical propagation of the electronic states enables the simulation of such phenomena, and its coupling to the nuclear dynamics in a semi-classical level can be included to the lowest order within the Ehrenfest method.Purely electronic (frozen-nuclei) dynamics as well as Ehrenfest dynamics are included in DFTB+.We solve the equation of motion of the reduced density matrix ρ given by the Liouville-von Neumann equation with D being the non-adiabatic coupling matrix Dμν = ṘB ⋅ ∇BSμν and ṘB being the velocity of atom B. The on-site blocks can be calculated taking the RB → 0 limit, although neglecting those does not introduce significant changes to the dynamics. 96nitary evolution of ρ with no change in its eigenvalues would require D † = −D, which is normally not the case.Therefore, nuclear dynamics can induce electronic transitions leading to thermalization. 97Unitary evolution is recovered when all nuclear velocities are equal (frozen-nuclei dynamics) and the second term in Eq. ( 24) vanishes.
The force in the Ehrenfest-dynamics can be expressed as 96,98 where E(t) is the external electric field.In the present implementation, the velocity dependent terms have been neglected, and they would vanish for a complete basis 96 and are necessary for momentum, but not for energy conservation. 98When the system is driven The Journal of Chemical Physics ARTICLE scitation.org/journal/jcpexternally by an electric field, a dipole coupling term is added in the time-dependent hamiltonian in Eq. (24).Some applications that have been enabled by the speedup over time-dependent DFT are the simulation of the plasmon-driven breathing-mode excitation in silver nanoparticles of 1-2 nm in diameter 99 and the simulation of transient absorption pump-probe spectra in molecules. 100,101henever a time propagation approach is used for the calculation of absorption spectra in the linear regime, this method is equivalent to calculations using the Casida formalism and shares its strengths and limitations.Specific pitfalls of the time dependent approach come into play whenever simulating the response to intense external fields.In these cases, the poor description of highly lying excited states due to the use of a minimal basis set would likely be inaccurate if these states are populated during the dynamics.

pp-RPA
An approximate particle-particle RPA scheme, the so-called pp-DFTB, 88 is now implemented in DFTB+.Particle-particle RPA, based on the pairing matrix fluctuation formalism, has been shown to be an efficient approach for the accurate description of double and charge-transfer (CT) excitations involving the highest occupied molecular orbital (HOMO) (see Ref. 102 for details).In Ref. 88, we compare against TD-LC-DFTB for CT excitation energies of donoracceptor complexes.TD-LC-DFTB has the advantage transitions do not necessarily have to involve the HOMO of the system.Alternatively, pp-DFTB does not require parameter tuning and is efficient for the lowest lying excitations.
Although one of the strengths of the original pp-RPA formulation lies on the accurate description of Rydberg excitations, our approximate formalism based on DFTB fails to describe these kinds of transitions, as explained in Sec.II C 1.

Coupled perturbed responses
DFTB+ supports several types of response calculations for second-order derivatives.The general form of the response evaluation is via standard perturbation theory, where the sums for the states that U mixes together may be over all states or only the virtual space (parallel gauge) depending on application.U is anti-symmetric (anti-Hermitian) or has no symmetry depending on whether the derivative of the overlap matrix is non-zero.
In the case of systems with degenerate levels, a unitary transformation, Z, that diagonalizes the block of P associated with that manifold can be applied to the states; note that this sub-block is always symmetric (Hermitian), leading to orthogonality between states in the perturbation operation, For fractionally occupied levels, the derivatives of the occupations for q = 0 perturbations (where the change in the Fermi energy should be included) are then evaluated. 103ime dependent perturbations at an energy of ̵ hω can be written as (1 Here, the small constant η prevents divergence exactly at excitation poles. Derivatives with respect to external electric fields and potentials are included (giving polarizabilities and dipole excitation energies), with respect to atom positions (at q = 0, providing Born charges and electronic derivatives for the hessian) and with respect to k in periodic systems (effective masses and also the Berry connection via ⟨u| ∂u /∂k⟩).In the longer term, perturbation with respect to magnetic fields, boundary conditions (elastic tensors), and alternative approaches (Sternheimer equations for q ≠ 0, and also lower computationally scaling density matrix perturbation theory) are planned.

D. Non-equilibrium Green's function based electron transport
Electron transport in the steady-state regime is described in DFTB+ within a non-equilibrium Green's function (NEGF) method, 104,105 as implemented in the code-independent libNEGF 106 library.The density matrix is evaluated in terms of the electronelectron correlation matrix G < , 105 Open boundary conditions are included in terms of electron baths with an arbitrary spectrum and chemical potential, allowing for a seamless description of charge injection from electrodes with an applied bias.The density matrix is then used to evaluate a real-space electron density distribution, which is coupled self-consistently with a Poisson solver.We perform a full band integration of Eq. ( 36), utilizing a complex contour integral to reduce the number of integration points. 104This allows for an implicit description of dielectric properties, which is crucial for an accurate modeling of ultra-scaled electron devices. 107,108After self-consistency is achieved, the total ARTICLE scitation.org/journal/jcpcurrent flowing in the system is calculated with the Landauer/Caroli formula for the non-interacting case or with the Meir-Wingreen formula for the interacting case. 105A detailed description of the numerical algorithms and self-consistent coupling is presented in Ref. 109.
Here, we summarize the main features that might differentiate DFTB+ from other nano-device simulation packages: (i) support for N ⩾ 1 electrodes (enabling structures from surfaces and semi-infinite wires to multiple terminal geometries), (ii) O(L) memory and time scaling (where L is the system length) via a block-iterative algorithm, (iii) a real space Poisson solver with support for gates and dielectric regions, and (iv) evaluation of local currents.Being a parameterized tight binding method, its usage is bounded by the availability of good parameters for the system under investigation.][112] In Fig. 3, we show a non-SCC calculation example of transmission in linear response for a multi-terminal device.The simulated system is a cross-junction between two (10,10) Carbon nanotubes (CNTs).One CNT is tilted by 60 ○ with respect to the second, and the transmission is calculated by displacing one CNT along the axis of the other.The transmission follows, as expected, a periodic pattern in accordance with the lattice repeat of 0.25 nm along the axis of the CNT.
][119][120] Overall, DFTB-NEGF shares many similarities with DFT based implementations, and it also inherits some shortcomings the less experienced users should be aware of.For example, the open boundary treatment demands that external and non-equilibrium potentials are screened at the boundaries. 105Therefore, the simulated system should be large enough compared to the screening length.This condition is easily achieved with bulk metallic electrodes, but it can be difficult with low dimensional systems that exhibit poor screening.When this condition is not fulfilled, unphysical discontinuities in the potential may be obtained.In addition, compared to band structure calculations, NEGF tends to converge with more difficulty. 121Aside from these common challenges, it is important that for DFTB-NEGF calculations, any set of parameters should be evaluated by verifying at the least band structure properties in the energy range of interest.DFTB parameters fitted to reproduce total energies and forces might be excellent in those applications but lack the necessary accuracy in the band structure.Depending on the degree of accuracy required, an ad hoc fitting for transport calculations could also be necessary, for example, in the case of silicon. 122

E. Extended Lagrangian Born-Oppenheimer dynamics
The Extended Lagrangian Born-Oppenheimer molecular dynamics (XLBOMD) framework allows 123,124 molecular dynamics on the Born-Oppenheimer surface with only one hamiltonian diagonalization per time step without the need for self-consistency cycles.The basic idea is based on a backward error analysis, i.e., instead of calculating approximate forces through an expensive non-linear iterative optimization procedure for an underlying exact potential energy surface, XL-BOMD calculates exact forces for an approximate "shadow" potential energy surface, U(R, n).This is approximated from a constrained minimization of a linearized Kohn-Sham energy functional. 124,125The functional is linearized around an approximate ground state density, n.This density is included as a dynamical field variable driven by an extended harmonic oscillator centered on an approximate ground state, q[n], which is given by the minimization of the linearized Kohn-Sham functional.The harmonic well is defined in terms of a metric tensor, T = K T K, where the kernel K is assumed to be the inverse Jacobian of the residual function, q[n] − n. 124 The equations of motion are given by Here, MI are the atomic masses, RI are the nuclear coordinates, ω is the frequency of the harmonic oscillator, q[n] are the net Mulliken charge vectors (from an optimized linearized energy expression), and n is the extended dynamical variable that is set to the optimized ground state net Mulliken charge vector in the initial time step.The details of the DFTB + implementation are given in Ref. 126.
We currently approximate the kernel by a scaled identity matrix, For many problems, this is a sufficiently accurate approximation.However, for the most challenging problems including simulations of reactive chemical systems or metals, the scaled delta function is not a sufficiently stable approximation.Improved approximations have been developed 124 and will be implemented in the DFTB+ program in the near future.

F. Objective geometries
Objective structures 127 (OSs) describe geometries consisting of a set of identical cells, where the corresponding atoms in different cells can be mapped onto each other by orthogonal transformation(s).Both finite and infinite OSs are possible.Currently, we describe structures [127][128][129] possessing Cn rotational symmetry and a Cm ⊗ T screw axis, where n ∈ N * and m ∈ R + , with N atoms in the reference objective cell ({Xi}) and {ζ, ξ} ∈ N, where −∞ < ζ < ∞ and 0 < ξ < n.Exploiting the objective boundary conditions (OBCs) can introduce substantial computational savings, for example, irrational values of m lead to structures with a small OS cell, but an infinitely long one dimensional periodic boundary condition (PBC), i.e., intractable purely as a T operation.OBCs generalize symmetry-adapted Bloch sums for orbitals.
As with molecular and periodic structures, 8 most expressions in DFTB+ can be performed in real space via the boundary-condition agnostic and sparse representation of matrices in real space, the solution of the hamiltonian only requires dense matrices and k-points.
For the long-range Coulombic and dispersion interactions in DFTB, we also require lattice sums that are generalized to these boundary conditions. 130urther examples can be found in Refs.131-133, but here we demonstrate the bending of a BN bi-layer.Figure 4 shows a doublewalled tubular OS with a curvature of 1/R (from the tube radius) that represents the bent bi-layer.Bending along the a (b) direction of the sheet is an "armchair" ("zig-zag") tube with a Cn proper axis, described as an eight atom objective cell in which we select T = a (T = b), with no tube twist.The bi-layer bends as a plate, with the outer wall stretching and the inner wall compressing along their circumferential directions; its energy change is interpreted as bending strain (E bend ).It is important to note that the corresponding curvature is not an imposed constraint, but a result of the calculation: R is the average tube radius.A wider range of OSs will be made available in later DFTB+ releases, along with adapted electrostatic evaluation for these structures.

G. Extended tight binding hamiltonian
The extended tight binding (xTB) methods were primarily designed for the fast calculation of structures and non-covalent interaction energies for finite systems with a few thousand atoms.The main parameterizations, GFNn-xTB, target molecular geometries, frequencies, and non-covalent interactions follow mostly a global and element-specific parameter only strategy.The historically first parameterization, GFN1-xTB, covers all elements up to Z = 86 and is now supported in DFTB+.Its successor, GFN2-xTB, 71 will also be made available in the future.
We briefly outline the xTB methods; for a more detailed discussion and comparison to other methods, we refer to Refs.70 and 71.The xTB core hamiltonian is constructed in a partially polarized STO-nG basis set with diagonal terms made flexible by adding a dependence on the local chemical environment according to a coordination number (CN), similar to that used in DFT-D3, 60 The off-diagonal terms are approximated as an average of the diagonal terms proportional to the overlap between the corresponding basis functions.Both GFN1-xTB and GFN2-xTB include density fluctuation up to third order diagonal terms, while the distance dependence of the Coulomb interaction within the isotropic second order term is described by a generalized form of the Mataga-Nishimoto-Ohno-Klopman [134][135][136] expression.In GFN2-xTB, the expansion of the second order density fluctuations goes beyond the usual isotropic The Journal of Chemical Physics ARTICLE scitation.org/journal/jcpenergy terms and includes interactions up to R −3 , i.e., charge-dipole, dipole-dipole, and charge-quadrupole interactions, which significantly improves the description of inter-molecular interactions, such as halogen bonds and hydrogen bonds, without the need to include force-field-like corrections as in DFTB or GFN1-xTB.It is planned to implement full multipole electrostatics with Ewald summation in DFTB+ to enable GFN2-xTB and other generalized DFTB models. 137FN1-xTB and GFN2-xTB have been extensively tested for their target properties, 71 and further studies regarding structures for lanthanoid complexes 138 and transition metal complexes 66 have shown xTB methods to be robust for all its parameterized elements.Errors in these methods are very systematic, which can be used to devise correction schemes for off-target properties such as reaction enthalpies. 139 DFTB parameterization 1. Parameterization workflow a. Electronic parameters.The electronic parameterization for DFTB involves two principal steps.First, the compressed atomic densities and the atomic basis functions have to be determined (a one-center problem), followed by the calculation of the hamiltonian and overlap elements at various distances (a two-center problem).The compressed densities and wave-functions come from solving a Kohn-Sham-problem for a single atom with an additional confinement potential (usually a power function), as shown in Eq. ( 4).One may use different compression radii (and make separate calculations) to obtain the compressed density and the compressed atomic wave-functions for a given atom.The atomic calculations are currently carried out with a code implementing the Hartree-Fock theory based atomic problem 140,141 extended with the possibility of including DFT exchange-correlation potentials via the libxc library 142 and scalar relativistic effects via the zero-order relativistic approximation (ZORA).143 The resulting densities and atomic wave-functions are stored on a grid.The two-center integration tool reads those grid-based quantities and calculates the hamiltonian and overlap two-center integrals for various distances using the Becke-method.144 b.Repulsive parameters.Once the electronic parameters for certain species have been determined, the first three terms of Eq. ( 9) can be calculated for any systems composed of those species.The missing fourth term, the repulsive energy, is composed of pairwise contributions, V rep AB , between all possible atomic pairs of A and B in the system [see Eq. ( 6)].During the parameterization process, one aims to determine repulsive potentials between the atomic species as a function of the distance between the atoms RAB so that V rep AB = f sp(A),sp(B) (RAB), where sp(X) refers to the species of atom X.In contrast to the electronic parameters, which are determined by species-specific parameters only, the repulsive functions must be defined for each combination of species pairs separately.They are usually determined by minimizing the difference between the reference (usually ab initio) total energies and the DFTB total energies for a given set of atomic geometries.If one uses only one (or a few simple) reference system(s), the optimal repulsive function can be determined manually, while for more complex scenarios, usually semi-automatic approaches 15,21,[145][146][147] are used.

Outlook
In recent years, machine learning has been utilized with DFTB+, usually to enhance the generation and description of the repulsive potentials [148][149][150][151][152] or try to improve on electronic parameters. 151,153Related Δ-machine learning 154 methodologies based on neural network corrections for DFTB energies and forces have been also reported recently. 155,156We are currently in the process of developing a new unified machine-learning framework, which for a target system allows optimal adaption of both the electronic and the repulsive contributions.Given the predicted DFTB model, one would still have to solve it in order to obtain the system properties.On the other hand, changing external conditions (temperature, electric field, and applied bias) would not require additional training in this approach, and long range effects (e.g., metallic states) could also be described easily.

III. TECHNICAL ASPECTS OF THE DFTB+ PACKAGE A. Parallel scaling
In large-scale simulations, the solution of the DFTB hamiltonian to obtain the density matrix eventually becomes prohibitively expensive, scaling cubically with the size of the system being simulated.The diagonalization infrastructure in DFTB+ has undergone a major upgrade, including distributed parallelism and GPU accelerated solutions to address this cost.If instead the density matrix is directly obtained from the hamiltonian, circumventing diagonalization, then linear or quadratic scaling can now be obtained, depending on the chosen method.DFTB+ will continue to benefit from developments in these advanced solvers as we move into the era of exascale computing.

The ELSI interface and supported solvers
ELSI 157 features a unified software interface that simplifies the use of various high-performance eigensolvers (ELPA 158 EigenExa, 159 SLEPc, 160 and MAGMA 161 ) and density matrix solvers (libOMM, 162 PEXSI, 163 and NTPoly 164 ).We convert the sparse DFTB+ H and S structures 8 into either standard 2D block-cyclic distributed dense matrices or sparse 1D block distributed matrices compatible with the ELSI interface.All k-points and spin channels are then solved in parallel.
6][167] Figure 5 demonstrates two examples: non-self-consistent-charge, spin-non-polarized, Γ-point calculations for a C 64000 nanotube (CNT) and a Si 6750 supercell, with 25 600 and 27 000 basis functions, respectively.Figure 5(c) shows the time to build the density matrix for the CNT model with three solvers, the pdsyevr eigensolver in the MKL implementation of ScaLAPACK, the ELPA2 eigensolver, and the PEXSI density matrix solver.Here, both the MKL's version of pdsyevr eigensolver and the ELPA2 eigensolver adopt a two-stage tri-diagonalization algorithm. 158,168,169In terms of performance, ELPA2 and MKL pdsyevr are similar, while both are outperformed by the PEXSI solver by more than an order of magnitude.The PEXSI 163 method directly constructs the density matrix from the hamiltonian and overlap matrices with a computational complexity of O(N (d+1)/2 ) for d = 1 . . .3D systems.This reduced scaling property stems from sparse linear algebra, not the existence of an energy gap.Therefore, for any low-dimensional system, regardless of the electronic structure, PEXSI can be used as a powerful alternative to diagonalization.A similar comparison of solver performance for the silicon supercell model is shown in Fig. 5(d), where the NTPoly density matrix solver shows greater performance than the MKL pdsyevr and ELPA2 eigensolvers.Around its massively parallel sparse matrix multiplication routine, NTPoly implements various linear scaling density matrix purification methods, including the 2nd order trace-resetting purification method (TRS2) 170 used here.While PEXSI is not particularly suited for 3D systems, NTPoly offers an alternative as long as the system has a non-trivial energy B. Order-N scaling with the SP2 solver The SP2 (second-order recursive spectral projection expansion), 170 which is valid at zero electronic temperature when 1/kBT → ∞, recursively expands a Heaviside step function to project the (occupied) density matrix, where H is the hamiltonian transformed into an orthogonalized basis, given by the congruence transformation, H = Z T HZ.Each iteration of the SP2 Fermi-operator expansion consists of a generalized matrix-matrix multiplication that can be performed using thresholded sparse matrix algebra.In this way, the computational complexity in each iteration can be reduced to O(N) for sufficiently large sparse matrices.Note that we cannot expect linear scaling complexity for metals, since the inter-atomic elements of the density matrix decay algebraically instead of exponentially. 171The spectral projection functions in the SP2 expansion can be chosen to correct Tr(ρ) such that the step is formed automatically around the chemical potential separating the occupied from the unoccupied states. 170btaining the congruence matrix, Z, introduces a potential O(N 3 ) bottleneck.To avoid this, the sparsity of S can be exploited and the Z matrix can be obtained recursively with linear scaling complexity applying the "ZSP method" developed in Refs.172 and 173.
Several versions of the SP2 algorithm can be found in the PROGRESS library, 174 which uses the Basic Matrix Library (BML) 175,176 for the thresholded sparse matrix-matrix operations.The matrix data structure is based on the ELLPACK-R sparse matrix format, which allows efficient shared memory parallelism on a single node. 177The DFTB+ code was modified to use the LANL PROGRESS library and, in particular, the SP2 and ZSP algorithms.In combination with XL-BOMD, this allows efficient energyconserving, molecular dynamics simulations, where the computational cost scales only linearly with the system size.Figure 6 shows the performance of the SP2 algorithm compared to regular diagonalization.

C. GPU computing
Graphics processing unit (GPU) acceleration is implemented in DFTB+.Given the nature of the underlying theory, the timelimiting step in routine calculations corresponds to the diagonalization of the hamiltonian matrix, taking in the order of 90%-95% of FIG. 6. CPU time for the density matrix construction for different varying sizes of water box systems.Regular diagonalization (black curve) was compared to the SP2 method (red curve).A numerical threshold of 10 −5 was used in the sparse matrix-matrix multiplications of the SP2 algorithm.the total running time.The hybrid CPU-GPU implementation in DFTB+ replaces the LAPACK-based eigensolver with a GPU eigensolver based on the divide-and-conquer algorithm as implemented in MAGMA. 178enchmarking of the code shows that at least 5000 basis functions are necessary to exploit the power of the GPUs and to produce an observable speedup with respect to the CPU-only code.For systems spanning a vector space comprised of 70 000 basis functions, speedups of 17× have been observed in a system with 6 NVIDIA ® Tesla ® V100 with respect to the multi-threading CPU-only implementation (see Fig. 7).

IV. INTERFACING DFTB+ WITH OTHER SOFTWARE PACKAGES
DFTB+ can be currently interfaced with other software packages using three different ways of communications: file communication, socket based, or direct connection via the DFTB + API as a library.The first one is very easy to implement but comes with an overhead for the file I/O, while the latter two enable a more efficient coupling at the price of somewhat higher complexity in implementation.

A. File based communication
When using file based communication, the external driver creates necessary input files and starts an individual DFTB+ program for each of the inputs.After DFTB+ has finished, the driver analyses the created output files and extracts the necessary information from those.DFTB+ had been interfaced using file based communication to, among others, the phonopy 179 code for finite difference harmonic and anharmonic phonon calculations and the Atomic Simulation Environment (ASE) package 180 (a set of tools and Python modules for setting up, manipulating, running, visualizing, and analyzing atomistic simulations).

B. Socket interface
The i-PI 181 interface for communication with external driving codes is supported by DFTB+.DFTB+ can then be driven directly instead of using file I/O.The initial input to DFTB+ specifies the boundary conditions, type of calculation, and chemical information for atoms, and the code then waits to be externally contacted.This kind of communication with DFTB+ can be used by, among others, the i-PI universal force engine package 181 and ASE. 180 DFTB+ library, QM/MM simulations 1. Gromacs integration DFTB quantum-chemical models may be utilized as a QM engine in hybrid quantum-mechanical/molecular mechanical (QM/MM) approaches.This allows, for example, efficient simulations of chemical processes taking place in bio-molecular complexes.The DFTB+ library interface has been connected to the Gromacs 182 MM-simulation software package.(The Gromacs part of the integration is contained in a fork of the Gromacs main branch.183 ) At the start of the simulation, the DFTB + input file is read in, and a DFTB calculation environment is created, containing all of the necessary information (parameters), but no atomic coordinates yet.In every step of MD simulation or energy minimization, the calculation of forces starts with a call to the DFTB+ API, passing the coordinates of QM atoms and the values of electrostatic potentials induced by the MM atoms at the positions of the QM atoms.DFTB+ then returns QM forces and QM charges back to Gromacs, where the QM/MM forces are calculated in the QM/MM routines.Gromacs then continues by calculating the MM forces, integration of equations of motion, etc.
Sometimes the electrostatic interactions cannot be represented as an external potential but also depend on the actual values of the QM-charges (i.e., polarizable surroundings).In those cases, a callback function can be passed to DFTB+, which is then invoked at every SCC iteration to update the potential by the driver program whenever the QM charges change.In the DFTB+/Gromacs integration, we use this technique to calculate the QM-QM electrostatic interactions in periodic systems with the highly efficient particle mesh Ewald method 184 implemented in Gromacs.

DL_POLY_4 integration with MPI support
DL_POLY_4 is a general-purpose package for classical molecular dynamics (MD) simulations. 185In conjunction with the recent extension of DFTB+'s API, DL_POLY_4.10supports the use of DFTB+ for self-consistent force calculations in place of empirical force fields for Born-Oppenheimer molecular dynamics.
The interface fully supports passing MPI communicators between the programs, allowing users to run simulations in parallel, across multiple processes.The MPI parallelization schemes of DL_POLY_4 and DFTB+ differ considerably.DL_POLY_4 utilizes domain decomposition to spatially distribute the atoms that comprise the system across multiple processes, whereas DFTB+ distributes the hamiltonian matrix elements using BLACS The DL_POLY_4-DFTB+ interface works by gathering the atoms from each DL_POLY_4 process such that all processes have a complete copy of all the atoms.Coordinates, species types, and the atomic ordering are then passed to DFTB+.The calculated forces are returned to DL_POLY_4, which redistributes them according to its domain decomposition, and the atomic positions are propagated one time step.
Spatial decomposition means that atoms can propagate between processes.Because atoms are gathered sequentially according to their process id (or rank), when atoms propagate between processes, their ordering effectively changes.The DFTB+ API facilitates this and is, therefore, able to support any molecular dynamics software that implements domain decomposition parallelization; however, the total number of atoms (and atom types) must be conserved during the simulation.

D. Meta-dynamics using PLUMED
Molecular dynamics is often plagued by high energy barriers that trap the nuclear ensemble in one or several local minima.7][188] This "timescale" problem is typical for rare-event systems or those in which ergodicity of a particular state is impeded by the local topology of the potential energy surface.A variety of methods have been conceived to circumvent this, including umbrella sampling 189 and meta-dynamics. 190mbrella sampling and meta-dynamics can now be performed using DFTB+ via its interface to the PLUMED plugin. 191,192Using PLUMED, MD trajectories generated in DFTB+ can be analyzed, sampled, and biased in a variety of ways along user-defined collective variables (CVs), enabling accelerated MD simulations and determination of the free energy surface.A CV is a subspace of the full potential energy surface that can be arbitrarily defined to sample atomic dynamics along dimensions/pathways of physicochemical interest.PLUMED also includes bias functions such as the upper and lower wall biases, enabling a constraint of MD configurations to specific areas on the potential energy surface.The utility of the DFTB+/PLUMED interface has been demonstrated on several challenging systems, including malonaldehyde intra-molecular proton transfer (Fig. 8), corannulene bowl inversion, and the diffusion of epoxide groups on graphene. 192

E. DFTB+ in Materials Studio
DFTB+ is included as a module in the commercial modeling and simulation software package, BIOVIA Materials Studio (MS). 193FTB+ runs as an in-process energy server, supplying energies, forces, and stresses to drive the MS in-house simulations tools.Supported tasks include energy calculation, geometry optimization, molecular dynamics, electron transport calculation, mechanical properties, and parameterization.The module also supports calculation and visualization of standard electronic properties, such as band structure, density of states, orbitals, and so on.The DFTB+ module integrates closely with the data model and the Materials Studio Visualizer, allowing the user to construct structures and start calculations quickly, with fully automated creation of the DFTB+ input file.The DFTB+ module is also supported in the MS Mate-rialsScript interface and the Materials Studio Collection for Pipeline Pilot, 194 allowing creation of more complicated workflows. 195,196he DFTB+ parameterization workflow in MS supports fitting of both electronic parameters and repulsive pair potentials using DFT calculations with the DMol3 module 197,198 as a reference.The DFTB+ module includes scripts for validation of parameters in terms of band structure, bond length, bond angles, and so on, as well as visualization for the hamiltonian, overlap matrix elements, and the repulsive pair potentials.The parameterization tools allow extension of existing parameters or incremental development of a parameter set.Parameters developed using the DFTB+ module can, after conversion, be used outside MS.Several default DFTB+ parameter sets, generated using these parameterization tools, are also included.In 2019, MS introduced a new parameter set that includes the Li, C, H, N, O, P, and F elements and is aimed toward Li-ion battery modeling.

F. Outlook
In order to enable flexible general communication with various types of external components (external drivers, QM/MM, and machine learning models), we are in the process of developing a communication library, 199 which allows for data exchange between mixed language (e.g., Fortran and C) components via API-bindings as well as between different processes via socket communications.After engagement with other stakeholders, this will be released as a set of BSD-licensed tools and a library.

V. SOFTWARE ENGINEERING IN DFTB+
This section presents a few aspects of our software development, which may have some interest beyond the DFTB+ software package.

A. Modern Fortran wrappers for MPI and ScaLAPACK functions
Modern scientific modeling packages must be able to run on massive parallel architectures to utilize high performance computing, often using the Message Passing Interface (MPI) framework.While the MPI offers a versatile parallelization framework, its application interface was designed to support C and Fortran 77-like interfaces.This requires the programmer to explicitly pass arguments to the MPI-routines, which should be automatically deduced by the compiler for languages with higher abstraction levels (C++ or Fortran 95 and newer versions).In order to eliminate developer need to pass redundant information (and to reduce associated programming bugs), we have developed modern Fortran wrappers around the MPI-routines.These have been collected in the MPIFX-library, 200 which is an independent software project outside of the DFTB+ software suite, being licensed under the more permissive BSD-license.It enables shorter MPI-calls by automatically deducing data types and data sizes from the call signature.Additionally, several MPI parameters have been made optional using their most commonly used value as a default value.For example, in order to broadcast a real array from the master process to all other process, one would have to make the following MPI-call: call mpi_bcast(array, size(array), MPI_FLOAT, 0, comm, err) while MPIFX-wrappers reduce it to a much shorter and less errorprone line: call mpifx_bcast(comm, array) where comm is an MPIFX derived type containing the MPIcommunicator.The type (MPI_FLOAT) and number of broadcasted items [size(array)] are automatically deduced.The process initiating the broadcasting has been assumed to be process 0 (master process), as this is probably the most common use case but can be customized when needed with an optional parameter.The error argument is optional as well, if it is not passed (as in the example above), the routine would stop the code in the case of any errors.
Likewise, the commonly used parallel linear algebra library ScaLAPACK uses Fortran 77-type interfaces.The open source SCALAPACKFX library 201 offers higher level modern Fortran wrappers around routines used by DFTB+.

B. Fortran meta-programming using Fypp
Although the latest Fortran standard (Fortran 2018) offers many constructs to support modern programming paradigms, it does not allow for generic template based programming.This would avoid substantial code duplication and offer useful metaprogramming capabilities for Fortran programmers.We have developed the Python based pre-processor, Fypp, 202 which offers a workaround for the missing features.Fypp is used during the build process to turn the meta-programming constructs into the standard Fortran code.The Fypp project is independent of the DFTB+ software package and is licensed under the BSD-license, being also used by other scientific software packages, for example, by the CP2K code 203 and both the MPIFX and the SCALAPACKFX libraries.

VI. SUMMARY
DFTB+ is an atomistic quantum mechanical simulation software package allowing fast and efficient simulations of large systems for long timescales.It implements the DFTB-and the xTBmethods and various extensions of those, such as range-separated functionals, multiple methods of excited state calculations, and electron transport simulations.It can be used either as a standalone application or as a library and has been already interfaced to several other simulation packages.DFTB+ is a community developed open source project under the GNU Lesser General Public License, which can be freely used, modified, and extended by everybody.

SUPPLEMENTARY MATERIAL
See the supplementary material for the damping parameters for the D4, the TS, and the MBD dispersion models.

FIG. 1 .
FIG. 1. Performance of different dispersion corrected tight binding methods on the S30L benchmark set, and the values for SCAN-D4 are taken from Ref. 65.

FIG. 3 .
FIG. 3. Transmission across two (10,10) CNTs as a function of the displacement of the top CNT along the axis of the bottom CNT.The two curves represent the transmission resolved between electrode 1 of the bottom CNT, and, respectively, electrodes 2 and 3 of the top CNT (as labeled in the inset).
Figure 4(b) demonstrates linearity with

FIG. 4 .
FIG. 4. (a) OS of a BN bi-layer tube with a B 4 N 4 unit (red and blue atoms).Angular, but not translational, objective images are shown in gray.(b) Bending energy (circles) vs curvature with a linear fit.

FIG. 5 .
FIG. 5. Atomic structures of (a) the carbon nanotube (CNT) model (6400 atoms) and (b) the silicon supercell model (6750 atoms).The length of the actual CNT model is 16 times that of the structure shown in (a).(c) and (d) show the time to compute the density matrix for models (a) and (b), respectively.Calculations are performed on the NewRiver computer.MKL pdsyevr and ELPA2 first compute all the eigenvalues and eigenvectors of the eigensystem of H and S and then build the density matrix.PEXSI and NTPoly directly construct the density matrix from H and S.

FIG. 8 .
FIG. 8. Intra-molecular proton transfer in malonaldehyde at 298 K. Contours show the DFTB3-D3/3ob free energy surface of malonaldehyde obtained using welltempered meta-dynamics, with collective variables d(O 1 -H) and d(O 2 -H).Each point is colored according to its sampling frequency during the meta-dynamics simulation, yellow (blue) indicating high (low) sampling frequency.The DFTB3-D/3ob free energy surface yields a proton transfer barrier of 13.1 ± 0.4 kJ mol −1 .
Wall clock running times for total energy calculations of water clusters (with 6 basis functions/water molecule).The black curve shows timings obtained using the LAPACK compatible ESSL eigensolver on the CPU, and the red/green curves show timings obtained using the MAGMA and the ESSL libraries without/with ESSL-CUDA off-loading.Timings have been recorded on the Summit machine using 42 threads for 42 physical cores.

The Journal of Chemical Physics ARTICLE scitation.org/journal/jcp decomposition
. This does not impose any serious restrictions as DL_POLY_4 and DFTB+ run sequentially, with DFTB+ being called once per MD time step.