Indirect Learning of Interatomic Potentials for Accelerated Materials Simulations

Machine learning (ML) based interatomic potentials are emerging tools for materials simulations but require a trade-off between accuracy and speed. Here we show how one can use one ML potential model to train another: we use an existing, accurate, but more computationally expensive model to generate reference data (locations and labels) for a series of much faster potentials. Without the need for quantum-mechanical reference computations at the secondary stage, extensive reference datasets can be easily generated, and we find that this improves the quality of fast potentials with less flexible functional forms. We apply the technique to disordered silicon, including a simulation of vitrification and polycrystalline grain formation under pressure with a system size of a million atoms. Our work provides conceptual insight into the machine learning of interatomic potential models, and it suggests a route toward accelerated simulations of condensed-phase systems and nanostructured materials.

Machine learning (ML) based interatomic potentials are emerging tools for materials simulations but require a trade-off between accuracy and speed.Here we show how one can use one ML potential model to train another: we use an existing, accurate, but more computationally expensive model to generate reference data (locations and labels) for a series of much faster potentials.Without the need for quantum-mechanical reference computations at the secondary stage, extensive reference datasets can be easily generated, and we find that this improves the quality of fast potentials with less flexible functional forms.We apply the technique to disordered silicon, including a simulation of vitrification and polycrystalline grain formation under pressure with a system size of a million atoms.Our work provides conceptual insight into the machine learning of interatomic potential models, and it suggests a route toward accelerated simulations of condensed-phase systems and nanostructured materials.
The properties of materials are governed by the atomic structure and by the forces acting on atoms.Creating accurate computational models for interatomic forces and potential-energy surfaces is therefore a central task in the physical sciences.Today, machine learning (ML) methods are increasingly used to represent quantummechanical potential-energy surfaces, typically based on density-functional theory (DFT) ground-truth data and achieving prediction accuracy to within a few meV per atom [1].This way, quantum-mechanical quality simulations become accessible that are orders of magnitude cheaper than DFT [2][3][4][5][6][7][8].ML potential models have begun to be applied to challenging problems in physics and related fields: structures and properties of amorphous solids [9][10][11], phase transformations under extreme conditions [12][13][14], or surface science and catalysis [15,16].Recent overviews summarize progress in the field [17][18][19][20].
However, ML potentials still face challenges.A key one is that they incur much larger computational cost than efficient empirical counterparts.In addition to developing accurate and fast potentials, there is a similarly important need for validating ML potential models.This is challenging because there is less physical information, or even none, encoded into the model a priori, and because there is not always an unambiguous correlation between the numerical error of a potential (on a fixed test set) and its behavior in simulations [21].The importance of physically-motivated tests has been demonstrated, e.g., in a comparative study of carbon interatomic potentials [22].A recent benchmark study compared different ML potential fitting frameworks and their numerical performance [1].Specifically for silicon (Si), various tests including random search have been devised [21,23,24].
In the present work, we outline an approach to training ML potentials by indirect learning: we use one model to create reference data for another.The former is accurate and flexible, but more computationally expensive than the latter.We demonstrate that training a potential in this way for disordered Si largely retains the quality of prediction of the initial model (both in terms of numerical errors and physically motivated tests), whilst speeding up the simulation substantially.This way, much larger-scale simulations become accessible-or conversely, the cost of a given simulation is reduced to a few percent.
We briefly discuss terminology.The approach herein is related to "knowledge distillation" in ML research [25], which seeks to transfer information from a large "teacher" model to a smaller "student" model, enabling faster predictions.So far, distillation has been applied primarily to classification tasks performed by neural networks, such as image recognition [25]."Database distillation", synonymously with "dataset meta-learning", was recently described as a way to condense information into smaller training sets for cheaper learning [26], and the term "meta-labeling" has been used to describe the construction of secondary ML models using a primary one [27].We here think of "distillation" in a rather broad sense: to encapsulate the indirect learning of potential models via any fitting framework, and the generation of entirely new atomic configurations for training an indirectly-learned model, in addition to the energy and force data.In other words, we use the teacher model to generate both the data locations and the data labels. Methodology.
-ML potentials approximate a potentialenergy surface (PES) based on reference data, typically from small-scale DFT computations [32]: where x i is a descriptor for the atomic environment [33], y i is one of N observations of the reference PES, and ỹ denotes the ML model fitted with a given method [34].We call this method "M1" in the following.
We now show that two fitting methodologies at different points on the "Pareto front" of accuracy and cost [1], namely the Gaussian Approximation Potential (GAP) [3,32] and Moment Tensor Potential (MTP) [6,38] frameworks, can be combined to accelerate simulations of disordered Si.To do so, we train MTPs using an ex- Fitting with the method M1 yields an ML potential model (blue line).Right: We now generate a much larger set of structures using M1, and label them using the same method (not M0!).This dataset, D1, covers a more narrow structural space, but at more points.A second fit is then made with a different ML method, M2.
isting general-purpose ML potential for Si [36], referred to as GAP-18, as the reference method.We use the MTP formalism (henceforth denoted M2) to approximate the GAP-18 PES (M1), which is itself approximating the DFT PES (M0).Hence, M2 is indirectly learning quantum-mechanical reference data with M1 as its teacher (Fig. 1).Extending the notation of Eq. 1, where ỹ is an indirectly-learned model, and N 2 N .In the following, we use the notation M for a directlylearned MTP, and M for an indirectly-learned one [39].
There are two primary advantages to using GAP-18 as an intermediary rather than training directly on DFTlabeled data.Both of these hinge on the massively reduced computational cost of GAP compared to DFT.Firstly, we can build an almost arbitrarily large database of GAP-labeled training data, thus approximating the M1 PES as faithfully as possible within the (flexibility and fitting) limitations of M2.Secondly, we can evaluate the fidelity of the fit via numerical errors on a wideranging set of out-of-sample testing data, and also test behavior in large-scale molecular dynamics (MD) simulations.The practical advantages of the present application of distillation rely on the observation that GAPs may be able to more accurately interpolate across a sparsely  [36], here denoted D0, and the newly created indirect learning dataset, D1.The distance between points on this map is a 2D-embedding of their structural dissimilarity as measured by SOAP [37].At low pressure, a random starting structure, which rapidly transforms to approximate low-density amorphous (LDA) Si, is met with the good coverage provided by liquid and LDA structures in D0.Equally, the high-pressure region is densely sampled with distorted β-Sn-type and simple hexagonal (sh) structures.However, the D0 coverage is distinctly sparse at intermediate pressures around 10 GPa.
sampled PES compared to MTPs, which is not yet fully investigated [1,44].We expect, however, that the same concept could be applied to any pair of fitting methodologies where a cost-accuracy disparity exists.The insights into database development and physically-guided validation of ML potentials are independent of the particular M1 and M2 methods chosen.
For the present study, we aim to replicate the behavior of GAP-18 under pressure, which we find to be an instructive test case, and which relates to questions of experimental interest [45][46][47].To improve interpolation for high-p structures, we use a large training database (D1 in Fig. 1) containing 250 structures with 1,000 atoms each, corresponding to compression to 20 GPa (as in Ref. 11).These structures are supplemented with the original GAP-18 database (Fig. 2), which we re-label with GAP-18 energies and forces to make the input data consistent.
Numerical errors.-Theaccuracy of an ML potential is often characterized by an error versus its reference method [1], typically evaluated on a test set not included during training.In this vein, Fig. 3 shows the mean absolute error on different test sets for a series of M and M models of increasing complexity and cost.On the one The test comprises 100 cells of 1-64 atoms produced using buildcell [40,41] and optimized with GAP-18 at pressures in the range 0-32 GPa.For comparison to M0 (M1), test-set structures were labeled with DFT [42,43] (GAP-18 [23]), respectively; grey lines report the prediction errors of GAP-18 versus DFT for those structures.This test is not intended as a competitive benchmark, but rather to convey the trade-off between flexibility and efficiency in choosing the number of parameters in MTPs via the maximum level, L (here, set to values of 12-24 in increments of 2).See Supplemental Material for details.(b) As before, but now for compression MD snapshots: in this case, the configurations have been included in the indirect learning database, D1.
hand, we test on random-structure-searching (RSS) configurations which are different from the fitting database; on the other hand, we test on snapshots from smallscale compression MD simulations, directly mirroring the physical situation that is to be "indirectly learned".
For both test sets, Fig. 3 provides encouraging results: comparing their predictions to M0 (DFT), indirectlylearned MTP models tend to at least match, if not improve upon, the corresponding directly-learned (D0trained) MTPs.Inspecting the force-component errors (lower panels in Fig. 3) indicates that our indirectlylearned MTP models faithfully recover the predictions of M1-suggesting that the existing M1 PES is smooth enough to be learned by a different fitting method.More generally, the error analysis suggests that improvements to M1 can be expected to transfer to subsequent indirectly-learned models.
We emphasize that the tests in Fig. 3 are not designed, nor able, to assess the quality of MTP models in absolute terms.In previous studies, reference databases for MTP fitting have been built using active learning [48,49], and it would now be interesting to include such a gradually improved database in the construction of MTP models for high-pressure disordered Si.
Physically-guided validation.-Whilenumerical errors provide some indication of performance, the correlation between both is unreliable enough to warrant the extensive validation of ML potentials based on physical behavior-for example, assessing whether structural features of a reference simulation are reproduced.Here we use the Smooth Overlap of Atomic Positions (SOAP) kernel [37] which has proven useful for analyzing local structure in amorphous Si [11,50].The computational speed of ML potentials, and of MTPs in particular, allows us to compare several different indirectly-learned potentials in large-scale (100,000-atom) simulations.For Si, this provides a more stringent test than small-scale structures, allowing us to assess the description of the nucleation of sh crystallites and their intervening grain boundaries, and to quantify this description using SOAP.
Figure 4 benchmarks a series of MTPs, directly-learned (blue; M L ) or indirectly-learned (red; M L ), with increasing quality and cost.All compression simulations start from the same low-density amorphous (LDA) structure from Ref. 11 and are run using LAMMPS [52].In all cases, we use the similarity to sh Si as quality measure, aiming to quantify how well a given model reproduces the behavior of GAP-18 (M1).Overall, indirectly-learned MTPs perform much more reliably at the compression test than those directly fitted (to D0), with every M L FIG. 4. Performance of indirectly-learned potential models of increasing flexibility in 100,000-atom compression simulations of amorphous Si.We report the average SOAP similarity to pressure-adjusted sh Si during compression MD (cf.Fig. S1), benchmarking MTPs (solid) versus GAP-18 (dashed).The difference for each MTP to GAP-18 is plotted underneath the absolute values, with M L offset for clarity.Representative snapshots during the run are displayed, colored according to atomistic SOAP similarity to sh (A-E).Points at which MD simulations fail (for M 14 and M 20) are marked with asterisks.TABLE I. Quality metrics for 100,000-atom LDA Si models produced by 10 11 Ks −1 quench simulations: the proportion of 3-and 5-fold connected atoms (N3 and N5), the inverse height of the first sharp diffraction peak (H −1 ) [51], and the mean ( θ) and width (∆θ) of the bond-angle distribution.model producing the expected sequence of phases [11], viz.LDA −→ very-high-density amorphous (VHDA) −→ polycrystalline (pc) sh.In contrast, only 2 out of 5 the directly-learned models perform acceptably.Such comparisons are only feasible because M1 is efficient enough to produce a reference simulation (albeit with considerable effort), and additionally because M2 is efficient enough to readily produce many of such simulations.

N3
We finally test our indirectly-learned potentials for a task for which they were not specifically trained (in D1): the production of LDA structural models via simulated quenching from the melt.Well-established quantitative measures are available [53][54][55] to measure how well our indirectly-learned models reproduce the behavior of GAP-18 (Table I).We run quenching simulations with 100,000 atoms, as in Ref. 11, now comparing two representative levels of MTP fitting.In both cases, the indirectly-learned model outperforms a directly-learned one at the same level.Both M models come close to reproducing the defect count, with that of 3-fold connected atoms being almost identical (0.68%, 0.69%, and 0.70% with M 16 , M 20 , and GAP-18, respectively), and that of 5-fold connected ones being somewhat more varied.There is not, however, a unambiguous advantage of M 20 over M 16 : specifically, the inverse height of the first sharp diffraction peak, H −1 , is much better described by the latter.This underlines the subtleties in assessing the behavior of ML potentials.
Application.-Todemonstrate its usefulness in practice, we employ our approach to generate million-atom structural models of LDA and pc-sh Si.The nanoscale grain structure of pc-sh invites the use of larger-scale models for more extended studies of the grain-boundary region and the size and orientation of crystallites.The efficiency of MTPs permits fast simulations on long timescales without having to compromise on system size.Figure 5 shows results of a melt-quench (Fig. 5a) and compression MD (Fig. 5b) simulation of a system of 1M Si atoms, using the M 16 model [56].
The LDA structure, visualized in Fig. 5a, has a bondangle distribution closely similar to that for the 100katom GAP-18-generated structure (Fig. 5c).However, a more detailed analysis of the environments of the N = 5 atoms ("coordination defects") is possible with larger system size, as evident from the much lower scatter in the results shown in Fig. 5c.The histogram bin widths are 0.25°for both system sizes for fair comparison, sufficiently narrow to permit analysis of small deviations away from the ideal tetrahedral angle.The wide distribution of bond angles in N = 5 atoms is consistent with the relatively wide spread in local structure, as characterized in Ref. 50 for much smaller simulation cells.
Upon compression, the expected pressure-induced collapse of the largely fourfold-connected structure into a VHDA phase [11,57] is reproduced by the indirectlylearned potential, the onset occurring at only slightly lower pressure compared to the reference GAP-18 simulation.There are no discernible differences between the volume-pressure curves of 100k-atom and 1M-atom simulations carried out using the same indirectly-learned model (Fig. 5d), making system-size effects for VHDA formation unlikely.However, system-size effects in the nucleation of sh crystallites are clearly still important even in 100k-atom systems.On the top face of the GAP-18 structure in Fig. 5b, a crystallite extends all across the cell and hence is infinite in that direction under periodic boundary conditions.The 1M-atom structure generated by M 16 has similar grain sizes, but the larger cell ensures that no grain approaches the length of the cell itself.
The differences in the volume-pressure curves between GAP-18 and M 16 (Fig. 5d) are of a similar magnitude to those between GAPs trained using different reference methods in Ref. 11.The crystallization and volume of the pc-sh phase are well reproduced, as is the behavior as characterized using SOAP (Fig. 5e). Figure 5 therefore suggests that our indirectly-learned M 16 model can, to a large extent, match predictions of GAP-18.This way, more detailed analyses become possible-for example, of the grain structure in pc-sh Si, as well as a closer study of the LDA phase which is the subject of ongoing work.
In future studies, we envisage using more diverse combinations of fitting frameworks in a similar vein, e.g., neural networks [2] and SNAP [4], and automating this model selection.In addition, we propose to incorporate ensembles of base models, including using different levels of theory, to further augment the capabilities of an indirectly-learned model.We hope that such developments will help to make quantum-mechanically accurate materials simulations with millions of atoms more commonplace in the years ahead.The Smooth Overlap of Atomic Positions (SOAP) descriptor is a fingerprint originally developed in the context of ML potentials [37] and widely employed in GAP fitting [32]; however, SOAP has has also been used for classifying and understanding complex structures across a wide range of chemical systems [33, S1-S2].Briefly, the SOAP formalism encodes the local environment of atom i as a neighbor density ρ (i) (r) by placing a 3D Gaussian function of width σ at on i and each of its neighbors up to a cutoff.A similarity measure k SOAP (i, j) between two such environments is given by evaluating the overlap integral of their respective ρ(r) and averaging over all possible rotations R of one of the local environments.
Here we use the configuration-averaged SOAP kernel (cf.Ref. S2) to measure the similarity of MD frames to crystal structures adopted by Si, which provides a physical interpretation of the average structure in a large simulation system.We are interested in the structural changes that occur with increasing pressure beyond the trivial isotropic contraction of bond lengths.To subtract this effect from that of the SOAP similarity, we compare our simulation trajectory not only to ambient-pressure crystals ("fixed" in Fig. S1), but also to crystals with lattice parameters optimized as a function of pressure.More precisely, we optimize the lattice parameters at 1 GPa intervals with DFT and interpolate between them when calculating the similarity so that the reference crystal is under the same pressure as is measured over each ps of MD simulation time ("scaled" in Fig. S1).
Figure S1 shows the evolution of the SOAP similarity with pressure during the compression simulation reported in Ref. 11 (carried out for a 100,000-atom system using GAP-18) which we here use as reference, assessing the per-atom similarity to crystalline phases in two different ways: either fixing them to their ambient-pressure structures (dashed lines), or allowing for freely varying lattice parameters with pressure (solid lines).The atomic environments in the quenched LDA phase at 0 GPa have a rather close resemblance to those in the cubic diamond (dia) structure, with both featuring tetrahedral Si environments.Up to about 12 GPa, the similarity to pressure-adjusted dia remains practically constant.

TABLE S1.
General hyperparameters used for SOAPbased structural analysis, carried out using the QUIP/GAP code as interfaced to the quippy Python package (see https://github.com/libAtoms/QUIP).SOAP vectors were configuration-averaged (average = T in quippy) for each MD frame before their transformation to the power spectrum and then kernel via a dot product with the reference crystal.The dot product was raised to a power of ζ = 4.More detailed inspection of the MD trajectory confirms the interpretation that the LDA phase does not change substantially beyond an isotropic contraction until it collapses to a very-high-density amorphous phase (VHDA) around 12 GPa [11], seen as a sudden drop in the dia similarity.The VHDA phase has the closest similarity among tested crystals to the high-pressure Si-VI phase with space group Cmca, which appears experimentally upon compression to about 40 GPa, after the formation of sh and before hexagonal close-packed (hcp) Si.At about 14 to 16 GPa, the nucleation of sh crystallites begins [11], so the similarity to Cmca decays slightly and similarity to sh increases.At this point, characterizing inhomogeneous polycrystalline sh (pc-sh) with any single variable becomes fraught, which also explains why the Cmca similarity remains so high (along with β-Sn; see Fig. S2).The interfaces between sh grains, which are more similar to Cmca and β-Sn than they are to sh, occupy a reasonable fraction of the simulation-cell volume, hence contribute appreciably to the average similarity.Nevertheless, the analysis in Fig. S1 summarizes the structural changes during compression in a comprehensive way.).bct is a hypothetical four-fold coordinated structure predicted to be stable under tensile stress [S4].The metallic phases commonly labeled with their orthorhombic spacegroups, Imma and Cmca, appear with increasing pressure respectively between β-Sn and sh and between sh and hexagonally closepacked (hcp) phases.Anzellini et al. [S5] provide an introduction to the numerous Si phases and their nomenclature.

MELT-QUENCH SIMULATIONS
The static structure factor, S(Q), is a particularly useful quantity because it can be experimentally observed, and high-quality reference data for amorphous Si are available (e.g., Refs.51 and S6). Figure S3 shows the structure factor of an LDA model produced using the variable quench-rate protocol of Ref. 54.There is almost perfect agreement between M 16 and GAP-18, which also matches the experimental data from Ref. 51 closely.Smaller-scale simulations suggest that levels 16 and 20 span most of the variation in H −1 among the MTPs studied here (Fig. S4), but system sizes of the order of 100k atoms are required for repeatable results due to the stochastic nature of the simulations.
We limit ourselves to two levels because of the expense associated with the long timescales of 10  S3, and using directly and indirectly fitted MTPs covering a broader range of levmax values.Some of the variability in the height of the FSDP can be explained by the stochastic nature of grain formation during cooling in combination with the relatively small system size (the height varies from simulation to simulation for the same MTP and same protocol: e.g., compare M 20 in this figure and Fig. S3).However, the more flexible indirectlylearned MTPs do seem to induce some over-ordering.More detailed studies using indirectly-learned MTPs specifically trained for melt-quench simulations will be the subject of future work.
(as do M 12 and M 14 ); this requires further work in a future study, and we focus on the level 16 and 20 models herein.The strong performance of M 16 supports our choice to use it for a large-scale proof-of-concept (Fig. 5 in the main text).

COMPUTATIONAL DETAILS
Moment tensor potential fitting MTP models have been applied to a range of research questions, including phase diagrams [44], reaction dynamics of small molecules [S7], and lithium ion conduction [S8].Full details of their construction are given in the original work by Shapeev and co-workers [6,38], which we summarize briefly here.The descriptor of a local environment is defined as the moment tensor, where n i denotes the positions of the i th atom and all neighbors j up to a cutoff, R cut , which we set to 5 A consistent with that of the descriptor used for GAP-18 [23].The radial part f µ depends only on the position vectors r ij of each j-th atom from the i-th one, and the angular information is encoded by tensors of rank ν produced from the repeated outer product of r ij vectors.These M µ,ν are contracted to give the basis functions B α .Training involves finding the regression coefficients ξ α that minimize a loss function based on the errors in predicted energies, as well as forces and stresses, which can be expressed in terms of the first and second derivatives of E MTP with respect to r ij .The number of basis functions defining the particular functional form of an MTP (and hence its flexibility and computational efficiency) grows exponentially with the important hyperparameter lev max , where the level ("lev") is defined as lev(M µ,ν ) = 2 + 4µ + ν (S3a) lev(M µ,ν : M η,λ ) = lev(M µ,ν ) + lev(M η,λ ) (S3b) and ":" represents an appropriate tensor contraction operation.All possible M µ,ν , and contractions of one or more M µ,ν , are included as degrees of freedom in the functional form of an MTP, provided the resulting basis function satisfies lev(B α ) ≤ lev max .In the present work, we investigated the range lev max = 12-24.For brevity, we denote lev max as "L" in the main text.We found that the performance of MTPs was quite sensitive to the relative weights of energies, forces, and stresses in the fit.After some testing, we settled on 10 : 1 : 0.01 respectively.Our chosen force weight is a factor of 10 larger than the commonly-used default value, as we found both direct and indirectly-learned MTPs suffered from unstable MD and poor predictions of the physical behavior under pressure using the default force weight.Full details of all remaining MTP hyperparameter choices are given in Table S2.
TABLE S2.General hyperparameters used for all MTPs trained in this work.The default radial basis size for each value of levmax was used throughout, as was the case for any parameters not listed below.Mean absolute error of predicted energies versus quantity of compression MD training data of two types: starting separately from LDA structures (dashed lines) and randomly placed hard spheres (solid lines).3 M 16 MTPs with the same settings were trained for each database and we report their average errors in prediction vs. GAP-18 on two independent validation sets: structures derived from melt-quench MD (blue) and compression MD (yellow).The median standard deviation in error across identically-trained MTPs, σ∆E/|∆E|, is in the region of 10-45 %.MAEs show improvement with the addition of similar data to the training set, which is converged with respect to database size by 240 × 10 3 atomic environments.The errors for melt-quench MD begin to degrade with the increased proportion of compression MD included during training between 120 × 10 3 -240 × 10 3 environments.Including even more environments leads to an unstable potential for MD simulations.

FIG. 1 .
FIG.1.Indirect learning of interatomic potentials.Left: In established ML potentials, a potential-energy surface is represented by reference computations for a given number of structures, leading to the dataset D0.Fitting with the method M1 yields an ML potential model (blue line).Right: We now generate a much larger set of structures using M1, and label them using the same method (not M0!).This dataset, D1, covers a more narrow structural space, but at more points.A second fit is then made with a different ML method, M2.

FIG. 2 .
FIG.2.Composition of the ML potential fitting databases used in this work.A SOAP/KPCA map (cf.Ref. 35) compares the structural space covered by the GAP-18 database[36], here denoted D0, and the newly created indirect learning dataset, D1.The distance between points on this map is a 2D-embedding of their structural dissimilarity as measured by SOAP[37].At low pressure, a random starting structure, which rapidly transforms to approximate low-density amorphous (LDA) Si, is met with the good coverage provided by liquid and LDA structures in D0.Equally, the high-pressure region is densely sampled with distorted β-Sn-type and simple hexagonal (sh) structures.However, the D0 coverage is distinctly sparse at intermediate pressures around 10 GPa.

FIG. 3 .
FIG.3.Numerical errors versus computational cost for directly-(blue) and indirectly-learned (red) potentials.The figure characterizes MTPs with increasing number of fitting parameters, expressed via their respective compute-time per MD step relative to GAP-18 (M1).(a) Mean absolute error for random structures which were not part of the training.The test comprises 100 cells of 1-64 atoms produced using buildcell[40,41] and optimized with GAP-18 at pressures in the range 0-32 GPa.For comparison to M0 (M1), test-set structures were labeled with DFT[42,43] (GAP-18[23]), respectively; grey lines report the prediction errors of GAP-18 versus DFT for those structures.This test is not intended as a competitive benchmark, but rather to convey the trade-off between flexibility and efficiency in choosing the number of parameters in MTPs via the maximum level, L (here, set to values of 12-24 in increments of 2).See Supplemental Material for details.(b) As before, but now for compression MD snapshots: in this case, the configurations have been included in the indirect learning database, D1.

FIG. 5 .
FIG. 5. Million-atom MD simulations with an indirectly-learned potential.(a) Structural model of low-density amorphous (LDA) Si, obtained by quenching from the melt at 10 11 Ks −1 to 500 K before relaxation.Color-coding indicates coordination numbers, N .(b) Structural model of a polycrystalline (pc) simple-hexagonal (sh) phase produced by compression to 20 GPa.Close inspection reveals a mixture of thick, amorphous interfaces and coherent symmetric-tilt grain boundaries.(c) Bond-angle distribution function of the LDA structure (cf.panel a) compared to GAP-18.The larger system-size permits a detailed analysis of N = 5 coordination defects.(d-e) Pressure-volume and pressure-similarity curves during compression of LDA −→ VHDA −→ pc-sh.Data for the previous GAP-18 simulation refer to a 100,000-atom system studied in Ref. 11.
J.D.M. acknowledges funding from the EPSRC Centre for Doctoral Training in Inorganic Chemistry for Future Manufacturing (OxICFM), EP/S023828/1.The authors acknowledge the use of the University of Oxford Advanced Research Computing (ARC) facility in carrying out this work (http://dx.doi.org/10.5281/zenodo.22558),as well as support from the John Fell Oxford University Press (OUP) Research Fund.We thank S. Batzner, D. Holzmüller, S. Hoyer, and A. Shapeev for helpful comments on the manuscript.Supplemental Material for "Indirect Learning of Interatomic Potentials for Accelerated Materials Simulations" Joe D. Morrow and Volker L. Deringer* Department of Chemistry, Inorganic Chemistry Laboratory, University of Oxford, Oxford OX1 3QR, United Kingdom SOAP SIMILARITY ANALYSIS FOR PHYSICALLY-GUIDED VALIDATION FIG. S1.A structural benchmark for high-pressure disordered Si.We evaluate the average SOAP similarity of the compression trajectory of Ref. 11 to selected crystalline structures.Dashed lines are for reference crystals with a fixed lattice optimised at 0 GPa with DFT.Solid lines are similarities to crystals with their lattice parameters optimised as a function of pressure.The structure images show representative snapshots, color-coded according to SOAP similarity to the sh crystalline phase, with lighter colors indicating higher similarity (drawn with data from Ref. 11).

FIG. S2 .
FIG.S2.Extension of Fig.S1to now include further crystal structures: comparing the average atomic environment during compression MD to reference structures.Dashed lines are for reference crystals with fixed structural parameters as optimized using DFT at 0 GPa.Solid lines indicate SOAP similarities to crystals with their lattice parameters optimized using DFT as a function of pressure.Rhombohedral phase R8 results from decompressing β-Sn, containing 5 and 6-membered rings of tetrahedrally-coordinated Si (see Ref.S3).bct is a hypothetical four-fold coordinated structure predicted to be stable under tensile stress[S4].The metallic phases commonly labeled with their orthorhombic spacegroups, Imma and Cmca, appear with increasing pressure respectively between β-Sn and sh and between sh and hexagonally closepacked (hcp) phases.Anzellini et al.[S5]  provide an introduction to the numerous Si phases and their nomenclature.
FIG. S5.Mean absolute error of predicted energies versus quantity of compression MD training data of two types: starting separately from LDA structures (dashed lines) and randomly placed hard spheres (solid lines).3 M 16 MTPs with the same settings were trained for each database and we report their average errors in prediction vs. GAP-18 on two independent validation sets: structures derived from melt-quench MD (blue) and compression MD (yellow).The median standard deviation in error across identically-trained MTPs, σ∆E/|∆E|, is in the region of 10-45 %.MAEs show improvement with the addition of similar data to the training set, which is converged with respect to database size by 240 × 10 3 atomic environments.The errors for melt-quench MD begin to degrade with the increased proportion of compression MD included during training between 120 × 10 3 -240 × 10 3 environments.Including even more environments leads to an unstable potential for MD simulations.
54K s −1 meltquenching, but note that other indirectly-learned MTPs tested tend to overestimate the FSDP height, with M 12 and M 14 producing regions of diamond-like crystallinity FIG.S3.Variable-rate melt quenches as a benchmark for the behavior of ML potentials for Si.The figure compares D0-trained and indirectly-learned MTPs to M1 (upper panels) and experiment (lower panel); the simulations follow the protocol of Ref.54. Alhough the structure factor, S(Q), as predicted by M 16, agrees almost perfectly with the GAP-18 result, the much larger FSDP predicted by M 20 suggests this may in fact be serendipitous.It is nevertheless encouraging that the other peaks are also well-matched by the M 16 curve.The increased ordering implied by the higher FSDP of the M 20 curve does not obviously manifest as crystalline grains, nor is any peak splitting characteristic of paracrystalline ordering clear in the other diffraction peaks.