The microscopic mechanism of bulk melting of ice

We study the initial stages of homogeneous melting of a hexagonal ice crystal at coexistence and at moderate superheating. Our trajectory-based computer simulation approach provides a comprehensive picture of the events that lead to melting; from the initial accumulation of 5+7 defects, via the formation of L-D and interstitial-vacancy pairs, to the formation of a liquid nucleus. Of the different types of defects that we observe to be involved in melting, a particular kind of 5+7 type defect (type 5) plays a prominent role as it often forms prior to the formation of the initial liquid nucleus and close to the site where the nucleus forms. Hence, like other solids, ice homogeneously melts via the prior accumulation of defects.


I. INTRODUCTION
In this paper we present a computer simulation study of the initial stages of melting of hexagonal water ice (Ice Ih) at ambient pressure and superheating up to 11% above the melting point in a regime where the formation of a liquid nucleus of sufficient size is the rate limiting step. Our analysis of ensembles of trajectories yields a detailed, time-resolved picture of the different dynamical pathways that lead to the formation of such a liquid nucleus and the role that different types of defects play. We find that prior to melting a number of so-called 5+7 defects and larger defect structures in the hydrogen-bond network accumulate in the volume that later becomes the liquid nucleus and that the size and number of these defects becomes larger as the degree of superheating is reduced.
The microscopic mechanisms that lead to melting of crystalline solids are a longstanding subject of solid state theory. In most situations melting originates at the surfaces of a crystal 1 , however, under particular circumstances a solid may melt from within the bulk of a crystal as opposed to its surfaces 2 (so called homogeneous melting). For example, micrometer sized single crystal spheres made of silver melt homogeneously if their surfaces are covered by layers of gold atoms. This shell of atoms suppresses surface melting 3 by having a higher melting point while forming a lattice that is compatible with the silver lattice. Using this method, substantial superheating of the Ag lattice can be achieved that is preempted by surface melting in a conventional setting.
A multitude of theories exist that put increasingly stringent limits on the amount of superheating a crystal can be subjected to before it becomes mechanically unstable [4][5][6][7][8][9][10] . However, in equilibrium all of these instabilities are preempted by thermal melting of the crystal via a nucleation and growth mechanism 11 . In many rea) christoph.dellago@univie.ac.at FIG. 1. Example configuration containing a spherical liquid cluster. For clarity, the potential energy of the configuration has been locally minimized. The configuration has been prepared by selecting the molecules in a sphere of radius 1.5 nm and melting them by heating. Afterwards the configuration is briefly equilibrated at a temperature of 303 K. cent studies preexisting defects have been found to play a major role [11][12][13][14][15][16][17][18][19][20] in melting mechanisms and to significantly influence the stability limits of crystals.
Melting of ice Ih-the ordinary form of ice that can be seen around the liquid nucleus in Fig. 1-is a particularly interesting example of a melting crystal, as its open structure is held together by a network of hydrogen bonds. At temperatures around the ambient pressure melting point, these bonds can rupture and form with relative ease compared to, e.g., covalent bonds, and, arXiv:2107.11808v1 [physics.chem-ph] 25 Jul 2021 hence, a multitude of hydrogen bonding defects occurs in ice under these conditions. While ice usually melts heterogeneously, homogeneous nucleation has been induced by internal heating. This is achieved by exciting the OH stretching mode of water with IR laser light [21][22][23][24][25] . Such experiments strongly suggest that defects play a key role in determining the stability of ice crystals 23 .
Previous computer simulation studies of ice melting support the finding that defects play an important role. Donadio, Raiteri, and Parrinello 14 investigated the free energy landscape of melting at the melting point, finding that so-called 5+7 defects form a minimum in the free energy landscape (see Sec. II for a description of the various defect types found in ice). Furthermore, they observed the formation of large defect structures in the hydrogen bond network that involve on the order of 50 molecules. Mochizuki, Matsumoto, and Ohmine 18 studied spontaneous melting under higher superheating (around 19% above melting temperature) where melting events occur spontaneously in simulations within a timeframe of a few nanoseconds. They identified the formation of separated defect pairs (either interstitial-vacancy or L-D pairs) as the controlling step in the melting process at these conditions and also observed 5+7 defects as part of the melting mechanism.
In this paper we follow a similar approach to the one presented in Ref. 18 where an ensemble of trajectories is generated and the effect of defects is investigated. However, due to the lower degree of superheating the critical step in the nucleation process is the formation of a liquid nucleus of sufficient size. At the temperatures we consider, the rate of formation of such critical nuclei is much lower than the rate of melting observed in Ref. 18. As a result, the required number of melting trajectories cannot be feasibly obtained from MD simulations simply by waiting for spontaneous melting events to occur. In this paper we generate unbiased melting trajectories based on the expectation of a nucleation-growth mechanism near phase coexistence, yielding hundreds of statistically independent samples. With the help of these trajectories we then assemble a detailed, time-resolved picture of the dynamic pathways that lead from a frozen crystal to a liquid nucleus at different degrees of superheating. This analysis yields a comprehensive picture of the roles different defects play in the mechanism of homogeneous melting as a function of temperature.
The remainder of the paper is organized as follows: in Sec. II we introduce the types of defects that we refer to throughout the paper. In Sec. III we lay out our simulation methodology and in Sec. IV we present the results of our simulations. Section V summarizes and discusses our findings.

II. ICE DEFECTS
As part of our analysis we classify different point defect types that can occur in ice. Here we briefly introduce the types of defects we consider and summarize previous results on the role they play in melting. Higher dimensional defects such as dislocations and disclinations do not form spontaneously in the simulations presented in this paper and, hence, we limit this discussion to point defects only. In order to discuss defects within the ice Ih structure it is instructive to first reiterate some of the properties of hexagonal ice 26 : (1) the water molecules (H 2 O) are laid out in a wurtzite structure 27,28 ; (2a) each molecule takes part in four hydrogen bonds to its nearest neighbors; (2b) two of the hydrogen bonds are donated to other molecules. Observations (2a) and (2b) together are known as the Bernal-Fowler ice rules 27,29 . (3) The hydrogen bond network of a defect-free Ice Ih crystal can be decomposed into an array of 6-membered rings. However, the direction of hydrogen bonds in the bond network is not uniquely determined and, consequently, there is no long range proton order in Ice Ih 27,28 .
Breaking rule (1), i.e. displacing a molecule far from its location in the perfect lattice, leads to an interstitialvacancy (I-V) pair. It has been shown that translational diffusion within ice occurs by the movement of whole molecules 28,31 and that the concentration of ionic defects is low compared to molecular defects 26,32 . Hence, we expect that whole molecules also form the majority of I-V defects in the lattice (and not ionic defects). In our simulations we only consider I-V defects where whole molecules are moved out of their lattice position. No ionic defects can occur in our simulations.
Breaking condition (2) while keeping condition (1) intact leads to so-called Bjerrum-or L-D pairs 33 . L and D defects are, respectively, characterized by a missing or an excess proton within a small region of space so that the ice rules are not satisfied and can not be satisfied until a matching defect of the other type is encountered. Nevertheless, L-D pairs are often found bound to each other forming an L+D complex that exhibits a lower potential energy than a separated L-D pair 30 . In this work, we do not distinguish between separated L-D pairs and L+D complexes and call all structures where the ice rules are broken L-D defects.
I-V and L-D defects will be referred to as mobile defects throughout this work because, once a pair of these defects is formed and separated from each other, their movement through the system is relatively facile 34 . Ref. 18 discusses the critical role of mobile defect pairs in melting under high superheating conditions, where the separation of a mobile defect pair is found to drastically lower the free energy barrier that needs to be overcome in order for melting to occur.
Lastly, breaking of rule (3) while keeping rule (2) intact constitutes another class of defects, which (in line with Ref. 14) we will call topological defects. These are defects where the 6-ring structure of the perfect lattice is broken and instead there are other ring combinations present. Note that the ice rules are still fulfilled in these defects and that the molecules that take part are only shifted by small distances from their positions in the perfect lattice. The most prominent of these defects are socalled 5+7 defects, first found in simulations by Tanaka and Mohanty 35 and described in detail by Grishina and Buch 30 . In 5+7 defects two 5-and two 7-membered rings are found neighboring each other. These defects can then be further classified into different types according to the placement of protons around the central bond of the 5+7 defect (see Fig. 2) and according to the crystal plane they are formed in. 5+7 defects are readily observed in equilibrium simulations that use the TIP4P family of water models 35,36 .
Other combinations of ring sizes are also possible: for example we encounter 455778 defects that center around a 4-ring while still satisfying the ice rules (see Fig. 3). Larger defect structures (such as the ones observed in Ref. 14) are frequently observed in melting trajectories and we will subsume all of these structures under the name extended topological defect or E defect. All topological defects have in common that there is no efficient mechanism for these defects to move which is why we will refer to them as immobile defects in the following.
In the next section we present the simulation methodology used to generate trajectories that are then analyzed in terms of the defects that occur at various stages of melting.
FIG. 3. Snapshot of a 455778 defect that is embedded into the basal plane of an Ice Ih crystal. The numbers mark the 4-, 5-, 7-and 8-membered rings in the H-bond network that make up the defect. The white molecules are located in an adjacent, defect free plane.

A. Generating parts of reactive trajectories
Our aim is to investigate the melting transition starting from configurations that contain an Ice Ih crystal with possibly a few 5+7 defects (state A in Fig. 4) and ending in the liquid state (state B). In particular, we are interested in the initial stages of these trajectories up to a state where a liquid nucleus has formed (state S in Fig.  4). Note that we include configurations that contain 5+7 defects into the definition of state A. Such defect states occur readily in equilibrium trajectories under the conditions considered in this paper where the Ice Ih crystal is metastable with respect to the liquid state.
Under these conditions the two states A and B are separated by a free energy barrier so that the melting transition is a rare event. This means that the average waiting time τ AB between preparing a system in an equilibrium frozen state in A and a melting event that leads the system from state A to the liquid state B exceeds the timescale of relaxation in state A, τ A , by orders of magnitude: This so-called separation of timescales guarantees that the way melting events occur does not depend on the details of how frozen configurations are prepared and that, instead, we can think of the two states as being connected by an ensemble of melting trajectories that leave A and end in B without visiting A in the meantime. This ensemble of trajectories is called the transition path ensemble 37,38 (indicated by black arrows in Fig. 4).
To generate the initial parts of melting trajectories we take the following approach: Region B contains configurations that are liquid and region S contains molten nuclei prepared by the procedure laid out in Sec. III A. Our aim is to generate a sample of the early stages of melting trajectories (black curves with arrows). To do so, we integrate trajectories starting from configurations in S until they reach A (red, dashed arrows) and subsequently invert the direction of time. Trajectories that reach B before A are discarded.
FIG. 5. Example of a configuration that contains a slab of liquid that consists of half the molecules in a 720-molecule water system. For clarity the potential energy of the configuration has been locally minimized.
1. Pick a sample of configurations from equilibrium simulations of ice Ih performed at the chosen temperature and pressure.
2. Construct a liquid domain inside this configuration and locally equilibrate the resulting configuration. The resulting ensemble is denoted with S.
3. Run molecular dynamics simulations using a symplectic integration scheme starting from S until state A or B is reached. The resulting trajectories are referred to as backwards trajectories.
4. Invert the time direction of the backwards trajectories that end in A to get a sample of trajectories that lead from A to S.
The use of a symplectic integration scheme guarantees that the sample of trajectories obtained by integrating backwards in time has the same statistics as the ensemble of trajectories that leads from A to S when integrating forward 39 .
Similar to the so-called seeding method [40][41][42][43] we choose the liquid clusters to be spherical in shape. In reality, the shape of the liquid nuclei that form during homogeneous melting is likely not perfectly spherical due to slight differences in the interface tension associated with different crystal planes 44-47 as well as due to the different dynamics of crystal growth along the plane normals [48][49][50][51] . To assess the effect different cluster shapes have on the early stages of melting trajectories, we perform additional simulations at the ice-liquid coexistence temperature that start from configurations with a slab shaped liquid domain such as the one shown in Fig. 5.
We refer to trajectories that are constructed from spherically shaped liquid domains as spherical-geometry trajectories and to the ones constructed from slab shaped liquid domains as slab-geometry trajectories.

B. Simulation details
The simulations presented in this paper are obtained using the TIP4P/Ice water model 52 with a time reversible and symplectic rigid body integration scheme 53,54 as implemented in the LAMMPS simulation package 5556 . Thermo-and barostats are implemented using Nosé-Hoover chains [57][58][59] where the x, y and z directions are independently barostatted to a pressure of 1 bar. Long-range interactions are treated using a particle-mesh Ewald method (PPPM 60,61 ) with an accuracy of 10 −4 and the timestep is set to 1 fs. Snapshots are saved for analysis every 10 ps.

Spherical-geometry trajectories
MD simulations used to generate spherical geometry trajectories are carried out with 2880 molecules in an almost cubic simulation box (see Fig. 1). The initial dimensions of the boxes are 44.9 × 46.7 × 44.0 A 3 in the directions orthogonal to the secondary-prism plane, the prism plane and the basal plane, respectively. To generate initial configurations for the backward trajectories we follow the following procedure: 1. Construct a proton ordered Ice XI lattice and randomly reorder the hydrogen bonds using a Monte Carlo procedure 62 in order to obtain ice Ih. Here we require that the total dipole moment of the configuration is zero at the end of the procedure.
2. Equilibrate these configurations in a 30 ns parallel tempering 63-66 trajectory using replicas starting from a temperature of 258 K up to and including 328 K spaced 5 K apart.
3. Pick a sample of configurations from the parallel tempering simulation at the desired temperature.
4. Pick a random center for the liquid nucleus in each of the configurations and find the molecules within a radius of 15 A.
5. Heat the selected molecules using a thermostat while keeping the molecules outside the sphere fixed until the crystal structure in the selected region breaks down.
6. Equilibrate at the target temperature by first keeping the molten fraction fixed and propagating the molecules in the crystalline phase (for 20 ps) and then keeping the crystalline molecules fixed and propagating the molten molecules (for 25 ps). This procedure hinders the molecules from recrystallizing because the molecules inside and outside of the selected volume can not collectively reorder into a frozen configuration.

Slab-geometry trajectories
Slab-geometry trajectories are generated using 720 molecules in an elongated box with initial dimensions 44.9 × 23.3 × 22.0 A 3 (see Fig. 5 for an example configuration). The volume that contains the liquid is chosen so that the solid-liquid interfaces are parallel to the secondary prism face of the Ice Ih crystal. This is the geometry that has been found to preferentially form in Ref. 67. No initial equilibration is performed because the system has sufficient time to equilibrate before the slab collapses into one of the two competing phases. The resulting trajectories are quite different from the spherical-geometry trajectories at the same temperature of 268 K. The spherical liquid domains shrink rapidly and predictably due to surface tension. For the slab geometry, periodic boundary conditions eliminate contributions from the surfaces where the liquid domain wraps around the simulation box 68,69 . Surface tension therefore does not drive the growth of the slab-shaped liquid domains because a change in the volume of the liquid domain has no effect on the overall liquid-solid surface area.
The temperature of T = 268 K was chosen so that roughly half of the trajectories started with a slab-shaped domain melt and the others freeze, further reducing the thermodynamic force that drives the growth and shrinkage of the liquid domain. This temperature is slightly below the melting temperature at 1 bar of 272.2 K reported by Abascal et al. 52 ; a discrepancy that is likely due to the non-negligible influence of the solid-liquid interfaces in the comparatively small simulation box with 720 molecules. The result are trajectories where the two solid-liquid interfaces that delimit the liquid domain diffuse in the direction of their surface normal until they are close enough that a fluctuation in the shape of the interfaces brings them into contact. The distance between the interfaces at which this contact occurs is small, roughly one layer of 6-rings or 7 A.
A larger simulation cell would yield a larger, and more realistic, contact distance. Increasing this distance considerably, however, comes at great computational cost, for two reasons: first, the contact distance scales logarithmically with the size of the simulation box 70 . Secondly, increasing the size of the simulation box slows down the diffusion of the two interfaces which in turn sharply increases the length of the required trajectories. The system size with interface areas of 23.3×22.0 A 2 was chosen as a compromise between maximizing the contact distance between the two interfaces and minimizing the simulation time required to obtain trajectories.

Gathered data
In total, three sets of trajectories leading from the prepared configurations S to the frozen state A were generated: 108 slab-geometry trajectories at a temperature of T = 268 K, 504 spherical-geometry trajectories at T = 268 K, and 448 spherical-geometry trajectories at T = 303 K.
The temperature of 303 K is 11% superheated relative to the melting point 52 and has been chosen so that the spherical nuclei with a radius of 15 A are slightly subcritical. Out of the 504 trajectories that were run we observed 56 trajectories that melted before they could reach the frozen state A. On average A is reached in 13.6 ns (slab, 268 K), 5.1 ns (spherical, 268 K), and 2.3 ns (spherical, 303 K).
Data on equilibrium properties presented in this paper were obtained from the parallel tempering simulations that were also used to generate initial configuration for the backward simulation runs. The data from different replicas is combined using the weighted histogram analysis method (WHAM 71,72 ) as implemented in the PyEMMA package 73 .
We now invert the direction of time in these generated trajectories, which become examples of the early stages of melting. We also set t = 0 at the crystalline endpoint of each trajectory, so that the system adopts a frozen configuration at time zero and melting commences as time increases. These time conventions will be used throughout the remainder of the paper.

C. Defect detection
To detect defects in the hexagonal ice structure we adapt an array of different techniques previously used to investigate the properties of water and ice. These include, 0 100 after locally minimizing the potential energy of a configuration, an analysis of the hydrogen bond network 18,74 , in particular the ring structures found therein 14 , as well as the analysis of configurations relative to a reference configuration to facilitate the detection of molecular interstitials and vacancies 18 . See App. A for a detailed description of the algorithms used. These techniques yield an analysis like the one shown in Figs. 6 and 7. It includes counts of the number of 4-and 5-membered rings (n 4 (t) and n 5 (t), respectively) that are roughly proportional to the size of the liquid domain. It also indicates whether certain defects have been detected in the configuration. The defect types distinguished are I-V pairs, L-D pairs, and topological defects in the H-bond network: 5+7 defects, 455778 defects, and extended topological defect structures where the ice rules are fulfilled (E defects). The 5+7 defects are further split into the five horizontal types 30 (5+7-1 through 5+7-5) that are formed within the basal plane of the lattice and the vertical type (5+7-V) where the 5+7 defect is formed in a plane orthogonal to the basal plane.

Number of rings
Note that the algorithm used to detect defects of type I-V, L-D, and E does so in an exclusive fashion: if an I-V pair is present, L-D and E defects can not be detected and if an L-D pair is present, E defects can not be detected (see App. A, Fig. 19).
As part of the analysis we determine three times along each melting trajectory: t A , t E , and t m : • t A is the time when the system leaves state A for the last time, i.e. it is the last time where there are only 5+7 or 455778 defects present in the system.
• t E is the last time when no mobile defects are present in the system. In the time between t A and t E , E defects are present in the system; L-D and I-V pairs may also form during this interval, but by definition they must recombine before time t E .
If no E defect occurs along a trajectory, then t A equals t E .
• t m is the time when the extended liquid domain forms. This event is associated with an increase in the rate at which 5-membered rings appear, i.e., the slope of n 5 (t). Because this increase occurs on top of significant background fluctuations in n 5 (t), we look for an increase of n 5 (t) over a timespan of 200 ps that is larger than a given threshold ∆n 5 (see Tab. I).

IV. THE THREE STAGES OF MELTING
Based on the times defined in the previous section we now define three stages of the melting mechanism (cf. Fig.  8 where only immobile defects are present in the system, stage II (t E ≤ t < t m ) where one or more mobile defects have formed and stage III (t ≥ t m ) where an extended liquid nucleus has formed. In the following sections we first discuss the spherical-geometry trajectories. Section IV C then discusses the differences found in slab-geometry trajectories.
In equilibrium as well as in stage I of melting trajectories, the bulk of defects that are present are of type 5+7. A prominent role in melting is played by 5+7-5 defects as can be seen in Fig. 9 where we report the average numbers of defects found at time t A compared to the average numbers found in equilibrium. The difference in defect numbers between these two scenarios is listed in Tab. II.
At both temperatures the overall excess of defects found at time t A is mostly accounted for by the excess of 5+7-5 defects. At T = 268 K the number of 5+7-V defects is also significantly enhanced while the number of 5+7-3 and 5+7-4 defects is slightly reduced. At T = 303 K the average number of defects of type 5+7-V and types 5+7-1 through 5+7-4 are roughly equal to the numbers observed in equilibrium.
At time t A one of two things happens: either an E defect forms (E-channel ) or a mobile defect forms directly (57-channel ). Figure 8 shows the fractions of trajectories that pass through each of these channels. As the temperature decreases from 303 K to 268 K the number of trajectories that involve an E defect increases from 35% to 52%.
In order to assess the role of 5+7 defects in the melting mechanism, we analyze the locations of defects relative to the center of the volume where the liquid nucleus eventually forms (the nucleus volume), as well as relative to  the sites where mobile defects form at time t E . To do so, we define the pair-correlation function between defects of reference type S and defects of target type T , where H ST (r, ∆r) is a histogram of all pairwise distances between defects of type S and type T observed in a set of configurations, ρ T is the equilibrium number density of defects of type T , and c S is the total number of defects of type S observed. V (r, ∆r) is the volume of a spherical shell with inner radius r−∆r/2 and outer radius r+∆r/2, where ∆r is the bin width of the histogram. Note, that g ST (r) is not symmetric in S and T because c S = ρ S . Nevertheless, we expect lim r→∞ g ST (r) = 1. We also define n ST (r) as the average number of defects T within a sphere of radius r centered on a defect of type S, i.e.
g T (r) and n T (r) are the analogous quantities where the reference points are the center of the spherical nucleus volume.  Figure 10 shows g T (r) and n T (r) obtained from configurations observed at time t E (including trajectories that proceed through both the 57-and E-channels). At T = 268 K there is up to a 50-fold excess over equilibrium in the density of 5+7 defects inside the radius of the liquid bubble that forms later on. The most abundant defect type inside this volume is the 5+7-5 defect with an average of 1.2 defects found within the radius of 15 A while the average total number of 5+7 defects within this volume is 2.2. The next most common defect within the nucleus volume are 5+7-V defects.
Under superheating conditions at T = 303 K the excess is slightly less pronounced. Nevertheless, on average we find 1.5 5+7 defects within 15 A of the center of the forming bubble (5+7-5 defects: 0.7).
At T = 268 K we observe a suppression of defects outside the nucleus volume relative to equilibrium. This suppression develops despite the fact that we used equilibrium configurations to seed the simulations, a procedure that enforces that the environment around the liquid domain is initially in equilibrium. The suppression of defect densities around the liquid nucleus indicates that the presence of the nucleus facilitates annealing of existing defects in the ice structure. We expect this effect to subside with increasing distance from the nucleus, however, the simulation box used in our simulations is not large enough to observe this return to average densities. It is important to note here that precisely at coexistence the radius of the critical nucleus is macroscopically large, and that the nucleus radius of 15 A used to seed our simulations was chosen for comparison with simulations performed under superheating. At the physically more realistic temperature of 303 K the nucleus size used to seed simulations is chosen close to the critical nucleus size at this temperature. This allows us to extrapolate these simulation results to larger system sizes. Notably, under these conditions there is no significant suppression of defect densities outside the nucleus volume.

B. Stage II: mobile defects (t E < t ≤ tm)
In this next stage, a mobile defect, i.e., either an L-D or an I-V pair has formed. Which defect type has formed as a function of time relative to t E is shown in Fig. 11 for different temperatures and cluster geometries. Just over half of the configurations (56%) observed at temperature T = 303 K contain an L-D pair, while the other ones contain an I-V pair. The fraction of configurations where an I-V pair is present then rapidly increases, reaching 98% 0.5 ns later. Similar behavior can be observed at T = 268 K regardless of the geometry of the liquid nucleus, giving us confidence that the observed timescale of  We have already shown in the previous section that 5+7 defects tend to form within the eventual volume of the liquid nucleus. Figure 12 assesses the analogous behavior for mobile defects. While at T = 268 K the first mobile defect that leads to melting forms inside the volume of the future liquid bubble 80% percent of the time, at T = 303 K this share has declined to 64%. In both cases, the formation site is correlated with the center of the liquid nucleus (i.e. g(r) is not flat), however, the mobile defects are formed outside the nucleus volume in a significant number of trajectories.
To investigate the role of 5+7 defects in the creation of mobile defects, Fig. 13 shows an analysis of the defect densities around the site where a mobile defect forms. Only trajectories that proceed via the 57-channel are included in this analysis. Note, that due to the sampling frequency of (10 ps) −1 5+7 defects with a life time smaller than 10 ps may not be detected in this analysis.
At both temperatures we find that the density of 5+7 defects is enhanced close to the site where a mobile defect forms and we find the closest 5+7 defect within 6 A in 71% of trajectories at 268 K (303 K: 57%). The average number of defects within 6 A is 0.87 (303 K: 0.70). While at T = 303 K all 5+7 defect types roughly contribute The states shown are detected in a mutually exclusive fashion, i.e. if there exists at least one I+V defect in the system, it is considered to be in the I+V state regardless of the number of L+D defects in the system. Notice the similarity between the data obtained with the slab and the spherical geometry; the probabilites are largely independent of the shape of cluster that is formed as well as of temperature. The probabilities of finding the system in state I+V and L+D by definition vanish at tE and sum to one for t > tE.  equally to the nearby defect population, at T = 268 K there is a preference for 5+7-5 defects.
In trajectories that pass through the E-channel, mobile defects can also form close to the E defect. Hence, due to volume exclusion, the density of 5+7 defects around the site where a mobile defect forms is suppressed relative to the densities found in trajecotories that pass through the 57-channel (see App. C, Fig. 27).
After an I-V defect pair has formed we can track the motion of the two defects through the system. In Fig. 14 we show the average distances between the I and the V defects as a function of the elapsed time since the I-V pair has formed. The average is calculated over all configurations where a single I-V pair is present and each trajectory is included until time t m . Within roughly 0.5 ns the average distance between I and V defects approaches the value expected if one were to randomly place two particles in the simulation box (dotted lines in Fig. 14). In FIG. 14. Average distance between interstitial and vacancy as a function of time after their first creation in melting trajectories. Shown are different simulation box-and cluster geometries and temperatures. The data shown has been calculated considering configurations with a single I-V pair because the assignment of defects into pairs is unambiguous in this case. The dashed horizontal lines indicate the average distance between two randomly chosen points in the simulation box of the respective geometry. Fig. 15 we show the mean-square-displacements, r 2 (t) , of I and V defects during the same timespan. After a subdiffusive regime that lasts roughly 100 ps, r 2 (t) is close to linear indicating that the defects freely diffuse through the system. At 268 K the ratio of the self-diffusion constants of intersitials and vacancies, D I /D V , equals approximately 2 (303 K: 1.75).
Together these two datasets suggest that in the timespan between t E and t m both defect types independently diffuse through the simulation box. Indeed we find that the non-equilibrium pair correlation function between interstitials and vacancies obtained from configurations observed in this timespan is flat for distances larger than 10 A (see App. C, Fig. 28).
To further investigate the role of I and V defects in the formation of a liquid nucleus, Fig. 16 shows the position of the I-V defect pair that is closest to the center of the nucleus volume at time t m . We find that at 268 K in 98% of trajectories there is at least a single I or V defect present inside the volume that later becomes the liquid nucleus. In 82% of trajectories both the closest I and V defect are found in this volume. No significant imbalance between the two types of defects can be detected.
Note that Fig. 16 shows the probabilities of finding the I and V defect closest to the center of the liquid nucleus. Even for an ideal gas, the analogous distribution of the closest gas atom to a given location is not flat but has a maximum at a distance that is determined by the density of the gas. In App. B we calculate the expected shape of this distribution and compare it to the data shown in Fig. 16. We find that there is a strong excess of I and V defects that are close to the liquid nucleus' center 10 −2 10 −1 10 0 t/ns at time t m relative to the expected distances based on uncorrelated density fluctuations. At 303 K the distribution of defect positions at time t m is broader. Here we find that in 90% of trajectories at least one I or V defect is close to the center of the liquid nucleus at time t m . V defects are found outside of the nucleus volume slightly more often than I defects. Note, that the change in slope of n 5 (t) at time t m is less pronounced at T = 303 K than at 268 K, making the determination of t m less precise. The broader distribution of defect positions is at least in part a consequence of this reduced precision. Nevertheless, also here we find a strong excess of I and V defects close to the center of the nucleus volume compared to a uniform distribution of defect positions.

C. Analysis of trajectories with slab shaped clusters
The results obtained for slab-geometry trajectories at T = 268 K are largely similar to the ones obtained for spherical clusters at the same temperature. The transition proceeds via the E-channel in 57±7% of trajectories, via the 57-channel in 42 ± 6% and via the D-channel in 1 ± 1% of trajectories. This is in good agreement with the data obtained using spherical nuclei. Furthermore, also in this case we find that mobile defects are preferentially formed around 5+7 defects and in particular around 5+7-5 defects, as can be seen in Fig. 17. Figure  11 demonstrates that the dynamics of forming I-V pairs after t E are essentially identical to the ones observed with spherical cluster geometries.  Fig. 13 for trajectories that end in a slab-shaped cluster in a 720 molecule system.
A notable difference can be observed in the mobility of interstitial defects (Fig. 15): while all I and V defects exhibit some subdiffusivity at timescales below 100 ps regardless of cluster geometry and temperature, this effect is strongly enhanced in interstitials that form when the liquid slab decays. The same is not true for the corresponding vacancies. An analysis that separates the contributions to the MSD in different lattice directions (see App. C, Fig. 29) also suggests that the movement of in- terstitials is hindered in both the direction orthogonal to the secondary prism plane and the direction orthogonal to the basal plane. An analysis of the positions of interstitial defects in these trajectories shows that they are confined to the volume that later becomes the slab shaped liquid domain. This suggests that interstitials are more likely to not leave the volume of the liquid domain between the time they are formed and t m if the cluster is slab shaped.
We also analyzed the distribution of waiting times between t E and t m (Fig. 18). Here we find a strong dependence of the waiting time on the simulation geometry where (at the same temperature of 268 K) the decay time τ decreases from 4.1 ns in the sphere-geometry simulations to 0.7 ns in the slab-geometry. This reduction in waiting time may in part be a consequence of the pinning of interstitials in the slab volume. However, based on calculations by Le Vot et al. 75 for one-dimensional systems, we would expect a significant dependence of waiting times on system size even in the case of equal diffusivities.

V. DISCUSSION AND OUTLOOK
We analyzed melting trajectories that were obtained using molecular dynamics simulations at coexistence and at 11% superheating. At superheating conditions we observed the following sequence of events: (i) on average 1.5 5+7 defects form within the volume that later becomes the liquid nucleus; 0.7 of these defects are 5+7-5 defects; (ii) in roughly 1/3 of trajectories a larger topological defect structure that fulfills the ice rules (an E defect) forms; (iii) a mobile defect (an L-D or an I-V pair) forms close to either an E if one exists or close to a 5+7 defect; (iv) if an L-D pair has formed, an I-V pair forms within a timescale of 0.5 ns; (v) the I-V defects freely diffuse through the system; (vi) a liquid nucleus forms as a result of the interaction of the mobile defects with the 5+7 defects that are already present in the volume that is later occupied by the critical liquid nucleus.
Previous studies 14,18 have pointed out the role of 5+7 defects and larger defect structures in the melting mechanism of ice (modelled by the TIP4P water model). Our results show that the role of 5+7 defects is two-fold: 5+7 defects are often found close to the site where the first mobile defect is formed and, secondly, 5+7 and E defects create a defective region that is succeptible to the formation of a liquid nucleus when a mobile interacts with it.
As the temperature is reduced this defective region tends to become larger as demonstrated by the increase of melting trajectories that pass through the E-channel (52% at 268 K compared to 35% at 303 K) and the increase in the average number of 5+7 defects found close to the site where the liquid nucleus forms (2.2 compared to 1.5). This finding is consistent with the results of of Mochizuki, Matsumoto, and Ohmine 18 who reported that at 19% superheating 5+7 defects play a role in the formation of mobile defects but no accumulation of these defects occurs prior to melting.
In Ref. 18 it was also shown that the formation of a separated interstitial-vacancy pair is the rate limiting step in the limit of high superheating. Under the conditions investigated in this paper, the rate limiting step is the formation of a liquid nucleus of critical size, however, also in this case interstitials and vacancies play a crucial role in that they cooperate with 5+7 defects to form the initial liquid nucleus.
Prior to the formation of the liquid nucleus the interstitials and vacancies diffuse freely through the simulation box. This has interesting implications when one wants to extrapolate simulations results to the thermodynamic limit. Since interstitials and vacancies are only weakly bound to each other they are also not limited to form close to the accumulation of immobile defects they later interact with to form a liquid nucleus. This is underscored by the finding that even in the strongly confined environment of our simulation boxes we find that a considerable number (36% at 303 K) of mobile defects forms outside of the eventual volume of the liquid nucleus. This suggests that mobile defects may diffuse for considerable distances before they encounter an accumulation of other defects and form a liquid nucleus.
In summary, the results of our simulations further support the observation that interstitials and vacancies play an integral role in the microscopic mechanism of ice melting and establish that, as the degree of superheating becomes smaller, increasingly large immobile defect struc-tures are present within the volume where the initial liquid nucleus forms prior to its formation. Interstitial and vacancy defects then interact with these immobile defects to form an initial liquid nucleus that later grows and melts the ice crystal. This adds ice to the list of solids with a melting mechanism that involves the prior accumulation of defects.

DATA AVAILABILITY
The data that support the findings of this study are available from the corresponding author upon reasonable request. Figure 19 presents the scheme used to detect defects in configurations throughout the paper. We start by minimizing the potential energy of a given configuration by setting the momenta of atoms to zero and annealing the system to a temperature of 1 K using a Langevin thermostat. The cooled configurations are then analyzed using a number of algorithms the details of which can be found below.
In the course of this analysis deviations from a perfect hexagonal ice lattice are detected and, if possible, attributed to known defect structures. The attribution of anomalies in the H-bond network to a defect may reveal other known defect structures and, hence, the analysis is run iteratively until no new defects are found.
The last part of the detection scheme assigns each configuration to a global state based on the criteria shown in Fig. 19.

Detection of the hydrogen bond network and of L-D defects
Hydrogen bonds are detected in annealed configurations using the HB2 criterion proposed in Ref. 74 using a custom implementation for the LAMMPS simulation package. This criterion considers a set of two oxygen atoms and one hydrogen atom to form a hydrogen bond if the O-O distance is smaller than 3.5 A and the O-H-O angle is larger than 140°. This analysis yields a network of HBs from which an undirected graph is constructed using the networkx python package 76 . This package provides a large number of graph analysis functions we use to carry out parts of the following analysis.
The simplest task is the count of the number of HBs each molecule donates (n d ) and accepts (n a ). In the perfect lattice one expects n d = n a = 2 for all molecules. If either of these two criteria is not fulfilled for a given molecule it is considered to be miscoordinated. This can be the result of both, an interstitial-vacancy and an L-D pair. If no I-V pair is found, an L-D pair is considered to be present. Note, that L-D pairs where L and D stay bound to each other form frequently in hexagonal ice and that their separation is associated with an additional free energy barrier 30 . Here we do not distinguish between bound and unbound L-D pairs.

Detection of topological defects in the hydrogen bond network
Defects in the ice Ih structure such as the 5+7 defect can present very similar molecular environments to what can be seen in the perfect lattice. The hydrogen bonds in the region of 5+7 defects fulfill the ice rules and the changes in the bond angles are comparatively small. This makes it hard to detect defects based on the positions of molecules around a given molecule alone. In order to detect such defects, algorithms that are based on detecting the topology of the hydrogen-bond network have been proposed 14,18 . Here we use a similar scheme that analyzes the rings that can be found in the network formed by H-bonds in a hexagonal Ice crystal.
We define a ring in the H-bond network as a path that leads from a molecule to itself, where we move from one molecule to the next following H-bonds. We do not allow the path to cross itself, i.e. each intermediate molecule may only be visited once. In principle, with the use of periodic boundary conditions, a given molecule may be part of an infinite number of rings of arbitrary size. To make our ring definition unique we restrict ourselves to a smaller set of rings: the set of shortest rings around each molecule that contain the molecule and two of its neighbors.
Consider a molecule O that is H-bonded to n hb other molecules as depicted in Fig. 20. In order to construct the shortest rings we iterate through all n hb (n hb − 1)/2 pairs of neighbor molecules A and B. For each of these pairs FIG. 19. Sketch of the algorithm used to detect defects in the ice Ih structure throughout this paper. It consists of two stages: in the first stage the configuration is analyzed and deviations from the pristine Ice Ih structure are marked using different approaches (see the descriptions in this appendix). In the second stage a state is assigned based on the deviations found. n6 and N are the number of 6-rings detected and the number of molecules in the system, respectively. we look for the shortest path between A and B that do not pass through the molecule O. These paths are constructed using Dijsktra's algorithm 77 as implemented in networkx. There may be more than one path of the same length and-in order to make the result reproduciblewe include all rings in the following analysis. Repeating this procedure for all molecules in the crystal yields a set of rings and a list of rings that pass through a given molecule which we use to identify defect types. 5+7 defects 30,34 can be formed both in the cplane ("horizontally") or parallel to the prism-, or the secondary-prism plane. The basic ideas for detecting these defects is demonstrated in the right part of Fig.  20: the donor and the acceptor molecule of the 5+7 defect must be part of at least two 7-rings and one 5-ring in the c-plane alone. If we also consider the rings that pass through molecules in the adjacent plane, donor and acceptor molecules of a horizontal 5+7 defect are part of 8 7-rings, 4 6-rings and 2 5-rings. This ring fingerprint is used to pick out 5+7 defects from a given configuration.
In order to further distinguish the different types of 5+7 defects described in Ref. 30 we also take into account where the other H-bonds of donor and acceptor are pointing (cf. Tab. III). The HBs of each molecule can either both point towards a molecule in the same layer (in-plane, IP) or one of the HBs points towards a molecule in the same plane and the other one to a molecule in an adjacent plane (out-of-plane, OP). Hence, to detect these different types one needs to associate each molecule with a layer of the configuration. Type 3 and Type 5 defects are additionally distinguished by the direction the in-plane H-bonds of donor and acceptor are pointing to. In addition to horizontal 5+7 defects with donoracceptor pairs that are in the same c-plane, also vertical 5+7 defects can be observed. Figure 21 shows two examples of such defects where the donor-acceptor pair lies in the prism-([1010]) and the secondary prism-plane ([1210]). Together these defects are referred to as 5+7-V defects.
The algorithm that assigns each molecule to a layer is sketched in Fig. 22. First, the configuration is sliced along the x-axis into slices that contain two layers of molecules. Next, the configuration is squashed leaving only the z-coordinates of the oxygen atom of each molecule. These z-positions are clustered using the Ag-glomerativeClustering hierarchical clustering algorithm implemented in the scikit-learn python package 7879 . As metric we use the distances between the molecules taking periodic boundary conditions into account. Average linkage is used and we set the number of clusters to be found to the number of c-planes expected (6 in the example shown). This procedure is applied to each slice along the x-direction separately and afterwards the resulting layers are matched across the slices using the center-of-mass of the layers found in each slice to identify which neighboring layers belong together. This slicing procedure is used to accommodate capillary waves that modulate the layer positions as one moves along the x-direction. The result of this procedure is that each molecule is now associated with a number that identifies the layer the molecule is in.
The detection of defect structures is complicated if multiple defects are close to each other. In this case, the ring counts of different defects can influence each other. To be able to detect 5+7 defects that are close to each other, we compiled a list of different ring fingerprints by inspecting a large number of configurations with unknown defect structures.

Detection of interstiatial-vacancy defects
In order to identify interstitials and vacancies in configurations we compare the minimized configuration with a template configuration that contains a perfect ice Ih crystal of the same size. This procedure is a simplified version of the method previously employed by Mochizuki, Matsumoto, and Ohmine 18 where we skip the calculation of the edit distance.
To do so, we first align the configuration that is analyzed (denoted by C) with the template configuration (T ) using the following procedure: 1. Identify the set of molecules in the system, I, whose contribution to the potential energy in the system is lower than −16 kcal mol −1 . These molecules are in ice-like configurations.
2. Align these molecules with their counterparts in the template configuration. To do so, use a global shift of the configuration, s, to minimize the summed mean square displacement wherein X(x C i ; T + s) is the position of the closest neighbor of molecule i in C that can be found in T that has been shifted by s. As the position of a molecule we use the position of the oxygen. This yields an optimized vector s (1) and a value of the MSD function F 3. If F (1) min /N mol is larger than a threshold value f max = 0.03 A 2 , shift the vector s (1) by one layer distance in the z-direction and redo the optimization using this shifted vector as initial condition.

If F
(2) min /N mol is still larger than f max , again, shift the output vector of the previous configuration and redo the optimization. If this optimization does not succeed, the procedure fails.
The reason for this multi-step procedure becomes clear when we examine the landscape F (s; C, T ) shown in Fig.  23 that consists of multiple local and global minima. Figure 24 shows the alternating layers that an hexagonal ice crystal is comprised of. Depending on the initial shift of the template, an optimization of F can lead into the local minimums marked by green and red dots in Fig. 23. These minima correspond to shifts where the template configuration is offset by exactly one layer width along the z-direction, i.e., such that the B-layers of the template is found on top of the A-layers of the configuration.
Fortunately, at the densities observed in the simulations presented here, there are no local minima if s is chosen such that the layers match. Hence, by performing multiple optimizations where s is shifted by one layer in the z-direction after the first optimisation, we find the correct alignment of the two configurations. Only in some cases the third optimization is required because the initial s of the first optimization finds a small local minimum. The optimizations are performed using the BFGS algorithm as implemented in the scipy python package 80 .
Given the optimal alignment of T to C we can now assign each molecule in C to the site in T that is its nearest neighbor and count the number of times that each site in T is found as a nearest site. In a defect free crystal this count, n T i , is equal to one for each site. However, if there are interstitial-vacancy pairs in the system, some of the n T i s are different from one. These sites are marked as template mismatches (TMM).
To avoid false positives that come about due to molecules translating only locally (e.g. to form a 5+7 defect), an additional step is performed where, if a site with an excess molecule and a site with a missing molecule are neighbors of each other (in the sense that they are H-bonded to each other in the template configuration), this pair of template mismatches is ignored.
The TMM that remain are marked as interstitials if n T i > 1 and as vacancies if n T i < 1.

Appendix B: Ideal gas closest distance distribution
In Fig. 16 (Sec. IV) we presented the histograms of the distances of the closest interstitial and the closest vacancy to the center of the liquid domain. It is important to note that even for a homogeneous distribution of defects (i.e. an ideal gas), the distribution of the closest defect is not uniform but instead has a maximum that is determined by the density of the gas and the volume of the simulation box 81 .
Consider the probability of finding the atom closest to a given location at a distance that falls into the interval [r, r + dr] for an ideal gas. Because the atom positions are not correlated with each other in the ideal gas, this probability can be decomposed into three factors: the probability that a specific atom is found at a distance inside [r, r + dr] times the probability that none of the N − 1 other atoms are closer than r + dr, times the number of particles that can be picked. With the total system volume V and the sphere volume v(r) = 4πr 3 /3 the probability density p id can be written as (B1) In the thermodynamic limit (N → ∞, V → ∞ such that N/V = ρ = const.) this can be approximated by To calculate the joint probability that the closest ideal gas-like interstitial is found at a distance in the interval [r I , r I + dr I ] and the closest ideal-gas vacancy in the interval [r V , r V + dr V ] we multiply the two probabilities: Integration over the size of a histogram bin, ∆r, yields the probabilities To compare this expected distribution to the ones obtained from melting trajectories we need to determine the average densities ρ I and ρ V at time t m . To do so we count the total number of defects of type i observed in configurations at time t m , n i , and divide by the number number of trajectories n and the average volume of the simulation box V . shows the landscape used to align a configuration that containes 720 molecules that are arranged in a mostly pristine lattice except for a small number of defects. The black cross indicates the vector s at which the first optimization is started. The blue cross indicates the value of s after the first optimization (s (1) z = −1.14, in between the two levels shown), which corresponds to a shift of the template configuration relative to the optimal alignment along both the prism-and the c-axis. The red cross indicates another local minimum, which corresponds to an alignment where the crystal is shifted by exactly one layer in the z-direction (i.e. layers of type A are on top of type B layers). The green cross marks the value of s after the second optimization which finds the best alignment of the two configurations (s Excess over ideal gas density FIG. 25. Analysis of the locations of I-V defects at the time of formation of a liquid nucleus, tm, using the center of the nucleus volume as reference. Shown are the ratios between the observed joint probability of the closest vacancy and the closest interstitial, P (rV, rI), to the expected density assuming the defects follow ideal gas statistics P id (rV, rI). P (rV, rI) is shown in Fig. 16 in the main part. Figure 25 shows the ratio between the observed probabilities, P (r V , r I ), and the ideal gas probabilities P id (r V , r I ). There is a strong excess of configurations where interstitials and vacancies are close to the center of the liquid nucleus compared to what is expected in an ideal gas.

Appendix C: Data supplement
Here we present a number of additional graphs that were not included in the main part for clarity.
• Figure 26 shows the immobile defect counts found at time t A in spherical-geometry trajectories obtained at 268 K and 303 K.
• Figure 27 shows the defect densities of 5+7 defects around the site where a mobile defect forms for trajecotires that pass through the E-channel. See Figs. 13 and 17 for the corresponding plots that include all trajectories.
• Figure 28 shows the non-equlibrium pair correlation functions for I and V defects obtained in the timespan between t E and t m .
• Figure 29 shows the one-dimensional contributions to the mean square displacements of I and V defects in the same time period but only when there is a single I-V pair present.    28. Pair correlation functions G between interstitial and vacancy defects found in melting trajectories between time tE and tm. Note, that only configurations where a single defect pair is present are included in the analysis and that the trajectories in the underlying ensemble start at the creation of a defect pair. As a consequence, G is not an equilibrium property since finding defect pairs a short distance apart is guaranteed. The distribution has been normalized with the density ρ = V −1 , where V is the box volume resulting in an assumed number density of the defects of V −1 .