Magnetization switching in nanoelements induced by the spin-transfer torque: Study by massively parallel micromagnetic simulations

In this paper we present a detailed numerical study of magnetization switching in shape-anisotropic thin-film nanoelements. These elements are at present of the major interest for the applied solid state magnetism as main components of a new generation of conventional and spintransfer-torque (STT) magnetic random access memory (MRAM) cells. To conduct this study, we have developed a highly efficient method for massively parallel micromagnetic simulations of the magnetization reversal in small-size nanoelements, which allows to fully exploit the large performance gain available on the GPU architecture (usually achievable only for large systems). We apply our method to the spin-torqueinduced magnetization switching in elliptical nanoelements in presence of thermal fluctuations. Being able to compute simultaneously the reversal of up to 1000 such elements, we obtain the dependence of (i) the average switching time and (ii) the distribution density of switching times for individual elements on the element size with a high statistical accuracy. Analysis of these dependencies provides important insights into the physics of magnetization reversal in such systems. Comparison with analogous simulations in the macrospin approximation allows to determine the validity limits of the macrospin model. Our methodology can be applied for the optimization of the MRAM design regarding the information life time and significantly improve the prediction accuracy of write and read error rates of conventional and STT-based MRAM cells. © 2019 Author(s). All article content, except where otherwise noted, is licensed under a Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/). https://doi.org/10.1063/1.5096264


I. INTRODUCTION
Development of next generations of magnetic memory, including conventional and spin-transfer torque MRAM (see, e.g. Refs. 1-4) requires detailed analysis and understanding of complicated switching processes in MRAM cells. From this analysis we expect, in particular, thorough theoretical recommendations on how to decrease the write and read error rates 3 (WER and RER) of this memory kind to the very low level acceptable for practical applications. To solve this problem (which is among the most important ones for these emerging technologies), we should be able to analyze a very large number of independent switching events in MRAM cells: this is necessary to gain sufficiently accurate statistics concerning the distribution of switching times at finite temperature in dependence on the cell geometry, external field, current strength etc.
This requirement poses increasing demands on the efficiency of full-scale (beyond the macrospin approximation) micromagnetic simulations, which are the most reliable tool for modeling and understanding of such switching processes. Recent development of programming technologies based on the graphic processor units (GPU) has allowed a rapid increase of the micromagnetic software efficiency; for achievable acceleration factors see Ref. 5 and references therein. However, the GPU-based software provides high accelerations (compared to CPU) only for large system sizes, because only in this case the computational power of a GPU (which may have > 10 3 cores and several GBs of RAM) is fully exhausted.
At the same time, physical sizes of MRAM cells should be well below 100 nm to make them competitive with a standard dynamic RAM (DRAM). For this reason, even for a very fine discretization of layers constituting a MRAM stack, the number of discretization ARTICLE scitation.org/journal/adv nodes for its micromagnetic model remains rather moderate: e.g., for a 50 × 80 nm 2 nanoelement discretized using 2 × 2 nm 2 square mesh, we get a system with 25 × 40 nodes only. This circumstance does not allow to achieve significant accelerations by modeling a single MRAM cell with a GPU-based software using standard micromagnetic methods. At the same time, even for very small nanoelements the macrospin approximations (in which many results can be obtained analytically 6 ) is not really valid, 7 so that fast full-scale micromagnetic simulations are highly desirable.
In order to overcome this difficulty and to achieve the maximal acceleration offered by GPU, we suggest a micromagnetic methodology for simultaneous modeling of magnetization reversal (both quasistatic and dynamic) in a large number of nanoelements, where different elements are fully independent -they do not interact with each other and thermal fluctuations on different elements are also not correlated. This way we can model in a single run as many MRAM cells as the given GPU allows (the GPU RAM size is the only limitation), thus achieving the maximal acceleration available on this highly promising computer architecture. We apply our method to the STT-induced magnetization switching in a collection of small-sized nanoelements, revealing important trends in these switching processes when the size of a nanoelement is changed.

A. Full-scale micromagnetic simulations on GPU
Our method is the extension of the standard finite-difference micromagnetic formalism, used in our package. 8 The basic feature of this formalism is the usage of the translationally invariant discretization grid, which allows a simple representation of the exchange interaction and, much more importantly, the application of the fast Fourier transformation (FFT) for the computation of the long-range magnetodipolar interaction. 9 First of all we note that simulation of many small-sized interacting nanoelements is straightforward: it requires only the placement of these elements within the simulation area. However, magnetodipolar interaction between the elements in such a system makes magnetization dynamics in different elements strongly correlated. Hence no reliable conclusions about the reversal statistics for independently switching elements could be drawn from these simulations. We note in passing, that simulations of interacting nanoelements are still necessary for studies of real arrays of closely placed -and thus magnetically interacting -MRAM cells on the chip (a very important practical task). However, the detailed understanding of the switching in a single MRAM cell should be achieved first, being a necessary prerequisite for studying an array of such cells.
Hence the major problem is how to exclude the magnetodipolar interaction between nanoelements in the geometry shown in Fig. 1, fully preserving at the same time this interaction between discretization nodes corresponding to the same nanoelement. Note that this problem does not arise by GPU simulation of a set of nanoelements in the macrospin approximation, 10 because in this approach (i) the magnetodipolar interaction within the same element is included as the shape anisotropy, and (ii) this interaction between FIG. 1. Spatial arrangement of nanoelements to be simulated and the cut-off of the magnetodipolar interaction required to avoid this interaction between different elements (schematic plot below). different elements is absent, because each GPU core simulates 'its own' macrospin, without any information exchange with other cores.
To solve the problem outlined above, we place the nanoelements (which may have different shapes and sizes) on a regular rectangular lattice. The edge-to-edge separation between nanoelements in each lattice direction should be larger than the maximal size of these elements in this direction. An example is shown in Fig. 1: here the maximal nanoelement size in the x-direction is s max x = a, and the lattice constant in this direction is Lx = 2a; s max y = b and Ly = 2b, correspondingly.
The main idea of our method is the introduction of a cut-off of the long-range magnetodipolar interaction at the distances xmax = a in the x-direction and ymax = b in the y-direction (see the bottom graph in Fig. 1). This way we eliminate magnetodipolar interaction between different elements, because the distance between them in each direction is larger than or equal to the corresponding cut-off size. At the same time we fully retain this interaction inside each nanoelement, because its size in each direction is smaller than the cut-off distance. The Fourier transform (FT) of this restricted magnetodipolar interaction kernel allows to use it in a standard way for the discrete convolution with the FT of magnetic moment arrays to obtain the magnetodipolar field. External field, magnetic anisotropy and exchange interaction are treated as usual.
Importantly, our method allows to study ensembles of nanoelements -e.g., MRAM cells -with arbitrary distribution of any elements parameters (like the element size, edge roughness, anisotropy value etc.), because the MicroMagus package, on which the software presented here is based, enables the site-dependent input of all magnetic properties and handling of arbitrary in-plane geometries.
The total effective field computed this way is then employed both in quasistatic simulations -to compute hysteresis loops by energy minimization, and dynamic modeling -by solving the

ARTICLE scitation.org/journal/adv
Landau-Lifshitz-Gilbert (LLG) equation of motion with or without thermal fluctuations. In the latter case, the stochastic LLG equation in our package is solved using a Runge-Kutta-23 or RK-45 methods with the adaptive step size control (relative dynamic accuracy was set to 10 −5 ). Both methods are implemented in our software package MicroMagus 8 and additionally optimized for simulations of stochastic differential equations (SDE). As we have shown in Ref. 11, for micromagnetic models where the magnetic moment magnitude of each discretization cell is kept constant, any numerical method for solving the corresponding SDE converges to its proper (Stratonovich) solution, so that the solution method can be chosen simply according to its efficiency.

B. Macrospin approach: Determination of the macrospin parameters
Quantitative comparison of the two models -full-scale micromagnetic and macrospin -requires an accurate determination of magnetic parameters for the macrospin corresponding to the given flat nanoelement with the elliptical in-plane shape. For this purpose, the expression for the macrospin energy should be established.
In general, one has four standard micromagnetic energy contributions -energy due to an external field, magnetocrystalline anisotropy energy, exchange stiffness energy and (internal) magnetodipolar interaction energy. In the macrospin approach, the energy of a macrospin µ in an external field Hext is expressed -similar to micromagnetics -as the scalar product Eext = −µ ⋅ Hext, where µ = Ms ⋅ V el = Ms ⋅ (πabh)/4 with a and b being the axes of the in-plane ellipse and h -the thickness of the nanoelement correspondingly. The magnetocrystalline anisotropy energy for our system can be neglected. The exchange energy in this approach is absent by definition, as it is assumed that a nanoelement has no internal magnetization structure (to be more precise, the macrospin assumption implies that this -possibly non-trivial, e.g. vortex-like -internal structure does not change during the magnetization reversal, so that the exchange energy -if present -is constant and thus can be omitted).
Thus the only non-trivial energy contribution in the macrospin approximation is the magnetodipolar energy, treated in this model as the shape anisotropy energy E an sh . In a general case, this energy can be expressed via a shape anisotropy tensorN as where m is the unit vector of the macrospin moment (in this notation, the sum of the diagonal components ofN is Nxx + Nyy + Nzz = 1).
The coordinate system where the tensorN is diagonal coincides in our case with Cartesian coordinates used for simulations: x and y axes lie in the nanoelement plane parallel the long and short axes of the ellipse respectively. Hence the anisotropy field Han in this system is Han,α = −2πMs ⋅ Nαmα, where Nα ≡ Nαα and α = x, y, z.
Components of the demagnetization tensorN for flat elliptical nanoelements are often set equal to corresponding values for an ellipsoid with the same 'in-plane' major axes a and b and the third major axis equal to the element thickness h. This approximation can lead to significant errors, as the thickness of a nanoelement under study is homogeneous across the element plane, so that its shape significantly deviates from that of an ellipsoid. For this reason we have used a more accurate method for the determination of Nx, Ny and Nz.
The method is based on comparison of the linear regions of magnetization curves for the macrospin and the elliptical nanoelement, when they are magnetized along one of the 'hard' directions (i.e., along the in-plane y-axis or z-axis perpendicular to the element plane). For the discretized nanoelement, corresponding magnetization curves My(Hy) and Mz(Hz) have been obtained numerically using the quasistatic part of the MicroMagus package (see Fig. 2).
For the macrospin, the M(H)-dependencies can be derived from the total macrospin energy: where we have used the condition m 2 x + m 2 y + m 2 z = 1. In zero field the macrospin moment is directed along the x-axis. If the external field changes, e.g., along the y-direction, then mz = 0 and the equilibrium condition ∂E/∂my = 0 yields the required linear relation between my and Hy: Next we introduce the proportionality constants cy and cz as Hy = cyMsmy and Hz = czMsmz. These constants can be determined from the slope of magnetization curves obtained via full-scale micromagnetic simulations of elliptical elements shown in Fig. 2. Further, according to Eq. (3) and its analogue for the relation between Hz and mz, these constants are related to Nα as Together with the condition Nx + Ny + Nz = 1 we obtain the system of three equations for computing three demagnetizing coefficients Nα for the macrospin corresponding to the given elliptical element. Results of this computation for several element sizes are summarized in Table I.
Note, that the coefficient Nz ('responsible' for the demagnetizing field perpendicular to the element plane) significantly differs from unity, although the element thickness h = 3 nm is much smaller than its lateral sizes a and b; this means that the often used approximation of an extended thin film (for which Hz ≈ − 4πMz, i.e. Nz ≈ 1) is not suitable for the quantitative treatment of nanosized elements under study.
It is also worth mentioning that the problem of finding an ellipsoid equivalent (in the sense of the average demagnetizing factor) to a uniformly magnetized bodies of several simple forms (like a disk, or a cylinder with an elliptical cross-section) was studied in several papers (see, e.g., Ref. 12). However, our goal is different from that pursued in these publications -whereas the authors of this and similar references are mostly calculating the sizes of an equivalent ellipsoids, we are interested only in demagnetizing factors for the equivalent macrospin (in our approach, a macrospin does not have any size assigned to it). For this purpose, our method based on comparison of the linear parts of magnetization curves, is fully sufficient.
We also would like to note, that in principle, one can also study an ensemble of macrospins with the distribution of particle parameters (e.g., their magnetic moments and anisotropy values). However, the establishment of the relation between the variations of these parameters and variations of real-world parameters of nanoelements requires a special study.

III. SPIN-TORQUE-INDUCED SWITCHING: SIZE DEPENDENCE FOR THE CONSTANT CURRENT DENSITY
One of the most important and promising applications of our methodology is the study of switching events in a collection of independent nanoelements (or independent multilayer stacks of nanoelements) at T > 0. These studies are crucial for the development of all kinds of next-generation MRAM. For this reason we have chosen the simulation of the magnetization reversal in single-layer nanoelements induced by the spin-polarized current (SPC) as the first application example.
Using the methodology explained above, we have simulated magnetization switching in a set of 400 elliptical nanoelements, arranged as a rectangular lattice 20 × 20. All elements in one set have the same sizes and magnetic parameters. Simulations of this system allow to draw statistically relevant conclusions about the switching characteristics from the set of magnetization time dependencies of individual elements.
We use magnetic parameters typical for CoFe alloys (M = 1800 G, A = 3 × 10 −6 erg/cm, damping λ = 0.01) and neglect the magnetocrystalline anisotropy because it largely averages out in a polycrystalline film with the typical grain size ∼ 10 nm. We have studied switching processes in elliptical elements with the same short axes b = 40 nm and the thickness h = 3 nm, varying the long axes a (directed along the x-axis) in the interval from 50 to 110 nm.
Magnetization of all nanoelements (initially directed along the positive x-axis direction) was switched by applying the spinpolarized current with the polarization degree P = 0.3 at T = 300 K. Either the current density or the total current through the element was kept constant for all element sizes, i.e. for all values of a; these two simulation protocols obviously lead to qualitatively different dependencies of the switching time on the element size. Magnetization switching was simulated during the physical time up to 200 ns (until all nanoelements were switched).
In this Section we discuss physical results obtained when the current density is the same for all element sizes, namely j = 9 × 10 11 A/m 2 .
After presenting results obtained in the macrospin approximation and using full-scale micromagnetic simulations, we compare switching times for these two approches. We note already here, that macrospin simulations are obviously a few orders of magnitude faster than micromagnetic ones. However, their validity region is limited from above by the critical nanoelement size, for which inhomogeneities of the nanoelement magnetization state become qualitatively important for the adequate description of its magnetization dynamics. Determination of this critical size is mandatory for each new system under study for the choice of the valid numerical model. In addition, thorough understanding of the reasons of the failure of the macrospin approximations in each specific case provide important insights into the physics of magnetic nanosystems.

A. Results for the macrospin approach
We begin with the analysis of results obtained in the macrospin approximation, as for this simpler approach no internal magnetization structure of nanoelements has to be taken into account. For this reason, we can estimate the switching time dependence on the element length a using the same logic as by the derivation of the Arrhenius law in the transition-state theory without SPC effects. 13 This Arrhenius-like approximation also provides a useful starting point for the later comparison between macrospin and full micromagnetic models.
For a macrospin, the energy barrier ∆E ms an is due to the shape anisotropy discussed above and can be calculated as ∆E ms an = K sh V. Here K sh = 2πM 2 s (Ny − Nx) = 2πM 2 s ∆N(a) is the shape anisotropy constant effective by the reversal of the macrospin in the nanoelement plane (xy-plane), V el = πabh/4 is the nanoelement volume, and the difference of demagnetizing factors ∆N(a) is the (slowly) increasing function of the long axis a for the constant short axis b. Hence, in the absence of the spin current the spontaneous thermal fswitching time should (according to the Arrhenius law) grow with ARTICLE scitation.org/journal/adv a as τsw ∼ exp(∆E ms an kT) = exp(κ dem a ⋅ ∆N(a) kT); here we have introduced the constant κ dem = π 2 bhM 2 s 2. However, in our case the energy is constantly supplied to the system by the spin-polarized current. For the constant current density j the supplied power is also proportional to the element length a, as the total current is proportional to the nanoelement area: I = j ⋅ S ell = j ⋅ πab/4. This means that in the steady-state situation (dynamic equilibrium), the average energy of the macrospin lies above the static energy minimum for I = 0, and this energy excess due to SPC is proportional to the current density and the element length: ∆Espc = κspc ⋅ j ⋅ a, where κspc denotes the corresponding proportionality constant.
Hence the effective energy barrier ∆E = ∆E ms an − ∆Espc = a(κ dem ∆N(a) − κspcj) growth with a much slower than in the absence of the current, so that the switching time should increase with a also much slower than implied by the Arrhenius law τsw ∼ exp(∆E ms an kT). To verify this prediction, we have simulated an ensemble of 10 4 macrospins, with demagnetizing coefficients (evaluated as described in the previous Section) corresponding to nanoelements with the long axes a = 50 ÷ 110 nm and other parameters as listed above. For each macrospin the switching time τsw was determined as the time when the dependence mx(t) crosses zero and does not return anymore to positive values (i.e. does not cross the line mx = 0 again); the last condition is important, as shown below.
Thermal fluctuations lead to different magnetization trajectories for different macrospins. However, the resulting distribution of individual switching times ρ(τsw) shown in Fig. 3 is relatively narrow. The average switching time τav (blue symbols in Fig. 4)   FIG. 3. Distribution of the switching time τsw for macrospins with parameters equivalent to nanoelements with the short axis b = 40 nm and varying long axes a for the same current density j = 9 × 10 11 A/m 2 . Each histogram is built by analyzing the switching events for 10 4 independent macrospins. Red lines are fits by the log-normal distribution. depends very weak on the element size, confirming the estimation presented above. This weak dependence also shows that for our parameter choice the size dependencies of the shape anisotropy energy barrier ∆E ms an and the SPC-supplied energy ∆E SPC nearly cancel each other. Further discussion of these results is postponed till the next subsection.

B. Results for full-scale micromagnetic simulations
For these simulations, elements were discretized in-plane by 2.5 × 2.5 nm 2 cells; further mesh refinement did not affect final results within their statistical accuracy.
Typical time dependencies of mx-projections for 40 nanoelements randomly chosen out of the set of 400 trajectories for 50 × 40 nm 2 elements are shown in Fig. 5. All elements demonstrate a behavior typical for the STT-induced switching: amplitude of thermally induced magnetization oscillations increases with time due to the spin-torque effect, until the magnetization switches. Then oscillations rapidly decay, because SPC suppresses the precession of the switched magnetization. For each nanoelement the switching time τsw was determined as explained in the previous subsection for the macrospin.
Analogous to the macrospin case, thermal fluctuations lead to different magnetization trajectories for different nanoelements, but for the full-scale micromagnetic approach the distribution of individual switching times τsw is much broader than for the macrospin model, as shown by error bars in 4. Distribution densities ρ(τsw) obtained by the analysis of switching in a set of 400 identical elements are shown in Fig. 6 for various long axes a = 50 ÷ 110 nm. As expected, both the average switching time τav and the width of the switching times distribution στ increase with the long axis length a. Accurate statistical analysis (see Figs. 4 and 6), made possible by the large number of switching events in our ensembles, reveals several interesting features of the switching process. In particular, our results demonstrate a non-trivial dependence of the average switching time τav on the value of a (Fig. 4, red circles): for 50 < a < 70 nm, τav and the width of the switching time distribution increase relatively fast with a, then τav remains nearly unchanged up to a ≈ 90 nm, and above this size τav and the distribution width increase up to the maximal nanoelement size studied in simulations.
To understand this behavior, we have first performed the analysis of dynamic eigenmodes which are active in magnetic nanoelements of various sizes. For this purpose, we have simulated magnetization fluctuations in thermal equilibrium (using the standard Langevin dynamics) for elements of all sizes studied here. These fluctuations were simulated in the absence of an external field and spin-polarized current following the methodology described in details in our papers. 14,15 Namely, after the initial 'equilibration' of the system (time period after which the average energy does not depend on time) we have recorded time dependencies of the magnetization projections for all discretization cells. Fouriertransform of these dependencies provides the power spectra of magnetization fluctuations for all discretization cells. Summing up these power spectra, we obtain the average fluctuation spectrum for the given element. Peaks of this spectrum give the eigenfrequencies of corresponding dynamic modes, and the site-dependent maps of the fluctuation power at the mode frequency provide spatial distributions of the magnetization oscillation amplitude for each mode.
Using this method, we have determined the modes which are 'active' for nanoelements of all sizes studied here. A mode has been considered as active, if the amplitude of the corresponding peak was not smaller than 5% of the highest spectral peak for the given nanoelement; this highest peak always corresponds to the edge mode with the lowest frequency.
Results of this analysis are presented in Fig. 7, 8 and 9.
In Fig. 7 we show frequency dependencies on the element size for 'active' modes with eigenfrequencies f < 40 GHz, and in Fig. 8 -spatial maps of the modes' amplitude and phase for all mode types shown in the previous figure. Finally, Fig. 9 demonstrates the dependence of the total number of active modes (found in the entire studied frequency range up to 100 GHz) on the element length a.
Detailed discussion of eigenfrequency dependencies on the element size and spatial patterns of active modes is out of scope of this paper (for the thorough discussion of the last issue for a circular element see, e.g., Ref. 16). We note in passing that spatial patterns of eigenmodes are relevant, in particular, for the analysis of the switching process in frames of the 'dynamic mode concept' employed in Ref. 17. For our purposes, we are primarily interested in the dependence of the number of active modes on the element size Nact(a): it can be seen from Fig. 9 that this dependence unambiguously correlates with the corresponding switching time dependence τav(a) for micromagnetic simulations shown in Fig. 4 with red symbols. This increase of the number of active modes can be attributed to the ability of a larger element to 'accomodate' modes with an increasingly complicated spatial structure on a moderate energy cost; this results in a higher number of modes observed for the given temperature in a larger nanoelement.
The increasing number of active modes might be one of the reasons why the switching time growths with the element size for the micromagnetic approach much faster, than for the macrospin approximation. Namely, the energy supplied by SPC is absorbed and dissipated not only by the homogeneous oscillation mode (as in a macrospin), but also by the internal system modes which do not contribute directly to the magnetization reversal. Larger number of such 'active' internal modes could thus lead to a more intensive absorption of the SPC-provided energy and hence -to the increase of the switching time.
Further, we have performed the detailed analysis of magnetization states during the switching -see Fig. 10, where corresponding grey-scale maps of mx-projections are shown. The analysis of these

ARTICLE
scitation.org/journal/adv images reveals that elements with a < 60 nm in most cases switch approximately uniformly (left column in Fig. 10). We note, however, that already for ellipses with a = 60 nm the deviation from the homogeneous magnetization configuration during the reversal process is significant, resulting in a substantial difference between the switching times for the macrospin and micromagnetic approaches (see Fig. 4). This means that the macrospin approximation is unsatisfactory for the description of the reversal process already for these small sizes, although we study a material with the high exchange stiffness.
For larger sizes, magnetization reversal becomes more inhomogeneous. Switching of 70 × 40 nm 2 ellipses always occurs via a series of highly non-collinear magnetization configurations. The process starts within a relatively small 'nucleation' area usually near the edge of a nanoelement, because the magnetodipolar field (which is responsible for the shape anisotropy) is smaller at these locations. Then the reversed area expands towards the inner region of the nanoellipse until the whole nanoelements is switched. Corresponding examples are presented in the second and third columns of Fig. 10 for elements with a = 70 and 90 nm.
For ellipses with a > 90 nm, switching occurs mostly via the nucleation at the narrow ellipse end and the subsequent motion of a domain wall across the nanoelement, as shown in the right column of Fig. 10 for a = 110 nm. In a substantial fraction of switching events, however, the reversal occurs via a nearly simultaneous appearance of two nucleation areas which then merge by their expansion, leading to a complete reversal.
As already mentioned above, the average switching time τav and the width of the switching times distribution ρ(τsw) begin to increase with the ellipse size again starting from a = 90 nm (Fig. 4). Histograms of these distributions shown in Fig. 6 clearly demonstrate that this increase is not due to the increase of the most probable switching time τ prob (which corresponds to the maximum of ρ(τsw)). Indeed, when a increases from 90 to 110 nm, τ prob obtained from the log-normal fit of switching time histograms increases only from 3.85 to 4.50 ns. In this terms, the increase of both τav and the width of ρ(τsw) is due to the appearance of heavy tails by the distribution ρ(τsw). The nature of these tails can be understood by inspection of magnetization time dependencies for particles with largest switching times. Two corresponding examples are shown in Fig. 11, where it can be seen that the magnetization trajectory crosses the line mx = 0 several times before the actual switching happens, but then returns to the initial value mx ≈ +1; these events are marked by arrows in Fig. 11.
In the language of the transition rate theory 13 this means that the 'first passage time' approximation is not valid for these elements: magnetization can cross the 'critical surface' (passing through the saddle point in the energy landscape) several times forth and back, before it finally stays in the attraction basin of the opposite equilibrium magnetization state. These processes become increasingly important for the nanoelements with large sizes, because magnetization trajectories between positive (mx ≈ + 1) and negative (mx ≈ − 1) energy minima become longer and more complicated. Hence the probability to encounter a fluctuation which forces the magnetization to return to the initial minimum increases.
Our findings can in principle be related to results reported in Ref. 18, where the size dependence of the critical switching current Icr for an in-plane magnetized elliptical nanoelement made of CoFeB  Fig. 6). Several events where the magnetization was nearly switched, but then returned to its initial orientation, are marked with arrows.
is simulated neglecting thermal fluctuations. From general considerations one would expect that the size dependence of Icr should at least correlate with the corresponding dependence of the average switching time obtained for the undercritical current for the given temperature. However, results presented in Ref. 18 demonstrate an unexpected oscillating dependence Icr(a) (see, e.g., Fig. 2 in Ref. 18), where a is the long axis of the nanoelement. The authors of Ref. 18 suggest that these oscillation result from the quantization of dynamical eigenmodes which are different for different nanoelement sizes. However, taking into account that the number of active eigenmodes and frequency of the given mode depend monotonically on the element size (see our Fig. 7 and 9), this explanation seems to be unlikely. In addition, critical currents obtained in Ref. 18 do not show any monotonic dependence on the short axis length b or exchange constant A, what also cannot be explained only by the quantization of eigenmodes. Hence, the problem of the critical current dependence on the element size requires further investigation. The only result in Ref. 18 which is in agreement to our data is the finding that the critical current obtained in the macrospin approximation is considerably smaller than the corresponding value from micromagnetic simulations.
In most experimental 19,20 and theoretical 17,21-23 papers the size dependence of the switching process has been studied for nanosized elements and multilayer magnetic tunnel junctions with the perpendicular magnetic anisotropy (PMA). Due to this difference to our system, direct comparison with results presented here is difficult. Still, some relation to our findings can be established.
In particular, tendencies similar to ours concerning the character of switching processes have been observed in Ref. 22 by studying the size dependence of the STT-driven switching for a disk-shaped element with PMA. The authors of Ref. 22 have simulated the optimal switching path for the current (voltage) values exceeding the critical ones, neglecting thermal fluctuations (T = 0). It was found that this optimal path corresponds to the homogeneous rotation for small disks (diameter D ≈ 20 nm), domain wall motion for larger ARTICLE scitation.org/journal/adv disks (26 < D < 60 nm) and reversal nucleation starting in the disk center (D > 90 nm). Start of the nucleation reversal at the disk center -in contrast to the nucleation near the edge for our nanolement -is explained in Ref. 22 by the out-of-plane initial magnetization orientation in their disk: for this orientation the internal magnetodipolar field is demagnetizing, so that the nucleation-like reversal starts in the disk center, where this field has the largest value. This result from Ref. 22 can be put into a more general context by comparing it with simulations performed in Ref. 21. Namely, in Ref. 21 it is shown that for low temperature (T = 5 K, i.e. thermal fluctuations are nearly absent), nucleation-mediated magnetization reversal of a CoFeB nanodisk with PMA starts in the disk center, whereas for T = 300 K such a reversal begins at the disk edge. This difference was explained in Ref. 21 by larger amplitude of thermal magnetization fluctuations at the disk edge, where the exchange interaction with surrounding spins is weaker, because the number of nearest neighbours at the edge is smaller. For a sufficiently high temperature, these fluctuations can lead to the preferred reversal nucleation near the edge. For our systems, simulations are performed for T = 300 K, and the magnetodipolar field hinders the magnetization switching (in contrast to systems with PMA); both circumstances favor the reversal nucleation near the edges of our nanoellipse, as observed in our modeling.
Switching time statistics for a double magnetic tunnel junction (DMTJ) with PMA has been collected using full-scale micromagnetic and macrospin simulations in Ref. 23. Unfortunately, data concerning the number of simulation runs used to collect statistical information are missing in this paper. However, one of qualitatively interesting results presented in Ref. 23 is the finding that the switching time in the macrospin approach is larger than for micromagnetic modeling, especially for smallest values of SPC (in contrast to our results shown in Fig. 4 and in Fig. 12 below). Understanding of the reason for this difference requires a more detailed analysis of the switching process for the system studied in Ref. 23.

IV. SPIN-TORQUE-INDUCED SWITCHING: SIZE DEPENDENCE FOR THE CONSTANT TOTAL CURRENT
For comparison with the constant current density case described in the previous Section, we have also studied the dependence of the switching time on the element size when the total current through the element is kept constant. In this case, the total power supplied to the nanoelement by the spin-polarized current also is independent on the element size. Hence, the difference between the anisotropy energy barrier ∆E ms an and the energy excess ∆Espc due to SPC increases with the length of the nanoelement as ∆E = ∆E ms an − ∆Espc ∼ κ dem a∆N(a). Taking into account that the difference of demagnetizing factors ∆N(a) scales with the long axis a much slower than linearly, we expect that the dependence τav(a) is close to the exponential one at least for the macrospin model.
Simulation results for both the macrospin (blue symbols) and the full-scale micromagnetic (red symbols) approaches are compared in Fig. 12 (note the logarithmic scale of the time axis). Due to much longer switching times, micromagnetic simulations could be carried out only up to the maximal long axis length a = 90 nm.
First of all, our expectation concerning the nearly exponential dependence τav(a) for the macrospin model was confirmed in our simulations: at least up to a = 90 nm the corresponding dependence -shown in Fig. 12 by the blue symbols -is approximately linear in logarithmic coordinates. A significant nonlinearity of the log(τav(a))-dependence observed for a > 90 nm cannot be explained by the additional factor ∆N(a) present in the anisotropy energy barrier, because for large ratios a/b (i.e., for large values of a) the dependence ∆N(a) becomes weaker. Hence, the most probable reason for this non-linearity is the strong dependence of the saddle point curvature on the long axis a for larger a values.
Switching times obtained in the micromagnetic approach are also for this case much higher than in the macrospin approximation; the difference amounts to nearly 100 times for the largest element which could be simulated micromagnetically (a = 90 nm). Similar to the constant current density case considered above, these higher values of the switching times (as compared to macrospin) can be explained by the presence of the large number of dynamic eigenmodes in a discretized nanoelement and a more complicated switching path; both factors become increasingly important for elliptical elements with longer a-axis, thus leading to larger difference with the macrospin values. In addition, we note that the non-linearity of the log(τav(a))-dependence is clearly visible for the micromagnetic model already for a ≥ 70 nm.

V. CONCLUSION
Summarizing, we have suggested a new micromagnetic methodology for simultaneous simulation of a large number of small-size nanoelements. Our methodology is able to fully exploit the acceleration potential of modern GPUs, even when the size of ARTICLE scitation.org/journal/adv a single element is so small, that the simulation of such a single element on GPU would not lead to any significant acceleration. We have applied our method to studies of the magnetization reversal in a large ensemble of elliptical nanoelements under the influence of the spin-polarized current. Detailed analysis of the highly accurate switching statistics (which collection has been made possible by a large number of switching events simulated in one run) has revealed important features of magnetization reversal in small-sized nanostructures. Further development and application of our methodology would be of great interest to nanomagnetism and especially -to a rapidly progressing technological development of conventional and STT-MRAM.
In particular, numerous experimental studies of the average switching time and the Write Error Rate (WER) of STT-MRAM cells in dependence on the cell size, applied pulse voltage, applied pulse duration and temperature (see, e.g., Refs. 3, 7, 24, and 25) often disagree qualitatively with theoretical results obtained in the macrospin approximation (see discussions in Refs. 3 and 7). Full-scale micromagnetic simulations carried out with our method for a complete MRAM cell, could be directly compared to these experiments, resulting in a decisive progress in solving this important problems. Our method could also greatly assist to accelerate the recently suggested method for WER estimation based on the rare event enhancement (see Refs. 26 and 27).