A compact physical model for the simulation of pNML-based architectures

Among emerging technologies, perpendicular Nanomagnetic Logic (pNML) seems to be very promising because of its capability of combining logic and memory onto the same device, scalability, 3D-integration and low power consumption. Recently, Full Adder (FA) structures clocked by a global magnetic ﬁeld have been experimentally demonstrated and detailed characterizations of the switching process governing the domain wall (DW) nucleation probability P nuc and time t nuc have been performed. However, the design of pNML architectures represent a crucial point in the study of this technology; this can have a remarkable impact on the reliability of pNML structures. Here, we present a compact model developed in VHDL which enables to simulate complex pNML architectures while keeping into account critical physical parameters. Therefore, such parameters have been extracted from the experiments, ﬁtted by the corresponding physical equations and encapsulated into the proposed model. Within this, magnetic structures are decomposed into a few basic elements (nucleation centers, nanowires, inverters etc.) represented by the according physical description. To validate the model, we redesigned a FA and compared our simulation results to the experiment. With this compact model of pNML devices we have envisioned a new methodology which makes it possible to simulate and test the physical behavior of complex architectures with very low computational costs. © 2017 Author(s). All article content, except where otherwise noted, is licensed under a Creative Commons Attribution (CC BY) license (


I. INTRODUCTION
Many emerging technologies 1 are currently under investigation, Nano Magnetic Logic (NML) seems to be promising thanks to its low power consumption and nano-scale dimensions. 1,2With this emerging technology it is possible to represent and store binary information according to the magnetization of devices.The elementary cells are rectangular-shaped nanomagnets with typical dimensions of (50x100)nm. 3Perpendicular Nano Magnetic Logic (pNML) is a particular NML implementation characterized by a perpendicular magneto anisotropy (PMA).Differently from NML, where the magnetization of devices is parallel to the plane, in pNML, the magnetization vector is perpendicular.Fig. 1.A shows an example of pNML device; here the direction of the magnetization vector is used to represent binary logic values.Logic architectures can be designed by placing pNML elementary cells side.Information is propagated thanks to the coupling field interacting among neighboring elements.The clocking mechanism is realized through an external out-of-plane magnetic field, applied uniformly to the whole circuit(Fig.1

.B).
Signal directionality is forced by irradiating defined spot with the Focused Ion Beam (FIB). 4This partially irradiated zone is called nucleation center, and it is tuned to define the switching properties of a Electronic mail: giovanna.turvani@polito.itpNML devices and to force a specific propagation direction (1.C).The most interesting logic element can be identified in the Minority Voter (Fig. 1.D).This particular gate exploits the coupling field among inputs to put the output magnet into the proper magnetization value according to Fig. 1.E.This function plays a crucial role in the design of pNML logic architectures since it enables the synthesis of elementary logic gates such as NANDs and NORs.This is possible by forcing one of the three inputs of the Minority Gate into a specific logic value.The concept of a programmable gate can be extended and redesigned in a 3D environment, exploiting the Logic-in-Memory (LIM) concept 5,6 by placing programmable inputs into separated layers as already.
The design of logic circuits represent a key point for the study of emerging technologies.In this paper, we present a compact model able to describe the physical properties of pNML devices and the magnetic interactions among them.This compact model, described in VHDL (VHSIC Hardware Description Language), enables very fast simulations of complex pNML architectures while preserving adequate accuracy.This model, structured and organized through libraries, implements all the theoretical equations presented in Ref. 7. Parameters, physical quantities and constants can be set and modified.This gives the model a peculiar flexibility thanks to the parametrization of physical equations.The considered physical parameters are: I) Physical geometry, II) Material parameters, III) Nucleation probability (P nuc ), IV) Nucleation time (t nuc ), V) Domain Wall (DW) propagation time and VI) the magnetic clocking field.These quantities have been fitted from experimental activities, like already demonstrated in Ref. 7. Our model allows us to determine the impact of physical effects, such as DW nucleation and motion on the logic computation and how such influences can be handled at an architectural level. 8This enables the estimation of the critical path of the circuit, evaluating the related latency, and the maximum operating frequency, limited by the DW propagation time.In Refs.9 and 10, authors demonstrated the experimental feasibility of a Full Adder (FA), while in Ref. 7 they analyzed how the nucleation probability (P nuc ) influences the DW nucleation time (t nuc ).In particular, results presented in Ref. 9 show that P nuc can have a remarkable impact on the reliability of pNML-based structures.To validate our model, we carried out different studies on the FA 9 and compared measurements with data obtained through simulations.

II. MODELING
Two main steps are required to switch a pNML cell: the nucleation of the domain wall and the propagation of the reverse of the current magnetization along the whole magnet.The nucleation and the propagation of the new domain in pNML is supported by the presence of a clocking field of which amplitude H clock and duration t clock should be adequate to guarantee an operation without errors.The switching process should be completed during the clock field pulse time: where t nuc is the nucleation time and t prop is the required time to propagate the magnetization through the entire domain wall. 7

A. DW propagation
According to the clocking field aplitude H clock , the speed of the domain expansion v DW , changes and it is modelled according to the belonging regimes. 7he propagation time t prop is related to the velocity of propagation: 7 where l mag is the length of the magnet.
The propagation entails the motion of the DW and the expansion of the most favourable magnetization domain.The propagation is characterized by a speed of motion, the DW velocity v DW , which depends on the applied field.
Three main regimes can be identified in the propagation of the new state in thin multilayer structures.These regimes are modelled according to the external field H z applied to the magnetic structure.This field is compared to the intrinsic pinning field, the field needed for DW depinning at zero temperature H int : 11 Creep regime: H z H int .The DW depinning depends on the defects which are thermally activated.The motion is called creep motion.Depinning regime: H z ≈ H int .The motion is still thermally activated.It is an intermediate region between the creep and the flow regime.Flow regime: H z H int .The disorder of the magnetic material becomes irrelevant and the motion is governed by flow dynamics.
According to the regimes the velocity v DW , the speed of the domain expansion changes through the structure.
The velocity v DW , so the speed of the domain expansion (the propagation of the new domain) through the structure changes according to the applied external field, so according to the clocking field.The three main regimes can be identified and according to them v DW is described by the following equations: Creep regime: the speed is very slow v DW 1 m/s and it is described by: where v 0 is a numerical prefactor.
Depinning regime: The velocity increase exponentially with the applied field, however it strictly depends on the temperature too, according to Equation 3. Flow regime: The velocity depends linearly on the applied field and can be modelled according to the equation: where v 0 is a numerical prefactor and µ w is the domain wall mobility.
For pNML with a constant clocking field amplitude, v DW can be considered constant.

B. DW nucleation
For DW nucleation, and the generation of the new magnetic domain, the related energy barrier E barrier should has to be overcome.The artificial centre of nucleation (ANC), having low anisotropy, supports the nucleation, by locally reducing the energy required to reverse the magnetization.The applied perpendicular clocking field H clock further decreases the E barrier which is modelled according to the Stoner-Wohlfarth model: 11 The factor n is 1 or 2 according to the rotation of the magnetization reversal; E 0 is the energy barrier at zero field and H 0 is the coercive field at zero temperature.The nucleation event occurs in the ANC, and the E 0,ANC is expressed as where V ANC is the volume, and K ANC is the anisotropy of the ANC, which can be obtained from the effective anisotropy K ef f , and the FIB irradiation factor g FIB : 12 Each nanomagnet is characterized by an unique anisotropy term, which includes all the different contributions (e.g.uniaxial anisotropy, surface anisotropy, interface anisotropy, ecc.).This total anisotropy is called effective anisotropy K ef f : 11 where K u is the crystalline anisotropy, K s the surface anisotropy, M s the saturation magnetization and t layer the thickness of the layer.As a consequence the preferential behaviour of the magnetization depends on the crystal structure, the shape or geometry and material surfaces.
The zero-temperature field required to reverse the ANC magnetization and nucleate the SW is given therefore by: 11 where M S is the saturation magnetization of the magnet, which depends on physical parameters of the magnetic structure. 12hether DW nucleation occurs or not, it can be estimated by exploiting the probability of nucleation, described by the Arrhenius model. 7On first approximation, it is expressed in terms of applied field and of the duration time of the pulse.In pNML, the nucleation of the magnet is controlled by the stray fields of the input magnets which surround the ANC.The nucleation field required to nucleate the domain wall is either decreased to support the nucleation, or increased, to prevent the nucleation, by the superposition of those coupling fields.
are the coupling fields from the inputs with magnetization M i ∈ {−1; 1}. 7he resulting probability P nuc to nucleate the ANC with a magnetic field pulse of amplitude H ef f and effective pulse time t ef f is: where τ(H ef f ) is the reverse of the switching rate and f 0 is the attempt frequency. 7wo constraints on the nucleation probability, 7,11 can be formulated to guarantee the correct behaviour of the circuit.
Eq. 13 guarantees the nucleation lasts at most t nuc , while Eq. 14 prevents the nucleation during all the t clock .
The nucleation time can be expressed in terms of probability to nucleate it: so it can be computed according to the wanted P nuc (Eq.11).
All the described equations that model the pNML have been verified. 7,11

III. ADOPTED METHODOLOGY AND VALIDATION
To typify every pNML general structure, we developed a compact model for the characterization of complex architectures by using the hardware description language VHDL.The aim of the model is to exploit all the properties of the single magnet and of the whole domain wall: from the smallest circuit to the complex architecture.It describes the switching behaviour of the architecture and it fully characterizes the structures accordingly to physical and technological parameters.
Moreover, all the parameters can be set accordingly to the adopted technology.Considering both the geometry and the material chosen for the realization of the magnetic layers.Also delay parameters such as the t nuc and the t prop are considered.The whole model is divided into two parts: the former describes the switching behaviour of the magnetic structures, while the latter characterizes the description of the technological and physical parameters.All equations mentioned in the paper have been implemented in VHDL by using the numeric and math libraries.
To define the behaviour of the magnetic structures; devices are divided into basic blocks, whose combination realizes any desired structure.To realize a complete device different blocks are needed, the first is the nucleation centre which models the ANC.Then a chain of other basic components (e.g.domain wall or corners) can be attached in order to create the complete domain wall, according to the required geometry.The realized structure can be the input of other basic elements, such as the nucleation of another DW or a majority voter gate.The basic components are described in a VHDL package, the behaviour of each block is analysed and it is characterized accordingly to its functionality.Each elementary component is intended as a single VHDL entity.In order to compose each device, different basic blocks must be interconnected.Consequently, larger circuits can be divided into sub-circuits connected together.
In addition to the description of the switching behaviour of the elements, the model is refined by adding information and equations describing the physical and technological characteristics of pNML.This physical description is divided into two parts: the first one is a collection of constants which are technology and material dependent and which can be set according to the user's needs.Table I show the complete list of the adopted technological constants.
In the second part all the equations and the procedures are implemented.These procedures depend strictly both on design parameters, such as the amplitude of the pulse field, and on the current state of the circuit.These procedures concern the computation of the nucleation probability or the evaluation of the domain velocity and they are correctly employed by the basic elements.
The parametric model can be used to extract the latency, the critical path and the maximum achievable frequency of the implemented structure.In particular, the critical path can be used to set A correct evaluation of the critical path makes it possible to set properly t clock , which is the sum of three contributions: t prop , propagation time of the longest DW; t nuc , nucleation time; and t rise , contribution of the non-ideality of the clock.Indeed, shorter values can bring to an incomplete switch of the domain wall and so to error in computation (1); on the other hand an excessive overestimation of t clock reduces the frequency that the circuit may achieve, since f clock = 1/(2t clock ).To better set those parameters a dummy simulation is performed at first.To perform it, the time parameters are set as follows: t prop is set to a value which is high enough to guarantee in the circuit the full propagation of the longest DW; t nuc is set to 0 ns, since in this way the critical path which is computed during the simulation coincides with the longest propagation time.The longest propagation time and the minimum nucleation time that guarantees, in the worst case, an high nucleation probability (condition 13) are extracted from the simulation.This last term is computed accordingly to the Equation 15 setting the wanted minimum nucleation probability.These two constants will be used to set the time parameters for the final simulation.

A. Case study: Full Adder
In order to validate our model, we have redesigned and considered the Full Adder structure presented in Ref. 9 (with 200 nm wide [Co 0.8nm Pt 1.0nnm ] ×4 as case study.From a methodological point of view, the tested circuit is decomposed into basic blocks as depicted in Fig. 2 and it is described by using the model in VHDL.The switching behaviour have been simulated and depicted in Fig. 2.C, it coincides with the one observed during experiments. The applied field in the simulation is the same as in Ref. 9, H clock = 560 Oe (Table I).The propagation time of the longest DW extracted performing the dummy simulation, the t prop,cr extracted is equal to 118 ns.The second parameter extracted by the dummy simulation is minimum nucleation time.It is equal to 558 ns, and it comes from the worst case which happens when the coupling field of the majority voter gate, which has to nucleate, counts only one contribution (e.g.I 1 = 0, I 2 = 1, I 3 = 0), for a wanted nucleation probability of 0.95%.
The clock pulse (t clock ) is set to 900 ns since it is calculated by the sum of t prop,cr , t nuc and t rise .The latter parameter is approximately the 20% of the pulse duration and it is set as 180 ns.After having set those parameters the final simulation is run and the waveforms are extracted.As shown in Fig. 2.C, the switching behaviour coincides with the experimental simulations.Here the critical path extracted is equal to 718 ns while the latency values are reported in Fig. 2.D.

IV. CONCLUSION
In this paper we have presented a new compact model able to describe the switching process governing pNML structures.Parameters extracted from experiments has been inserted into libraries while physical equations have been encapsulated into the proposed model.The validation has been accomplished considering a Full Adder architecture, previously realized experimentally, as case study.The correctness of its logical behavior has been proved and timing information have been collected by performing exhaustive test varying the input patterns.Results show the latency values for different input patterns.Further technological improvements on pNML devices will bring to the scaling of the actual sizes, this will lead to an increase of the propagation speed and, as a consequence, to improved clock frequencies.This model makes it possible to investigate on the potential of this technology, evaluating the impact of scaled dimensions in terms of timing performance.
FIG. 1. A) pNML cells B) Applied clocking field C) Unidirectional wire made by pNML cells D) Minority gate and E) the related truth table.

FIG. 2 .
FIG. 2. Full Adder with ordered input (different combinations in A and B) and the corresponding implementation decomposed into basic blocks.C) The simulated switching behaviour.It has been compared with the real switching behaviour analysed in Ref. 11.They are coherent.The carry out is computed after one clock cycle, and the sum is evaluated the next clock cycle.Delays can be noticed.D) Latency results.

TABLE I .
Set parameters in the model.