Implementing Bayesian Networks with Embedded Stochastic MRAM

Magnetic tunnel junctions (MTJ's) with low barrier magnets have been used to implement random number generators (RNG's) and it has recently been shown that such an MTJ connected to the drain of a conventional transistor provides a three-terminal tunable RNG or a $p$-bit. In this letter we show how this $p$-bit can be used to build a $p$-circuit that emulates a Bayesian network (BN), such that the correlations in real world variables can be obtained from electrical measurements on the corresponding circuit nodes. The $p$-circuit design proceeds in two steps: the BN is first translated into a behavioral model, called Probabilistic Spin Logic (PSL), defined by dimensionless biasing (h) and interconnection (J) coefficients, which are then translated into electronic circuit elements. As a benchmark example, we mimic a family tree of three generations and show that the genetic relatedness calculated from a SPICE-compatible circuit simulator matches well-known results.

Magnetic tunnel junctions (MTJ's) with low barrier magnets have been used to implement random number generators (RNG's) and it has recently been shown that such an MTJ connected to the drain of a conventional transistor provides a three-terminal tunable RNG or a p-bit. In this letter we show how this p-bit can be used to build a p-circuit that emulates a Bayesian network (BN), such that the correlations in real world variables can be obtained from electrical measurements on the corresponding circuit nodes. The p-circuit design proceeds in two steps: the BN is first translated into a behavioral model, called Probabilistic Spin Logic (PSL), defined by dimensionless biasing (h) and interconnection (J) coefficients, which are then translated into electronic circuit elements. As a benchmark example, we mimic a family tree of three generations and show that the genetic relatedness calculated from a SPICE-compatible circuit simulator matches well-known results.
Magnetic tunnel junctions (MTJ's) with low barrier magnets have been used to implement random number generators (RNG's) 1-4 and it has recently been shown that such an MTJ connected to the drain of a conventional transistor provides a three-terminal tunable RNG or a p-bit 5 with applications to optimization 6 and an enhanced type of Boolean logic, that is invertible [7][8][9][10] . In this paper we show how this p-bit can be used to build a p-circuit that emulates a Bayesian network (BN) 11 defined in terms of conditional probability tables (CPT) that describe how each child node is influenced by its parent nodes. BN's are widely used to understand causal relationships in real world problems such as forecasting, diagnosis, automated vision, manufacturing control and so on 12 . For deep and complicated networks where each child node has many parent nodes, the computation of the joint probability becomes impractical 13 and different hardware implementations of BN's have been proposed [14][15][16][17][18][19][20][21][22] .
In this letter we present a systematic approach for translating a BN into an electronic circuit such that the stochastic node voltages mimic the real world variables whose correlations can be obtained from electrical measurements on the corresponding circuit nodes. The proposed electronic circuit and the hardware building blocks are based on present day Magnetoresistive Random Access Memory (MRAM) technology whose MTJs are built out of thermally unstable nanomagnets 5 (Stochastic MRAM), obviating the need for the development of a new device.
As a benchmark example, consider a BN (Fig. 1) consisting of three generations of a family, where each child (C) inherits half the genes from the father (F ) and the other half from the mother (M), so that C = 0.5F +0.5M , where C, F and M can each be viewed as a bipolar random variable: (−1,+1). A well-known concept in geneta) Equally contributing authors b) Electronic mail: datta@purdue.edu ics is that of relatedness. For example, the relatedness C1 × C2 of two siblings, with the same parents is 50% : 25 1 + 0 + 0 + 1 = 0.5. On the other hand two cousins whose fathers are siblings have a relatedness of only 12.5%: Fig. 1b compares the well-known relatedness of different family members (see for example, Ref. 23 ) with that calculated from a behavioral model which we call probabilistic spin logic, PSL, and from a simulation of the actual circuit using a SPICE-based circuit simulator. The behavioral PSL model represents an intermediate step in the translation of BN's to electronic circuits. It is a network whose nodes are abstract p-bits denoted by m (see Fig. 2) connected to other nodes and biased through dimensionless constants J, h respectively. The pbits described by Eq. 1a is analogous to a binary stochastic neuron and their interconnection described by Eq. 1b is analogous to a synapse. The PSL model is then translated into a circuit model whose nodes are actual circuit elements denoted by M connected to other nodes and biased through conductances G and voltages V bias . Fig. 1b shows that the relatedness from the PSL model (second column) as well as that obtained from the SPICE model (third column) agree well with the standard BN result (first column), thus providing confidence that the circuits obtained following our procedure can be used to study BN's in general.
Genetic relatedness is a textbook concept that provides a good benchmark for a hardware circuit emulator, but the principles presented here can be used to emulate more complicated BN's as well, involving more complex CPT's, as well as more complex nodes with N > 2 parents, reflecting the presence of more than two factors influencing the occurrence of an event.   Translating nodal information from BN to PSL to circuit: Each node of a BN is described by a conditional probability table (CPT), that of a PSL network is described by dimensionless constants J, h, and that of circuit is described by conductances G and voltage V bias . The text describes how the CPT is translated to J, h and then to G, V bias for (a) zero-parent node, (b) one-parent node and (c) two-parent node.

I. PROBABILISTIC SPIN LOGIC: BEHAVIORAL MODEL
PSL is defined by two equations 10 loosely analogous to a neuron and a synapse. The former is a binary stochastic neuron, or what we call a p-bit, whose output m i is related to its dimensionless input I i by the relation where rand (−1, +1) is a random number uniformly distributed between −1 and +1, and t is the normalized time unit. The synapse generates the input I i from a weighted sum of the states of other p-bits according to the relation where, h i is the on-site bias and J ij is the weight of the coupling from j th p-bit to i th p-bit.

II. FROM BN NODES TO PSL NODES
To relate I i to the conditional probability P i for m i to be 1, we note from Eq. 1a that the average value of m i is tanh(I i ) and this must equal Making use of Eq. 1b we can write We use this relation to translate the P i from the CPT into J, h in the PSL model, but the details differ depending on the number of "parents" of node i (Fig. 2). Nodes with no parents have no connecting weights J ij , only a bias h i which is related to the specified conditional probability p i by h i = (1/I 0 ) tanh −1 (2p i −1). Nodes with one parent have one connecting weight J ij , and a bias h i which can be obtained from the two specified conditional probabilities q i , r i from the equations Nodes with two parents have two connecting weights J i1 , J i2 , and a bias h i but there are four equations for these three unknowns. All equations can be satisfied simultaneously only if the equations are not linearly independent. If they are independent then an auxiliary node X is introduced so that: where m X = tanh(h X + J X1 m 1 + J X2 m 2 ) with the parents m 1 , m 2 equal to (±1, ±1) as appropriate for the four equations. One possibility is to choose h X , J X1 , J X2 such that m X = m 1 ∩ m 2 and then select the four remaining unknowns h i , J i1 , J i2 , J iX to satisfy Eqs. 4. Nodes with N parents have a total of (N + 1) unknowns, but there are 2 N equations to satisfy. Depending on the number of linearly independent equations, it is necessary to introduce the appropriate number of auxiliary variables. In this letter we will only present results for the BN in Fig. 1 which includes nodes with a maximum of N = 2 parents. Moreover, the CPT for the 2-parent nodes is assumed to be of the form t = u = 0.5, s = 1−ε and v = ε, ε being a small number introduced to avoid the singularities associated with the tanh function. With this CPT, no auxiliary node (X) is needed. LLG Fig. 3. Circuit implementation of building block:The circuit Eqs. 5 can be mapped onto the PSL Eqs. 1 using Eqs. 6 as described in the text. The circuit node Mi is defined to include the transimpedance amplifier along with the p-bit. The details of the embedded MRAM based p-bit are discussed in the text.

III. FROM PSL NODES TO CIRCUIT NODES
To translate the PSL into a circuit we use the embedded MTJ 10 whose output is related to its input by the relation where ±V DD /2 are the supply voltages, and V 0 is a parameter (∼ 50 mV) describing the width of the sigmoidal response. Although V 0 is a fitting parameter for the tanh function, it captures the actual sigmoidal response of the MTJ unit quite well. Even if there is a slight deviation with the tanh function due to the skewness of the MTJ response, it will not cause a noticeable difference in the output since PSL is quite robust against noise 10 . The output voltages are connected back through conductances G with a transimpedance amplifier having a feedback resistor R f , so that (see Fig. 3) Eqs. 5 can be mapped onto the PSL Eqs. 1 by defining  Fig. 1a. (a) Circuit diagram, (b) Typical stochastic nodal voltages from which nodal correlations can be obtained using an XNOR gate and a long time constant RC circuit. In the present example, the following parameters are used: The RC circuit uses R = 200 kΩ, C=200 fF, R f = 150 kΩ and I0 = 1 with dimensionless weights Jij = J0 = 2.3026 which are then used to obtain conductances Gij from Eq. 6. A simulation time of 1 ps is used in HSPICE that combines the self-consistent stochastic LLG with Predictive Technology models (PTM) 28 as in 5 . All transistors use the 14nm HP-FinFET node with minimum fin numbers (nfin=1). The XNOR gate is designed as a standard 14 transistor CMOS circuit, inverting an XOR output.

IV. SPICE-BASED P-BIT MODEL
In order to design the basic building block for the BN based on Eq. 5a, we are following the p-bit design in Ref. 5 that describes an embedded MTJ structure with a stochastic nanomagnet. We consider the weight logic in Eq. 1b to be implemented using ideal transimpedance amplifier with resistors 10 though a capacitive network with a more compact implementation could also be used to implement the weighted sum operation 24 . We use the same parameters for the p-bit as in 5 : A circular (stochastic) in-plane nanomagnet with negligible uniaxial anisotropy (H K ∼ 0) 25,26 , damping coefficient for the nanomagnet α = 0.01, saturation magnetization M s = 1100 emu/cc, with a free layer diameter 22 nm and a thickness of 2 nm. A Tunneling Magnetoresistance (TMR) value of 110% is used based on 27 . The MTJ conductance is assumed to be bias-independent and is given by G(t) = G 0 1 + m z (t) TMR/(2 + TMR) , where m z (t) is provided to the model by a self-consistent solution of the sLLG (stochastic Landau-Lifshitz-Gilbert equation) solver. The device operation is based on the control of the transistor conductance through the input voltage. The non-linear transistor characteristics with respect to drain, gate and source voltages are captured in simulation by the 14 nm HP-FinFET node from the Predictive Technology Models (PTM) 28 . When the transistor conductance is much greater or less than the MTJ conductance, the output shows little noise but when the MTJ conductance is matched to the transistor ON resistance around V in,i =0, there are large fluctuations at the output. In Fig. 3, each circular dot in the sigmoid is obtained by averaging 1 µs response of the stochastic output and the dashed lines show a (tanh) fit with a V 0 = 50 mV. The solid lines are obtained by sweeping the input voltage rail-to-rail in 100 ns and plotting the input with respect to the output voltage. Within the modular SPICE framework, the magnetization dynamics of the circular stochastic nanomagnet is captured by solving the sLLG equation in the macrospin assumption, where α is the damping coefficient, γ is the electron gyromagnetic ratio, N = M s Vol./µ B is the total number of Bohr magnetons in the magnet volume, M s is the saturation magnetization, H = H d + H n is the effective field including the out-of-plane (x directed) demagnetization field H d = −4πM s m xx , as well as the thermally fluctuating magnetic field due to the three dimensional uncorrelated thermal noise H n with zero mean H n = 0 and standard deviation H 2 n = 2αkT/|γ|M s V along each direction, I S = P I Cẑ is the spin current along the MTJ fixed layer direction (ẑ) where P is the polarization of the fixed magnet. The model takes this spin-current (I S ) incident to the free layer into account and for the parameters we have used, this current does not cause appreciable pinning of the free layer. A time step ∆t = 1 ps is used for the circuit simulation which implies a noise bandwidth of ∆f = 1 THz. Fig. 4a shows the full circuit assembled using the nodes defined in Fig. 3 to mimic the Bayesian network in Fig. 1a. Fig. 4b shows typical nodal voltages obtained from a SPICE simulation, whose correlations can either be calculated in software or measured using an XNOR gate to multiply them as shown and finding the long term average with an RC circuit having a time constant T:

V. SPICE-BASED CIRCUIT MODEL
These nodal correlations in the circuit can be used to compute the correlation between causally connected real world variables. For example the relatedness of different family members cited in Fig. 1b  Note that this is an asynchronous circuit with no clocks of any kind. This is particularly interesting since the corresponding PSL simulations require p-bits to be updated sequentially from parent to child node. In the SPICE circuit simulation there is no central clock to enforce an updating sequence, but our results show that the correlations are in good agreement with the PSL results and with Bayes theorem. However, such an asynchronous operation works only if the interconnect delays, for example from node FF1 to FF2, are much shorter than the nanomagnet fluctuations as discussed in Reference 8 . Since magnetic fluctuations occur at ∼ns time scales, this condition is naturally satisfied. The slight mismatch of the Bayes theorem and the PSL model appears to decrease systematically with increasing sample size (N=1e7 for the examples shown in Fig. 1b) with the full circuit model closely following them, but the updating issue in asynchronous operation deserves further study. We have not considered variations in the thermal barriers or interconnect delays in this paper, which requires further study.

VI. CONCLUSIONS
In summary, we have used SPICE simulations to show that using existing MRAM technology it should be possible to build p-circuits that mimic Bayesian networks such that each stochastic node is represented by a stochastic p-bit. We show that the ensemble-averaged correlations between the actual physical variables can be estimated from the time-averaged correlations between the voltages at the corresponding nodes which are measured electronically with XNOR logic gates and long time constant RC circuits, thus requiring no software-based processing of any kind. Our results could open up a new application space for Embedded MRAM technology with minimal modifications.

VII. ACKNOWLEDGMENT
This work was supported by the National Science Foundation (NSF).