Fan-out and Fan-in properties of superconducting neuromorphic circuits

Neuromorphic computing has the potential to further the success of software-based artificial neural networks (ANNs) by designing hardware from a different perspective. Current research in neuromorphic hardware targets dramatic improvements to ANN performance by increasing energy efficiency, speed of operation, and even seeks to extend the utility of ANNs by natively adding functionality such as spiking operation. One promising neuromorphic hardware platform is based on superconductive electronics, which has the potential to incorporate all of these advantages at the device level in addition to offering the potential of near lossless communications both within the neuromorphic circuits as well as between disparate superconductive chips. Here we explore one of the fundamental brain-inspired architecture components, the fan-in and fan-out as realized in superconductive circuits based on Josephson junctions. From our calculations and WRSPICE simulations we find that the fan-out should be limited only by junction count and circuit size limitations, and we demonstrate results in simulation at a level of 1-to-10,000, similar to that of the human brain. We find that fan-in has more limitations, but a fan-in level on the order of a few 100-to-1 should be achievable based on current technology. We discuss our findings and the critical parameters that set the limits on fan-in and fan-out in the context of superconductive neuromorphic circuits.


I Introduction
In recent years there has been an increase in the study of neuromorphic circuits made from superconducting electronics. 1 2 3 4 5 6 7 8 9 10 11 12 The single flux quantum (SFQ) pulse that is released from a Josephson junction when the applied current briefly exceeds its critical current has a lot in common with the action potential in neurons that is released when the membrane potential exceeds its threshold. Superconducting circuits based on Josephson junctions are being developed to generate pulses, send pulses down output lines, transmit variable feed-forward signals, and sum the signals from different sites in a network; these circuits represent the electrical equivalent of biological somas, axons, synapses and dendrites, respectively. As a result, it is possible to conceive of building spiking neural networks (SNNs) completely from superconducting electronics. SNNs are used for pattern recognition in spatio-temporal data sets and are considered by some to have superior information-processing capabilities to the basic artificial neural networks (ANNs) used in "deep learning." 13 They are also good analogs of the human brain and are used to study its computational capabilities. 14 SNNs made from superconducting electronics have the potential to be both faster and more energy-efficient 15 16 than those made from silicon circuits.
In superconducting digital circuits, the presence or absence of an SFQ pulse during a clock cycle represents the 1 or 0 of digital logic. As a result a large variety of circuits that manipulate and control SFQ pulses are available including flip-flops, logic gates, shift registers, transmission lines, splitters, mergers and comparators. 17 In moving to neuromorphic circuits, much of the effort has been to simply redesign and repurpose these circuits to perform the necessary functions in a neuromorphic 18 environment. Fan-in is the number of inputs that a circuit element can accept, whereas fan-out refers to the number of loads that can be connected to a circuit element's output. One of the weaknesses of SFQ logic is its poor fan-in and fan-out, typically only 2 or 3 at most. 19 Given the extraordinary fan-out of the human brain (~ 10,000), this is a huge concern for the utility of SFQ based neuromorphic circuits.
In this paper we explore the problem of fan-out and fan-in in superconducting neuromorphic circuits. Although there are many similarities to the fan-in and fan-out properties of superconducting digital circuits, there are also important differences, most notably the analog nature of the synapse. We make some basic assumptions about a possible system architecture and show both flux-and currentbased methods of fan-in and fan-out. We use both circuit simulations and calculations to show how these approaches might scale and what factors might limit them in pushing to larger networks. With modest assumptions about fabrication parameters like inductances and critical currents, we show how a fan-out and fan-in of order 100 is possible. Although this falls short of the human brain, it brings the technology in line with silicon neuromorphic computing, without loss of the advantages of low power and high speed.
The paper is organized as follows. In Section II we give an overview of the problem, assuming a basic system architecture. In Section III we show two possible ways of fanning out, including splitter trees and a current fan-out. In Section IV we show two methods of fan-in, one with flux and one with current. In Section V we look at the limiting factors for each of these as one scales to larger networks. Finally, in Section VI we conclude and suggest future work. Figure 1 Schematic of the fan-in and fan-out of a spiking neural network (SNN). 1a) Schematic of a SNN, with the neurons indicated as ovals and the synapses indicated as connecting lines. Neural spike patterns in time are the inputs and outputs of the system. The neurons are divided into two halves: the input half, which sums the inputs, and the output half, which generates a spike when the threshold is exceeded. The weighting occurs along each connecting line and is not shown explicitly. 1b) The number of signals that are summed at the input form the fan-in (NFI), while the number of output lines that a spike is driven down forms the fan-out (NFO). The input signals are of varying weight, both positive and negative, while the output signals are all the same amplitude and shape.

II Overview
In this section we carefully define the problem of fan-in and fan-out, first in generalized spiking neural networks and then, by assuming a simplified architecture, in a superconducting electronics implementation. Since there is as of yet no agreed-upon architecture for a superconducting neural network, some of our results will change as different architectures are proposed and tested. However, we believe that many of the ideas developed here are general enough that they will translate into future experiments regardless of architecture. Figure 1 shows a general schematic diagram of a spiking neural network (SNN). Indicated are spiking neurons, synapses, and propagation lines that represent axons and dendrites. Neurons are broken into two parts: the input, where signals from many synapses are summed together, and the output, where a spike is emitted when the sum of the inputs exceeds a threshold. The number of signals summed together on the input constitutes the fan-in, while the number of output lines that a spike is propagated down is the fan-out. The fan-in and fan-out are about the same (on average) for a given network, since most signals start in one neuron and end in another, with the exception of the inputs and outputs to the system.
One important distinction needs to be made about the signals from neurons and synapses. The signals that flow out of neurons (fan-out), called the spikes or "action potentials" in a biological setting, are essentially digital: they all have about the same amplitude and shape, which are fixed by the ion channel dynamics inside the soma. For this reason, the SFQ pulses of superconducting digital logic are ideal to serve as action potentials. Meanwhile, the signals that flow out of synapses into neurons (fan-in), the socalled "post-synaptic potentials," are analog in nature: although they typically have a similar shape, their amplitude depends on the strength of the synapse and they can be either positive (excitatory) or negative (inhibitory). 20 In many implementations of a spiking neural network, synaptic signals are represented digitally; for example, IBM's True North used a 4-bit digital number to represent the amplitude of the synaptic current. 21 However, this represents a quantity which is still inherently analog and the bit depth that is required to capture this analog behavior is the topic of active investigation. 22 These ideas make the role of fan-in and fan-out circuitry clearer. For fan-in, we need to sum the analog signals from different synapses coming into the soma. For fan-out, we need to take the output signal of the soma and generate identical copies, each to be sent down a separate line to a synapse.
In our superconducting implementation, we assume a simple two-junction neuron similar to the Josephson junction neuron (JJ neuron) already proposed. 1 Although a single junction will generate a spike when driven above threshold, for larger input currents an unwanted offset will develop; this is avoided in a two-junction neuron. The two junctions also give the neuron more biological realism, as each junction can be likened to an ion channel, with the resulting reproduction of several biologically realistic neuron behaviors. 1 The input to the neuron is either a current or a flux, and the threshold is defined when an SFQ pulse is produced from a given input. This can be tuned by adjusting circuit parameters such as a bias current.

III Fan-out
In this section we will discuss two potential methods, flux based and current based, for fan-out that can be used in superconducting neuromorphic circuits. Because the fan-out in these circuits is effectively a digital operation, one can start with a fan-out circuit that is a series of nested splitters like those used typically in digital SFQ circuits. 17 Figure 2a shows the schematic of the binary splitter circuit, which will be nested in layers to achieve a larger fan-out. The input to the first layer is an SFQ pulse coming from a "pre-synaptic" JJ neuron and the output at the final splitter layer will be transmitted to a synapse. In the intermediate layers, the input will be from the previous splitter layer and the output will go to the subsequent splitter layer. With a network of such splitters one can fan-out to a given power of 2 with an overhead of 3NFO-3 junctions, where NFO is the final fan-out number. It is worth noting that the junction biases can be run in series meaning that only two lines would be needed for the fan-out circuit. Figure 2b shows the current pulse at each layer of a binary splitter, as labeled to the right of the trace, that has been nested seven times to achieve a fan-out of 1-to-128. Note that the output is nearly identical for each trace aside from the slight time delay, as desired. The output was taken at a random location within a given layer and corresponds to the current through the inductor after division (L2). In this example we feed an initial DC to SFQ converter with a 25 GHz triangle wave, its output is fed into the input of the first splitter layer at the node marked "in" in  arbitrarily large number of synapses.
In addition to the flux-based fan-out shown in Fig. 2, one can also use a current based fan-out as shown in Fig. 3a. In this fan-out a pre-synaptic current spike is resistively split and then reamplified with an amplifying Josephson transmission line (JTL), which can be seen as the increasing Ic values of the junctions in fig. 3a. We have run simulations of this scheme at up to a direct division of 1-to-10. This structure is also nestable, meaning that using 1-to-10 splitters one could get to a fan-out of 1-to-1000 in three layers. It should be easier to modify this structure anywhere between a fan-out of 2 and a fan-out of 10 for any single stage, which could reduce the potential overhead if a non-power-of-two fan-out is desired. However, a downside of this design is that when nesting 1-to-10 fan-out splitters, the JTL amplifier required at least 4 stages meaning that the final junction count overhead is ~ 4NFO where NFO is the final fan-out number. More precisely, in the case where the reamplification takes an M-junction JTL, the junction count is , where k is the number of stages and J is the single stage fan-out (10 in this example.) Figure 3b shows the output of the last stage junction at each of the three stages in a simulated 1-to-1000 fan-out. We note that the final stage is terminated with a resistor and inductor, which is why its pulse shape is slightly different from the other two stages as they are driving a 10  resistor divide plus the first stage of the JTL re-amplifier. In this example we again feed an initial DC to SFQ converter with a 25 GHz triangle wave, its output is fed into the input node marked in Fig. 3a. In this section we will discuss two methods for fan-in that can be used in neuromorphic SFQ circuits. Unlike the fan-out, we envision the fan-in to be an analog operation. Figure 4a shows the basic schematic for a flux-based fan-in. This circuit is designed for the input on each of the lines (Iin_1 through _ ) to have already been weighted by a synaptic circuit. With flux-based fan-in, excitatory or inhibitory connections can be made using positive or negative inductive coupling. The summation loop has NFI small mutual inductors, two JJs and one more inductor to adjust the behavior of the loop, as well as a bias current, which can adjust the threshold.

IV Fan-in
While both fan-out circuits were digital and nested to achieve arbitrary levels of fan-out, the fan-in circuit is inherently analog and therefore cannot be nested in the same way. This means that the scaling of these circuits will be under further constraints, the details of which are discussed below. In the simulations presented here, we take a worst-case scenario of trying to drive the summation loop that is the post synaptic JJ neuron above threshold with a single incoming SFQ pulse. For small fan-in sizes this is relatively easy, and we show here that even under these "worst-case" conditions we are able to successfully drive a fan-in of 128-to-1. In order to drive the summation loop above threshold with a single input we had to decrease the critical current of the JJs in the post synaptic JJ neuron. If one wants to extend the network further than a single layer, we then need to reamplify the signal back to the original level so that it can be fanned-out into the next layer of the network. We include this reamplification in our simulated circuits as shown in fig. 4a, with the amplifying JTL between LSQ_1 and the output.  Figure 4b shows the results of the simulation of a 128-to-1 fan-in circuit operating at 25 GHz under the single active input condition. In this simulation, we started with the fan-out circuit described in Fig. 2. However, only 1 of the final 128 junctions was connected back to the input pulses coming from the DC to SFQ input. The rest of the splitter network was terminated to ground on the input to ensure that there were no spikes coming from these JJs while maintaining the impedance of the network to simulate any potential cross talk effects. In our simulation one full-weight pulse is then defined by the final stage of a JTL amplifier after the fan-out with Ic = 500 µA.
The traces in Fig. 4b shows the current pulse through one of the coupling inductors (LSQ_1 in Fig. 4a) in black, the output current of the summation loop through Lout_0 is shown in red, and the output current through Lout_3 after reamplification is shown in blue. Figure 4c shows the phase response of JJSQ_2, exhibiting 2π phase steps for each output pulse of the post-synaptic JJ neuron, which confirms the desired SFQ type operation.
In addition to the flux-based fan-in just described, we also investigated current-based fan-in. Figure 5a shows the basic schematic for a current-based fan-in. This circuit is designed for the input on each of the lines (Iin_1 through _ ) to have already been weighted by a synaptic circuit. We again run simulations for the "worst-case" scenario of trying to drive the post-synaptic JJ neuron with a single input pulse. Figure 5b shows the results of simulations of the current-based fan-in of 10-to-1 operating at 25 GHz. The current pulse through LSQ_1 in the summation loop is shown in black, output current of the summation loop through Lout_0 is shown in red, and the output current through Lout_3 after reamplification is shown in blue. Figure 5c shows the phase evolution of JJSQ_2, which demonstrates both 2π phase steps that accompany the SFQ pulsing of the post-synaptic JJ neuron and the desired SFQ operation of this part of the circuit. Note, Iin_1 was taken as an output spike from a flux-based fan-out of 1-to-10 and was then amplified with a JTL amplifier with a final stage of 200 µA. All other inputs were grounded so as not to cause any additional input current, but to properly model the loss of current running back into the inactive inputs, which we will refer to as the cross-talk.

V Constraints and limits for large networks
For larger networks, the fan-out methods presented in Section III will be limited only by area, junction count, power, and delay. There is no fundamental limit to how many signals can be fanned out with either the splitter method or the current method. The scaling of the junction count with fan-out for each method is given in fig. 6. The splitter method seems like the better choice but depending on other design constraints it is possible that current method might still be viable.
Depending on the architecture chosen for a superconducting SNN one may want to maintain a constant pulse timing. Figure 7a shows a block diagram schematic of binary splitters connected in a way that maintains a constant time delay at a single output line. In this example the timing is accomplished using constant wire path lengths, and junction counts for all outputs. If one were to relax the timing requirement or the single output line requirement, then the circuit size could be reduced. In the example shown in Fig. 7a the splitters were nested to three layers giving a fan-out is 1-to-8. One could also extend this structure arbitrarily deep depending on the desired level of fan-out. In this case, the size of the fan-out circuit will be determined primarily by two factors. The size in the vertical direction (as illustrated in Fig. 7a) is set by the minimum pitch of the JJs, and the size in the horizontal direction is set by the required inductance. Using this nominal two wiring layer layout, we estimate that a 1-to-128 fan-out circuit would occupy roughly 175 µm x 200 µm. To arrive at this estimated area, the following assumptions were used: The minimum JJ diameter is 500 nm, and the separation between JJ edges is 1 µm. This yields a 1.5 µm pitch for the JJ's in the final output line, which result in a maximum vertical length of ~200 µm for a fan-out of 1-to-128. To estimate the space required in the corresponding horizontal direction we assumed that the inductance of the strip-lines used for wiring can be varied between roughly 0.05 pH/µm and 0.9 pH/µm. Using the structure shown in Fig. 7a with the wiring as the source of inductance we estimate that the horizontal length of a 1-to-128 fan-out circuit would be ~ 175 µm. Reduction in the horizontal length could be achieved with the use of high inductance layers or by adding meanders in the wiring, an example of which is shown in Fig. 7a with dashed lines. Reduction in the vertical length could be achieved by reducing the JJ diameter and/or by reducing the space between JJs.
If space and time delay are permitted, then the binary splitters can be nested much deeper, including to levels similar to the human brain. Figure 7b shows example output current spikes from a fan-out circuit using the same basic parameters as outlined in Fig. 2 but extending the circuit to 14 layers deep resulting in a fan-out of 1-to-16,384. This 1-to-16,384 fan-out circuit would occupy an area of roughly 350 µm x 25mm using the same assumption discussed in Fig. 7a. This circuit is quite large, and one would hope that further scaling of the JJ technology will help to reduce this size in the future.
There is one additional constraint that must be considered as the fan-out is nested to larger circuits, which is the inclusion of repeater junctions (or JTLs), shown in red in Fig. 7a. These are needed once the impedance of the wire becomes too large for the JJs to drive within the SFQ regime and will minimally  Fig. 2. increase the junction count. In the 1-to-16,384 fan-out example, the longest wire (following the nominal layout of Fig. 7a) is roughly 12,300 µm. This would require roughly 122 additional junctions spread through the first 5 layers of the fanout, if we assume wiring with 0.05 pH/µm and a desired inductance of 20 pH. Compared to the total junctions count of roughly 3N (49,152) this is a negligible design constraint.

2b) Example output current pulses, with layers indicated on the right, in a 1-to-16,384 fan-out circuit of nested binary splitters (pulses are offset for clarity with the first stage at the bottom and the last stage at the top). The data shown in 2b were from a simulation with the same unit cell parameters used in
In addition to the flux-based splitter fan-out, one can also nest the current-based fan-out to an arbitrarily large size. The footprint of this type of fan-out will be similar in size to the splitter tree if we are constrained to a vertical line output with uniform time delay wiring. The rough size of a fan-out of 1to-1000 (three layers deep) with this method is 1500 µm in the vertical direction and 250 µm in the horizontal direction. These dimensions were limited by the 1.5 µm pitch in the vertical direction, and by the 0.9 pH/µm maximum inductance in the horizontal direction. Next, we discuss the limits to the two methods of fan-in, current and flux, shown earlier in Section IV. Figure 8 shows a generalized schematic of each method where we have tried to simplify each circuit to contain only the essential parts that affect scaling to large NFI. Figure 8a shows the current method of fan-in while Fig. 8b shows the flux method. In both cases, the input to the fan-in coming down each synapse is an SFQ pulse with current Iin, generated by a junction with critical current Ic1 and a total series inductance (Ls + Lcp). In the current method, this current that flows into a summing node and then into an output superconducting quantum interference device (SQUID), which has an inductance Lsq and a critical current of Ic2. In the flux method, the current Iin flows through the coupling inductance Lcp which is magnetically coupled to the inductance Lsq. This causes a flux in the output loop which has two junctions with critical currents equal to Ic2.
With these simplified circuits we can easily derive the value of the signal current Isig that flows into the output loop in both cases. In the case of the current method, we use the equation for current division. We first calculate the total parallel inductance L|| from the summing node to ground: where the term 1 ( + ) ⁄ appears (NFI-1) times, once for each of the remaining fan-in paths aside from the one generating the pulse. For now, we have assumed that Lsq is the inductance going to ground for the signal path, ignoring the higher-impedance path through the junctions; we will correct this later. The signal current is then simply ( || ). This leads to: where Ltot is the total inductance of the large summing loop. Later we will relate Ltot to NFI and Lsq.
Looking at Eqs. (2) and (3), we can make a few assumptions about some of the terms. First, we note that the current Iin which appears in Eqs. (2) and (3) is the result of an SFQ pulse, and therefore has a total quantized flux of 0 = 2.07x10 -15 Vs. Since it is coupled through an inductance of (Lcp + Ls), we can In addition, since the both the input loop (Ic1 -Ls+Lcp) and the output loop (Ic2 -Lsq) each need to generate an SFQ pulse, there is a constraint on their inductance parameter L. Loops capable of generating SFQ pulses need to have a value of L in the range L ≈ 1 -10. 23 Loops with L < 1 are nonhysteretic and incapable of generating an SFQ pulse; loops with L > 10 can store more than one flux quantum and are subject to random phase slips. We define L1 and L2 as follows: In our designs both of these parameters ended up being between 1.5 and 4 and close to each other.
(Note that L2 in eq. (6) is defined for the current fan-in case, in the flux case there would be NFI SQUID inductors.) Finally, we refer the signal current to the critical current Ic2, which is the junction that must release an SFQ pulse when the sum of the fan-in signals exceed threshold. Ideally, Isig should be a significant fraction of Ic2, in the neighborhood of at least 20%. Although smaller signal currents (1%-10%) could still potentially cause pulses in the output loop, this would require an external bias current very close to Ic2. In that case fabrication margins will become a significant concern. We thus use 20% as a rough rule of thumb.
We now apply these three assumptions into the equations for the signal and then compare the expression for the two methods. Looking first at the current, we find: where we have assumed that NFI >> 1 and Ic1 >> Ic2 (chosen to maximize the signal -see below), thus ignoring the second term in the denominator. For the case of flux, we make the simple assumption that Ltot in equation (3) is equal to NFILsq, again assuming that NFI >>1 and also ignoring the slight modification to the impedance of each of the Lsq inductors due to the coupling back to the input. Then we obtain: In our case Lcp = Lsq, so the term in the square root is 1. Note then the similarities between Eqs. (7) and (8). The one major difference is the factor of (Ic1/Ic2) in the flux case. (There is also the factor of k, but this can be of order one.) This allows scaling to a much larger fan-in in the flux case, as one can make this ratio of critical currents (Ic1/Ic2) a factor of 50 or even higher. Thus, the flux method scales to large NFI better than the current method. This is a major result of our paper. Besides the comparison between Eqs. (7) and (8), we will also show this directly with simulations.
The expressions for the signal current derived in Eqs. (2) and (3) along with their simplified form in (7) and (8) can be verified by circuit simulations. We built a WR-SPICE model of the circuits in Fig. 8, choosing parameters that were similar to those in the simulations shown in Section IV. In both the current and the flux case we chose Ic1 = 1000 A. This allowed us to decrease the size of Lcp and Ls while keeping L1 roughly constant, via equation (5). Meanwhile, the smaller value of (Lcp + Ls) helped maximize the size of the current Iin, via equation (4). The value of Ic2, on the other hand, was kept as small as possible to account for a small signal current. We used both a value of Ic2 = 20 A, which is safe from the effects of thermal fluctuations at 4 K, and a value of 6 A, which is very aggressive. Our values of Ls and Lcp were 0.8 pH and 1 pH in the current case and both equal to 1 pH in the flux case. The value of Lsq is different for the flux and the current case and is described below.
After setting these parameters we varied the fan-in and noted how the signal current changed. Figures  9 and 10 show the cases for current and flux, respectively, where we plot the (Isig/Ic2) as a function of NFI. The markers show the results from circuit simulations while the solid and dashed lines show the results from the equations. The dashed line at a value of (Isig/Ic2) = 0.2 represents our 20% criterion described above.
Looking first at the current, we used equation (7) for the fitting, but with the value of Lsq modified to include the extra path to ground in parallel. This path included the additional inductor Lsq and the Josephson inductance of the two junctions with Ic2. This parallel path has an inductance of about 2Lsq, so the total equivalent inductance to ground is then 2/3 Lsq. Thus, we replaced Lsq with 2/3 Lsq to calculate L2 in equation (6), which is then used in equation (7) to calculate the signal current. Figure 9 shows  Next, we look at fitting the case for flux fan-in. We use Lsq = 1 pH, trying to keep it as small as possible since there are so many of them. Figure 10 shows the simulated data for the two values of Ic2, 20 A and 6 A. The dotted lines show the fit from equation (8), which assumes that the total inductance in the large loop is NFILsq. These fit the data well at large NFI, but not at small NFI. As with the current method, we need to account for the other components in the loop, the two junctions and the remaining Lsq inductor; we call this extra inductance Lex, which simply adds to NFILsq. In addition, since each of the Lsq inductors couple a small current back to the input loop (the cross-talk current, which will be discussed below), their impedance is reduced below the value of Lsq. The reduced value ′ is given by: which can be derived from the circuit model of a transformer. Incorporating both of these corrections, we can then rewrite equation (8) as The corrective term is in square brackets. In Fig. 10 the solid lines show the fits using equation (10), showing better agreement at smaller NFI. At larger values of NFI the corrective term gets close to one, thus the curves become roughly the same. The maximum fan-in for flux is over 100 with the 20 A case and over 300 in the 6 A case. Note these are much larger than the fan-in for current, which confirms our conclusion that flux scales better than current. Finally, we look at the crosstalk current (Ict) which flows back down synapses, shown in Fig. 8. For the current method, Ict is given by: Using our numbers for the case where NFI = 16, Lsq = 40 pH and Ic2 = 20 A, we find a crosstalk current of 33.2 A using equation (11), which compares well to our simulated value of 32.6 A. This is actually much larger than the signal current for those parameters (6.4 A). Note that the crosstalk current scales approximately as (1/NFI), so the inability to increase the fan-in keeps the crosstalk large.
In the flux case, the crosstalk is given by: where LJ1 is the Josephson inductance of the input junction with Ic1 and we have approximated NFI>>1. Using our numbers for the case where NFI = 128 and Ic2 = 20 A we find a crosstalk current of 1.3 A using equation (12), the same as what we find in our simulation and much smaller than in the current method. Here we also find that the crosstalk current scales approximately as (1/NFI), so the ability to go to large NFI helps reduce the crosstalk. The size of the crosstalk for flux can thus be made much smaller than for current, which once again confirms our conclusion that flux is indeed the better choice for fanin.

VI Discussion
In this paper we have explored the problem of fan-out and fan-in in superconducting neuromorphic circuits. The fan-out is a digital process where the action potential is repeated and copied multiple times. Both flux and current methods give fan-out schemes that accomplish this successfully. Both can be scaled indefinitely to large networks, with a proportional cost in chip area and power dissipation.
Since power dissipation scales with the junction count, and each junction dissipating roughly Ic0 energy per SFQ pulse, a rough estimate of the power dissipated for a 1-to-128 flux based fan-out circuit using the Ic values given above is 44 aJ. Pulse timing can be preserved throughout subsequent layers of fanout; if the constraint on timing is lifted, then the cost in area can be reduced. The flux method has a better scaling with junction number, but current fan-out might still be used, depending on the application.
Fan-in is an analog process where the weighted synaptic signals are summed and compared to a threshold. Both a current method and flux method were proposed. In scaling to large networks, the flux method appears to be superior to the current method, both in preserving a larger signal current and in keeping the cross-talk to a minimum. For small networks it is possible that the current method could still be used.
In our study we have largely ignored the problem of synaptic weighting; that is covered in other work. 5 However, the architectural assumptions that we have made in this paper are consistent with typical SFQ weighting schemes that have been proposed in the literature. In addition, in our study of the fan-in we have been careful to consider the worst-case scenario for the signal, insisting that a single synaptic current be capable of invoking an action potential. In real networks there will most likely be signals from many synapses simultaneously, increasing the total signal beyond the estimates in this paper.
Looking forward, there are many possible advantages to a future superconducting neuromorphic processor: (i) the ability to use superconducting multi-chip modules (MCMs) 24 to connect many chips together with no cost in dissipation; (ii) the availability 25 of low-power superconducting digital electronics to provide interfacing, multiplexing and readout between analog units; (iii) the ability of the JJ neuron to attain biological realism 1 with only two junctions; and (iv) possible optical interface to truly expand to brain-sized networks. In this work, we demonstrate that both a fan-in and fan-out of order 100 are completely reasonable for existing superconducting fabrication. Using our area estimates, a network of 128 fully-connected neurons would easily fit on a 5 mm x 5 mm chip and draw minimal power. Thus, we conclude that the fan-in and fan-out are not limiting factors for the future of superconducting neuromorphic computing.

Data Availability Statement:
The data that supports the findings of this study are available within the article [and its supplementary material].