A Superconducting Nanowire Binary Shift Register

We present a design for a superconducting nanowire binary shift register, which stores digital states in the form of circulating supercurrents in high-kinetic-inductance loops. Adjacent superconducting loops are connected with nanocryotrons, three terminal electrothermal switches, and fed with an alternating two-phase clock to synchronously transfer the digital state between the loops. A two-loop serial-input shift register was fabricated with thin-film NbN and achieved a bit error rate less than $10^{-4}$, operating at a maximum clock frequency of $83\,\mathrm{MHz}$ and in an out-of-plane magnetic field up to $6\,\mathrm{mT}$. A shift register based on this technology offers an integrated solution for low-power readout of superconducting nanowire single photon detector arrays, and is capable of interfacing directly with room-temperature electronics and operating unshielded in high magnetic field environments.

We present a design for a superconducting nanowire binary shift register, which stores digital states in the form of circulating supercurrents in high-kinetic-inductance loops. Adjacent superconducting loops are connected with nanocryotrons, three terminal electrothermal switches, and fed with an alternating two-phase clock to synchronously transfer the digital state between the loops. A two-loop serial-input shift register was fabricated with thin-film NbN and achieved a bit error rate less than 10 −4 , operating at a maximum clock frequency of 83 MHz and in an out-of-plane magnetic field up to 6 mT. A shift register based on this technology offers an integrated solution for low-power readout of superconducting nanowire single photon detector arrays, and is capable of interfacing directly with room-temperature electronics and operating unshielded in high magnetic field environments.
Superconducting nanowires are interesting candidates for cryogenic data processing and storage, particularly for readout of superconducting nanowire single photon detector (SNSPD) arrays. The high kinetic inductance of thin film superconductors allows them to store data in compact loops, 1 and the existence of nanocryotrons (nTrons), three-terminal electrothermal switches, 2 enables the creation of low-power digital logic and memory elements. 3 In addition, superconducting nanowires can operate in harsh environments. NbN is radiation hard, 4 and SNSPDs have been shown to operate under high magnetic fields: both in-plane up to 5 T and outof-plane up to 500mT. 5 This makes nanowires an interesting candidate for applications in which SNSPD readout electronics must be able to withstand strong ambient magnetic fields or radiation, such as high energy physics and space exploration. Furthermore, the shared technology platform with SNSPDs and ability to drive high-impedance loads 2 is a strong motivator for direct integration of nanowire electronics with SNSPD arrays. Dedicated readout electronics are necessary to address the thermal and mechanical challenges of scaling SNSPD imagers beyond 1 kilopixel, 6 and low-power electronic devices that operate in extreme environments and can be fabricated adjacent to superconducting detectors are an attractive choice over Josephson junction logic and CMOS. Previous work 7,8 has used the high kinetic inductance of superconducting nanowires to make analog delay-line imagers, which offer high pixel counts and preserve the picosecond timing resolution of the SNSPDs. Row column multiplexing has also been shown as an effective technique for reducing cable counts, 9 however a more aggressive reduction in cable count will be required for megapixel arrays. Inspired by the operation of a semiconductor CCD, serial readout of SNSPD arrays could be performed by a superconducting nanowire binary shift register. Serial readout may enable higher count rates than delay-line techniques by shortening dead-time, but more importantly, it simplifies the interface to conventional CMOS readout electronics by removing the need for high resolution, low jitter time-to-digital converters.
In this work, we demonstrate a proof-of-concept for a superconducting nanowire binary shift register, which encodes digital states with dissipationless circulating current in superconducting loops. As shown in Fig. 1, each loop is formed by a kinetic inductor L k and two nTrons, U 1 and U 2 . The presence of a circulating current flowing through L k into the gate of U 2 encodes a binary "1", and the absence of current is used to represent a "0". The shift register is designed to use circulating currents on the order of 100 µA, therefore small (µA) fluctuations in loop current (e.g. due to thermally-activated phase slips which change the stored flux in each loop by Φ 0 ) are not expected to impact the binary state. A substantial environmental disturbance that makes the film resistive (e.g. T > T c , H > H c2 ) would be necessary to destroy the state stored in the shift register. The state of the shift register is only altered under the application of a clock, when the combination of the circulating current and clock pulse exceeds the critical current density in the nTron channel, causing it to switch from superconductive to resistive, diverting the clock pulse into the next loop. This process forms a new circulating current conditional on the presence of current in the previous loop. A two-phase clock is used to guarantee the diverted current always has a superconducting path to ground (as shown in Fig. 1c). In comparison to the original nTron design, 2 which acts like an amplifier, the loops are connected with wide-gate nTrons, where the width of the gate constriction is comparable or equal to that of the channel constriction. A wide-gate nTron is crucial for the shift register: because the output of one nTron becomes the input of another, the current levels for the input and output should be equal. The additional readout nTron shown in Fig. 1f uses a standard nTron with a small choke. It terminates the final loop of the shift register to destroy any circulating current present at the end of each clock cycle. The readout nTron serves two purposes: (1) to reset the final loop of the shift register, and (2) to generate an output voltage signal, which can be sent to off-chip readout electronics or cascaded through a resistor to other nTron logic.
The shift register was fabricated on a 16 nm-thick layer of NbN, deposited with an AJA sputtering system onto an Si wafer with 300 nm-thick SiO 2 thermal oxide. The circuit geometry was patterned on the NbN layer with electron-beam lithography using ZEP530A resist and CF 4   The corresponding time a in the simulation is indicated in (g). A two phase clock (φ 1 , φ 2 ) is used to transfer the digital state between adjacent loops; the first phase φ 1 is applied in (b). In (c), the summation of the clock and circulating currents exceeds the switching current of U 2 's channel, forming a resistive hotspot and diverting the clock into the loop formed by U 2 and U 3 . The hotspot creates a voltage spike v 1→2 shown in the lower panel of (g) at time c. By the time the clock is turned off in (d), the channel of U 2 has healed and a circulating current is present in the loop between U 2 and U 3 . The process continues in (e) when the second clock phase φ 2 is applied. Two clock phases are needed to ensure a zero resistance path to ground for the diverted clock, for example, the path through U 3 as shown in (c). The readout nTron U ro in (f) is used to reset the state of the final loop and generate an output voltage conditional on the presence of a circulating current. ing. The wide-gate nTron channel constriction widths were designed to be 270 nm (with an equal-sized gate choke), and the readout nTron channel width was designed to be 240nm, with a gate choke width of 40 nm. Figure 2a shows an electron micrograph of a wide-gate nTron patterned on thin-film NbN. Figure 2b is an electron micrograph of the experimental two-loop shift register circuit, and the equivalent circuit model is shown in Fig. 2c. The loop kinetic inductors were designed to be 100 nH; the estimated inductance came out to 60 nH (30 pH per square) based on a room temperature sheet resistance measurement of 194 Ω per square. The finished chip was wirebonded to a printed circuit board with off-chip current bias and shunt resistors, which was mounted to a custom dip probe 10 and cooled to 4.2 K in a dewar of liquid helium. The 2 kΩ bias resistors were used as approximate current sources to convert an applied voltage to a current through the nanowire. The hotspot resistance of the switching nTron is small compared to 2 kΩ, so the amount of current through the nanowire given some applied voltage stays roughly constant regardless of the nanowire state. The nTron dimensions, inductor sizes and resistor values were selected through LT-Spice simulation, 11 the results of which are shown in Fig. 1g. The bit error rate of the shift register model under high levels of noise (e.g. ±5% variation in clock amplitude) was used to guide selection of component properties. Eight different shift register circuits were fabricated on a single 1 cm 2 chip. Two circuits were tested: the circuit presented in this letter, which used a wide-gate nTron to connect adjacent loops, and a shift register with a different switch geometry. The alternative design used current summation into a single two-terminal constriction as a switch, which performed worse than the design based on the wide-gate nTron, likely due to leakage current that could flow between loops unimpeded regardless of the switch state. The results presented in this letter are from the circuit which used wide-gate nTrons.
The circuit was characterized with clock rates from 10 MHz to 100MHz and under magnetic fields from ±1 mT to ±6 mT, applied orthogonal to the chip surface by a superconducting magnet mounted on the end of the dip probe. A Keysight PXIe M3202A (arbitrary waveform generator) and M3102A (digitizer) were used to verify correct operation of the shift register over a range of signal amplitudes. This was done by generating multiple 10 kbit-long pseudorandom binary sequences of voltage pulses and measuring the circuit response. The data and clock input signals encoded digital "1"s with low-dutycycle 2 ns FWHM voltage pulses, as can be seen in the top panel of Fig. 3c. The PXIe chassis controller swept the amplitude of the shift and readout clock pulses and measured the bit error rate in near real time for each set of clock amplitudes by comparing the device output with the 10 kbit input sequence. Each spike of the output waveform was thresholded and digitized, and the result was compared with a copy of the input signal delayed by a clock period -for each instance where the input and digitized output differed, the total error count was incremented. A sample waveform used to calculate the bit error rate is shown in Fig. 3c.
The plots in Fig. 3a are bias margin plots, which show the bit error rate as a function of clock pulse amplitude for various clock rates. The dark regions indicate no measured errors for the 10 kbit sequence, and the width of the dark regions give the bias margins, defined as the amount of variation in clock amplitude that is acceptable before the circuit begins to function incorrectly. The device performed correctly up to a maximum clock rate of 83 MHz, with the bias margins steadily shrinking for increasing clock frequency. The bias margins of the shift clock were ±24 % at f clk = 10 MHz, but only ±7 % for f clk = 83 MHz. Margins for the readout clock shrank even more, from > ±45% at f clk = 10 MHz to ±5 % at f clk = 83 MHz. As shown in Fig. 3b, the introduction of a ±1 mT field did not dramatically hurt the margins of the shift clock: ±25% for +1 mT and ±20 % for -1 mT. The readout clock margins were unimpacted. However, introduction of a +6 mT field reduced the margins of the shift clock to ±4 %, and a -6 mT field (not shown) prevented the device from working with a bit error rate below 10 −3 .
The lower half of each bias margin plot exhibits a downwards slope due to the transfer characteristics of the readout nTron: for a larger gate current, the required channel current to switch the nTron is lower. Therefore, for a larger readout clock, the required loop current (and thus shift clock amplitude) is lower. The abrupt change in bit error rate for readout clock amplitudes below 30 µA occurred because the readout clock was not strong enough to switch the readout nTron. If the final loop current is left circulating, it prevents the middle nTron from switching again when a shift clock is applied. The optimal bias region slopes upwards for high readout clock currents, possibly because of current injection from the readout clock, which would create a reverse circulating current in the final shift register loop. This would require the amplitude of the shift clock to be larger to leave a net-forward circulating current in the final loop that was large enough for the readout nTron to switch when clocked.
As the frequency of the clock increased, the bias margins for the shift clock shrank from both sides, and the maximum acceptable readout clock amplitude decreased dramatically. The L/R time constant to charge a loop with a circulating current depends on the loop kinetic inductance and the total shunt resistance. It is plausible that, for higher clock frequencies, the circulating current does not reach a stable level in the halfperiod between the two clock phases, thus producing incorrect behavior. Further characterization with various shunt resistor and kinetic inductor sizes should be performed to verify that the decrease in margins is due to this electrical time constant, and not a thermal process or some other unconsidered effect. One possible explanation for the large decrease in the bias margins of the readout clock could be slow thermal reset of the readout nTron gate choke. The designed critical current of the choke was only 30 µA, and overdriving the readout clock significantly above that (e.g., 100 µA) would generate a considerable amount of heat. Residual heat from a readout clock with phase φ 1 would suppress the critical current of the channel, potentially causing the readout nTron to switch on phase φ 2 if it had not cooled sufficiently. Shunting the gate with a small resistor could limit the heating of the choke, potentially restoring the bias margin range of the readout clock for high clock frequencies.
The observed shift in bias margins of 15 µA/mT due to the external magnetic field (Fig. 3b) agrees with the expected loop current induced by the Meissner effect. However, enhancement of current crowding around constrictions (such as the sharp corners in the nTron channel as can be seen in Fig. 2a) due to the Lorentz force is potentially a more plausible explanation, so further work must be done to understand the mechanism of the external field on the bias margins of the circuit. If the Meissner effect is the dominant mechanism, reducing the size of the loop inductor may help improve resilience against out-of-plane magnetic fields. Instead, if the mechanism is current crowding enhanced by the Lorentz force, then the nTron geometry would need to be modified to mitigate this effect.
The total energy of any cryogenic electronics system will be dominated by the cryocooler, which can consume on the order of 1 kW to supply tens of milliwatts of cooling power at 4 K. 12 Unless the design of the shift register presented in this work is modified, SNSPD arrays using shift register readout are limited to the kilopixel regime by cryostat cooling power. The energy consumption of the shift register is estimated to be 80 fJ per shift operation, and is dominated by the clocking: each clock phase dissipates 100 µA through 2 kΩ for 2 ns. When the shift register stores a "1", approximately 300aJ of energy is stored (100 µA in a 60 nH loop). Each shifting operation destroys this circulating current, dissipating the stored energy through the resistive hotspot in the nTron channel. Shift register readout of a 1 kilopixel array clocked at 50 MHz would dissipate about 4 mW. Reduction of the clock impedance by a factor of 20 from 2 kΩ to 100 Ω and the operating current from 100 µA to 10 µA would reduce the power dissipation of the 1 kilopixel array to 2 µW, making a megapixel array feasible from a power perspective.
Decreasing the size of the loop inductor will enable faster, more compact shift registers due to a reduced kinetic in- ductance and therefore smaller L/R loop current time constant. The speed of the device is fundamentally limited by the hotspot thermal relaxation time, since the nTron channel must cool between the two clock phases, otherwise there will not be a superconducting path for the diverted clock if the previous shift register stage switches. For example, as shown in Fig. 1c, U 3 must be superconducting during the application of clock φ 1 . An nTron fabricated with NbN on SiO 2 thermal oxide has achieved a thermally-limited switching speed of 615.4MHz, with an estimated thermal relaxation time of 130ps. 13 Based on this, a conservative estimate for the thermal-reset-limited clock frequency of the shift register is about 1 ns, allowing for a 500 MHz two-phase clock. At this clock rate, a 1 megapixel array could be read out on two wires at a frame rate of 1 kHz, for a maximum photon count rate of 1 Gcps. More thermally conductive substrates can speed up thermal relaxation, 14 potentially offering further speed improvements to nanowire logic.
Due to the small feature size of the nTron constriction, fabrication variations may pose a challenge when drastically reducing feature sizes, especially for shift registers with many nTrons. In order to minimize cable count, the same clock signal must be shared between multiple nTrons for any practi-cal shift register. Therefore, all nTrons will receive the same amplitude clock signal, so if there is substantial variation in the switching current of the nTrons, then some loops may not function correctly for a clock amplitude which works for other loops. The bias margins of each nTron in a large shift register will have roughly the same shape, with variations in the midpoint of the optimal bias region due to edge roughness altering the constriction widths. Film thickness also plays a role, but edge roughness should be the dominant factor in switching current variations. Based on Fig. 3a, the allowable variation in switching current is ±7 % for a clock rate of 83 MHz. This is equivalent to ±18 nm variation in nTron width for the 270 nm-wide nTrons. A nanowire fabrication process using ma-N demonstrated 36 nTrons with a mean gate width of 33.7 nm and standard deviation of 2.4 nm across a 1 cm 2 chip area. 15 With ±7 standard deviations of allowable variation in width, a shift register with millions of nTrons should be feasible. However, scaling down to smaller nTron widths may still pose a challenge, as the relative variation in nTron switching current is larger.
Because the device we fabricated only accepts serial inputs, it would provide little practical benefit for large SNSPD arrays, as it is incapable of reducing wire count. However, mod-ifications to the circuit design can be incorporated to load data from an entire row of pixels in parallel into the shift register, as shown in Fig. 4. This proposed modification was designed and simulated in LTSpice. A simple pixel and destructivereadout memory can be implemented with an inductivelyshunted SNSPD and nTron. A second nTron is used to store a current in the shift register when the pixel is read out, conditional on the presence of a current in the pixel inductor. Using this technique, data from all pixels could be loaded simultaneously into the shift register. Since the readout of the pixels is destructive, the bias current through the SNSPD is restored, so the pixels can still detect photons after the pixel data is loaded into the shift register. There is still per-pixel dead time set by the frame rate of the imager, since each pixel can only detect a single photon before it is reset again, but there is no imager-wide dead time like in a delay-line readout approach. Each pixel consists of an inductively-shunted SNSPD which is read out with an nTron. When the bias i bias is enabled, photon arrivals divert the bias current into the right branch of the pixel. An additional bias current is applied to the φ 1 clock input of each shift register stage. If the SNSPD bias current is diverted, a pulse of current i load applied to the gate of nTron U 1 will cause it to switch, sending a pulse of current to the gate of nTron U 2 . The additional bias on the φ 1 input will be diverted, forming a circulating current in the shift register. In (b), photon arrivals create circulating currents in each pixel. After the application of the load pulse i load , the pixel states are loaded into the shift register and shifted out.
In addition to performing detector readout, the simplicity of a shift register makes it a useful test structure, which could be used to characterize process yield, as has been done in the past with SFQ logic to evaluate yield for Josephson junction processes. 16 More generally, the inherent ability of shift registers to serialize and deserialize data makes them a critical function of any large-scale digital system. A superconducting shift register could help increase the capacity of links between room temperature and superconducting electronics, and with the introduction of digital logic, push even more computing into the fridge and enable larger scale superconducting systems based on nanowires.
The initial stages of this work were sponsored by the Army Research Office (ARO) under Cooperative Agreement Number W911NF-21-2-0041. The views and conclusions contained in this document are those of the authors and should not be interpreted as representing the official policies, either expressed or implied, of the Army Research Office or the U.S. Government. The U.S. Government is authorized to reproduce and distribute reprints for Government purposes notwithstanding any copyright notation herein. The completion of the data analysis and presentation was funded by the DOE under the National Laboratory LAB 21-2491 Microelectronics grant. The authors would like to thank Kyle Richards and Teja Kothamasu for assistance with setting up and using the Keysight PXIe system. The data that support the findings of this study are available from the corresponding author upon reasonable request. The authors have no conflicts of interest to report.