Scalable and customizable arbitrary waveform generator for superconducting quantum computing

Superconducting quantum processors are manufactured based on a semiconductor process, which makes qubit integration possible. At the same time, this kind of qubit exhibits high-performance fidelity and decoherence time and requires a programmable arbitrary waveform generator (AWG). This paper presents the implementation of an AWG with a sampling rate of two-gigabit samples per second as well as 16-bit vertical resolution digital-to-analog converters. The AWGs are designed for a scaled-up usage scenario by integrating them with separate microwave devices onto a single backplane. A special waveform sequence output controller is designed to realize seamless waveform switching and arbitrary waveform generation. The jitter of multiple AWG channels is around 10 ps, and the integral nonlinearity and differential nonlinearity are both about 2 least significant bits. This customizable AWG has been used in several superconducting quantum processors, and the result of multiple qubits’ measurement verifies that the AWG is qualified for controlling tens of superconducting qubits.


I. INTRODUCTION
Superconducting qubits exhibit excellent performance in fidelity, decoherence time, and integration, making them one of the most feasible quantum computing schemes. Qubit controlling and reading can be achieved via commercial 1 gigabit samples per second (GSPS) digital-to-analog converters (DACs) and analog-to-digital converters (ADCs) by combining with IQ mixers in microwave modulation for up/down conversion. 1,2 As illustrated in Fig. 1, superconducting quantum computing control systems are composed of multiple independent parts, including switching networks, clocks and synchronizations, qubit controls, qubit readouts, biases, and hosts. Each control unit is formed by arbitrary waveform generators (AWGs), filters, differential amplifiers, power splitters (PWD), and high-precision DC sources. These components can be used flexibly in accordance with the qubit control requirements. Readout modules include a data acquisition (DAQ) and other facets that are included in the control unit. The clock and synchronization module consists of a phase locked-loop (PLL) and multiple signal fan-out units that distribute one input to multiple outputs, providing the clock and trigger signals required for system synchronization.
Generally speaking, the qubits are the key components of the superconducting quantum processor. Each qubit is tunable by Z-bias, and the qubit state can be inferred by the frequency shift of its unique readout resonator. 3 Multiple resonators are coupled onto one readout transmission line. The information of each qubit can be read by one readout circuit rather than directly interacting with the qubits. Typically, the ratio between the channel numbers of the qubit control and the qubit readout is around 4-10. Every qubit needs 3 AWG channels (I, Q, and fast bias) for its control. This means that the proportion between the channel numbers of AWGs and DAQs is about 10 to 1. Hence, a high-speed, high-precision AWG and its synchronization control represent the primary challenge in such a system. The required synchronization accuracy between the channels of superconducting qubits is on the order of picoseconds.
Effective designs for a superconducting qubit chip are constantly being explored, and new control requirements continue to emerge. For example, current research must grapple with rapid calibration of modulation waveforms, specific rapid demodulation algorithms, and rapid feedback control required by fault-tolerant quantum computing. An effective control device of this nature requires a highly customized system.
Several groups, such as those in UCSB, IBM, ETH Zurich, Yale, and BBN Technologies, have recently implemented superconducting quantum computing control platforms. [4][5][6] Moreover, control systems have been realized in ultralow temperature environments. 7,8 However, commercially available devices cannot satisfy the growing demand for a programmable and compact product. Therefore, in order to realize large-scale production of quantum computer, it is desirable to develop a programmable, compact, and extensible AWG. This paper presents an AWG framework and waveform output sequence controller with extensible capabilities. To validate the AWG design, we conduct a series of tests on the AWG and use it for controlling multiple superconducting processors.

II. AWG IMPLEMENTATION
Every superconducting qubit has its own frequency which is about 4-8 GHz. Thus, the decoherence time of most available qubits is about 10-100 μs. In order to properly measure superconducting qubits, an ensemble test is required; in other words, a qubit test is conducted by repeatedly transmitting a set of modulated pulses to the qubit and then measuring the response. AWGs together with IQ mixers significantly improve the flexibility of generating the qubit modulation pulse. With the flexibility of field-programmable gate arrays (FPGAs), the AWGs can be customized to meet the varying requirements of superconducting qubit control, such as quantum feedback and quantum error correction.

A. Hardware
The bandwidth of the AWG is usually around DC ∼300 MHz. In order to achieve the qubit test with 99.9% fidelity, the AWG's adjustable accuracy is required to be higher than five times the fidelity requirement or 1/5000. As shown in Fig. 2, the AWG unit is composed of a Xilinx FPGA (XCKU040-FFVA156-2-E) and two commercially available high-performance DAC chips (from Analog Devices, Inc.). Such an AWG is capable of providing four arbitrary waveform output channels, each of which can operate at 2 GSPS and provide 16-bit resolution output. We use the JESD204B protocol to achieve transmission of the high-speed digital data. Sixteen lanes are used in an AWG unit, where each lane runs at a rate of 10 Gbps, and the total rate is 160 Gbps. Bipolar signals are achieved by passing the outputs through low-pass filters and differential amplifiers (from Analog Devices, Inc.). An RC low-pass filter, consisting of resistors and capacitors, is used to filter out frequency noise equal to that of the sampling clock in the waveform generated by the DAC chip. Since the DAC chip is current-type, its output is unipolar. The signal then enters a low-noise differential amplifier circuit to achieve bipolar output with a voltage range of ±1 V. The signal is the direct coupled input to the differential amplifier circuit and enables the adjustment of the output signal's bias voltage. The adjustable bias voltage can be used to address the IQ mixer leakage issue that is caused by the imbalance IQ input. Reliable communication is ensured by a gigabit Ethernet running on the Transmission Control Protocol/Internet Protocol (TCP/IP) protocol. An external input clock that enables multiboards to be synchronized by one synchronous source is fanned out and distributed equally to the FPGA and DACs. Another external input trigger, which is synchronized to the same clock source, can provide accurate synchronous control. An extensible AWG array may be achieved by simple integrating multiple AWGs with a synchronization control module.
To reduce the size of the system, we integrate the AWG with individual IQ mixers (Sinolink SIQM0408) and power splitters onto a single backplane and constrain the height to 1U of a standard server rack. As displayed in Fig. 2, two channels from the same DAC chip are used as I/Q signals and are wired to the IQ mixer's corresponding inputs. A common microwave source is connected to the power splitter, and four outputs are separately fanned out to four IQ mixers' LO inputs. Every compensative channel exiting from the differential amplifier is used for monitoring or fast bias controlling. packets so that the system states may be monitored by any host in the same LAN. Key waveform output control data are also transmitted through the network and sent to the waveform output control module.

B. FPGA implementation
The waveform output control module operates under a 250 MHz clock which functions from the external clock interface. The same clock input allows multiple AWG modules to operate under one clock source. The module interface with the DAC chip is completed by the Xilinx JESD204B IP core, which operates with a 10-Gbps rate under subclass 1. The JESD204B subclass 1 protocol supports deterministic delay control and realizes the computable time delay of the waveform data from the JESD204B IP core input to the DAC output. In the present design, the deterministic delay of the JESD204B protocol is 191 ns.

C. AWG control
We implement a waveform output sequence controller for each channel. Each controller is composed of three parts-waveform data memory (WDM), sequence data memory (SDM), and a finite state machine (FSM) for reading and controlling. As shown in Fig. 4, the memory data are set by the host computer and provide a direct arbitrary waveform output. According to the flag bits of the sequence data, the FSM reads the waveform data from the WDM and determines the output mode of the waveform data. The output control mode includes the start address, length, trigger source, counter value, etc. Owing to qubit's decoherence time limit, the length of the control pulse generally does not exceed the qubit's decoherence time. According to the current reported qubit time, this value is

Instruction type Function
Direct output Directly output a waveform in the waveform area without any condition; the output length is determined by the address field and the length field in the instruction Wait output Delay output command; delay count is determined by the 16 bit counter field, and the maximum delay time is ∼250 μs Trigger output Instruction starts executing on the rising edge of the input trigger. There is also a 16 bit wait counter after the trigger signal arrives.

Qubit state output
The qubit state is divided into 0, 1, 2, and NULL. The waveform corresponding to each state is stored in different regions of WDM. The input qubit state selects which waveform to be output. At first, the "Trigger output" instruction executes at the rising edge of an input trigger. Second, the "Loop begin level 1" instruction executes and sets the "Loop count 1." Then, the "Wait output" instruction executes after its wait counter reached. Next is another "Loop begin level 2" instruction which executes and sets the "Loop count 2." Followed is a "Qubit state output" instruction, it will execute according to the qubit information that it has received. Next is "Loop end level 1" instruction that is corresponding to "Loop begin level 2" instruction, and this instruction counts down the "Loop count 2." If the "Loop count 2" equals 0, the next instruction is "Loop end level 1," else the executing flow will jump to "Loop begin level 2." The last instruction is "Loop end level 1;" it counts down the "loop count 1." If "Loop count 1" equals 0, the executing flow will end when the "Total repeat count" equals 0 or jump to the first instruction and count down the "Total repeat count" when the "Total repeat count" is not 0. If "Loop count 1" is not 0, the executing flow will jump to "Loop begin level 1." currently around 10-100 μs; hence, the storage depth of WDM is about 50 μs and the WDM is realized by the BRAM resource in the FPGA. In actual use, a combination of multiple sequence data and multisegment waveforms is required to achieve a larger-length waveform output. When a longer waveform is required, it can be realized by the on-chip DDR4 memory. In general, the programmable arbitrary waveform generation function is achieved by the FSM together with the WDM and SDM. The SDM can be regarded as an instruction buffer and supports up to 4096 instructions. The currently supported instructions are listed in Table I. Each instruction independently controls which area of the WDM to be output. The "Trigger type" instruction is used to synchronize all channel outputs. Some instructions can set waiting times before execution; the maximum waiting time is 65 μs. The instructions in SDM can also be regarded as a whole set and be repeatedly executed. The repeat count can be set through a 16 bit loop counter. Figure 5 is an example of flow path for executing 7 instructions.
When running an arbitrary waveform output, the resources are ready and waiting for the first instruction to be executed. When an instruction begins to run, the FSM prefetches the next instruction to determine and prepare the required resources. Hence, the resources are ready for every instruction to be executed, and the FSM can seamlessly switch to the next instruction without delay. However, a constraint arises where the output length of each waveform area cannot be less than four. As a result, it can be ensured that the prefetching action/stage has sufficient time for the resources of the next instruction to be well prepared.

III. TESTING
A series of tests have been conducted on the present AWG design, including static tests, dynamic tests, and qubit tests. A Keysight 34470A 7 1/2 multimeter is used for integral nonlinearity/differential nonlinearity (INL/DNL) static test. A Keysight N9030B 8.4 GHz bandwidth spectrum analyzer is used in spurious free dynamic range (SFDR) and phase noise test. A Keysight DSO-X 6004A 1 GHz Bandwidth Oscilloscope is used in the synchronization test. Finally, the AWG present in this paper is used to conduct the qubit control on a real qubit system.

A. INL/DNL
The AWG delivers 16 bit vertical resolution, that is, 65 536 codes. The INL/DNL test requires traversing all of the codes. We set a digital code and used a high-precision multimeter to measure the output voltage. The digital code is then continuously increased until all of the codes are traversed. The test takes about 0.1 s for one code and about 2 h to go through all of the codes for a single channel. The high precision multimeter supports an external trigger and can store 50 000 samples. The AWG output 65 535 steps with the duration of 1 ms for each step, each of which is synchronized to a trigger signal. The trigger signal is connected to the external trigger input of the multimeter. To collect the measured data, the host computer controls the AWG as well as the multimeter and carries out the INL and DNL analysis. By improving the testing scheme, the test time of a single code value is reduced to 1 ms, and the whole test time is reduced to 1 min. The results of the test demonstrate that the AWG DNL and INL are within 2 least significant bits (LSBs). Figure 6 illustrates the phase noise of one AWG channel at seven distinct frequencies, i.e., 10, 20, 40, 50, 100, 200, and 250 MHz. Noise floor with a frequency offset greater than 10 MHz tends to be consistent. Increasing the signal frequency yields downward movement of the phase noise curve. In particular, the phase noise curve moves ∼6 dB when the frequency is doubled, which is theoretically consistent. 9

C. Spurious free dynamic range (SFDR)
In the SFDR measurement, the harmonic noise represents the main spurious noise source, especially when the measurement signal is a monosyllabic sine wave. If the RF attenuation setting of the spectrum meter is not reasonable, a considerable difference between the tested harmonic noise and the actual harmonic noise will be observed. We considered the lowest output as the gain of the AWG and set the frequency spectrum instrument input attenuation to 0 dB. As shown in Fig. 7, the SFDR curve is plotted by 25 frequency points, i.e., from 10 MHz to 250 MHz with 10 MHz step size. It must be noted that the SFDR test results here include harmonic noise in the range of 10-500 MHz. If harmonic noise is not considered, the SFDR test results are −68.8 dBc in the range of 10-500 MHz and −83.2 dBc at 100 MHz. The test results are consistent with the chip's datasheet.

D. Synchronization
We test the jitter among AWG channels by simultaneously outputting square signals from multiple AWG channels to the oscilloscope and then recording the delays and standard deviations between two AWG channels. Figure 8 displays the jitters of 40 arbitrary waveform output channels on 10 AWGs. The minimum and maximum standard deviations are 9.22 ps and 10.89 ps, respectively, with the mean value of 9.9 ps. The skew among the channels is ∼100 ps. The large skew is due to the different signal line delay from the clock signals input to the AWGs and the different signal line delay from the AWG output to the oscilloscope input. The skew is deterministic and may be adjusted by cable length.

E. Qubit test
The AWG reveals a series of characteristics such as 14 effective number of bits (ENOB), −68.8 dBc SFDR across 10-500 MHz, and 10 ps jitter among different channels. We further test the AWG's function by measuring the qubit decoherence time (T1 and T2 * ) which is a critical parameter for qubits. 10,11 The schematic diagram of the test system is shown in Fig. 1.
We use the AWG design presented in this paper to test T1 and T2 * of a qubit. The T1 test varies different Z bias voltage and gets a statistic mean value. As shown in Fig. 9, the T1 test result is ∼12 μs. The color bar is the measured population of the |1⟩ state after a given time on the y-axis, and the x-axis is the Z bias which tunes the qubit frequency. The qubit tuning frequency range is hundreds of megahertz. As can be seen from Fig. 9, the T1 is symmetric about 0 bias and at maximum frequency. Figure 10 shows the T2 * test results produced by the AWG. The T2 * measurement results are calculated by fitting the envelope of successive Ramsey experiments; 3 these data are recorded at qubit's maximum frequency that is tuned by Z bias. The T1 and T2 * tests verify that the AWG is qualified for controlling the superconducting quantum processor.
The qualification of the AWG for qubit control is further characterized by the measurement of qubit fidelity. By testing three superconducting quantum processors and extracting the fidelity of the single-qubit gate through Randomized Benchmarking (RB), the average single-qubit gate fidelity of AWG-controlled qubits is over 0.995. In the genuine 12-qubit entanglement experiment 12 where 40 AWG channels are used, the average single-qubit gate fidelity of 12 qubits is 0.998. On another 12-qubit superconducting processor, FIG. 9. T1 measurement results for the designed 2 GSPS 16-bit AWG. The T1 test varies different Z bias voltage and gets a statistic mean value; the color bar is the measured population of the |1⟩ state after a given time on the y-axis, and the x-axis is the Z bias which tunes the qubit frequency. The qubit tuning frequency range is hundreds of megahertz. 38 AWG channels are used to implement strongly correlated quantum walks, 13 and the average-qubit gate fidelity of the 12 qubits is 0.997. On a 24-qubit superconducting processor, 80 AWG channels are used to control this processor, and the average single-qubit gate fidelity is 0.995. 14 Furthermore, in order to verify that the AWG's synchronization is suitable for a scalable quantum computer, we perform the two-qubit CZ gate fidelity test. In a genuine 12-qubit entanglement experiment, the tested average CZ gate fidelity is 0.939. 12 The above results show that the AWG is qualified in both the single-qubit fidelity test and the two-qubit CZ coupling test. At the same time, multiple qubits' experiments also verify the scalability of the AWG in controlling tens of qubits.

IV. CONCLUSION
We introduce a scalable and highly integrated AWG array for a superconducting quantum computing control system. The AWG consists of two DACs, one FPGA, and a gigabit Ethernet transceiver. The designed AWG has 2-GSPS and 16-bit vertical resolution. With the programmable waveform instruction set, the AWG can feasibly output waveform with deterministic latency. 40 and 80 AWG channels are used to construct a 12qubit and a 24-qubit superconducting quantum processor control system separately. The test results show that the average singlequbit gate control fidelity is over 0.995 and the average two-qubit CZ gate control fidelity is over 0.939. With a feasible instruction structure, this AWG is also used in the Twin-Field Quantum Key Distribution (TFQKD) experiment to modulate the source pulse. 15 In summary, this AWG lays the foundation for the superconducting quantum processor experiment and yields possibilities for customizable quantum control circuits.