Photon counting with photon number resolution through superconducting nanowires coupled to a multi-channel TDC in FPGA

The paper presents a system for measuring photon statistics and photon timing in the few-photon regime down to the single-photon level. The measurement system is based on superconducting nanowire single photon detectors and a time-to-digital converter implemented into a programmable device. The combination of these devices gives high performance to the system in terms of resolution and adaptability to the actual experimental conditions. As case of application, we present the measurement of photon statistics for coherent light states. In this measurement, we make use of 8th order single photon correlations to reconstruct with high fidelity the statistics of a coherent state with average photon number up to 4. The processing is performed by means of a Time-to-Digital Converter (TDC) architecture that also hosts an Asynchronous-Correlated-Digital-Counter (ACDC) implemented in a Field Programmable Gate Array (FPGA) device and specifically designed for performance optimization in multi-channel usage.


I. INTRODUCTION
Extremely sensitive light sensing can be performed by observing the transition of a section of a current-biased superconducting nanowire from the superconducting state to the normal resistive state, as a result of photon absorption. Devices based on this principle are called Superconducting nanowire single photon detectors (SNSPDs) and find application in several areas of quantum information technology 1 for their ability of detecting single photons with near unity probability 2 in a large wavelength range from UV to infrared. SNSPDs have been used to measure single-photon emission from a wide variety of light sources as, for example, individual dopants in carbon nanotubes, 3 color centers in silicon carbide, 4 and semiconductor quantum dots. 5 In addition to high photon detection efficiency, SNSPDs provide several advantages in comparison to other single-photon sensitive devices: extremely low timing jitter, 6 the absence of afterpulsing, and the low dark count rate. Such performances make them the ideal detector for applications where efficient detection of weak signals with high time resolution is required. Examples include photon correlations, 7 high-resolution light detection and ranging (Lidar), 8 oxygen singlet detection, 9 optical time domain reflectometry in telecommunication networks, 10 and deep-space optical communication. 11 In this work, the high-resolution timing measurements of detected pulses are performed by means of a Tapped-Delay-Line Time-to-Digital Converter (TDL-TDC) implemented in a Field Programmable Gate Array (FPGA) device. Our architecture differs from other commonly used timing electronics such as the Time-to-Amplitude Converter (TAC) or Delay-Lock-Loop Time-to-Digital Converter (DLL-TDC) implemented in an Application-Specific Integrated Circuit (ASIC), in which the FPGA is used for management purpose only. 12 A modern TDL-TDC implemented in FPGA can achieve performance almost equal to ASIC TDC systems; moreover, from the side of pure economic and design convenience, the realization in FPGA is by far preferable due to hugely lower development costs and extremely higher flexibility of the implemented architecture thanks to the programmable resources of the device. The main feature of the developed architecture is to be specifically addressed to multi-channel applications with the use of minimum resources and power supply overhead. In Table I we summarize the performance achieved by the proposed TDL-TDC architecture in relation with some of the state-of-art commercially available solutions. 13

II. DESCRIPTION OF THE DETECTOR
The detector has the shape of a meandering nanowire of NbTiN with 100 nm width and 50% fill factor. The nanowire detector covers an area of diameter 16 µm. The superconductor is positioned on top of a resonant cavity formed by a silicon-oxide layer of 135 nm and gold in order to enhance the photon absorption probability at ∼800 nm. 14 This wavelength range is particularly important for quantum experiments since both state-of-the-art single photon sources, both solid-state and based on photon down-conversion, provide photon emission in this wavelength range. The detector is operated at a temperature of ∼2.5 K and is coupled to a single-mode optical fiber. 15 Figure 1 shows the calibration of one superconducting nanowire detector used in these experiments. The system detection efficiency measured at 878 nm is plotted as a function of the bias current applied to the superconducting sensor. The device reaches saturation at a system detection efficiency of 80.7%. This efficiency represents the ratio of the detected photon rate with respect to the photon rate at the cryostat input. It therefore includes optical losses at the fiber connector located at the cryostat input and it also includes the transmission losses of the optical fiber that connects the superconducting detectors from inside the cryostat to the cryostat input. Figure 1( Figure 1(b) is acquired with a LeCroy Waverunner 640Zi oscilloscope. 16 The electrical jitter of the laser trigger signal (σ LASER ) has been measured separately and resulted in a contribution of less than 6 ps r.m.s. corresponding to 14 ps FWHM. The contribution of the oscilloscope (σ SCOPE below 2 ps r.m.s. that is below 4.7 ps FWHM) can be neglected. The jitter contribution of the SNSPD (σ SNSPD ) is below 8 ps r.m.s. and of the amplification stage (σ AMPLI ) is less than 14 ps r.m.s., i.e.,

III. EXPERIMENTAL SETUP
The timing circuit performing the measurement is a resource-saving 8-channels TDL-TDC 17 and an Asynchronous-Correlated-Digital-Counter (ACDC) implemented in the Programmable Logic (PL) section of a System-on-Chip (SoC) programmable device, i.e., a Xilinx ZYNQ-7020, 18 that integrates a 28 nm Xilinx's FPGA within the Programmable Logic (PL) section and an ARM-based processor in the Programmable Software (PS) section. The setup is composed of five main parts (Fig. 2), which are a detection section, an analog stage, a group of comparators for digitizing the output of the analog stage, the SoC that includes TDL-TDC and ACDC, and a read-out software running on a host PC.
The signal at the output of each detector has decreasing exponential shape with peak amplitude of about 0.5 mV and nominal decay time constant equal to 15 ns ( Fig. 5(d)). The analog stage is composed of 8 (one per channel) ACcoupled pass-band and low-noise double-stage amplifiers, with a gain of 50 dB over the bandwidth 10 MHz-1 GHz adding a timing jitter σ 2 AMPLI of 14 ps r.m.s. to the intrinsic jitter σ 2 SNSPD of the single SNSPD that is less than 8 ps r.m.s. 6 Each output of the 8 analog amplifiers is converted in a digital pulse by a programmable threshold discriminator based on an ultra-fast and low-jitter comparator (Analog Devices ADCMP605BCPZ 19 ), whose output is a logic level in LVDS format. The timing jitter introduced by each comparator (σ CMP ) is below 7 ps r.m.s. As stated above, the multi-channel TDL-TDC and the ACDC are implemented in a commercial 28 nm programmable device (Xilinx ZYNQ-7020). The main features of the TDL-TDC architecture (only the firmware) are low power consumption (430 mW), very modest expense of resources necessary for implementation (30% of the PL section), reconfigurability of the number of channels, and time resolution and precision. The resolution of the average tap of the TDL-TDC is 8.6 ps and the single channel precision is guaranteed below 15 ps r.m.s. per channel (σ TDC ); moreover, if we consider the threshold comparator as input, the channel precision (σ 2 TDC + σ 2 CMP ) is 17 ps r.m.s. The resolution is kept constant over the full-scale range. The nominal fullscale-range of the TDL-TDC is 10.7 seconds, but this value is limited to 640 ns because the maximum time interval under measurement is the time skew between two different channels, which is lower than few tens of ns.
The ACDC has 8 channels and detects presence into the channel of an event for each single acquisition. The trigger maximum time resolution is 2.5 ns, which is significantly smaller than the maximum rate of detection in our experiment due to the temporal distances of the laser pulses (i.e., 12.6 ns). One channel is set as trigger of the acquisition process and gives the start for the photon counting on the other channels within a programmable time window (Fig. 3). The user selects via software the triggering channel and the duration of the measurement window. First, the SoC device performs the time measurements of the TDL-TDC implemented on the programmable logic, then it sends measurement data to the host PC via USB. By means of the read-out software hosted on a standard computer, the user can set which channel is the "start" of the acquisition and the duration by a programmable time window. The TDL-TDC/ACDC provides data in real-time, so the user on the host PC can perform in real-time the timing statistics among the 8 channels and after the acquisition, the corresponding photon statistics. In order to minimize the total jitter, it is worth using an acquisition time that is shorter or equal to the repetition period of the laser pulses. Consequently, the procedure involves the acquisition of the 7 channels after receiving a laser pulse that triggers the start of the measurement as shown in Fig. 3.

IV. DETECTION OF THE PHOTON TIMING IN A WEAK OPTICAL PULSE
As a first experiment, we propose the calculation of the statistic of the times of arrival of the photons from a weak laser pulse with the purpose of measuring the time resolution of our complete system. High time resolution at the single photon level is of fundamental importance for analyzing fast photon emission dynamics at low illumination fluxes or in the presence of quantum emitters. An application example is the measurement with high contrast two-photon quantum interference. 5 In lidar and optical time-domain reflectometry, distances are determined by the time-of-flight measurement of optical signals reflected by the target. In this case, the accuracy of the distance measurement is directly determined by the time resolution of the measurement.
The laser source used in our experiment is a Ti:Sapphire laser with 76 MHz repetition rate, 6 ps pulse width, and 750 nm wavelength. Referring to the layout shown in Figure 2, the TDC measures the temporal distances of the events on the channel from the reference trigger on channel 8, which is given by the internal fast photodiode of the laser. The time interval and its temporal statistics from two generic channels can be shown on the host PC via the read-out software.
In order to identify the contributions to the jitter given by the different components, first, we have bypassed the detector and the analog stage by entering directly in the comparators with the signals of a function generator. In order to maintain the same experimental conditions, the signals are generated with the same temporal shape as the output of the superconducting nanowire detectors and in the range of frequency rates, i.e., from 50 kHz to 1.8 MHz. Table II shows a synoptic view of the resolutions of all channels measured taking channel 8 (LASER) as reference. In all other possible selections of the trigger channel and considering every combination of couples of channels, the cascade of comparators and TDL-TDC maintains the measurement precision 12 of the couple of channels below 25 ps r.m.s. that is 17 ps r.m.s. for each single channel (σ 2 TDC + σ 2 CMP ). By measuring the complete detection and timing channel, i.e., including the superconducting nanowire detector and the analog circuitry in the measurement path, we observe that the minimum achievable resolution approximately doubles. Table III sums up the resolutions of all complete channels measured taking channel 8 as reference. The block of comparator and TDC guarantees the timing precision between couples of channels below 58 ps r.m.s. for input rates varying from tens of kHz to a few MHz and for all possible START-STOP combinations, using any channel as trigger channel. It should be considered that the variance of the measurement σ 2 ij is the sum of variances of channels i and j respectively, which means that the resolution of a single channel can be estimated to be σ ij / √ 2. Next, we consider the estimated baseline fluctuation (σ BASE < 40 ps r.m.s.) due to a stochastic rate of arrival of the events and its impact on the resolution. The baseline fluctuation occurs because the photon detector transforms the deterministic frequency of 76 MHz of the laser into an equivalent lower stochastic rate. We restore the baseline by applying a highpass filter at 1 MHz at the input of the TDL-TDC. Hence, the precision achieves ≤30 ps r.m.s. between couples of channels (Fig. 4), which means 22 ps r.m.s. per channel (σ 2 CH ). The improvement with respect to the 50 ps of Table II is considerable. Moreover, this result is consistent with the best achievable value due to the nominal jitter of the individual components constituting the system that is calculated to be around 23.1 ps r.m.s,

V. DETECTION OF THE PHOTON STATISTICS IN A WEAK OPTICAL PULSE
We now present a measurement that employs simultaneously several SNSPDs devices connected to the TDL-TDC/ACDC unit as displayed in Figure 2. Thanks to the ACDC part, we measure the number of photons in weak optical pulses by measuring the photon statistics. Photon number resolution in the few-photon regime is a crucial feature for the development of quantum information technologies like quantum cryptography. Since secret keys are exchanged by means of single photons or photon pairs, an eavesdropper could get information on the secret key if excess photons are transmitted through the link due to imperfect photon sources. In order to maintain quantum protection against potential eavesdroppers, it is necessary to employ a tool for detecting the number of photons used over the communication link and the corresponding photon statistics.
Since the absorption of more than one photon simultaneously in a SNSPD gives the same electrical output as if one photon is absorbed, a single SNSPD cannot distinguish the precise number of photons hitting the detector. In our experiments, we overcome this limitation by spatially multiplexing the optical signal onto 7 different SNSPDs installed in a cryostat based on a Gifford-McMahon closedcycle cryo-cooler and making use of the multi-channel TDL-TDC/ACDC in FPGA architecture. By measuring via the TDL-TDC/ACDC the photon arrival time on the 7 detectors as compared to the laser trigger, we are able to determine the number of detection events per each laser clock. Through a calibration of the photon detection probability, we reconstruct the input photon statistics from the acquired histogram of coincident detection events up to the 8th order coincidence.
Other ways to measure the photon number in a weak optical pulse are to use parallel detectors to output electrical pulses of different height as a function of the number of detectors transitioning to the normal state. [20][21][22][23] This technique requires the detectors to be operated at a much lower current than the critical current, thereby operating them below the optimum efficiency point. Alternatively, one can apply schemes for time multiplexing of the photons. 24,25 An incident pulse is split into N weaker pulses which are delayed compared to each other by a series of fiber delay lines. If sufficiently long fibers are used, in order to delay the photons for a value longer than the detector dead time, this architecture enables measuring photon statistics using a limited number of photon detectors. However, this implementation is not easily scalable for high photon number and it is not suitable for high repetition rates, i.e., high photon fluxes.
Our approach makes use of a scalable TDL-TDC/ACDC architecture that is connected to several high efficiency photon detectors. This architecture can be readily scaled up to 32 detection channels thanks to the use of FPGA parallel architecture without compromising the performance shown in Table I. The device can make full use of the superconducting nanowire performance of photon detection and time resolution.
The optical signal is obtained by a laser diode with a wavelength of 878 nm, electrically pulsed at a repetition rate of 100 kHz, with a pulse width of 2 ns. The width of the optical pulse is approximately one order of magnitude smaller than the device dead time as shown in Figure 5 is measured as the time constant of the decay of the voltage output pulse, which is due to the recovery of superconductivity in the detector after a detection event. The width of the pulse is kept much shorter than the detector dead time to avoid consecutive detection events from a single detector during one measurement window. The average laser power is measured with a calibrated optical power meter and then attenuated by a tunable optical attenuator for achieving an average of a few photons per pulse. The Poissonian distribution of photons within the weak optical pulses arriving at the beam splitter is shown in Figure 5(a) from an average photon number of 0.5 to 4.2. The number of sensors used in the experiments sets a limit to the average photon number that the device is able to measure. For the measurement, the laser is connected to an eightport fiber beam splitter that multiplexes the input to 7 detectors. One port of the splitter is not used since the electrical signal corresponding to the laser trigger is directly connected to channel 8 of the TDL-TDC. We note that it is essential that one channel of the TDL-TDC/ACDC acquire the laser trigger to open a detection window. The events where no detectors click, after the laser trigger has been acquired, will be assigned to the 0-photon probability. Each detector, as well as the transmission of each fiber, has been separately calibrated. The efficiency of each detector and the transmission of the fiber splitter have been used, for each input power used in Figure 4(a), to calculate the system response. Thus, the photon statistics at the fiber splitter is reconstructed from the measurement data and the known instrument. The reconstructed photon statistic for each input power is shown in Figure 4(b) after the acquisition of 10 000 photon pulses. The measured photon number inferred by our device as compared to the photon number measured with the optical attenuator and calibrated with the optical power meter is shown in Figure  4(c). The results are shown with a linear fit with R 2 > 0.99. Additionally, we observe that the relative error in the determination of the photon number is decreased with increasing number of photons per pulse because of the signal-to-noise ratio (i.e., photon count rate versus dark count rate). The average dark count rate (DCR) per channel in this experiment is between 30 and 50 Hz. The results demonstrate that using superconducting nanowire detectors coupled to a multichannel TDL-TDC/ACDC we can perform up to 8-fold photon coincidence measurements and measure with the ACDC the photon statistics of coherent states of light arriving at the beam splitter.

VI. CONCLUSIONS
A setup for measuring photon statistics and photon timing in the few-photon regime down to the single-photon level has been presented. The system is based on SNSPDs and a TDL-TDC/ACDC architecture implemented in a FPGA device. The main target applications are several areas of quantum information technology for the feature of detecting single photons at high efficiency in a wide wavelength range from UV to infrared, with extremely low timing jitter, absence of afterpulsing, and low dark count rate.
Timing and counting in a FPGA device allow comparable performance of ASIC solutions, but with the advantage of enabling flexible implementation of different architectures thanks to the programmable configuration of 035003-7 Lusardi et al.
Rev. Sci. Instrum. 88, 035003 (2017) resources. In the presented system, this feature has been employed synthetizing a re-configurable multi-channel structure with minimum resource and power supply overhead. The option of using an oscilloscope for START-STOP measurements would require the presence of the laser reference for avoid complex and not always possible triggering methods. In this case, the ACDC component is not present and consequently the function of calculating the statistics of incoming photons is not available. Moreover, with an oscilloscope the number of channels measuring in parallel is limited, while the FPGA allows to extend the number of channels keeping the resolution almost constant. Referred to the single channel, the complete measurement system has precision below 22 ps r.m.s., power consumption of 2.5 mW, and employing only 2.7% on the selected FPGA device ZYNQ7020. The trigger resolution is equal to 2.5 ns.
The measurement system presented in this article could be promptly scaled up to perform timing measurements over 32 channels simultaneously, calculating statistics and detecting time correlations using a compact, low power, and resourcesaving hardware.