Roadmap on Material-Function Mapping for Photonic-Electronic Hybrid Neural Networks

Driven by machine-learning tasks neural networks have demonstrated useful capabilities as nonlinear hypothesis classifiers. The underlying technologies performing the dot product multiplication, the summation, and the nonlinear thresholding on the input data in electronics, however, are limited by the same capacitive challenges known from electronic integrated circuits. The optical domain, in contrast, provides low delay interconnectivity suitable for such node distributed non Von Neumann architectures relying on dense node to node communication. Thus, once the neural network's weights are set, the delay of the network is just given by the time of flight of the photon, which is in the picosecond range for photonic integrated circuits. However, the functionality of memory for storing the trained weights does not exists in optics, thus demanding a fresh look to explore synergies between photonics and electronics in neural networks. Here we provide a roadmap to pave the way for emerging hybridized photonic electronic neural networks by taking a detailed look into a single node's perceptron, discussing how it can be realized in hybrid photonic electronic heterogeneous technologies. We show that a set of materials exist that exploit synergies with respect to a number of constrains including electronic contacts, memory functionality, electrooptic modulation, optical nonlinearity, and device packaging. We find that the material ITO, in particular, could provide a viable path for both the perceptron weights and the nonlinear activation function, while simultaneously being a foundry process near material. We finally identify a number of challenges that, if solved, could accelerate the adoption of such heterogeneous integration strategies of emerging memory materials into integrated photonics platforms for real time responsive neural networks.

conventional (Von Neumann) compute architectures that i) utilize centralized processors, ii) rely on logic technologies, and iii) execute stored programs sequentially, emerging non-Von Neumann systems are inherently i) decentralized and therefore ii) require significant communication between these many nodes of the network, yet iii) process information in parallel and iv) are trainable for applications such as in machine learning. Electronic implementations of such brain-mimicked systems show significant (3-4 orders of magnitude 6,7 energy-per-compute reductions defined by the underlying mathematical interconnection function of NNs, namely, multiply-and-accumulate (MAC). Thus, the efficiency and performance in the number of processed MAC's per unit time and unit energy (i.e. MAC/s and MAC/J) are two key metrics to be considered when new hardware technology and emerging materials are explored for NN technology. While electronic NN-based processors have shown improvement with respect to energy efficiency, the time delay to obtain a classification result from the NN (e.g. of an inference ML task) is not improved as compared to regular Von Neumann systems, due to the high-RC delay of electronic circuits. In this work we consider the NN being trained offline, and hence only discuss the performance under inference operation. This approach, yet, requires mapping the ML-task to the NN hardware, which is not straightforward for photonics networks at the time this paper is written. However, initial approaches of modifying graph-simulators (e.g. Tensorflow) to include physical effects from the device layer (e.g. impact of noise, cascadability) show higher inference accuracy using electro-optic hybrid approaches for NNs 8 .
With the aforementioned electronic-NN based ML successes, the question becomes why perform NN's in the optical domain? Before we answer this, one can more generally ask what the rational might be for performing information processing in optics and also in integrated photonics? Here we argue that the main rational for photonic NN hardware is found in the short delay, which could be important for applications such as in ranging, synthetic aperture radar, automated target recognition, or nonlinear predictive control 9 (Table 1). Once the dot-product, providing the 'weights' of the photonic perceptron, are set (i.e. trained offline), the entire system only depends on the time-of-flight of the photon through the NN plus the response time of the electro-optic components of the perceptron 8,10-12 ; here several implementation options exist, and for the discussion of this roadmap we focus on the dot-product photonic weights and the nonlinear (NL) activation function. That is, the entire NN's delay (once trained) can be 100's picosecond short using integrated photonics and high-speed (10's GHz) fast photodetectors. In addition to delay, there are other benefits using photonics for NNs or information processing such as parallelism from WDM and vanishing capacitive 'wire' delay (Table 1). However, challenges exist too such as in function-per-wafer-footprint, electro-optic (EO) conversion efficiencies, packaging, andmost relevant for this this roadmap paper -the non-existence of photonic memories. It is this very reason that we turn our attention to a hybridization strategy by combining the best-of-bothworlds (Fig 1a); that is, using the non-volatility from electronics yet the interconnectivity from optics combined with the compactness from integrated photonics which also aims to leverage semiconductor process economy of scale through foundry services such as AIM, IMEC or IME. Thus, our aim in this work is to explore a roadmap of non-volatile materials and approaches that can be integrated with photonics to enable dot-product multiplications and thresholding in the optical domain. While volatile memory options in photonics could be implemented for the weighting function using, for example, electro-optic modulators 13-20 their high optical material index instability is sub-optimal for the typical longevity of NN weights, which change only when a new training set is obtained (if ever), which can be particularly time consuming due to the 'curse-of-dimensionality' 21 . Thus, the community should explore emerging materials that feature favorable performances for photonic, electro-optic material performance while keeping chip integration synergies in mind, as discussed below. Neuromorphic photonics and memristor-based systems provide solutions that can deliver particularly high performances in terms of either throughput or energy efficiency by comparison with the current purely CMOS based execution 11,22 . The main advantage of exploiting Photonic Integrated Circuit (PIC) for NN tasks is intrinsic to the wave-nature of the signal that carries information (Fig. 1a). The fundamental operation in NN-tasks are multiplications and accumulations (MAC) can be performed without any additional energy because a MAC is achieved by simple interference of phase-shifted electromagnetic waves travelling within waveguides 5,10,11 . Moreover, integrated photonics is naturally predisposed to ultra-high speed characterized by short latency due to a) the time-of-flight of the photon in the chip (few ps for large chips) and b) to a large extent by its electro-optic components (few tens of ps), such as detectors and modulators. Furthermore, exploiting Wavelength Division Multiplexing (WDM) 5 , PICs can perform selective weighting operation on multiple inputs mapped on different wavelengths, using the same physical channel, supporting higher parallelism and remarkable throughput 5,23 . Furthermore engineering light-matter interactions enables advances in efficient modulators, which can realize 100's of attojoule-per-operation energy dissipation [24][25][26] . Nevertheless, the advancement of integrated photonics in neural networks is hindered by major challenges; beside the larger footprint with respect to microelectronics, which could be neglected considering the higher throughput and efficiency, the advancement of PICs at a larger scale is primarily held back by difficulties in the integration with a CMOS interface, local laser sources and packaging ( Table 1). This is because the materials employed require dedicated industrial process recipes that have not reached a sufficient maturity yet to provide cost effective, consistent or scalable results. Just to clarify this point further: PICs-based components and subsystems are commercially sold in the millions namely for data-centers, however, using PICs for NN is a technology direction which is still at its infancy despite great progress having been made [7][8][9]11,12 . It also worth mentioning that, except for recent efforts 8 , NNs in integrated photonics are still lacking a straightforward implementation of an optical nonlinear activation function, to mimic the action potential firing in the neuron or an analog tuning such as favored by the computer science community (e.g. rectifying linear unit ReLU). Also, the absence of a straightforward non-volatile memory realized in photonics that can be written, erased, and read optically still limits the realization of all-photonic chip-scale information processing for NN tasks, yet initial designs concepts exist 10 . What we find is that the required power consumption and bandwidth could both be improved when non-volatile memory functions could be integrated directly in the optic domain, which is particularly advantageous in trained networks when the weights are fixed (or changing seldom).
Emerging memory technologies, such as phase-change memory (PCM), conductive bridge random access memory (CBRAM), resistive switching memory (RRAM), have been demonstrated to be compelling candidates as synaptic devices for weight storage and matrix vector multiplication (VMM) in purely electronic neuromorphic circuitry. These devices are characterized by exceptional characteristics such as high footprint scalability, multi-bit storage capability, and long retention-time non-volatility as well as an overall higher technological maturity compared to integrated photonics. Therefore, few efforts [27][28][29] have targeted integrating non-volatile materials with waveguides, in order to explore novel ways of tuning their refractive index, either plasmonically of the optical mode or through phase changes in the crystallinity of the material. From this perspective, here we argue that integrated photonics and memristive hybrid systems (Fig. 1) could present significant improvements to the established digital implementations of VMM-tasks using graphic-and tensor process units. However, their usage aiming to replace the current technology is still hindered by the abovementioned hurdles and the state of the community of integrating CMOS compatible phase change materials 30 , such as GST (Ge2Sb2Te5), with PIC is relatively at its infancy. Nevertheless, the integration of the two technologies seems to be a compelling direction to pursuit in order to complement their strengths (Fig. 1a). Yet, if successful, such hybrid integration would be particularly appealing since it could enable retention of the optical information, digitalization of the output, and introduce nonlinearities at the same time, while building on existing process know-how and made capital investments. The underlying mechanism for achieving weighting functionality in integrated photonics using memristors relies on specific active materials 28,30-35 either alone (weighting + storing) or in combination with electro-optic modulators (storing), aiming to alter their optical response in a non-volatile reversible way, by means of an activation potential (i.e. thermal, electrical or optical).
Here we review, discuss, and project challenges and opportunities for device type and material choices for these memristor-based electronic-photonic hybrid NNs from the perspective of key metrics such as power dissipation, electrical vs. optical readout, multi-state tunability to mention a few, thus laying a corner stone towards establishing a materials roadmap for photonic NNs. We discuss the advantages and issues related to a hybrid integration of photonics and memristors in a complementary way, highlighting material systems, which are jointly shared between the two technologies and that could allow a seamless integration. CMOS compatibility and large-scale integration pose additional material-related constraints, limiting the choice of materials available for electrodes, device layers and isolation.

Figure 1. Hybrid memristive photonic systems towards building dual-technology neural network (NN) architectures. (a) Specific pros and cons which characterize the two technologies, integrated photonics (dark) and memristor (light). (b) The perceptron model representing a single node of a NN can be mapped onto photonic hardware components. Thus once the NN is trained and the weights are set in the electro-optic weighting modulators, the delay inference for the inference task is given by the photons time-of-flight through the photonic integrated circuit (PIC), which can be taken to be 'real-time' (i.e. 1-10's ps) compared to electronic NN solutions.
However, the opto-electronic weights require a constant voltage bias taking a toll on the power consumption. Thus real-time fast and power efficient photonic NNs can be designed by integrating non-volatile memory elements near the photonic weight components. This is synergistic to offline trained 'weights', which are updated infrequently, if ever. (c) Schematic of a back and front end-of-line of hybrid integration for non-volatile weights. In addition to technological performance gains, the material compatibilities with fabrication processes of foundries need to be considered as well.

INTEGRATED PHOTONICS FOR NEUROMORPHICS
For hybrid memristor-photonics NN systems, the fast and efficient response of the transfer function of an electro-optic modulator is mostly suitable for mimicking the NL activation function of the neuron rather than the weighting, since the network is mostly offline trained and performing only the inference tasks 8,12,36 . On the contrary the ability of the memory element to retain information for a comparable long time, can be exploited for the purpose of efficiently storing the weights and modulating the signal travelling in the waveguide accordingly. In this section (Sec. 2.2), we discuss our recent work on integrated photonics platform, optical modulators and their integration with CMOS electronics before we further discuss key metrics of the state-of-the-art modulators.

Passive Photonic Neural Network Interconnectivity: waveguide platform options
For PIC operating at telecom wavelengths (1550 nm) the main platform for planar light-wave circuit is Silicon on Insulator (SOI). The crystalline silicon layer atop the insulator is used to create optical waveguides (through optical index contrasts) and can be extended to include both passive and active devices used to deliver NN functionality. The SOI chip can be realized by either smart cut or SIMOX processes. The buried insulator enables propagation and strong confinement of infrared light in the silicon layer on the basis of total internal reflection, with low propagation losses (<1 dB/cm) and small bending radii (<0.5mm) enabling PIC with compact footprint. This SOI platform also features monolithic electro-optic modulators, i.e. without the addition of any other material, either by thermally or electrostatically changing Silicon's optical refractive index 37,38 . Silicon's optical (i.e. all-optical) nonlinearities arise at few tens of mW, which is a value that could limit the depth (numbers of layers) of the NN. The integration with CMOS electronics for logic circuitry with SOI, while technologically feasible 39 , is challenging due to technical and economic mismatches, therefore hybrid integration is usually required. Layer-stacking and integration with light sources could be feasible by SOI wafer bonding techniques such as coupling the evanescent optical mode of a III-V laser to SOI waveguide. While Silicon's bandgap is transparent at telecomm frequencies, there is no conceptual rule why PIC-based NN could not operate at visible or also at mid-IR wavelengths. In fact, there are compelling reasons to consider small wavelengths; a) the smaller wavelength enables denser PICs, b) the higher bandgap can deliver lower optical losses saving chip power consumption, and c) extending the pump-power range before NL become parasitic 20 , thus enabling NN cascadability. Such a visible-photonics platform material is Si3N4, which presents a bandgap at higher energy (0.4µm). Interestingly, the process recipes for silicon nitride are favorable for large-scale photonic networks; when deposited by LPCVD Si3N4 is characterized by both high material stability and refractive index regularity, as well as nanoscale etch-resolution and lower surface roughness, which leads to reduced scattering and hence low optical losses per unit length (<10 dB/m) for a 0.5 mm bending radii, which is 10-100x lower compared to SOI. 40 In contrast to SOI, due to the fundamental flexibility of the deposition techniques 41 , Si3N4 also allows for easy integration and 3D stacking flexibility with either SOI or CMOS platforms 42 . Regarding other material integration into this silicon nitride platform, several options have been successfully embedded such as a number of metals or colloidal quantum dots. This platform also allows creating more complex photonic structures such as distributed feedback reflectors (DFB), and has shown to deliver PICs with multiple photonic layers. Regarding electro-optic (EO) phase tunability, silicon nitride is less responsive than even silicon and hence is not a suitable material for monolithic weights of photonic NNs. However, given its promising passive properties, it appears an ideal material for heterogeneous integration in addition to co-integration with SOI and CMOS logic circuitry.  In order to modulate the light-wave information travelling in the waveguides, efficient electrooptic or absorptive modulators have to be designed and integrated, either monolithically or heterogeneously in the SOI or Si3N4 platform. Modulators are active photonic components that induce an optical absorption (EA, electro-absorption), or change the optical path length or refractive index (EO, electro-refraction) of a material 25 . For neuromorphic applications the optical modulator is a key component having a significant impact on the overall metric of the NN. Indeed EA/EO modulation can perform two of the three functionalities required in a perceptron namely providing the weights and the NL activation function. The modulator can also be used for online learning by actively adjusting the weights of an to-be-trained NN 11,51 . That is, when the network is trained offline, the weighting can be realized completely passive without energy consumption using EO or EA modulators that are gated via a non-volatile storage element (Fig. 1b). Whereas, the NL activation function would mostly still rely on EO or EA modulators in combination with photodiodes. As reported by Miller 52 , while providing a signal modulation depth (extinction ratio, ER, >3dB) at a sustained speed (>25 Gbit/s), the target power consumption for modulators has to be within few fJ/MAC (or better) for global on-chip connections to become technologically competitive (i.e. BER < 10 -8 ). Materials such as silicon 37,38 , lithium Niobate (LiNbO3) 53 , Germanium 17,54 and hybrid III-V 45,55 are regularly employed as active component in modulators and have showcased reliable, and to a certain extent, scalable results. Although, their use in densely integrated photonic circuit is not a viable approach due to their vastly (~10 6 x) larger footprint compared with electronic switches (e.g. MOSFETs). Considering the overall platform for PIC-based NNs, the most obvious material would be Silicon-based modulators for weights and/or NL activation, although as displayed in Fig. 2, Si-modulators based on carrier injection display low dynamic modulation (i.e. low ER) and if doped are particularly lossy (high insertion losses, IL), and additionally if thermally driven modulators are slow (1-10 kHz). On the other hand, even though characterized by competitive performances, III-V modulators are difficult to integrate into SOI platform as well as with CMOS circuitry, due to process and material incompatibility, and the wafer sources are costly
In the view of providing modest energy reductions when performing NN tasks compared to electrical approaches, while still preserving a modest footprint, several engineering design and material choices have to be made. As a main point, evidently the selection of the active material is strongly impacted by its ability for voltage-efficient optical index tuning for the NL activation function and the weights; ideally, the absolute difference in the transmission of the modulator in its two states (ON/OFF), avoiding modulators with significant background loss, therefore not using those active materials which present high intrinsic losses in the OFF-state. Although, this might not be sufficient in order to achieve a sufficiently high modulation performance and low energy-per-compute for surpassing electronic efficiency, therefore an enhanced modulation can be reached by using either quantum-confined system or sub-diffraction limited plasmonic structures, either with monolithic or heterogeneous integration of other materials or structures. This results in a low-energy consumption of few attojoules-per-bit 24,25 , up to the highest speeds 14,20,38,56 , which corresponds to a compounded merit improvement of 10 5 times compared to electronic switches. Hence, many research groups have strived for engineering on-chip modulators beyond the solutions offered by heterogeneous integrated photonic-foundries to date. Recent developments of monolithically and CMOS compatible integrated emerging EO or EA materials, such as Indium Tin Oxide (ITO) 15,57 , graphene 14,18,58,59 , quantum-confined structures 60 and TMDs into Si-photonics with specific device configuration aiming to enhance mode overlap allowed energy efficient 61,62 , compact silicon photonic based modulators. The important performance metric for EOMs include high ER (>3dB), low IL (<1dB), modulation speed (>25 GHz), low energy consumption per bit (<10fJ/bit), and compact footprint area (possibly 3D volume). ITO is one such compelling material for heterogeneous integration in Si exhibiting formidable electro-optic effect characterized by unity-order index change at telecommunication frequencies. ITO carrier-based electro-absorption models, which are implemented via capacitive gating 19 , have shown sharp dynamic range, compact footprint and potential for GHz-fast modulation. In recent works we demonstrated a monolithically SOI-integrated ITO electro-optic modulator based on a Mach Zehnder interferometer (MZI) featuring a high-performance halfwave voltage and active device length product of VπL = 0.52V•mm 15 . This device demonstrates a unity-strong index change in the active ITO layer enabling a 30 micrometer-short π-phase shifter, while purposefully operating ITO in the index-dominated region away from the epsilonnear-zero (ENZ) point, hence reducing optical losses. Moreover, some major electronic manufacturers have recently declared to integrate ITO into their foundry processes, hence it can now be termed to be a CMOS compatible material as it can be monolithically integrated in the photonic frameworks and directly interfaced with on-chip logic circuitry and memories (Fig.2). Incidentally, GST is a second candidate that will enter foundry processes soon, while Graphene has yet to receive such 'permission', due to limitation in the substrates used for the growth and the inconvenience of the required transfer. In addition to ITO-based modulators 19,59,63 , other active opto-electronic components are also demonstrated; Kim et al 64 , for instance, experimentally demonstrated efficient and potentially high-speed directional coupler-based on ITO, that can also be employed as intensity modulation for NN weighting. Still at a research level, due to their non-scalable integration, other emerging materials such as Graphene 65,66 , transition metal dichalcogenide flakes 66,67 , organic 20 and phase change 30 modulators and detectors have been reported, demonstrating, in some cases, striking performances yet process maturity is far-away from foundry standards. As aforementioned, integrated photonics does not have a memory functionality, although recent work on different device configurations has capabilities of optically write/read/reset functionality as described in the next section. Next, we discuss details of novel memory devices embedded in photonic framework used for storage and weighting functionalities.

NOVEL MEMORY DEVICES FOR PHOTONIC INTEGRATION
In a hybrid photonic system, memristor devices would be used mainly as weights, which do not require high-frequency updating. Therefore, the efficiency of programming them during training (either directly on the photonic hardware or uploading offline-trained weights) is of importance in terms of chip power budget and compatibility of available voltage ranges (i.e. signal dynamic range). Recent studies [68][69][70] have highlighted the difficulties of training large-scale memristor networks acting as weights for neuromorphic computing. An alternative approach is to alleviate these challenges and to use both long-and short-term memory elements to implement these different temporal weight classes.

Efficient and Long-term Weights: heterogeneous integration of non-volatile memory
There are several long-term non-volatile memory technologies that show promise for integration with photonics systems (Fig. 3). Given their technological maturity, flash memories (i.e. floating gate transistors) have the potential for gating the above-discussed modulators that perform the weighting or NL thresholding (Fig. 3a). However, in a typical configuration for regular memory applications, these transistors use polysilicon for the floating gate. For photonic integration, however, this option is not very desirable since the low-number of trapped charges can only impose small changes in the modulators' refractive index (low dynamic range). Weight banks or NL activation functions based on such a technology option will have limited neuron bit-density limiting inference accuracy and NN cascadability. This essentially rules out online-learning options, where these modulators had to perform a gradient-decent algorithm via backpropagation, which relies on differentiability of the modulators transfer function; the steeper the higher the representable bit-density and the higher the training efficiency, i.e. lower power. Also, the additional electrical capacitance from poly-depletion has increases both the device' RC-delay and the energy-per-MAC similar to arguments in transistor technologies and hence high-k dielectrics and metal-based floating gates may be a more viable path. Recent work 71 proposed a floating gate utilizing a non-volatile optical switch, where a graphene sheet is used as floating gate material 72 with much larger refractive index variation 18 (Fig. 3a). However, electron-based memory storage can be problematic from the perspective of increased optical loss in semiconductor waveguides, reduced non-volatility and limited analog programmability. A more desirable approach could be integrating memories heterogeneously with photonic waveguidemodulators. Such micrometer-compact device-to-device integration was recently pointed out by D. Miller 52 due to improved capacitive loading; since every micrometer of metallic wire has a capacitance of 0.2fF,just a few micrometer of wire would bring the power budget above 1fJ-peroperation (i.e. bit or MAC) just for the wiring alone (excluding the functional devices) for a VDD of just 1 volt. For such 'tight' integration, phase change memory (PCM) materials could be a viable option, since they can significantly change the refractive index of the waveguide by local amorphisation or crystallization of its crystalinity, which produces unity-strong index changes (i.e. 10 3 -10 4 times stronger than the free-carrier modulation of Silicon, for example). Furthermore, PCMs have a state retention time of (estimated) >10 years and analog programmability up to 3-4 bits during the 'write' transition. Recent experimental demonstration 73 showed a photonic synaptic behavior of a chalcogenide GST phase-change film integrated with a Si3N4 waveguide (Fig. 3b). To achieve precise control of weight-programming using fixed-pulse characteristics, an innovative tapered waveguide structure with multiple discrete PCM islands was used, demonstrating 3 bit (8 level) operation using ~400pJ for single pulse weighting. However, for such a power this GST-based synapse would not provide any significative advantage with respect to the state-of-the-art electronic neuromorphic systems (for comparable performance), thus reducing the photonic advantages. However, the short NN delay would still be the main rationale for photonic PCM-hybrid NN systems. However, replacement of GST with engineered phase-change materials that have low switching powers is therefore desirable from a materials roadmap perspective. Another intriguing alternative is plasmonic memristive devices that exploit the modulation of electron transmission in optical systems through filamentary switching (Fig. 3c). Memristive devices based on metallic or highly reduced oxide filaments can be used for this purpose, for example. Plasmonic memristors based on silver filaments and integrated with silicon waveguides have shown a hysteretic behavior and optical readout functionality 74 . However, poor ON/OFF signal ratio limits higher-bit densities (i.e. multi-level switching), but a lower power compared to PCMs highlight the potential for femtojoule operation. On the other hand, once the NN is trained a power-optimized bit-density is about 3-bits only. That is, the inference accuracy does only marginally improve for higher bit densities. This is in contrast to online training on the photonic hardware, where higher bit-density reduces the energy cost and training time during gradient decent back-propagation algorithm, while resulting in higher accuracy during inference operations 11 . Since memristors are challenged by device-to-device variability and require trains of tens of voltage pulses for precise tuning, short-term memory technologies are needed to speed-up both the training and to reduce the energy consumption for on-the-hardware training (offline training, i.e. using GPUs, would be an alternative). Capacitive-driven devices can be a useful short-term memory solution for integration with photonic platforms, since footprint is not as significant of a concern as in ultra-scaled purely electronic systems. Here, micron-scale capacitors can be fabricated in a stacked configuration atop waveguides, utilizing long-term memristor devices as photonic weights for example (Fig. 1b). Unlike capacitive-challenged arrays of electronic memories (either emerging PCMs or classical SRAMs), electro-optic components such as the NL thresholding modulators discussed above are capacitive stand-alone devices, just connected to their respective drivers (e.g. the summation providing photodetector). These modulatoractivation functions can respond at <10's ps short delays thus do not slow down the (already fast) optical NN. If longer retention is needed short-term memory elements can be used to bias the modulator's gate temporarily. Such a complex cell could benefit from the advantages of each technology and minimize the drawbacks regarding training. A novel technology for short-term memory, for instance, is the diffusive memristor 75 ; while designed as a selector device compatible with memristive crossbars, the diffusive memristor has the advantage of a metallic filament that can be programmed using low voltages and its memory-state dissolves rapidly (hundreds of µs to ms) afterwards forming nanoparticles. This behavior resembles the Ca 2+ dynamics in a biological synapse thus providing the short-term memory capabilities required during training. Experiments in plasmonic metasurfaces using silver nano-filaments have shown short-term memory as well and a rather few volts switching voltage was found similar to that of a diffusive memristors. 76

CHALLENGES IN DESIGNING MEMRISTOR DEVICES FOR PHOTONIC INTEGRATION
While the integration of non-volatile memristive devices in photonic circuits would be an enabling step towards energy-efficient and NNs capable of nanosecond-short inference tasks, here we address some of the challenges that should be considered in such a roadmap; a typical issue with novel material-chip integration is often a limited reproducibility for larger-scaled PICs. However, unlike NN approaches from the computer science community, which often follow a strategy of adding more neurons to increase inference accuracy and whose NN's approach millions of neurons, recent results on photonic neuromorphics show that; a) 100's of neurons are sufficient to perform smaller inference tasks equally well to electronics 8 , and b) the inherent noise of the analog photonic system can be advantageous during training with respect to accuracy. Nonetheless, reproducibility and reliability at the device-level is fundamental in order to ensure performance guarantees at the circuit and system levels. This problem is related to material engineering, interface control, and optimization of nanofabrication processes. Engineering the material stoichiometry to improve device performance is desirable for a broad range of materials of interest in both memristor and photonics fields. For example, ITO composition can be explored in a holistic fashion using reactive sputtering 77 to carefully control its electrical and optical properties. Similarly, phase-change materials can be tuned to lower the switching energy and improve reliability of amorphisation/crystallization. For filamentary memristors, the issue of filament robustness and controllability is driven also by the choice of materials; for example, the electrode material influences the filament shape based on the different free energy of oxide formation or the metal electromigration 78 . Such instabilities lead to noise of the NN. Interestingly, depending on the amount of noise, training the NN with the actual system noise of these photonic analog networks results in higher inference accuracy than performing the training 'signal-clean' digitally (i.e. GPUs) 79 . Indeed, highlighting material systems, that are jointly shared between memory and photonic technologies could enable a seamless integration pathway (Fig. 4). Here, CMOS compatibility and large-scale integration poses additional material-related constraints, limiting the choice of materials available for electrode, device layer and isolation. For conductive bridge devices, short term transient memory effects on a metasurface can be obtained at voltages as low as 5 mV, yet the thermal noise floor of 26meV at room temperature would dictate an SNR < 1 at the back-end photodetector of the photonic NN 80 . However, long-term memories require often high programming voltages, which reduce the endurance (number of memory's programmable cycles) leading to undesirable electric shorts as the failure mode, which could become extremely critical at high speed. In fact, if the endurance performance of a short-term memory element had a value of <10 11 cycles and this memory would be used as the charge-storage that controls the NL threshold of the photonic NN, then this memory would fail after just 5 seconds when the inference data input is clocked at 20 GHz. This shows, that some performance parameters do not translate well across these different applications. However, for usage as NN 'weights' any state-of-the-art memory is already overperforming, given the infrequent updates (which, naturally, depends on the application). Oxidebased memristors have a more tunable filament, at the expense of increased optical insertion loss. Devices based on electromigration are typically bipolar, requiring both polarities of voltage for programming: one polarity for SET (switching from OFF to ON) and the other polarity for RESET (switching from ON to OFF). The bipolar programming is challenging for all-optical photonic integration, since it is difficult to realize plasmonic optical programming. A bipolar optical readout would require rectification of the light thus an optical rectenna would be required, yet, so far, optical rectenna technology is still in its infancy exhibiting limited performance 79 . On the other hand, phase-change memories are unipolar, thus all-optical programmability can be achieved, as shown in recent work on all optical STDP plasticity 73 . For long-term memory elements, such neuromorphic technology can interface seamlessly with an optical compute system, thus reducing the overall delay by keeping the signal 'longer' in the optical domain before eventually converting back to electronics via a photodetector. Nevertheless, electro-optic memristors offer advantages due to the integration synergies with circuitry in existing CMOS technology. Therefore, integrated memristor devices exhibiting electrical programmability might be desirable in some integrated photonic systems for offline weight training. These discussion points are qualitatively summarized in Table 2.   Table 2. Qualitative mapping of both the neural network's (NN) functions and the mathematical model of the perceptron onto photonic-electronic hybrid neurons. The memory is most relevant in providing the charge to store the weights to bias the electro-optic modulators performing weighting. However, the same modulator can also be used as nonlinear activation function, where here the *Assumes analog amplitude signal control (perceptron). Other options such as spiking neurons could also be explored for higherorder neuron signal-shaping using a multitude of partial-differential equation 9 . **if WDM is used. ***Leakage current impacts power consumption of the NN.

CONCLUSIONS
In conclusion, the heterogeneous integration of memristive materials and devices in photonic platforms suggest promising performance advantages for photonic-electronic hybrid artificial neural networks, since these two technologies have complementary strengths. The advantage of photonics, for instance, lies in providing a picosecond-short delay between the various network nodes, hence enables highly efficient 'interconnectivity' of such non Von-Neumann compute and information processing architectures. However, photons are challenged to store states, which is where memories come in, such as to provide for the long-term neural network 'weights' (nonvolatile) and the nonlinear thesholding or activation function (relatively volatile). In combining the 'best-of-both-words', we see a viable path forward in densely integrating memristive materials particularly with the active photonic components that determine the neural networks governing functions and thus performance. We find that electro-optic modulators are unique candidates to perform both dot-product 'weighting' as well as 'thresholding', yet vastly different time-scales are required for each, while both share the aim to execute the respective functionality with lowest power consumption. The advantage is strengthened by the fact that the two technologies share a broad range of materials, thus enabling seamless integration and stacking of integrated devices. Indeed, hybrid photonic memristive neuromorphic circuitry could enable systems capable of (sub)nanosecond-fast and energy-efficient inference tasks in trained networks. Moreover, such platforms could be used to realize new neuromorphic architectures which could rely on hybrid devices based on single photon detection 80 or modulation. Finally, a word on the target applications for photonic neural networks; unlike GPUs which are suitable for big-data and high throughout tasks, photonic neural networks would be more suitable for those specific tasks that rely on real-time (<microsecond) responses to inference machine learning tasks such as those found in military applications of ranging, synthetic aperture radar, or automated target recognition for example