Hybrid photoacoustic/fluorescence microendoscopy through a multimode fiber using speckle illumination

We present an ultra-thin hybrid imaging system based on an optical multimode fiber (MMF) and an optical fiber hydrophone that combines optical resolution photoacoustic and fluorescence microscopy. To control the illumination at the distal tip of the MMF, a digital micromirror device modulates the amplitude of the optical wavefront which is coupled into the MMF. A set of pre-calibrated speckle illuminations combined with a reconstruction algorithm enables photoacoustic and fluorescence imaging of samples located at the distal tip of the fiber with optical resolution determined by the numerical aperture.


INTRODUCTION
Imaging neural activity deep in the brain of small animals is limited by tissue scattering, and for a long time, it has only been possible to observe activity at the surface of the brain by observing the fluorescence signal of calcium indicators [1]. Fluorescence imaging requires exogenous agents (dyes or genetically modified cells) to produce the contrast required to image. A complementary imaging modality that relies on endogenous contrast agents such as the light absorption is photoacoustic imaging. Photoacoustic imaging is an emerging multi-wave imaging modality that couples light excitation to acoustic detection via the photoacoustic effect (sound generation via light absorption). Lately, photoacoustic microscopy has been employed to obtain images of neural activity [2] with promising results. Both fluorescence and photoacoustic imaging offer unique advantages -while fluorescence imaging offers high sensitivity to specific molecular probes and the ability to look at multiple features simultaneously, photoacoustic imaging provides specific sensitivity to non-radiative optical absorption. Combining the two modalities is therefore attractive and has proved effective for superficial tumor detection [3][4][5][6][7].
For deep brain imaging with diffraction limited optical resolution, endoscopic approaches are required to avoid the loss of resolution due to scattering. At the cost of invasiveness, endoscopic approaches can both deliver light and collect signals from deep regions. In fluorescence brain imaging, endoscopes are commonly based on fiber bundles or graded-index (GRIN) lenses [8] whose cross sections are in the order of a millimeter. Optical-resolution photoacoustic micro-endoscopes with a similar footprint have been proposed using analogous approaches [9][10][11]. However, such cross sections are still larger than what is desirable to minimize neuronal tissue damage when the endoscope is inserted into the brain. Multimode fibers (MMF) are becoming a popular alternative employed to guide light and collect signals thanks to their small footprint as compared to fiber bundles with comparable imaging performances, and thanks to their efficient light collection for fluorescence imaging. Several groups have demonstrated the possibility to perform optical resolution imaging through MMF [12][13][14][15][16][17][18][19][20][21][22][23][24][25], including photoacoustic imaging [26][27][28].
However, coherent light propagating through a MMF is seemingly randomized through fiber mode variations in phase velocity and potential mode-coupling, leading to a speckle pattern at the far end of the fiber (distal tip). Imaging through MMF therefore requires special methods that rely on a pre-calibration of the fiber to determine its input-output relation, namely the transmission matrix (TM) [14][15][16][19][20][21]. Hence, given the fiber TM, we can control the illuminations used to sample the output plane by modifying the wavefront of the input illuminations, and collect a feedback signal such as a fluorescence or photoacoustic emission, to recover the object. A common choice of such controlled illuminations is scanning focal spots. However, this local sampling approach needs scanning of N focal spots to obtain an N pixel object reconstruction. Moreover, generating focal spots at the distal end requires complete calibration of the fiber complex TM. It is worth noticing that the fiber needs to stay mostly unperturbed during the experiment; otherwise the TM changes, reducing the ability to control the intensity distribution at the distal tip [22]. Principle of speckle illumination imaging through a MMF. A set of M pre-calibrated speckle intensity patterns samples the object plane to generate M corresponding S k signals proportional to the overlap between each speckle R k and the object. The integrated signal is detected using a single pixel detector. b) Sketch of the experimental setup used for both photoacoustic imaging (section 3. A) and hybrid imaging (section 3. C).
ing, based on sampling the output plane using the natural output speckle patterns of a stationary MMF. In addition to being readily produced by propagation through MMF, speckle patterns have also been shown to be ideal for compressive sensing [24,29]. This idea was first proposed by Bolshtyansky et al. [30] who simulated the total integrated signal coming from a reflective object illuminated with speckles produced by a MMF and demonstrated coherent imaging. The concept was later demonstrated experimentally through MMF first with reflective samples [17], and more recently with fluorescence beads [31,32]. In photoacoustic imaging, the same approach was also implemented for imaging test samples through a scattering sample [33] and through MMFs [34]. Although this approach still requires a calibration step, it is simpler because it only requires to measure the optical speckle intensities as opposed to wavefront shaping based methods that require speckle field measurements. Additionally the nonlocal sampling has shown better compression ratios in noisy environments than local approaches [35] and is advantageous in imaging sparse samples that are often encountered in biological imaging.
In this work, we show that the above concept can also be extended to implement both photoacoustic and fluorescence endoscopic imaging through a MMF. Combining the pre-recorded speckle illuminations and the corresponding fluorescence and photoacoustic signals from the object at the distal tip of the MMF with reconstruction algorithms, we obtain images of biological test samples in vitro with both modalities with a minimally invasive microendoscope. To our knowledge, this is the first demonstration of an ultra-thin MMF imaging system capable of optical resolution photoacoustic imaging and fluorescence imaging at the same time with micrometer resolution.

METHODS
A. Principle of imaging using speckle illumination Figure 1 a illustrates the principle of the measurements for imaging with speckle illumination. A set of M random patterns is projected on the fiber input (proximal tip), generating different speckle illuminations R k (x, y) at the fiber output (distal tip). This reference set is first measured using a camera before the object is placed at the distal tip, as a calibration step. Each output pat-tern has a different distribution of speckle grains and therefore probes the field-of-view differently from other patterns.
During the measurement step, the object O(x, y) to be reconstructed is illuminated by the same set of M speckle patterns and a feedback signal from the whole field-of-view is captured through the same fiber by a single-pixel detector.In our work, the single-pixel detector is either a fluorescence detector or an acoustic detector. For both photoacoustic and fluorescence measurement, the signal, S k , received by a single pixel detector when the object is illuminated by each speckle pattern is modeled as the overlap integral of the object and the illumination pattern: The intensity fluctuations from speckle pattern to speckle pattern in the object plane translates into fluctuations of the signal S k , thereby encoding sample information at the positions at which the speckle grains overlap with the object, reminiscent of the working principle of a single pixel camera [36].

B. Image reconstruction with a sparsity-constrained optimization
From the set of M measurement values, there are various ways to reconstruct an estimate of the object [33,35], including correlation-based image reconstruction or resolution of a linear problem, possibly incorporating prior information on the object [37][38][39] . A comparison of the results obtained from our experimental measurement with three different approaches is provided in the supplementary information. In our work, the objects of interest are sparse in the real space, and the best results were obtained with a sparsity-constrained solution of a linear problem formulated in a matrix formalism, following the approach introduced for ghost imaging [37]. In this approach, the discretization of equation 1 over the N pixels of the camera yields a linear problem in a matrix form written as In equation 2, S is a M × 1 vector which represents the measurements, R is an M × N matrix of the set of M illumination patterns measured on the camera during the calibration step, and O now represents a discretized version of the object as a N × 1 vector. An estimate of the object O can be computed by solving the following minimization problem [37]: The first term of the cost function ensures that the reconstructed object fits the data while the regularization term with a L1-norm favors sparse objects. γ is a regularization parameter that needs to be tuned depending on the signal-to-noise ratio and the degree of sparsity in the object. Such formulation allows to find an estimate of the object O even when the matrix R is non invertible (either because N > M or because the M speckle patterns may not form a basis due to correlations). The regularization term is needed to account for noise in the measurements and the L1-norm favors sparse estimates. The resolution of equation 3 was performed numerically with a Fast Iterative Shrinkage-Thresholding Algorithm (FISTA), further described in the Supplementary Material.

C. Speckle illumination pattern selection
The quality of image recovery is significantly dependent on two properties of the set of references speckle patterns. First is the sampling efficiency: the ability to perfectly recover an object depends on the completeness of the sampling of the object plane by the output illumination set. An ideal set of illuminations would form a complete basis set which can encode information about every point at the fiber distal tip. In practice, generating such an illumination set is limited by the efficiency of exciting all the modes of the fiber, and the number of available speckle patterns can thus be less than the number of pixel to reconstruct. The second property is compressibility. Since speed is an important factor in imaging through dynamic scattering media, recovering an object using a number of illuminations as small as possible is highly desirable, and made possible by the proposed approached when sparse objects are concerned [37].
To maximize the compression ratio N/M, while minimally affecting sampling efficiency, the correlations between individual illuminations must be minimized so that each of them is able to retrieve unique information about the sample. Those correlations can be quantified by measuring their mutual coherence, defined as where µ ij is the mutual coherence between the i th and j th illuminations. The M × M mutual coherence matrix can be constructed containing the correlation of each speckle with every other speckle. The optimization to select the speckles with reduced correlation involves the following steps: 1. Set the speckle self-correlation terms on the diagonal elements of the coherence matrix, to zero. This way only correlations between different speckles are analyzed.
2. Calculate the norm of each row in the mutual coherence matrix. The row R i with maximum norm signifies that the i th speckle has maximum correlations with all other speckles in the illumination set.
3. Set the i th row and column in the coherence matrix to zero and note the index, i, of the speckle to be deleted from the illumination set. The new coherence matrix corresponds to correlations among M-1 illuminations.
4. Repeat 2 and 3 till the number of non-zero rows in the coherence matrix becomes equal to the size of the desired optimized illumination set.
Note that the coherence matrix is symmetric, which means the above optimization can equivalently be performed on columns instead of rows of the coherence matrix, leading to the same results.

D. Experimental setup
Three types of proof-of-principle experiments were performed in this work, namely photoacoustic microscopy alone, fluorescence microscopy alone, and hybrid photoacoustic/fluorescence microscopy.
A schematic description of the experimental setup used for both photoacoustic microscopy and the hybrid photoacoustic/fluorescence microscopy (see results in sections 3.A and 3.C) is shown on Figure 1 b. The excitation light is provided by a a pulsed laser (Cobolt Tor™series, 3ns pulse duration, 532nm, 7kHz repetition rate). A digital micromirror device (DMD, Vialux ®, model V-7001) is used to modulate the optical wavefront at a repetition rate of 22 KHz. The laser beam is first expanded to match the dimensions of the DMD, and the DMD is then imaged with a 4f-system onto the input facet of a multimode fiber (MMF). A 8-cm long MMF (Thorlabs DCF13) is used to guide light to the object plane and collect fluorescence. The outer diameter of the MMF (without protective cladding) is 125 µm. The 105 µm-diameter multimode core is used to guide the source light, and also collect fluorescence. The fluorescence signal is measured with a photomultiplier tube (PMT, Hamamatsu, model H7422) placed behind a dichroic mirror. For photoacoustic measurements, a fiber-optic hydrophone (FOH, Precision Acoustics ®), with an outer diameter of 125 µm (without protective cladding) is attached parallel to the MMF. Both fibers are held together inside a metallic cannula to make sure they remain attached and avoid relative motion. A few millimeters of the distal tip of both fibers stick out of the metallic cannula (as shown in the inset of Figure 1.b), as an imaging head with a minimal total footprint of about 250 µm.
For the calibration step, a set of M random patterns (M on the order of a few thousands) with a 50% fill factor are projected onto the DMD producing a set of speckle patterns at the distal tip of the MMF. The speckle patterns are measured with a CMOS camera (Basler ® acA1920-155um) that, after a 20x magnification, images a plane located ∼ 50µm away from the MMF distal. After the calibration is performed, the sample is placed in the same imaging plane and the same set of speckle patterns are projected onto the sample. In the sample plane, the image pixel size was 0.36µm.
For each speckle pattern, the setup allows to detect and record either a photoacoustic signal with the FOH or a fluorescence signal with the PMT. Both detectors act as single pixel detectors collecting a signal S k integrating information from the whole field-of-view. The time-resolved voltage signals from the detectors are digitized and recorded with a data acquisition card (Gage Razor Express 1622). Finally, the recorded temporal traces are processed to obtain a scalar value S k for each type of signal (photoacoustic or fluorescence). For photoacoustic measurements, S k is defined as the area under the envelope of the FOH pulsed signal. For fluorescence measurements, S k is defined as the peak value of the PMT pulsed signal. To improve the signalto-noise ratio (SNR) when needed, signals were averaged over repeated measurements for each speckle pattern, as specified further below in Sec. 3.The synchronization of all the parts of the system is controlled by a BNC pulse generator.
The fluorescence results described in section 3.B were obtained with a slightly different setup for practical reasons related to samples and material availability in the two different facilities involved in this work. The corresponding experimental setup is described in detail in [22,23], and differs from the setup above in particular by the fact that a continuous laser was used for the experiments. Additionnaly, an EMCCD camera (Andor iXon+) was used to collect fluorescence from the input/proximal tip of the MMF. The measurement methodology was however strictly identical with both setups.

EXPERIMENTAL RESULTS
Three types of proof-of-principle experiments were performed in our work. In section 3.A we present images obtained by photoacoustic microendoscopy of various types of samples, including in vitro red blood cells. In section 3.B we present images obtained by fluorescence microendoscopy of various types of samples, including in vitro images of fluorescent beads in a slice of mouse brain. We also illustrate how the set of illumination patterns can be chosen to optimize the quality of the reconstructed image. Finally, we show in section 3.C how both imaging modalities can be combined into a unique optical setup to experimentally validate the compressive hybrid imaging of red blood cells and fluorescent particles simultaneously.

A. Photoacoustic imaging
In this part, for each pattern projected at the DMD, the photoacoustic signal was averaged between 100 and 1000 laser pulses to obtain a sufficient SNR. Two types of samples were used to illustrate the photoacoustic imaging capability of the setup shown in Fig. 1 b. As a first, well-controlled test sample, we used an absorbing micro-structure photoplotted on a polymer film (Selba S.A, Versoix, Switzerland), shown on Fig. 2 a ("poweron" logo). The field of view corresponds to a 200x200 pixels scene, corresponding to N = 40, 000. For each pattern projected at the DMD, the photoacoustic signal was averaged over 100 laser pulses to obtain a sufficient SNR. For the reconstructed photoacoustic image shown in Fig. 2 b, only M = 4096 random speckle patterns were used to compute the photoacoustic values S k , and both parts of the "power on" sign are imaged with a very low background signal. It is important to notice that the "granular" appearance is due to the relatively low number of speckle patterns used (about 10% of the total number of reconstructed pixels). This artifact can be reduced by increasing the number of patterns or repeating the experiment with different speckle realizations optimized using the method described in 2 C. To further demonstrate the performance of the system on a more relevant biological sample, we used the same system to obtain photoacoustic images of red blood cells previously washed and deposited on phosphate buffered saline (PBS). Fig. 2 c presents a bright field microscope image of the sample, which shows three red blood cells and a Nile red fluorescent particle (big circle on the left bottom) added to the buffer to demonstrate that the fluorescent dye does not produce any detectable photoacoustic signal. M = 4096 speckle patterns were used to reconstruct the 300x300 pixels photoacoustic image (N = 90, 000) shown in

B. Fluorescence imaging
Two types of samples containing fluorescent beads were imaged with a similar experimental setup described in the supplementary information. In all imaging experiments through the fiber, i.e. when fluorescent light was collected from the proximal tip of the fiber after propagation through the MMF, the raw fluorescence data contained contribution from the autofluorescence of the fiber, more prominently from the fiber cladding. This autofluorescence from the fiber cladding was discarded (by operating in the image acquisition mode of the EMCCD) to selectively detect the fluorescence signal from the sample guided through the fiber core. Additionally, a background signal in the absence of the object was subtracted from all the data before running the reconstruction algorithm. Fig. 3 a shows a reference widefield fluorescent image of 4 µm orange beads from a TetraSpeck Fluorescent Microspheres Sampler kit. This reference fluorescent image was obtained with a CMOS camera directly imaging the fluorescent sources at the output/distal side, by averaging the fluorescence images corresponding to 4000 different speckles patterns. The 192x192 pixels (N = 36864) image from fluorescence collected at the input/proximal side of fiber is shown in Fig. 3 b: this reconstructed image demonstrates that the complex distribution of beads is well-recovered while preserving the boundaries of both individual and clustered beads. M = 10, 000 speckle illuminations were used to reconstruct the image, corresponding to about 25% of the total number of reconstructed pixels.
We also performed imaging of red fluorecent retrobeads (0.05 -0.2µm) from Lumafluor microinjected into the dorsomedial striatum (DMS) of a mouse brain, which was then sliced and mounted on a microscope slide. As for the first sample, Fig. 3 c shows a reference widefield fluorescent image of the sample. Fig. 3 shows the corresponding image reconstructed with our approach, and also clearly demonstrates the recovery of individual clusters of retrobeads in neurons. It can be observed that the resolution of the reconstructed object is dictated by the grain size of the speckle produced by the excitation wavelength, thanks to which the reconstruction image on Fig. 3 d is better resolved than the reference widefield fluorescence image on Fig. 3 c obtained by direct imaging with the fluorescent light. Moreover, the sparsity assumption allows a better z-sectioning by eliminating the out-of-focus features from the reconstruction image seen in the reference widefield fluorescence image on Fig. 3

c.
We further used the measurement values obtained with the TetraSpeck microscospheres to study the influence of the set of speckle patterns used to perform the measurements on the image quality. Fig. 4 a illustrates how chosing M = 3000 speckles patterns influences the quality of the reconstructed image: if the 3000 speckle patterns are chosen randomly out of a larger set of 10000 patterns, the reconstructed image is shown in Fig. 4 a (middle image, "No Optimization"). If the 3000 speckle patterns are chosen such as to minimize cross-correlation between speckle patterns as presented in Sec. 2, a significantly better image can be obtained as illustrated in Fig. 4 a (right image, "After Optimization").
In Fig. 4 b and c, we further illustrate how the size of the speckle pattern set influences the quality of the reconstructed image. We take a set with M = 10, 000 to define a reference reconstructed image, and study the quality of the images reconstructed with optimized subsets of smaller size. Fig. 4 b shows that for a speckle set of size as low as M = 3000, a N = 40, 000 pixel image remains recoverable while maintain-ing a good qualitative resemblance with the reference image, whereas for M = 1500, features of the object are obviously lost. Fig. 4 c provides a more quantitative insight into how the size of the speckle set influences the reconstruction: Fig.4 c plots the relative mean square error as well as the resemblance (estimated by cross-correlation) of the reconstruction images as a function of M with respect to the reference image. Simulations data are obtained by using images reconstructed from signal values computed from equation 2 with R containing each set of experimentally measured speckle patterns and the object O as the image 3 b with M=10,000. Both simulated and experimental curves show a decay in error and increase in resemblance with increasing number of illuminations, as expected. The effect is more pronounced in the case of the experimental data, which is likely due to the influence of experimental noise.

C. Hybrid imaging
To demonstrate the hybrid imaging capability of our system, we finally performed an experiment where we obtain simultaneously the photoacoustic and fluorescence image of the same sample. The system used is shown in Fig. 1 b. We used a diluted solution composed of red blood cells and 11 µm diameter fluorescent particles (Nile red) in PBS. Figure 5 a shows the widefield microscope image of the sample using incoherent illumination through the MMF where two fluorescence particles and one red blood cell can be identified. To obtain both the absorption and fluorescence images, two different illumination configurations had to be set: fluorescence imaging requires a low energy to avoid photo-bleaching of the fluorophores, while photoacoustic imaging requires high energy to have a sufficient signal-to-noise ratio. Therefore, we first performed the fluorescence measurement projecting all the speckle patterns with low energy (∼ 4nJ per pulse at the distal tip) and then the photoacoustic detection by projecting again the same DMD patterns at high energy (∼ 4µJ per pulse at the distal tip). Figure 5 b shows the false-color reconstructed photoacoustic image (red) and fluorescence image (green) from the set of signals corresponding to 2048 speckle patterns. The area calibrated with the CMOS camera is 300x300 pixels, corresponding to 150 µm 2 . The photoacoustic image clearly shows the ability to resolve single red blood cells and it does not have any spurious signal coming from the fluorescence particles as expected. In the fluorescence case, the signal recorded by the PMT only has contributions from the fluorescence particles.

DISCUSSION AND CONCLUSIONS
In conclusion, we have presented an ultra-thin system capable of performing simultaneous florescence and photoacoustic microscopy. The footprint of the probe head is 250µm x 125 µm, which makes it the thinest device capable of hybrid imaging to our knowledge. The reference-free calibration presented here, demonstrating the ability to recover the object using intensityonly measurements, simplifies the optical system compared to other approaches based on learning the transmission matrix of the MMF. Furthermore, we presented a way of optimizing the illumination set for a stationary MMF system, such that maximum information about the object can be recovered using the smallest possible number of illuminations, hence further boosting the imaging speed. Overall, the combination of photoacoustic and fluorescence imaging -two prevalent imaging modalities for in-vivo imaging through biological tissue -put together in a 250 µm thin fiber system makes for a powerful tool that could have a great impact in a range of biomedical applications, and specifically for deep brain neural activity detection. Hybrid photoacoustic/fluorescence microendoscopy through a multimode fiber using speckle illumination: supplementary material

FUNDING INFORMATION
This document provides supplementary information to "Hybrid photoacoustic/fluorescence microendoscopy through a multimode fiber using speckle illumination". We show a comparison between three different image reconstruction algorithms and we explain the details of the experimental setup used for fluorescence imaging.

S1. IMAGE RECONSTRUCTION ALGORITHMS
There are several methods to reconstruct the object as mentioned in Section 2. One approach based on simple correlations, as used in [35], is based on the following estimation of the object: In this approach, pixel-wise cross-correlations between the signals value S k and the calibrated intensity values directly provide an estimate of the object. As introduced in the main text, the object can also be estimated through a linear problem formulation by minimizing a cost-function. One simple possible approach in this case is to solve the following problem by using the Moore-Penrose pseudo-inverse "R −1 " of R and computing O = "R −1 " × S. This methods is however known to be sensitive to measurement noise, and the approach described in the main text was to minimize a cost function with a L1-based regularization term and positivity constrain as min The L1 penalty term is known to favor sparse objects. To solve the above minimization problem, we use Fast Iterative Shrinkage-Thresholding Algorithm (FISTA) [40] a state of the art fast algorithm to find a solution of our optimization problem: The convergence of FISTA is theoretically guaranteed with a convergence rate of O(1/k 2 ). Figure S1 illustrates the performances of the three approaches described above for both fluorescence and photoacoustic imaging. The sparsity-and positivity-constrained reconstruction is cleary superior to the two other approaches, and was therefore used for all the reconstruction presented in the main manuscript.

S2. EXPERIMENTAL SETUP FOR FLUORESCENCE IMAGING
The fluorescence imaging setup used for experiments described in section 3B of the main paper consists of a 532 nm CW Coherent Verdi-G laser which passes through a spatial filter SF and is incident on a TI-DLP Discovery 4100 digital micro-mirror device (DMD). The DMD performs phase modulation on the incident wavefront using off-axis computer generated holography, as described in [22]. The incident wavefront is modulated using 9216 independent input modes and imaged onto the back focal plane of a 20x microscope objective MO1 which then couples it into the fiber. The distal end of the fiber is imaged using a 40x microscope objective MO2 and lens L3 onto a CMOS camera (Hamamatsu Orca Flash 2.8). For calibration, we project patterns from the binary random orthonormal basis set on the objective back focal plane to produce different speckles at the fiber distal end. After calibration, we mount the fluorescent sample on a sample holder SH at the fiber distal end and project the same set of patterns used for calibration. For each projection, we record the corresponding fluorescence signal coming back to the input end of the fiber using an Andor iXon+ Electron multiplying gain CCD (EMCCD). The excitation photons are rejected using a dichroic mirror DM (Chroma ZT532rdc-UF1) and a bandpass filter (Chroma ET590/50m).