Adaptive 3D convolutional neural network-based reconstruction method for 3D coherent diffraction imaging

We present a novel adaptive machine-learning based approach for reconstructing three-dimensional (3D) crystals from coherent diffraction imaging (CDI). We represent the crystals using spherical harmonics (SH) and generate corresponding synthetic diffraction patterns. We utilize 3D convolutional neural networks (CNN) to learn a mapping between 3D diffraction volumes and the SH which describe the boundary of the physical volumes from which they were generated. We use the 3D CNN-predicted SH coefficients as the initial guesses which are then fine tuned using adaptive model independent feedback for improved accuracy.


I. INTRODUCTION
In-situ characterization of detailed 3D views of defects and interfaces and their evolution at the mesoscale (few nm -hundreds of µm) are required to develop microstructure-aware physics-based models and to design advanced materials with tailored properties 1,2 . Coherent Diffraction Imaging (CDI) is a non-destructive X-ray imaging technique providing 3D measurements of sample electron density at nm resolution from which sub nm atomic displacement estimates can be calculated to understand deformation for µm sized non-crystalline specimens [3][4][5][6][7][8] .
CDI has now been applied for a wide range of scientific studies including biology, physics and engineering 9 . CDI has been used to measure the 3D structures of individual viruses 10 and bacteria 11 , for imaging quantum dots 12 , to image red blood cells infected with malaria 13 , for 3D imaging of human chromosomes 14 , for imaging the 3D electron denisty of large ZnO crystals 15 , using only partially coherent light 16 , for measuring 3D lattice distortions due to defect structures in ionimplanted nano-crystals 17 , and for measuring dislocations in polycrystalline samples 18 .
The CDI technique records only the intensity of the complex diffraction pattern originating from the illuminated sample volume, in which all phase information is lost. If the phase information in the diffraction signal could be measured, then a simple inverse Fourier transform would provide the 3D electron density which generated the diffraction pattern. Many iterative numerical methods for achieving phase retrieval have been developed which map measured diffraction patterns to electron density [19][20][21][22][23][24][25][26][27][28][29][30][31] . The existing CDI phase reconstruction methods are sometimes very lengthy processes requiring extensive fine tuning by expert users. Existing algorithms for inverting diffraction signals to produce a real-space image are sometimes brute-force and usually very computationally expensive. Additionally, iterative phase retrieval algorithms require expert knowledge, relying on a wide range of experience and expertise in various fields. Standard methods are sensitive to small variation in diffraction signals and different a) Electronic mail: ascheink@lanl.gov b) Electronic mail: reeju@lanl.gov users may produce inconsistent reconstructions depending on the experience of the user and the choice of initial guesses for the parameters while expert users are able to combine various conventional algorithms such as error reduction, difference map, shrinkwrap and hybrid input-output to be capable of exploiting the available frequency information in a CDI measurement, utilizing robust treatments of measurement signal noise 32 . The challenges of iterative phase retrieval make it a good candidate for utilizing machine learning methods. Although ML methods cannot substitute for traditional algorithms, they do have the potential to help with the speed of obtaining reconstructions by providing an initial guess which is then fine tuned by traditional methods to achieve accurate results.
Machine learning (ML) tools, such as deep neural networks, have recently grown in popularity due to their ability to learn input-output relationships of large complex systems. Neural networks have been used to speed up lattice quantum Monte Carlo simulations 33 , for studying complicated many body systems 34 , for depth prediction in digital holography 35 , and combined with model-independent feedback for adaptively controlling particle accelerator beams 36 .
Recently 2D convolutional neural networks (CNN) have been utilized for speeding up diffraction-based reconstructions. CNNs have been developed to directly map 2D diffraction amplitude measurements to the amplitudes and phases of the 2D objects from which they originated, presenting an approach for orders of magnitude faster 2D amplitude and phase reconstructions for CDI 37 . CNNs have also been recently developed for orders of magnitude faster mapping of electron backscatter diffraction (EBSD) patterns to crystal orientations 38 . In this work we utilize Tensorflow and the automatic differentiation capabilities of the software package, a recent application of automatic differentiation to problem of phase retrieval is given in 39 .

II. SUMMARY OF MAIN RESULTS
A graphical summary of the method proposed in this work is shown in Figure 1. We present an adaptive ML approach to the reconstruction of 3D object with uniform electron densities from synthetic diffraction patterns. Our adaptive ML framework utilizes a combination of a 3D convolutional neu-ral network together with an ensemble of model-independent adaptive feedback agents to reconstruct 3D volumes based only on CDI diffraction measurements. The algorithm uses 3D diffracted intensities as inputs and provides outputs in the form of spherical harmonics which describe the surfaces of the 3D objects with uniform densities that generated the diffracted intensities.

III. MATHEMATICAL BACKGROUND
Ideally, the goal of the the CDI measurements would be to record the complex diffracted scalar wavefield ψ(w) = |ψ(w)| exp [iφ (w)], which is related to the Fourier transform of the electron density, ρ(r) of the sample, where r = (x, y, z) is the sample space and w = (w x , w y , w z ) is reciprocal space coordinates, respectively. If such a measurement could be made, the 3D electron density could be reconstructed by simply performing an inverse Fourier transform. Unfortunately, when a coherent X-ray passes through a material with electron density ρ(r), what is recorded on a detector is the intensity of the diffracted light, given by with all of the phase information lost 4,6 . Reconstructingψ(w) requires lengthy phase retrieval algorithms which are typically carried out after the experiments and performed by expert users.

A. Spherical harmonics shape descriptors
Our approach is to represent the unknown electron density inside a 3D object by a collection of basis vectors in the form of spherical harmonics, which describe the surface that encloses the volume of material of interest. Spherical harmonics are a generalization of the 1D Fourier series for representing functions defined on the unit sphere. For any D > 0, the Hilbert space of real square-integrable functions defined over the interval x ∈ [0, D] is defined as with the inner product of any f , g ∈ L 2 [0, D] defined as Distance between functions in L 2 [0, D] is defined by the metric It is well known from Fourier analysis that any function f (x) ∈ L 2 [0, D] can be approximated arbitrarily closely by a linear combination of the basis functions ϕ c,n (x) = cos 2πnx D , ϕ s,n (x) = sin 2πnx D , n ∈ N.
(5) If a sequence of functions f N are defined as then For any function s(θ , φ ) defined on the surface of the sphere, where θ ∈ [0, π] and φ ∈ [0, 2π] are the spherical coordinates, the function can be approximated arbitrarily accurately with a representation of the form where the coefficients are found by the inner product By approximating in terms of the basis of spherical harmonics, we assume that we can find a star-convex approximation of a surface s(θ , φ ).

IV. ADAPTIVE MACHINE LEARNING FOR PHASE RETRIEVAL
To determine the unknown electron density ρ(r, θ , φ ), we make the assumption that the electron density is non-zero only within some compact set and that the density is uniform within Note that in this proof-of-concept work, we are considering solid objects of uniform density d without internal structures. Our 3D reconstruction approach is to find a set of coefficientŝ a ml up to order l = N, y = (â 00 , . . . ,â ml , . . . ,â NN ), which define a surface that approximates s(θ , φ ) by constructinĝ  which in turn defines an electron densitŷ

Iterative ES
that approximates ρ(r, θ , φ ). In order to find the appropriate spherical harmonics, we calculate the amplitude of the 3D Fourier transform |F (ρ(r, θ , φ ))| which represents the amplitude of a complex scalar diffracted wavefield |ψ(w)| and then compare it to the ground-truth synthetic 3D diffraction pattern.

A. 3D Convolutional Neural Network
Our approach uses a combination of a 3D convolutional neural network together with model-independent adaptive feedback. Convolutional neural networks are very powerful tools that can learn relationships between parameters in complex systems and in this case can directly utilize spatial information to learn 3D features from the 3D amplitude of the Fourier transform. The architecture of the 3D CNN network developed for our problem is shown in Figure (2).
We point out that in mapping 3D Fourier transform intensities to spherical harmonic coefficients, a CNN is only able to predict the even l-valued harmonics Y m 0 ,Y m 2 ,Y m 4 , . . . for the following reason.
Consider two volumes described by surfaces which are per-turbations of a sphere, of the form where l is odd. The odd l-valued harmonics are themselves odd functions because all real spherical harmonics satisfy: Therefore the two surfaces in (16) can be rewritten as which are simply reflections and so the intensities of the Fourier transforms of their volumes ρ ± are indistinguishable because of the lost phase information. When teaching a neural network to map diffraction patterns to coefficients, we end up giving it inputs generated from two different surfaces and volumes, with their corresponding spherical harmonic coefficients as the correct outputs: Because in this case the Fourier intensities are exactly the same after learning over thousands of random data sets, the neural network is confused and at best can predict only the average value of 0 for all of the odd spherical harmonic coefficients. This problem can be confirmed numerically in three ways: 1). If only positive values of odd harmonics are used to generate volumes then the CNN learns how to map diffraction patters to both even and odd harmonics, but this limits its applicability because realistic objects have shapes that are composed of both odd and even harmonic components. 2). If the network is tasked with only identifying the magnitude of the odd harmonics it is able to learn the relationship, but resulting structure predictions must then iterate through all of the possible ± combinations to find those which best match the given Fourier transform intensity to calculate the correct object shape, but the created object's orientation will not necessarily match that of the target. 3). Finally, if the CNN is given the actual 3D electron densities or the 3D Fourier transforms (not just intensities) which contain the phase information as inputs, then it learns to map all spherical harmonics correctly, but this is not helpful for our problem, where the goal is to make predictions solely based on diffracted intensities. This limitation of a neural network approach was also documented in 37 where they found that the neural network would sometimes predict objects that "are twin images of each other, and that they can be obtained from each other through a centrosymmetric inversion and complex conjugate operation. Both images are equivalent solutions to the input diffraction pattern." Because of the limitations described above, the CNN was trained to map 3D diffracted intensities to only the even valued spherical harmonic coefficients that describe the boundaries of the volumes whose Fourier transforms generated those intensities.
Data for training the 3D CNN was generated by sampling coefficients from uniform distributions with ranges: and generating training volumes based on surfaces of the form where the large Y 0 0 (θ , φ ) value ensured that we are working with a well defined surface perturbed by higher order components similar to naturally occurring complex grain shapes.
We generated 500,000 training sets of 49 coefficients, for l = 0, . . . , N = 6, each of which was used to generate a surface and a volume bound by that surface to perform a 3D Fourier transform. The input to the CNN was the intensity of the 3D Fourier transform and the output of the CNN was a 28 dimensional vector which was compared to the 28 even spherical harmonic coefficients Y m l for l ≤ 6: y = (y 1 , . . . , y 28 ) = (â 00 ,â −22 , . . . ,â 22 , . . . ,â 66 ) via the cost function: CNN performance is illustrated in Figures 3 and 4 where the predictive accuracy of 200 unseen test sets is shown along with the lowest and highest prediction accuracy shapes.
In this setup CNN performance is accurate, but can only make predictions for the even spherical harmonic coefficients. Furthermore, for use in experiments, the accuracy of a learned model-based approach such as a CNN may suffer depending on experimental setup changes and may require very lengthy experiment-specific retraining.
In order to predict the even spherical harmonics and also to make these results more robust to a wide range of experimental conditions, the next step of this approach is to use a modelindependent algorithm that adjusts all spherical harmonic coefficients directly based on matching individual 3D Fourier transforms. The model independent approach is also capable of handling very large numbers of coefficients as shown below when tuning up to 225 spherical harmonics simultaneously.

B. Model-independent tuning
For the adaptive part of this work, we utilize a modelindependent extremum seeking (ES) algorithm which has was originally developed for control and optimization of uncertain and time-varying systems by simultaneously tuning large numbers of coupled parameters based only on noisy measurements 40 . This bounded form of ES has been analytically studied with convergence proofs for general nondifferentiable dithers 41 , has been proven to converge to optimal controllers for unknown systems 42 , and has been applied to automatically control charged particle beams in particle accelerators 36 .
The ES method is applicable to n-dimensional dynamic system of the form where y = (y 1 , . . . , y n ) are physical quantities of interest, such as diffraction patterns of electron densities. The p = (p 1 , . . . , p m ) are controlled parameters, such as the spherical harmonics that define the surface of a volume and t is time. The function f may be an unknown function governing the system's dynamics.Ĉ is a measurement of an analytically unknown function C(y,t) that is noise-corrupted by an unknown function of time, n(t), and depends on both the parameter values y and on time due to a time-varying system environment.
In our approach, we compare the intensities of the measured diffraction and generated diffraction wavefields and quantify the difference with the numerical cost function whose mini-mization is our goal: where integration is performed over a volume in reciprocal In experimental applications of such a method, uncertainty would come from the unknown electron densities that we are trying to find and from uncertainties (such as misalignment of components and drifts in X-ray coherence volume, wavelength, and flux) in the experimental setup.
The parameters that we tuned were the Y l m coefficients y = y 1 , . . . , y j , . . . , y (1+N) 2 = (â 00 , . . . ,â ml , . . . ,â NN ) , (23) which define the boundary surface of an unknown volume as in Equation (11) and the function that we are minimizing is C(y) as defined in (28). The ES algorithm perturbs parameters according to the dynamics where ω j = ωr j and r i = r j for i = j. In (24) α is a dithering amplitude which can be increased to escape local minima. Once the dynamics have settled near an equilibrium point of (24), which may be a local minimum of C, each parameter will continue to oscillate about its local optimal value with a magnitude of 2α/ω j . The term k > 0 is a feedback gain.
For ω 1 the dynamics (24) are on average approximated by which tracks the time-varying minimum of the unknown function C(y,t) with respect to y(t) although using only its noisecorrupted measurementĈ as input. The reason behind convergence is that the evolution of the coupled parameters y j is decoupled and made orthogonal relative to the inner product in the L 2 [0,t] Hilbert space as defined in (3) lim ω i ,ω j →∞ cos(ω i t), cos(ω j t) = 0. (26) Details and analytical proofs are available in [40][41][42] . For iterative optimization as done in this work, we replace the continuous time dynamics (24) with their discrete-time approximation and make iterative updates according to y j (n + 1) = y j (n) + ∆ t 2αω j cos ω j ∆ t n + kĈ(n) , (27) which is a finite difference approximation of (24) for ∆ t 1. In order to test the convergence properties of the ES algorithm, we simultaneously measured the following three quantities during convergence for 100 random volumes: The quantity C F (ρ) is a measure of the percent difference between the intensity of the target and reconstructed Fourier transforms. Convergence would mean we have matched the intensity of the Fourier transforms. However, it does not guarantee the correct shape due to missing phase information, and the same 3D object could be rotated or reflected. The quantity C ρ is a measure of the mismatch between volumes which is non-zero when the objects have the same shape, but are of different orientations. Finally, the quantity C ∆ρ is a measure of shape convergence which subtracts the total volumes occupied by the two shapes and therefore will converge to zero when the two shapes are the same even if they have different orientations.
We created 100 random 3D shapes, generated their 3D Fourier transforms, and fed the intensities of those transforms into the 3D CNN. The predictions of the CNN were then used as the starting point for the ES algorithm. Results of ES convergence for 100 random 3D shapes are shown in Figures 5  and 6. Looking at the top images in the second and third columns of Figure 5 it is clear that the CNN-based objects had Fourier transform intensity errors, C F (ρ) of 40% relative to their full spectrum measures as defined in (31) and volume errors, C ∆ρ of approximately 3%. The bottom images of the second and third columns show that by the end of convergence the average intensity error was 8.4% and volumetric error was 0.19%.
In Figure 5, it is evident that on average all of the quantities C, C F (ρ) , C ∆ρ , and C ρ converge towards zero; however C ρ has several large outliers that never converge which implies that the densities being created are of the correct shape, but wrong orientation. The last column of Figure 5 is showing the errors between predictedâ ml and correct a ml values. The green background in Figure 5 highlights the even valued coefficients which we expect to match exactly while the odd components are expected to sometimes not converge due to the ambiguity introduced by the lack of phase information in the Fourier transform's intensity, as discussed above. Overall, the results of Figure 5 confirm that the ES approach is very robust and is able to find the correct object shape with the possibility of an incorrect orientation in space, as expected. Figure 6 shows three examples of exact agreement between 3D test objects and their ES-based reconstructions.

C. Adaptive ML for experimental data
In order to further demonstrate the robustness of the adaptive ML approach, we applied it to an experimentally measured 3D crystal volume that was obtained using high energy diffraction microscopy (HEDM) 43 . HEDM is used for nondestructive measurements of spatially resolved orientation (∼ 1.5 µm and 0.01 • ), grain resolved orientation, and elastic strain tensor ( 10 −3 ) from representative volume elements with hundreds of bulk grains in the measured microstructure (mm 3 ) 44 . HEDM measurements at multiple states of a sample's evolution can be used as inputs to inform and validate crystal plasticity models 45,46 . For a broad overview of HEDM and its many applications the reader is refereed to 44 and the multiple references within.
To test the robustness of our adaptive ML approach to a structure that had never been seen by neither the CNN nor the ES algorithm during its tuning and design, we picked out a single 3D grain from a polycrystalline copper sample which was measured with the HEDM technique at the Advanced Photon Source (APS) 43 . The intensity of the 3D Fourier transform of this volume was fed into the CNN which provided an estimate of the first 28 even a ml coefficients. These were then fed as initial guesses into the ES adaptive feedback algorithm which had the freedom to tune all 225 coefficients of the l ∈ {0, . . . , 14} Y m l (θ , φ ) spherical harmonics in order to match the generated and measured diffraction patterns. The 3D shape and 2D slices of the amplitude and phase of the reconstructed particle results of convergence are shown in Figures 7 and 8.
The HEDM grain is relatively large with ∼60 µm diameter and is therefore too large to be imaged with existing CDI techniques due to light energy and coherence length limitations of existing light sources. Nevertheless, for testing the proposed method, the morphology of the HEDM crystal was interesting in its complexity and was similar to what has been measured by Bragg CDI techniques as applied to quantum dot nanoparticles 9 . Furthermore, advanced light sources such as the planned Linac Coherent Light Source II (LCLS-II) free electron laser (FEL) and the APS Upgrade (APSU) are expected to have increased transverse and longitudinal coherence lengths with techniques such as self seeding 47,48 , to image larger than 1 µm diameter crystals using high-energy CDI combined with HEDM 49 . The top image of the last column shows the errors of the convergedâ ml coefficients with the green band highlighting the even coefficients relative to which there is no ambiguity in the intensity of the Fourier transform. The bottom image of the last column shows the a ml errors together with mean and standard deviation as well as the bounds of the random distributions from which they were generated (blue/dashed).
FIG. 6. Convergence of the ES algorithm for 3 different structures A, B, and C. In this demonstration the algorithm always started with only Y 00 = 0 which created an initial spherical shape as shown on the left. The reconstructed volumes perfectly matched the targets in our 50×50×50 volume representations as seen by the error r(θ , φ ) showing the initial and final differences between the volume bounding surfaces mapped onto the sphere. The last column shows the convergence of all parameter errors as the algorithm is able to find the correct values for all 36 Y lm settings in each case.

V. CONCLUSIONS
In this proof-of-concept work, we demonstrate reconstructions of arbitrary 3D shapes, while assuming no contribution of internal lattice distortions to the diffraction signal. One immediate limitation of this method is that by parameterizing surfaces as single valued functions over the 2D domain (θ , φ ) ∈ [0, π]×[0, 2π], we are limiting ourselves to producing only star convex shapes. Star convex shapes are ones in which  a line can be drawn from the center point to the outer edge without intercepting any other edges and therefore do not include more complex surfaces which are not simply connected, such as a donut-like shape with holes. Generalization of this approach to a larger class of shapes will be the study of future work by utilizing surface parameterization which decomposes a 3D particle surface onto three orthogonal directions [50][51][52] . Furthermore, the method presented here can be readily extended to reconstruct additional phases for crystals with internal structures due to inherent defects and dislocations by several methods including the use of generative convolutional neural networks or by extending the adaptive modelindependent process to include more degrees of freedom. Although the CNN model is trained only on synthetic diffraction data, the adaptive framework will readily account for noise in the experimental data for robust reconstruction of 3D crystals with internal structures, which is a topic of future work.