Measurement of thin film interfacial surface roughness by coherence scanning interferometry in hot of including teflon: Calculations with experimental validations

Coherence Scanning Interferometry (CSI), which is also referred to as scanning white light interferometry, is a well-established optical method used to measure the surface roughness and topography with sub-nanometer precision. One of the challenges CSI has faced is extracting the interfacial topographies of a thin ﬁlm assembly, where the thin ﬁlm layers are deposited on a substrate, and each interface has its own deﬁned roughness. What makes this analysis difﬁcult is that the peaks of the interference signal are too close to each other to be separately identiﬁed. The Helical Complex Field (HCF) function is a topographically deﬁned helix modulated by the electri-cal ﬁeld reﬂectance, originally conceived for the measurement of thin ﬁlm thickness. In this paper, we verify a new technique, which uses a ﬁrst order Taylor expansion of the HCF function to determine the interfacial topographies at each pixel, so avoiding a heavy computation. The method is demonstrated on the surfaces of Silicon wafers using deposited Silica and Zirconia oxide thin ﬁlms as test examples. These measurements show a reasonable agreement with those obtained by conventional CSI measurement of the bare Silicon wafer substrates. V C 2017 Author(s). All article content, except where otherwise noted, is licensed under a Creative Commons Attribution (CC BY)


I. INTRODUCTION
Three dimensional inspection of transparent/semi-transparent thin film layers together with roughness measurements of their upper and lower interfacial topographies would be useful to many optical applications such as those involved with optical coatings, semiconductors, photovoltaics (PV) and flat-panel displays. Precise control of surface roughness and thin film thickness is essential to optimize the performance of optically active coatings.
Conventionally, stylus profilometers have been used for the measurement of the step height between a thin film and its substrate to determine the film thickness even though the use of a sharp-pointed stylus can be destructive. Spectroscopic ellipsometry has been used for the non-destructive measurement of thin film thickness with the area of interest typically averaged over a large area of the order of a millimeter square. Although the stylus profilometers and spectroscopic ellipsometry are well-established techniques, there is a need for a nondestructive method for the measurement of thin film thickness in three dimensions with a higher horizontal resolution.
Optical Coherence Tomography (OCT) 1 has played an important role particularly in medical applications. OCT reconstructs the tomographic information of biomedical tissues by means of the interference signal provided by a near infrared light source. Debnath et al., have proposed a method of measuring the tissue layer thickness and underlying topography using spectrally resolved OCT. 2 The layer thickness is typically over the micro-meter range.
Coherent Scanning Interferometry (CSI), 3 also referred to as Scanning White Light Interferometry (SWLI) 4 or fullfield OCT, 5 have been widely used for three dimensional surface topographical measurement. These methods are generally unsuitable for determining the film thickness where there is an interfacial surface roughness (ISR) between deposited films and the substrate. Nevertheless, many studies have been carried out on film thickness determination using CSI. Some methods treat the interference signals in the time domain [6][7][8] when each thin film has a thickness in a certain range (տ1.5 lm), while others make use of the spectral phase or amplitude of the signals in the frequency domain to allow measurement in the thin film range (Շ1.5 lm). [9][10][11][12][13][14] The use of the CSI-methods for thin film thickness measurements based on time domain analysis is limited to film thickness (տ1.5 lm). This is because as the surfaces of the film assembly get closer, the peaks in the interference signal overlap more in the time domain. Even so, this method provides a three dimensional presentation of interfacial surface topography and film thickness without difficulty. In contrast, those measurements which use frequency analysis work well in the thin film regime. For example, methods using the Helical Complex Field (HCF) function 15 are able to accurately measure the film thickness less than 100 nm. 16 There is, however, difficulty using this method in determining the film structure on a pixel by pixel basis owing to noise in the signal. For this, the conventional HCF method normally computes the thin film thickness by averaging the signals over an area occupied by a few hundreds of pixels.
Mansfield proposed applying the HCF function to three dimensional pixel-by-pixel thin film thickness determination together with Interfacial Surface Roughness (ISR)  17 and then demonstrated this capability in terms of comparing the resultant top surface of a dielectric-coated glass substrate with AFM measurements. 18 This paper aims to verify the ISR methodology using actual measurement results by testing samples with dielectric films of Zirconium dioxide and Silicon dioxide on Silicon wafers with etched pits and measuring the roughness and the pits depth of the buried interfaces and comparing these measurements with those of the original substrate prior to thin film deposition. Theoretic consideration shows the ISR method works for perturbations of up to $610 nm depending on the materials used. We have analyzed the pits ranging from À2.5 to À16.4 nm and have proven a good agreement with those original surfaces measured by CSI before deposition of the dielectric films.

A. Coherence scanning interferometry
In a typical CSI hardware configuration, a LED is often preferred as a light source over a halogen light for its intensity, lower heat production, and an appropriate bandwidth for surface inspection. The bandwidth of the light source and the numerical aperture (NA) of the objective lens mainly determines the coherence length. [19][20][21] Although a white light source such as a halogen light creates a shorter coherence length than those from LEDs making detection of the interference fringes on test surfaces less easy, a halogen light is used in this method. This is because this methodology involves a numerical optimization in the frequency domain, and thus a wider bandwidth in the region between 400 and 750 nm leads to a more accurate solution.
The interference signal, interferogram denoted by I, observed by a photo-detector can be approximated as a combination of a DC part and an oscillatory part with respect to axial (vertical Z) scanning. For example, Fig. 1 shows the interference signals observed from a bare smooth Si substrate and a thin film of 560 nm SiO 2 on Si. It is clearly seen that the signal passing through the transparent film, SiO 2 , is attenuated and distorted due to reflection and absorption within the film structure. Evidently, the peak position in Fig. 1(a) corresponds to the substrate surface height while that of Fig. 1(b) collapses due to the thin film. Thus the need for different treatments of this type signals, such as in the frequency domain, occurs.

B. Helical complex field (HCF) function
A general interferogram consists of a DC component and a symmetric oscillation as shown in Fig. 1(a). The presence of transparent films on a substrate will distort the shape of the interference signal as shown in Fig. 1(b). This distortion is determined by the amplitude reflection coefficients of the thin film assembly.
The HCF theory 12,15,16 can be used to estimate the film thickness based on the distortion, where the positive side-band of the Fourier transform with respect to the frequency , denoted by F Á ½ SBþ , of the interference signals from the film assembly I and reference sample I ref are computed. The HCF function is theoretically and experimentally derived. Let HCF d be the HCF function derived from an actual measurement and HCF s be the one that is theoretically synthesized, respectively, with its film thickness vector d ¼ d 1 ; …; d L f g > , then they are expressed by 22,23 where the unknown parameter Dz HCF satisfies À2DZ step < Dz HCF < 2DZ step . Here, DZ step is the data-sampling interval of the interference signal, which is normally about 60-70 nm, corresponding to one eighth of the mean effective wavelength (see Fig. 2(a)). The distortion due to the thin film assembly is finally interpreted as the change in phase and amplitude in the frequency domain. Note that normally a known flat surface such as Si or SiC is used as a reference sample with its field reflectance averaged over the numerical aperture of the objective lens r ref . The symbol h also denotes the incident light angle averaged over the numerical aperture (see Appendix A).  The set of the film thickness d is numerically determined by minimizing the least squares error between HCF s and HCF d with respect to d and Dz HCF . This procedure is broken down to a curve fitting between the HCF functions with respect to the real and imaginary parts.
As is apparent from Figs. 3(b) and 3(c), a single pixel HCF function suffers significantly from noise (dominated by photo-electron noise) in the frequency domain. The global determined HCF function denoted by HCF d , consists of an HCF function generated from an ensemble of pixel intensity sequences. This global function typically corresponds to an area with M pixels say, 200 Â 200 as shown in Fig. 2(b), and exhibits dramatically a better signal/noise as shown in Fig.  3(a). Following the preceding film thickness measurement method using the HCF function, 12,15,16 the optimal set of the film thicknessd is determined by minimizing the error function J HCF ¼ Ð jHCF d À HCF s ; d ð Þj 2 d with respect to d and Dz HCF under the constraint given after Eq. (1). Note that the physical meaning of Dz HCF is the height difference between the test film structure and the reference material, denoted by z ref , randomly created by signal data sampling as illustrated in Fig. 2

(a).
With this method, computation to determined þ Dd at all the pixels would require M non-linear optimizations. This would require an excessive and time-consuming computing particularly when the number of pixels in a measurement area is large. Also, the optimization itself may well be compromised because of the significant noise apparent in the frequency domain.

C. Extension of HCF function to interfacial surface roughness measurement
A large number of optimizations might be required even if the application of J HCF to each pixel by substituting HCF d with the corresponding local determined HCF function HCF d px was effectively feasible. It is also very difficult to achieve a good fit between the synthesized and the determined HCF functions due to the noise as illustrated in Fig.  3(b) and thus this may lead to spurious results.
The ISR method provides a solution to this difficulty by effectively avoiding the noise effect in the optimization process. The ISR method uses such a relatively smooth HCF d as in Fig. 3(a) to approximate such a rough HCF d px as in Figs. 3(b) and 3(c) by means of the first order partial derivative of the synthesized HCF function. This approach brings two advantages: first a smooth local HCF function can be established; second a linear least squares error optimization will be conducted rather than a non-linear one.

First order approximation of the synthesized HCF function
Considering small deviations Dd from the global film thicknessd determined by the method discussed in Section II B, a primarily approximated expression for the local synthesized HCF function HCF s px can be presented by using HCF d , 17,24 where the set of the film thickness d is understood hereafter to be d ¼ fd sub ; d 1 ; …; d L g > including a perturbation of the substrate Dd sub as shown in Fig. 2(a). G l is defined as a gain of the perturbation in the l-th layer and is analytically provided in advance (see Appendix B). 17 Note thatd sub cannot be obtained explicitly from J HCF although, this will not cause problems because what is essential in the following calculations is Dd sub . The value of each Dd l should be small enough (Շ10 nm) to approximate the partial derivatives. The perturbations in the film thickness Dd are determined by minimizing the least-square based error function px represents a locally determined HCF function provided by the corresponding actual measurement. This locally determined HCF function apparently contains too much noise to be used for the non-linear optimization of J HCF . Perturbations to be determined by means of J px should be less than $10 nm to maintain the quality of approximation depending on the structure of the film assembly.

Linear least-squares optimization
In the actual computation, the variables and functions are treated in a discrete manner such that m ¼ 1 ; 2 ; …; m ½ > , and thus the merit function J px is effectively re-written As in Eq. (3), the function HCF s px depends linearly on Dd. Thus, the optimal solutionDd of the linear least squares error problem in Eq. (3) is explicitly determined in the wellknown form as follows: 25 where This gives the Best Linear Unbiased Estimators (BLUE) for solving over-determined linear problems. It follows that the thin film thickness, or the interfacial surface roughness at a pixel is finally obtained asd þDd.
This optimization avoids such a time-consuming non-linear optimization as J HCF , and thus enables the determination of interfacial surface topographies in realistic timescales.

A. Experimental setup
The CSI instrument, CCI HD (Taylor Hobson Ltd), was used to observe the interference signals. As in the specification sheet, 26 the 4 M pixel camera of the instrument allows noise-robust signal acquisition by averaging the signals over four pixels to obtain one signal with high lateral resolution maintained. This four-pixel unit is henceforth regarded as a single pixel for purposes of the model.
As discussed in Section II A, a light source with a broad bandwidth in the visible region is necessary for this technique. In this experiment, a halogen lamp is used and its nominal characteristic light intensity is shown in Fig. 11(a). For the assumption discussed in Section II A to be satisfied, a Â10 objective lens shown in Table I is used for the data acquisition. 26 A Si optical flat surface is used as a reference sample. Note that any reference material can be used here but its refractive index N ref must be known in advance.

B. Test sample fabrication
The test samples were fabricated by etching a square pit on a Silicon wafer substrate using a 30 kV gallium Focused Ion Beam (FIB/SEM dual beam system). An example is shown in Fig. 4(a). The thin film oxide layers were then deposited using reactive magnetron sputtering with metal targets, an oxygen plasma source, and a pulsed DC power supply. A 20 lm Â 20 lm pit was created by FIB etching, and its depth was controlled by timing the FIB etching and prior knowledge of the etching rate. Three types of oxide thin films were deposited on the Silicon wafers using the reactive magnetron sputtering as shown in Table II: producing 514.4 nm for SiO 2 , 308.6 nm for a second SiO 2 thin film, and 338.9 nm for a ZrO 2 thin film, which correspond to the Quarter Wavelength Optical Thickness (QWOT), respectively. The refractive index of the etched areas will be slightly modified due to surface amorphisation and Gallium ion implantation. In this study, we have ignored any such changes in the substrate refractive index.

C. Measurement result and analysis
Measurements were obtained from the samples listed in Table II using the CSI system. The measurement data was post-processed: using the ISR method and a conventional CSI surface measurement. The latter method detects the peak positions of the interference signals (CCI method 27 ). Comparisons between the results of the two methods were then made to verify the performance of the ISR method. Figures 4(a)-6(c) illustrate the topographies of some of the substrate surfaces. These images imply that the ISR method extracts the patterns more accurately than the conventional CSI method. Also the surfaces measured by the ISR method appear to be rougher than the original surfaces measured before thin film deposition.
We define S o (x, y) to be the substrate surface topography measured prior to thin film deposition by the CSI instrument, S ISR (x, y) the substrate surface topography measured by the ISR method and S CSI (x, y) that measured using the conventional CSI method, respectively. Note that in this experiment, although both the (buried) substrate and top surface are simultaneously generated, only the substrate surface measurements are considered. This is because, apart from vacuum metallization of a reflector (such as Cr), no available methods were deemed sufficiently accurate to measure the top surface.

Result: Depth of the FIB etched square pit
The depth of the etched pits measured by the ISR method will be compared with those obtained using the conventional CSI method. The depths are computed by comparing the average height of the pits and that of the rest of the measured area. Note that roughness on the surfaces involved is not considered for this evaluation.
While the conventional CSI method only considers the light reflection from the whole thin film/substrate assembly to measure a three dimensional surface, the ISR method successfully separates the contribution of the substrate from the complete thin film/substrate signal. As a result, the ISR method reproduces the buried surface roughness more faithfully comparing (a) with (b) and (c) in Figs. 4-6. Figures 7(a) and 7(b) show comparisons in the depth of the substrates measured by the ISR and the conventional CSI method. The interferogram from the sample #3 ZrO 2 has its peak corresponding to the top surface of the thin film, and this makes the correlation in the depth of the etched pit less accurate as shown in Fig. 7(b).

Analysis: Correlation of the determined surfaces
The correlation coefficients between S o and S ISR and S CSI are provided in Fig. 8. The Correlation coefficient operator Cor is defined here to be the maximum value of the correlation function of two different surfaces as follows: where S 1 S 2 denotes the cross-correlation function between surfaces S 1 and S 2 , and k Á k implies the L2 norm; mtd represents the method to be used, chosen as either ISR or CSI.
It is clearly seen that the correlation coefficients decline in an inverse proportion to the depth of the corresponding square pit. This is because the measurements and images obtained using the ISR method contain spurious features which are associated with the signal to noise in the local pixel.
Root Mean Square (RMS) errors between S o and S ISR and between S o and S CSI are shown in Figs. 9(a) and 9(b), respectively. As shown, the RMS errors between S o and S ISR are almost all smaller than 1 nm regardless of the thin film type and the depth of the etched pit, whereas the conventional CSI method results in larger errors as shown in Fig. 9(b). The root cause of the RMS errors in Fig. 9(a) is related to the various sources of noise in the signal. The noise stems from several sources not only including uncertainty in the measurement of optical constants using spectroscopic ellipsometry, but also from noise generated in the CSI system and its surrounding environment. Although the ISR method results in a larger level of RMS error with the #2 SiO 2 sample compared to the #3 ZrO 2 sample as shown in Fig. 9(a), it provides the #2 sample with a higher correlation coefficient for the depth than #3 sample as shown in Fig. 8. This implies that the #2 SiO 2 sample is more susceptible to random noise.
The surface roughness determined by the ISR method is expected to increase from the original due to the noise induced by the system, surrounding environment, and uncertainty in the physical constants. A Silicon wafer is used as the substrate in this experiment such that its surface should be regarded to be "flat." The original surface roughness (Sq) of the areas surrounding the pit is 0.63 6 0.11 nm. As  Table II, and the value of the outlier of the sample #3 in (b) is À285 nm. (a) Pit depth correlation by the ISR method and (b) pit depth correlation by the conventional CSI method. expected, those determined by the ISR method resulted in an increased Sq as shown in Fig. 10. This roughness is also visually apparent in a comparison between the surfaces in Figs. 4(a) and 4(b). As in Fig. 10, the roughness values are evenly increased slightly depending on the type of the film and its thickness.

IV. DISCUSSION
As far as determination of the pit depth is concerned, the ISR method works best for #1 SiO 2 samples with QWOT ¼ 5, followed by #2 SiO 2 with 3 and #3 ZrO 2 with 3 as in Fig. 7(a). QWOT is known to be proportional to the number of peaks and valleys found in a spectral reflectance as shown in Fig. 11(b). 28 The HCF function has such spectral features corresponding to its field reflectance. Therefore, an HCF function with a small QWOT might have difficulty in numerical optimization because the smaller number of the features of the function results in less accurate curve-fitting. This implies that the value of QWOT has a significant impact on the performance of the HCF based techniques. The ISR method is not an exception because the HCF function introduced in this method is a first order approximation of the original HCF function. The difference in the performance of the ISR method shown in Fig. 7(a) is considered to be the values of the QWOTs, which are effectively 5.27 for #1 samples, 3.09 for #2, and 4.59 for #3 as shown in Table III.
Reflection from the thin film assembly influences the signal to noise ratio under the assumption that the noise level is unchanged throughout the measurements. It follows that the increases in the roughness values, which is discussed in Section III C 3, are thought to be negatively proportional to the product between the intensities of light LI and an averaged reflection Let G be the integration of the product g(), then this is proportional to the number of signal photoelectrons, this can be considered as an indication of the robustness of the measurement to the noise effect. Figures 11(b) and 11(c) show the spectral reflectance of the samples with a mean film thickness and the computations of their g(), respectively. In Table III  thickness are presented. As expected, the G value for the #2 SiO 2 sample is the smallest among the three groups, and so is the mean values of g by 10%-20% compared with the rest of the groups. From these, one reasonable explanation for the level of roughness induced on the substrate is the signal to noise ratio associated with the spectral product between a reflection of a film structure and the intensity of a light source. Evidently, an error arises from the approximation in Eq. (B3) and uncertainties in refractive indices also influence the signal to noise ratio.

V. CONCLUSIONS
Application of the conventional CSI techniques to thin film metrologies such as film thickness and interfacial surface roughness, is limited by the film thickness because the interferogram is generally analysed in the time domain where the peaks of the signal should be separated. The general requirement for the film thickness is over 1.5 lm. The introduction of the HCF function to the thin film regime (from 0.05 to 1.5 lm) enabled the measurement of global film thickness over hundreds of pixels. 16 For thin films of significant optical thickness (such as a SiO 2 layer of a few hundred nanometers), the HCF signature is easily sufficient to allow interfacial topographies to be determined on a pixelby-pixel basis using this approach; as the optical thickness of the film(s) is further reduced, this capability becomes progressively less viable. Nevertheless, the HCF function used in the ISR method, derived from a first order approximation of the HCF function, extends the HCF method's capability to determine the local interfacial or buried surfaces and provide a three dimensional representation. In addition, heavy computation for numerical optimization can be avoided as this method uses a linear approximation of the HCF function.
The Silicon wafer samples having an etched pit depth ranging from 2.5 to 16.4 nm were deposited with Silica and Zirconia oxide. These pits were measured through a thin film layer by using the ISR method and the existing CSI method (CCI). Prior theoretic consideration showed the ISR method held for perturbations of up to $610 nm and the substrate surfaces determined by the ISR method were almost identical to the original surfaces. Together with the experimental results presented in the earlier top surface ISR publication, 18 these results provide a substantive experimental evidence for the ISR theory.
The roughness of the substrate surfaces measured by the ISR method tends to be larger than those of the original surfaces. This approach numerically determines the local film thickness including the substrate surface profile by establishing a synthesized HCF function by means of the global HCF counterpart and compares it with a relatively noisy local HCF function. It follows that the smaller the noise, the less is the roughness induced on the surface measurement. Standard noise reduction techniques such as signal accumulation or post-analysis filtering could be applied to minimize this roughness.
The CSI technique is a powerful well-established tool for the precise measurement of surface topography. The development of the HCF function has already extended its capability for the determination of refractive index. 23 This paper extends its capability further to include the threedimensional measurement of buried interfaces.

ACKNOWLEDGMENTS
The authors are grateful to RCUK for financial support through the SuperSolar Hub (EPSCR Grant No. EP/J017361/1).  11. Evaluation of noise-robustness based on a product between the reflectance and spectral light intensity of the light source. Note that the normalised light intensity in (a) and the reflectance of the mean film thickness by group in (b) are used for the computation in (c). (a) Spectral normalised light intensity of a halogen light source implemented in the CSI instrument, (b) spectral reflectance of the film assemblies with the averaged film thickness, and (c) spectral product between reflectance in and normalized light intensity.