A highly accurate ab initio dipole moment surface for the ground electronic state of water vapour for spectra extending into the ultraviolet

A new global and highly accurate ab initio dipole moment surface (DMS) for water vapour is presented. This DMS is based on a set of 17 628 multi-reference configuration interaction data points that were calculated with the aug-cc-pCV6Z basis set with the Douglas-Kroll-Hess Hamiltonian; tests are performed at several other levels of ab initio theory. This new "CKAPTEN" DMS improves agreement with recent experimental measurements compared with previous models that poorly predicted some bands in the infrared while also maintaining or improving on the agreement for all remaining strong lines. For high overtones located in both the visible and the near ultraviolet regions, our predicted intensities all lie within 10% of recent atmospheric observations. A crossing of energy levels in the ν1 fundamental and 2ν2 states is seen to offset transition intensities in the ν1 fundamental band; residual inaccuracies within the potential energy surface used is the cause of this problem.


I. INTRODUCTION
Water vapour can certainly be regarded as the most important molecule on Earth; it is one of the basic requirements for life to exist 1 but is also the major contributor to the greenhouse effect and the largest absorber of incoming sunlight. 2t has long been suggested that climate change may have significant implications on our hydrological cycle, 3 either suppressing it 4 or possibly enhancing it. 5Similarly, any changes in atmospheric water vapour concentration will affect our climate.
The key to understanding this complex relationship focuses on our ability to both accurately measure and model H 2 16 O spectra and hence water vapour concentrations.Atmospheric observations are usually based on spectroscopic databases such as HITRAN 6 or GEISA. 7Given the abundance of water vapour and the complexity surrounding its spectrum, water absorption lines can often obscure spectral features due trace molecules whose monitoring can be important for a variety of atmospheric issues.
The upcoming NASA Tropospheric Emissions: Monitoring of Pollution (TEMPO) mission 8 plus the related European Sentinal 9 and Korean GEMS (geostationary environmental monitoring satellite) 10 missions will probe the Earth's atmosphere in the blue visible and near ultraviolet.As Lampel and co-workers 11,12 have already demonstrated, retrievals for trace species in this region require dramatically improved understanding of the underlying water absorptions.Furthermore, it has been argued that retrieval of water columns at these short wavelengths has significant advantages 13,14 compared with measurements using longer wavelengths.This argument may have consequences for previous satellite missions that had scanned the Earth's atmosphere in the visible and near UV, such as GOME, 15 SCIAMACHY, 16 GOME-2, 17 and OMI. 18owever, precise retrievals rely on the availability of accurate laboratory data.Even in this region, the absorption by water vapour is due to weak, high overtone rotation-vibration transitions, on which data are largely lacking.
Recently, Birk et al. 19 measured new experimental water line intensities in the infrared region and followed this up with a thorough comparison of these new accurate measurement against a computed line list that used variational nuclear motion calculations and the LTP2011 dipole moment surface (DMS) of Lodi et al.; 20 the current release of HITRAN, HITRAN2016, contains large sections based on transition intensities computed using the LTP2011 DMS. 21,22Birk et al.'s findings are significant, as while they found excellent agreement for many transitions, they also identified a number of discrepancies including the computed intensities of the ν 1 fundamental symmetric stretching band that deviated from experiment by between +5% and −13%; such disagreements for the fundamental are well known. 23,24ecent atmospheric measurements in the visible and near ultraviolet by Lampel et al. 11 detected for the first time a prominent H 2 16 O absorption band at 363 nm, which corresponds to the eighth overtone of stretch quanta, (9,0) ± 0. Lampel et al. showed that neglecting this water absorption had a significant effect on the accurate retrieval of important trace molecules such as O 4 , HONO, SO 2 , and OClO, which show absorption features close to those of water vapour.Lampel et al. also suggested that water vapour absorption at wavelengths shorter than 360 nm may also affect the retrieval of BrO and HCHO.
][25][26][27][28] Notable among these are the SP2000 surface of Schwenke and Partridge, 24 which was used in the BT2 line list, 29 and a number of surfaces from Lodi et al.: Core-Valence-Relativistic (CVR) 28 and the more recent Lodi-Tennyson-Polyansky LTP2011 and LTP2011S 20 models that provide a full and reduced (smaller) fit to the same ab initio dataset.
Lampel et al. 11 used transition intensity calculations based on several of these DMSs to interpret their retrievals and none of them provided accurate results.The most recent and complete POKAZATEL water line list of Polyansky et al. 30 used the LTP2011S surface and showed good behavior around 363 nm but was too weak, while the earlier CVR surface 28 had an irregular shape but demonstrated stronger absorption than LTP2011S.Lampel et al. concluded that the shape of water absorption in the 340-380 nm range is better replicated by POKAZATEL than by CVR, BT2, or HITEMP. 31roadband, laboratory measurements by Du et al. 32 of near ultraviolet water cross sections show very significant absorption which Wilson et al. 33 could not replicate.On the basis of their measurements, Du et al. also make assertions about the importance of short-wavelength absorption by water vapour to the Earth's energy budget.The precise value of the absorption by the vibration-rotation spectrum of water in the near ultraviolet remains an important open question.At present, the maximum transition frequency for H 2 16 O within HITRAN2016 is 25 710.8 cm −1 , in line with high-resolution laboratory studies where intensity measurements stop in this region. 34,35he difficulty in creating an ab initio DMS capable of accurately modeling weak absorption in high-energy regions is of no surprise. 24,36,37Possible sources of error are extensive and range from the accuracy of the underlying ab initio calculations, the number of data points calculated, the choice of functional form used in the fitting procedure, and the number of parameters used in the fit.The purpose of this work is to create a global DMS for the ground electronic state of the water molecule that improves the accuracy of previous models while simultaneously solving the issues surrounding high-frequency transitions occurring in both the visible and the ultraviolet.

II. COMPUTATIONAL METHODS
All electronic structure computations were carried out with the quantum chemistry package molpro. 38Dipoles were computed using the finite differences (FD) approach, which necessitated four calculations per point for the dipoles and one other at zero field to obtain the energy at that geometry.While being computationally more demanding, experience has shown us that this method does, however, provide more accurate dipoles than the alternative scenario of taking an expectation value (XP). 39It also facilitates the inclusion of the contribution to the dipole of small corrections whose contribution to the energy is treated using perturbation theory.
Dipoles were calculated at the multi-reference configuration interaction (MR-CI) level of theory.To ensure numerical stability in the FD, the energy convergence threshold was set to 5 × 10 −11 E h for all MR-CI calculations.The relaxed reference Davidson correction (+Q) has been applied to the MR-CI dipoles.Unless otherwise stated, relativistic corrections were obtained with the mass-velocity Darwin one-electron (MVD1) operator available within molpro.
1][42] With the exception of valence-only calculations, all ten electrons were otherwise fully correlated with both the complete active space self-consistent field (CAS-SCF) and MR-CI methods using an active space consisting of the lowest nine energetic A molecular orbitals (MO's) and the two lowest A MO's, denoted as (9,2).

A. Dipole accuracy
Lodi et al. 28 selected 14 "key structure" geometries that overall offer an insight into the global behavior of the calculation methods; these same 14 points were used for our own investigation.Here and elsewhere, r 1 and r 2 identify the bond length between the oxygen atom centred at the origin and both hydrogen atoms, with θ representing ∠HOH.First, we investigated the differences between dipoles computed under two different electric field strengths, 5 × 10 −5 a 0 and 3 × 10 −4 a 0 , for basis sets aug-cc-pCV(T,Q,5,6)Z and aug-cc-pwCV5Z.
Table I shows that the difference between dipoles computed with the two field strength is small and of the order of 10 −6 a.u.Dipoles are, on average, slightly larger when computed with the smaller field strength.The two-point central point difference formula is valid in the limit of the field strength going to zero; hence, the weaker field is preferred if numerical stability is achieved.

aug-cc-pwCV5Z
A total of 8921 dipoles have been computed with bond lengths ranging from 1.3141 a 0 to 2.3141 a 0 and 84.35 • ≤ θ ≤ 124.35 • , of which the maximum energy calculated with respect to equilibrium is 45 789 cm −1 .This surface is referred to as pwCV5Z below.

aug-cc-pCV6Z
This data set includes 2915 configurations with the length of OH bonds ranging from 1.3891 a 0 to 2.3141 a 0 and 84.35 • ≤ θ ≤ 124.35 • .The maximum energy within the sample is 23 625 cm −1 .This surface is referred to as MVD1 6Z below.

aug-cc-pV5Z
For this group, we computed an extended grid, comprising 21 879 molecular configurations whose energies extend to 46 091 cm −1 .OH distances stretch from 1.3141 a 0 to 2.41 a 0 with ∠HOH restricted to 50 • ≤ θ ≤ 178 • .These calculations were performed using a smaller active space, (7,2) with the CAS-SCF requiring a double occupied 1s(O).Based on the result of Lodi et al., 28 who found that the (MVD1) relativistic TABLE I. Differences in dipole moments (in a.u.) computed for the 14 key structure geometries of Lodi et al. 28 for different electric field strengths using the aug-cc-pCV6Z basis set.Dipole values are shown for the larger field strength of 5 × 10 5 a 0 , with ∆ defined as µ(5 × 10 5 a 0 ) µ(3 × 10 4 a 0 ).and core corrections almost perfectly cancel, our valence-only dipoles exclude any relativistic treatment.The speed of these computations allowed us to reduce the threshold of MR-CI energy convergence to 2 × 10 −11 E h , which provides a test of numerical stability of our procedure.This surface is referred to as V5Z below.

MVD1 and Douglas-Kroll-Hess comparisons
We investigate differences in dipoles computed with the Douglas-Kroll-Hess Hamiltonian to order two (DKH2) and the MVD1 method for the 14 "key structure" geometries; see Table II.DKH2 dipoles are always larger than the respective MVD1 points, with deviation between the two methods becoming increasingly larger for those molecular configurations whereby one or both bonds are stretched further beyond equilibrium.
These findings are rather unexpected; we cannot identify a previous study that investigated both methods of computing dipole moments in depth.We therefore computed dipoles with both computational methods for two different data sets: Set 1 with one bond fixed at 1.31 a 0 and the angular separation constant at 104.85 • and Set 2 with one bond remaining at 1.6 a 0 and the angle set at 40 • .Energies for Set 1 range from 29 633 cm −1 to 40 955 cm −1 , with those from Set 2 in the region of 36 727 cm −1 to 43 373 cm −1 .Figure 1 plots the difference between the two methods for these two sets.The DKH2 dipoles are consistently larger than those from MVD1.
A decreasing trend in deviation is observed at 2.6 a 0 , which corresponds to an approximate molecular configuration energy of 38 100 cm −1 .The reason for such behavior is presently unknown.TABLE II.Differences between DKH2 and MVD1 dipoles (a.u.) for basis sets 6Z and 5Z calculated by subtracting the MVD1 value from that of DKH2.Shown are the DKH2 dipoles for the aug-cc-pCV6Z basis set.6Z (10 5 ) 5Z ( 10 5 )

DKH2 aug-cc-pCV6Z
It is also well known that accurate determination of a DMS requires a large number of grid points; 24,43 hence, the ab initio calculations presented here implement the DKH2 method and encompass geometries with 30 • ≤ θ ≤ 178 • for a total of 17 628 configurations with OH stretches between 1.3 a 0 and 4 a 0 .The maximum energy considered was 45 000 cm −1 , and for the linear least squares (LSQ) fitting procedure those points whose energy falls below 32 000 cm −1 were weighted unity, with the remaining taking a value of 1 × 10 −6 .This energy was chosen because points lying below this in energy threshold were not distorted by the introduction of this weighting scheme, while the 3437 ab initio points that lie above this threshold help to provide good stability and to constrain the long-range behavior.This surface is referred to as CKAPTEN below.
In total, all ab initio calculations accumulate over 110 years of central processing unit (CPU) run time.These were completed in OpenMP and MPI (message passing interface) arrangements on supercomputers Legion and Grace, respectively, each of which form part of University College London's (UCL) research computing network.On the Legion cluster, Dell C6220 nodes were chosen (hardware subject to small differences as nodes were purchased in sections), of which 6 cores and 90 GB of local disk space were required per data point and each core provided 4 GB of memory.For Grace, 2 nodes were utilised, with each node configured with 2 × 8 core Intel Xeon E5-2630 v3 processors, with 120 GB of SSD (solid-state device) disk space each, and all nodes connected through Intel TrueScale QDR Infiniband infrastructure.
The CKAPTEN surface alone required approximately 80 CPU years of the 110.

III. DMS FIT
The development of a functional form that can accurately model the DMS over an extended range of geometries was not straightforward; it was only via a lengthy process of trial and error that we found a suitable expression.
In the equilibrium configuration, we take r e = 1.8141 a 0 and θ e = 104.52• , which is close to the experimental value. 44he parallel dipole component is denoted as µ x and bisects the angle θ, and the remaining dipole µ y is placed perpendicular to µ x .
Three variables are introduced: ζ 1 = (r 1 +r 2 ) 2 −r e , ζ 2 = (r 2 − r 1 ), and ζ 3 = θ/θ e .Each is chosen to approximately represent the behavior of a vibrational mode: ζ 1 reflects the symmetric stretching mode ν 1 , ζ 3 reflects the bending mode ν 2 , and ζ 2 reflects the anti-symmetric stretching mode ν 3 .The analytic expressions used to represent each of the dipole components are as follows: Given our choice of coordinate system, the underlying symmetry provides several constraints which must be adhered to in our functional form.The first requires the parallel dipole to be zero at linear geometries, so we introduce the leading (π − θ) factor, which provides freedom in our choice of the ζ 3 variable.Second, under exchange of r 1 and r 2 , we require µ x to be symmetric and µ y to be anti-symmetric.Hence, the exponent of ζ 2 must be even for µ x and can include zero due to the presence of our pre-leading (π − θ) factor.Similarly for µ y , the exponent of ζ 2 must be odd.
A total of 126 and 98 parameters were used to fit the parallel and perpendicular dipole components of the augcc-pCV6Z CKAPTEN surface, and this gave a weighted root mean square (rms) deviation of 1.5 × 10 −4 a.u. and 1.8 × 10 −4 a.u., respectively.
In fitting the parallel dipole component of LTP2011, 20 Lodi et al. obtained a lower residual of 3 × 10 −5 a.u.However, while their implementation of 200 fitting parameters to "only" 2628 configurations provides a smaller residual, our data set is over six times larger and we implemented only 128 parameters to ensure stability.For perpendicular dipoles, we obtain a lower rms than the 4 × 10 −4 a.u. reported by LTP2011. 20e have successfully restricted the residuals to be less than 10 −3 a.u.for configurations lying below approximately 30 000 cm −1 (Fig. 2).This was achieved by reducing the weight FIG. 2. Log plot of the absolute value of residuals per ab initio data point as a function of energy for the CKAPTEN dipole moment surface.
of those geometries whose energy with respect to equilibrium lies above 32 000 cm −1 .At 30 000 cm −1 , the LTP2011 DMS 20 has a deviation of its parallel and perpendicular dipoles of approximately 5 × 10 −3 a.u. and 1 × 10 −2 a.u., respectively, which would create problems for bands that depend strongly on the perpendicular dipole contribution.

IV. CKAPTEN STABILITY
To test the stability of CKAPTEN DMS to our fitting procedure, we remove half the number of ab initio data points and refit this subset using the same functional form.The data points are ordered in increasing energy and every second removed to leave 8814 geometries.
Neither rms value shows any substantial change, with both remaining as they were before to within 1 × 10 −5 a.u.Below 10 000 cm −1 , we observe less than 1% change in the theoretical intensities of both models.
For known problematic bands 19 above 10 000 cm −1 , we note a 1.5% change in the (121) band, with the smaller ab initio model predicting stronger line intensities.For the (102) band, we instead see line intensities slightly decrease in magnitude for the smaller model, representing an approximate 1% change.We observe negligible changes in 3ν 1 band intensities.
For the atmospheric data of Lampel et al., 11 notably the bands at 487 nm (511), 471 nm (303), 377 nm, and 363 nm (900), we see the intensities increase by 5% for the 487 nm (511) band but weaken for the remainders by 8%, 12%, and 11%, respectively.These high energy overtones are extremely difficult to accurately predict, and such changes are small compared with the error associated with previous theoretical models.

V. COMPUTATION OF LINE INTENSITIES
Our line intensities are computed with the wellestablished, exact kinetic energy operator, variational nuclear motion program DVR3D. 45he intensity of a transition I(ω if ) in units of cm per molecule, evaluated at frequency ω if , is given by where C is a constant of value 4.162 034 × 10 −19 ; E i and E f are the energy of lower and upper states, respectively; g i is the total degeneracy factor; and i|µ α |f is the expectation value of the α component of the dipole moment between states i and f in Debye.Q(T) is the partition function at temperature T and has been calculated to high accuracy. 46,47All intensities quoted in this paper assume the natural isotopologue abundance for H 16 2 O of 0.997 317, and deviation between theory and experimental line intensities is calculated as 100( Obs. Calc. − 1).Calculation of the i|µ α |f dipole transition matrix elements requires wavefunctions.Unless otherwise stated, these were taken from those generated for the recent POKAZATEL H 2 16 O line list. 30The POKAZATEL PES extends the highly accurate potential energy surface (PES) due to Bubukina et al. 48This provides an accuracy of 0.03 cm −1 for energy levels below 25 000 cm −1 and 0.1 cm −1 for those between 25 000 cm −1 and 41 000 cm −1 . 30e note that Lodi and Tennyson 21 analyzed the stability of lines with respect to changes in both the DMS and the PES used in the calculations.Stable lines were those that showed little sensitivity and whose uncertainty in their intensity was largely determined by the uncertainty in the underlying DMS.See Zak et al. 49 for a more detailed analysis of this methodology.
Below, where reference is made to intensities of the LTP2011 surface, these lines were computed by us with the POKAZATEL PES.

VI. RESULTS
All intensities are presented for the HITRAN canonical temperature of 296 K. Below, we define weak lines as those with an intensity lower than 10 −24 cm molecule −1 .For transition frequencies in the infrared, we calculate all transitions for J ≤ 20.Spectral intensities within the visible and ultraviolet regions are confined to J ≤ 14. 20 and CKAPTEN for the infrared bands measured by both Birk et al. 19 and Sironneau and Hodges. 50For many bands, the agreement is excellent; below, we focus on those bands that can be regarded as problematic.

Table III documents an average percentage deviation between theoretical intensities of both LTP2011
Figure 3 compares computed ν 1 line intensities with J ≤ 2 between various models and shows strong dependence on the method used to compute the ab initio DMS.This is unusual behavior given that these are fundamental transitions and different levels of ab initio calculations should not affect this band so dramatically.However, we do note that previous studies have already highlighted this fundamental as being problematic. 23,24he vertical shift between each of the different models is attributed to the underlying electronic structure calculations used to construct the DMS.However, despite significant differences between each of the DMSs tested, we also observe an identical pattern to which the PES, common to all, must be inherently responsible.
Figure 4 recasts these data by intensity and shows that there is a clear change in the deviation between theory and experiment for intensities falling slightly below 10 −24 cm molecule −1 .However, we note that LTP2011 agrees well with CKAPTEN for these 181 weak lines.When we exclude these weak lines, Fig. 5 shows that for the remaining 467 strong lines in the ν 1 fundamental, the CKAPTEN DMS improves on LTP2011 for almost all transitions of Birk et al. 19 considered.Lines deemed unstable fall outside of the ±10% deviation shown here and are also excluded; there are 64 unstable theoretical lines.For the transition intensities that LTP2011 underestimates, CKAPTEN fractionally increases  20 against the experimental intensities. 19The uncertainties are experimental.
these, while it similarly reduces the intensity of those that LTP2011 overestimated, thus highlighting the good stability of CKAPTEN.
CKAPTEN marginally underestimates the same ν 3 fundamental lines that LTP2011 20 slightly overestimates (Fig. 6).This overestimation is likely a direct consequence of the larger rms associated with the perpendicular dipole fitting of Lodi et al. 20 The experiment of Sironneau and Hodges 50 identified a failure by the BT2 line list 29 to accurately model 2ν 3 band intensities, with lines overestimated by an average of 5.3%, with those of Schwenke and Partridge (SP2000) averaging less than 1%.The SP2000 line list was actually computed by Tashkun (Tomsk, Russia) 52 using surfaces due to Schwenke and Partridge. 23,24We also find similar problems with the LTP2011 DMS, except that lines are now underestimated by approximately 5% (see Fig. 7).
We compare transitions from CKAPTEN, Lodi et al., 20 and the Schwenke and Partridge (SP2000) line list 23,24,52 with 35 of the experimental 50 lines in Fig. 7. Of these 35, CKAPTEN is closer to experiment 50 for 23.
The transition frequency using the POKAZATEL PES for (10 7 4) is 7745.5 cm −1 , while that for (10 7 3) is 7745.674cm −1 , both at a difference of approximately 0.2 cm −1 , nearly 7 times the 0.03 cm −1 rms reported in Ref. 30.The error in transition frequency is due to the upper energy levels of POKAZATEL, with both lower states reproduced by POKAZATEL 30 to 3 × 10 −3 cm −1 with respect to the empirical data in HITRAN2016.
Polyansky et al. 53 identified a crossing of rotationalvibrational energy levels (020) and (100), which gave rise to a sharing of intensities between transitions involving these states.The trend in the ν 1 fundamental intensity deviations shown in Fig. 5 displays very similar structure to energy level crossings shown in Fig. 8 with a switch in behavior about the crossing region.Given that the LTP2011 DMS 20 produces intensities that follow a similar path to ours despite their use FIG. 8. Energy levels of states (100) and (020) with J = K a plotted as a function of J.The zero energy is taken as that of the J J0 energy level within the (000) state, as done by Polyansky et al. 53 Energy levels taken from the MARVEL 54 database.
of fewer ab initio data points, a different functional form, and more parameters, it certainly highlights that the underlying issue is probably with the common potential energy surface.
We tested the latest PES currently available for H 2 16 O, denoted PES15k, of Mizus et al., 51 which is of high accuracy but only valid below 15 000 cm −1 .Calculations were again constrained to J ≤ 20 and only transitions lying in the infrared were considered.For the 2ν 2 band, we observe that intensities increase in strength and the percentage deviation change by approximately 0.56%, noting that with the POKAZATEL wavefunctions we were approximately 1.85% too weak.For energy levels (10 7 4) and (10 7 3) that are not accurately reproduced by the POKAZATEL PES, PES15k predicts these to approximately 0.03 cm −1 .
However, we see the deviation in the ν 1 fundamental go in the opposite direction by 0.40%, with intensities becoming slightly weaker.This behavior again supports our claim of intensity mixing occurring between transitions that involve states ν 1 and 2ν 2 .
A dramatic improvement is observed in most of the highenergy bands measured by Birk et al. 19 PES15k improves upon the POKAZATEL potential for a large number of low-lying bands.
Included in the supplementary material is a comparison of CKAPTEN and LTP2011 20 with all J = 6-7 experimentally measured intensities of Birk et al. 19

B. Visible and ultraviolet comparisons
Lampel et al. 11 recorded water vapour absorption in the visible and ultraviolet and compared their observations with several theoretical models; the POKAZATEL line list that utilised the LTP2011S DMS, which is a fewer parameter fit than the larger LTP2011, against the CVR DMS, 28 the BT2 29 line list, and HITEMP. 31or the 450-500 nm window, the strongest line intensities in HITRAN2016 are from the experiment of Tolchenov et al., 55 with the weaker lines taken from the BT2 line list of Barber et al. 29 Tolchenov et al. 55 compared their experimental intensities with several other models and determined BT2 29 to be the most accurate and reliable in this region.
The two strongest bands in Fig. 9 are (511) and (303), at 487 nm and 471 nm respectively, and theoretical line strengths  depend primarily on the accuracy of perpendicular dipole fitting.The strongest line intensities of Tolchenov et al. 55 also share a similar shape to the absorption features measured by Lampel et al. 11 The POKAZATEL line list fails to accurately model either of the two observed bands in Fig. 9; again, it is likely due to errors in the ab initio fitting of LTP2011S perpendicular dipoles, noting a residual of 5 × 10 −4 a.u.Comparing CKAPTEN intensities with the strong lines of Tolchenov et al. 55 available in HITRAN2016 which have an associated experimental uncertainty of 2%-5%, the percentage deviation between these bands is in the range of 5%-6%, thus showing good agreement.
Lampel et al. 11 measured strong water vapour absorption at 363 nm, the (9,0) ± 0 feature, and determined that it is underestimated in POKAZATEL by a factor of 2.39 ± 0.05 and similarly a weaker band at 377 nm is too small by a factor of 3.1 ± 0.7.
CKAPTEN predicts the (9,0) ± 0 structure to be almost 2.7 times larger than that of POKAZATEL (see Fig. 10), approximately 10% greater than the Lampel et al. 11 observation.Likewise, the magnitude of the 377 nm band is nearly 3.9 times larger than POKAZATEL, slightly outside the error quoted by Lampel et al., 11 but given the large uncertainty with their measurement, this is a satisfactory result.FIG.11.Cross sections in the 285-355 nm region for the POKAZATEL line list and CKAPTEN surface generated using the HITRAN application program interface (HAPI), 56 with the scaled observations of Du et al. 32 The cross sections measured by Du et al. 32 have been the subject of much scrutiny 11,33,57 for being orders-of-magnitude too large.While we also agree that their measurements appear to significantly overestimate the overall absorption in this region, CKAPTEN does, however, predict absorption features that map quite well with their observation (see Fig. 11), with the exception that Du et al. show a strong absorption near 290 nm.The band at approximately 335 nm is that with 10 quanta of stretch.
For this band, experimental line positions are blue-shifted by 5 nm compared with theory, while the band at 313 nm is red-shifted by an equal amount.Such large errors remove the POKAZATEL PES as an underlying cause; however, we note that the cross sections of Du et al. 32 were in fact measured in 5 nm intervals.The POKAZATEL line list also fails to show any significant absorption in this region.

VII. SUMMARY
This new DMS for water vapour, termed CKAPTEN, is the most accurate available and calculations required approximately 80 CPU years to complete.It is based on a grid of 17 628 ab initio data points calculated with the Douglas-Kroll-Hess Hamiltonian using the aug-cc-pCV6Z basis set.The computations were calculated at the MR-CI level of theory whose active space included the 9 lowest lying A molecular orbitals together with two lowest lying A molecular orbitals.
Transition intensities that were poorly predicted by previous theoretical models in the infrared region are now closer to experimental measurements than previous theoretical models, notably for bands (002), (300), (102), and (201).For transitions occurring at lower frequencies, theory already predicts intensities to within 2% of the experimental values and CKAPTEN reproduces these intensities with the same or improved accuracy.
High-energy bands, such as (303) and (511), located in the visible spectrum now show excellent agreement with strong experimental lines of Tolchenov et al. 55 available in HITRAN2016, with itensity deviations of about 5%-6%.Both bands also share a similar shape to the observations of Lampel et al. 11 This new surface predicts the (9,0) ± 0 feature to within 10% of the atmospheric observation of Lampel et al., 11 which is a dramatic improvement over the most recent models that underestimate some bands by a factor of 2.3.The high accuracy associated with CKAPTEN for predicting these intensities will hopefully offer a partial solution to the missing absorber problem in our atmosphere but could only account for possibly a few percentage of extra radiation absorbency.
The crossing of between vibrational energy levels (100) and ( 020) is identified as the cause of the long-running problem of computing the intensity of ν 1 symmetric stretching fundamental to high accuracy.Use of the latest high accuracy water PES 51 leads to improved results for many transitions.
The stability of the DMS fit is investigated through the removal of half the ab initio points.Comparing with accurate experimental measurements in the infrared, we observe no change in intensities that causes deviation to increase by more than 1% for transition frequencies lower than 10 000 cm −1 , up to a maximum change of 1.5% in the (121) band.For observed overtones located in the visible and ultraviolet, we see a maximum change of 12% in the 377 nm band.
Upon testing a new potential surface, we observe intensity changes of approximately 1.35% in the 2ν 1 band, 0.56% in the 2ν 2 band, 1.40% in 3ν 1 , and up to 1.63% in 3ν 3 transitions.Such changes in these low-lying bands indicate that more work is indeed required in the fitting of potential energy surfaces for the water molecule.A Fortran subroutine containing the CKAPTEN DMS is available as supplementary material to this article.
Continued validation of the CKAPTEN surface against both experimental and atmospheric data for highly energetic overtones is planned, with the end result being a new line list providing accurate transition intensities extending from the infrared to the near ultraviolet.

SUPPLEMENTARY MATERIAL
The CKAPTEN DMS is available in the supplementary material.

FIG. 1 .
FIG. 1. Differences between parallel and perpendicular dipole components computed with DKH2 and MVD1 methods using the aug-cc-pCV6Z basis set at two different configuration sets.The difference is given by the DKH2 dipole minus the MVD1 dipole.Positive points correspond to parallel dipole components, while negative ones are perpendicular components.Set 1 is for fixed r 1 = 1.31 a o and θ = 104.85• , while Set 2 has r 1 = 1.6 a o and θ = 40 • .

FIG. 4 .
FIG. 4. Log plot of the deviation from measured intensities 19 for all observed ν 1 lines for the CKAPTEN and LTP2011 20 DMS.The uncertainties shown are experimental.

FIG. 10 .
FIG. 10.Comparison of the POKAZATEL line list with the new CKAPTEN surface for 363 nm and 377 nm absorption features.

TABLE III .
20nd average deviation (%) compared with measured transition intensities of Birk et al.19and Sironneau and Hodges.50Theheadinggives the PES used (upper row) and DMS used (lower row).The quoted experimental uncertainty is given for comparison.Intensity deviation from the experimental data of Birk et al.19in the ν 1 fundamental for transitions with J ≤ 2 for different theoretical models, LTP2011 of Lodi et al.,20and our best for each basis set.