Deep Learning for UV Absorption Spectra with SchNarc: First Steps Towards Transferability in Chemical Compound Space

Machine learning (ML) has shown to advance the research field of quantum chemistry in almost any possible direction and has recently also entered the excited states to investigate the multifaceted photochemistry of molecules. In this paper, we pursue two goals: i) We show how ML can be used to model permanent dipole moments for excited states and transition dipole moments by adapting the charge model of [Chem. Sci., 2017, 8, 6924-6935], which was originally proposed for the permanent dipole moment vector of the electronic ground state. ii) We investigate the transferability of our excited-state ML models in chemical space, i.e., whether an ML model can predict properties of molecules that it has never been trained on and whether it can learn the different excited states of two molecules simultaneously. To this aim, we employ and extend our previously reported SchNarc approach for excited-state ML. We calculate UV absorption spectra from excited-state energies and transition dipole moments as well as electrostatic potentials from latent charges inferred by the ML model of the permanent dipole moment vectors. We train our ML models on CH$_2$NH$_2^+$ and C$_2$H$_4$, while predictions are carried out for these molecules and additionally for CHNH$_2$, CH$_2$NH, and C$_2$H$_5^+$. The results indicate that transferability is possible for the excited states.


I. INTRODUCTION
Photosynthesis, 1,2 the ability of beings to see, photorelaxation of e.g. DNA and proteins to prevent them from photodamage [3][4][5] are fascinating examples of the importance of light-matter interactions for our daily lives. Another marvelous aspect are the colors of every thing and every being, which are related to the absorption of a part of the incident lights spectrum. In order to get a deeper understanding of these phenomena and to find out about the possibility of a molecule to be excited by light, answers to the following questions have to be provided: At which wavelengths can a molecule absorb electromagnetic radiation? How much of these wavelengths is absorbed? Can this absorption be used to identify a molecule?
In order to answer such questions, experiments or quantum chemical calculations are usually carried out. Assuming the resonance condition, i.e., the equivalence of the energy of one or more photons of the incident light with the energy gap between two electronic states, single-or multiphoton excitations can take place to one or more excited states if an oscillating dipole is induced. [6][7][8] The oscillator strength, f osc i j , between electronic state i and j, is related to the transition dipole moment, µ i j , of the respective electronic states as well as the energy difference, ∆E i j , between them 9 and is given in a.u.: f osc i j = 2 3 ∆E i j | µ i j | 2 . The larger the oscillation strength, the more likely a transition takes place.
Corresponding experiments often lack the possibility to distinguish and characterize the different electronic states and a) Electronic mail: philipp.marquetand@univie.ac.at rely on theoretical simulations to identify the states and provide detailed insights on their characters. However, these calculations are limited by the high costs for solving the underlying quantum chemical equations. Especially the excited states necessitate highly accurate quantum chemical methods, whose computational costs scale unfavourably with the number of electronic states and atoms considered in the calculations. 10,11 Further, sampling of many different molecular configurations followed by statistical averaging is often required in order to accurately reproduce the shape of experimentally obtained spectra. The many calculations, which are needed to obtain accurate results seriously limit the calculations.
A solution to the aforementioned problems can be obtained with (atomistic) machine learning (ML) models, which have shown to be extremely powerful for the electronic ground state to provide ML potentials for energies or dipole moments, see e.g. refs 12-31. ML force fields exist 15,18,19,28,[32][33][34][35][36][37][38][39] and also the transferability of properties has been demonstrated. 32,[40][41][42][43][44][45] The main advantages of ML models is that they can sample a huge number of molecular configurations with the accuracy of the underlying quantum chemical calculations at only a fraction of the original costs. 37,46 Recently, the interest to advance also the research field of photochemistry and to tackle the excited states with ML has increased. [47][48][49] The fitting of molecule-specific potential energy surfaces (PESs) and coupling values [50][51][52][53][54][55][56][57][58][59][60][61][62][63][64][65] or dipole moments as single values 56,57,63 has been demonstrated up to date and energy gaps, HOMO-LUMO gaps as well as oscillator strengths have been fitted. 45,[66][67][68][69][70] The novel proposed ML models are mostly based on many configurations of a single molecule. 47,[52][53][54]56,57,[62][63][64][65][71][72][73][74] Only a few ML models treat different molecules in their energetic equilibrium structure, which is mapped to a single output, e.g. the oscilla-tor strength. 75 Yet is unclear, whether such a universal ML force field as it exists for the electronic ground state is feasible also for the excited states. The description of many molecular systems with one ML model further requires the construction of excited-state properties from atomistic contributions, but most ML models targeting the excited states employ molecule-wise descriptors and some studies suggest molecular descriptors to be superior to atom-wise descriptors for the excited states. 47,48,68 Another limitation of many existing ML models for the computation of absorption spectra is that they fit the oscillator strengths rather than the excited-state energies and transition dipole moments. The fitting of the latter properties is beneficial as they can be used e.g. for the computation of photodynamics, ML/MM (ML/molecular mechanics) schemes 76 similar to QM/MM (quantum mechamics/MM) schemes 77 or the investigation of explicit light interaction 78,79 -to name only a few applications.
Transition dipole moments and permanent dipole moments can be computed by applying the dipole moment operator as implemented in many electronic structure programs. Permanent dipole moments can also be constructed from atomic charges using the point charge model (eq. 5). By having access to the atomic charges of a molecule, not only the dipole moment vector can be computed, but also the charge fluctuations within dynamics or different reaction coordinates can be investigated and electrostatic potentials can be computed 80,81 . The atomic charges of the excited states can further be used to construct approximated excited-state force fields 82 or can be used to investigate how the charge distribution changes due to light excitation. Although atomic charges are considered one of the most intuitive chemical concepts, they cannot be obtained directly by solving the Schrödinger equation. 83 Subsequent analysis of the charge distribution in a molecule is highly dependent on the underlying partitioning scheme applied. 84 Dipole models based on ML 17,33,39,[85][86][87][88][89][90][91][92] can provide access to the density or latent partial charges while being based on the underlying electronic structure theory. For instance, the latent charges obtained from the dipole moment ML model reported in ref. 17, which never learn atomic charges directly, show good agreement with common charge models (CHELPG 93 and Hirshfeld 94 ), which are considered to be more reliable than for example Mulliken 95 charges. 81 They have been used to plot electrostatic potentials and assess the changes of atomic charges with respect to molecular geometries for the electronic ground state in Refs. 81,96. Electrostatic potentials are further interesting 80 to interpret noncovalent interactions 97 , for Quantitative Structure-Activity Relationship 98 or for force fields. 99 Unfortunately, especially the fitting of transition dipole moments is challenging, as the sign of properties resulting from two different electronic states is arbitrary due to the arbitrary phase of the wave function, 56,100 and because rotational covariance has to be preserved for vectorial properties. To the best of our knowledge, only one study exists, 71 in which phase corrected transition dipole moments were treated in a rotationally covariant way and a single-state fashion 63 with ML. The trained ML models were used to fit model Hamiltonians for subsequent prediction of UV spectra. Yet an ML model that can describe many different PESs, forces, and dipole moment vectors simultaneously for the prediction of UV spectra does not exist.
In this work, we adapt the aforementioned ground-state charge model to describe the permanent dipole moments of the excited states and in addition, we extend it to model the transition dipole moments in a rotationally covariant way. To this aim, we use the SchNarc deep learning approach, originally developed for photodynamics simulations, to additionally enable the computation of UV spectra. By doing so, we extend the SchNarc approach enabling a simultaneous modelling of permanent and transition dipole moment vectors of an arbitrary number of electronic states in addition to a manifold of excited-state potentials, forces, and couplings thereof. The methylenimonium cation, CH 2 NH + 2 , and the isoelectronic molecule ethylene, C 2 H 4 , are used as model systems to asses the accuracy of ML-fitted transition dipole moments and latent partial charges by computation of UV spectra and electrostatic potentials.
In addition, we aim to evaluate the possibility of training one ML model on a set of molecular conformations of CH 2 NH + 2 and C 2 H 4 , i.e. a multi-molecule model. The performance of this model is assessed by comparison to the singlemolecule ML models. As SchNarc constructs energies and dipole moments from atomic contributions, the transferability of this model toward other molecules not included in the training set is evaluated. Thus, in addition to CH 2 NH + 2 and C 2 H 4 , the molecules CH 2 NH, CHNH 2 , and C 2 H + 5 are described, which are not included in the training set and have never been seen by the ML model.

II. THEORY
Recently, we reported the SchNarc approach for efficient photodynamics simulations with ML-fitted PESs, derivatives, and couplings. 64 In this work, we extend this deep learning model to fit permanent and transition dipole moments in a rotationally covariant way of an arbitrary number of states and pairs of states, respectively.
In order to train an ML model on the different excitedstate PESs and properties simultaneously, a training set has to be provided that consists of molecular conformations on the one hand and the corresponding PESs, forces, and excitedstate properties on the other hand. The molecular geometries are automatically transformed into molecular descriptors by SchNet 33 and are intrinsic to the network architecture, which further relates these tailored molecular descriptors to the excited-states properties in an end-to-end fashion. 64,90 The loss function, L SchNarc , which is used to monitor the error on the different properties during training includes permanent and transition dipole moments, summarized in the term µ, in addition to energies, forces, and different types of couplings: The trade-offs between the errors of the different properties are labelled with the letter, t, and the error of each property with L. The subscripts E, F, µ, SOC, and NAC denote energies, forces, permanent and transition dipole moments, spinorbit couplings, and nonadiabatic couplings, respectively. The couplings are not accounted for in this work, as they spin-orbit couplings arise between states of different spin multiplicity 101 and nonadiabatic couplings can be approximated from Hessians of the PESs. 64 While the energies and forces can be monitored using the mean squared error (MSE) between predicted properties by the ML model (denoted with the superscript "ML") and the quantum chemical reference values (denoted as "QC"), a phase-less loss function has to be applied for coupling values and transition dipole moments unless they are phase corrected. 56 Different variants of such a phase-free training algorithm have been proposed by us, which depend on the type of calculation and can be found in detail in Ref. 64. SchNarc automatically determines the most suitable phase-free training process, which is in its simplest form the minimum function of the MSEs assuming once a negative and once a positive sign of a coupling value or dipole moment, respectively. The minimum function can be used when only one excited-state property with arbitrary signs is treated, which is the case here: The dipole moments are treated as vectorial properties and thus the signs within a vector are conserved. As permanent dipole moment vectors are described together with transition dipole moment vectors, they are also trained in a phase-free manner. As a consequence, they are only defined up to an arbitrary sign, which can lead to permanent dipole moment vectors pointing into the wrong direction. Hence they have to be adjusted for a reference molecular geometry when making predictions. A more detailed discussion can be found in the supporting information (SI). 64 The model for permanent and transition dipole moments used here is based on the charge model of Ref. 17: Since the ML model is an atomistic one, atomic contributions to the molecular dipole moment can be automatically obtained. These atomic contributions are taken as latent atomic charges, i.e., they have to be multiplied by the distance, r CM a , of the atoms, a, to the center of mass of the molecule and are then summed up, before feeding the resulting dipole moment into the loss function. In the same way as the permanent dipole moment of the electronic ground state, SchNarc fits the permanent dipole moment of arbitrary states, µ i , and transition dipole moments, µ i j , between different electronic states according to equations 5 and 6, respectively. (6) Note that also here the atomic charges are latent variables and the "atomic transition charges" between two different states used to obtain transition dipole moments do not have a direct physical meaning. However, these charges are the quantities that are used in the predictions. They are then multiplied with r CM a and, in this way, allow for rotational covariance of the transition dipole moment vectors.

A. Training Sets
The training sets and reference computations of all molecules are based on the multi-reference configuration interaction method accounting for single and double excitations (MR-CISD) out of the active space of 6 electrons in 4 orbitals with the double-zeta basis set aug-cc-pVDZ (augmented correlation consistent polarized valence double zeta) as implemented in Columbus. 102 The molecules investigated in this study are the methylenimmonium cation (CH 2 NH + 2 ), ethylene (C 2 H 4 ), aminomethylene (CHNH 2 ), methylenimine (CH 2 NH), and C 2 H + 5 . a. Training Sets The training set of the methylenimmonium cation, CH 2 NH + 2 , forms the basis, as this training set already exists and can be taken from Ref. 56. It consists of 4,000 data points of three singlet states, which has been shown to cover the relevant configurational space visited after photoexcitation to the second excited singlet state, S 2 .
In order to compute an ample training set for ethylene in the most efficient way, the molecular geometries of the available CH 2 NH + 2 training set are used and the nitrogen atom is replaced by a carbon atom. 3,969 MR-CISD/aug-cc-pVDZ calculations are converged, with which the training set for ethylene is built. No optimizations of state minima or crossing points are carried out as this would lead to considerably higher computational effort. The same reference method as for the training set of CH 2 NH + 2 is used in order to allow for merging of the two training sets. Hence Rydberg states of ethylene are not described, which have also been neglected in some previous studies [103][104][105][106][107][108][109] and are considered to be less relevant in two-state photodynamics. 104,110 As CH 2 NH + 2 is considered to be a three-state problem with a bright second excited singlet state and C 2 H 4 is referred to as a two-state problem with a bright first excited singlet state, S 1 , 111-114 these two molecules and their distinct photodynamics are considered to be a perfect testbed for the purpose of this study.

B. Absorption Spectra and Electrostatic Potentials
Statistically significant results for the computation of UV/Visible absorption spectra can be obtained by sampling a lot of different molecular conformations. Here, the reference UV/Visible absorption spectra are obtained from excited-state calculations of 500 molecular conformations sampled from a Wigner distribution. 115,116 The same method as for the training set generation is used for every molecule. Except for the equilibrium structure of CH 2 NH + 2 and C 2 H 4 , these 500 data points are not included in the training set. Alternatively, sampling could also be carried out with Born-Oppenheimer MD simulations, but Wigner sampling is considered to be superior for small molecules 117 and is the standard procedure in SHARC. 118 The calculated vertical excitations from every sampled conformation in combination with the corresponding oscillator strengths and a Gaussian broadening yield the UV/visible spectra. The width of the Gaussians are specified in table S1 in the SI. In addition to the molecules, on which the ML models are trained on, the UV/Visible spectra of CH 2 NH, CHNH 2 , and C 2 H + 5 are computed from 500, 500, and 100 Wigner-sampled conformations. The molecular structures of these molecules are optimized at the MP2/TZVP level of theory using ORCA. 119 The electrostatic potentials are plotted with Jmol 120 and correspond to the energetically lowest lying conformation of each molecule. The Hirshfeld charges are obtained from MP2/TZVP calculations, while the Mulliken charges are available in Columbus, hence they are obtained from respective calculations with MR-CISD/aug-cc-pVDZ.

C. SchNarc
As a deep learning model, SchNarc is used, which combines the continuous-filter convolutional-layer neural network SchNet 33,90 for excited states and the MD program SHARC (Surface Hopping including ARbitrary Couplings). 78,118,121 As the SchNarc model, originally developed for photodynamics, is described in details elsewhere, 64 thus we only shortly describe the technical details and timings of the computations.
As ML is computationally efficient compared to quantum chemistry more conformations can be sampled and more trajectories can be initiated, while still being computationally less expensive. To this aim, 20,000 initial conditions are sampled from a Wigner distribution, from which the UV/Visible absorption spectra are computed using the oscillator strengths obtained from ML energy gaps and transition dipole moments in combination with Gaussian broadening. The computation of the three potential energies at 500 and 20,000 initially sampled molecular conformations takes about 9 sec (39 sec) and 6 min (26 min), respectively, on a GeForce GTX 1080 Ti GPU (Xeon E5-2650 v3 CPU) using the largest trained ML model. In contrast, 500 computations of three PESs with MR-CISD/aug-cc-pVDZ take about 17 hours on a Intel Xeon E5-2650 v3 CPU.
SchNarc models are trained on 3,000 data points of CH 2 NH + 2 and C 2 H 4 separately using 200 additional data points for validation during training and the remaining points for the test set. 5 hidden layers and 256 features to describe the atoms within a cut-off region of 5 Å are used to generate the molecular descriptors. The model which is trained on both molecules takes 7,000 data points, 500 data points are used for validation and the rest is held back as a test set. The network architecture comprises 7 hidden layers and 512 features with a cut-off region of 6 Å. The training of the single-molecule SchNarc model takes about 11 hours and of the model trained on both molecules with the larger network architecture about 15 hours on the aforementioned GPU.
The trade-offs for each trained property along with the mean absolute error (MAE) obtained from all 3 states or all possible pairs of states on the test set of each model is given in Table I. The scatter plots of the models are shown in Fig. 1. The largest errors can be estimated from the scatter plots. Especially in critical regions of the PESs, quantum chemical calculations are difficult to converge and can show artifacts and energy jumps in PESs, 56,63 hence the scatter plots should be taken with care. The predicted dipole moments obtained with SchNarc are about a factor of 5 more accurate than our previously reported kernel ridge regression models 63 and multilayer feed-forward neural networks, 56,63 which fit dipole moments in a direct way -as single values with kernel ridge regression and as single elements put together in one vector with neural networks.  Table I. Trade-offs used to train energies, forces and dipole moments along with the mean absolute error (MAE) and root mean squared error (RMSE) on the test set for each property. Permanent and transition dipole moments are shown together as they are processed together with SchNarc. The mean over all states and pairs of states is shown. The respective scatter plots are given in Fig. 1.

A. UV/Visible Absorption Spectra
The computed UV/Visible absorption spectra are shown in Fig. 2 with the reference method on the left and the ML predictions on the right. Panels (a) and (b) illustrate the spectra of CH 2 NH + 2 and C 2 H 4 , which are both included in the training set. The filled spectrum with solid lines is obtained from SchNarc models trained solely on CH 2 NH + 2 or C 2 H 4 (i.e., a single-molecule model) and the dotted lines are obtained from the SchNarc model trained on the combined training set, which includes both molecules (i.e., a multi-molecule model). As it is visible, both models can be used to accurately predict the UV/Visible absorption spectra. Remarkably,  the S 2 state is correctly predicted to be bright for CH 2 NH + 2 in panel (a), while the S 1 state is dark, while the inverse relation is predicted correctly in panel (b) for C 2 H 4 . The results indicate that although the transition dipole moments are completely different for the different electronic states in both molecules, SchNarc can accurately capture the absorption behaviour of both molecules. Remarkably, the model trained on both molecules is even slightly more accurate than the ML model trained solely on C 2 H 4 for the prediciton of the UV/Visible absorption spectrum in panel (b). We did not expect such an outcome because force fields with increasing generality become usually less accurate for specific examples.
Due to the advantage of the atom-wise molecular descriptor, which enables a description of different molecules of different sizes, the transferability capabilities of SchNarc for the prediction of a manifold of PESs and transition dipole moments throughout chemical compound space is evaluated. To this aim, the UV/Visible spectra of CHNH 2 and CH 2 NH are additionally computed, which are shown in panels (c) and (d), respectively. In order to make sure that these molecules are not included in the training set, an analysis of the maximum bond distances in the training set is carried out. According to unrelaxed dissociation scans of CH 2 NH + 2 , the hydrogen atoms can be considered as dissociated at a bond length of about 2.5 Å. No geometry inside of the training set has an N-H bond length larger than 2 Å, and eight geometries have a C-H bond length larger than 2 Å, where only one is larger than 2.5 Å. The same is true for the training set of C 2 H 4 with regard to the C-H bond length. Thus, it can be safely said that the assessment of the performance of SchNarc is not biased by an unusual large amount of dissociated configurations in the training set.
As it is clearly visible in panels (c) and (d), the energies of the S 1 and S 2 states are lower compared to the energies of CH 2 NH + 2 and C 2 H 4 in panels (a) and (b). This trend is predicted correctly with SchNarc for both CH 2 NH and CHNH 2 . Also the bright and dark states are predicted qualitatively correct. In panel (d), the S 1 state is much darker than the S 2 state, whereas the S 1 state is brighter in panel (c). Although the spectra of the SchNarc models of the unknown molecules are broadened compared to the quantum chemical spectra, they can be used to obtain a qualitatively correct picture of the UV/Visible light absorption at almost no additional costs. CH 2 NH and CHNH 2 both contain one atom less than the molecules described in the training set. Thus, one might assume, that also the ML model trained solely on CH 2 NH2 + can be used to predict a qualitatively correct UV/Visible absorption spectra, as only atoms have to be removed. However, evaluation of the single-molecule models shows that this model cannot be used to capture the correct absorption behaviour and energy range of the two molecules not included in the training set. The performance of the ML model trained solely on CH 2 NH + 2 is even comparable to the ML model trained solely on C 2 H 4 , which would be expected to be at least worse.
As already indicated, the molecular structures of the tested molecules, CH 2 NH and CHNH 2 , are similar to CH 2 NH + 2 and C 2 H 4 . In order to assess the performance of SchNarc for the computation of the UV/Visible absorption spectra of molecules with a different structure, the isoelectronic molecule C 2 H + 5 , which contains one atom more, is additionally chosen. Fig. 3 shows the reference spectrum on the left and the MLpredicted spectrum on the right. The trained SchNarc models cannot be used to predict the UV/Visible absorption spectrum of C 2 H + 5 . While the S 1 state is predicted to be dark and the S 2 state to be bright, which is in accordance to the reference spectrum, the energy range is off. Reasons can be the larger system size, due to the different shapes of the molecules, or due to both reasons. As three hydrogen atoms are bound to a carbon atom in C 2 H + 5 , the structure of this molecule is completely different to the structures inside of the training set.
The results shown here leave us to conclude that isoelectronic molecules with similar molecular structure can be predicted and that our ML models are to a certain extent transferable throughout chemical compound space also for excitedstate PEss and properties thereof. It would be interesting to assess the transferability capacity of ML for the excited states when treating a larger number of molecules. Unfortunately, the high expenses and complexity of multi-reference quantum

B. Electrostatic potentials
The transition dipole moments and energies provide a measure of the quality of the molecular properties that are constructed from atomic contributions with SchNarc. As mentioned above, SchNarc also provides direct access to latent ground-state and excited-state partial charges based solely on the dipole moment data of the underlying electronic structure method. In order to assess, whether the ML model provides meaningful partial charges, the electrostatic potentials obtained from SchNarc are compared to those obtained from Mulliken and Hirshfeld charges. Note that the latter are rarely implemented in quantum chemistry programs for excited states. The results are thus shown only for the electronic ground state in Fig. 4(a).
The first and the second column show the molecules, which are included in the training set. Red colors indicate negative charges, while blue colors indicate positive charges. The electrostatic potentials in the first line are obtained from Mulliken charges. As it is visible, the Mulliken scheme shows that negative charges are located at hydrogen atoms and positive charges at the carbon atoms, which is in contrast to the Hirshfeld scheme given in the second line and also in contrast to chemical intuition. The electrostatic potentials obtained from the ML model trained on both molecules is shown in the third line. A similar charge distribution is obtained for ML models trained on a single molecule (see Fig. S2 in the SI). The partial charges obtained from SchNarc are in good agreement with the Hirshfeld charges. Similar agreement, at least qualitatively, can be obtained for CH 2 NH + 2 . As the charge distribution of the electronic ground state is in qualitatively good agreement to the Hirshfeld partitioning scheme, the redistribution for the excited states can be analyzed. In case of C 2 H 4 in the first column of panel (b), the negative and positive charges do not redistribute considerably in case of the S 1 state, but the distribution is inverted for the S 2 state. The positive charge is then located between the carbon atoms. For CH 2 NH + 2 , the positive charge is located at the far end of the nitrogen side of the molecule for the ground state. In the S 1 state, the positive charge is still located at the nitrogen but closer to the center of the molecule. In the S 2 state, the distribution is similar to the ground state. These distributions give rise to dipole moments, which perfectly agree with the reference calculations (QC/ML: 1.5 a.u. (S 0 ), 1.2 a.u. (S 1 ), 1.5 a.u. (S 2 ); the vectors all point from C towards N).
In addition to the molecules included in the training set, the transferability of SchNarc to predict electrostatic potentials is tested too. Although the ML model has never been trained on CHNH 2 or CH 2 NH, the ground-state electrostatic potentials agree arguably better with the Hirshfeld distributions than the Mulliken ones. This is especially true for CHNH 2 . Comparing the S 0 distribution with the one from S 2 , an inversion of the charge locations is visible, which is also present in C 2 H 4 but not in CH 2 NH + 2 . The last column illustrates the electrostatic potentials of CH 2 NH, where the negative charge is located at the nitrogen atom according to the Hirshfeld partitioning but rather at the adjacent hydrogen according to ML.
All these results indicate that the charge distributions obtained with SchNarc can be used to obtain electrostatic poten-tials of molecules included in the training set and that transferability is possible also for latent partial charges, at least for isoelectronic molecules.

V. SUMMARY AND OUTLOOK
In this work, the SchNarc deep learning approach for photodynamics is extended to describe permanent and transition dipole moments in a rotationally covariant manner and for an arbitrary number of electronic states. The dipole moment vectors can be trained in one ML model in addition to the groundstate energies and forces as well as a manifold of excited-state energies and forces. SchNarc can be used to accurately predict UV/Visible absorption spectra and the latent partial charges can be used to assess the charge distribution via electrostatic potentials of molecules. As SchNarc is trained not only on the ground state, but also on the excited states, the charge distribution for the excited states can be assessed. As the partial charges for the ground state are in qualitatively good agreement to the Hirshfeld charges and also the excited-state molecular dipole moments agree between ML and the reference, we consider the charges to be equally accurate also for the excited states. The latent partial charges are based on highly accurate quantum chemistry and provide direct access to the charge distribution after light excitation.
Due to the atom-wise tailored descriptor, many different molecules can be described in one model, which contain different numbers of atoms. At least when isoelectronic, similarly structured molecules are treated, transferability is confirmed for UV/Visible absorption spectra and partial charges. These properties can be computed with our ML approach at least qualitatively at almost no additional costs. Remarkably, the ML model can treat charged species on the same footing as neutral species.
Especially interesting would be to assess the improvement one can achieve by including many more molecules than just two isoelectronic ones. At the current stage of research, the high complexity and costs of accurate multi-reference quantum chemical methods hampers an ample assessment of the transferability in the excited states. Nevertheless, the trend clearly shows that ML models trained on more molecules are superior to ML models trained on single molecules, even if these molecules exhibit a completely different photochemistry and overall charge.

DATA AVAILABILITY
The training set for CH 2 NH + 2 is published with Ref. 56 and the training set of C 2 H 4 will be made available as supporting information in the same format as the previous training set, i.e., the one used by SHARC. 118 The SchNarc model is updated and freely available at https://github.com/schnarc/schnarc. Supporting Information S1. UV/VISIBLE ABSORPTION SPECTRA UV/Visible absorption spectra are computed from energy differences and transition dipole moments applying Gaussian broadening. Dependent on the number of sampled conformations a full width at half maximum (FWHM) for the Gaussian convolution of 0.5-1.0 eV is used for the quantum chemical spectra and of 0.3-0.5 eV for the ML-predicted spectra in Fig.  2 in the main text and Fig. S1. The reason is to avoid an unphysical fine structure of the spectra, resembling vibrational quantum levels although only electronic degrees of freedom are quantized in the employed approach. The width of the Gaussian used is specified in Table S1. Noticeably, the sampling of even more molecular conformations can reduce the FWHM, which has been shown recently to be possible with ML, 66 Table S1. The used FWHM for the spectra computed with the quantum chemistry reference method MR-CISD/aug-cc-pVDZ (abbreviated as QC) using 500 molecular configurations for C 2 H 4 , CH 2 NH + 2 , CHNH 2 , and CH 2 NH, and 100 molecular configurations for C 2 H + 5 . ML-1, ML-2, and ML-3 denote the ML models trained on C 2 H 4 , CH 2 NH + 2 , and both molecules, respectively. UV spectra are computed from 20,000 molecular configuration of each molecule.
The performance of the ML models trained on only one molecule, i.e., C 2 H 4 (left plots) and CH 2 NH + 2 (right plots) separately, for the computation of UV/Visible spectra of the molecules CHNH 2 and CH 2 NH are compared in Fig. S1(a) and Fig. S1(b), respectively. As it is visible, the ML model trained on CH 2 NH + 2 predicts for both C 2 H 4 and CH 2 NH + 2 the first excited singlet state to be darker than the second excited singlet state, which is also the case for CH 2 NH + 2 , whose behaviour the ML model has learned. The spectrum of CHNH 2 is not comparable to the the reference spectrum shown in the main text in Fig. 2(b) at all. For CH 2 NH, the resulting curves agree qualitatively with the reference spectrum, but the energy gap between the two absorption peaks is larger. The two peaks slightly overlap in the reference spectrum.
In contrast, the ML model trained solely on C 2 H 4 predicts the first excited singlet state to be brighter for CHNH 2 (panel (a) left plot), but the opposite behaviour for CHNH 2 (panel (b) left plot), which is comparable to the reference spectrum. However, the energy range is not comparable to the reference.
The results here show that an ML model solely trained on one molecular species is not transferable, even though the molecule to be predicted contains a subset the same atoms (arranged in the same way). The ML models trained on both molecules discussed in the main text, however, show much better transferability, although the two molecules contained in the training set exhibit a different photochemistry. Our assumption that the ML model gets worse with each additional molecule in the training set is refuted.

S2. ELECTROSTATIC POTENTIALS
For the training of dipole moment vectors, the simplest phase-less loss function is used, which is computationally efficient compared to more accurate loss functions reported in Ref. 64, which are necessary e.g. for photodynamics simulations based on couplings. Here, the minimum function for fitting the permanent dipole moments and transition dipole moments suffices, but as a consequence, the trained properties are only defined up to an arbitrary sign. While this does not influence the transition dipole moments, the signs of the permanent dipole moment vectors for each electronic state have to be adjusted with respect to a reference geometry, e.g., the ground-state equilibrium geometry. Figure S1. UV spectra of (a) CH 2 NH and (b) CHNH 2 predicted with the ML models trained solely on C 2 H 4 (left plots) and CH 2 NH + 2 (right plots). The minimum and maximum energy was selected according to Fig. 2 and was extended where necessary in order to enable better comparison between the spectra. A full width at half maximum of 0.3 eV for all spectra, expect for panel (a) using the C 2 H 4 -ML model, where the width is set 0.75 eV.
If e.g. reaction scans are executed subsequently, the as-signed sign has to be considered in order to obtain the cor-rect direction of the permanent dipole moment vectors along the reaction path. The signs only have to be adjusted for one molecular geometry as the ML outputs are smooth functions by definition. 64 The manual assignment can be circumvented by applying the more accurate phase-less loss function (equations 3 and 4 in Ref. 64). Nevertheless, the comparison of the signs for one molecular geometry is rather inexpensive compared to a much longer training procedure. As the number of electrons is not encoded in the descriptor and the overall charge of the molecule is not known, the atomic partial charges have to be scaled in order to resemble the correct molecular charges when using latent partial charges for electrostatic potentials for example. The scaled charges of atom a for a given electronic state i,q i,a : with N a being the number of atoms in a molecule and Q the charge of the molecule.
Electrostatic potentials of C 2 H 4 and CH 2 NH + 2 computed with the single-molecule ML models fitted on C 2 H 4 and CH 2 NH + 2 , respectively, are given in Fig. S2. Comparison to electrostatic potentials obtained with Mulliken and Hirshfeld charges in Fig. 4 of the main text demonstrates, that C 2 H 4 (panel (a)) is similar to Hirshfeld charges and thus also in excellent agreement to the model trained on both molecules. The electrostatic potential computed with CH 2 NH + 2 in panel (b) for the CH 2 NH + 2 is also comparable to the ML model trained on both molecules. Figure S2. Electrostatic potentials predicted for (a) C 2 H 4 and (b) CH 2 NH + 2 using the latent charges of the respective single-molecule ML model. Reddish colors indicate regions of negative charge, while blue refers to positive charges.