How proteins modify water dynamics

Much of biology happens at the protein-water interface, so all dynamical processes in this region are of fundamental importance. Local structural fluctuations in the hydration layer can be probed by 17O magnetic relaxation dispersion (MRD), which, at high frequencies, measures the integral of a biaxial rotational time correlation function (TCF)—the integral rotational correlation time. Numerous 17O MRD studies have demonstrated that this correlation time, when averaged over the first hydration shell, is longer than in bulk water by a factor 3–5. This rotational perturbation factor (RPF) has been corroborated by molecular dynamics simulations, which can also reveal the underlying molecular mechanisms. Here, we address several outstanding problems in this area by analyzing an extensive set of molecular dynamics data, including four globular proteins and three water models. The vexed issue of polarity versus topography as the primary determinant of hydration water dynamics is resolved by establishing a protein-invariant exponential dependence of the RPF on a simple confinement index. We conclude that the previously observed correlation of the RPF with surface polarity is a secondary effect of the correlation between polarity and confinement. Water rotation interpolates between a perturbed but bulk-like collective mechanism at low confinement and an exchange-mediated orientational randomization (EMOR) mechanism at high confinement. The EMOR process, which accounts for about half of the RPF, was not recognized in previous simulation studies, where only the early part of the TCF was examined. Based on the analysis of the experimentally relevant TCF over its full time course, we compare simulated and measured RPFs, finding a 30% discrepancy attributable to force field imperfections. We also compute the full 17O MRD profile, including the low-frequency dispersion produced by buried water molecules. Computing a local RPF for each hydration shell, we find that the perturbation decays exponentially with a decay “length” of 0.3 shells and that the second and higher shells account for a mere 3% of the total perturbation measured by 17O MRD. The only long-range effect is a weak water alignment in the electric field produced by an electroneutral protein (not screened by counterions), but this effect is negligibly small for 17O MRD. By contrast, we find that the 17O TCF is significantly more sensitive to the important shortrange perturbations than the other two TCFs examined here. © 2018 Author(s). All article content, except where otherwise noted, is licensed under a Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/). https://doi.org/10.1063/1.5026861


I. INTRODUCTION
The dynamics of the protein-water interface is of fundamental importance for many biological processes. Accordingly, a wide array of experimental techniques have been deployed in the challenging task to characterize the mobility of water molecules at or near the surface of a protein in aqueous solution. [1][2][3][4][5][6][7][8][9][10] Arguably, the most reliable information has come from 17 O magnetic relaxation dispersion (MRD) experiments, which are unique in selectively probing the motions of single water molecules. [11][12][13][14] Once the MHz frequency dependence had been causally linked to buried water molecules, 15 it became clear that 17 O MRD provides, in addition to information about the buried water molecules, a model-independent global measure of water dynamics in the external hydration shell: the shell-averaged integral rotational correlation time (IRCT) τ R , often expressed as the rotational perturbation factor (RPF), ξ R = τ R /τ bulk R . From 17 O MRD studies of a large number of native proteins, it has been found that ξ R ≈ 3-5 at room temperature. [16][17][18][19][20][21][22][23][24][25][26][27][28][29][30][31][32][33] During the past three decades, experimental studies of protein hydration dynamics have been supplemented by molecular dynamics (MD) simulations.  While the early simulations were often hampered by methodological issues, the present state-of-the-art, with more accurate modeling of interactions and greatly improved statistical sampling, can reliably deliver important insights about protein hydration dynamics. MD simulations have achieved at least semi-quantitative agreement with the RPF measured by 17 O MRD, 51,59,61-63 boosting confidence in the ability of simulations to unravel details well beyond experimental reach.
Among the more significant contributions of simulation studies in this field is the demonstration of a strong dynamical heterogeneity within the hydration shell, where both site residence times 42,[46][47][48][49][50][64][65][66][67][68] and rotational correlation times 42,51,59,61,62 of water molecules in the hydration shell span at least three orders of magnitude. This dynamical heterogeneity was not unexpected; after all, residence times were (experimentally) known to range from picoseconds for water coordinating small molecules, 69 polymers, and unstructured surfaces 5 to microseconds for water molecules deeply buried inside the protein. 28,32,33,70 What has remained less clear is the precise form of the distribution of residence or correlation times and the nature of the molecular factors that govern these distributions.
The dynamical heterogeneity of the hydration shell must clearly reflect the chemical diversity and structural complexity of the protein-water interface, 14,18,44,[71][72][73][74] but attempts to quantify this dependence by simulations, and to elucidate the underlying mechanisms, have produced conflicting results. Some studies have concluded that the hydration dynamics depends strongly on the secondary structure, [75][76][77] but other studies have found no such correlation. 49,50,64,78 At the level of hydration site polarity, several studies have found that the dynamical perturbation increases in the order nonpolar < neutral polar < charged, 41,61,66,79 but some have reported slower dynamics at nonpolar sites than at charged 64 or polar 80 sites, and still other studies have found no clearcut trend at all. 40,43,49 Some authors have claimed that slower water dynamics only occurs in charged sites, 81 especially positively charged sites, 76 while others have found that the negatively charged sites are more perturbing. 59,78 This lack of consensus is partly attributable to differences in how residence times and rotational correlation times are defined and computed and in how water molecules are assigned to sites of different polarities. But there is another more fundamental problem. A putative correlation between hydration dynamics and site polarity is just that-a correlation. Even if the detailed topography of the protein surface were more important than its polarity, water dynamics would appear to depend on polarity if topography is correlated with polarity to some extent. In other words, the polarity dependence might depend on the structural context. One observation pointing in this direction is the experimental fact that nonpolar groups (such as methyl and methylene groups) slow down water rotation more than polar groups (such as amide groups) for small organic solutes, 69 whereas the reverse is true for proteins according to most simulation studies. 41,61,66,79 The importance of surface topography, with water being more perturbed in concave sites, pockets, or clefts than in highly exposed, convex sites, has been noted in several simulation studies, 47,49,59,61,62,67,78,79,82 but a quantitative correlation has not been established, nor has a molecular mechanism been proposed that is physically transparent and directly reflected in the time correlation function (TCF). The first objective of the present study is to provide the missing pieces of this puzzle.
We focus here on the rotational motion of single water molecules, quantified by a rotational time correlation function (TCF). This dynamical variable has several distinct advantages. First, it is arguably the most local measure of structural relaxation in water (involving breaking and reforming of H-bonds). In particular, it is more local than the (intermittent) H-bond TCF, which involves the slow diffusive separation of the two H-bond partners. 83 As we shall show, the rotational TCF is spatially local even if the rotational correlation time is very long. Second, unlike the residence time and the Hbond lifetime, the (integral) rotational correlation time requires no parameters for its definition. Third, if properly defined and computed, the rotational correlation time can be directly compared to experimental data, e.g., from 17 O MRD.
We compute and analyze three different rotational TCFs: two uniaxial TCFs describing the rotation of a moleculefixed vector and one biaxial TCF describing the rotation of a molecule-fixed tensor. The biaxial TCF has only rarely been computed before, 63,84 but this is the TCF that must be used to compare with 17 O MRD results. These TCFs are highly nonexponential, reflecting dynamics on multiple time scales, from sub-picosecond librations to exchange from confined sites on a 0.1-1 ns time scale. 17 O MRD probes dynamics in the frequency domain, so it is the Fourier transform of the TCF-the spectral density function (SDF)-that should be computed if one wants to compare with experiment. However, ignoring a few buried water molecules, the measured quantity is essentially the SDF at zero frequency, which is the integral of the TCF, referred to as the integral rotational correlation time (IRCT) τ R . In most previous simulation studies, a rotational correlation time has been extracted by fitting the TCF to an exponential (or a stretched exponential) at short delay times (typically, at τ < 10 ps). Whether this correlation time is used directly to calculate the RPF or first used to estimate the IRCT, the failure to account for the long-time tail of the TCF has two unfortunate consequences: the RPF is underestimated (typically, by a factor of 2) and the importance of confinement for identifying and understanding the mechanism underlying the heterogeneous hydration dynamics is largely missed.
Here, we compute the three TCFs up to 1 ns for the hydration shell and up to 10 ns for buried water molecules. As our second objective, we then carry out a quantitative comparison with the RPF measured by 17 O MRD for the four proteins simulated here. Such comparisons have been made before, 51,59,[61][62][63] but they have suffered from the aforementioned neglect of the important long-time tail of the TCF as well as from the failure to correct for certain experimental factors. Like previous studies, 51,59,[61][62][63] we obtain semi-quantitative agreement between simulation and experiment, but our more rigorous and accurate comparison allows us to establish a ∼30% discrepancy that we attribute to imperfections in the protein force field. We also show that the biaxial TCF probed by 17 O MRD is considerably more sensitive to dynamical perturbations than the uniaxial TCFs, in contrast to the conclusion of a recent analysis. 63 Our third objective concerns another contentious issue in the field of protein hydration dynamics: the spatial range of the protein-induced perturbation of water dynamics. For example, far-infrared absorption measurements have been interpreted as implying a "long-ranged" perturbation, 85,86 but the validity of this interpretation has been questioned. 87 17 O MRD reports on the total perturbation, so no assumption need to be made about its range. 17 O MRD measurements as a function of protein concentration rule out a long-ranged perturbation, 11 but a quantitative determination of the range is prevented for the same reason as in the far-IR case: it is difficult to exclude protein self-association and to predict its effect on water trapped between associated protein molecules. 87 For simpler model systems, such as reverse micelles, 17 O MRD measurements clearly demonstrate that only the first hydration shell is significantly perturbed. 5,88 For proteins, the most reliable way to establish the perturbation range is arguably by simulations. Several studies with this objective have been reported by Steinhauser's group, but with mixed results. 42,56,63,78 Here, we use hydration shells of monolayer thickness as distance metric to show that, for all four examined proteins, the excess perturbation decays exponentially with a decay "length" of 0.3 shells. Furthermore, we show that the second and higher shells account for just 3% of the total perturbation measured by 17 The outline of this paper is as follows. In Sec. II, we describe the simulation methodology and the analysis of spatially resolved TCFs. In Sec. III, we analyze water rotation in the first hydration shell and its polarity-based and confinementbased subsets. This analysis leads us to propose two different water rotation mechanisms, one of which has been largely overlooked in previous work. In Sec. IV, we examine water rotation in the second and higher shells, thereby establishing the range of the protein-induced dynamical perturbation. Here, we also analyze the effect of water alignment by long-ranged electric fields. In Sec. V, we compute the 17 O MRD profile, including the contribution from buried water molecules. We then present a rigorous analysis of the experimentally accessible information on water rotation in the external hydration layers, leading to the first fully quantitative confrontation of simulation results with 17 O MRD data. Finally, in Sec. VI, we discuss the most important findings of this work, comparing them with previous simulation results, and attempting to resolve the most glaring discrepancies. A substantial amount of additional results are provided in the four appendices of the supplementary material. Because the results from the four proteins and three water models examined here are very similar, most figures include data from only one of the proteins (ubiquitin); the complete results from all seven simulated protein solutions can be found in the supplementary material.

A. MD simulations
Molecular dynamics simulations were performed on four globular proteins in dilute aqueous solution at 300.0 K and 1 bar (Table I): a triple mutant of the immunoglobulin-binding domain B1 of protein G (GB1), 89 bovine pancreatic trypsin inhibitor (BPTI), 90 mammalian ubiquitin without the three C-terminal residues (UBQ), 91 and isoform 2-14 of the β-helical antifreeze protein from Tenebrio molitor (AFP). 92 Ionizable groups were modeled in their predominant state at neutral pH. UBQ and AFP are then electroneutral, whereas GB1 and BPTI carry a net charge of −6 and +6, respectively. To avoid counterions in the solvent, GB1 and BPTI were neutralized by adding a small charge to each protein atom. As a control, we also performed a GB1 simulation (GB1Z) without such charge rescaling, including six sodium ions 93 in the solvent. MD simulations were performed with the AMBER 16 package 94 using a 2 fs integration time step and periodic boundary conditions. The proteins were modeled with the Amber ff14SB force field 95 and were solvated in a truncated octahedral cell with the TIP4P-Ew water model. 96 As a control, we also performed simulations of GB1 with the three-point water models SPCE 97 and TIP3P; 98 these systems are referred to as GB1SPCE and GB1TIP3P, respectively. A cutoff of 16 Å was used for short-range interactions, and Coulomb interactions were treated with the particle mesh Ewald method. After ∼25 ns equilibration, a ∼100 ns NpT trajectory was propagated for each system (Table I), with coordinates saved every 100 fs. The C α root-mean-square deviation (RMSD) indicated no long-term drift or major conformational transition during the analyzed production runs. Further system and simulation details have been reported. 99

Hydration shells and other water subpopulations
To analyze water dynamics with spatial resolution, we define hydration shells and subsets thereof. 99 Water molecules "in contact" with the protein are identified by the condition that the water oxygen is within 5.0 Å of a carbon atom. These water molecules include the (external) first hydration shell as well as water molecules located in cavities or deep surface pockets. The few (3.3-6.4) buried water molecules are only considered in our comparison with experimental MRD data (Sec. V). This simple definition has been shown to produce a monolayered first hydration shell that covers 98.5% of the protein surface. 99 To examine the spatial range of the protein-induced perturbation of water dynamics, we need a measure of the distance from the protein surface. A natural distance metric is provided by the hydration shell index: the number of water monolayers that separates the given water molecule from the protein surface. A water molecule belongs to shell n ≥ 2 if its oxygen atom is within 4.0 Å of at least one water oxygen in shell n − 1. These shells have a high degree of coverage and a uniform thickness of ∼2.8 Å. 99 To investigate the variation of water dynamics over the protein surface, we decompose the first hydration shell in two different ways. First, we use element-specific cutoff distances (Table A1 of the supplementary material) to split the first shell into 57.1% nonpolar and 42.9% polar hydration sites (mean values for the four proteins). The nonpolar subset contains all water oxygens that are within 5 Å of a protein non-H atom, but more than 3.5, 3.3, or 4.5 Å from any protein N, O, or S atom, respectively (Table A1 of the supplementary  material). The polar subset contains the remaining first-shell water molecules (Table A1 of the supplementary material). The polar subset is further decomposed into neutral, positive, and negative hydration sites.
In addition to these polarity-based subsets, we use the carbon coordination number z C , that is, the number of C atoms within 5.0 Å of the water-O atom, to decompose the first shell into 15 subsets with z C = 1-15. The few water molecules with z C ≥ 16 are classified as buried (Table A1 of the supplementary  material), while, by definition, water molecules with z C = 0 do not belong to the first hydration shell. Because z C reflects the degree of confinement or seclusion of the hydration site, we refer to it as the confinement index.

Rotational time correlation functions
We investigate single-molecule water rotation via three different time correlation functions (TCFs). Uniaxial TCFs of rank k = 1 and 2 for the O-H bonds of the water molecule are defined as where P k (z) is the rank-k Legendre polynomial and θ(τ) is the angle by which the O-H bond has rotated in the time interval τ. Whereas 2 H NMR relaxation probes the uniaxial rank-2 TCF C U2 (τ), 17 O NMR relaxation is governed by the biaxial rank-2 TCF where V is the 17 O electric field gradient (EFG) tensor, with asymmetry parameter η = 0.86 and the major principal axis perpendicular to the molecular plane. 84,100,101 This TCF was computed in terms of Cartesian tensor components, as described. 84 In the following, the uniaxial rank-1 and rank-2 TCFs in Eq. (1) and the biaxial rank-2 TCF in Eq. (2) are referred to as U1, U2, and B2, respectively. Spatially resolved TCFs were computed for hydration shells and for subsets of the first shell based on the initial (τ = 0) location of the water oxygen; the subset TCFs are therefore additive. The TCFs were averaged over ∼20 000 time origins (∆t = 5 ps) and over all water molecules initially belonging to the subset. All TCFs were recorded with a time delay resolution ∆τ = 0.1 ps. The maximum delay time τ up to which the TCF was recorded was 1 ns for the first shell and its subsets, 0.1 ns for the higher shells (and bulk water) and 10 ns for buried water molecules.

Integral rotational correlation time
In the following, the normalized TCFs in Eqs. (1) and (2) are collectively denoted by C(τ). In the isotropic systems investigated here, these TCFs decay to zero: C(τ → ∞) = 0. But the TCF decays on multiple time scales, reflecting not only local water rotation but also site exchange, translational diffusion, and protein rotation. Correlation times must therefore be defined, computed, and interpreted with care.
The integral rotational correlation time (IRCT) is defined as To the extent that the TCF can be expressed as a linear combination of exponentials, the IRCT can also be interpreted as the average correlation time. We split the integral into two parts, The dominant head part is computed by numerical integration in the range 0 ≤ τ ≤ τ 2 using a fourth-order trapezoidal rule. The minor tail part is computed by analytical integration in the range τ 2 ≤ τ < ∞, assuming that C(τ) decays exponentially in this range so that The asymptotic parameters C a and τ a are determined by linear regression of ln C(τ) in the interval τ 1 ≤ τ ≤ τ 2 . The regression interval is selected individually for each TCF as the widest and highest (largest τ) range before the first zero-crossing (due to noise) and where a well-defined nonzero slope (the magnitude of which must decrease monotonically with τ) is evident.
Note that J tail (ω 0 ) can be negative if ω 0 τ a is not 1. With the TCF sampled at a resolution of ∆τ = 0.1 ps, J(ω 0 ) cannot be obtained at frequencies higher than ω 0 = π/∆τ ≈ 3 × 10 13 rad s −1 , but this is of no concern since experimental 2 H and 17 O MRD data are limited to ω 0 < 10 9 rad s −1 .
To illustrate the computation of the IRCT, Fig. 1 shows the B2 TCF for the polar first shell of UBQ. The last linear FIG. 1. Biaxial rank-2 (B2) TCF for the polar sites in the first hydration shell of UBQ, recorded up to 1 ns. The exponential tail (dashed) is fitted to the TCF in the interval τ 1 < τ < τ 2 (orange). interval in the semilog plot was identified as [τ 1 , τ 2 ] = [0.6, 1.0] ns. Linear regression to the TCF data in this interval yields the exponential tail of the TCF, indicated by a blue dashed line. Extrapolating this tail beyond τ 2 = 1 ns, we obtain from Eq. (5) the tail contribution τ tail R = 0.4 ps, corresponding to 2.7% of τ R = 14.4 ± 0.2 ps.
Since the trajectory was sampled with a time resolution of ∆τ = 0.1 ps, the damped librational oscillations in the first 100 fs are not captured (Fig. C1a of the supplementary material), but this has a negligible effect on the IRCT. As a rough measure of the libration amplitude, we use C I ≡ C(0.1 ps).
The standard error of the IRCT, and of other quantities derived from the TCF, was computed by dividing the ∼20 000 time origins into ten blocks. 102 Bulk water results for the three water models, used as a baseline for the protein-induced dynamic perturbation, are collected in Appendix C of the supplementary material and the TCFs are shown in Fig. C1 of the supplementary material.

III. FIRST HYDRATION SHELL
We investigate the rotational dynamics of the 350-400 water molecules in the first hydration shell by means of the three rotational TCFs in Eqs. (1) and (2) and the associated IRCT in Eq. (3). To resolve the wide IRCT variation among different hydration sites on the heterogeneous protein surface, we decompose the first shell into subpopulations in two different ways. The polarity decomposition is based on the presence or absence of polar protein atoms in the first coordination shell of the water molecule (Table A1 of the supplementary material), whereas the confinement decomposition is based on the extent to which the water molecule is removed from bulk water, as quantified by the confinement index z C (Sec. II B 1). For each subset s, we quantify the effect of the protein on water rotation by the rotational perturbation factor (RPF), where both IRCTs refer to the same TCF type.

A. Polarity-based decomposition
Polarity-resolved TCFs are shown in Figs. B1-B5 of the supplementary material, while the IRCTs and RPFs derived from these TCFs are collected in Tables A2-A5 of the supplementary material. The RPFs for the first-shell subsets of the four proteins and three water models are also presented graphically in Figs. B6 and B7 of the supplementary material. A summary of these extensive results is provided in Fig. 2, showing the four-protein mean and standard deviation of the RPFs for the first-shell and five subsets, and based on each of the three TCFs. Several conclusions follow.
(2) For all first-shell subsets, the biaxial TCF B2 is more sensitive to the protein-induced perturbation (larger RPF) than the two uniaxial TCFs U1 and U2 ( Fig. 2; Tables A2-A4 of the supplementary material).
FIG. 2. Rotational perturbation factor ξ s R for the first hydration shell and five subsets, based on the U1 (green), U2 (blue), and B2 (red) TCF. Bars and error bars represent the mean and standard deviation, respectively, for the four proteins in TIP4P-Ew water. BPTI was excluded from the neutral mean and AFP from the negative mean (see text).
(3) For all first-shell subsets of GB1, the RPF is nearly the same for the TIP4P-Ew, SPCE, and TIP3P water models ( Fig. B7 of the supplementary material). (4) For the four non-overlapping subsets of the first shell, the protein-averaged RPF ranges from ∼2 (nonpolar) to 7-11 (negative) and the rank order is nonpolar < positive < neutral < negative (Fig. 2). (5) As expected, the rank-2 TCFs are influenced more by librations than the rank-1 TCF: C I (B2) = 0.768, 17% below C I (U1) = 0.922. The subset rank order for C s I is positive ≈ nonpolar < neutral < negative, essentially the same as for the RPF. This qualitative similarity is consistent with the notion that a higher degree of confinement is associated with a smaller libration amplitude (larger C s I ) as well as with a longer survival time (larger RPF); see Secs. III B and III C. (6) The pronounced RPF difference between the positive and negative subsets ( Fig. 2) is likely due to two factors. First, water molecules near the ionic groups of the long and flexible Lys and Arg side-chains tend to be less confined and hence more mobile (Sec. III B). Second, the formal charge is more delocalized for these positive groups than for negative carboxylate groups, so water molecules interact more strongly with the latter. (7) With the exception of two outliers, the RPF for a given subset and TCF does not differ much among the four proteins (Fig. B6

B. Confinement-based decomposition
The polarity-based decomposition of the first shell reveals significant RPF variations among the subsets (Fig. 2), but these variations constitute correlations that are not necessarily caused by the polarity differences per se. For example, the larger RPF for polar as compared to nonpolar sites may have more to do with the higher degree of confinement (on average) in polar sites than with their polarity (Sec. III C).
To explore this notion quantitatively, we assign to each water molecule a confinement index z C , defined as the number of carbon atoms within 5.0 Å of the water oxygen atom. As seen from Fig. 3 and Fig. B8 of the supplementary material, this simple definition does indeed capture the essence of water confinement.
The confinement index distribution P(z C ), that is, the fraction of all N 1 first-shell water molecules that have confinement index z C , is remarkably similar for the four proteins (Fig.  B9 of the supplementary material), suggesting that it is an invariant property of the protein-water interface. For z C ≥ 2, the distribution conforms closely to exponential form, Fig. 4; Fig. B10 of the supplementary material); for every additional carbon within 5 Å, N 1 (z C ) drops by 25% (Table A6 of the supplementary  material).
The TCFs for these subsets exhibit a clear-cut, robust, and monotonic trend: with increasing confinement index z C; the amplitude and decay time of the TCF tail increase (  Tables A7-A9 of  the supplementary material).
For z C ≤ 10, the confinement-resolved RPF ξ R (z C ) increases exponentially, ξ R (z C ) ∼ exp(α z C ) ∼ β z C ( Fig. 6; Figs. B13 and B14 of the supplementary material); for every additional carbon within 5 Å, the RPF increases by 19% (U1), 22% (U2), and 27% (B2) ( Table A10 of the supplementary  material). Among the three TCF types, the biaxial B2 RPF is thus the most sensitive probe of confinement. For a given TCF type, the incremental increase (the β parameter, or the slope FIG. 3. Water molecules in the first shell of UBQ, color-coded according to their confinement index: z C = 1 (dark blue), 5 (light blue), 10-12 (orange), and 13-15 (red). For clarity, only a fraction of the z C = 1 subset is shown.
FIG. 4. Normalized distribution P(z C ) of confinement index z C for the 400 water molecules in the first shell of UBQ (circles) and exponential fit for in a semilog plot) is nearly the same for the four proteins (Fig.  B14 of the supplementary material). For the most confined sites, the RPF increases more strongly and with more proteinspecificity (Figs. B13 and B14 of the supplementary material), as expected since such sites are few; for the four proteins, only 5.5-7.8 water molecules have z C = 13-15 (Table A9 of the  supplementary material).
With increasing confinement index z C , the RPF ξ R (z C ) increases exponentially (Fig. 6), whereas the subpopulation N 1 (z C ) decreases exponentially (Fig. 4). Because the two exponents (α and α ) are similar in magnitude (especially for the B2 TCF), all z C -resolved subpopulations make similar contributions to the shell-averaged RPF ξ R (Fig. 7). But for BPTI and AFP, the most highly confined subsets make a FIG. 5. Biaxial rank-2 (B2) TCF for first-shell subsets with confinement index z C = 1 (dark blue), z C = 5 (light blue), z C = 10 (orange), and z C = 15 (red) for UBQ. In the semilog plot, the TCF is colored gray in the regression interval [τ 1 , τ 2 ] and the fitted exponential TCF tail is dashed. The bulk water TCF is also shown (magenta dashed-dotted).
FIG. 6. Rotational perturbation factor ξ R (z C ) versus confinement index z C for water molecules in the first shell of UBQ. Dotted lines resulted from exponential fits for z C ≤ 10. TCF type: U1 (green), U2 (blue), and B2 (red).
relatively larger contribution to the shell average (Fig. B15 of the supplementary material).
Even though the shell-averaged RPF is made up of comparable contributions from all 15 z C -resolved subsets, the highly confined subsets are small so the few water molecules in such sites contribute a large fraction of ξ R . In UBQ, for example, 1.8 sites with z C = 15 contribute more than 83.7 sites with z C = 1 (Fig. 7, Table A9 of the supplementary material).
The disproportionate contribution from a few highly confined sites to the shell-averaged RDF is illustrated in Fig. B16 of the supplementary material, where we plot the cumulative RPF ξ R,cum (z C ), that is, the population-weighted average RPF for all water molecules with confinement index not larger z C . The strong influence of the small number of highly confined sites is best seen by plotting ξ R,cum (z C ) against the fraction water molecules with confinement index ≤ z C (Fig. B17 of the supplementary material). In this way, we can also calculate the partial RPF for the 50% or 90% "fastest" (and least confined) water molecules in the first shell. From the B2 TCF, we thus find ξ R,cum (50%) = 1.80 ± 0.02 and ξ R,cum (90%) = 2.8 ± 0.2 (mean ± standard deviation for the four proteins) (Table A11 of the supplementary material).
To construct the full RPF distribution f (ξ R ) for the first shell, we need to know the distribution f (ξ R ; z C ) for the hydration sites in each subset so we can compute f (ξ R ) = z C P(z C ) f (ξ R ; z C ). Making the reasonable assumption that FIG. 7. Contributions to the RPF ξ R (B2) from first-shell z C -subsets of GB1 (red) and UBQ (black).
FIG. 8. Normalized distribution f (ξ R ) of the RPF ξ R (B2) in the first shell of GB1 (red), BPTI (blue), UBQ (black), and AFP (green). A power-law fit in the interval 4 ≤ ξ R ≤ 40 is shown (dots) for BPTI and UBQ. f (ξ R ; z C ) is Gaussian, we find that the distribution f (ξ R ) is nearly the same for the four proteins ( Fig. 8; Figs. B18 and B19 of the supplementary material), with a maximum at ξ R = 1.82 ± 0.02 and a power-law tail, f (ξ R ) ∼ ξ −ν R , with exponent ν = 2.2 ± 0.1 (mean ± standard deviation for the four proteins) (Table A6 of the supplementary material).
In our reconstruction of the RPF distribution f (ξ R ), the Gaussian f (ξ R ; z C ) acts as a smoothing function. We assume that σ(z C ) = ξ R (z C ), but other choices for the width of the smoothing function yield nearly the same maximum and power-law exponent. On the other hand, it is clear that the Gaussian form cannot be correct for small z C , where f (ξ R ; z C ) must be skewed to ensure that f (ξ R ; z C ) = 0 for ξ R ≤ 0. Therefore, f (ξ R ) comes out too large below the maximum. In particular, the physically required endpoint f (0) = 0 is not reproduced (Fig. B19 of the supplementary material).

C. Rotation mechanisms
We have seen that the "fastest" water molecules occur in nonpolar sites (Fig. 2) and that the "slowest" water molecules have the largest number of carbon neighbors (Fig. 6). To see why this is not a contradiction, we examine the polar coordination numbers, that is, the number of protein-O, protein-N, or water-O atoms within the respective cutoff distance (Sec. II B 1) of a water molecule with a given confinement index z C (Fig. 9, Tables A12 and A13 of the supplementary material).
As expected, the number of water neighbors decreases monotonically with increasing confinement index. Less obviously, the number of neighboring polar protein atoms increases along with z C . In effect, the confinement index z C measures the extent-or "intimacy"-of the water-protein contact, regardless of whether it involves polar or nonpolar protein atoms. The covariation of polar and nonpolar protein atom neighbors (Fig. 9) reflects a universal property of the protein-water interface: water molecules can penetrate the protein only if (most of) the sacrificed water-water H-bonds are compensated by new water-protein H-bonds. Indeed, the total polar (W+O+N) coordination number for water molecules in the first shell is, on average, just 1% lower than in bulk water. 99 This remarkable invariance is a consequence of the high H-bonding propensity of the water molecule, which also endows bulk water with an exceptionally high cohesive pressure (23 kbar) that keeps water molecules out of small nonpolar cavities [103][104][105][106] and narrow nonpolar pores. 29 To understand how water rotation is slowed down by the protein, we must take into account the polar coordination of the water molecule, not only on average, but also in the transition state for water rotation. Consider first the lowestconfinement subset z C = 1, which makes up 21%-22% of the first shell of the four proteins. In this subpopulation, 86% of the water molecules have no polar protein atom within H-bonding distance (Fig. 10, Fig. B20 of the supplementary material) and therefore belong to the nonpolar subset (Table A1 of the supplementary material). On average, these water molecules coordinate 4.23 other water molecules (Fig. 9, Table A13 of the supplementary material), just 2% below the bulk-water coordination number, 4.32, for the same TIP4P-Ew model. 99 These unconfined water molecules (dark blue in Fig. 3) rotate by a bulk-like mechanism, where the cooperative motion of several water molecules lowers the transition-state barrier by enabling water-water H-bonds to be broken and formed in concert. The vast majority of the water molecules assigned to the nonpolar subset also belong to the low-confinement subset z C = 1-3; both of these subsets account for 57% of the first shell (Tables A1 and A9 of the supplementary material) although they are not identical. The two-fold retardation of water rotation in the nonpolar subset ( Fig. 2) can be attributed to the impediment of the bulk-like cooperative rotation mechanism caused by the asymmetric distribution of water neighbors.
The polar subset, accounting for the remaining 43% of the first shell (Table A1 of the supplementary material), has a much larger, ten-fold retardation (Fig. 2). In contrast to the nonpolar subset, the polar subset comprises hydration sites ranging from low to high confinement. For the low-confinement part of the polar subset, water still rotates by a bulk-like collective mechanism, which is now impeded also by the fixed location and fixed H-bond polarity of the protein polar atom(s) (Fig. 11, top). Highly confined water molecules with z C = 13-15 (red in Fig. 3) account for 1.6%-2.0% of the first shell and have ∼2 polar protein atoms within H-bonding distance (Fig. 9); they therefore belong to the polar subset. If it were not for these H-bonding opportunities, such sites would be devoid of water. Because of the steric exclusion of most water neighbors, these highly confined water molecules rotate by a different mechanism (Fig. 11, bottom).
A bulk water molecule can switch from being H-bond donor to being H-bond acceptor by changing its orientation and it can optimize the H-bond geometry by changing its position. By contrast, for most protein polar atoms, both the H-bond polarity and the position are essentially fixed. As long as a water molecule remains in a confined site, it is therefore orientationally restricted, except for sub-ps librations and, possibly, much slower symmetric C 2 flips. Rotation therefore requires an exchange event, whereby another water molecule enters the confined site and the original one now can rotate with little or no retardation (Fig. 11, bottom). This rotation mechanism has been termed exchange-mediated orientational randomization (EMOR). 28,107 In the EMOR mechanism, the TCF is proportional to the survival probability Q S (τ) of a water molecule in the site. For Poissonian exchange statistics, Q S (τ) = exp(−τ/τ S ) so the asymptotic decay time τ a in Eq. (5) can be identified with the mean survival time τ S . If internal motions are much faster than exchange, the IRCT can be decomposed as τ R = τ internal + S 2 τ S , where S is an orientational order parameter. For a highly confined water molecule, τ S is typically two orders of magnitude longer than τ R for an unconfined water molecule, because the transition state for exchange requires H-bonds to the protein to be broken before new ones can be formed with other water molecules (Fig. 11, bottom).
The description in the previous paragraph applies to a fixed protein. A protein in solution undergoes rotational diffusion, so τ S is replaced by an effective correlation time τ eff = 1/(1/τ S + 1/τ P ), where τ P is the rotational correlation time of the protein. 12 However, even for the small proteins considered here, τ P is several nanoseconds (Sec. V) so τ eff ≈ τ S for water molecules in the first hydration shell. The EMOR mechanism also operates for buried water molecules, but then τ S is typically in the microsecond range 28,32,33,70 since the transition state involves large-scale protein structural fluctuations. 108 For a long-lived buried water molecule in a freely tumbling protein, τ P therefore plays the role of τ S . Figure 11 is highly schematic, but we believe that it captures the essential features underlying the broad τ R (or ξ R ) distribution for water molecules in the first hydration shell of proteins (Fig. 8). This distribution reflects a continuous spectrum of mechanisms, ranging from the bulk-like cooperative mechanism to the EMOR mechanism. Several further observations support this scenario.
If water in the low-confinement nonpolar subset rotates by a bulk-like mechanism, then τ R should have essentially the same dependence on TCF rank as in bulk water: τ bulk R (U1)/τ bulk R (U2) = 2.56 (Table C1 of the supplementary material). This is, indeed, the case: τ non R (U1)/τ non R (U2) = 2.55 for the four proteins (span: 2.41-2.67) (Tables A2 and A3 of the supplementary material). The polar subset has a weaker rank dependence, τ pol R (U1)/τ pol R (U2) = 2.07 (span: 1.87-2.33), because it includes the most highly confined sites. The TCF for these sites should exhibit an initial decay due to internal motions in the confined site (librations and, possibly, C 2 flips), followed by a much slower exponential decay due to water exchange. (For symmetry reasons, the C 2 flip affects the U1 and U2 TCFs, but not the B2 TCF. 12 ) In contrast to the initial decay, the asymptotic decay time should be the same for all three TCFs since, on the time scale of the mean survival time, the water orientation is randomized "instantaneously" after the exchange. This expectation is confirmed in Fig. 12, where the asymptotic decay time, τ a ≈ 0.5 ns, is the same for all three TCFs. The same near-equality of τ a is seen in the three TCFs for the entire polar subset (Fig. B1 of the supplementary material) since the TCF tail is produced by the most confined water molecules.
In conclusion, the strong increase of the amplitude and decay time of the TCF tail with increasing confinement (Fig. 5; Figs. B11 and B12 of the supplementary material) and the consequent increase of the IRCT and RPF ( Fig. 6; Figs. B13 and B14; Tables A7-A9 of the supplementary material) can all be understood in terms of a gradual transition from the collective bulk-like mechanism for low confinement to the EMOR mechanism for high confinement.

A. Range of the dynamic perturbation
To determine the spatial range of the protein-induced perturbation of water rotation, we compute TCFs for the water molecules (initially) belonging to the nth hydration shell (defined in Sec. II B 1). From these shell-resolved TCFs (Figs. B21-B23 of the supplementary material), we obtain the IRCT τ n R (Tables A14-A16 of the supplementary material) and the RPF ξ n R (Table A17 of the supplementary material) for the nth shell. The perturbation δ n R , which decays toward zero with increasing shell index n, is simply the excess RPF: δ n R ≡ ξ n R − 1. Given that the first-shell perturbation is caused by confinement and, to a lesser extent, by water-protein H-bonds and asymmetric water coordination (Sec. III C), we expect a much smaller perturbation in the second and higher shells. This is, indeed, the case: whereas water rotation is slowed down five-fold (as gauged by the B2 TCF) in the first shell, it is only ∼10% slower in the second shell and ∼1% slower in the third shell (Table A17 of the supplementary material). In the first shell, the perturbation varies somewhat among the four FIG. 13. Excess rotational perturbation factor derived from the three TCFs, U1 (green), U2 (blue), and B2 (red), for water molecules in the nth hydration shell. The symbols represent the average for the four proteins and the curves resulted from exponential fits.
proteins because of the disproportionately large contribution from a few highly confined sites (Sec. III B). By contrast, for the second and higher shells, this variation is much smaller (Table A17 of the supplementary material). To obtain a representative view of the spatial decay of the perturbation, we plot in Fig. 13 the four-protein average δ n R versus shell index n. As for the first shell, the perturbation of the higher shells does not depend significantly on the water model (Table A17 of the supplementary material).
The decay over the first three shells, where the perturbation is significantly different from zero, is well described as δ n+1 (Fig. 13). On going from one shell to the next higher one, the perturbation is reduced by an order of magnitude, with α R ranging from 13 (U1) to 40 (B2) ( Table A18 of the supplementary material). Equivalently, the perturbation decay can be described by an exponential function (defined on the positive integers): δ n R ∼ exp(−n/n * R ). The decay "length" n * R , which indicates where the perturbation has decayed to a fraction 1/e of its value in the first shell, ranges from 0.39 ± 0.01 (U1) to 0.27 ± 0.01 (B2). To obtain a physical length, these values can be multiplied by the average shell thickness, 99 2.8 Å, yielding 1.10 ± 0.03 Å (U1) and 0.76 ± 0.01 Å (B2) ( Table A18 of the supplementary material).

B. Electric-field induced water alignment
For most purposes, the short-ranged perturbation of water rotation (Fig. 13) can be modeled as a step function; the first shell is significantly perturbed, while the second and higher shells, to an excellent approximation, can be regarded as bulk water. But this is not the whole story: although of little or no practical importance, a very small part of the perturbation can have a much longer range.
Water molecules are weakly aligned by the protein's electric field, which decays asymptotically as R −3 for the electroneutral proteins studied here (Sec. II A). Such a weak alignment hardly affects the local water dynamics, but it introduces a persistent orientational correlation. Complete randomization of the water molecule's orientation then cannot be accomplished by local water rotation alone, but requires other processes that randomize the orientation (with respect to a labfixed frame) of the aligning field. In a protein solution, these processes are water diffusion around the protein and protein tumbling (to the extent that the field lacks spherical symmetry). These processes are much slower (by 2-4 orders of magnitude) than local water rotation (outside the first shell), so the TCF C(τ) decays on two distinct time scales: picosecond water rotation brings the TCF down to a small plateau value, whereupon nanosecond water diffusion completes the decay toward zero. The IRCT in the presence of an electric field E is then given by 109 τ where S k is a rank-dependent order parameter 109 and τ E is the integral correlation time of the slow isotropic process. At the noise level of our TCF data, the weak long-time tail associated with isotropic averaging of the local electric field can be seen in the rank-1 (U1) TCF up to the sixth shell ( Fig. B21 of the supplementary material). However, even in the second shell, the TCF has decayed to well below 1% of its initial value before the tail is manifested. Therefore, the second term in Eq. (9) does not contribute much to the IRCT, which is within 1% of the bulk value already in the fourth shell (Tables A14 and A17 of the supplementary material). As expected, the TCF tail has a larger amplitude for the GB1Z system ( Fig. B23 of the supplementary material), where the net protein charge produces a stronger field in the first few shells (before it is exponentially attenuated by counterion screening), and the RPF in the first few shells is slightly larger for GB1Z than for GB1 (Table A14 of the supplementary material). By contrast, no tail is seen above the noise level for the rank-2 TCFs (Figs. B22 and B23 of the supplementary material) because the alignment effect is then of higher order: S k ∼ E k (Sec. VI E).
Water molecules in the first shell are also aligned by the electric field and, more importantly, by short-range interactions. But here there is no time-scale separation, so the alignment effect is overshadowed by the much larger, short-range perturbation mechanisms (Sec. III C). For the higher shells, field alignment is the only perturbation.
Our shell-resolved TCFs are based on the initial (τ = 0) shell assignment of the water molecule. During the decay of the TCF, water molecules may leave the shell, leading to loss of resolution. Such shell smearing can only be important if the TCF has a long-time tail and if the water molecule leaves the shell before the TCF has decayed to zero. Shell smearing is therefore not of concern for confined sites, where the water molecule remains in the first shell until its orientation is randomized. Shell smearing is only important for rank-1 TCFs of the second and higher shells, but then the tail is so weak that it hardly affects the IRCT.

V. SIMULATION VERSUS EXPERIMENT
We shall now use our simulation data to compute quantities that are measured in magnetic relaxation dispersion (MRD) experiments. 12,13 Such a comparison is best carried out with 17 O MRD data, which selectively report on water dynamics (whereas 1 H and 2 H MRD data are also affected by labile hydrogens) and which, for symmetry reasons, are unaffected by water C 2 flips, the kinetics of which may not be accurately reproduced by the simulation. 108

A. Magnetic relaxation dispersion profile
The longitudinal water 17 O spin relaxation rate can be expressed as 12 where ω Q is the nuclear quadrupole frequency (including a numerical factor). The effective spectral density function (SDF) J(ω 0 ) is a linear combination of cosine Fourier transforms of the B2 TCF in Eq. (2), where ω 0 = 2πν 0 is the (angular) Larmor frequency, and the kernel K(z) is defined as Since K(ω 0 τ) ≤ 1, it follows that J(ω 0 ) ≤ τ R , the equality holding if all rotational correlation times τ k in the population described by C(τ) obey ω 0 τ k 1. In Tables A2-A5, A7-A9, and A14-A16 of the supplementary material, we quote the effective SDF J(ω * 0 ) at MHz (for all three TCFs), the highest 17 O Larmor frequency accessed in most MRD studies. For the first shell and its subsets, τ R exceeds J(ω * 0 ) by more than 10% only for the most highly confined sites (z C = 13-15).
Although our focus here is on the external hydration shells, we have also computed the three TCFs for the buried water molecules (Appendix D of the supplementary material). Because these TCFs decay on the nanosecond time scale, the statistical error is at least an order of magnitude larger than for the external water molecules. Nevertheless, for all four proteins, the (effective) correlation time computed for the buried water molecules is in semi-quantitative agreement with MRD experiments (Appendix D of the supplementary material).
Having computed the B2 TCF C(τ) for all water molecules in the system, we can obtain the complete MRD profile R 1 (ω 0 ) from Eqs. (10)- (12). Moreover, we can dissect the MRD profile into its contributions from different water subpopulations ( Fig In contrast to buried water molecules and the second and higher shells, the first hydration shell gives rise to a broad dispersion (Fig. 14), extending from ∼100 MHz to ∼100 GHz and reflecting the broad τ R distribution ( Fig. 8). At room temperature, this extended dispersion cannot be measured, since the strongest available NMR magnets only reach an 17 O Larmor frequency of 136 MHz. But the low-frequency half of this dispersion (produced by the more confined water molecules) can be shifted into the experimentally accessible frequency range by performing MRD measurements at lower temperatures. 27

B. Rotational perturbation
When confronting simulation results with experimental data, it is important to compare commensurate quantities. The link to MRD experiments is provided by the generalized excess RPF (for shell n), which reduces to the excess RPF δ n R = ξ n R − 1 in the limit ω 0 = 0 or, more precisely, when ω 0 τ k 1 for all rotational correlation times τ k in the distributions for shell n and bulk water. In the opposite (unattainable) limit, δ n R (ω 0 → ∞) = 1/τ n k / 1/τ bulk k − 1 ≤ 0. A measure of the total perturbation is obtained by multiplying δ n R (ω 0 ) by the occupation number N n of the nth shell and then summing over all shells. Dividing this quantity by the first-shell occupation number N 1 , we obtain the effective first-shell generalized excess RPF, If δ H R (ω 0 ) is assigned to the N 1 water molecules in the first shell while the remaining N W -N 1 water molecules are treated as bulk water, the total perturbation (whatever its range) is reproduced exactly. Since the perturbation is, in fact, very short-ranged (Fig. 13), the effective quantity δ H R (ω 0 ) can, to an excellent approximation, be regarded as the actual excess RPF for the first shell. In other words, the second term in Eq. (14) is negligibly small at all experimentally accessible frequencies ( Fig. 15; Figs. B26 and B27 of the supplementary material), contributing merely ∼3% of δ H R (ω * 0 ) at 81.3 MHz (Tables A19  and A20 of the supplementary material).
By a simple normalization, the MRD profile can be related to the effective first-shell generalized excess RPF, where Eqs. (10) and (13)- (15) were used in the second step. The model-independent quantity δ MRD R (ω 0 ) involves the 17 O longitudinal relaxation rate R 1 (ω 0 ) measured at frequency ω 0 on a protein solution with water/protein mole ratio N W and the (frequency-independent at accessible ω 0 ) relaxation rate R bulk 1 measured on a bulk-water reference sample. The normalization also makes use of the first-shell occupation number N 1 , obtained from a simulation or estimated as N 1 = A S /a H , where A S is the protein SASA and a H = 11.1 Å 2 is the invariant water interfacial area. 99 Normalization by N 1 is a matter of convenience and does not imply that the perturbation is limited to the first shell.
The buried-water contribution in Eq. (15), δ bur R (ω 0 ) , accounts for most of the observed frequency dependence in the MRD profile (Appendix D of the supplementary material), but it makes only a small contribution to δ MRD R (ω * 0 ) at 81.3 MHz (Fig. 14). 27 By performing a model fit to the MRD profile (Appendix D of the supplementary material), 12,13,27 this contribution can be subtracted to obtain δ H R (ω * 0 ). However, to keep our analysis simple and completely model-independent, we ignore this minor complication. In other words, we compare the experimental δ MRD R (ω * 0 ) with the simulated δ H R (ω * 0 ), keeping in mind that δ MRD R (ω * 0 ) also contains a small contribution from buried water molecules.
The best available experimental δ MRD R (ω * 0 ) values for the four proteins are compiled in Table A21 of the supplementary material and compared with simulated δ H R (ω * 0 ) values in Fig. 16 and Table A19 of the supplementary material. Because δ MRD R (ω * 0 ) is derived from direct measurements without any model-dependent interpretation, this comparison primarily tests the accuracy of the simulation results. The semiquantitative agreement between experiment and simulation evident from Fig. 16 clearly supports the simulation data and the conclusions derived from them.
On the other hand, even though the experimental δ MRD R (ω * 0 ) value contains a small buried-water contribution not present in δ H R (ω * 0 ), it is 25%-30% smaller for three of the four proteins. For BPTI and UBQ, this discrepancy is highly significant, the difference being an order of magnitude larger than the uncertainty (Table A19 of the supplementary material). We attribute this discrepancy to the molecular mechanics force field, which evidently yields too slow dynamics for (some of) the water molecules in the first shell. The origin of the discrepancy might be the Amber ff14SB protein model, the TIP4P-Ew water model, or both. But the finding that the TIP4P-Ew, SPCE, and TIP3P models all yield δ H R (ω * 0 ) values that exceed the experimental δ MRD R (ω * 0 ) to similar extent (Table A20 of the supplementary material) suggests that the protein model is the principal source of error. The somewhat larger δ H R (ω * 0 ) value for GB1Z may be caused by minor differences between the GB1Z and GB1 trajectories in the structure, interactions, and fluctuations of a few highly confined hydration sites.
As we did with the first-shell TCF (Sec. III B), we can resolve the generalized excess RPF with respect to confinement index z C , The kernel K(ω 0 τ) in Eq. (11) suppresses contributions from water molecules with correlation times longer than 1/ω 0 . However, except for a few highly confined sites in some of the proteins (see below), all water molecules in the first shell have correlation times (at 300 K) shorter than 1/ω * 0 ≈ 2 ns. The different z C -subsets in Eq. (16) therefore make comparable contributions to δ 1 R (ω * 0 ) ( Fig. B28 of the supplementary material), just as we found at ω 0 = 0 (Fig. 7).
The z C -resolved contributions to the δ 1 R (ω 0 ) dispersion display a monotonic trend from higher dispersion frequency (shorter correlation time) for the least confined (smallest z C ) hydration sites to lower dispersion frequency for the most confined sites (Fig. B29 of the supplementary material). Beyond this general trend, we see considerable variation among the four proteins. For UBQ, the 15 dispersion profiles δ 1 R (ω 0 ; z C ) have all more or less reached the low-frequency plateau at 81.3 MHz, indicating a correlation time <2 ns for all z C -subsets. By contrast, for the most confined sites (z C = 13-15) of AFP, the reference frequency 81.3 MHz is near the middle of the dispersion. The combined mean occupation number of these sites is 7.5 (Table A9 of the supplementary material), likely including most or all of the linear array of six ordered water molecules on the ice-binding surface. 30,92 These highly confined water molecules rotate by the EMOR mechanism, with τ R = τ int + S 2 τ S (Sec. III C). The dispersion centered near 81.3 MHz can therefore be accounted for by a mean survival time, τ S ≈ 1-2 ns, even though the IRCT is much shorter, τ R ≈ 0.1-0.3 ns (Table A9 of the supplementary material).

A. From TCF to correlation time
In simulation studies of water rotation in the protein hydration shell, the results obtained and the conclusions drawn depend critically on how the rotational correlation time is derived from the TCF. In the present study, we have computed and analyzed the TCF over a much wider range of delay times τ than what is customary. For the first shell and its subsets, we recorded the TCF up to τ = 1 ns and we computed the IRCT by numerically integrating the TCF, without approximations or model assumptions, up to a delay time τ 2 , where, typically, the normalized TCF has decayed to ∼10 −3 and statistical noise prevents further direct integration. In addition, we obtained the small contribution to the integral from the interval τ 2 < τ < ∞ by extrapolating the exponential tail established in the interval τ 1 < τ < τ 2 ( Fig. 1).
Most previous simulation studies have not computed the IRCT τ R , as required to compare with MRD data, but have instead determined a rotational correlation time τ r by an exponential fit to a small part of the highly non-exponential TCF. The work from Laage's group is typical in this respect in using a fitting interval of 2-10 ps, 59,61 but others have used fitting intervals as short as 0.25-1.5 ps. 110 Because of the dramatic variation in TCF shape between exposed and confined hydration sites (Fig. 5), it is clear that the τ r value deduced from the slope in the interval 2-10 ps of the semilog plot differs greatly from the IRCT τ R measured by MRD.
Exponential fitting of only the initial (τ < 10 ps) part of the TCF can be justified when modeling data from time-resolved vibrational spectroscopy, where the short excited-state lifetime prevents access to longer times. 80 Unlike MRD, which measures the integral of the TCF, such methods therefore cannot detect the rotational perturbation in confined hydration sites.
To demonstrate quantitatively the importance of considering the full TCF, we have analyzed our TCF data as was done by Laage et al., determining τ r from an exponential fit in the interval 2-10 ps. The resulting RPF ξ r ≡ τ r /τ bulk r for the z C -subsets of the four proteins is included in Tables A7-A9 of the supplementary material and in Fig. B13 of the supplementary material. For the most exposed hydration sites, ξ r is only marginally smaller than ξ R , but for the most confined sites these quantities differ by an order of magnitude. As a result, the analysis of Laage et al. misses much of the confinement effect and, therefore, does not identify the important EMOR rotation mechanism. Because they only consider the initial part of the TCF, Laage et al. found a "quite uniform" RPF distribution over the protein surface with a "completely negligible (<1%)" fraction of strongly retarded (ξ r > 10) water molecules in the first shell. 59 By contrast, we find that 9.6% ± 0.6% of the first-shell water molecules (with z C > 8) have ξ R > 10 and that these water molecules account for 56% ± 8% of the total excess RPF δ 1 R (0) = ξ R − 1 for the first shell. (The quoted mean ± standard deviation for the four proteins refers to the B2 TCF. The corresponding figures for the U2 TCF computed by Laage et al. are 7.1% ± 0.4% and 43% ± 6%.) We note that these calculations pertain to the first hydration shell, exclusive of buried water molecules (Appendix D of the supplementary material). Laage et al. justified their 10 ps upper fitting interval limit by their desire to avoid contributions from water molecules "which have moved to a different environment." 59,61 But such concerns are not warranted, because the most pronounced long-time TCF tails occur precisely because the water molecule does not move, as explained by the EMOR mechanism (Sec. III C).

B. Dynamical heterogeneity
It was appreciated early on that the chemical and structural heterogeneity of the protein surface leads to water dynamics on a wide range of time scales, but it has proved challenging to identify the molecular origins of this dynamical heterogeneity. MRD experiments have provided important clues, such as the weak pH dependence of the RPF 16 and the normal RPF observed from highly charged proteins 31 (arguing against a dominant contribution from charged hydration sites), the larger RPF observed for proteins with a known abundance of highly confined hydration sites, 27,30 and the much smaller RPF for unfolded proteins 20,31 (both observations pointing to the relevance of the EMOR mechanism). By virtue of their superior resolution in time and space, simulations should resolve this issue, but the picture that has emerged so far is perplexing (Sec. I).
As a recent example, Bagchi et al. concluded that only charged amino acid side-chains can give rise to slower than bulk water dynamics, 81 in stark contrast to our results (e.g., Fig. 2). The rotational correlation times reported by Bagchi et al. refer to water molecules that diffuse within a large region (corresponding to 4-5 hydration shells) and the rotational correlation time is derived from multi-exponential fits to the TCF without including the long-time tail of the strongly confined sites. 81 The RPF distribution reported by Bagchi et al. therefore extend only up to ξ r ≈ 3.5 (for the rank-2 TCF), 81 whereas our RPF values extend up to ∼100 ( Fig. 6; Fig. B18 of the supplementary material).
We find that the first-shell RPF distribution f (ξ R ) is nearly the same for all four proteins, which above the maximum is well approximated by a power law ( Fig. 8; Fig. B18 of the supplementary material) with exponent ν = 2.2 ± 0.1 (mean ± standard deviation for the four proteins). Similar results have been deduced from 17 O MRD data analyzed with a power-law distribution: ν = 2.3 for BPTI and ubiquitin and ν = 2.2 for β-lactoglobulin. 27 We regard the power law as a convenient numerical representation of (part of) the RPF distribution, without attaching any physical significance to it.
The first-shell RPF distribution based on the U2 TCF has been reported by Laage et al. for several proteins (including UBQ). 59,61 Their distribution f (ξ r ), derived from τ r values fitted in the 2-10 ps interval (Sec. VI A), does not extend to as large RPF values as our f (ξ R ) distribution, which includes the full retardation effect in confined hydration sites. The two larger proteins studied by Laage et al. contain many buried water molecules, which give rise to an additional large-RPF tail in their f (ξ r ) distribution. 61 By contrast, our f (ξ R ) distribution refers to the first hydration shell, exclusive of buried water molecules. Another difference is that Laage et al. computed the TCF for individual hydration sites, whereas our z C -subsets include from 1.2 (z C = 15 for GB1) to 87 (z C = 1 for AFP) water molecules, thus reducing the statistical error.
Skewed correlation time distributions, such as the lognormal distribution, have long been used to model dielectric 111,112 and nuclear magnetic 113,114 relaxation data, also in studies of protein hydration dynamics. 115 The log-normal distribution was recently revived by Bagchi et al., 81,116 albeit without actually demonstrating that their simulated distributions (spanning a very narrow range, two orders of magnitude less than those presented here) conform to this mathematical form. As seen from Figs. B18 and B19 of the supplementary material, our f (ξ R ) distributions are not well described by the log-normal function.
It has long been suspected 47,49,59,61,62,67,78,79,82 that confinement is a stronger RPF predictor than polarity, but so far this notion has not been established quantitatively. To accomplish this goal, we introduced the confinement index z C , which is not just a proxy for surface topography, but, because it is based on the carbon coordination of water molecules, describes the degree of confinement of sites that are actually occupied by a water molecule. Our results indicate that the nearly exponential confinement distribution P(z C ) is a universal property of the protein-water interface ( Fig. 4; Figs. B9 and B10 of the supplementary material).
The finding that, for all proteins, the RPF increases monotonically and nearly exponentially, with the confinement index ( Fig. 6; Figs. B13 and B14 of the supplementary material) explains several hitherto puzzling observations. (1) Nonpolar sites are perturbed less than polar sites (Fig. 2), whereas the reverse is true for small molecules. 5,69 This is so because all nonpolar sites are exposed, whereas some polar sites are confined. This, in turn, is a consequence of the high cohesive pressure of bulk water: a confined site would not be occupied if it did not contain at least one polar protein atom, or two polar atoms for strongly confined sites (Fig. 9). (2) Positively charged hydration sites are perturbed less than neutral polar sites (Fig. 2). This is so because the charged groups of the long lysine and arginine side-chains tend to be highly exposed, while some of the neutral polar atoms, such as backbone carbonyl oxygens, reside in confined sites.

C. Rotation mechanism
The discovery of a universal and exponential dependence of the RPF on confinement index ( Fig. 6; Figs. B13 and B14 of the supplementary material), as well as the analysis of the asymptotic TCF decay (Fig. 12) and the dependence of the IRCT on TCF rank (Sec. III C), indicates that different water molecules in the hydration shell rotate by different mechanisms on a spectrum between two extremes: a collective bulk-like mechanism and a protein-coupled EMOR mechanism.
Bulk water provides a unique environment, where each water molecule is surrounded by four or five mobile and versatile (donor or acceptor) potential H-bond partners, allowing the water molecule to rotate six orders of magnitude faster than in ice Ih by a collective low-transition-energy path involving the concerted translation and rotation of several water molecules in such a way as to avoid intermediate states where several H-bonds are simultaneously broken. The extended jump model of Laage and Hynes captures important aspects of this process. 117 Water molecules in the first hydration shell of proteins rotate more slowly because the unique bulk-water environment is replaced by an environment with fewer potential H-bond partners, some of which are, in addition, less flexible with respect to position and orientation as well as H-bond polarity. This scenario can be thought of in terms of the mechanisms operating in sites with low and high confinement.
The low-confinement mechanism is essentially the same collective mechanism as in bulk water, but it is impeded for two reasons. First, the presence of the protein on one side necessarily causes the water molecule to have a less symmetric distribution of water neighbors. The asymmetry is the key here; the total number of water neighbors in highly exposed (z C = 1) sites is very nearly the same as in bulk water (Fig. 9). Second, in polar exposed sites, protein O and N atoms cannot participate as efficiently in the collective process as the more mobile water molecules which they replace.
In the high-confinement (EMOR) process, the water molecule, which typically has two polar protein atoms within H-bonding distance while also being confined by up to 15 C atoms within 5 Å (Fig. 9), must break the H-bond(s) with the protein, which can be replaced by new water-water H-bonds only after the water molecule has left the confined site. In other words, water rotation is mediated by exchange (EMOR) and the main decay of the TCF (after librations and, possibly, C 2 flips have produced a partial orientational averaging) is governed by the mean survival time of the water molecule in the site. Unlike the low-confinement process, the EMOR process involves the coupled dynamics of the water molecule and its immediate protein environment. In a highly confined site, the mean survival time is likely to be governed mainly by structural fluctuations in the protein, as is the case for buried water molecules (although the details, such as the formation of a transient aqueduct, 108 are likely to differ). The extended jump model 117 may capture some aspects of the retardation effect in the low-confinement process, but it can hardly describe the EMOR process.

D. Perturbation range
The spatial range of the protein-induced perturbation of water dynamics has long been a contentious issue, primarily because it is not easily determined experimentally. But another reason for the lack of consensus, which also applies to simulations, is the frequent failure to express results in terms of quantities that can be confirmed or refuted by other investigators. The perturbation is clearly non-zero at any finite distance. The issue therefore cannot be settled by the finding that a particular local property at a certain distance from the protein is significantly different from the bulk-water value. What we need to know is the "rate" of decay of the perturbation, that is, the characteristic length scale on which a local property approaches its bulk-water value. There is also a semantic issue: technically, a "long-ranged" perturbation in a threedimensional system decays asymptotically as 1/R 3 or slower, whereas an exponential decay is always "short-ranged", even if the decay length is long.
When addressing this problem by simulations, there are two critical issues to consider. First, if we want to characterize the perturbation at molecular resolution, we must choose a physical property that can be defined at the single-molecule level, even though it may be affected by interactions with the surroundings. The rank-2 rotational TCF (U2 or B2) satisfies this condition; it reports on single-molecule motion which is highly localized since a water molecule outside the first shell does not have time to move much during the ∼2 ps rotational correlation time. Second, we need to measure the distance of the water molecule from the protein surface. A natural distance metric is provided by the hydration shell index. The simple criterion used here to define the second and higher shells produces shells of nearly constant (∼ 2.8 Å) thickness (Sec. II B 1), so a physical decay length can be obtained by multiplying the number of shells by 2.8 Å. In this way, we found that, for all four proteins, the excess rotational perturbation decays exponentially to zero with a decay length of 0.3 shells or 0.8 Å for the rank-2 TCFs and 0.4 shells or 1.1 Å for the U1 TCF ( Fig. 13, Table A18 of the supplementary material).
Previously, the range of the protein-induced perturbation of water rotation has been investigated in two simulation studies by Steinhauser's group. 42,63 The first study 42 reported (for the U2 TCF) a five-fold reduction of the excess RPF from the first to the second hydration shell, whereas we find a 37fold reduction for the same protein and the same TCF (UBQ ,  Table A15 of the supplementary material). This discrepancy is entirely due to the too small first-shell RPF (ξ r = 1.4) resulting from only considering the first 10 ps of the TCF (Sec. IV A); the perturbation for the second shell was nearly the same as what we find (8 versus 9%). The second study 63 reported (for the B2 TCF) a 34-fold reduction of the excess RPF from the first to the second hydration shell, whereas we find a 43-fold reduction for the same protein and the same TCF (UBQ , Table A16 of the supplementary material). Here, the RPF was based on the IRCT (rather than on a fitted exponential decay time), but the integration range was not stated. Presumably, the discrepancy is partly due to a shorter integration range than we used and partly to the different shell definition (with 25% and 70% more water molecules in the first and second shells, respectively, as compared to our monolayer shells).

E. Field-induced alignment
As shown in Sec. IV B, the effect of water alignment by the electric field of the protein is very small (a few percent, versus several hundred percent from the short-range mechanisms in the first shell), albeit detectable in the rank-1 TCF. However, the field alignment effect on the IRCT can be more pronounced for water near surfactant aggregates, where the high surface charge density produces a stronger electric field. For example, Steinhauser's group recently described large alignment effects on the IRCT of water in a reverse micelle. 63 These effects, and their strong dependence on TCF type, can be understood by considering the orientational order parameter S k in Eq. (9).
For a uniaxial rank-k TCF, 109 where P k (z) is the rank-k Legendre polynomial, ϑ is the fixed angle between the water dipole and the vector whose rotation is described by the TCF (in NMR: the principal axis of the nuclear interaction tensor), and β is the angle between the water dipole and the local electric field. Using Eqs. (9) and (17), and describing the dipole-field interaction with an effective dipole moment µ eff W (which, because of dipole-dipole correlations, is larger than the bare water dipole), we find for uniaxial (U) TCFs of ranks 1 and 2 and for the biaxial 17 O B2 TCF (with the principal axis of the EFG tensor perpendicular to the molecular plane), where we have expanded to leading order in the coupling parameter α ≡ µ eff W E/(k B T ). The results obtained by Steinhauser et al. 63 for a reverse micelle are consistent with Eq. (18). For the rank-1 TCF associated with the dipole vector (ϑ = 0), the RPF was found to be very large (ξ r = 14) in the first shell (containing more than half of the 1800 water molecules) and to gradually decrease in the next three shells. By contrast, for the rank-1 TCFs associated with the H-H vector and the normal to the molecular plane, ξ r = 1.2 and 1.5, respectively, with no significant perturbation in the second and higher shells. This striking observation is explained by Eq. (18a), which predicts that the alignment effect vanishes for these vectors (ϑ = π/2). For the four rank-2 TCFs examined by Steinhauser et al., 63 ξ r ≈ 3 in the first shell, decreasing gradually in subsequent shells. These observations are explained by Eqs. (18b) and (18c), showing that the orientation-dependent prefactor is non-zero for all four TCFs, but that the alignment effect is of higher order, and hence smaller (since α < 1), for the rank-2 TCFs.
The explanation of these results proposed by Steinhauser et al. 63 differs essentially from the one given here. They consider the inversion symmetry of the distribution f [cos θ(τ)], where θ(τ) is the angle by which the molecule-fixed vector has rotated in the time interval τ; cf. Eq. (1). However, to exhibit the alignment effect, it is necessary to separate the degrees of freedom associated with local rotation (essentially unaffected by the field) and long-range translational diffusion, which modulates the orientation of the electric field experienced by the water molecule. This separation is accomplished by transforming the angular functions in the TCF from the interaction vector (or tensor) to the laboratory via two intermediate frames, associated with the water dipole and the electric field. 109 In this way, the relevant angles ϑ and β in Eq. (17) appear naturally. By only considering the angle θ(τ), Steinhauser et al. failed to identify translational diffusion as the process responsible for the long-time tail in the TCF and the associated large increase of the RPF, which, in this case, is only an apparent RPF (since it does not describe water rotation).
Based on their reasoning, Steinhauser et al. concluded that the B2 TCF probed by 17 O MRD "does not capture the potentially strong retardation of the water dipole by an electric field exerted by a biological interface" and that this can (partly) explain the "vastly differing findings" from 17 O MRD and dielectric relaxation. 63 While it is true that rank-2 TCFs are much less affected than the dipolar rank-1 TCF by alignment effects, our results (Sec. IV B) as well as the results of Steinhauser et al. 63 show that, in the first hydration shell of proteins, the alignment effect on the rank-1 TCF is dwarfed by the short-range mechanisms discussed in Sec. III C. The fact that the small alignment effect seen in higher shells for the uniaxial TCF is virtually absent in the B2 TCF would seem to be an advantage of 17 O MRD since the alignment effect reports on bulk-water translational diffusion rather than on local water rotation. And for the first shell, where the perturbation is substantial, we have found, in contrast to the conclusion of Steinhauser et al., 63 that the B2 TCF measured by 17 O MRD is a more sensitive probe of the dynamical perturbation than any of the other TCFs (Figs. 2 and 6).
As pointed out before, 118 the same mistaken attribution of electric-field alignment effects on the rank-1 dipolar TCF to slow local water rotation, rather than to long-range translational diffusion, has featured in simulation studies of a diskshaped surfactant micelle (again, with a high surface charge density). 119,120

F. Translation versus rotation
Many simulation studies of protein hydration dynamics have examined the translational diffusion of water molecules, 36,39,40,45,46,65,75,81,[121][122][123][124][125][126][127] often concluding that the diffusion process is "anomalous" because a sub-linear time dependence of the mean-square displacement, (∆r) 2 ∼ t α with α < 1, was observed. In a decoupled continuous-time random walk model, 128 a sub-linear time dependence can arise from a waiting time distribution with a long-time tail of the form ψ(τ) ∼ τ −(1+α) , but the data in Fig. 8 suggest that α = 1.2, inconsistent with sub-diffusive behavior. An exponent α < 1 can also arise from other physical mechanisms, including surface roughness. 65 Apart from this interpretational ambiguity, it is difficult to determine the α exponent (or an effective translational diffusion coefficient) with spatial resolution. A more local dynamical variable, such as the rotational correlation time, arguably provides more direct access to the dynamical heterogeneity of the hydration shell.
Another popular dynamical variable is the mean residence time (MRT) of water molecules in a local site or in an extended region. As compared to the rotational correlation time, the local MRT has the drawback of being sensitive to spatial and temporal cutoff parameters and of not being quantitatively related to any experimental observable. The regional MRT, like the mean-square displacement, is also subject to obstruction effects, which, if not recognized, can lead to erroneous conclusions.
The two studies by Steinhauser's group of water rotation in protein solutions 42,63 agree with the present study in finding that the dynamical perturbation is essentially confined to the first hydration shell. However, two other studies from the same group 56,78 came to a different conclusion. These studies examined the MRT of water molecules in different hydration shells, rather than their rotation. More precisely, the MRT was defined as the time integral of the intermittent shell propagator, that is, the probability that a water molecule initially located in shell n is also found in that shell a time τ later, regardless of its whereabouts in the meantime. The results, for UBQ and two other proteins, showed "a clear variation up to and beyond the fifth shell," 56 which, with our shell definition, corresponds to seven shells. 99 While this was taken to indicate a long-range influence of the protein on water dynamics, 56 another interpretation seems more likely. By virtue of the volume it excludes for diffusing water molecules, the protein enhances the probability that a water molecule will return to its initial shell. Because it involves translational diffusion over long times and distances, this obstruction effect diminishes only gradually with increasing shell index, as observed.
In a similar vein, Bagchi's group 81 recently calculated the MRT for water molecules within 10 Å of any protein atom, corresponding to the first 4-5 hydration shells, as defined here. Here, continuous residence in the region was required, so the MRT is a mean-first-passage time (MFPT). A "retardation factor" was then obtained by dividing this MFPT by the corresponding MFPT calculated for a bulk-water region of similar size and shape. This factor, found to range from 2.3 for the small protein GB1 to 2.6 for the larger myoglobin, was interpreted as an average measure of the proteininduced dynamical perturbation of the water molecules in this large region. 81 A simpler, and more plausible interpretation attributes this "retardation factor" to the obstruction effect of the protein, which, by its mere presence (without perturbing the water dynamics) keeps water molecules in the region for a longer time than if they could leave via both boundaries (as in the bulk reference system). To demonstrate this, we note that, for a spherical protein of radius R and shell thickness L, 129,130 where λ = R/(R + L) and we have taken the diffusion coefficient in the "shell" to be the same as in bulk water. The obstruction factor ξ obs thus ranges from 1 for R = 0 (as required) to 4 for R L. Setting L = 10 Å, as in the simulation study, 81 and estimating R = 12.5 Å for GB1 and 17.5 Å for myoglobin, we obtain from Eq. (19) ξ obs = 2.34 for GB1 and 2.64 for myoglobin, in excellent accord with the simulations. The reported "retardation factor" can thus be reproduced without any retardation at all. This is not surprising because the average diffusion coefficient in a dynamically heterogeneous population is dominated by the "fastest" water molecules, whereas the IRCT is dominated by the "slowest" water molecules (Sec. III B).

G. Simulation versus 17 O MRD
Simulation results on the protein-induced perturbation of water rotation have been compared to experimental results obtained by 17 O MRD on several occasions, 51,59,61-63 always finding semi-quantitative or even quantitative agreement. The comparison presented here is the most rigorous to date. For three of the four proteins, we find that the simulation overestimates the RPF by ∼30%, and this difference increases somewhat if the small contribution to the experimental RPF from buried water molecules is subtracted. (To make the comparison as robust and model-independent as possible, we refrained from making this correction.) As we have seen (Sec. III B), the RPF is heavily influenced by a relatively small number of highly confined hydration sites, where water rotation occurs by the EMOR mechanism so the RPF reflects water exchange rather than water rotation per se. The RPF therefore depends sensitively on protein-water interactions in the transition state for water exchange and on the rate of fluctuations of the confining protein atoms. That the Amber ff14SB force field gets this right to within ∼30% is arguably better than one could reasonably expect.
The present simulation-MRD comparison differs in several respects from previous comparisons. (1) In early MRD work, the effective first-shell excess RPF was normalized by a too small first-shell occupation number N 1 , stemming from a too small interfacial water area 99 a H derived from an early simulation of BPTI. 37 Experimental RPF values published prior to 2008 are thus too large if compared to simulated RPFs normalized with an accurate N 1 value. (2) In previous simulation studies, a large part of the RPF was missed because the longtime tail of the TCF from strongly confined water molecules was not properly accounted for (Sec. VI A). As a result of these two oversights, the simulated RPF was underestimated while the experimental RPF was overestimated, leading to fortuitously good agreement between the two. (3) Only one 63 of the previous comparisons has computed the biaxial B2 TCF actually measured by 17 O MRD. (4) The first hydration shell has been defined in several ways, so the same N 1 value was not used to obtain the simulated and experimental RPF. (5) The (small) effect of buried water molecules was sometimes excluded and sometimes included. (6) Rather than comparing the directly measured (at 81.3 MHz) generalized excess RPF δ r (ω * 0 ), previous comparisons considered the slightly model-dependent zero-frequency limit of this quantity. (7) The (small) contribution from water outside the first shell to the experimental RPF was usually not included in the simulated RPF.
Finally, we note that Steinhauser et al. recently showed that the biaxial B2 TCF can be expressed as a linear combination of the three uniaxial TCFs associated with the three principal axes of the EFG tensor. 63 While this is correct, it is a purely formal decomposition that provides little or no physical insight. First, because one of the weights is always negative (the one corresponding to the EFG principal component of smallest magnitude), this linear combination cannot be regarded as a superposition of rotational modes. Second, there is no reason to suppose that the symmetry of the rotational motion should be reflected in the EFG tensor. In other words: the principal frames of the mobility and EFG tensors are not expected to coincide.

ACKNOWLEDGMENTS
This work was supported by the Swedish Research Council. Computing resources at Lund University Center for Scientific and Technical Computing (LUNARC) were allocated by the Swedish National Infrastructure for Computing (SNIC).

Supplementary material
How proteins modify water dynamics   c In Appendix D, the buried subset is further decomposed into isolated-buried (no W) and connected-buried (≥ 1 W and > 15 C) water molecules.
d The small zwitterionic subset is not considered in this work.
S2 Table A2. Rotational dynamics of first-shell subsets based on the U1 TCF.  Table A3. Rotational dynamics of first-shell subsets based on the U2 TCF.  Table A4. Rotational dynamics of first-shell subsets based on the B2 TCF. S9 Table A6. Results of exponential fit to P (z C ) and power-law fit to f (ξ R (B2)). S10 Table A7. Rotational dynamics in first shell versus z C based on the U1 TCF.   Table A7. Continued.  S12 Table A7. Continued.    Table A8. Rotational dynamics in first shell versus z C based on the U2 TCF.  S15 Table A8. Continued.  S16 Table A8. Continued.    S18   Table A9. Rotational dynamics in first shell versus z C based on the B2 TCF.  S19 Table A9. Continued.  S20 Table A9. Continued.    S22   Table A10. Results of exponential fit to ξ R (z C ) versus z C .  Table A11. Partial RPF for the 50% and 90% "fastest" fraction of the first shell. S23 Table A12. Mean water and polar protein atom coordination numbers for water molecules with given carbon coordination number      Table A13. Protein-averaged mean water and polar protein atom coordination numbers for water molecules with given carbon coordination number z C . a  Table A14. Rotational dynamics versus shell index based on the U1 TCF.    Table A14.      0.697 ± 0.000 0.08 ± 0.05 0.697 ± 0.000 0.08 ± 0.05 0.6884 ± 0.0001 6 − 9 5

S34
b Same GB1 variant as used in the simulation. c Isoform 4-9 differs in 5 residues from isoform 2-14 used in the simulation (and for crystal structure 1EZG).

S37
APPENDIX B: SUPPLEMENTARY FIGURES S38 Figure B1: Rotational TCFs, U1 (green), U2 (blue), and B2 (red), for the polar part of the first shell of GB1, BPTI, UBQ, and AFP. In the semilog plots, the TCF is colored orange in the regression interval [τ 1 , τ 2 ] and the fitted exponential TCF tail is dashed. The U1 TCF for bulk water is also shown (magenta dash-dot). Figure B2: Rotational TCFs, U1 (green), U2 (blue), and B2 (red), for the polar part of the first shell of GB1, GB1Z, GB1SPCE, and GB1TIP3P (expanded τ scale). In the semilog plots, the TCF is colored orange in the regression interval [τ 1 , τ 2 ] and the fitted exponential TCF tail is dashed. The U1 TCF for bulk water is also shown (magenta dash-dot). Figure B3: Rotational U1 TCF for the nonpolar (black), neutral (green), positive (blue), and negative (red) subsets of the first shell of GB1, BPTI, UBQ, and AFP. In the semilog plots, the TCF is colored orange in the regression interval [τ 1 , τ 2 ] and the fitted exponential TCF tail is dashed. The U1 TCF for bulk water is also shown (magenta dash-dot). Figure B4: Rotational B2 TCF for the nonpolar (black), neutral (green), positive (blue), and negative (red) subsets of the first shell of GB1, BPTI, UBQ, and AFP. In the semilog plots, the TCF is colored orange in the regression interval [τ 1 , τ 2 ] and the fitted exponential TCF tail is dashed. The B2 TCF for bulk water is also shown (magenta dash-dot).   Figure B10: Normalized distribution P (z C ) of confinement index z C for water molecules in the first shell of GB1 (red), BPTI (blue), UBQ (black), AFP (green), GB1Z (magenta), GB1SPCE (cyan), and GB1TIP3P (orange). The curves resulted from exponential fits for z C ≥ 2.
S48 Figure B12: Rotational B2 TCF for first-shell subsets with confinement index z C = 1 (black), z C = 5 (green), z C = 10 (blue), and z C = 15 (red) for GB1, BPTI, UBQ, and AFP. In the semilog plots, the TCF is colored orange in the regression interval [τ 1 , τ 2 ] and the fitted exponential TCF tail is dashed. The B2 TCF for bulk water is also shown (magenta dash-dot).
S49 Figure B13: Rotational perturbation factor ξ R (z C ) based on IRCTs (filled circles) and ξ r (z C ) based on exponential fit to TCF in 2-10 ps interval (open symbols) versus confinement index z C for water molecules in the first hydration shell (left-hand y axis). Dotted lines resulted from exponential fits for z C ≤ 10. TCF type: U1 (green), U2 (blue), and B2 (red). The gray bars indicate the fraction P (z C ) of first-shell water molecules with a given z C (right-hand y axis).  Figure B15: Contributions to the first-shell RPF ξ R (B2) from hydration sites with different confinement index z C .  Figure B18: Normalized distribution f (ξ R ) of the RPF ξ R (B2) for the first hydration shell (red solid), with power-law fit in the interval 4 ≤ ξ R ≤ 40 (blue dash) and log-normal fit in the interval 1 ≤ ξ R ≤ 40 (black dash-dot).  Figure B19: Normalized distribution f (ξ R ) of the RPF ξ R (B2) for the first hydration shell (red solid), with power-law fit in the interval 4 ≤ ξ R ≤ 40 (blue dash) and log-normal fit in the interval 1 ≤ ξ R ≤ 40 (black dash-dot). S57 Figure B21: Rotational U1 TCF for the first (red), second (blue), third (green), fourth (black), fifth (cyan), and sixth (orange) hydration shells of GB1, BPTI, UBQ, and AFP. In the semilog plots, the TCF is colored yellow in the regression interval [τ 1 , τ 2 ] and the fitted exponential TCF tail is dashed. The U1 TCF for bulk water is also shown (magenta dash-dot).
S59 Figure B23: Rotational U1 TCF (upper four panels) and B2 TCF (lower four panels) for the first (red), second (blue), third (green), fourth (black), fifth (cyan), and sixth (orange) hydration shells of GB1 and GB1Z. In the semilog plots, the TCF is colored yellow in the regression interval [τ 1 , τ 2 ] and the fitted exponential TCF tail is dashed. The corresponding bulk water TCFs are also shown (magenta dash-dot). S60  Figure B29: Frequency dependence of the contributions to the generalized excess 17 O RPF δ 1 R (ω 0 ) from first-shell water molecules with different confinement index, ranging from z C = 1 (red) to z C = 15 (purple). The vertical line is at 81.3 MHz.

APPENDIX C: BULK WATER
The three rotational TCFs for bulk TIP4P-Ew water are shown in Fig. C1. At long times, the TCF decay is very nearly exponential, especially for the uniaxial TCFs (Fig. C1c). The IRCT obtained by integrating these TCFs is given in Table C1, which also includes results averaged over the bulk-like hydration shells n = 7 − 9 of GB1SPCE and GB1TIP3P. The ratio of the rank-1 and rank-2 uniaxial IRCTs is τ bulk R (U1)/τ bulk R (U2) = 2.558 ± 0.001 for TIP4P-Ew, 2.520 ± 0.001 for SPCE, and 2.628 ± 0.002 for the TIP3P model. As expected, C bulk I is significantly smaller for the rank-2 TCFs than for the rank-1 TCF. 17 O quadrupolar spin relaxation in H 2 O at 300.0 K yields 6 τ bulk R (B2) = 1.60 ps, 2.4 % below the TIP4P-Ew simulation result. 1 H− 17 O dipolar spin relaxation in H 2 O at 300.0 K yields 7-9 τ bulk R (U2) = 1.74 ps, 3.5 % below the TIP4P-Ew simulation result. For SPCE water, the three IRCTs are 7.5% (U1), 6.1% (U2), and 11.9% (B2) smaller than for TIP4P-Ew, and for TIP3P they are smaller by factors 2.5, 2.6, and 2.9, respectively. The TIP4P-Ew water model thus agrees better with experiment than the other two models. The TIP3P model not only has nearly three-fold too fast dynamics, as is well known; 10 it also has substantially larger libration amplitude (smaller C bulk I ) than the other two models. Apart from the IRCT τ bulk R , Table C1 also includes the correlation time τ bulk r obtained by fitting the exponential function C(τ ) = C a exp(−τ /τ bulk r ) to the TCF in the interval τ = 2 − 10 ps (Sec. VI). 0.5655 ± 0.0003 0.62491 ± 0.00007 6 − 9 a Integral rotational correlation time. b Correlation time obtained from an exponential fit to the TCF in the interval 2-10 ps.
c Results averaged over hydration shells n = 7 − 9 of GB1SPCE or GB1TIP3P. In panel (a), the individual data points with 0.1 ps resolution are shown as circles. In panel (c), the TCF is colored orange in the regression interval [τ 1 , τ 2 ] and the fitted exponential TCF tail is dashed.

APPENDIX D: BURIED WATER MOLECULES
For the purpose of the following analysis, we subdivide the buried class into two subpopulations: isolated-buried (iso) water molecules have no water-O atom within 3.3Å, whereas connected-buried (con) water molecules have at least one water-O atom within 3.3Å and z C > 15. The three TCFs and the associated IRCTs were computed for the isolated and connected subsets in the same way as for the external water populations (Sec. II.B).
The results in Table D1 show that the IRCT τ R > 1 ns for the isolated-buried water molecules in BPTI, UBQ, and AFP, as well as for the connected-buried water molecules in BPTI. To determine the longest correlation time in the non-exponential TCF (rather than the IRCT) and to compare it with the rank-k protein tumbling time τ (k) P , we fitted the exponential function C(τ ) = S 2 exp(−τ /τ eff ) to the TCF in the range τ > 0.5 ns (Fig. D1). The considerable statistical noise evident in Fig. D1 suggests that longer MD trajectories are needed to compute these TCFs accurately. Nevertheless, the results in Table D2 show that, in most cases (see below), the correlation times deduced from exponential fits agree to within the rather large standard error with the expected protein tumbling time. The latter was estimated from the Debye-Stokes-Einstein relation and an empirical factor 2 correction, 11 where V P is the apparent partial protein volume 12 and η W is the viscosity at 300 K of TIP4P-Ew, SPCE, or TIP3P water, taken to be 13, 19 and 64%, respectively, below the experimental value, 0.851 cP. A more refined estimate, computed with the molecular hydrodynamics program HYDROPRO 13 using an atomic radius 11 of 3.0Å and TIP4P-Ew water viscosity at 300 K, yields τ (2) P = 3.17 ns for AFP, close to the value, 3.25 ns, obtained from Eq. (D.1). The HYDRPRO calculation also yields a rotational anisotropy D // /D ⊥ = 1.6 for AFP. 5 From experimental 17 O MRD profiles, we can determine the number N int MRD of "internal" water molecules and their effective rotational correlation time τ int MRD . [Solution MRD actually yields the product N int MRD (S int MRD ) 2 , but N int MRD can be determined separately by using immobilized proteins. 14 ] The "internal" subset is operationally defined as those water molecules that give rise to a dispersion (frequency-dependent relaxation rate) in the accessible frequency range (typically, ≤ 81.3 MHz for 17 O), meaning that the effective correlation time must be 1 ns. Since the operational "internal" subset is not necessarily identical to the computational isolated-buried subset, we should not expect quantitative agreement between experimental and simulation results in Tables D1 and D2. Furthermore, some isolated-buried water molecules exchange too slowly ( µs) to contribute to the 17 O relaxation rate; they therefore contribute to N iso 1 but not to N int MRD . On the other hand, if the mean survival time τ S of a buried water molecule is not much longer than the protein tumbling time τ P , then the experimentally determined correlation time is τ int MRD = 1/(1/τ P +1/τ S ). Moreover, the criterion used here to define isolated-buried water molecules (no water neighbor within 3.3Å) S69 does not necessarily capture all water molecules that contribute to the MRD. Finally, the inevitable inaccuracies in the force field can produce significant errors in the local orientational order and exchange kinetics of buried water molecules. 15 With these caveats in mind, we now compare the computational results for isolated (and connected) buried water molecules with MRD-derived results.
For AFP, we find 5.47 isolated-buried water molecules, consistent with the five deeply buried water molecules in the crystal structure, 16 which, because of slow exchange, contribute very little to the room-temperature 17 O MRD profile. 5 Both the absolute values and the rank-1/rank-2 ratio of the correlation times are consistent with protein rotational diffusion. The connected-buried subset, with N con 1 = 0.92, clearly does not include (all of) the linear array of six ordered water molecules on the ice-binding surface, 5, 16 most of which are instead included in the first shell.
For UBQ, we find 1.28 isolated-buried water molecules, consistent with the assignment of the small experimental MRD to a single water molecule. 14, 17 The effective correlation time is significantly shorter than the protein tumbling time, especially for the rank-1 TCF, suggesting a mean survival time of 5 -10 ns. A similar value, 23 ± 3 ns, was deduced from 17 O MRD on immobilized ubiquitin. 14 All BPTI crystal structures have four buried water molecules, 18 one of which (W122) is in slow exchange for 17 O and does not contribute to the MRD profile at room temperature or below. 3 Our criteria identify 1.27 isolated-buried and 5.14 connected-buried water molecules in BPTI. The isolated subset presumably comprises W122 plus W113 in a minor subset of configurations where it is more than 3.3Å away from W112. The effective correlation time τ iso eff for the isolated subset is consistent with the corresponding protein tumbling time, and the ratio of the rank-1 and rank-2 correlation times agrees, within the standard error, with the ratio of 3 expected for rotational diffusion. The connected subset is likely dominated by W110, W111, W112, W113, and W143. The effective correlation time τ iso eff for the connected subset is smaller than the corresponding protein tumbling time, as expected since some of the water molecules in this subset should have mean survival times of order 1 − 10 ns.
For GB1, the IRCT is well below 1 ns (Table D1), consistent with the absence of a ∼ 10 MHz dispersion step in the MRD profile. 1 Therefore, the 0.65 isolated water molecules for GB1 are not likely to be deeply buried.
The order parameters S 2 s in Table D2, derived from the exponential fits in Fig. D1, depend on the chosen fit interval (τ > 0.5 ns) and should not be identified with the experimentally accessible local order parameter, which is more closely related to the quantity C iso a Rank-k protein tumbling time τ   Figure D1: Rotational TCFs, U1 (top), U2 (middle), and B2 (bottom), for the isolated-buried water molecules of BPTI, UBQ, and AFP, and for the connected-buried water molecules of BPTI. The dashed curves were obtained by fitting the two parameters in the function S 2 exp(−τ /τ eff ) to the data for τ > 0.5 ns.