Adaptive Resolution Simulation of an Atomistic Protein in Martini Water

Copyright Other than for strictly personal use, it is not permitted to download or to forward/distribute the text or part of it without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license (like Creative Commons). Take-down policy If you believe that this document breaches copyright please contact us providing details, and we will remove access to the work immediately and investigate your claim. Downloaded from the University of Groningen/UMCG research database (Pure): http://www.rug.nl/research/portal. For technical reasons the number of authors shown on this cover page is limited to 10 maximum.

Molecular simulations provide insight into the physical basis of the structure, dynamics, and function of biological macromolecules, e.g., proteins. 1Simulations yield complete trajectories of individual particles and can therefore be used to address specific questions about the properties of biomolecular systems that depend on these details.Water as a solvent profoundly affects structural, dynamical, and functional properties of proteins through both direct and hydrodynamic interactions. 2A protein influences in turn the local structure and dynamics of water.To adequately describe this delicate interplay we have to resort to the atomic resolution in our model.However, due to a large number of atoms these systems are difficult to tackle by atomistic computer simulations.Moreover, the largest portion of the simulation time is spent on the solvent and not on the protein itself.To simplify the model of the system to the largest extent possible, while at the same time keeping the atomistic detail where it is needed, multiscale models of the solvent have been extensively developed. 3][6][7][8][9][10][11] In these hybrid methods, typically molecular dynamics (MD) or a similar approach is used for simulating dynamics on the atomistic scale, whereas the Navier-Stokes equation governs the fluid dynamics on the continuum scale.Alternatively, multiscale schemes have been introduced usa) s.j.marrink@rug.nlb) praprot@cmm.ki.si ing particle-based models only, e.g., atomistic and physically simplified coarse-grained (CG) molecular models. 122][23][24][25][26][27] Among the most advanced methods for the latter kind of simulations is the Adaptive Resolution Scheme (AdResS), 21,[28][29][30][31][32] which allows for concurrent coupling from quantum all the way to continuum length scales of molecular liquids and soft matter.
In this work, we extend the range of applications of the AdResS scheme toward biological macromolecules.As a proof of principle, we simulated a 56-residue protein G, at full atomistic detail, solvated in water molecules that dynamically change resolution between an atomistic representation -a bundled version 33 of the Simple Point Charge (SPC) model 34 -and a mesoscopic one -the MARTINI CG model. 35,36 he successful interfacing with the widely used MARTINI CG model opens up a range of future applications involving the broad variety of solutes and solvents parameterized for this force field.
We use the new multiresolution water to solvate protein G (pdb entry 1PGB) as an example of a well studied protein. 16he protein is always modeled at full atomistic resolution using the GROMOS 53a6 force field. 37The solvent's level of representation depends on the distance from the protein's center of mass.For short distances we resort to a bundled version of the SPC model using a topology in which four FIG. 1.A schematic cross section of simulation box with spherical adaptive resolution regions.Two levels of resolution are used for solvent molecules.High level of resolution is used for solvent molecules within a certain radius from the protein's center of mass.There the solvent is modeled with the bundled water model.Low level of resolution is used for solvent molecules at larger distances to protein's center of mass, where solvent is modeled with the MARTINI CG water (one bead represents four water molecules).The high resolution region sphere moves with the protein's center of mass.The protein G is thus at all times fully atomistic, but is shown here in cartoon representation (secondary structure) for better clarity.
water molecules are kept together using weak restraints. 33his allows for exchange of the bundled water with the CG MARTINI water model that represents four water molecules by a single CG bead.At larger distances, the MARTINI water model is employed.The water molecules change their resolution from four molecules to one CG bead and vice versa adaptively according to their current position.A schematic representation of the system is depicted in Fig. 1.
By using the bundled water as our atomistic water model we avoid theoretical difficulties, both of the AdResS 21 as well as the coarse-graining of several nonbonded particles (e.g., solvent molecules) into one CG bead, 38 that arise when the atomistic molecules can drift apart (on a picosecond timescale in the case of water).It has been shown that the changes introduced by the bundling most strongly affect the selfinteractions of the water molecules. 33This results in a slightly modified water structure and consequently different dynamics.On the other hand, interactions with other molecules stay mostly unchanged and with the density and the free energies of hydration of small molecules important properties of SPC water are well preserved.Applications of bundled water to biomolecular systems, e.g., a lipid bilayer and a protein, show that stronger bundling may lead to some artifacts, such as the larger penetration of water into lipid bilayer interfaces and globular proteins, probably because of increased hydration of ionic species. 33To avoid artifacts close to certain proteins, e.g., incorrect filling of protein water pockets, one could gradually weaken the springs closer to the protein, to such an extent that the clusters can really deform to adopt to any local shape at no energetic cost.However, even with the current bundling the clusters can take many shapes.
The multiscale MD simulations are carried out, as mentioned above, using the AdResS method where the total force acting on a bundle α is where F ex αβ and F cg αβ are the forces between bundles α and β, obtained from the explicit atomistic and CG potentials, respectively.The sigmoidal function w ∈ [0, 1] is used to smoothly couple the high and low resolution regimes, where R α , R β , and R are centers of mass of bundles α and β, and the protein, respectively.The thermodynamic force F TD acts on bundles' centers of mass in the hybrid region and ensures the thermodynamic equilibrium of the system. 29, 39F TD is computed with an iterative procedure as described in the literature and shown in the supplementary material. 40he atomistic region around protein has a spherical shape with the distance between the hybrid domain border and the center of mass of the protein fixed.Hence, the resolution domains follow protein random translation with the center of the atomistic region coinciding with R at all times.Our assumption, which is confirmed by the values of the respective diffusion coefficients (see below), is that the protein moves slowly compared to the solvent molecules.This is necessary, so that solvent molecules can equilibrate their degrees of freedom adequately when crossing the borders of domains with different resolution.By varying the size of the atomistic region, we gain insight on the extent of influence the bulk has on the local hydrogen bond network in the hydration shell.To this end, we investigate three sizes of atomistic sphere radii: 3.2, 3.4, and 3.6 nm.In all cases the center of atomistic region sphere moves with the protein's center of mass.As a reference, we use a simulation where the atomistic region extends across the whole simulation box as well as an atomistic simulation of the protein solvated with the SPC water model.Additionally, we have performed simulations of the protein enclosed in an atomistic water droplet in vacuum.
We test two models (models 1 and 2 33 ) for bundling of atomistic waters.The two models differ in harmonic spring force constant k s and C 12 Lennard-Jones parameter between oxygen atoms (model 1: k s1 = 1000 kJ mol −1 nm −2 and C 12 = 3.25 × 10 −6 kJ mol −1 nm 12 , model 2: k s2 = 4000 kJ mol −1 nm −2 and C 12 = 3.45 × 10 −6 kJ mol −1 nm 12 ).Comparing the results for both models, we have not found any noticeable differences.Therefore, we show here only the results for model 1 and direct the reader to the supplementary material for model 2 results. 40The protein is modeled with the GROMOS 53a6 force field. 37A cutoff of 1.2 nm is used for the nonbonded interactions.Electrostatic interactions are computed using a reaction field method 41 with dielectric constants of 54 33 and 80 for the bundled and SPC water,  43 The results are compared to the fully atomistic bundled and SPC solvations.The vertical lines denote the boundaries between resolution domains, i.e., the atomistic (AT), hybrid (HY), and the coarse-grained (CG) domains.
respectively.All simulations are performed in a cubic simulation box with box size 10.8 nm using periodic boundary conditions.The temperature is for all simulations maintained at 300 K by the Langevin thermostat with a coupling constant of 25.0 ps −1 .After equilibration, production runs of 15 ns were computed using a 1 fs timestep.All simulations are performed using the ESPResSo++ software package. 42irst, we check that our adaptive resolution simulations using the new multiscale water model correctly reproduce the structure of the solvent and protein.In Fig. 2, we show the solvent density around the center of mass of the protein for the three atomistic region sizes.The domain with a decreased solvent density around the protein is well within the atomistic region for all cases.All-atom bundled and AdResS simulations give comparable results, while the all-atom SPC simulation gives a slightly shifted curve.This is because we calculate the density profile for bundled water using bundles' centers of mass and not the centers of mass of individual waters in the bundles.The corresponding oxygen (more or less the same as center of mass of individual water molecule) density profiles, where the mentioned shift in the density of water around the protein disappears, are reported in the supplementary material. 40Note that the solvent density at larger distances from the protein is equal to the bulk density in all adaptive resolution domains.
To show that the multiscale simulation does not affect the structural properties of the protein we have computed the root-mean-square deviation (RMSD) and the root-meansquare fluctuations (RMSF) of the backbone atoms with respect to the crystal structure.The results plotted in Fig. 3 are in agreement with previously published results 16 and indicate that the structure of the protein is stable in all simulations.The average RMSD values (in nm) are 0.17 ± 0.05, 0.12 ± 0.03, 0.17 ± 0.02, 0.16 ± 0.01, and 0.19 ± 0.02 for the all-atom SPC, all-atom bundled SPC, and AdResS bundled (with the three atomistic region sizes) solvation, respectively.
We have also assessed the protein stability by computing the protein's radius of gyration, the stability of native contacts FIG. 3. RMSD (top) and RMSF with error bars (bottom) of the backbone atoms with respect to the crystal structure as a function of time.We compare the results obtained from the fully atomistic simulations using SPC and bundled waters to AdResS simulations with three atomistic region sizes: spheres of radii 3.2 nm, 3.4 nm, and 3.6 nm, respectively.and secondary structure, and the solvent accessible surface area (see the supplementary material). 40The results confirm that the protein remains stably folded and resides in the atomistic domain in all our simulations.Should the protein simultaneously extend over multiple adaptive resolution domains one would have to use a different approach as in Ref. 29.
To further validate our new approach, we also compare the dynamic properties of the protein and solvent for the atomistic and multiscale water models.The diffusion coefficients are determined from the mean square displacement using a finite size correction. 44In particular, the computed values (in units of 10 −9 m 2 s ) for the protein's diffusion coefficient are: 0.13 ± 0.04, 0.08 ± 0.03, 0.07 ± 0.02 for the allatom SPC, all-atom bundled, and AdResS bundled (the same for all the atomistic region sizes) solvation, respectively.The obtained coefficients, which within the error bars match, are a lot smaller than the bundled water diffusion coefficient of 1.8 ± 0.1.This result also justifies our initial assumption of moving the center of the atomistic resolution region along with the protein's center of mass.More detailed results on the multiscale water itself will be published elsewhere.
In conclusion, we have presented a multiscale simulation of a solvated protein employing a hybrid atomistic/ mesoscopic water model.The atomistic domain was limited to a sphere embedding the protein, while a mesoscopic MAR-TINI water model was used farther away.This leads to a significant speedup in comparison to a fully atomistic simulation.The speedup is due to the coarse-grained model, which reduces the number of solvent particles by a factor of 12 and introduces softer interactions.The actual speedup of AdResS simulations depends on the ratio between the atomistic and coarse-grained domain sizes, i.e., up to an order of magnitude using the same integration time step in the atomistic and CG regions. 45This is important because the hydrodynamic interactions are long-ranged and we need large systems to avoid finite size effects.Our approach falls in between the multiresolution approaches concurrently coupling atomistic water to a CG water, where one bead represents one water molecule, 21 and hybrid methods interfacing atomistic description with continuum, e.g., Refs. 10 and 11.Hence, it bridges the hydrodynamics from the atomic to mesoscopic scale and enables the study of biophysical phenomena that are beyond the scope of either atomistic or mesoscopic simulations.Future work will include extension of this methodology to polarizable CG water models 46,47 and salt solutions.

FIG. 2 .
FIG.2.Solvent density around the center of mass of protein G for AdResS simulations with atomistic region radius sizes of 3.2 nm (top), 3.4 nm (middle), and 3.6 nm (bottom).The plots include error bars.43The results are compared to the fully atomistic bundled and SPC solvations.The vertical lines denote the boundaries between resolution domains, i.e., the atomistic (AT), hybrid (HY), and the coarse-grained (CG) domains.