Wavelet Monte Carlo dynamics: a new algorithm for simulating the hydrodynamics of interacting Brownian particles

We develop a new algorithm for the Brownian dynamics of soft matter systems that evolves time by spatially correlated Monte Carlo moves. The algorithm uses vector wavelets as its basic moves and produces hydrodynamics in the low Reynolds number regime propagated according to the Oseen tensor. When small moves are removed the correlations closely approximate the Rotne-Prager tensor, itself widely used to correct for deficiencies in Oseen. We also include plane wave moves to provide the longest range correlations, which we detail for both infinite and periodic systems. The computational cost of the algorithm scales competitively with the number of particles simulated, $N$, scaling as $N\ln N$ in homogeneous systems and as $N$ in dilute systems. In comparisons to established lattice Boltzmann and Brownian dynamics algorithms the wavelet method was found to be only a factor of order 1 times more expensive than the cheaper lattice Boltzmann algorithm in marginally semi-dilute simulations, while it is significantly faster than both algorithms at large $N$ in dilute simulations. We also validate the algorithm by checking it reproduces the correct dynamics and equilibrium properties of simple single polymer systems, as well as verifying the effect of periodicity on the mobility tensor.


I. INTRODUCTION
Brownian dynamics (BD) algorithms aim to simplify soft matter simulations by replacing the large number of degrees of freedom in the solvent with known hydrodynamic interactions (HIs), which simply need to be calculated between the N particles of interest at each time step. 1 This compares to explicit solvent algorithms that follow solvent molecules with some level of coarse graining, despite not being interested in them directly, to let them mediate viscosity and HIs through local interactions.Such methods include molecular dynamics (MD), 2 dissipative particle dynamics (DPD), 3 lattice Boltzmann (LB) [4][5][6] and multi-particle collision dynamics (MPCD) 7 algorithms.The computational cost of these methods, or time taken to run a simulation evolving a system by some physical amount of time, scales linearly with the total number of particles.This includes particles in the solvent as well as those of interest.For systems of fixed concentration this leads to the cost scaling as N , 8 though the overhead cost of moving the solvent molecules limits the feasible system size, especially as the systems become more dilute.When considering a non-periodic system the scaling rises.The important example of single polymer chains leads to N 3ν , with ν the Flory exponent, in order to fit the whole chain inside the simulation box. 6espite reducing the number of degrees of freedom significantly, the cost of conventional BD algorithms, which is dominated by the decomposition of the mobility tensor, limits their ability to simulate large N systems.Fixman's algorithm is well known to cut the scaling of this decomposition down to N 2. 25 from the naive N 3 , 9 and several methods have since been put forward to reduce the scaling further. 8,10It has even been reduced to or near N ln N in some cases, although these approaches are only valid for bounded systems, 11 or introduce errors to allow more efficient computation via particle mesh Ewald techniques 12,13 or sparse arrays. 14n this work we present an entirely different approach to BD, using a Monte Carlo (MC) algorithm to bypass explicit calculations with the mobility tensor altogether.To date, MC methods in the field have been used primarily to study equilibrium properties since they have not accounted for HIs.7][18] So called bridging moves have also been introduced to handle branching polymers, 19,20 and more recently 'eventchain' algorithms have been introduced to handle hard sphere particles. 21,22n important bridge between the MC methods above and our hydrodynamically coupled method below lies in the work of Maggs. 23Maggs introduced spatially extended correlated MC moves which he tuned to maximise the equilibration rate of a simple fluid system.He did not target HIs per se, but he was led to motion of the same scaling with arbitrary moves.We focus on moves described by wavelets, a class of function that has seen use in many areas of physics, particularly for signal processing because they form an (over) complete, localised basis that allows temporal changes in a signal to be identified. 24,25The development of wavelet theory is catalogued in Ref. 26.For this work, their role as basis functions enables us to re-express the mobility tensor in a form readily transferable to a MC simulation and hence we call the method 'Wavelet Monte Carlo dynamics' (WMCD).We will show that this approach leads to a cost scaling that is at worst N ln N per physical unit of time with a very competitive prefactor and no assumptions on the system beyond those already in basic BD algorithms.
In section III we begin with sketch calculations to high-light how HIs and the cost scaling arise in WMCD without needing the full details.Section IV addresses the background physics and mathematics required for the method before section V explains the wavelet method in full.Section VI describes a necessary modification to include occasional plane wave moves, before results of simple validation simulations are given in section VII.

II. NOTE ON SYSTEMS, UNITS AND HARDWARE
For the data given in this paper, but not required by the theory, we have simulated polymeric systems in a good solvent.The polymers are represented by beadspring chains with finitely extensible non-linear elastic (FENE) springs and the Weeks-Chandler-Anderson (WCA) potential acting between all particles at positions r i : for We adopt as length and energy units 2 1/6 σ and ǫ, and denote non-dimensionalised quantities with a bar.Therefore, to match physical systems in Ref.s 6 and 8 for benchmarking performance, we use σ = 2 −1/6 , ǭ = 1, kFENE = 7 × 2 2/6 and R0 = 2 × 2 −1/6 .Matching hydrodynamic radii of individual particles, a, leads to the minimum wavelet radius (introduced in section V D) λmin = 0.700, while the thermal energy is k B T = 1.2.
We define our unit of time to be the time over which completely isolated or non-interacting particles are expected to have diffused by their own radius: By requiring the same viscosity η as used in Ref.s 6 and 8, which will only appear implicitly in our algorithm, this unit of time is 4.04 times smaller than in those papers.Finally, to match the systems in Ref. 8, whenever a system is called 'semi-dilute' it consists of polymer chains of length N b = 10 beads and a global bead concentration N/ L3 = 0.625, where L is the side length of our simulation box.
All data were obtained using the gcc compiler with optimisation -O3 on a single CPU on an Intel Core2 Quad CPU Q9400 at 2.66GHz.

III. METHOD SKETCH AND MOTIVATION
We seek a MC algorithm that produces hydrodynamics in a simple and efficient way.The moves considered are displacements inside spheres of radius λ centred on position b, chosen from probability density functions (PDFs) P λ and P b|λ respectively.Here we have anticipated some λ-dependence in the distribution for b.The orientation of these moves, p, is unbiased so that average motion is only induced by non-zero forces weighting the Metropolis test.An example of such a move is depicted in Fig. 1.
A move will only contribute to correlated motion of particles i and j if it encloses both of them, requiring 2λ r ij and for b to land in a volume of order λ d in d dimensions.The contribution from a given move therefore has the piecewise form where A w is a displacement amplitude that is related to strain by ε ∝ A w /λ if, for simplicity in this section, the strain is assumed to be uniform over the move.In that case the estimate for the change in energy over a move containing n particles is ∆U est ∼ nε 2 , which we are constrained by a Metropolis test to avoid being large compared to k B T .This sets A w ∼ λ/ √ n.For P b|λ we note that moves with n = 0 contribute no particle motion and are desirably avoided altogether.Those with n 1 can be chosen by first picking a particle at random, and then choosing b at random within distance λ.P b|λ is then exactly proportional to n/(N λ d ) since each of the n particles could have led to that centre being chosen.
Finally we pick P λ ∼ λ −d−1 , so that integrating Eq. ( 4) over radii we have matching the required form for hydrodynamics in d dimensions.
We now know the algorithm produces the desired dynamics so we turn our attention to estimating the computational cost to evolve our system by a physical amount of time.This will be equal to the cost per move divided by the time evolved per move.The time factor comes from the previous result, using Brownian motion to set it proportional to r 2−d ij δt so that we have a time evolution per move going as δt ∼ 1/N.(6)   Meanwhile, the cost per move of radius λ is of order the expected number of particles involved n(λ), so the increment in the computational cost is given by δC = c dλP λ n(λ), with c being the cost per particle.The homogeneous case has the cost integral diverging logarithmically, leading to The factor of N here simply reflects the distribution of computational effort across the system, while the ln N will be seen in section V F to stem from an increase in allowed λ values.It is also shown that for a fractal system, such as a single polymer chain, the logarithm is no longer present.The method outlined above is therefore very simple, with a favourable cost scaling compared to conventional BD and without the solvent degrees of freedom of explicit solvent methods.The rest of this paper describes the method in detail for d = 3.

IV. MOBILITY TENSORS
We focus on the low Reynolds number limit in which the dynamics of the fluid solvent is governed by the Stokes equations.For simplicity we assume we have spherical particles subject to no applied torques, so the dynamics in the system reduces to interrelating the particle velocities, v i , to the applied forces F j , the latter including interparticle forces.In general this is given by where G is the mobility tensor and the superposability of the velocity response follows from the linearity of the Stokes equations.The contribution from random thermal fluctuations, δv i , have covariance set by the fluctuation dissipation theorem (FDT), which in the Stokes limit reduces to In practice we need to implement displacements across a small but non-zero time interval δt, in terms of which this becomes In general the mobility tensor depends on the entire configuration of particle positions and here we have to simplify, following other workers in using the leading dilute limit expressions.Thus for the self-terms, with i = j, we take the Stokes form G ii = I/6πηa, where a is assumed monodisperse for simplicity of exposition.For the cross terms we have G ij (r i , r j ) = g(r i − r j ) when the interference of third particles is ignored, and then to leading order in powers of separation r = |r i − r j | we obtain the Oseen tensor, 27 g Oseen (r corresponding to particles comoving with the bare solvent response, and as a result ∇ • g Oseen (r) = 0.
As is well known, the Oseen tensor alone does not assure a mobility matrix with non-negative eigenvalues 28 (crucial for the FDT result) so some modification must always be taken at small r to remedy this.It has become standard in the recent literature to use the Rotne-Prager (RP) form. 29This incorporates the next-leading power of distance, which also turns out to be divergence free: for r 2a, with a branch for 0 r 2a given by In WMCD we implement the Oseen tensor modified by a cut-off in the wavelet spectrum.This is positive definite and in section V G we show that it can lead to a tensor very close to g RP .It should be noted that whilst these modifications render the total mobility tensor positive definite, they still underrepresent the reduction in mutual mobility of two particles at close approach, for which r • g • r should match the self term 1/6πηa as r → 2a.

A. Wavelet representation of the Oseen tensor
Our method is an adaptation of the continuous wavelet transform in three dimensions, itself based on the identity for the Dirac delta function (13) The choice of 'mother wavelet' shape is surprisingly arbitrary, with real wavelets constrained only by the formal requirements 25 The normalising front factor is then given by the wavelet Fourier transform w(k): For this work it is advantageous to impose additional constraints.The first is to limit the support of w so that w(x) = 0 for |x| > 1. ( Then it is clear that w( (r − b)/λ) is non-zero only over a region of radius, or 'scale', λ centred on b.
Next we modify Eq. ( 13) in two ways.First we make the wavelet a vector valued function, w( (r − b)/λ, p), with additional input 'polarisation' variable, p.We require these wavelets to be divergence-free, ∇ • w(x, p) = 0, (17)   so that g inherits this property below.It is this constraint of the wavelet being transverse which forces us to supply p.Our vector wavelet can now be thought of as a flow field in the vicinity of b extending over a lengthscale λ.
Our second modification is to change the explicit power of λ in Eq. ( 13) so that the dimensions match that of the Oseen tensor.This then leads us to consider In Appendix A we show that this is exactly equal to O(r i − r j ) when λ is integrated from 0 to ∞ and with an appropriate choice of the constant N g .Eq. ( 18) is more enlightening when expressed as an expectation value.In doing so we introduce a wavelet amplitude A w , which subsumes all constant factors but may also depend on the wavelet parameters as well.This dependence is found in section V C, but for now we assume it is known and write where subscripts on the angle brackets indicate quantities averaged over.These subscripts are left implicit in later results.Finally, comparison between Eq.s ( 9) and ( 19) immediately identifies A w w ( (r − b)/λ, p) as the displacement δr.

V. DESCRIPTION OF THE WAVELET METHOD
With the expectation value interpretation in Eq. ( 19), we do not need to compute the integral for each particle at every time step, and can instead sample wavelets with appropriate distributions for λ, b and p.As these moves add up, the correlated motion of particles reproduces that of the Oseen tensor with thermal noise entering via the stochastic move generation.
The details of how this is implemented are given below and in Appendix D, but the basic structure of the algorithm follows the process: 1: generate wavelet parameters from their distributions, 2: provisionally move particles according to the resulting wavelet, 3: calculate the energy change, ∆U , caused by this move and accept or reject it with a Metropolis probability The process is then repeated for a desired number of moves.
In the Metropolis test there is no need to include a term for the Jacobian of the move, as per Maggs, 23 because we will only consider moves with divergenceless flows.

A. Choice of mother wavelet
The restrictions in Eq.s ( 14)-( 17) still leave a choice for w.In this article we satisfy these conditions by using the form for some scalar function φ.
We also choose to limit ourselves to continuous wavelets so that the strain tensor is finite everywhere.This is required by neither wavelet theory nor the algorithm, but does simplify some of the analysis.Further simplification comes by abbreviating the m th moment of the square of the Fourier transform of φ to For the data in this article we have used the 'tapered wavelet', which in spherical polar coordinates (r, θ, ϕ), and polarisation vector along the z-axis (θ = 0), is given by The associated φ and its Fourier transform are

B. Distributions of wavelet orientation and centres
With the mother wavelet chosen, we now need to determine the distributions from which to pick the parameters.For p we take an isotropic distribution with P p = 1/4π and wavelet amplitude A w independent of p.
The PDF for b is more involved as we want to avoid spending CPU time on moves that contain no particles and hence don't evolve the system of interest.To ensure all wavelets contain at least one particle we first pick a particle, all with equal probability, and then choose b uniformly inside a sphere of radius λ centred on this particle.
This approach introduces biases that need to be accounted for.First, the probability of choosing a position inside a volume element is inversely proportional to the volume of the sphere.Similarly, the chance that any given particle is chosen is inversely proportional to N .Lastly, if the resulting wavelet contains n particles, there must have been n possible ways to have chosen it, and hence the probability is increased by this factor.All combined, we have the PDF for b as C. Choosing the wavelet amplitude Before the PDF for λ can be determined, we need to know the form of A w .To find this we note that if we want the distribution of λ to be correctly reflected in the accepted moves, we want P acc , and hence ∆U , to be as constant as possible over all moves.
An estimate of ∆U can be made with the strain energy associated with a move where µ is a system dependent particle interaction energy expected to be of order k B T in soft-matter systems and leading us to estimate the local shear modulus as 3µn/(4πλ 3 ).ε is the move's strain tensor, and : denotes a double dot product.On the assumption of small displacements we use the infinitesimal strain tensor ε = A w ∇w + (∇w) T /2.With no preferred direction, n accounts for all location dependence so b and p can be left out of the calculation, leaving only the transformation r → r/λ from the mother wavelet.In Appendix B we show how the energy estimate can be written as for any wavelet of the form in Eq. (21).For this to be constant over moves we must choose with A 0 a dimensionless tuning parameter that scales the maximum rotation angle of the wavelet vector field, and thus must be kept small enough to keep the infinitesimal strain approximation valid.To maintain simplicity in our results we use A 0 as a completely free parameter, but we comment that, via Eq.( 28), it is fully defined such that highlighting the contributions from the choice of mother wavelet (M 6 ), the system being simulated (µ/k B T ), and the energy scale we are really choosing (∆U est /k B T ).
The interplay is seen in Fig. 2, with both increasing A 0 and system density, i.e. µ, increasing ∆U est and hence decreasing P acc .Fig. 2 also reveals that choosing A w as per Eq. ( 29) leads to constant P acc only asymptotically, where it limits to a value P asym acc .For small wavelets the strain energy estimate is inaccurate as it requires both particles in an interacting pair to be displaced, which is less likely when λ is small, to see a relative displacement from ∇w rather than w itself.This results in an overestimate of ∆U as evidenced by the rise in P acc .Ideally one would like to modify the distribution of λ in attempted moves to compensate for the bias observed in P acc , but to do so analytically would be complicated.A simpler and system independent approach is to keep the form of A w in Eq. ( 29), which already guarantees P acc does not drop too low, and use the λ-recycling scheme described at the end of Appendix D.
This scheme is still an imperfect correction for the influence of P acc on diffusion rates as it fails to take account of any system dependence on b and p.It is therefore advantageous to reduce the variation observed in P acc , which can be achieved by lowering A 0 as seen in Fig. 2. Thus increased dynamical fidelity is available in return for the increased computational cost of making smaller displacements per move, and in this respect A 0 can be viewed as the WMCD analogue of the time step size of other algorithms.We also anticipate switching from Metropolis to a Glauber test, 30 and further to smart MC 31,32 will see reductions in the variation in P acc .

D. Distribution of wavelet radii
With A 2 w , P p and P b|λ identified, we can now determine P λ from Eq. ( 18) by requiring We therefore use the PDF when normalised between λ min and λ max .The effect of using these finite bounds instead of (0, ∞) is discussed in section V G.For now we simply comment that a finite λ min is desirable as it regularises the singularity in the Oseen tensor at r → 0 and gives us a particle radius via a = λ min /λ a , with λ a a constant dependent only on the choice of mother wavelet.λ max is required when considering a simulation in a finite box of side length L, where we do not want wavelets to overlap with their periodic images due to the additional computational effort involved in applying multiple displacements to individual particles.For a homogeneous system an alternative method can be used in section V B above, whereby b is chosen uniformly across the box so that P b|λ = 1/L 3 , independent of λ.In this case it is sufficient to consider the mean n , using the global rather than local density.This leads to A w = A 0 L 3 /(N √ λ) and, again, P λ = N λ λ −4 .The rapid decay of P λ with either approach means small radius moves dominate.This is reflected in Fig. 2 with the mean P acc being significantly higher than the asymptote.

E. Time evolution per move
Next we need to know how much physical time passes during a simulation, for which we calculate the expected displacement squared of any given particle in a single move, δr 2 i .Using the same approach as in Appendix B this simplifies to The simulated time increment in a single move, δ t = δt/τ , is then This is consistent with the claim that A 0 , which is the only free parameter if λ max is taken as large as the simulation allows, is analogous to the choice of time step.

F. Computational cost
We now come to calculating the computational cost.Here we consider a system with fractal dimension d f , so that the expected number of particles in a move is proportional to (λ/s) d f , with s the mean separation between near-neighbouring particles.
The cost associated with generating move parameters is small compared to the cost of moving the n particles and calculating each of their energy changes.We can therefore use an average cost per particle per move, c, and multiply this by the expected value of n to find the total cost per move.Dividing this by the time advance per move in Eq. ( 34) we have cost per unit dimensionless time Here P −1 acc , defined as handles the additional λ-dependence coming from our λ recycling scheme.Its value lies between 1 and 1/P asym acc , itself of order unity, and limits to this upper value as λ max → ∞.Its effect on the estimated scaling is therefore small and we will treat it as a constant.

Cost of homogeneous systems
When the system is homogeneous or in a poor solvent so that d f = 3, the integral evaluates to a logarithm and s 3 = L 3 /N .The cost then goes as For systems with fixed global density and λ max ∼ L ∼ N 1/3 , this reduces to C homo ∼ N ln N per unit of time.Fig. 3 shows cost timings in semi-dilute homogeneous systems which confirm the N ln N scaling.This is true also of the wavelet plus Fourier (w+F) data in this figure, which includes moves described in section VI that only add a minor cost.The following discussion therefore applies to either pure wavelet or w+F data equally.
Because the ln N factor originates in λ max , and hence sees only the asymptotic acceptance probability P asym acc , the coefficients of the ln N terms in Fig. 3 vary as 1/(A 2 0 P asym acc ).Reading P asym acc from Fig. 2 predicts a ratio of coefficients of approximately 1.5, agreeing within the error margins in Fig. 3. Also in Fig. 3, the cost for the LB algorithm for identical semi-dilute systems is seen to be a factor of order 1 times faster than the WMCD algorithm, with the exact factor depending on both A 0 and N .(The equivalent BD costs are several orders of magnitude larger. 8) Here we note that a fair comparison would use the same hardware for each algorithm, which was not done here, and the ratio between the costs is therefore only a rough indicator of their relative performance.Moreover our code was far from optimised with particle neighbour lists (the dominant contribution to the cost) being recomputed every move rather than updated only as required, we did not exploit multiple processors, and we compiled with just the standard gcc compiler.Finally, we expect gains relative to LB in semi-dilute systems with longer chains, which are less dense than the systems used in this comparison and consequently the LB algorithm's explicit solvent would incur an additional cost.

Cost of fractal systems
When d f < 3, as in the case for a single polymer chain in good (d f = 1/0.588)or θ (d f = 2) solvents, 27 the cost integrates to (38) This time s is taken as the bond length between beads and is set by the potentials between particles.For θ solvents the λ max dependence cancels, while it is only significant for small systems in good solvent.In either case we are left with C frac ∼ N .This is shown in Fig. 4 where the observed scaling is slightly faster than the theoretical linear scaling in N .We suspect the discrepancy is due to our use of the cell index method, 32 which introduces small changes in the cost of identifying moving and interacting particles as the chain length increases.For the same reason, the cost is actually L dependent with the optimal L increasing with N .For the data in Fig. 4 we used L = 50 for all data points, and hence the costs given are not optimal.
Again, in comparison with other algorithms the differences in hardware should not be forgotten, but in the dilute regime the wavelet algorithm is seen to be much faster than conventional BD for all but the shortest chains, while the LB algorithm is now more expensive because of the many solvent molecules.

G. Effect of λmin and λmax on the mobility tensor
So far the effects of using the finite bounds λ min , λ max have only been stated without proof.To calculate the effects we first take the Fourier transform of the wavelet representation of the Oseen tensor, as per Appendix A, substitute in the wavelet form in Eq. ( 21) and impose the finite limits on the λ integral.This results in the Fourier space tensor The inverse FT then gives the tensor simulated by the algorithm in a form that more clearly shows its characteristics.The inverse FT with limits (kλ ′ , ∞) is found in where Q = qr/λ ′ and Si is the sine-integral function Si (Q) = Q 0 dt sin t/t.The full tensor is then O w (r; λ min , λ max ) = O w (r; λ min , ∞) − O w (r; λ max , ∞), (41) which approximates the Oseen tensor between λ min and λ max , with regularisation of the singularity and missing correlations for separations larger than λ max .
To calculate the regularisation at r ij → 0, and hence find the particle hydrodynamic radius, we consider the case where λ max → ∞.This choice anticipates the modification in section VI.We then associate Eq. ( 40) with the self-and cross-terms in section IV, taking the limits r → 0 and r → ∞ respectively.In the former limit, the square brackets in Eq. ( 40) become (4/3)QI, while for the latter they become (π/2)(I + r ⊗ r).Since this yields the expected tensor structures it is sufficient to equate the scalar factors and then solve the 2 equations, giving the ratio Hence and as previously claimed, λ min is exactly proportional to particle radius so long as the moments M 3 and M 4 exist.For the tapered wavelet, λ a = ((9/8) − ln 2) −1 ≈ 2.316.Fig. 5 shows both radial and angular elements of the simulated mobility tensor, measured from correlations in particle displacements, plotted alongside curves derived from Eq. (41).The finite λ max data shows long range deviation from r −1 ij behaviour.Intuitively this must happen as any particles separated by a distance larger than 2λ max cannot possibly have correlated motion due to never being in the same move.As λ max increases the long range correlations are more accurately supplied, limiting to the correct 1/r behaviour of the λ max → ∞ data, obtained using the Fourier moves described in the next section.
Note that the correlations are very close to those of the RP tensor across both of its branches.As the RP tensor is already an approximation at small r ij , modifying the method to explicitly replicate this tensor is not expected to be worthwhile, although we note that it is possible to do so using Faxén's laws. 33

VI. ADAPTATION TO INCLUDE FOURIER MOVES
We now describe how to modify the algorithm to correct for finite λ max by adding O ′ (r; λ max ) back into Eq.( 41).This is achieved using essentially the same approach but using plane waves as our moves instead of wavelets.
Although it did not matter for wavelets because λ max ensured periodic images would not overlap, whether we are considering an infinite or periodic system is now important, and the corresponding tensors will be denoted O ∞ F (r; λ max ) and O P F (r; λ max ).Similarly, other boundary condition dependent quantities will be distinguished with these superscripts, and equations that apply for either will leave the superscript off of these quantities.
We start by writing these tensors in the plane wave basis, which is none other than the Fourier transform, so we have which can be read directly from Eq. (39).For the periodic case only the commensurate wavevectors are represented, leading to with k ℓ = (2π/L)(ℓ x , ℓ y , ℓ z ) and all ℓ ∈ Z.
The inverse Fourier transforms can be re-expressed in expectation value form by recognising (1 − k ⊗ k) = 2 ê ⊗ ê e , with ê a unit vector perpendicular to k.To re-express e ik•rij we note that Õw (k; λ max , ∞) is even in k, so we can replace the complex exponential with In both cases the distributions are uniform so that P e = P Φ = 1/2π, and we finally have with A F filling the same role as A w did for wavelet moves.We therefore have Fourier moves causing displacements δr = A F cos(k • r + Φ)ê, as seen diagrammatically in Fig. 6, and the infinite and periodic versions differing only in their distributions over k.
The rest of this section follows loosely the preceding wavelet section.

A. Choosing plane wave amplitude
Similarly to section V C, we desire A F to be chosen such that ∆U is independent of k.Further to this we want the change in energy to be equal to that of wavelet moves.
In a Fourier move the whole simulation box sees a displacement, so the move volume is L 3 and n = N .Any L-dependence will cancel so this approach is still valid for the infinite case.The strain energy estimate is therefore given by setting A F (k) = 2k −1 ∆U est /µN with ∆U est /µ taking the same value as set by A 0 in Eq. (30).Fig. 7 verifies that choosing this form does indeed lead to a constant acceptance probability, at least for small k modes which will end up dominating the distribution.For these low frequency moves comparison with the wavelet asymptotic values, seen again as dashed lines in Fig. 7, confirms equating Eq.s ( 28) and (48) correctly equates the actual energy changes in wavelet and Fourier moves.
At large k the wavelength is less than the particle separation and the gradient of the vector field is no longer a good measure of the relative displacements of nearby particles.Consequently the strain energy is an overestimate, leading to P acc rising to 1. Similarly to how we recycle the radius of failed wavelet moves, and as described at the end of Appendix D, we recycle the absolute value of k of Fourier moves to correct for the variation in P acc .
We now address the divergence of A F at the k = 0 mode.This mode is left out in the Ewald sum in other methods to enforce zero net force. 34For our algorithm, the k = 0 mode has no influence on dynamics within the system and by viewing the system in the centre of mass frame we set the displacements associated with this move to zero.However, because in the periodic case this mode has a non-zero probability of being chosen, as is shown in the next section, we deem these moves be accepted even though they induce no change in the system.Their automatic acceptance coming from ∆U = 0 accounts for the slight rise in P acc in the lowest k bins in Fig. 7(b).

B. Distributions of wavevectors
With the amplitude chosen, we are now able to determine the distribution of wavevectors.For the infinite case these are isotropically distributed so that P ∞ k = 1/4π, while the radial part picks up k 2 from d 3 k, giving The normalisation simplifies to For the periodic case, X(k) turns the PDF into the discrete set of probabilities where the normalisation, obtained by a sum of this integral over all k ℓ , cannot be expressed in a simple way.
For the tapered wavelet P P k is shown graphically in Fig. 8, where the k −4 decay in the number of high frequency modes, resulting from wavelets already accounting for the short-range correlations these would be contributing, ensures N P k converges.As the markers in this figure show, when λ max takes its maximal value of L/2 all but the k = 0 mode experience this decay and low frequency modes therefore dominate the distribution.

C. The probability of making a Fourier move
It now only remains to determine how often to make a Fourier move, for which we compare the magnitudes of the tensors generated by the missing wavelets with λ > λ max and the Fourier moves replacing them, after conversion to expectation value form.Surprisingly we find exact equality, i.e.
when we might have expected the normalisations to introduce a relative factor.It turns out the fixing of the strain energies to be equal, which can be viewed as an operator on δr i ⊗ δr j , set this factor to 1.This leads us to the remarkable result that, on average, a Fourier move evolves time by the exact amount that a missing large wavelet would have done, regardless of the value of λ max .We therefore need to make a Fourier move with the same probability as picking a wavelet with λ > λ max : To show that this argument is consistent with the previously found distributions, we explicitly calculate δr 2 i for infinite system Fourier moves.After averaging over space and the trivial k, Φ and ê integrals have been performed this becomes which is identical to Eq. ( 33) when λ max → ∞ and λ min → λ max , as expected.
In periodic systems the argument is complicated by the estimated energy for wavelets being invalid for λ > L/2 when they overlap with their own images, and hence the relative factor, R say, has not been fixed at 1. Performing the energy calculation directly is intractable for overlapping wavelets, but the value of R can be found indirectly by comparing the introduced normalisation factors.The wavelet factors are unchanged from the infinite case, so it is sufficient to use the ratio of introduced factors between the infinite and periodic Fourier tensors, which gives To calculate the probability of making a Fourier move we enforce equal time evolution from Fourier and missing wavelet moves after the same number of wavelet moves with λ < λ max , leading to Finally we calculate the new value of δ t, taking both Fourier and wavelet moves into account.Because τ is defined for an isolated particle this will use δr 2 i ∞ F in both the periodic and infinite systems.Since this is identical to the missing contribution of λ max < λ < ∞, it is clear we can just take λ max → ∞ in Eq. ( 34) to obtain D. Computational cost with Fourier moves Armed with the frequency at which Fourier moves are chosen, we now calculate the computational cost associated with them.Each plane wave evolves all N particles, each with a cost c as in the wavelet case, by δt ∼ N −1 .For Fourier moves alone then, the cost would scale as N 2 .
Accounting for the presence of wavelet moves, the Fourier contribution to the cost is weighted by P F ∼ λ −3 max ∼ N −1 for a set of constant density systems.The Fourier moves are therefore subdominant to the wavelet cost, and the previous N ln N scaling remains, as observed in Fig. 3. Similarly the fractal scaling of N is unchanged if λ max continues to scale at least as fast as N 1/3 , which is slower than the N ν scaling of the polymer size.
Unlike the Ewald sum in BD simulations, which splits the workload so that the cost of the Fourier space calculation is comparable to the position space part, 8 there is no requirement for the total cost of WMCD Fourier moves to be comparable to the cost of wavelet moves.Indeed Fig. s 3 and 4 both show the Fourier moves can contribute only a small, if not negligible fraction of the total cost.λ max can be tuned to give comparable costs, but since Fourier moves replace wavelets on a 1-to-1 basis regardless of the value of λ max and they always carry the cost of moving every particle, this is not optimal.Rather it is optimal to take λ max = L/2, or as close as the algorithm's implementation allows.

VII. ALGORITHM VALIDATION A. Chain relaxation
Relaxation of isolated chains of length N b = 10 beads with initially stretched or compressed bonds is seen in Fig. 9, with mean-square radius of gyration defined as Both stretched and compressed chains relax to the same value as obtained for this system in Ref. 8, confirming the algorithm equilibrates correctly, and therefore that our λ and k recycling schemes have not violated detailed balance.The figure also shows the same physical relaxation time for the different values of A 0 , verifying the A 0 -dependence in Eq. (57).

B. Chain diffusivity
Diffusion of the centre of mass of chains of different lengths is shown in Fig. 10(a), where the correct linear relationship with time is seen.Note that we are ignoring the difference between short and long time diffusion in the analysis as this basic data is not precise enough to distinguish the few percent between them. 35Data at that accuracy is left for future work.CoM /6t, for chains in both good and θ solvents, with the former using the same data as in Fig. 10(a).For the θ solvent the WCA potential was turned off and the FENE potential was replaced with the harmonic bond potential (3/2)k FENE (r ij − r 0 ) 2 , in which r0 = 0.9691 was used to match the mean bond length in the good solvent.These sets of data show quantitatively that simulations in both types of solvent lead to expected scaling with N within the error bounds on the exponent. 27The absolute values in good solvent also agree with previous work, with the value at N = 32 (star) provided by Ref. 6 lying on the fit of our data.No previous data for θ solvents with these parameters are available for comparison.
This scaling seen in Fig. 10(b) is usually considered asymptotic, requiring long chains to observe.To see why this is not the case with our data we write the chain diffusivity as per Ref. 36, which has split D into terms expressing the sum of hydrodynamic radii of individual monomers and the contribution from the HIs between the monomers.Using B = 4.036/s, found for Gaussian chains with root mean square bond length s, 36 as an rough estimate for our systems, and inputting our values of s ≈ 1 and a ≈ 0.3, we find aB ≈ 1.2.This leads to significant cancellation of the N −1 terms, while aA ≈ 1.1 so the N −ν term dominates and scaling is expected at small N .Note that the common practice of neglecting the first term in Eq. ( 59) would hide this result.
For polymers in good solvent there are additional 'non-analytic' terms that come from excluded volume interactions. 36Nonetheless the qualitative result is the same when we use coefficients fitted in Ref. 36.While we do not expect the coefficients used here to apply exactly for our systems, it is the near cancellation of the N −1 terms that we highlight as explaining the scaling observed, and that we do expect to apply.

C. Finite box effect on self-diffusion
The periodic algorithm should lower the diffusivity because of the extra HIs with periodic images.The simplest check for this is to look at the trace of the mobility tensor in the limit r ij → 0, which has been calculated to leading order to vary with box size as 37 (60) By generating correlation curves for δr 2 i with noninteracting particles, similar to Fig. 5, and comparing the unscaled values at r ij → 0, we have confirmed this behaviour as well as the L-independence of the infinite algorithm (see Fig. 11).This figure also confirms that Eq. ( 57) is indeed appropriate for periodic systems as the measured diffusivities are inversely proportional to how we scale time.Finally, it provides further verification of FIG.11.Simulated traces of self-diffusion tensors, scaled by the mean trace of infinite systems for each λmin, in simulation boxes with L between 5 and 30.The solid and dot-dashed lines were calculated using Eq. ( 60).A0 = 0.5 was used for this data.
Eq. ( 42) as this determines the value of a and hence the gradient of the lines via Eq.(60).

VIII. SUMMARY AND OUTLOOK
In this paper we have detailed the Wavelet Monte Carlo dynamics algorithm, using wavelet and plane wave moves to produce hydrodynamics without explicit decomposition of the mobility tensor.Our algorithm has been shown to closely approximate the Rotne-Prager tensor by starting from the Oseen tensor and removing small radii moves, which in turn provides an explicit particle radius.The distribution of additional Fourier moves, meanwhile, can be chosen to simulate either a periodic or infinite system with the main wavelet part of the algorithm left unchanged.
It has been shown that the computational cost of WMCD scales well to large systems, going as N ln N for homogeneous systems with fixed particle density and as N fractal systems.Both of these have a very competitive prefactor, making it a promising simulation technique for extending the reach of soft-matter simulations.
The algorithm has been tested to show correct hydrodynamic correlations are produced, while the simulated behaviour of isolated polymer chains has been shown to agree with previous work.
For the case of homogeneous systems, our wavelet algorithm turns out to be fortuitously well balanced across lengthscales according to three distinct criteria.First we chose the move amplitudes (Eq.( 29)) such that their elastic energy is expected to be of order k B T leading to move acceptance independent of wavelet radius λ.Secondly we then found that this leads to computational cost dλ/λ uniformly distributed over lengthscales, leading to the logarithmic dependence in the overall cost (Eq.( 37)).Thirdly the timescale for the algorithm to build the move correlations across a distance r, which come predominantly from wavelet radii λ ∼ r, turns out to be independent of λ.This last result can be seen from the way the squared move amplitudes have λ-dependence matching the r-dependence of the cross mobility (Eq.( 10)), and it means that the timescale to build up long ranged correlations is of the same order as the timescale for particles to expect to experience a local (small wavelet) move.It is easily checked that the d-dimensional sketch of the method given in section III leads to all the same coincidences.The case where the coincidences do unravel is when the distribution of particles is fractal, as with dilute polymer and locally in semidilute polymer: then the longer wavelet moves are allowed greater amplitude leading to computational cost dominated by small wavelets.There is correspondingly longer build-up time for long ranged correlations, but this is still fast compared to the corresponding polymer relaxation modes.
With the core set out, we plan to develop the algorithm further to include solvent flows.We envision the procedure with this modification to be essentially the same as presented in this article interspersed with flowgenerating moves.So long as the flow moves, however they are implemented, have small enough amplitudes to keep close, interacting particles near equilibrium, even with large scale separations along the chain moved out of equilibrium, we expect the energy estimates in this article to remain valid.Consequently we do not anticipate any change to how we handle the existing moves.
On the computational side, there is scope to improve the current algorithm in several areas.Foremost we can improve the accuracy and reduce the variation in move rejection rate by switching from Metropolis to Glauber move acceptance, and also to smart MC in which moves are biased according to the force conjugate to them.There is also scope for optimisation of choices which already exist, notably the choice of mother wavelet is yet to be explored.The precise way in which periodic boundary conditions are enforced on the hydrodynamic propagator is another deeper choice where we have followed that of prior work, but it is a choice: for example we could have chosen to keep only the wavelet moves.
A range of coding improvements also stand to be made.The computational cost could be reduced by tracking particle neighbour lists rather than re-computing them at each move, and computational speed could be increased by a combination of parallel execution of multiple nonoverlapping wavelet moves and parallelised execution of single large wavelet moves.
Choosing p: p is an isotropically distributed unit vector, for which we generate V as above and take p = V/ √ V • V.
Determining A w : With λ and b chosen we can simply count how many particles, or a periodic images if appropriate, are centred inside it to give us the value of n.A w is then determined by Eq. (29).Note that only 1 image should be counted per particle as this is all a single image of the wavelet would contain.

Generating a plane wave
Choosing k for an infinite system: With infinite systems we can generate the amplitude and direction separately, with the latter generated in the same way as p for wavelets, i.e. k = V/ √ V • V.The distribution over the amplitude in Eq. ( 49) depends on the choice of mother wavelet and its associated cumulative distribution function is not in general analytically invertible.Nonetheless we use the inverse transform method numerically using a file containing pre-calculated conversion values from rand ∈ (0, 1] to k.Note we interpolate between the necessarily discrete values in the file and exclude rand = 0 because P ∞ k (0) = 0.
Choosing k for a periodic system: To generate discrete wavevectors we split k-space into a cubic discrete region centred on k=0 and with N layers layers, as shown in Fig. 12, and a quasi-continuous region outside.We choose between these regions by comparison of rand ∈ (0, 1] to the sum of probabilities of modes inside the discrete region. In the discrete region we generate k = (2π/L)(rand x , rand y , rand z ) with each rand ∈ {−N layers , −N layers + 1, −N layers + 2, ..., N layers }.
If (rand ∈ (0, 1]) P P k (k ℓ )/P P k (0), where the denominator serves to increase the acceptance rate without altering the relative probabilities, then we use this mode, else we repeat the process until a mode is accepted.For the results in this article we have used N layers = 15, but owing to the rapid k −4 decay in probabilities a smaller value can be used while maintaining an accurate distribution.
In the quasi-continuous region we pick modes continuously as per the the infinite case, and then shift them to the nearest discrete mode.This is accurate when P ∞ k is linear over its surrounding Wigner-Seitz cell, as is ensured locally by the Taylor series.Since the distribution now starts at k = (2π/L)(N layers + 0.5) rather than 0, a different conversion table to the one used in the infinite algorithm must be constructed.Finally, to not over-represent modes contained in both the discrete and quasi-continuous regions, such as those in the corners of the square in Fig. 12, if these modes are landed on in the quasi-continuous region another mode must be selected instead.
Choosing Φ and ê: The plane wave's phase is simply Φ = rand ∈ [0, 2π).Its orientation, chosen uniformly over the directions perpendicular to k, also generates rand ∈ [0, 2π) as an angle about the k-axis.Geometrical calculations then readily find ê from rand and k.

Parameter recycling and updating time
If a move is rejected, we reuse either λ or k for wavelets and plane waves respectively to keep their distributions as required for hydrodynamics.For wavelets and infinite system Fourier moves this just requires using the same λ or k as the previous move (note that the same move type is required also), while new values for all other parameters are generated as above.For periodic Fourier moves the discrete k requires a different approach, and we ensure we keep the same amplitude by simply permuting the components, each with a uniform probability to land in any of the 3 new components, and assign a minus sign to each with a probability of 0.5.
Because the λ and k recycling scheme ensures the correct distribution of accepted moves, we must update time only after an accepted move, when we add δt as given by Eq. (57) to the evolved time.

FIG. 1 .
FIG. 1. Schematic diagram of a wavelet move on a section of polymer.The dashed arrow represents p and the central dot marks b.The particles enclosed in the wavelet move according to the vector Aww, introduced in section IV A, while particles lying outside remain stationary.

FIG. 2 .
FIG. 2. Acceptance probabilities over the spectrum of wavelet radii at different move amplitudes.The dilute data used an isolated polymer chain with N = 2048.The dashed lines are only to emphasize the asymptotic behaviour, while the markers at λ = −1 indicate the average Pacc over all wavelets.

FIG. 3 .
FIG. 3. CPU cost per particle to evolve semi-dilute (homogeneous) systems by a single time unit.The dashed line indicates LB timings from Fig. 8 in Ref. 8, which used identical systems, rescaled to our time unit.Timings for both pure wavelet (w) and periodic wavelet plus Fourier (w+F) algorithms, the latter described in section VI, are shown for comparison.

FIG. 4 .
FIG. 4. CPU cost to evolve systems with an isolated polymer of length N by a single time unit with the pure wavelet (w) and infinite wavelet plus Fourier (w+F) algorithms.The systems considered were identical to those in Fig. 11 in Ref. 6, and the dashed line indicates the BD and LB timings from that plot, rescaled to our time unit.

FIG. 5 .
FIG. 5. Plots of simulated mobility tensor elements normalised by ζ = 6πηa with the value of λmax given in the legend.Theoretical curves from Eq. (41) lie underneath the data and curves for the full RP tensor are also shown (dashed) for comparison.For λmax = 29λmin the data for ζ(grr − g θθ ) are not shown: they overlay the λmax = ∞ data over the whole range.The inset shows ζgrr using logarithmic scales.

FIG. 6 .
FIG.6.Schematic diagram of how a Fourier move displaces all particles in the system.The plane wave surface indicates the vector field, which is perpendicular to k and spans the entire system.

FIG. 7 .
FIG. 7. Acceptance probabilities over the spectrum of plane waves different move amplitudes.(a): Single chain data using the infinite algorithm.(b): Data for semi-dilute systems with the periodic algorithm.Dashed lines, indicating the asymptotic wavelet values of Pacc for equivalent systems, are identical to the dashed lines in Fig. 2.

FIG. 8 .
FIG. 8. Probability of picking Fourier modes in a periodic system in Eq. (51), normalised by the probability of picking the k = 0 mode.Markers indicate the lowest discrete modes when λmax = L/2, starting at k = 2π/L.

FIG. 9 .
FIG. 9. Relaxation of an isolated polymer with N b = 10 and WCA and FENE potentials, for different A0 and initial bond lengths, s(0).The dashed line shows the value for the same system in Ref. 8.

FIG. 10
FIG. 10. (a): Mean centre of mass displacement squared for isolated chains in a good solvent, scaled to show the theoretical collapse of the data.Every fifth data point has been used from each data set to show this clearly.(b): Measured diffusion constants plotted against chain length for chains with identical mean bond lengths in good and θ solvents.A0 = 0.5 was used for all WMCD data while the star marks the diffusivity measured in good solvent in Ref. 6.

FIG. 12
FIG. 12. 2 dimensional diagram of discrete Fourier modes.The shaded region, here with 2 layers around (0,0) with inner edges marked by dashed lines, contains modes chosen discretely.The circle indicates the inner edge of the quasicontinuous region.