Stochastic effects on the dynamics of an epidemic due to population subdivision

Using a stochastic susceptible–infected–removed meta-population model of disease transmission, we present analytical calculations and numerical simulations dissecting the interplay between stochasticity and the division of a population into mutually independent sub-populations. We show that subdivision activates two stochastic effects—extinction and desynchronization—diminishing the overall impact of the outbreak even when the total population has already left the stochastic regime and the basic reproduction number is not altered by the subdivision. Both effects are quantitatively captured by our theoretical estimates, allowing us to determine their individual contributions to the observed reduction of the peak of the epidemic.

Simple models for the spread of infectious diseases are useful for the quantitative characterization of an epidemic as well as for forecasting future infection numbers and guiding decisionmaking for containment. Different extensions and refined versions of these models have been created to extract various factors that may be critical for the dynamics and prevention of epidemics. Although it is well known that stochastic fluctuations can alter the dynamics as well, they are often neglected at higher infection number levels such that the contact rates and basic reproduction number become the central quantities of interest. In contrast, we investigate a situation in which stochastic effects can quantitatively change the course of an epidemic when infection numbers are large and contact rates remain unaltered. We consider an extended Susceptible-Infected-Removed (SIR) model in which a large population is subdivided into a certain number of sub-populations, each containing only a few infected individuals. For the limiting case of perfect isolation, i.e., when the epidemic evolves independently in each sub-population with no cross-infections, we derive analytical estimates for these stochastic effects that together recapitulate the results of extensive numerical simulations. Our central quantity of interest is the peak total number of simultaneously infected individuals, which we compare between the subdivided population and a single large population with an identical reproduction number.
Our analysis suggests that regional isolation can resurrect certain stochastic effects and thereby contribute to effective containment, regardless of the initial distribution of infected individuals.

I. INTRODUCTION
Generic models such as the Susceptible-Infected-Removed (SIR) model conceived by Kermack and McKendrik 1 are indispensable for characterizing the bulk properties of epidemics and determining the influence of crucial parameters on the dynamics. The contact rate between individuals, which is proportional to the reproduction number R 0 , usually plays a crucial role, as its reduction through containment measures directly slows the spreading of the disease. On a large scale (states or countries), numbers of infections during the height of an epidemic are usually large such that deterministic mean-field descriptions are appropriate. These have been widely used to track the course of epidemics and the effect of interventions, for example, for the current spreading of COVID-19. 2 While many details about the biology and modes of infection of a specific disease are important for its dynamics in detailed models, 3 even basic SIR models have been extended in various conceptual directions. Besides various general topologies of the underlying contact and mobility networks, 4-6 so-called meta-population models Chaos ARTICLE scitation.org/journal/cha have been used to separate the disease dynamics within local environments from its spread between them. 7 It has been shown that it is possible to calculate effective quantities for the whole population, such as reproduction numbers (i.e., a threshold theorem), 8 the final attack ratio, 9 and criteria for persistence 10 in deterministic models of such subdivided populations. Another important deviation from simple bulk behavior arises through stochasticity (see Ref. 11 and references therein). Stochastic versions of extended SIR and related models have been used to calculate corrections to the outbreak threshold, 12 consequences of stochasticity for contact tracing, 13 and other control schemes, 14 to only name a few. Stochastic effects are also observed in agent-based 15 and meta-population models. 16,17 Here, we seek to study the joint effect of subdivision and stochasticity on the overall magnitude of an epidemic for a fixed initial number of infected individuals in the total population. In general, subdivision can be expected to artificially boost fluctuations, as the infection numbers in each sub-population can be small even when the total number of infections in the entire population is large. We would like to quantify the ability of such increased stochasticity to reduce the impact of the epidemic. We deliberately refrain from applying any form of traditional containment in our model, such as further reductions in the contact rate or contact tracing. 18 In particular, we design the subdivision such that the deterministic dynamics of the epidemic in the subdivided population remains unchanged compared to a single large population, as outlined in Sec. II. This allows us to compare the peak number of infected individuals in the entire population for each scenario both analytically and numerically in order to extract the specific effects of stochasticity triggered by subdivision.

A. Reaction system
We consider a population of N individuals with SIR dynamics, 1 with S, I, and R referring to susceptible, infected, and removed individuals, respectively, where removal with per capita rate k happens due to recovery, quarantine, or death. The rate b corresponds to the number of contacts per unit time an individual has with a random other individual in the population, multiplied by the probability that a contact between a susceptible and an infected individual leads to transmission. The total transition rate from S to I per unit time is, therefore, b N S I. The two rates b and k are related to the basic reproduction number R 0 = b/k, which is independent of population size. The deterministic epidemic threshold above which an outbreak occurs is R 0 = 1, and we assume R 0 > 1 throughout this study. The population is subject to the total constraint N = S(t) where we denote the number of individuals in each state by the same letters. For simplicity, all initial conditions assume R(0) = 0 such that they are uniquely defined by N and the number of initially infected I 0 = I(0).
When a population of total size N is split up into N s subpopulations, we simulate N s separate copies of the system (1), with N, S, I and R replaced by N i , S i , I i , and R i , respectively, where the index i refers to the different sub-populations and N i = N/N s . N = N i , S = S i , I = I i , and R = R i refer to the population totals. The initial number of infected individuals is distributed either uniformly or randomly across the N s sub-populations. All numerical results in this study are obtained from stochastic simulations of Eq. (1) using the Gillespie algorithm. 19 To account for the inherent stochasticity of the system, several realizations, i.e., identical simulations with different random number generator seeds, are simulated for each parameter set. We report the number of realizations as well as distributions, averages, and standard deviations across the results as appropriate. Our main figure of interest is the peak number of infected individuals I max or, equivalently, the peak infected fraction of the population γ = I max /N. These could be considered a measure for the impact of the epidemic and the strain on the health care system and public health resources such as the agencies that perform contact tracing and testing.

B. Deterministic behavior
The reaction scheme (1) results in the deterministic mean-field equations, which give rise to two regimes in the dynamics. During the initial regime, I starts off from an initial value I(0) = I 0 , rises exponentially ∼ I 0 e (b−k)t , and saturates to a peak value, where the approximation for the maximum fraction of infected individuals 0 < γ < 1 is valid as long as the entire population is initially susceptible; i.e., S(0) ≈ N. 20 In the secondary regime when the recovery dynamics dominates, I decays to zero exponentially, as the number of susceptibles decreases below the value necessary to sustain spreading. In this deterministic system, a subdivision into N s smaller subpopulations of size N/N s will have no effect since Eq. (2) remains invariant when S, I, R, and N are scaled by the common factor 1/N s . Relative to their individual sub-population sizes N i , the same dynamics are observed in all sub-populations and the dynamics of the population totals S = S i , I = I i , and R = R i are identical to those of a single large population. Therefore, the subdivision is not analogous to cutting links in a contact network but rather a redistribution of them since we assume that the contact rate b remains unchanged. This conservative assumption means that individuals in each sub-population still have the same number of contacts per unit time as they had in the large population despite the smaller number of individuals to choose from. While, in reality, the contact rate b might decrease in such a situation and deterministically reduce R 0 and, therefore, I max , we intentionally keep it constant here to extract the effects of stochasticity.

C. Stochastic behavior
Deterministic behavior only applies if S and I are both large, particularly only after the number of infected people I has risen to appreciable levels. If I is still low, stochastic fluctuations determine whether I will "take off" and develop exponential behavior even if b > k. This effect was already considered shortly after Kermack and McKendrick introduced the original SIR model 21 and is now wellknown. However, in a subdivided population, it can significantly alter the course of the outbreak in the total population if the initial number of infected individuals in a single sub-population is low enough (even if the number is large in the total population). An example for populations of N = 1 000 000 individuals split into N s = 10 sub-populations is shown in Figs. 1(a) and 1(b), along with the expected dynamics of a single large population (red curve). In one example set of ten sub-populations, only three sub-populations (blue, yellow, and green curves) experience a significant outbreak, and they are desynchronized with a broad distribution of individual peak times [ Fig. 1(c)]. Spontaneous extinction and desynchronization lead to an average behavior across 100 simulations with a significantly reduced peak (turquoise curve). Note that, on average, both the undivided large population and the sum of the smaller sub-populations initially exhibit comparable exponential growth in the number of infected individuals [ Fig. 1(b)]. This means that, while extinction in some sub-populations and fluctuations in timing happen early on, their effect is only seen later during the saturation phase.
During the initial phase, we can assume that S ≈ N and that I follows a simple birth-death process with rates b for "birth" and k for "death." We shall use this analogy for derivations throughout this study and in Appendixes A-D. We briefly recapitulate one important result from the theory of branching processes here, namely, that an exponentially growing population that starts from an initial condition of I(0) = 1 has a finite extinction probability of which asymptotically approaches k/b at long times; see the derivation in Appendix A. This means that with probability p ext 1 = k/b, the dynamics never enters the exponentially growing deterministic regime but decays back to zero due to number fluctuations. 22 Therefore, for two independent lineages in the same population, the extinction probability is p ext 2 = (k/b) 2 , and, similarly, p ext n = (k/b) n , as long as the total population is sufficiently large such that the lineages do not interfere with each other. We will use these extinction probabilities and other statistics of the birth-death process to derive analytical approximations for the effects of extinction and desynchronization on the stochastic dynamics.

III. RESULTS
A. Theoretical estimates for isolated sub-populations

Extinction
To obtain an estimate for the effect of extinction and the distribution of infected individuals, we add up the maximum numbers of infected individuals in the sub-populations. Each of these peaks is approximately γ N/N s but only if the infection does not stochastically become extinct during the initial stages. For largepopulation sizes and values of b/k that result in a significant peak, extinction usually happens well before the peak is reached in other sub-populations (see Appendix B) such that these populations do not contribute. Therefore, on average, the contribution of each subpopulation will be I s,max (n) = γ 1 − p ext n N/N s , where n indicates the number of initially infected individuals in the sub-population and p ext n is the probability that they go extinct without entering deterministic growth as discussed above. Therefore, the total peak number of infected individuals in all the sub-populations due to extinction is given by I ext max = n g n I s,max (n), where g n is the number of sub-populations with n initially infected individuals. Note that N s = n g n . Combining the above equations, we obtain The above result manifestly shows that holds. Note that this reduction is exclusively due to extinction, and the simple summation of the individual maxima neglects the possible desynchronization between sub-populations, which we will consider further below. For example, for the ideal case where each sub-population only contains at most one infected individual, we have where g 1 = I 0 is the total number of initially infected individuals in the large population (for this to make sense, N s ≥ I 0 is required).
Since γ corresponds to the case where the population was not split up, the peak number of infected can, therefore, be reduced by increasing the number of sub-populations N s or by bringing b closer to k. Note that this is in addition to a potential decrease in the deterministic peak fraction γ of infected [cf. Eq. (3)] that would result if the subdivision also led to fewer contacts (i.e., a reduced rate b), which we have conservatively assumed not to be the case here.

Desynchronization
The independent summation of maxima in different subpopulations is a conservative estimate since fluctuations can lead to stochastic desynchronization and thus to a further reduction of the peak value. The distribution of peak times in the sub-populations from the previous example is shown in Fig. 1(c). The temporal shift between the different sub-populations can be attributed entirely to stochastic fluctuations in the initial phase of the dynamics. Assuming that this time shift accumulates while the dynamics can still be modeled as a pure birth-death process without saturation effects, we can derive the probability distribution for the deviation from the mean peak time t peak ≡ t peak − t peak as where n is the initial number of infected individuals in the subpopulation andτ = ln γ k/b /(b − k) with γ being the exponential of the Euler constant (see Appendix C for details). Note that n here was only used to incorporate the extinction probability, while the shape of the distribution is based on a single initially infected individual. Nevertheless, this result is in excellent agreement with the measured distribution for randomly distributed infected individuals [see the dashed line in Fig. 1(c)].
We can then use this distribution to obtain a quantitative estimate for the additional peak reduction due to desynchronization. For this purpose, we approximate the deterministic time evolution of I in the vicinity of the peak as I(t) ≈ Nγ exp − 1 2 bkγ (t − t peak ) 2 , which is valid as long as S(t) remains of order ∼ N (see Appendix D); i.e., b/k is not too large. In the limit of many superimposed peaks of this shape, with the variability of t peak given by Eq. (8), the peak is reduced by an additional factor α −1 , The peak number of infected individuals, with both stochastic effects of the confinement taken into account, similarly becomes I con max = Nγ con = I ext max /α. It is interesting to note that this reduction factor is bounded from below by lim R 0 →1 α −1 = 12/(12 + π 2 ) ≈ 0.7407. The desynchronization effect is, therefore, much more limited than the extinction effect.

B. Numerical results
We consider as an example a region with a population of 8 000 000 and 500 infected individuals (I 0 /N ∼ 6 · 10 −5 ) and assume a removal rate of k = 0.14, corresponding to a realistic mean removal time of 1/k ≈ 7 days for the recent epidemic 23 (particularly if symptomatic individuals are quickly removed from the infectious pool through quarantining). Let us further assume that the infectious contact rate is b = 0.2 (> k). This corresponds to a substantial reduction of R 0 from its initial value of 2-2.5 24 through mild measures such as social distancing, although the epidemic would still spread exponentially, with infection numbers doubling about every 12 days.
If this population is allowed to mix homogeneously, the dynamics will evolve according to the deterministic prediction with a peak around 5% infected individuals (blue data in Fig. 2). If instead, the population is split up and the 500 infected people are distributed randomly across the sub-populations, the peak percentage of infected individuals decreases to around 3% (for 100 sub-populations of 80 000 people) or 1% (for 500 sub-populations of 16 000 people) on average (red and yellow, respectively). In all cases, the analytical estimate that only considers the extinction effect, Eq. (6), is only an upper bound for the peak percentage of infected individuals in the total population, while also considering desynchronization according to Eq. (9) yields a good estimate the typical peak values. The peak time distributions for the three different ways of splitting up the population shown in Fig. 2(c) also agree with the analytical estimate of Eq. (8). Note that these distributions are not normalized since a significant fraction of sub-populations experience extinction of the epidemic and, therefore, do not exhibit a peak. There is also a subtle, non-monotonic effect on the termination time of the epidemic [ Fig. 2(d)] whose distribution is broader when the population is split up but does not change position appreciably. Note that the reduction for N s = 500 sub-populations in Fig. 2 is comparable (or even slightly lower) than the case where the 500 infected individuals are not distributed randomly across the sub-populations, but each sub-population contains exactly one infected individual. In this case (see Fig. 3), there are no sub-populations with initially zero infected individuals, implying that the reduction in the peak value compared to the large homogeneous population is strictly due to extinction and desynchronization, which are again well predicted by the analytical estimates.
To examine the validity of our approximations across different parameters, we varied the contact rate b and carried out numerical simulations for values of R 0 ranging between 1.14 and 2. We analyzed the resulting peak magnitudes to extract the individual contributions of extinction and desynchronization, which are in excellent agreement with our predictions of Eqs. (6) and (9), as shown in Fig. 4. The contribution of extinction alone was estimated numerically by summing maxima in different sub-populations, regardless of their timing. Overall, the simulations confirm the relative importance of the extinction effect, whereas the additional reduction by desynchronization plays a smaller role. Figures 4(a) and 4(b) show the case where N s = I 0 = 100, i.e., number of sub-populations and initially infected individuals is the same, and exactly one infected individual is placed in each sub-population. This serves to demonstrate the maximum effect of extinction, whereas in Fig. 4(c), a large share of the peak reduction is due to sub-populations containing no infections, as I 0 = 100 < N s = 500. However, the random distribution of infected individuals for N s = I 0 = 500 [ Fig. 4(d)] leads to a very similar result as in Fig. 4(b), although some of the reduction is due to the initial distribution (i.e., sub-population without any infections). For a high number of sub-populations N s as in Figs. 4(c) and 4(d) (and consequently a smaller sub-population size), deviations from the theory begin to appear toward low values of b very close to k, as the timescale of the extinction process becomes comparable to that of the deterministic SIR dynamics. In this regime, the distinction between an initial stochastic phase approximated by a birth-death process and the onset of saturation effects becomes increasingly blurred, as we show analytically in Appendix B. In particular, this affects the estimation of the extinction contribution (marked by black dots).

IV. DISCUSSION
Reducing the infectious contact rate b or increasing the removal rate k directly leads to a decrease of the deterministic peak fraction of infected, γ . The above analysis shows that, even without changing R 0 = b/k, the isolation of small sub-populations can reduce the overall peak number of infected people in the ideal case of at most one infected individual per sub-population by an additional factor of up to I 0 /N s · (1 − k/b)/α when I 0 /N s < 1. One contribution comes from the communities that have no infections and are now protected (I 0 /N s ), while another contribution comes from the possibility that an infection chain in a local community stochastically ends due to fluctuations (k/b). Stochastic desynchronization (1/α) further reduces the peak by up to about 25% according to Eq. (9). However, as shown by our estimates and confirmed by the numerical simulations, even outside this ideal scenario, a reduction can be achieved, regardless of the distribution of infected individuals across the sub-populations, and the reduction will be larger if b/k is already close to 1. It is also worth noting that, in contrast to reductions in R 0 = b/k, the timescale of the outbreak is not increased. The benefits of subdivision are obvious even from a deterministic standpoint in the case where many regions initially contain no infected individuals-in this case, subdivision prevents spreading of the epidemic to disease-free communities. However, our analysis shows that this advantage persists due to stochastic extinction events and desynchronization even if the sub-populations are so large that many or all of them initially contain infections, as long as I 0 /N s ∼ 1. Of course, increasing N s further is always beneficial due to the above-mentioned deterministic effect, with the trivial limiting case of one group per household (an extremely strict lockdown). In contrast, aiming at I 0 /N s ∼ 1 could still allow for the functioning of local socioeconomic life in fairly large sub-populations if I 0 is not too large when the subdivision happens.
While extinction has been widely considered for SIR-type models 11,21 and has been related to a minimum number of infections necessary to cause a "major" outbreak, 14 we have shown here that, even if the dynamics in the large population is outside the stochastic regime, it is possible to resurrect these effects by artificially subdividing the population. Because of the strong exponential dependence of the extinction probability on n [see Eq. (5)], it is important to note that I 0 denotes the true number of infections, including undetected and/or asymptomatic cases. Another aspect we have neglected here is that of cross-infections: In reality, sub-populations cannot be perfectly isolated; therefore, local extinction might only be temporary, as has been seen in studies of persistence. 10,16 The calculated peak reduction would be observed in the limit of small cross-infection rates. In contrast to extinction, desynchronization does not reveal itself on the level of a single population (except as a difference in timing) and is, therefore, an emergent property of the subdivision scenario, which is likely to persist in the presence of cross-infections. In the framework presented in Sec. II A, these could be included (without changing R 0 ) by allowing a certain fraction ξ of contacts across the entire population and only restricting the remaining fraction 1 − ξ to within each sub-population. We set up such a model in a separate study 25 to investigate a potential realistic containment strategy.
In reality, individuals will not compensate for all avoided contacts outside the local sub-population with contacts within it, as we have conservatively assumed by keeping b constant upon subdivision. Instead, isolation will naturally lead to a reduction in b, akin to cutting links in the spreading network 5 so that the effect of subdivision will be a combination of deterministic reductions in R 0 and the stochastic effects presented here. Subdivision of a population can be complementary to containment measures, such as social distancing and electronic contact tracing, 13,23 which still allow for the functioning of local public life. However, it also does not preclude the activation of more drastic measures in regions beginning to show deterministic exponential behavior. 25  Wilczek, and R. Yahyapour. This research was supported by the Max-Planck-Gesellschaft.

APPENDIX A: THE EXACT SOLUTION OF THE BIRTH-DEATH PROCESS
Consider a population of the infected individuals I that can undergo the following two processes: i.e., each I can give birth to another I with rate b or it can die with rate k, at any time. Ignoring the stochasticity, the average behavior of the system is described by exponential birth and death. The population n(t) can be determined as follows: where we have assumed that the initial size of the population is one.
As this is a one-step process, the probability of finding n copies of I in the sample at time t satisfies the following master equation: The factor of n is needed because the birth or death could happen to anyone. Equation (A3) can be solved by an ansatz of the form P n ∼ f n for n ≥ 1, which together with the initial condition P n (0) = δ n,1 gives us the solution as (A4) The distribution can be used to calculate the first two moments, which reveal more interesting features about the system. First, it is reassuring that the average population size behaves according to the mean-field description above that predicted exponential growth or decay. A quantity of interest is which probes whether number fluctuations follow a characteristic Poisson behavior. In the long time limit, we have which shows that while a decaying population that corresponds to b < k has a Poisson behavior, a growing population corresponding to b > k has giant number fluctuations, which can be Chaos ARTICLE scitation.org/journal/cha characterized via which leads to in the long time limit. In other words, the fluctuations scale with the average population size when b > k and with the square root of the average population size when b < k.
The above solution allows us to calculate the extinction probability of the population P 0 (t), which is an absorbing state. We find which is a very interesting result. When k > b,n → 0 at long times, and we obtain P 0 = 1. It is no surprise that extinction at long times is a certainty when the death rate is larger than the birth rate. However, when k < b,n → ∞ at long times, and we obtain P 0 = k/b, a result that is in contradiction with the prediction of the average behavior of the system, which is exponential growth. Therefore, number fluctuations could completely annihilate an exponentially growing population.

APPENDIX B: TIMESCALE OF THE EXTINCTION PROCESS AND ACCURACY OF MAXIMA DETECTION IN SUB-POPULATIONS
Here, we derive quantitative estimates that allow us to compare the timescale of the extinction process to that of the deterministic peak in the SIR model. This is conceptually interesting in its own right, but it also allows us to meaningfully differentiate between "real" maxima and random transient peaks in the number of infected individuals in sub-populations that experience extinction.
In the pure birth-death process, the fraction of extinction events, 0 ≤ φ x ≤ 1, that have already happened by time t can easily be calculated from Eq. (A11) as This equation can be inverted to yield the time t x by which a fraction φ x of extinction events have happened, On the other hand, we can also estimate the fraction of non-extinct populations, 0 ≤ φ c ≤ 1, that will still be below a cutoff size n c at time t, Evaluating φ c (t x (φ x ), n c ), therefore, yields the fraction of populations still below n c when a fraction φ x of extinction events have already happened. This expression can be inverted to yield the simple relationship giving the number of infected individuals below which a fraction φ c of non-extinct populations will still be at the time when a fraction φ x of populations destined for extinction have already reached the extinct state. In order to estimate the effect of extinction in our numerical simulations (cf. Fig. 4), we detect the maximum number of infected individuals in each sub-population (independent of their timing) and compare the sum of these numbers to our estimate I ext max from the main text. In the sub-populations that experience random extinction of the epidemic, the detected numerical maxima will in reality be transient fluctuations before extinctions. These contribute more and more as R 0 = b/k → 1 when the deterministic peak value Nγ = N 1 − (1 + log R 0 )/R 0 20 decreases and the extinction probability 1/R 0 increases. Using the estimates above, we can exclude these false maxima based on their timing by only considering those maxima for which and simultaneously ensuring that is fulfilled. φ x and φ c play the role of accuracy parameters. The first condition ensures that false maxima are excluded with probability φ x , while the second one ensures that a pure birth-death process would not have reached the deterministic SIR peak by the same time with probability φ c . Note that the latter is a conservative estimate, as growth in the SIR model is significantly slowed before reaching its peak compared to a pure birth-death process. In Fig. 4, we use a value of φ x = φ c = 0.99 to exclude 99% of false maxima and still detect more than 99% of deterministic SIR maxima except for the data points marked as unreliable, for which Eq. (B6) is not fulfilled, and, therefore, the extinction process and the deterministic SIR peak are not clearly separated in time. Conversely, this also means that for all other parameters (i.e., larger R 0 = b/k), extinction usually happens well before the deterministic SIR dynamics reaches its peak. It is worth emphasizing that, in the limit b → k and small populations, the distinction between an initial stochastic phase and a deterministic time course becomes meaningless since γ eventually becomes order ∼1/N and the mean extinction time diverges. At this point, the dynamics throughout will be dominated by random growth of the number of infected individuals and stochastic fluctuations will continue to contribute, even as the number of susceptibles decreases, eventually ending the epidemic (i.e., during and Chaos ARTICLE scitation.org/journal/cha beyond the maximum). In addition, the assumption that there is no depletion of susceptibles in the early phase (and thus the equivalence to a pure birth-death process) breaks down. However, in this study, we are interested in the regime where even sub-populations are still large and while b is sufficiently close to k to yield a significant extinction probability k/b, it is large enough to lead to a significant deterministic outbreak peak. Therefore, we do not investigate this regime.

APPENDIX C: ANALYTICAL APPROXIMATION OF THE RELATIVE PEAK TIME DISTRIBUTION
The fact that the early phase of the dynamics in the SIR model (when S ≈ N and I is small) corresponds to a simple birth-death process also allows us to obtain an analytical estimate for the peak time distributions of the sub-populations. This can be readily adapted from a similar calculation performed on an equivalent problem in evolution, where the dynamics of a small mutant subpopulation with a given selective advantage can likewise be understood as a birth-death branching process, 26 for which the transition from the initial stochastic regime where extinction is still possible to the deterministic regime of exponential growth corresponds to the establishment of the mutation in the population (which precedes fixation).
We obtain an approximation for the establishment time distribution of the disease in a sub-population as where we have corrected for an additional minus sign missing from Ref. 26. The variation in the timing of the later deterministic dynamics is due entirely to fluctuations in this initial stochastic phase. To compare this analytical approximation with our simulation results for the peak time in the main text, we plot the non-normalized, unconditional distribution, which is diminished by a factor [1 − (k/b) n ] [from Eq. (C1)] accounting for the probability of extinction in a population with initially n infected individuals and has its mean shifted to the measured mean peak time t peak . Here, where γ = 1.781 0724 . . . is the exponential of Euler's constant. We note that simply shifting the mean of the distribution is justified because the dynamics is predominantly identical in different sub-populations once they are in the deterministic regime, while only lagging by a random time span τ . This simple argument depends on the assumption that stochastic fluctuations can be ignored before deviations from exponential behavior (i.e., saturation effects) have to be considered for the deterministic dynamics. This is true for the scenarios we consider in the SIR model since our sub-populations still consist of thousands of individuals and we are explicitly focusing on cases where b is not arbitrarily close to k.

APPENDIX D: ESTIMATING THE EFFECT OF SUB-POPULATION DESYNCHRONIZATION
For estimating the peak reduction effect due to desynchronization of the sub-populations, it is convenient to work with the normalized equations for s = S/N and i = I/N, which reaḋ When i reaches its peak γ = i(t peak ), new infections and recovery balance according to Eq. (D1b) and s(t peak ) = k/b. Based on this known value, we use the following ansatz for s: with ε(t peak ) = 0. Since we are interested in the regime where there is a substantial extinction probability k/b, s(t peak ) is also still of order 1. Together with the fact thati(t peak ) = 0 by definition, we expect from Eq. (D1a) that the lowest (linear) order of ε will suffice to describe the dynamics around the peak; i.e., ε(t) ≈ ε 1 · (t − t peak ) (conversely, we expect this approximation to break down when b k). Substituting the ansatz into Eq. (D1a) yields ε 1 = −bγ or With this, we can obtain an approximation for i around the peak. From Eq. (D1b), we know that which can easily be solved. Together with the condition i(t peak ) = γ , we obtain Now that we have an approximation for i(t) near the peak, we can calculate how these time courses add up across individual subpopulations by assuming that they all have the shape (D5), with the peak time t peak stochastically distributed according to Eq. (C2). Definingī where each i(t j peak ; t) represents a time course as in Eq. (D5) with t j peak drawn from the distribution (C2) for each j, we obtain an average superposition of many sub-populations in the limit N s → ∞, i(t) = dt peak P est SIR (t peak +τ ) i(t peak ; t).
Note that [as compared to Eq. (C2)] we use here the normalized distribution, without the diminishing factor due to extinction, in order to extract the reduction strictly due to desynchronization. We have also set t peak to 0 without loss of generality, as a different value would simply shiftī(t) by the corresponding time.

ARTICLE scitation.org/journal/cha
The integral in Eq. (D7) cannot be integrated in a closed form. We, therefore, replace P est SIR by a normal distribution N (0, σ 2 ) with the same variance σ 2 = π 2 /[6(b − k) 2 ]. It is useful to note that, as for the normal distribution, the variance completely determines the shape of the Gumbel distribution in Eq. (C1), which means that the systematic error introduced by this replacement is parameter independent. Finally, we can calculatē with α = 1 + π 2 bkγ 6(b − k) 2 .
The maximum of the resulting time course occurs at t = 0 (due to our arbitrary choice of the mean for t peak ) and isī(0) = γ /α. Since the expected peak value without desynchronization is γ , desynchronization reduces this peak value by a factor of α −1 . According to Eq. (the definition of α above), α itself depends on γ , which in turn is a function of R 0 = b/k. Using the well known approximation γ = 1 − [1 + log(R 0 )]/R 0 , 20 which is valid as long as S ≈ N initially, we rewrite α as While we expect the quantitative estimate to be less accurate toward higher R 0 (see above), we note that the important limits lim R 0 →1 1 α = 12 12 + π 2 ≈ 0.7407 (D11) exist. The first one signifies that there is no peak reduction due to desynchronization for R 0 → ∞, consistent with the disappearance of the stochastic phase at the beginning of the dynamics. The second limit indicates a finite reduction by a factor ≈ 0.7407 toward R 0 = b/k = 1. Since the timescales of both the stochastic fluctuations and the deterministic peak behavior diverge for R 0 → 1 (and are ill-defined for R 0 = 1), this means that they must exhibit identical scaling behavior in order for neither of them to dominate. In between the two extremes, 1/α increases monotonically with R 0 , which implies that the maximum reduction that can be achieved by desynchronization is about 26% and is reached close to R 0 = 1. It is important to note, however, that several assumptions even about the deterministic time course (for example, the value of γ ) break down when R 0 is so close to 1 that γ becomes of order 1/N; therefore, a fully stochastic treatment would be needed to fully capture this regime. This does not limit the validity of the results in the regime we are interested in, i.e., where sub-populations still exhibit clear deterministic outbreaks (or extinction).

DATA AVAILABILITY
The data that support the findings of this study are available within the article.