Physics of cancer propagation: A game theory perspective

This is a theoretical paper which examines at a game theoretical perspective the dynamics of cooperators and cheater cells under metabolic stress conditions and high spatial heterogeneity. Although the ultimate aim of this work is to understand the dynamics of cancer tumor evolution under stress, we use a simple bacterial model to gain fundamental insights into the progression of resistance to drugs under high competition and stress conditions. Copyright 2012 Author(s). This article is distributed under a Creative Commons Attribution 3.0 Unported License .


I. THE HETEROGENOUS ENVIRONMENT OF A TUMOR
In cancer, there is much more happening than an uncontrolled growth of cells. If a solid cancer tumor was confined to that single phenotype, then a simple surgical removal of the tumor, when possible, would cure the patient. In fact, when caught early enough this still is the most effective way of curing a patient of cancer. Unfortunately, usually by the time a tumor has become big enough to cause symptoms or become visible, it has turned into something far more deadly. A good introduction to the sad history of our realization that cancer is much more subtle then we originally thought can be found in the book "The Emperor of All Maladies: A Biography of Cancer." 1 With such a big, sprawling disease no one article can encompass the totality of the cancer condition, so here we employ a physicist's perspective of looking for simplicity in complexity in true hopes if finding over-riding principles. In particular, we use computational modeling to examine some of the population dynamics of cancer as a multi-cellular community under intense stress and selection pressure which is not homogenous in space. We ignore time-dependent stress, not because it is not important but rather because we have yet to explore this part of stress-space experimentally.
In our model, we consider cancer's likeness to a community of bacteria, as cells grow, compete for nutrients and develop resistance to toxicity. In this instance, the cancerous cells of a tumor might resemble a mutant strain of bacteria from an initial wild type population. There are several reasons that suggest this is a useful correspondence. The first is that a tumor resembles a bacterial biofilm, a concept which we have developed in a recent article. 2 We noted there that cancer cells rapidly evolve drug resistance through somatic evolution in the body, and in the metastatic phase violate the organism-wide consensus of regulated growth and beneficial communal interactions in order to continue their own selfish growth. We claimed that tumor cells express deliberate evolutionary strategies usually used by prokaryotes under conditions of high stress, such as sustained growth despite DNA damage and toxic local environments. In the context of game theory, selfish growth strategies are attributed to both cancerous cells and stress resistant prokaryotes, in contrast to their neighbors' strategies of regulated growth and homeostasis. We proposed that such similarities are sound enough to motivate the use of bacteria under high stress as a biological model for the evolution of cancer cell drug resistance and tumor development.
To explore this proposal we have explored using microfabrication technology the effects of spatial stress patterns on bacterial microecologies. 3 The experiments are still ongoing, but are highly suggestive that bacterial population dynamics emerging from genotoxicity resemble the metapopulations occurring in real tumors in response to genotoxic treatment. The speed of resistance development to antibiotics has become one of the most important issues in human health. The 2158-3226/2012/2(1)/011202/10 C Author(s) 2012 2, 011202-1 design principles discussed in 3 should provide a powerful template for exploring the rates at which evolution can occur in very complex fitness landscapes, including the rate at which cancer cells develop resistance to genotoxic chemotherapy. While a great deal of work remains to be done to characterize the dynamics of bacterial evolution, the implications of that work we hope are obvious.
With this experimental background, here we model the population dynamics of a bacterial population in a complex heterogenous ecology using a cellular automaton framework. The model's primary aim is to aid consideration of how cells interact with each other in a complex stressed environment. For simplicity, we consider two classes of cells in this highly stressed environment: cells that are cooperators and obey general rules of communal survival, and cheaters which decide not to obey these rules. Within the context of cancer, the cooperator cells are the highly adapted and differentiated cells that make up the body under normal conditions, while the cheater cells are the cells in a rapidly growing tumor which do not obey these rules. Exactly HOW the cheater cells became cheaters is not the subject of this paper, rather we wish to examine the dynamics once this defection to cheat has occurred. Before we jump into our spatially resolved population model, we first discuss here the original game theory framework, from which it is generalized. We claim that when this game theoretic framework is modified to account for heterogeneous stress patterns, as it is in our spatially resolved model presented here, there are emergent cooperative outcomes between the two cell types.

II. PREVIOUS WORK ON MIXED POPULATIONS
The essence of our work here is the movement and growth of a mixed population of cells over a complex landscape where the stress is a strong function of position. One of our main points is that spatially resolved models and their well-mixed" counterparts often produce very distinct outcomes, with diversity and coexistence resulting from spatial models and homogeneous populations resulting from well-mixed" models. One of the best examples of the effects of spatial resolution can be found in the model presented Nowak and May's 1992 article. 4 In the Nowak and May model described in this the simulation initially plants seed cells throughout a lattice and then leaves them to compete with each other in place. Boundaries of population domains move because cells take over each others' territory, not because of any explicit modeling of migration. In this work we explicitly include diffusion of the agents over the lattice in recognition of the mobility of real biological cells.
C. Athena Aktipis has developed agent-based modeling in spatially-resolved environments. 5 In Aktipis's work, individual cells can move. 6 In the walk-away strategy she studies, cells leave a region after they determine that neighboring cells fail to produce sufficient public goods. This allows cooperators to gather with other cooperators rather than with other selfish agents. The population can thus avoid annihilating the cooperative subpopulation. This is a rather more deterministic approach than our essentially Markovian approach which is based on pure statistics. However, we have included spatial gradients of externally applied goods here, further work should explore the combination of Markovian dynamics and spatial gradients with the deeper strategies that Aktipis has developed.

III. A BRIEF INTRODUCTION TO GAME THEORY
The "walk-way" strategy of Aktipis' work, a behavioral rule of "move after partner defects", is a game-theoretic concept. John Nashs work 7 premised on non-cooperative competition, in particular, greatly influenced how economists and evolutionary biologists analyze competition in markets and in Nature, particularly informing the mainstream notion of natural selection. Competition, such as between the cooperator and cheater cells in a tumor, has historically been pictured as a zero sum game, where a conflict of interests between two players is perceived to produce a winner and a loser. In this guise, natural selection prefers competitors who enhance their fitness at the expense of other competitors fitness. While circumstances do appear in Nature that create win-loose competition among individuals, such as two bonobo monkeys fighting over a mate, or certain institutions imposing grade deflation policies on students, this narrow view of selection undermines the evolutionary utility of cooperation that is abundant in Nature, even among competing ecological individuals. A principal axiom in Nashs work is the premise that each player acts independently, without collaboration or communication with any others". This principle appears self-evident in the canonical game theory scenario of the prisoners dilemma, where two robbers are captured and imprisoned by the police. Each of the two captives is given the choice to "talk" and thereby betray his partner in crime, or to "be silent". With two available strategies available to both players, four combinations are possible and we can construct a 2x2 payoff matrix for each captive.
This is an instance of a symmetric game, where the payoff matrix for player A is the transpose of that for player B. As the Nash axiom of non-collaboration stipulates, neither captive can communicate with his partner. The Nash equilibrium of a game assigns the strategy to each player that will maximize his or her payoff. To calculate the solution, all players are assumed to abide two key criteria: self-interest (desiring to maximize their payoffs), and rationality (assuming that all other players are also self-interested). The well known Nash equilibrium solution to the prisoners dilemma is to "talk" or, in game theoretic terms, defect. The values given in Table I are somewhat arbitrary, but defection strategy satisfies the Nash Equilibrium for the general case where the payoff matrix satisfies T (temptation) > R (reward) and P (payoff) > S (sucker). This game theory paradigm is often used to model population dynamics in Nature, where an individuals survival and reproductive success depend on the phenotypic interactions with its competitors. 8 While classical game theory operates on the assumption that players are rational and self-interested agents (as the two captives are, presumably), these characteristics are not applicable to animal or bacteria players in the same way. Rather, in its evolutionary context, game theory requires a more subtle interpretation, involving steady state population dynamics and maximizing fitness. That is to say, each evolutionary player maximizes its fitness not by rational analysis of its competitors, but rather via dynamic convergence to a stable outcome, or what is known as an Evolutionary Stable Strategy (ESS). An ESS has a larger set of constraints than a Nash equilibrium has, but they are both stable solutions that recover from small perturbations.

IV. GAME THEORY IN SPATIAL COMPLEXITY
Of particular interest is how a classical game theory model could break down amid spatial complexity. That is, what happens when we account for local variations in a fitness landscape, particularly when strategic outcomes are dependent on their spatial locations. To understand how game theory can be adapted to spatial stress variations, such as the case in cancer dynamics, we will revert to our bacterial models, because they afford the greatest chance of experimental tests and may point the way to understanding the vastly more complex phenomena of cancer in an organism such as man.
The phenotypic state of the organism is a potential controller of the the rate of evolution dynamics. If the initial state of organism is a wild-type one which is not adapted to stress, substantial mutations must occur and be fixed in the population before the organism can evolve a true resistance to a highly stressful environment and propagate. However, if the strain has a preexisting genotype which gives rise to a initial resistance to stress, it may require less mutations for full resistance to evolve. In the case of bacteria, which we study here, it has been shown by Kolter and his colleagues that bacteria held in stationary phase over a period of several days evolve, in spite of the fixed number of bacterial cells in stationary phase, to a mutant which is adapted to the high stress conditions generated by stationary phase. 9 These mutants, called Growth Advantage in Stationary Phase (GASP) mutants, 10 have the ability to divide (slowly) even under high stress conditions.
The construct of the wild type and the GASP (growth under stationary phase) E. coli mutants that can be used for experiments have been described in previous work. 11 The GASP mutant has an interesting gene rearrangement which leads to the GASP phenotype. The main difference between the wild-type alleles is a duplication in the GASP mutant of the rpoS 819 sequence (gcaggggctgaatatcgaa). 12 We have shown earlier in microhabitat work that these GASP mutants co-exist with wild-type bacteria in a synergistic manner 13 and under high stress can grow to high densities. The basic mechanism for this stress adaptation is that the allele confers a strong competitive tness advantage under basic pH conditions, and during the death phase of stress these GASP mutants are able to adapt using a metabolism that acquires protons from the basic medium environment in order to maintain pH homeostasis.
As we touched on in the introduction, we are interested in the role heterogeneity plays in shaping competition between selfish and cooperative cells. Spatially resolved simulations allow us to model populations in a diverse environment while dynamically tracking competitive and chemical interactions at the local level. This allows us to observe complex dynamics that other game theoretic and numerical models neglect due to their spatial (game theory) and temporal (logistic) constraints. In this work, we concern ourselves with modeling the evolution of two competing strains of wild type and GASP E. coli bacteria. However, we will introduce the model here in as general terms as possible, as it could easily be translated or adjusted to many other scenarios.
Game theory, because of its discrete nature for decisions, easily lends itself to a scenario where the participants in the game are spatial contained within a cell. 14 "Cell" here refers to a general lattice point in the simulation space, rather than a physical cell. Although these spatial cells model the space in which physical cells might occupy, hopefully the context should make clear whether "cell" refers to a point in the simulation space or a bacteria cell in the model. Each cell that is occupied by either player has its chemicals consumed and/or excreted based on a set of chemical exchange parameters. (Remember, some cells can be vacant.) A convenient approach to this model is to think of it as a two player game not in the sense of there being a winner and a loser, but in the sense of there being a set of rules that both players must follow. Imagine a 3D lattice that is populated by two types of players: red and green. These two players might correspond to two competing cell types, such as wild-type and GASP mutants. Each lattice point (or cell) is occupied by one type or the other, or it is vacant. In addition, each cell has a normalized concentration of two chemicals, which we will denote as f and s. In a particular application, f might be nutrients and s some toxin created by a necrosis in the center of a tumor. At any point in time, all the information mentioned so far is contained in three lattice arrays: one for occupancy status, which has three discrete values corresponding to red, green, or vacancy; and the other two contain the concentrations of the two chemicals, which are real values normalized between 0 and 1. Fig. 1 gives a graphical picture of the dynamics of this game.
Given an initial population and chemical distribution, the lattice repopulates itself at each time step according to a set of rules involving three distinct phases, involving chemical consumption and excretion, probabilistic reproduction, and chemical diffusion. Since each time step involves reproduction, the time scale t of each three phase iteration roughly corresponds to the time τ of a generation. What makes spatially resolved models useful is that these rules described in the three phase system are based on local rather than global conditions, where chemicals are locally consumed and diffused, and competitive interaction and reproduction are based on each cells nearest neighbors. Hopefully, the three phase update model we describe will start to crudely resemble cell growth in a tumor environment.

A. Phase I: Chemical uptake and release
Each cell that is occupied by either player has its chemicals consumed and/or excreted based on a set of chemical exchange parameters.
Here g and r represent the green and red occupancies; the parameters φ g and φ r are the metabolic consumption rates of the green and red players of f; and the parameters δ g and δ r are the consumption rates of s. Since occupancy of a particular cell is exclusive (i.e.. green or red), we use the conditional symbol to indicate that the term is only non-zero if the cell is occupied by the indicated player. For instance, if a particular cell [i; j; k] is occupied by a green player, then f t = f t−1 (1 − φ g ) and s t = s t−1 (1 − δg). If a player consumes a chemical, then the corresponding parameter is positive, and negative in the case of excretion. It may be the case that a player neither consumes nor excretes one of the chemicals, in which case that parameter is zero. So we can encode the metabolic parameters involved in phase I into a 2x2 matrix:

B. Phase II: Probabilistic reproduction
The occupancy of each cell in the lattice is then updated according to a probabilistic reproduction stage. For a given cell [i; j; k] at time generation t, a probability distribution is computed over the three possible occupancy statuses (i.e.. green, red or vacant). This distribution is computed from the aggregate of the surrounding neighbors fitness values, which are computed at each lattice point based on the local chemical concentrations: We have defined fitness parameters ρ and σ that characterize the impacts of a particular chemical on a player. For instance, if f is a nutrient and s is a toxin to the green player, then ρ g > 0 and σ g < 0. We can intuitively think of p g and p r as local growth rates or fitnesses. Each cell occupied by a green player has an assigned p g value and, likewise, for each red occupation p r is computed, while all vacant cells maintain p g = p r = 0. Our fitness matrix is also encoded into 2x2 matrix containing these parameters: Since f and s concentrations are changing from one generation to the next, these p g and p r fitnesses vary dynamically as well as spatially.
The probability distribution for a given cell occupancy is generated by summing together and normalizing all the neighbor fitnesses.
If P g + P r > 1 at a lattice site, then the distribution is normalized so that P g + P r = 1. If P g + P r < 1, then there remains a non-zero probability of vacancy. The next occupancy status of each cell is then randomly generated based on their respective distributions. In this way, the lattice probabilistically leapfrogs from one generation to the next. The reproduction scheme described here is a true breed model, in which a cell of type A will always produce progeny of type A. 15 If we wish to account for the possibility of stress induced mutations, we can modify our equations to add a transition probability.

C. Phase III: Diffusion
The last step applies N D gaussian convolutions to both chemicals, broadening out their concentrations. Each convolution is applied to both chemicals along each dimension according to the discretized analog of: where q(x) is some chemical concentration along an arbitrary axis (x; y; z) and g(x) is a gaussian fit, with the integrals evaluated and normalized over the infinite intervals −∞ < x; k < ∞. The standard deviation of the gaussian fits applied to the f and s are σ f and σ s lattice units (i.e.. the distance between two adjacent lattice points). In addition, there is a cut-off parameter σ C lattice units, which prevents metabolites from flowing out too far and bleeding around the lattice edges. The degree of diffusion effects the steepness of chemical gradients that emerge in the landscape. The spatial territory of our lattice is fundamentally defined by three types of region. The swimmable region defines which cells players can occupy; the permeable region defines where chemicals can flow; and reservoir regions are where there is a constant supply of food. We will work with models where the first chemical is a food resource. So in the framework of our model, reservoirs always have f = 1 and s = 0 satisfied at each time step. These reservoir regions of replenishing nutrients are not unlike regenerative tissues in a host organism surrounding a tumor.

V. SIMULATING THE DEVELOPMENT OF THE UNITED STATES
Just for fun, and to test the ability of this algorithm to simulate a complex heterogenous environment from this game theory perspective, we attempted to simulate the development of the United States by sprinkling cooperators and defectors across the US landscape. We were curious as to how such a mixture would develop as it colonized a complex area we are familiar with. Most competition occurs in heterogeneous habitats, where cells (or animals) have the opportunity to migrate and form optimal configurations. The tissue environments where cancer tumors can grow, for instance, are diverse in nutrients and textures. To design a micro-environment resembling one we might find in nature, we draw inspiration from a larger scale environment: the United States of America. The reservoir regions correspond to non-ocean bodies of water such as the Mississippi river and coastal ports and inlets. The swimmable regions correspond to land masses and the permeable regions include both the land mass and land-based bodies of water. Just like natural water resources in North America, this heterogeneous setup has a combination of densely packed reservoir regions and dry deserts. The Lattice regions are set up by rendering a special three-color image of the a USA map and converting it into corresponding lattice arrays of 0's and 1's. Fig. 2 shows the settings of the 3 phases discussed above. The reservoir regions shown in Fig. 2 correspond to bodies of water on the North American map, including the Mississippi river and most coastal ports and inlets. The swimmable region is the landmass, and the permeable zone is the union of the two. It would not seem to matter that reservoirs are included in the permeable region, since their concentrations are unchanging, but it is worth noting that chemicals can diffuse through thin reservoir layers. As described earlier, the lattice is 3D, but the height of the lattice is small compared to the length and width (typical dimensions would be 300x300x11 pixels). The maps show a birds eye view, where each pixel we see corresponds to a cell, with the z dimension projected onto the xy plane. This viewpoint allows for useful visualization of population dynamics.
Since fitness is largely dependent on local food concentrations the selection of reservoir regions in our landscape is the crucial aspect of its fitness landscape. Phase III allows the reservoirs over time to diffuse into local areas, but we can generally view areas far away from reservoirs as nutrient deprived stress regions. Of course, the flow of f and s outside reservoirs are dynamic and will likely be shaped by clustering effects.
Altogether, these three phases operating in a three layer lattice environment comprise a spatially resolved cellular automata model of two competing strains, whose interaction is mediated by two chemicals. The particular parameters used depend on what pair of players and what environment we are modeling.
We inoculate the swimmable lattice with wild type and E. coli cells at a low concentration. We sprinkle the permeable region with small initial food concentrations so that populations in desert regions would not immediately die off. We then run this setup for 150 generations to catch any long term patterns that might form. In order to investigate the effect of competition, we run two separate control simulations where the initial lattices are populated with either only wild type or only GASP cells. Fig. 3 shows three snapshots of the development of the country at different generation numbers.
As we had seen in previous experiments competing cooperators and cheaters, 13 it is not true that that the cheaters simply out-compete the cooperators and take over the region: remember that the cheaters need the cooperators for detritus, and that the cooperators need the cheaters to remove the toxic waste. There is a Ying-Yang aspect to this game. We have computed the numbers of cooperators and cheaters versus generation iteration, this is plotted in Fig. 4.
Both competing strains show a distinct fitness advantage from their control counterparts, as indicated by the red and green arrows in Fig. 4. We also notice a roughness in the competitors growth curves, while the control growths appear smooth. The locations of these colonies are unpredictable because the initial seeds are randomly distributed, and only a fraction of those survive and grow into colonies.
Looking closely, we find that this roughness sets in only after about 40 generations, while the initial growth phases of the two strains are relatively smooth. Looking at the simulation montage in Fig. 3 at t = 40, we see that by this time GASP colonies have formed around the Mississippi river and the East coast ports, so there is a high degree of interaction between wild type and GASP patches. At t = 22, on the other hand, the map is dominated by wild type cells and the growth curve is smoother and more predictable, like the control curves that follow gentle oscillations. After about t = 100, the local population patches look more or less the same as they do at t = 150. After t = 100, the total population of the two competing strains seem to undergo slight fluctuations together, as if they were in sync. The patch patterns around the reservoirs in t = 150 seem to provide the two competitors with robust populations that are about 80% stronger than their lonesome control counterparts. We find that the stationary patch configurations that emerge and which give mutually advantageous configurations are difficult outcomes to predict.
With heterogeneous colony formations in the low GASP inoculations, the game theoretic assumptions about uniform collision breaks down and the concept of a global fitness f(t) fails to adequately describe local interactive dynamics. Furthermore, the population dynamics of our North American landscape topology demonstrates the mutual fitness benefits arising from patch formations around heterogeneous reservoir distributions.

VI. CONCLUSIONS
A global fitness approach clearly falls short in capturing the spatially resolved dynamics observed here. The use of a game-theoretic matrix to assign rewards and punishments within the context of a complex fitness landscape gives rise to a surprising heterogeneity and mutal improvement in fitness for seemingly competitive strains. As is true of the work of Nowak and May 4 and Kerr et al., 16 the nearest neighbor reproduction process is combined with a localized prisoners dilemma criteria for fitness. That is, fitnesses was determined locally by neighborhood interactions, rather than determined globally by uniform fitness, in a hybridized spatially resolved prisoners dilemma. Their studies confirm, as we have seen, that the more localized fitnesses are determined, the more cooperative the outcome, while the more globally fitnesses are determined, the more zero sum the outcome.
In light of the cooperative dynamics observed in experiments of competition in heterogeneous micro habitats as well as these spatially resolved results, it seems clear that the any game theoretic approach modeling bacterial competition requires localized fitness determination.
We can conclude that certain old conceptual frameworks about ecological competition and environmental stress response produce obsolete models, lacking essential characteristics of bacterial growth dynamics. Spatially resolved simulations have provided us with detailed perspectives into the subtler aspects of wild type and growth advantage mutant E. coli. Some of these details reveal novel cooperative patterns and stress gradient responses that their counterpart models fall short of. They moreover provide insights into modifying conventional models to account for these otherwise unanticipated dynamics.
While this group of simulations was inspired by several different lab experiments, the spatially resolved model can in turn motivate new micro-fabricated experiments to test its predictions. One of the most intriguing possibilities is the exchange of genetic materials between the individuals, a process called horizontal gene transfer. 17 Horizontal gene transfer opens up new ways for the game-theoretic dynamics to change as the individuals now can change their strategies from cheaters to cooperators. Another aspect, which has been emphasized in our recent paper, 18 is the evolution of new strategies by the players by genetic evolution in response to the stress of the local environment. Hopefully, a combined dialogue between spatial models and lab bacterial experiments will continue to enrich our growing insight into small scale evolutionary dynamics. At a longer range, the purpose of these simulations and experiments is to understand the emergence of resistance in tumors and how the combination of cancer cells and stromal cells, which act like the cheaters and cooperators in our modeling can end up with enhanced fitness.