Dynamical modeling and analysis of large cellular regulatory networks

D. Bérenguier, C. Chaouiya, a) P.T. Monteiro, 3 A. Naldi, E. Remy, b) D.Thieffry, c) and L.Tichit 6 Institut de Mathématiques de Luminy, Marseille, France Instituto Gulbenkian de Ciência, Oeiras, Portugal Instituto de Engenharia de Sistemas e Computadores Investigação e Desenvolvimento (INESC-ID), Lisboa, Portugal Center for Integrative Genomics, University of Lausanne, Lausanne, Switzerland Institut de Biologie de l’Ecole Normale Supérieure, Paris, France Aix-Marseille University, Marseille, France


I. INTRODUCTION
To model biological regulatory networks, we rely on the logical approach initially introduced by Thomas, where the interactions between the components of a network are formalized in terms of discrete variables, functions, and parameters. 10,37,38 This modeling approach has proved effective in its application to a variety of regulatory and signaling networks, from yeast cell cycle control 15 to T lymphocyte differentiation. 27 The logical modeling approach has been implemented into the software GINsim, which enables the definition of logical regulatory graphs and provides a number of original functionalities. These include the construction of synchronous or asynchronous state transition graphs (STGs) that represent model dynamical behaviors, along with algorithms enabling the determination of all logical stable states and the analysis of the roles of regulatory circuits. 9,26 However, when focusing on transient aspects of the dynamics or on the reachability of the attractors from specific initial conditions, we are facing the recurrent combinatorial explosion inherent in these models: the size of the state space grows exponentially with the number of regulatory components involved in the model. This problem is particularly acute in the case of asynchronous, non-deterministic updating mode, which is usually more biologically realistic than the simpler, deterministic synchronous updating mode. 17,37 Here, we present an overview of three main complementary strategies to cope with this combinatorial explosion.
The first strategy consists in reducing the model before performing simulations or other kinds of analysis. 29 The second strategy simplifies the state transition graphs by forcing choices between alternative transitions; this can be achieved by defining priority (a/synchronous) transition classes, which are similar to time-scale based assumptions often used to simplify the dynamical analysis of ordinary differential equation (ODE) models. 14 The third, novel strategy consists in compressing the state transition graph into a novel graph representation, called hierarchical transition graph (HTG), which keeps track of attractors and their basins of attraction, as well as of transient oscillatory properties; here, we further propose an algorithm for the construction of this hierarchical graph. We also show that this method can be used in combination with the two aforementioned approaches to get insights into the dynamics of complex logical regulatory graphs.
In addition, model-checking approaches rely on symbolic representations of the dynamics, exploring only the necessary state space required for the verification of properties expressed as temporal logic formulas.
Section II introduces the basics of the multilevel logical formalism and provides an overview of selected methods enabling the analysis of the dynamics of large logical regulatory networks.
The definition of hierarchical transition graphs is at the core of Section III, referring to the relatively simple example of the bacteriophage lambda core regulatory network. Section IV takes advantage of a recent comprehensive model of the regulatory network controlling T-helper cell differentiation in response to antigen presentation and to a dozen cytokines 27 to illustrate the power of the compression of state transition graphs into hierarchical transition graphs, as well as the insights gained into the corresponding logical dynamical behavior.
Finally, Section V proposes some global conclusions and discusses current challenges and further prospects.

II. LOGICAL MODELING AND ANALYSIS OF REGULATORY GRAPHS
This section introduces the basics of the logical formalism and presents a short overview of existing methods that enhance the dynamical analysis of logical models.

A. Logical regulatory graphs
A logical model is defined by an interaction graph where the nodes denote regulatory components (genes, proteins, etc.) and the arcs denote regulatory interactions. Moreover, a discrete variable is associated with each component, accounting for its level of activity (or expression). Logical functions define the dynamical evolution of the model. Definition 1. A Logical Regulatory Graph R ¼ ðG; KÞ is a graph, where ;n is the set of n regulatory components.
Each component g i is associated with a discrete variable s i in D i ¼ f0; …; max i g. A state is thus defined as a vector s 2 S ¼ P g j 2G D j . • K ¼ fK i g i¼1;…;n is the set of logical functions; K i : S ! D i defines for each state, the target level of g i .
The arcs are deduced from the functions in K; there is a regulatory interaction from g i to g j iff there are two states s and s 0 , differing only by the value of g i , that lead to different values of K i 9s; s 0 2 S s 0 k ¼ s k 8k 6 ¼ i; and s i 6 ¼ s 0 i ; s:t: K j ðsÞ 6 ¼ K j ðs 0 Þ: Figure 1 illustrates this definition with a logical regulatory graph for the bacteriophage lambda switch.
The dynamics of logical models are represented in the form of state transition graphs as defined in Subsection III B.

B. State transition graphs
Definition 2. Given a logical regulatory graph R ¼ ðG; KÞ, its (full) STG, denoted by E ¼ ðS; TÞ, is a directed graph with: • S the state space of R : S ¼ P g j 2G D j , • T : S 2 ! f0;1g the transition function: there is an arc connecting a state s to its successor s 0 whenever Tðs;s 0 Þ ¼ 1.
The transition function is defined according to an updating policy, which indicates the components to be updated in each transition. Here, for sake of brevity, we only consider the asynchronous updating policy (Definition 3). All results could be extended to other updating policies (including mixed (a) synchronous priority classes as presented in Sec. II C 2).
Definition 3. Given a logical regulatory graph R ¼ ðG; KÞ, the transition function defined according to the asynchronous updating policy specifies, for each state s, its successor states (as many as the components called to update in s): 8ðs; s 0 Þ 2 S 2 , Tðs; s 0 Þ ¼ 1; if 9g i 2 G s:t: K i ðsÞ 6 ¼ s i ; À s i j K i ðsÞ À s i and 8g j 6 ¼ g i ; s 0 j ¼ s j 0; otherwise:  35 Left: the interaction graph, with the four components CI, Cro, CII, and N. Right: the logical functions, s denoting the vector ðs CI ; s Cro ; s CII ; s N Þ of the component levels. For legibility, we rewrite each rule in terms of logical variables, e.g., CI denotes an interaction going out CI with a threshold 1, and it is true whenever s CI ! 1, while CI : 2 denotes an interaction going out CI with a threshold 2; it is true whenever s CI ! 2. Here, for each component, we provide the rule(s) leading to a non-zero value of the logical function (meaning that when none of these conditions is fulfilled, the value is 0). For instance, the rule for K CI ðsÞ ¼ 2 is satisfied for 30 states (those such that s Cro ¼ 0 or s CII ¼ 1); for all other states, CI's target value is 0. Note that values 1 of CI and Cro are always transient for this set of rules.
Note that updates are performed stepwise and thus transitions connect neighbouring states (i.e., their Hamming distance is 1).
We are often interested in a sub-graph of the full STG, which is generated considering a (set of) initial state(s). Then, the property of interest relates to the attractors reachable from this (set of) initial condition(s). Figure 2(a) displays the STG obtained for the phage lambda model, starting from a state where all variables are set to zero. Attractors, which denote asymptotical behaviors, are defined in a STG as its terminal strongly connected components (SCCs). Recall that strongly connected components are defined as the maximal strongly connected subgraphs (i.e., there is a path from each node to every other node). 5,13 Given E ¼ðS;TÞ a STG, we introduce further notation: • Scc is the set of the SCCs of E; • 8s; s 0 2 S; s s 0 means that there exists a path from s to s 0 (we consider that a sequence of a unique state forms a path of length 0, hence s s); • 8s 2 S; 8C 2 Scc; s C means that there exists a path from s to any state s 0 2 C; • S is the set of the trivial SCCs (i.e., reduced to a single node) S ¼ ffsg 2 Scc; s 2 Sg; • C is the set of the complex SCCs: C ¼ fC 2 Scc; #C ! 2g (or C ¼ SccnS); • The sign Ã denotes terminal elements of Scc that will be referred to as attractors: C Ã is terminal iff 8s 2 C Ã ; 8s 0 2 S; Tðs; s 0 Þ ¼ 1 ) s 0 2 C Ã . The non-terminal components are transient. In addition, C Ã (resp. S Ã ) denotes the set of the complex attractors (resp. the set of stable states).
Definition 4. Let A Ã be an attractor, we define B A Ã the basin of attraction of A Ã : B A Ã ¼ fs 2 S; s A Ã g. We further define B A Ã , the strict basin of attraction of A Ã , B A Ã ¼ fs 2 B A Ã s:t: 8X Ã 2 ðC Ã [ S Ã ÞnfA Ã g; s 6 2 B X Ã g: Hence, A Ã can be reached from any state in B A Ã or in B A Ã ; no other attractor can be reached from any state in B A Ã .

C. Coping with large dynamics
Given a logical regulatory graph, the associated state space has Q g i 2G jD i j elements (i.e., 2 jGj , in the case of Boolean variables), meaning that its size grows exponentially with the number of regulatory components. Most properties are thus NP-complete, but one can mitigate this problem by lessening the size of the search space.
Here, we briefly review strategies to ease the analysis of large dynamics. A first approach consists in reducing the model, while ensuring the preservation of key properties. Another strategy lessens the number of transitions of the STG (hence simplifying the dynamics) assigning priorities to updating calls, relying on biologically well-founded assumptions. Other methods enable the reduction of the size of a STG, either by compacting it without losing any information, by applying appropriate reductions, or by considering alternate representations. Finally, we end this section with a short discussion on model-checking applied to multilevel logical models.

Model reduction
A first strategy to reduce the complexity of a model is to reduce its size, by removing some components. This is often done manually by the modeler, defining direct interactions even when it is known that the regulatory effects involve intermediate components. Obviously, by lessening the number of components, such reductions lead to smaller state spaces and hence simplified dynamics.
We have proposed to automate such model reductions and characterized their impact on the dynamics. 29 Basically, the reduction of a component amounts to attribute its regulatory role to its own regulators and to modify the logical function driving the behavior of its targets accordingly. The reduction of a self-regulated component is forbidden for it would not fit this rationale and would change the dynamical properties of the model. Indeed, we could prove that, provided this restriction on self-regulated components, the stable states and elementary terminal cycles of a reduced model exactly correspond to those of the original model. Moreover, the reduced model displays at least as many complex attractors, some corresponding to complex attractors of the original model, while others correspond to original transient oscillatory behaviors. In short, the main property of the proposed reduction is that it may suppress some transitions or paths, but never generates new ones.
Considering signaling networks including non-regulated input components, which usually account for external stimuli, Saadatpour et al. recently proposed to reduce input cascades that stabilize under constant input conditions. 32 This reduction has obviously no impact on the number and nature of the attractors, although it might change their reachability. Similarly, one can ignore (pseudo-)output components that have no outgoing interactions or that only regulate (pseudo-)output components. 28 Indeed, such output cascades have no impact, neither on the number and nature of the attractors, nor on their reachability.

Priority classes
Asynchronous state transitions graphs can be sometimes simplified by reducing the number of transitions, using relatively simple temporal assumptions. Indeed, in all states, the asynchronous scheme defines as many transitions as the number of components called to update, thus potentially generating spurious trajectories. A number of these can be ignored by defining priority classes ranking updating calls. 14 When two calls with distinct priority ranks are enabled in a state, the one with the lowest rank is discarded. Updatings belonging to the same class can be treated synchronously or asynchronously. In GINsim, it is thus possible to partition component updatings into distinct classes that implement such a priority scheme. 9,26 Needless to say, priority classes should be biologically well-founded to ensure that discarded trajectories are indeed irrelevant.

Lessening the size of the state transition graph
Several studies have addressed the problem of the combinatorial explosion of the state space of asynchronous transition systems.
Given a STG, an informative view of the dynamics is provided by the graph of its SCCs, where each node accounts for one SCC (possibly keeping the information of the states it encompasses). The resulting graph is a directed acyclic graph, which is often much smaller than the original STG, yet keeping all the reachability information (see Figures 2(a) and 2(b)). Tarjan defined an efficient algorithm to compute the SCCs of a directed graph (linear in the number of nodes and arcs). 34 Tournier and Chaves 39 have already applied SCC decomposition to STGs. However, SCC compaction remains limited in the case of networks with long or numerous regulatory cascades, which give rise to multiple linear (although potentially branching) pathways in the resulting STGs.
Another approach that keeps the whole STG structure applies to models that encompass a significant number of input components. These account for external stimuli (e.g., environmental cues) and the corresponding variables are generally maintained constant. In this case, the STGs corresponding to different combinations of input values are disconnected. Input components may also be considered as "uncontrolled" variables, which are allowed to freely vary at each time step. A natural reduction consists in projecting the state space on the set of internal components and labelling each transition with the values of the input components that enable that transition (for more details, see Ref. 24).
Other strategies, mainly developed by the formal verification community, reduce the state space yet ensuring that truth values of (linear) temporal logic formulas are preserved. This is the case of partial-order reduction methods that basically consist in identifying, for each state, a subset of transitions to explore (hence not exploring all the successors). Alternative (rather similar) definitions of these sets have been proposed, called stubborn, ample, or persistent sets. 12,19 Relying on the Petri net representation of logical regulatory graphs 8 and using Petri net tools (e.g., TINA 6 ), we have recently applied such a partial-order reduction to check reachability properties on a large logical model (encompassing 72 regulatory components). For this specific model, due to the structure of its dynamics, partial-order reduction proved to be poorly effective. However, there is certainly room for improvements of these methods, 16 and further work might identify a class of logical graphs more amenable to this kind of reductions.

Model-checking
During the recent years, formal verification techniques based on model-checking have been successfully applied to the analysis of molecular network models. 4,7,24,25 This approach is directly applicable to the verification of logical regulatory graphs, which constitute a class of finite state systems. In general, experimentally observed biological behaviors can be expressed in terms of temporal logic statements, and model-checking algorithms used to automatically verify if a model satisfies these statements.
When using explicit representations of states and transitions, model-checking may use partial-order reduction to lessen the size of the search space. However, symbolic model-checking relies on implicit representations, scaling better for large models. The choice of the temporal logic depends on the type of property to be checked. 12 Here, we are mainly interested in attractor reachability from a (set of) initial condition(s) as well as in the conditions enabling such trajectories. This supposes a previous characterization of the attractors, among which the stable states can be efficiently identified beforehand. 9 GINsim includes an export converting logical models into NuSMV symbolic descriptions. 24 NuSMV is a symbolic model-checking tool capable of verifying finite state machines against a set of property requirements, expressed as temporal logic formulas. 11 This export supports the definition of priority classes and takes advantage of the reduction over input components evoked in Sec. II C 3, these being specified either as constant or as freely varying variables. Noteworthy, in the case of varying inputs, the notion of stable states needs to be extended: a state may be stable for given values of input components, and not for others. 28 For models with input components, it is thus possible to analyze switches between cellular types (i.e., stable states) and verify the corresponding input component variations (see Sec. IV and Figure 6).

III. HIERARCHICAL TRANSITION GRAPH
This section deals with the definition and properties of a novel, compact hierarchical graph, where a set of states is shrunken into a single node, whenever it forms a Strongly Connected Component (SCC), or a (set of) linear chain(s) leading to the same set of SCCs and attractors. Compared to the SCC graph mentioned above, this graph generally corresponds to a further reduction of the State Transition Graph (STG). Furthermore, the resulting grouping of states greatly eases the interpretation of the structure of the dynamics in terms of basins of attraction.

A. Definitions
Let us first define the application r that associates to each SCC C, the set of SCCs, complex or terminal, that are reachable from C, including C itself if it is complex or terminal Each complex SCC of the STG is contracted to a single node in the HTG. Similarly, a single HTG node accounts for all trivial SCCs sharing the same r-image. Figure 2 provides an illustration of the HTG construction for the lambda phage model.

B. Properties
For two components C; C 0 2 C [ I [ S Ã , the notation C H C 0 indicates the existence of a path from C to C 0 in the HTG. The following property relates paths in the STG to paths in the corresponding HTG. Property 1.

A path connecting any HTG component to a nonirreversible component implies the existence of a path in the corresponding STG
8C 2 C [ I; C 0 2 C; C H C 0 ) s s 0 8s 2 C; 8s 0 2 C 0 :

2.
A path between two states in the STG implies the existence of a path between the HTG components they belong to 8s; s 0 2 S; s s 0 ) C H C 0 ; with s 2 C; s 0 2 C 0 : Proof.
1. Let C H C 0 , with C 2 C [ I and C 0 2 C [ S Ã . Then, C 0 2 rðCÞ : 8s 2 C; s C 0 and the first item of Property 1 is proved by definition. 2. Let s; s 0 2 S s:t: s s 0 , and denote C and C 0 the components of the HTG, such that s 2 C and s 0 2 C 0 .
• If C ¼ C 0 , the statement is obviously true. • If C 6 ¼ C 0 , let ðs ¼ s 1 ; s 2 ; …; s k ¼ s 0 Þ be the path from s to s 0 in the STG: 8i ¼ 1; …; k À 1; s i 2 S and Tðs i ; Hence, following the path s s 0 , we obtain that C H C 0 .
Remark 1. Property 1 does not ensure equivalence of path existence in STG and related HTG. Indeed, in item 1, we have the restriction that C 0 6 2 I: when C H C 0 , with C 0 2 I, given s 2 C, we cannot ensure the existence of a path in the STG from s to a state s 0 2 C 0 .
In Figure 2, we have such a situation, where C H C 0 , with C 0 2 I and 9s 2 C s.t., there is no path in the STG from s to any state in C 0 . Indeed, considering Figure 2, panel C, the irreversible component i#7 contains the state 1000 (see panel B), and the arc from i#7 to i#3 indicates that there exists a state s 2 i#7 and a state s 0 2 i#3 such that s s 0 (e.g., T(1011, 2011) ¼ 1, in panel A or B). However, there is no path from state 1000 to any state of i#3 as illustrated in panel D. Another typical situation for which we have C H C 0 ; C 0 2 I, and no path in the STG from s 2 C to s 0 2 C 0 , may arise when a hierarchical (irreversible) component contains disconnected states.
We propose an algorithm to generate HTGs of logical regulatory graphs, given a (set of) initial condition(s). Described in the supplementary file, 43 this algorithm is based on Tarjan's method 34 and compacts a STG on-the-fly.

C. Basins of attraction
A classical way to study the dynamics is to focus on attractors and their basins of attraction (cf. Definition 4). When using the synchronous dynamics, their computation is facilitated by the fact that all states have at most one successor (for more details, see Ref. 42). But in the case of concurrent behavior, it is computationally much more costly (see Refs. 1, 39, and 41 for the fully asynchronous case).
By construction, the HTG nodes group together states of the STG and thus allow to easily recover the basins of attractions. Indeed, given A Ã an attractor in the STG, and C 2 C [ I [ S Ã a node of the HTG, the states of C are in B A Ã , the basin of attraction of A Ã , iff A Ã 2 rðCÞ. The states of C are in B A Ã , the strict basin of attraction of A Ã , iff rðCÞ \ ðC Ã [ S Ã Þ ¼ fA Ã g. Hence, for all attractors, it is much easier to identify their basins of attraction on the basis of the HTG.

IV. APPLICATION TO Th CELL DIFFERENTIATION
T-helper lymphocytes play a key role in the regulation of the immune response in vertebrate. Various T-helper subtypes (Th1, Th2, Th17, Treg) have been identified over the years, characterized by the expression of specific transcription factors and cytokines, which have a critical influence on the selection of specific immune responses, driving proinflammatory or allergic responses, promoting alternative antibody classes, or yet preventing (auto)immunity by inhibiting the activation and proliferation of other cells.
Several modeling studies have been proposed to shed light on the regulatory network controlling T-helper cell activation and differentiation (see, e.g., Refs. 20, 21, 23, 33, 40 and references therein).
To gain insight into the heterogeneity and the plasticity of late T-helper lineages, we have recently built an integrated logical model of the core regulatory network and main signaling pathways controlling Th cell differentiation 27 ( Figure  3). Encompassing 65 components (including 13 inputs, corresponding to antigen presentation and a dozen different cytokines), this multilevel logical model proved to be too complex to be straightforwardly simulated. This situation motivated the development of the reduction method mentioned above. 27,29 In the case of our T-helper model, we were led to hide 31 components (shown in grey in Figure 3) and thus obtained a more compact model encompassing 34 nodes (including the same 13 inputs).
Using the resulting reduced T-helper model, we have performed a series of simulations to assess the effects of heterogeneous environments on Th cell differentiation. This led us to identify stable states corresponding to canonical Th1, Th2, Th17, and Treg subtypes, but also to hybrid cell types co-expressing combinations of Th1, Th2, Treg, and Th17 markers in an environment-dependent fashion.
Here, we apply the HTG construction to this reduced model in order to demonstrate how the dynamics can be compressed in a meaningful way, emphasizing the structure of the underlying STG, as well as crucial decision points along dynamical pathways. In this respect, we have selected a limited number of simulations leading to STGs of increasing complexity. Figure 4 displays the HTGs obtained when simulating a naive T-helper cell stimulated by an antigen presenting cells in the presence of IL2 alone, or in the presence of pro-Th1, Th2, Treg, or Th17 cytokines. In all cases but the last one, we obtain a unique stable state corresponding to the expected cellular state (activated Th0, Th1, Th2, or Treg). In each of these HTGs, all other states reachable from the initial conditions are grouped together into a single irreversible transient component, encompassing between 25 and 255 states. The label associated with each arc denotes the ultimate elementary transitions going out the HTG node. In contrast, in pro-Th17 conditions, the system can reach two different stable states expressing Th17 transcription factor RORGT, IL10, IL21, and IL23, one expressing also FOXP3, the other expressing IL2. From the arc labels, it follows that the selection between these two stable states depends on the concurrent activation of RORGT and FOXP3. Th17 environment, i.e., in the presence of IL4, IL6, TGFB, and in the absence of IL2. The resulting HTG merely comprises 13 nodes, to be contrasted with the 1146 states of the corresponding STG. Furthermore, the HTG structure emphasizes the progressive commitment of cells when following paths from the root to the leaves (stable states). The states encompassed by other nodes belong to two or more basins of attraction. Note that the system can reach four stable states, more precisely two pairs of activated versus anergic Th2 RORGTþ subtypes. Within each of these pairs, the stable states differ by the expression level of FOXP3. The labels associated with the arcs clearly emphasize the transitions implementing differentiation decisions. As illustrated in Figure 5 Th1), or with IL4 þ IL6 (pro Th2), or with TGFB (pro Treg), or with TGFB and IL6 (pro Th17), from top to bottom and left to right. All other nodes are set to zero at the initial state. In the notation of the logical stable states (prefixed by "ss-"), the node order considered starts with APC, followed by the external input cytokines IFNB, INFG, IL2, IL4, IL6, IL10, IL12, IL15, IL21, IL23, IL27, TGFB, followed by the receptor components IL2R and IL2RA, and then by the cytokines produced by the Th cell considered INFG, IL2, IL4, IL10, IL21, IL23, TGFB, then the transcription and signal transduction factors TBET, GATA3, FOXP3, NFAT, STAT1, STAT3, STAT4, STAT5, STAT6, followed by the proliferation node, followed by RORGT and IL17. Arc labels indicate transitions (regulatory component updates) driving the system out of an HTG node toward another one.
FIG. 5. Compressed representation (HTG) of the asynchronous simulation of Th0 in the presence of APC signaling þ IL4 þ IL6 þ TGFB (combination of pro Th2 and Th17 cytokines, in the absence of IL2). Four stable states can be reached: two pairs of activated versus anergic Th2 RORGTþ cells, differing by the expression of FOXP3. The bottom part shows the HTG obtained for the same initial conditions but using two asynchronous priority classes. In this configuration, transitions involving IL2R, NFAT, or any of the STAT factor are selected against those involving any other component. In contrast with the results obtained without prioritization, only two stable states can be reached, both corresponding to anergic Th2 RORGTþ cells, which differ by the expression of FOXP3. The labels associated with the arcs emphasize the crucial transitions underlying the choice of one or the other differentiation state.
A thorough discussion of the biological significance of these observations would go beyond the scope of this article. However, these examples demonstrate the compression and clarification of asynchronous simulations that can be achieved using the HTG representation.
Finally, Figure 6 displays the reachability analysis between cellular types through the use of model-checking. For this, we considered the environmental conditions defined by specific valuations of the 13 input components (see Figure 5 of the original study, i.e., in Ref. 27) and the stable expression patterns (see Figure 6 in Ref. 27). These stable states correspond to cellular subtypes, which are stable under specific environmental conditions. So, for each input combination, we verify the existence of a direct path between each possible pair of cellular types. More precisely, we check whether there is a fixed valuation of inputs such that there is a direct path between two cellular subtypes, C1 and C2 (without going through other cellular subtypes) and C2 being stable. Using this approach, we could reproduce the results obtained at the cost of extensive simulations in the original study (Figure 7 in Ref. 27). Three main groups are defined over the Th cell subtypes (Th0, Th1, and Th2, see Figure 8 in Ref. 27). We could also verify that Th1 and Th2 subtypes can never switch back to a Th0 one, even when inputs are allowed to vary freely. However, in such case, switches between all cellular subtypes are possible within each group.

V. CONCLUSION AND PROSPECTS
HTGs emphasize relevant transient and asymptotic dynamical properties. We have defined a novel algorithm enabling the compaction of state transition graphs and the generation of HTGs on-the-fly. This approach has been implemented into a development version of the software GINsim, available as a pre-release. 30 We have applied this approach to a comprehensive model for T-helper cell differentiation. Although this model still needs to be further refined and tested, the analysis presented here clearly demonstrates the assets of the HTG representation, which leads to significant graph compression and clearly emphasizes the organization of the state space into attractors and basins of attraction.
Interestingly, applying our algorithm for HTG construction onto a HTG produces a further compacted graph-based representation of the dynamics, where the nodes correspond to basins of attraction.
Should a given dynamics be too large and complex to be effectively compacted using the HTG representation, we can rely on complementary methods presented in this manuscript. These methods aim at reducing the size of the search space, including the model reduction method that preserves key dynamical properties and the definition of transition priority classes, relying on biologically well-founded assumptions. Moreover, we have presented model-checking techniques to analyse reachability properties. Used jointly, these methods enable the dynamical analysis of logical models of unprecedented sizes.
It is worth noting that the HTG structure could be considered in the context of other formal approaches relying on state transition graphs, including Petri nets (see, e.g., Ref. 8 and references therein) and piecewise-linear differential equation (PLDE) models. 3,17 Model-checking techniques also apply to these models, once their dynamics can be represented by Kripke structures. 12 Our model reduction could be applied to PLDE models, but its impact on the dynamics still needs to be clarified.
HTG construction could be optimized and improved, e.g., using parallel algorithms. Although depth-first search algorithms are known to be difficult to parallelize, 31 different methods have been proposed to tackle this problem. 2 Further analysis relying on HTG structures should allow the assessment of finer properties. For instance, some wellestablished rules (Thomas' rules 36 ) assert that differentiation (resp. homeostasis) phenomena lean on the action of a positive (resp. negative) circuit in the regulatory graphs. In practice, circuit functionality analysis often points to combinations of intertwined circuits, which are difficult to analyze. HTGs appear particularly well suited to the dynamical analysis of complex networks endowed with differentiation properties (i.e., presenting multiple alternative stable states, which can be all reached from given initial conditions), as they capture the general organization of the corresponding STGs. Hence, based on the analysis of HTG structures, we should be able to identify the circuits at the core of cell commitment and thereby focus on the genes responsible for irreversible decisions.
An alternative strategy to analyze large regulatory networks takes advantage of their modularity. Recently, we have defined a compositional framework that relies on process algebra to incrementally compose, abstract, and minimize (using the safety equivalence) logical regulatory modules, enabling impressive reductions of the dynamics. 22 However, as proper methods to decompose large networks FIG. 6. Model-checking reachability analysis between cellular types under fixed input conditions using NuSMV. 11 Nodes represent cellular subtypes (see Figure 6 in Ref. 27), whereas arrows represent the existence of a direct path between two cellular types under a specific fixed environmental condition (see Figure 5 in Ref. 27). For simplicity, reachability analysis has been truncated to the Th0 subtypes, discarding the Th1 and Th2 subtypes. into functional modules are still lacking, we have focussed on regulatory networks that encompass several identical modules connected by inter-cellular signaling. In this respect, one could take advantage of HTG structures to come up with relevant model decomposition.