Protein-Polymer Mixtures in the Colloid Limit: Aggregation, Sedimentation and Crystallization

While proteins have been treated as particles with a spherically symmetric interaction, of course in reality the situation is rather more complex. A simple step towards higher complexity is to treat the proteins as non--spherical particles and that is the approach we pursue here. We investigate the phase behavior of enhanced green fluorescent protein (eGFP) under the addition of a non--adsorbing polymer, polyethylene glycol (PEG). From small angle x-ray scattering we infer that the eGFP undergoes dimerization and we treat the dimers as spherocylinders with aspect ratio $L/D-1 = 1.05$. Despite the complex nature of the proteins, we find that the phase behaviour is similar to that of hard spherocylinders with ideal polymer depletant, exhibiting aggregation and, in a small region of the phase diagram, crystallization. By comparing our measurements of the onset of aggregation with predictions for hard colloids and ideal polymers [S.V. Savenko and M. Dijkstra, J. Chem. Phys 124, 234902 (2006) and F. lo Verso et al., Phys. Rev. E 73, 061407 (2006)] we find good agreement, which suggests that the eGFP proteins are consistent with hard spherocylinders and ideal polymer.


I. INTRODUCTION
Protein aggregation, and crystallization behavior has important consequences in determining their structure, functions and understanding pivotal challenges ranging from condensation diseases 1-3 to the development of new materials. 4 Controlling their assembly into states in which their functionality can be exploited is crucial to fully realise their potential. Predictions can be made based on phase diagrams, with detailed conditions such as protein concentration, temperature, pH, etc. [5][6][7] Crystallization is one of the most complex and least understood topics in biology. Protein crystallization is fundamental to obtain protein structure and to get insights into protein function. However, crystallization protocols are mainly based on trial and error assays, with a lack of standardized approaches. In fact, in average only 0.04% of crystallization experiments yields good quality crystals. 8 This is due in part to inherent protein shape and surface complexity as well as the dependence of protein-protein interactions on combinations of pH, temperature and precipitants (salts and polymers). [8][9][10][11][12][13] Recent developments include an improved understanding of the role of clusters in protein crystallization, 14 the effect of polymers in inducing crystallization, [15][16][17] the effects of salts on crystallization [17][18][19] and crystal growth rate, 20 and the role a) Electronic mail: paddy.royall@espci.fr. of temperature in the protein phase diagram. 21,22 An emphasis has been placed on the role of entropy in contact-contact interactions in proteins. 23 Perhaps unsurprisingly, fine manipulation of protein interactions is necessary for self-assembly, 24,25 and if this can be successfully achieved and coupled with protein engineering, it is possible to manipulate proteins to enable new paths of self-assembly. 4,26,27 By contrast, the field of Soft Matter often operates at rather larger lengthscales than the supramolecular lengthscale of proteins. Yet phenomena exhibited by soft materials have been applied to proteins, with some degree of success. 8,[28][29][30] Among these is the concept inspired by colloidal systems of effective interactions between the protein molecules that can be altered by other components in the system such as added salts and polymers. [31][32][33][34] In this way a soft matter perspective offers some insights to understand and quantify protein interactions and their equilibrium phase diagrams by simplified models, which might provide a systematic way to improve protein crystallization. 8,[28][29][30][31] Indeed, a parameter that has been used to relate protein phase behavior [34][35][36][37][38] to that of colloidal suspensions [39][40][41] is the second viral coefficient, which can be calculated experimentally from osmotic pressure, 37 static light scattering, dynamic light scattering, SAXS or SANS [42][43][44][45] . While proteins often aggregate at high concentration, some do not and indeed interesting glassy behaviour reminiscent of hard sphere colloids has been seen for concentrated solutions of eye lens α-crystallin 43 , which opens the potential for further analogies with colloidal systems.
Examples of the insights gained from this approach of com-paring proteins to colloidal systems include the prediction of enhanced nucleation rates in the vicinity of a (metastable) critical point, 46,47 which have been realized using colloids with a short-ranged attraction 48,49 and gelation 42,50,51 and socalled liquid-liquid phase separation. 52 It is also possible to control the pathway of crystallization by manipulating the interactions. 53 Analogies have also been made between proteins and colloids with so-called mermaid (short-range repulsion, long-range attraction) interactions, through the discovery of finite-sized clusters, [54][55][56][57][58][59][60][61] although the existence of the protein clusters has been questioned. 62 Meanwhile the colloid systems have been shown to exhibit much more complex behavior than was originally supposed, through a fundamental breakdown in spherical symmetry in the electrostatic repulsions. 41,63 Moreover simple isotropic models applied to other colloidal systems 19,64 often fall short and fail to fully characterize key protein phenomenology due to anisotropic shape and a non-uniform surface charge and hydrophobic/hydrophilic pattern in proteins. 8,37,65 This anisotropy is responsible for the directional and localized protein interactions that yields non-closed packed crystal structures as well as directed self-assembly. 66 This more complex behavior can be captured and reproduced to some extent via patchy particle models, where the simulated particles include angular surface directionality of attractive short-range interactions. 12,67,68 By changing the number, size and specificity of said patches, the system can be optimised to fully describe the protein behavior. 27,67,[69][70][71][72][73] In colloidal systems, an effective attraction between the particles can be induced by adding non-adsorbing polymers. 74,75 This can lead to gelation [75][76][77][78] and enhanced crystallization rates. 48,49 Although they are often smaller than colloids, polymers are typically rather larger than proteins, 33,79 leading to the concept of the protein limit, [80][81][82] where the polymers are so much larger than the proteins that the relevant lengthscale is the intra-polymer persistence length, rather than the polymer radius of gyration that is typically considered in the case of colloid-polymer mixtures. However, here we consider a scenario more akin to colloid-polymer mixtures, where the polymer radius of gyration R g is smaller than or comparable to the protein radius.
Polymer-induced protein precipitation has been investigated via volume exclusion interactions i.e. depletion (for high molecular weight polymer). 9,83 However, some polymers can also interact with positively charged amino acids (lysine, arginine and histidine), 84 and/or through hydrophobic chemical interactions (for example with -CH 2 OCH 2 -groups) 85 both present on the protein surface. Additionally, there can be a preferential formation of hydrogen bonds between the polymer and water, which in turn enhances protein-protein interactions. 86 These scenarios can lead to more complex interactions than the non-adsorbing polymer-protein depletion picture.
Here, we consider mixtures of enhanced Green Fluorescent Protein (eGFP) and poly-ethylene-glycol (PEG). eGFP readily undergoes dimerization 24 such that the proteins resemble short rods. By comparing our results to the extensive colloidpolymer literature, 75 we treat the proteins as spherocylinders (with dimensions deduced from x-ray scattering), in particular as mixtures of hard spherocylinders and ideal polymer 87,88 . In this way, a we consider a model of the protein-polymer mixture where the only level at which the complexity of the system is treated is via a simplified form for the anisotropy of the protein dimers, namely a spherocylinder. We thus treat the protein dimers as hard spherocylinders and the polymers as ideal polymers.
Here we follow the literature on spherocylinder-polymer mixtures 87,88 and express the aspect ratio as L/D − 1 in which the aspect ratio of spheres is then zero where L is the spherocylinder length and D is the diameter. We interpolate the predictions of polymer fugacity required for polymerinduced demixing, between spherocylinders with aspect ratio L/D − 1 = 5 88 and spheres 89 . Remarkably, given the simplicity of the model, we find good agreement for the geometric parameters of our system with L/D − 1 = 1.05. This paper is organised as follows. In section II we describe the methods of protein preparation, characterization, as well as estimation of their interactions as spherocylinders. Section III A consists of phase behavior, including aggregation and crystallization, in salt-screened and salt-free mixtures. We then compare the onset of aggregation with theory in Section III B, where the polymer radius of gyration at different molecular weights are fitted by interpolating predictions from hard colloids and ideal polymers. Finally, a discussion of our findings presented in Section IV.

A. Estimation of Interactions Between Proteins
Our system is governed by two control parameters: the protein concentration and the polymer concentration. In the context of treating the system in the spirit of a colloid-polymer mixture, we consider the proteins as spherocylinders and thus the volume fraction where ρ eGFP is the protein number density. We determine the diameter D and length L from x-ray scattering and compare our results to literature values (see section II C). Our choice of spherocylinders is motivated by the literature on colloidpolymer mixtures, for which phase diagrams for hard spherocylinders plus ideal polymer are available. 87,88,90 Our second control parameter, polymer concentration, is expressed as the polymer fugacity z pol . We make the significant assumption that the polymers can be treated as an ideal gas (of polymers), and then the fugacity is equal to the polymer number density in a reservoir z pol = ρ res pol in thermodynamic equilibrium with the system.
To compare with predictions from theory and computer simulation, we use the fraction of available volume α to estimate the reservoir polymer number density from that in the experimental system ρ exp p , viz ρ exp p = αρ res p . We use the free-volume approximation for α. 90 where γ = L/D is the length-to-diameter ratio of spherocylinders, and q is the polymer-protein size ratio, of 2R g /D, R g is the radius of gyration of polymers. Below we compare the phase behavior we obtain for our system with literature values for spherocylinder-polymer mixtures. 87,88,90 Our proteins carry an electrostatic charge, which we determine below (section II C). To estimate the electrostatic interactions we used a screened Coulomb (Yukawa) potential. Here, it is convenient to treat the proteins as spheres. We shall see below that although they are not spherical, the electrostatic interactions turn out to be so weak that we believe that to a large extent they can be neglected. Therefore we merely estimate their strength with a spherically-symmetric approximation.
where the contact potential, κ is the inverse Debye length, with Z the charge number, λ B is the Bjerrum length, ρ ion i is the number density of the ith ionic species, Z ion i is the valency of the ith ionic species. For the system with no added salt, the ionic strength I = ∑ i ρ ion i (Z ion i ) 2 was evaluated as the sum of the ion contributions of the weak dissociation of 25 mM HEPES (pKa = 7.66) and the counter ions contribution assuming charge neutrality. Thus, by varying protein concentrations we obtained a range of I = 1-4 mM. For the system where 10 mM NaCl was added, we included to the sum the ions contribution from this salt dissociation, giving a range of I = 15-18 mM. Further details of Yukawa potential are listed in Table I. It is important to highlight, as pointed out by Roosen-Runge and collaborators, 91 that in addition to assuming a spherical shape for the proteins, also an isotropic distribution of ions on their surfaces is assumed. This is not the case for eGFP dimers, thus the charges calculated should only be considered as effective charges suitable to describe the phenomena observed in our experiments. However, the magnitude of the charge that we determine is sufficiently small that within the DLVO treatment we employ, the electrostatic interactions prove to be very weak, so we believe that at the level of this analysis, a spherical approximation is reasonable.

B. Protein Expression and Purification
Cellular Culture for the Expression of eGFP. A miniculture of competent Escherichia coli BL21 (DE3) previously transformed with the DNA plasmid-pET45b(+)-eGFP was prepared by inoculating 100 mL of lysogeny broth (LB) and the antibiotic carbenicillin (50 µg/mL) with an isolated E. coli. colony. The culture was left to grow overnight (16 h) at 37 • C and 180 rpm. 2 mL of this culture inoculated to a 1 L of LB containing the same antibiotic and which was left to grow under the same previous conditions. The optical density (OD 600nm ) was monitored until a value of 0.5-0.6 was reached. Then, the production of eGFP was induced by adding 1 mM of Isopropyl β -D-1-thiogalactopyranoside (IPTG). After 1 h of induction time, the temperature was changed to 28 • C and was incubated overnight. The cell culture was centrifuged at 4500 g for 15 min at 4 • C. The supernantant obtained was discarded and the pellet was resuspended in a lysis buffer (20 mM imidazole, 300 mM NaCl and 50 mM potassium phosphate at pH 8.0) and stored at -20 • C. 92 Purification and concentration of eGFP. Cell pellets were thawed and kept on ice, sonicated for 3 cycles of 30 seconds (Soniprep 150 plus MSE) and centrifuged at 18000 rpm (Sorvall SS34 rotor) at 4 • C for 30 min. The supernatant was recovered and filtered through a 0.22 µm syringe filter (Millipore) and injected to a Ni-NTA (nickel-nitrilotriacetic acid) agarose column (Qiagen) connected to an ÄKTA START purification system (GE Healthcare), previously equilibrated with the lysis buffer mentioned. The bound eGFP was washed with the same lysis buffer to elute the rest of unbound proteins and eGFP was later eluted with a linear gradient (0-100%) of a 500 mM imidazole, 300 mM NaCl, 50 mM potassium phosphate buffer at pH 8.
The recovered proteins were then further purified through size exclusion chromatography to eliminate aggregates and unfolded protein. A single peak corresponding to a single protein size was collected. eGFP was concentrated to ∼3 mL in a 25 mM Tris-Base 150 mM NaCl buffer at pH 7.4. The proteins were applied to a HiLoad Superdex 75 16/600 size exclusion column using ÄKTA START purification system (GE Healthcare) pre-equilibrated with the same buffer. Protein elution was monitored at 280 nm.
Purified eGFP was filtered through a 0.22 µm syringe filter (Millipore) and concentrated using protein 30 kDa concentrators (ThermoFisher Scientific) at 5000 rpm and 4 • C for the time required to reach the desired volume. The protein concentration was determined by measuring the absorbance at λ eGFP = 488 nm with a molar extinction coefficient ε eGFP = 56000 M −1 cm −1 . 93 Sample preparation. From small-angle X-ray scattering, the purified eGFP showed to form dimers with a height of 8.2 nm and length of 4 nm, as shown in Fig. 1. Thus we treated the eGFP molecules as spherocylinders, which are made of two hemispheres of diameter D = 4 nm, length L = 8.2 nm. We changed the protein buffer to 25 mM HEPES at pH = 7.4. A separate buffer solution with 0 or 200 mM NaCl was used as a stock solution to adjust the final protein and salt concentrations.
We carried out experiments with two different polymer sizes, in particular, polyethylene glycol (PEG, Polymer Laboratories) with molecular weights M w of 620 and 2000. The polymer radius of gyration R g was estimated from polymer scaling with an empirical prefactor R scale For each sample, we first mixed different volumes of the protein stock solution (106.4 mg/mL) with different volumes of the HEPES buffer with and without salt to complete a fixed volume of 5 µL, giving a range of protein concentrations of 0.7-30 mg/mL and a constant NaCl concentration of 10 mM (for the samples with salt). To induce effective attractions between the protein molecules, we added different amounts of PEG by weight at room temperature such that we obtained a polymer concentration between 0-0.8 gcm −3 (fugacity ∼ 1-50). The samples were thoroughly shaken for 5 seconds by a touch-vortexer, immediatly imbibed to inside-diameter of 0.5 mm capillaries (CM Science), and sealed with optical adhesive (Norland Optical no 81). Within 5 minutes, the different phases obtained were characterised through laser scanning confocal microscopy (Leica SP8) at an excitation wavelength of 488 nm and emission wavelength of 509 nm.

SAXS analysis
To characterise the size and shape of the expressed and purified eGFP, we performed SAXS measurements on 25 µL of 10 mg/mL eGFP in 25 mM Tris-Base 150 mM NaCl buffer at pH 7.4 on a SAXSLAB Ganesha 300XL instrument. Samples were loaded into 1.5 mm borosilicate glass capillaries (Capillary Tube Supplies UK) and sealed with optical adhesive under UV light (Norland 81). The wavevector k range was of 0.006-0.30 Å −1 . Background corrections were carried out with both an empty cell and a cell with the buffer only. The obtained data were fitted using the SasView version 4.0 software package. 95 The results are shown in Fig. 1a. The scattering intensity, I(k) is given by the product of the form factor P(k) and the static structure factor S(k) via V eGFP is the volume of a protein dimer (Eq. 1) and ∆ρ scat is the difference in scattering length density between the proteins and its solvent. [96][97][98] The scattering data was successfully fitted by a cylindrical form factor 99 with a diameter of 4.0 ± 0.02 nm and a length of 8.2 ± 0.08 nm (see full parameters in Appendix A II). These dimensions are consistent with dimers of eGFP as illustrated in Fig. 1b. These results are in agreement with previous work on eGFP, where it was found that the protein exists in dimers. 100 Electrophoretic Mobility Measurement. We performed electrophoretic mobility µ e measurements on 1 mL protein solutions of 2 mg/mL at 20 • C in a NaCl 10 mM solution using a Zetasizer Nano ZS (Malvern, UK) at a detector angle of 13 • and a 4 mW 633 nm laser beam to determine the charge of eGFP following Roosen-Runge, et al. 91 Care was taken in order to have the same pH with the buffer used in phase diagram determination. By using electrophoretic light scattering (ELS) via phase analysis light scattering (M3-PALS), the electrophoretic mobility µ e of eGFP was determined as an external electric field is applied.
From this we obtained the zeta potential ζ for a spherical particle with diameter D using where ε r and η are the dielectric constant and the viscosity of the solvent, respectively, and f (κD/2) is the Henry function evaluated at κD/2. The relation between surface charge density σ and the reduced zeta potentialζ = (eζ )/(2k B T) is: Finally we can obtain the total charge using Ze = πD 2 σ , where Z is the charge number and e is the elementary charge. The zeta potential value measured was ζ = -7.02 mV, which corresponds to a charge number of Z = 1. 16. We list the parameters for Yukawa potential in Table I.

III. RESULTS
We divide our results as follows. First a phase diagram is presented for the eGFP-PEG620 system, showing the fluidaggregation transition in section III A. We increased the polymer molecular weight to obtain a larger size ratio (using PEG2000), investigating the effects of polymer size on phase boundary. To check any residual effects of protein charges, the comparison between salt-screened and salt-free system is discussed. In section III B, we consider a spherocylinder-sphere system of L/D − 1 = 1.05. The polymer radius of gyration is fitted by interpolating between theoretical and computer simulation predictions. Finally the protein crystallization, formed through depletion attractions with polymer, is discussed in section IV.

A. Phase Behavior
Salt-screened system. The phase diagram with different states as determined from images from confocal microscopy for the eGFP -PEG620 (small polymer) system with 10 mM of added NaCl salt is shown in Fig. 2. The phase diagram is presented in the plane of protein volume fraction (φ eGFP ) and polymer fugacity z pol . The phase boundaries are determined by the average between the fluid and aggregated phase points. Note that in depletion systems, aggregation and gelation are identified with the liquid-gas phase boundary. 40,41,77 Thus while these are non-equilibrium states, comparison with equilibrium phase behaviour is nevertheless highly informative. For lower protein volume fraction below we tested, a dotted line is drawn based on the intuition from literature. 75 The smaller the concentration of proteins, the higher the polymer concentration needed for phase separation. As noted above, the protein volume fraction is estimated by assuming that the eGFP molecules are spherocylinders of aspect ratio L/D − 1 = 1.05. The polymer fugacity is obtained from the polymer number density. The protein dimensions determined from SAXS (section II C) and estimated polymer size gave a size ratio q = (2R scale g )/D ∼ 0.4. As a function of polymer concentration, we first encounter protein solutions where the eGFP appears stable and exhibits no observable aggregation, but instead there is a uniform fluorescent intensity, as the protein dimers are far below the resolution of the microscope (Fig. 2a). Upon increasing the polymer concentration, we see aggregation for polymer fugacity z pol = 30.0 ± 1.0 (which corresponds to a protein volume fraction around 0.002), shown in Fig. 2c. Now the polymer concentration here is rather high, indeed the polymer volume fraction φ pol = ρ pol πR 3 g /6 is of order unity. We return to this point below in section IV.
As the protein concentration is increased, the polymer concentration required for aggregation decreases. Upon further increase in polymer concentration, protein aggregates form quickly and become large enough that considerable quantities sediment to the bottom of the sample where a denser sediment builds up (Fig. 2d). This is reminiscent of aggregation and sedimentation behavior in colloidal systems. 101 In a small region of the phase diagram, we encounter protein crystallization, indicated as green diamonds (see region denoted as "X") in Fig. 2. We note that there is some lack of smoothness in the phase boundary. Such fluctuations in phase boundaries we well-known in soft matter systems (see. e.g. 102 ) and we leave this for further investigation.
Protein crystallization has been related to near-critical behavior. 46 Here, although the regime of crystallization occurs near the aggregation line (which, by itself might link it to criticality 49,103 ) the protein volume fraction is vastly lower than any critical isochore that would be expected to occur for this system. Indeed, the volume fraction of the critical isochore for spherical colloids plus polymers with size ratio q ∼ 0.4 is estimated to be at least φ c 0.25, 89 so it is hard to imagine that critical fluctuations are important here. The lengths of the crystallites that we find are in the range of ∼4-80 µm. Fig. 5b is pure crystal, while Figs. 5a and c show aggregates which we presume to be amorphous. Salt-free system. To investigate the effect of the (weak) electrostatic interactions between the proteins, we determine the phase behavior in the absence of added salt shown in Fig. 4. We find a boundary for aggregation estimated at z pol = 30.9 ± 1.9 for a rather lower volume fraction of 0.002, which is almost indistinguishable behavior to the case with added salt (Fig. 4) at the same protein volume fraction. This is quite consistent with the soft matter inspired analysis of treating the proteins as hard spherocylinders. However we do not encounter any crystallization behavior here and return to this in the discussion below.
Effects of PEG molecular weight. So far, we have discussed the system with the smaller polymer (PEG620), we now switch to the larger polymer. We chose PEG2000 here because its size is comparable to that of the protein. We therefore expect normal depletion behaviour, as described by the Asakura-Oosawa model, unlike the protein limit q 1. [80][81][82] The phase diagram for the eGFP-PEG2000 system is shown in Fig. 3. The aggregation shows at a much lower fugac-

B. Comparison with theory
In order to make a comparison with theoretical and computer simulation predictions, we interpolate between phase boundaries determined for spheres (L/D − 1 = 0) 89 and spherocylinders of a larger aspect ratio than those we consider here (L/D − 1 = 5) (Fig. 6). 88 It is important to note what the exact phase is. In the case of sphere-polymer mixtures, upon adding polymer at low colloid volume fraction, the first phase transition that is encountered (for q 0.3) is the (colloidal) liquidgas demixing. 75,90,104,105 In the case of spherocylinders with aspect ratio L/D − 1 = 5 it is fluid-crystal coexistence. 87,88 Nevertheless, for spheres at q ≈ 0.4 − 0.5 the liquid-gas and fluid-crystal phase boundaries occur at quite similar values of the polymer fugacity and so here we neglect the difference. We are, in any case unaware of any computation of the phase diagram for our parameters, and note that the free volume theory of Lekkerkeker et al. 104 is not highly accurate for these parameters. 89 We fit the data for spheres 89 and spherocylinders 88 by a power law z pol = a − bq c at a low value of protein volume fraction φ eGFP ∼ 0.02 with different q. Here a, b and c are fitted constants. The interpolation is done linearly, by z pol is for spheres, 89 and z (5) pol is for spherocylinders (L/D − 1 = 5). 88 Our interpolation is shown in Fig. 6 where we plot the fitted phase boundaries for fitting data and our interpolation. We interpolate to obtain values of q that are consistent with our measured fugacity for demixing z pol = 8.63 (smaller polymer) and z pol = 1.20 (larger polymer). We have in addition some uncertainty in determining the size ratio q. As noted above, our estimate for the polymer radius of gyration R scale g relied on polymer scaling, which may not be accurate for such small polymers. Moreover there are a variety of other assumptions, such as polymer ideality, rigidity, which have been addressed in more refined theoretical treatments. 90,106 We therefore accept some adjustment in our fitted values and take q fit = 0.59 for smaller polymer and q fit = 0.90 for larger polymer, which agree well with our data.
For larger polymers the fitted polymer radius of gyration R fit g of 1.80 nm falls close to the one from empirical equation of R scale g = 1.64 nm (see section II B). It is worth noting that there is an fitted increase from the size ratio from scaling q scale = 0.42 to the fitted size ratio of q fit = 0.59 for the smaller polymer (PEG620). Now we consider the assumption that polymers are ideal as in the standard AO model as with R g = 1.2 nm, these are very small polymers to treat as ideal. 74 Dijkstra et al. 105 compared additive hard-spheres with ideal polymers using thermodynamic perturbation theory, they found that for small q and polymer packing fraction φ p , the phase separation is very similar between two models. Here we have q = 0.59, and under these conditions of larger depletants, the behaviour of spherical colloids plus ideal polymer and spherical colloids plus hard sphere depletant is rather different, at least at the level of the effective interactions between the larger spheres. 107 While we cannot rule out that the polymers may exhibit significant deviations from ideality, given that the phase behavior we find is similar to that of hard spherocylinders and ideal polymer, we note that at the level of our analysis the polymers appear more likely to be behaving in a manner similar to a polymer depletant rather than hard spheres.

IV. DISCUSSION
We have seen that the model fluorescent protein-polymer system can, rather surprisingly, be treated in the spirit of a colloid-polymer mixture where the only additional complexity is an approximate treatment of the anisotropy of the protein dimers. This is notable, and a simple depletion picture of hard spherocylinders with non-absorbing ideal polymers is consistent with our observation. Furthermore, we observe no aggregation for eGFP in the absence of polymer at least to 500 mg/mL, corresponding to a volume fraction of 0.48. At this volume fraction, the protein solution becomes very viscous, consistent with previous work which found glassy behavior reminiscent of colloidal systems in concentrated eye lens α-crystallin. 43 Furthermore, we found that upon dilution aggregated protein solutions re-dissolved, behavior which is compatible with weak, depletion-driven aggregation.
The crystallization behavior in our system re-emphasises that protein crystals can be produced through addition of polymer, as noted previously. 99 This is significant because the process is apparently immediate without a fine-tuning of the system. We focus on the low volume fraction regime in this work, and we note that crystals only appeared in a limited region in our phase diagram and then only in the system with smaller polymer and added salt, not in the case of the larger polymer or without added salt. At first sight, it may seem surprising that we find above (section II A) that the electrostatic interactions are very weak in our system, with or without salt. It is important to highlight that the isoelectric point (pI) of the monomeric unit of the eGFP (obtained from its amino acid sequence) is 5.8, 108 which is close to the pH 7 used in the ex- The interpolation from the data presented by Savenko and Dijkstra 88 for L/D − 1 = 5 (purple triangles) and Lo Verso et al. 89 for L/D − 1 = 0 (green circles), green and purple lines are fitted by power functions. The blue line is the interpolation, and the pink and black stars are crossovers denote matching for PEG620 and PEG2000, respectively.
periments. This might explain the small values found for the surface charge.
We now enquire as to why not adding salt suppresses the crystallization. The observation of crystallization only in a very limited region of polymer concentration (i.e. attraction strength) is consistent with previous work with (spherical) colloids and polymer mixtures, 49,75,109 and has been interpreted in terms of fluctuation-dissipation theorem violation. 110 Additionally, it has been observed that acidic proteins are more likely to crystallize when the pH of the solution is 0-2.5 units above their pI. 111 Our experiments fall within such a range. Thus, only a small amount of salt would be required to overcome small electrostatic repulsions under these favourable conditions. What is perhaps more notable is the limited range of protein concentration in which we see crystallization and the failure of the salt-free system to crystallize. It is quite possible that the region of the phase diagram in which crys-tallization occurs is so small is somehow related to more complex behavior than that which we treat here. For example, Fusco et al. showed the importance of contacts in the crystallization behavior of rubredoxin. 67 We speculate that a decrease in the electrostatic repulsions only needs to occur around or in these regions to promote crystal formation, leading to only small amounts of salt required to yield a crystal, in contrast for example with isotropic systems. Finally, salts can also affect the hydrophobic protein-protein interactions by increasing the surface tension. 86 These interactions have shown to be relevant in the formation of a crystal phase and protein solubility, 112,113 which cannot be discarded in the present study.
Nevertheless, the crystallization that we observe is compatible with the spherocylinder-polymer phase behavior (L/D − 1 = 5), 87,88 . It would be most interesting to determine the phase diagram for hard spherocylinders of aspect ratio L/D − 1 = 1.05 plus polymer, but for now we conclude that our finding of protein crystallization is not inconsistent with some of the literature for hard particle -polymer mixtures. 75,87,88,90,104 The polymer volume fractions at which we find aggregation are rather high, of order unity. It is important to enquire whether one can still apply the concept of polymer-induced depletion under these conditions. Accurate computer simulations in which the polymer chain segments predict that for the polymer fractions that we consider here, only small deviations of ideal Asakura-Oosawa behavior are expected. 114 While we have treated our eGFP as spherocylinders, and this work refers to spherical particles, we are unaware of similar work which pertains to anisotropic particles and thus, in absence of evidence to the contrary, presume that a simple depletion picture remains reasonably accurate at these polymer concentrations.
While we have suggested that it is possible to account for the behavior of our system by treating the eGFP as hard spherocylinders in a solution of ideal polymers, we can be confident that the situation in reality is much more complex. In addition to an enhancement of hydrophobic interactions from salt addition discussed above, due to the amphiphilic nature of PEG, additional hydrophobic 115 and chemical 85 interactions (via PEG -CH 2 OCH 2 -groups) between PEG and proteins might also contribute to this phenomenon. Furthermore, PEG molecules can also enhance aggregation and crystalliza-tion via effective repulsion since PEG might preferentially form hydrogen bonds with water compared to the proteins. 86 Finally, we have determined electrostatic interactions between eGFP dimers to be weak, if we only consider the net charge. Of course this is a very significant approximation. Monomeric eGFP has a number of charging groups, e.g. 32 acidic residues and so a more sophisticated approach which takes this into account may prove valuable. Such an approach as that noted above for rubredoxin 67 would be most interesting to pursue here.
In short, further work is needed to explore throughout the metastable region and then predictions can be validated using the depletion theory. Moreover the properties of those crystals formed at this low protein concentration and by purely depletion interactions, are certainly worth investigating in future research.

V. CONCLUSION
We studied the phase behavior of a model system of fluorescent proteins and polymers (eGFP-PEG) in the "colloid limit" where the polymer depletant is smaller than or comparable in size to the protein. A phase behavior of fluid-aggregation was observed for two polymer sizes, i.e. two polymer-protein size ratios). In addition to a small region of the phase diagram of a system with added salt (NaCl) and small polymers where protein crystallization occurred. At high polymer concentration, protein aggregates were large enough to sediment on the timescale of the experiment and form a sediment whose structure is reminiscent of a gel. In the absence of polymer, solutions of eGFP are stable at least to a concentration of 500 mg/ml (volume fraction at 0.48). This suggests that the eGFP dimers interact rather weakly and that approximating them as hard particles may be reasonable.
Based on the shape of eGFP dimers as deduced from small angle x-ray scattering, we treat them as hard spherocylinders with aspect radio L/D − 1 = 1.05. In the case of the small polymer (PEG 620), the aggregation boundary of polymer fugacity around protein volume fraction of 0.002, was found almost indistinguishable, between 30.0 ± 1.0 for salt-screened system and 30.9±1.9 for salt-free system. For the larger polymer (PEG2000) aggregation was found at a polymer fugacity of 1.20. Consistent with DLVO theory for colloids, the effects of electrostatic interactions between the proteins were found to be weak. Intriguingly, in the case of no added salt, and also in the case of no added polymer, we observed no protein crystallization. Due to the uncertainty of the polymer radius of gyration, we interpolated the fugacity for the aggregation phase boundary from existing literature, between L/D − 1 = 0 for spheres-polymer mixtures 89 and L/D − 1 = 5 for spherocylinder-polymer mixtures 88 and fitted a polymer radius of gyration of 1.1 nm for PEG620. Compared with the empirical estimation of 0.83 nm, this somewhat larger size may be related to some non-ideality in the polymers 90 (we note that polymer scaling theory is expected to break down for such small polymers in any case). The smaller difference for larger polymer (PEG2000) is consistent with this.
The behavior we observed is consistent with the depletion picture of hard spherocylinders and ideal polymers. But in reality the system is rather more complex. At our level of analysis and observation, we cannot exclude the possibility that other interactions drive the phenomena that we observe, for example hydration effects, hydrophobic or electrostatic "patches". Nevertheless, the fact that in the absence of polymer, the eGFP solution exhibits no aggregation to such high concentrations, at that the aggregates re-dissolve upon dilution gives us some cautious optimism that the behavior we observe may be driven by such simple interactions as the excluded volume effects of polymer-induced depletion.

ACKNOWLEDGMENTS
We would like to thank John Russo and Mike Allen for helpful discussions; Richard Stenner for protein expression and purification; and Angélique Coutable-Pennarun for assistance with zeta potential measurements. This work was financially supported by Bristol Centre for Functional Nanomaterials (BCFN), Chinese Scholarship Council (CSC), and Bayer AG. IRdA was funded by the Philip Leverhulme Prize 2018 awarded by the Leverhulme Trust. IRdA, JLRA and CPR were funded by the Leverhulme Trust grant "Unifying Protein Design and Assembly of Soft Matter for New Materials". RC, IRdA and CPR gratefully acknowledge the ERC Grant agreement no. 617266 NANOPRS for financial support and Engineering and Physical Sciences Research Council (EP/H022333/1). The Ganesha X-ray scattering apparatus used for this research was purchased under EPSRC Grant Atoms to Applications (EP/K035746/1). This work benefitted from the SasView software, originally developed by the DANSE project under NSF award DMR-0520547.
Data availability statement The data that support the findings of this study are available from the corresponding author upon reasonable request.