Simulation of FRET dyes allows quantitative comparison against experimental data

Fully understanding biomolecular function requires detailed insight into the systems’ structural dynamics. Powerful experimental techniques such as single molecule F¨orster Resonance Energy Transfer (FRET) provide access to such dynamic information yet have to be carefully interpreted. Molecular simulations can complement these experiments but typically face limits in accessing slow time scales and large or unstructured systems. Here, we introduce a coarse-grained simulation technique that tackles these challenges. While requiring only few parameters, we maintain full protein ﬂexibility and include all heavy atoms of proteins, linkers, and dyes. We are able to sufﬁciently reduce computational demands to simulate large or heterogeneous structural dynamics and ensembles on slow time scales found in, e


I. INTRODUCTION
Life at the molecular level is determined by the interplay of structure and dynamics. While a biomolecular structure can be determined by sophisticated experimental techniques such as NMR or X-ray, detailed insight into its dynamics is more difficult to access. A particularly powerful technique is FRET (Förster Resonance Energy Transfer), where specific sites in biomolecules are labeled by fluorescent dyes. FRET can provide time-resolved information about, e.g., protein-folding dynamics, 1 folding intermediates, 2,3 unfolded or disordered proteins, [4][5][6] or conformational transitions. 7 Both planning and interpretation of such experiments can be nontrivial, as FRET does not directly provide quantitative distance information within the protein. Instead, it rather measures the transfer efficiency between the dyes which-besides the distance between the dyes-depends on their orientation and dynamics and hence also on their shapes. A useful complement for such experiments is molecular simulations of dye-labeled biomolecules. Depending on their complexity, additional information up to the full dynamics of all atoms becomes accessible.
To date, many simulation approaches are available. A straightforward approach is predicting the inter-dye distance by estimating the accessible volume of FRET dyes. [8][9][10][11][12] This does, however, not account for dynamics and the mutual dye orientation, reflected by the orientation factor κ 2 . An approach that has been used extensively for unfolded and intrinsically disordered proteins is the interpretation of FRET data in terms of analytical polymer models, 13 but such models are not applicable to folded proteins. Molecular dynamics simulations are computationally considerably more demanding and aim at modeling the system dynamics accurately. In this approach, the dyes are typically parametrized for a specific 0021-9606/2018/148(12)/123321/9 148, 123321-1 © Author(s) 2018 biomolecular force field. 14 Then, simulations on the ns to µs time scale are run and compared to experimentally accessible properties, such as fluorescence anisotropy 15 or mean FRET efficiencies. 14 As one might imagine, dye and linker fluctuations play an important role. 16 Similarly, the specific dye shape and mutual orientation are crucial and are not fully captured by simple accessible volume calculations. 17 Molecular dynamics simulations also allow testing assumptions such as approximating the mutual dye-dye orientation factor κ 2 with 2/3, which is questioned in several studies, 18,19 in particular, for low linker flexibilities. 20 Many simulation protocols have computationally prohibitive demands to reach sufficiently long simulation times for accurate calculations of experimentally accessible properties. 21 In particular, it is challenging to account for slow or large-scale conformational transitions, structurally diverse ensembles such as unfolded proteins, very large systems, or intrinsically disordered proteins.
Here, we want to introduce a new method for simulating the dynamics of FRET-labeled proteins on a coarse-grained description level 22 which maintains full protein flexibility and all heavy atoms of proteins and dyes. This considerably lowers computational costs and allows us to eliminate many challenges regarding the mentioned insufficient sampling of large conformational ensembles. Our model allows us to access and quantitatively analyze these complex dynamic ensembles in atomic detail. The model requires only few and robust parameters to achieve a realistic description of the system. We validate our approach by direct and quantitative comparison against experimental FRET efficiency histograms. As an example, we highlight simulating large and diverse structural ensembles by comparing simulations and measured FRET data of both folded and unfolded proteins. Having established the precision of our model, we then compare our model to descriptions of FRET by the accessible volume approach and analytical polymer models.

A. Förster resonance energy transfer
Förster Resonance Energy Transfer (FRET) 23 is a nonradiative energy transfer between a donor and an acceptor fluorophore. It has a high sensitivity to distances in the nanometer range and is therefore a powerful "spectroscopic ruler". 24 For example, one can label specific residues of proteins with suitable donor and acceptor fluorophores to distinguish different protein conformations and conformational changes directly through distance changes between the dyes.
FRET experiments measure the FRET efficiency E, which is related to the inter-dye distance R DA by 25 The Förster radius R 0 determines the range of measurable distances and is given by 25 where Q D is the fluorescence quantum yield of the donor in the absence of the acceptor, κ 2 is the dipole orientation factor, n is the refractive index of the medium, N A is the Avogadro constant, and J(λ) is the spectral overlap integral of donor emission and acceptor absorption spectra. The orientation factor κ 2 describes the relative orientation of the transition dipole moments for emission of the donor, µ D , and absorption of the acceptor, µ A , and can be calculated via 25 where θ D and θ A are the angles between µ D and µ A and the connecting vector R DA between the two dye centers, respectively. φ denotes the angle between the planes defined by µ A , R DA and µ D , R DA , respectively. Most experimental studies assume fast rotational diffusion of the dyes with respect to the excited state lifetime, resulting in an averaging over all possible orientations. This is referred to as the "isotropic averaging regime," in which case a constant value of κ 2 = 2/3 can be used. 25 The assumption of κ 2 = 2/3 has been questioned in several studies 12,18-20 but can be tested experimentally, e.g., by measuring time-dependent fluorescence anisotropy decays. 14

B. Rotational correlation time
A sample of randomly oriented dyes illuminated by polarized light depolarizes over time as a consequence of rotational diffusion. Observing the time dependent fluorescence anisotropy r(t) leads to a measure of the flexibility and rotational speed of the dyes, the rotational correlation time τ rot . The time-resolved fluorescence anisotropy can be measured experimentally and calculated from simulations using the normalized transition dipole moment vector µ via 26 where r 0 is the fundamental anisotropy and P 2 is the secondorder Legendre polynomial. r(t) allows a direct comparison of experimental observations with simulations. Here, we assume that the absorption and emission dipole moments are collinear for our dyes, so the fundamental anisotropy is given by r 0 = 0.4, close to the experimental value. 27 In the analysis of experimental data, often a doubleexponential decay is required to account for the contributions of rotational motion of both the fluorophore and the protein. 14,26,28 In the systems we investigate here, the proteins are large in comparison to the dyes and the measured dye rotational correlation times are one order of magnitude smaller than the expected rotational correlation time of the proteins. In the simulations, we focus on the rotational motion of the dye in the inertial system of the protein, where we find it sufficient to use only one exponential and fit with

C. Structure based models
For our simulations, we use native Structure Based Models (SBMs), also referred to as Gō-type models. 29 They are based on energy landscape theory and the principle of minimal frustration. [30][31][32][33] SBMs are used successfully to study a wide range of phenomena 34 ranging from, e.g., protein folding, 35,36 misfolding, 37,38 structure prediction, 39 and conformational dynamics 40,41 to large biomolecules such as the ribosome 42 or RNA. 43 SBM simulations show good agreement with experimental results, e.g., they are used to reproduce transition state ensembles and "en-route" intermediates, 44 and folding rates comparable to experimental measurements. 35 By design, they are limited to unfrustrated systems with negligible non-native interactions. [29][30][31][32][33] With their high computational efficiency, SBMs facilitate simulations of several folding and unfolding transitions on regular desktop computers while offering full flexibility for all parts of the protein. The SBM potential employs the native structure of the protein as its ground state but also allows one to model systems with multiple stable conformations by using potentials with multiple minima. 7,40,41,45 We use an all-atom SBM 22 implemented in eSBMTools, 46 which takes into account all heavy atoms of the protein and uses a simplified potential of the form 47 The potential includes harmonic potentials for bonds, angles, and improper dihedral angles and a potential for proper dihedral angles which allows the occupation of isomeric conformations. The minima are set to r 0 , θ 0 , χ 0 , and φ 0 , the values for bond lengths, angles, improper dihedral angles, and proper dihedral angles in the native state, respectively. The contact potential C G (r ij , r ij 0 ) stabilizes the native structure by introducing attractive potentials for atom pairs forming contacts in the native state, whereas an overall repulsive term is included for the excluded volume of all atoms. r ij 0 and r ij are the native and the actual distances of the atom pair (i, j) andσ represents the excluded volume for Pauli repulsion with the valuẽ σ = 2.5 Å. The energetic weights for bonds, angles, improper dihedral angles, and non-native contacts are set to the values K b = 20 000 /nm 2 , K a = 40 /deg, K i = 40 /deg, and K nc = 0.01 , where deg refers to degree and is the reduced energy unit. 48 The energetic weights for the contact potential K c and for the proper dihedral angle potential K d are determined as described in Ref. 48. The Gaussian contact potential is given by 47,49 with σ 2 = (r ij 0 ) 2 /(50 ln 2) for each native contact pair (i, j) determined by the shadow map algorithm. 47 All atoms are handled equally with identical parameters for the excluded volume and a unit mass of m = 1.0.
As SBMs do not have an inherent temperature scale comparable to physical temperature, the temperatures are typically measured in reduced units. Accordingly, the folding temperature in the SBMs is about 1.0, corresponding to around T = 120 in GROMACS units. 48 Here, we report the temperature in GROMACS units.

A. Dye structure
We want to find a minimal and robust set of parameters describing the dyes which is sufficient to replicate and predict experimental data.
SBM simulations require an initial structure of the system. For many dyes, only chemical structures are available. As we want a systematic and consistent workflow, we perform quantum-chemical calculations to get three-dimensional structures of the dyes. Details and the structures of the Alexa Fluor dyes with available chemical structures are given in the supplementary material (see SI 1). Subsequently we add the respective linkers and maleimide which are used to bind the dyes to the protein. In this study, we use two dye pairsthe Alexa Fluor 488 dye with C 5 -linker (AF488) and Alexa Fluor 594 dye with C 5 -linker (AF594), and the Alexa Fluor 546 dye with C 5 -linker (AF546) and Alexa Fluor 647 dye with C 2 -linker (AF647). In general, every dye and linker with an available chemical structure can be utilized accordingly.

B. Dye parameters
For the initial dye parameters, we choose the same parametrization as for the protein in the SBM. We determine the angles and dihedral angles automatically from the bond information given in the chemical structure. A detailed description can be found in the supplementary material (see SI 2). The only interaction between both dyes and also between dyes and protein is the excluded volume repulsion term.
SBMs do not have an inherent temperature scale. To adjust the dye parameters independently of the protein in a consistent way, we set the dye temperature T dye separately from the protein temperature T. In the simulation, we treat dyes and protein as different groups and couple them to separate temperature baths. This has the advantage of uncoupling the behavior of dyes and protein, so we can use the same dye temperature regardless of investigating, e.g., a folded or an unfolded protein, realized by low and high protein temperatures, respectively. We tested the effect of the dye temperature on the protein dynamics and found only a slight increase of the fluctuations of the respective residues the dyes are attached to, whereas the dynamics of the adjacent residues remain unaltered. Due to numerical instabilities, 50 we cannot sufficiently increase the temperature to account for the high dye mobility. Hence, we additionally change the mass of the dye atoms to 0.2, rather than the unit atom mass of 1.0 the SBM assigns to the protein.

C. Merging dyes and protein
The protein structures used for the simulations are taken from the Protein Data Bank (PDB). 51 The three used proteins are the chymotrypsin inhibitor 2 (CI-2, PDB: 2CI2 52 ) as a test system, the tenth type III module of fibronectin ( 10 FNIII, FIG. 1. Structures of 10 FNIII (left) and AF546 and AF647 dyes 55 (middle). The dyes are attached with the respective linkers via a maleimide bound to residues 11 and 86 of 10 FNIII. The merged structure is shown with the assumed transition dipole moments (black arrows) for donor (blue, µ D ) and acceptor (red, µ A ) dyes as well as the distance between the dyes' centers (R DA ) (right). PDB: 1TTG 53 ), and the cold-shock protein from the hyperthermophilic bacterium Thermotoga maritima (CspTm, PDB: 1G6P 54 ).
The Alexa Fluor dyes are typically attached to cysteine residues of proteins via a maleimide group. In our simulations, the respective residue in the protein is replaced by a cysteine and the dye-maleimide structure is attached to the sulfur atom of the cysteine. This is done by attaching the dye preferably orthogonally to the protein surface while avoiding steric clashes between atoms. As an example, 10 FNIII with AF546 and AF647 attached to residues 11 and 86, respectively, is shown in Fig. 1. Details of the procedure can be found in the supplementary material (see SI 2).

D. Simulation protocol
We perform our simulations using GROMACS 56 with the SBM potential [see Eq. (6)] and the Langevin dynamics simulation protocol. 56 As we use CI-2 as a simple test system, we choose a protein temperature of T = 50 below the folding temperature for the simulation of the folded state.
For the proteins CspTm and 10 FNIII, we want to achieve a quantitative comparison against experimental measurements. To identify the SBM temperature for the folded states corresponding to the experimental setup, we perform an initial allatom simulation in the AMBER99 force field: 57 We choose the SBM protein temperature by comparing the native-basin fluctuations of the proteins simulated with the AMBER99 force field at the physiological temperatures used in the experiments [T = 296/295 K ( 10 FNIII/CspTm)] with SBM simulations of varying temperatures. We compare the root mean square fluctuations of the C α -atoms and choose the SBM temperature which results in the least deviation from the AMBER99 simulation, according to Ref. 58. For the results, see Table I. For the unfolded state, we choose a temperature well above the folding temperature so that the protein is unfolded throughout the simulation. We check the influence of choosing different temperatures above the folding temperature, which does not alter the results. Future work will focus on the challenges of simulating conformational transitions (cf. Ref. 59).
To determine appropriate dye temperatures, we perform simulations with different dye temperatures and determine the respective rotational correlation times by using Eqs. (4) and (5). The temperature directly affects the dye mobility and the rotational correlation time τ rot . Higher temperatures lead to smaller τ rot . To obtain a high dye mobility, we choose the temperature for the faster rotating dye at T dye = 250 and for the slower dye such that the ratio of both rotational correlation times from experiments is matched (see Table S1 in the supplementary material). For AF647, we do not have an experimental value of τ rot and use the same temperature as for AF546. Varying the dye temperature slightly does not alter the final results.
We then perform simulations with 10 9 steps and dt = 0.0005 for CspTm and with 2.5 × 10 9 steps and dt = 0.0002 for CI-2 and 10 FNIII. The temperature coupling constant is set to τ T = 0.1. From the simulations, we extract the inter-dye distance R DA (t) and the transition dipole moments of donor µ D (t) and acceptor µ A (t) (see Fig. 1, for details see SI 3 of the supplementary material).
SBMs do not have a time scale directly comparable to the physical time scale. To evaluate simulated trajectories and obtain FRET efficiency histograms, we introduce a time scale based on a comparison of the experimental and theoretical rotational correlation times τ rot of the dyes. We calculate the fluorescence anisotropy decay r(t) from simulations according to Eq. (4) and fit it with Eq. (5). From the comparison, we get a conversion factor to adjust the trajectory time scale.
This time scale is necessary as we want to calculate FRET efficiency histograms directly comparable to experiments. Photon statistics are gained from Monte Carlo photon simulations (as described in Refs. 8, 60, and 61) and fully account for shot-noise. The simulation parameters are given in Table S1 in the supplementary material. We generate . We perform simulations with three different proteins (CspTm, CI-2, and 10 FNIII) and two different dye pairs (AF488 and AF594, AF546 and AF647). Because of the dependence of the Förster radius on the refractive index of the medium, R 0 changes slightly for higher denaturant concentrations in the experiments. Hence, for evaluation of the unfolded state, we use the changed Förster radii (see Table S1 in the supplementary material).

A. Test simulations with CI-2
As an example, we simulate the test system CI-2. The resulting distributions for CI-2 with two different dye pairs, attached at residues 20 and 78, respectively, are shown in Fig. 2. Clearly, the unfolded state has a much broader distance distribution, whereas the folded state is narrower. Comparison of the distances of the C α -atoms with the inter-dye distances shows that the distribution of R DA in the folded state is dominated by dye dynamics. For both dye pairs, the distance distributions are similar. The ideal Förster radius to get bestseparated states would be between the average distances of folded and unfolded states. Considering the Förster radii of the two dye pairs used here, the distance distributions show that the pair AF488/AF594 is better suitable for studying CI-2 experimentally.
The distributions of the orientation factors in the folded and unfolded states are similar, and the mean values κ 2 U are in good agreement with the approximation of κ 2 = 2/3. As expected, in the case of unfolded proteins, the dyes can rotate freely and their orientations are almost unrestricted. However, we observe slightly larger deviations of 2%-5% for the mean values in the folded state κ 2 F , probably due to steric restrictions imposed by the protein.
The resulting FRET efficiency histograms along with the mean efficiency values calculated by Gaussian fits are illustrated in Figs. 2(e) and 2(f). As an alternative, log-normal distributions or a straightforward median can be used, but this does only have negligible effects on the quantitative results. Different hypothetical Förster radii can be tested with this model (see SI 5 of the supplementary material).

B. CspTm with different dye positions
Another advantage of the efficient simulation approach we propose is that different labeling positions can readily be implemented and investigated. As an example, we perform simulations for CspTm with AF488 and AF594 at different labeling positions. The resulting FRET efficiency histograms are shown in Fig. 3. The labeling sites are the residue pairs C2/C68, C68/C2, C11/C68, C68/C11, and C23/C68 for donor/acceptor, 62,63 respectively (for the numbering scheme, see SI 6 of the supplementary material). As expected, the peaks for the unfolded state shift to higher efficiencies with shorter sequence separation. Also, permutations of the dyes with respect to the attachment points have, unsurprisingly, no relevance.
Both results (see Figs. 2 and 3) show the strength of the simulation method as it establishes a way to directly relate distance distributions to FRET efficiency distributions and allows one to vary and test different parameters such as the Förster radius, dye pair, linker length, or labeling sites. This facilitates improvement in planning, interpreting, and validating experimental results.

C. Comparison of simulation and experiment
As a validation of our model, we compare the resulting FRET efficiency histograms against experimental data. FRET efficiency histograms from experiments and simulations for four different systems are shown in Fig. 4. Figure 4 63 The distributions are shown for the folded (blue) and unfolded (orange) states. The experimental FRET efficiency values below 0.0 and above 1.0 are not shown. The distributions are fitted with Gaussians, and the peak positions for folded ( E F ) and unfolded ( E U ) states together with the respective standard deviations σ F and σ U can be found in Table II. the results for 10 FNIII with AF546 and AF647, and Figs. 4(b)-4(d) show the results for CspTm with AF488 and AF594 at different labeling positions.
The experimental data for the respective protein in 0.0 M GdmCl (folded) and in 4.63 M and 7.0 M GdmCl (unfolded) for 10 FNIII and CspTm 63 are shown, respectively. The experimental data are already corrected for background, different quantum yields of donors and acceptors, different detection efficiencies, cross talk, and direct acceptor excitation. From these errors, only the different quantum yields are reflected in our simulation protocol and are already corrected for in the Monte Carlo photon simulations.
The simulated data agree well with the experimental results except in the case of CspTm C11/C68 [see Fig. 4(c)], where the transfer efficiency of the folded state in the simulation is shifted to lower values. In this case, the dyes are attached on opposite sides of the protein, which would explain a lower FRET efficiency compared to the other two labeling schemes. The deviation between simulations and experiments may be caused by residual attractive interactions between the fluorophores and protein surface, an aspect that could be tested by detailed time-resolved fluorescence anisotropy measurements of the different variants.
The mean and standard deviations of the Gaussian fits in Fig. 4 are summarized in Table II. The widths of the FRET efficiency distributions from the simulations are dominated by shot-noise, caused by the limited number of photons collected for each burst.
In experiments, dye photophysics, e.g., donor and acceptor quenching or blinking, additional deviations through the necessity to correct for background, cross talk, and detection efficiencies, lack of site-specific labeling or other chemical heterogeneity, and other experimental artifacts can lead to additional broadening of the FRET efficiency distribution. 25,64 Incomplete labeling or photobleaching can lead to a donoronly peak near zero FRET efficiency which is also not present in the simulations.
For CspTm, the width of the Gaussian from the simulations is only slightly lower than the ones from the experiment, which indicates that the width of the experimental histograms is already close to the shot-noise limit.

D. Comparison to simple models for data analysis
To compare the simulation results with different simple models for data analysis, we plot the inter-dye distance R DA for the folded state and R DA in relation to the C α -distance (the distance between the C α -atoms of the respective residues the dyes are attached to) in the unfolded state in Fig. 5. This is shown for three different protein-dye systems.
For the folded state, we compare the mean and standard deviation from the distance distribution of the simulation (black crosses and error bars) to the respective values of an accessible volume calculation 11,12 (blue). The accessible volume approach calculates all sterically possible dye positions within the linkage length and considers them as equally probable. Then, it calculates the mean distance of the dyes R DA .
Both methods agree very well, with the simulations having a lower standard deviation. This originates from the dynamics in the simulations, which entropically disfavors dye states close to the protein surface, whereas the accessible volume method assumes all states to be equally probable.
For the unfolded state, the accessible volume approach faces the challenge of properly treating an unfolded ensemble of diverse conformations with dye distributions. By contrast, our model can directly simulate the whole ensemble with dyes.
Proteins in the unfolded state can be described as polymer chains. At high temperatures in the unfolded state, the SBM results in an excluded volume polymer chain. 65 To test this, we consider the equation with the spatial distance r between two chain elements, the sequence separation between two elements N, the length scaling exponent ν, and a constant C. We fit the mean C α -distance (R C α ) 2 ij 1/2 between residues i and j, averaged over the trajectory, as a function of the sequence separation N = |i j| with Eq. (8). We obtain values for the fit parameters ν which correspond well with the expected value for an excluded volume chain of ν = 3/5 65 (see SI 7 of the supplementary material). Now we determine an effective segment length N eff = N + L with a "length" L of the dye pair which was already used to interpret experimental FRET measurements. 66,67 We take the dyes' mean square separation R 2 DA 1/2 and assign an N eff from the respective fit. An example fit and details of the calculation are given in the supplementary material (see Fig. S12). It turns out that for the AF488/AF594 dye pair, the length is about L = 11.3 ± 1.3 residues, which is in the same range as found experimentally in Ref. 67. For the AF546/AF647 dye pair, it is 14.15 ± 0.65 residues. With this effective length, a correction factor m = ((N + L)/N) ν for the relation between R 2 DA 1/2 and the distance of the respective C α -atoms R 2 C α 1/2 can be calculated. It is given by m = 1.10 for CspTm (with AF488/AF594), m = 1.14 for CI-2 (with AF546/AF647), and m = 1.11 for 10 FNIII (with AF546/AF647) (for details see SI 7 of the supplementary material).
For the unfolded state, the histogram for the inter-dye distance and the corresponding C α -distance is shown in Fig. 5. It can be seen clearly that the values are not identical (which would be indicated by the dotted lines). By correcting this for the effective chain length, we expect a different relation FIG. 5. Distributions of inter-dye distances R DA and distances of the C α -atoms of the respective residues for CspTm (with AF488 and AF594 at residues 2 and 68, left), 10 FNIII (with AF546 and AF647 at residues 11 and 86, middle), and CI-2 (with AF546 and AF647 at residues 20 and 78, right). For the folded state [(a)-(c)], distance distributions (red), mean distances (crosses), and standard deviation (error bars) from simulations (black) and from accessible volume calculations (blue) are depicted. For the unfolded state [(d)-(f)], the histograms of the inter-dye distances and the corresponding C α -distances are shown. The expected dependency for equality of both values (dotted lines) and for the expectation corrected by an effective segment length of the chain (white crosses) is shown. The histogram count is scaled according to the maximum of each histogram separately.
between R DA and the C α -distance, as indicated by the white crosses, providing a good approximation of the dependence observed in the simulations.
In summary, we achieve our aim of capturing both folded and unfolded states in the same simulation method, paving the way to simulate complex large-scale conformational transitions between multiple states in the future.

V. DISCUSSION
In this work, we introduce a new simulation method for dye-labeled biomolecules with only a few robust parameters. The approach suggested here is most useful in the absence of pronounced non-native interactions or structure formation not included in the structure-based model. We are able to achieve a realistic description of the system as evidenced by direct comparison and quantitative agreement of FRET efficiency histograms with experimental data. This establishes a way to directly complement experimental FRET efficiency distributions with atomically resolved structural ensembles from simulations, which provides novel and detailed insights into biomolecular processes. Even complex scenarios involving large structural ensembles, such as unfolded proteins, intrinsically disordered proteins, conformational transitions, and folding intermediates, can be sufficiently sampled with modest computational resources.
The model can improve planning experimental measurements as one can vary and test different parameters such as the Förster radius, linker length, and labeling sites in silico. On the basis of these simulations, one can then choose the dye combinations which would, e.g., best distinguish the conformational states of interest. Further, we found that the approximation of κ 2 = 2/3 is mostly appropriate for the tested typical systems. Similarly, we could probe estimations of dye distance distributions based on simple models.
Having established the simulation techniques, many new scenarios are now accessible. These include simulating strongly restricted or slow dye motions and very short or inflexible linkers to test their effect on κ 2 . The low computational costs make investigating conformational transitions between multiple stable conformations and other large scale motions accessible. In addition, it can be used to simulate more complex FRET measurements, such as three-color FRET, [68][69][70] where the interpretation of the data is even more challenging.

SUPPLEMENTARY MATERIAL
See supplementary material for additional information about the used methods and the three-dimensional structures of the Alexa Fluor dyes.