Invited Review Article: Measurements of the Newtonian constant of gravitation, G

By many accounts, the Newtonian constant of gravitation G is the fundamental constant that is most difficult to measure accurately. Over the past three decades, more than a dozen precision measurements of this constant have been performed. However, the scatter of the data points is much larger than the uncertainties assigned to each individual measurement, yielding a Birge ratio of about five. Today, G is known with a relative standard uncertainty of 4.7 × 10−5, which is several orders of magnitudes greater than the relative uncertainties of other fundamental constants. In this article, various methods to measure G are discussed. A large array of different instruments ranging from the simple torsion balance to the sophisticated atom interferometer can be used to determine G. Some instruments, such as the torsion balance can be used in several different ways. In this article, the advantages and disadvantages of different instruments as well as different methods are discussed. A narrative arc from the historical beginnings of the different methods to their modern implementation is given. Finally, the article ends with a brief overview of the current state of the art and an outlook.


I. INTRODUCTION
Gravitation is the most tangible interaction of the four fundamental interactions of nature, having been investigated long before the discovery of the electromagnetic, weak, and strong interactions. Based on fundamental studies, mainly those of Galileo Galilei, Newton 1 formulated a description of the gravitational interaction in 1687. This physical law, which still gives-to a certain degree-a correct description of many astronomical and terrestrial observations, contains a proportionality factor known as the Newtonian constant of gravitation, abbreviated in equations by the capital letter G.
Newton's theory is correct for table top gravitational experiments in the laboratory but fails to describe the gravitational interaction in strong gravitational fields and at high velocities. For these cases, a relativistic model must be used. General relativity, the current accepted theory of gravitation, was published in 1915 by Einstein 2 and has been subjected to numerous tests, 3 all of which it has passed. Gravity emerges in general relativity as a property of space-time. Mass, as well as energy, affects the curvature of space time. The gravitational constant, together with the speed of light, describes the extent to which spacetime is contorted for a given mass. This article is concerned only with the limits of gravity where the Newtonian laws are sufficient to describe the system. G was first measured in the laboratory by Henry Cavendish in 1798. Since then, more than 200 experiments have been conducted to precisely determine the value of G 4 -with limited success. In June 2015, the Committee on Data for Science and Technology (CODATA) published a new recommendation for the value of G with a relative standard uncertainty of 4.7 × 10 −5 . For comparison, with today's technology, time intervals can be measured to a relative uncertainty of a few parts in 10 18 . The uncertainty of G is based on the measurement precision reported by the experimenters, but it also includes a factor of 6.3 in order to take into account the large spread between the values obtained by different groups. This factor reduces the normalized residuals below two, see Sec. VI. The published values scatter by this factor more than they should based on the reported uncertainties. This gives reason to suspect hidden systematic errors in some of the experiments. An alternative explanation is that although the values are reported correctly, some of the reported uncertainties may be lacking significant contributions. The uncertainty budgets can include only what experimenters know and not what they do not know. This missing uncertainty is sometimes referred to as a dark uncertainty. 5 We can assume that each experimenter has provided an uncertainty budget calculation for the experiment that is as detailed as possible, and has excluded possible systematic effects. Even if all experiments of the same type agree, a reasonable way to check for systematic errors is to repeat a measurement with another experimental approach. G is in good condition concerning the way in which it is measured.
Many different experiments have been conducted and different-sometimes inconsistentresults have been published. The field is definitely far removed from the intellectual phase locking 6 that has occurred in the past in the context of measuring other fundamental constants of nature. 7 In this review article, different experimental setups that have been used to determine G will be discussed. Where possible, historical background information on precision experiments is provided. Due to the large scope of the topic, details will be largely eschewed in favor of important problems and advances. The list of experiments given here is by no means comprehensive. For a more complete index of measurements, the reader is referred to the work of Gillies,4 which was updated in 1997. 8 For earlier review articles, see Refs. 9-16.
The measurements of G have a two-way interaction with the state of the art in technology: (1) Advances in technology can benefit measurements of G: for example, the beam balance scientists to measure G and to learn from the many interesting experimental techniques deployed in the field. Although the experiment is difficult, it provides an excellent training ground for future scientists and engineers.
This article is organized into seven sections. Section II gives additional background information on the gravitational constant, including its applications. In Sec. III, general facts about the measurement of G are described, and different methods are briefly introduced.
Section IV describes several different measurement principles in greater detail. In Sec. V, considerations regarding large masses (field masses) are examined. After an outlook in Sec. VI, the main conclusions of the article are discussed.

II. BACKGROUND
The measurement of the gravitational constant has a long history. It was the second fundamental physical constant ever measured, preceded only by the speed of light. The numerical value of the speed of light was fixed with zero uncertainty in 1983 in order to define the unit of length, the meter. Of the fundamental constants that can still be measured, G has the longest measurement history. The latest CODATA recommendation 19 assigns a value of 6.674 08 × 10 −11 m 3 kg −1 s −2 , with a relative measurement uncertainty of 4.7 × 10 −5 , to G (Fig. 1). For comparison, shortly before the fixed value of the speed of light was established, a relative uncertainty of 3.5 × 10 −9 20 was assigned to its value. At first glance, it seems difficult to understand that-after more than 200 years and over 200 measurements -the assigned relative uncertainty of G is as high as it is.

A. From the mean density of the Earth to G
The physical quantity discussed here appears in Newton's law of gravitation. This law was formulated by Newton in 1687, 1 and today is often written in the form F = G m 1 m 2 where m 1 and m 2 denote the masses of two bodies with a distance r between their centers of mass. G plays the role of a constant of proportionality (i.e., it can be considered as conversion factor). In fact, when Newton wrote the law of gravitation, he did not introduce this proportionality factor because at that time, laws were formulated as ratios rather than as equations. Hence, G was of no significance to Newton. Scientists were more interested in how much a celestial body weighed compared to the Earth or the Sun. Thus, originally, Cavendish did not measure G in his famous experiment but the mean density of the Earth, as his article was titled. 27 He calibrated the density of the field masses with respect to a reference material (water). Then, he measured the attraction of the test masses with respect to these field masses. As a result, he was able to give a ratio between the field masses and the Earth's mass. As the diameter of the Earth was already known, he was able to derive its mean density.
Although the constant of proportionality was not present in Newton's original publication, today, his law is written as an equation, in the form (1). It is believed that Siméon Denis Poisson was the first to introduce a constant of proportionality in 1811 (see Ref. 28) although he did not use the character G, which is now usually used for this constant. G appeared, in all likelihood, for the first time in a publication by König and Richarz in the year 1885. 29 Currently, G is often also called "Big G" in order to distinguish it from the acceleration due to gravity, which is denoted by the lower-case letter g-hence referred to as "Little g."

B. Applications
The current relative standard uncertainty of G is 4.7 × 10 −5 , according to the 2014 least squares adjustment performed by CODATA. This is a relatively large uncertainty when compared to the uncertainties of other fundamental constants (see Fig. 2). One may ask why the large relative uncertainty of G is not a problem in science or commerce. To answer this question, the fields and applications that require G will be briefly discussed.

Metrology-Metrology institutes such as Physikalisch-Technische Bundesanstalt (PTB) in Germany and National Institute of Standards and Technology (NIST) in the United
States, according to their statutes, have a mandate to measure fundamental constants. The current measurement uncertainty attributed to G illustrates the fact that measuring it is a very difficult task although many scientists are drawn to the challenge of reducing its uncertainty.
Another possible application of G in metrology can only be realized if its uncertainty is further reduced. In 1899, Max Planck proposed a natural system of units 28,30 that was not reliant upon artifacts. He proposed using the velocity of light, c, Planck's constant, h, and Newton's constant of gravitation, G, to define a fundamental system of units. In this system, the unit of mass-called the Planck mass-is defined as m p = ℏc G , (2) where ℏ = ℎ/ 2π , with a value of 21.765 μg, when expressed in the International System of Units (SI). This system of units was conceived to be valid everywhere in the universe and for all times. For this system of units to be competitive with the existing SI system, the uncertainties must be equal or better than those of the current SI system. However, in the SI, mass at the highest level can be measured to a few parts in 10 9 , whereas G is only known to be 5 parts in 10 5  Planck length play important roles in astrophysics and particle physics. In the planned revision of the International System of units, the SI, fundamental constants will be used to define the unit system although G will not be used.
inverse square law (i.e., the gravitational force is not proportional to 1/r 2 ). It is assumed that there is a hidden "fifth force," one which, like gravitation, is proportional to mass, but whose strength changes below a certain separation. More accurate knowledge of G can help in eliminating some of these theories and developing entirely new concepts. Furthermore, having an accurate number for G can be useful for disproving theories that predict a different number for G.
3. Geophysics and astronomy-Using observational methods, the geocentric constant of gravitation, GM Earth , can be determined with a relative uncertainty of only 2 × 10 −9 although the value of M Earth can be given only with the uncertainty of G. The determination of the Earth's mean density is also limited by the accuracy with which G is known. Better knowledge of the mean density of the Earth is still desired in geophysics, as current knowledge limits the calibration accuracy of gravity gradiometers. 31 Gravity gradiometers with better calibrations lead to better data in geophysical prospecting.
McQueen 32 pointed out that the uncertainty in the Earth's elasticity parameters (Love numbers) is also limited by the uncertainty in G. In astronomy, G alone is rarely of importance, as most effects are caused by the product of GM, where M is the mass of a (central) star. Even the most recent measurement of the mass of a star from the relativistic deflection of star's light was performed at first by means of GM. 33 Furthermore, star masses are very often given as a ratio to the solar mass, whose measurement is again obtained from a measured product of GM ⊙ .

III. GENERAL CONSIDERATIONS FOR MEASUREMENTS
In SI units, the quantity G has the unit m 3 s −2 kg −1 . In order to determine G, quantities with the dimensions of length, time, and mass have to be measured. In SI units, the numerical value {G} = 6.674 08 × 10 −11 is very small (ten orders below unity). The size of the unitless number indicates that the associated forces between objects in a laboratory are small, in terms of the units used in everyday life. This provides a first indication of why it is so difficult to measure G accurately. The gravitational force between two spherical masses of 1 kg at a distance of 1 m is 67 pN, which equals the weight of a mass of about 6.7 ng. For comparison, the mass of a human cell is approximately 1 ng. 34 From this thought experiment, it is clear that at least two large masses have to be brought as close together as possible in order to maximize the signal. This requires for a high volumetric mass density that allows large masses to be brought close to each other.

A. Microgravity environment
Equation (1) indicates two methods of measuring G: either by measuring the (attracting) force between two well-characterized masses or by measuring the acceleration of one mass towards the other, a, which is where m 1 denotes the field (generating) mass. Note that Newton's law of gravitation is symmetric for the masses m 1 and m 2 . However, the terms field mass and test mass are used to describe these experiments. The test mass is connected to a sensor, while the position of the field mass is usually changed in regular time intervals in order to modulate the gravitational signal. Although the distinction is made here and in most articles describing the experiments, one should be aware of the inherent symmetry of the gravitational law. A force (or more precisely, torque) measurement was used in the first laboratory experiment to determine G. In 1798, Henry Cavendish used a torsion balance, shown in Fig. 3, to determine G from the gravitational torque acting on a dumbbell suspended from a torsion fiber. The apparatus was designed and built by Rev. John Michell; however, he did not complete the apparatus or perform any experiments on it before his death in 1793. Michell invented the apparatus with the purpose of determining the mean density of the Earth. Rev. F. J. H. Wollaston later forwarded it to Cavendish, who carried out the now-famous measurement. 12 Independently, Charles A. de Coulomb invented a torsion balance in 1777. Gehler 35 pointed out that Coulomb was the first to use fibers to suspend a torsion pendulum, experimenting at first with hairs and silk fibers before employing metal fibers at a later time.
Michell, on the other hand, began by balancing the torsion bob, a thin wire, on a tip, similar to a compass needle, in 1768. Later, he adopted a fiber to suspend a dumbbell. One noteworthy feature of the torsion balance is that it decouples the vertical gravitational force caused by the Earth from the horizontal one caused by the field masses. This is necessary because if the attractive force between two bodies is measured vertically, the signal will be between a million (10 6 ) and a trillion (10 12 ) times smaller than the background caused by the gravitational attraction due to the Earth. The exact signal-to-background ratio depends on the size and geometry of the mass assembly.
The disadvantageous signal-to-background ratio is illustrated in Fig. 4. A 1 kg mass is suspended from a spring leading to a 10 cm elongation (to cite one example) due to the gravitational force between the Earth and the mass. A 1 kg field mass is then placed 1 m below the suspended mass, adding an additional gravitational force to the setup. This additional force causes an extra elongation of only 0.67 pm, which is about 1 000 000 times smaller than one wavelength of light. This measurement is thus a very challenging one to perform, considering environmental factors that cause changes in the spring constant and, thus, drift in the mass position. To make matters worse, the local gravitational acceleration g is not constant, but it is a function of time due to the apparent motion of the Sun and the Moon, both of which exert additional forces known as tides. The relative change of g due to the tides has a peak-to-peak amplitude of about 3 × 10 −7 (i.e., the change in force due to the tides is 15 000 times larger than the force produced by the field mass). The brief example of the spring balance clearly illustrates the merit of the torsion balance. Cavendish's setup maintains both attracting masses at the same vertical height, and the attracting horizontal force is observed. Apart from the horizontal gravity gradients of the Earth's gravity field and the surrounding masses in the laboratory, this force contains only the gravitational attraction due to the field and test masses. Since the torsion wire in Cavendish's setup has a very small restoring torque, small external torques can be measured. We have created a microgravity environment for one degree of freedom, the rotation of the torsion bob about the fiber axis. The lack of (angular) accelerations along this direction allows for long measurement times. Although the torsion wire is a crucial component in this setup, it is also an important source of error, as was shown in Ref. 36, see Sec. IV A 2 b.
A microgravity environment, with accelerations approximately 10 −6 g, can also be achieved in an artificial satellite. The field mass and test mass fall on a circular orbit around the Earth, and the relative acceleration between the two is only given by the small gradient of the gravitational field and the gravitational attraction between the two masses. Unlike a torsion balance, such a system can be realized without the disturbing properties of a suspensionthe two masses can float within the satellite without a mechanical connection to the satellite. Such a drag-free test mass was recently successful demonstrated by the LISA (Laser Interferometer Space Antenna) Pathfinder mission. 37 Observing the mutual gravitational attraction in a microgravity environment entails very long integration times since (in principle) the bodies continue falling forever.
However, space experiments are usually prohibitively expensive and require long preparation times, often over several decades. To date, several ideas for space experiments to determine G have been proposed. 38 Only one such proposal, called "Satellite Energy Exchange (SEE)," is still being actively prepared. 39 To avoid costly space experiments, other possibilities that allow free-fall are appealing. Obviously, on Earth, very long integration times cannot be achieved, as the object falls towards the Earth and not around it (as it does in space). The drop tower in Bremen, Germany, for example, allows a drop duration of 9.3 s. 40 In some cases, this may not be long enough to obtain a good measurement of G; unfortunately, the facility's tight schedule prevents many repetitions from taking place in succession (even if this were possible, it would become very expensive over time). Another option is sounding rockets. Experiments on sounding rockets have been considered for testing the equivalence principle, but not for measuring G. 41

B. Free-fall method
In the example above, the additional field mass between the Earth and the suspended mass alters the local acceleration g at the position of the test mass. This idea can and has been used to determine G. Free-fall absolute gravimeters are precision instruments that can measure the local acceleration with relative uncertainties on the order of 10 −9 . Gravimeters have a broad application in geodesy and geophysics and have recently been used in fundamental metrology as well for the planned re-definition of the SI unit of the kilogram via the Kibble (or watt) balance. 42 In contrast to space experiments, however, only one mass is in free-fall. A well-known field mass can be placed close to the trajectory of the test mass. The gravimeter measures the perturbed local gravity. By modeling the perturbation produced by the field mass, G can be determined.

C. Beam balance
Besides the torsion balance and absolute gravimeters, a third instrument can be used to measure G: a beam balance. If a mass of, for example, 1 kg is placed on each side of a beam balance, the balance will stay in equilibrium as long as the local gravity at the positions of each mass is equal. If a heavy field mass is placed close to one of the test masses, the local gravity at this test mass will change and the balance will get out of equilibrium since the forces on both masses are different. By measuring the force difference, we can determine G if the field and test mass are well characterized. To a certain extent, this method is similar to the torsion balance method: The gravitational pull from the Earth is supported by the fulcrum and only differential forces cause an indication of the balance. Newer versions of precision balances use an electromagnetic force compensation; as a result, the mechanical weight on one side is compensated by an electromagnetic force on the other side. In this case, however, the tidal variations of gravity have to be taken into account.

D. Simple pendulum
Finally, a simple pendulum can also be used to determine G. With a pendulum, there are basically two ways of measuring G. The first is to measure the frequency change of the pendulum swing when a field mass is placed below the pendulum bob. This is possible because the swing time, T, depends on gravity, g, as where l is the pendulum length. A second way is to deflect the pendulum bob by placing a field mass horizontally next to it. It becomes obvious that both ways are similar to variants of the torsion balance-namely, the time-of-swing method and the simple deflection (Cavendish) method, the difference being that the torsion balance has a higher sensitivity. 43

IV. MEASUREMENT PRINCIPLES
In Sec. III, a brief overview of the different approaches to measure G was given. In Secs. IV A-IV D, the different techniques are described in more detail.

A. Torsion balance
A torsion pendulum is a standard example of a simple harmonic oscillator. It consists of two components: the torsion fiber, which provides a restoring torque, and the pendulum bob, which contributes inertia (see Fig. 5). For a freely swinging torsion pendulum, the energy of the pendulum is stored as potential energy in the fiber or as kinetic energy in the rotating bob. The energy switches between these two types of energy every quarter period.
The design of the torsion bob depends on the measurement goal of the torsion balance. Its shape is optimized to allow it to be coupled to external forces that are often modulated. Torsion balances have many applications other than the determination of G. A review of torsion balances can be found in Ref. 44.
To measure the influence of the external force on the torsion pendulum, a readout is required. Very often, the readout is performed optically with an autocollimator. In an autocollimator, the divergent beams of a light source are collimated into a parallel beam through a lens. The beam is reflected by a mirror mounted on the torsion pendulum and returned to the same lens. The returning beam is focused onto a position-sensitive detector. By using the same lens twice (for collimation and focusing), the system is not only simpler but also more robust regarding optical aberrations. Further details are available in Refs. 45-47.

Basic equations-The differential equation of a torsion pendulum is
Iθ¨+ κ 1 + ϕi θ = N t , (5) where I is the moment of inertia of the pendulum body, κ is the torsional spring constant, θ is the azimuthal angle, and N(t) is the external torque applied at the pendulum bob. The torsion fiber is considered lossy, and ϕ denotes the loss angle of the torsion spring, which is the inverse of the quality factor, ϕ = 1/Q. The damping described by an imaginary component of the torsion constant is called internal damping. Another possibility for damping is velocity-dependent damping, often called external damping. External damping requires an additional term of γθ˙ on the left-hand side of Eq. (5). Gas pressure damping is an example of external damping. Modern torsion balance experiments are (mostly) limited by internal damping, while the external damping term can be ignored. For experiments at room temperature and vacuum using metal fibers, quality factors of several thousands are typical. Hence, the loss angles are very small and the equations can be expanded in a Taylor series around ϕ = 0. In the formula below, we expand up to second order.
The decay time τ is given by If the external torque is sinusoidal with angular frequency, i.e., N t = N o exp −iωt , the motion of the pendulum at this frequency is given by θ t = θ a exp −ωt with The response to a torque that is more complicated than a pure spectral note can be calculated from the response to each Fourier component by synthesizing the responses.
The non-negligible loss angle of the fiber has another important experimental consequence: A loss in a system is connected via the fluctuation dissipation theorem to the noise in the system. The single-sided power spectral density S τ (f) of the torque is given by where k B is the Boltzmann constant, T is the temperature of the system, and R is the real part of the mechanical impedance, 48 in this case R = κ/(2πQf). From the spectral density and the known signal frequency, the time that is required to measure a signal with given statistical uncertainty can be calculated. In Fig. 6, the power spectral amplitude of the torque and the angle readout of a torsion balance is shown.
2. Special fibers-According to Eq. (9), the signal-to-noise ratio can be improved by minimizing the term κ/Q (i.e., the ratio of the imaginary part of the spring constant to the real part). As described in Sec. IV A 2 b, a large quality factor will minimize a systematic bias in the time-of-swing method. Motivated by these two benefits, researchers in recent years have attempted to increase the quality factors of the torsion fibers in their experiments.
Because tungsten has a high tensile strength (and thus, a small restoring constant), tungsten fibers were used in many torsion balances. The quality factor of a torsion pendulum supported by a tungsten fiber can reach values of up to several thousand (see Table I). In order to obtain higher quality factors, different approaches are needed. To date, three different approaches have been used: fused silica fibers, metal fibers at cryogenic temperatures, and torsion strips.
The ultimate yield strength of fused silica fibers is 1000 N m −2 or higher. 60,61 By way of comparison, the yield strengths of tungsten fibers are typically above 2000 N m −2 . Hence, silica fibers must have a 40 % larger radius to carry the same load as tungsten fibers. A larger radius will increase the torsional stiffness of the fiber, which scales with the fourth power of the radius. Despite this increase in κ, fused silica is still a good option since κ/Q decreases due to its large quality factor. In fused silica, the quality factor is dependent on temperature and frequency. At high frequencies of several thousands of hertz, quality factors of 10 8 can be achieved. At frequencies typical for torsion balances (from one to several hundreds of mHz), quality factors of 10 6 are still possible. Unfortunately, because fused silica is an isolator, a pendulum bob suspended by a fused silica fiber is not electrically grounded. This causes increased noise or even systematic effects due to spurious electrostatic forces created by charges on the bob. This problem can be solved with three approaches: discharging the pendulum bob with ultraviolet light, 62,63 coating the silica fiber with a conductive film, 64 or sufficient separation between the pendulum body and its surrounding.
Another way in which the thermal noise can be reduced is by lowering the temperature, as doing so involves two beneficial mechanisms. First, the thermal energy k b T is lowered, and second, the quality factor increases for most metals with decreasing temperature. For fused silica fibers, the dependence of the quality factor on the temperature is more complicated. A detailed investigation 65 of fibers made from Aluminum 6061 and beryllium copper contains strong evidence of a stick-slip mechanism. A spring with a stick-slip mechanism can be imaged as a series of blocks on a surface connected by springs. 66 Another interesting suspension for a torsion bob is the torsion strip, a fiber with a rectangular cross section.  (10) where F is the material's shear modulus and Mg is the weight of the pendulum bob. 67 Two terms contribute to the restoring torque. The first term in Eq. (10) is analogous to the elastic term that provides the restoring torque in a circular fiber. The second term is a consequence of the fact that, as the ribbon is twisted, the pendulum bob is raised in the gravitational field of the Earth. Since the gravitational force is conservative, the second term is lossless. The imaginary part of the spring constant is given by the imaginary part of the elastic spring, but the real part of the spring is given by the sum of the elastic spring and the gravitational spring. Hence, the ratio of the imaginary part to the real part is smaller for both components than for the elastic part alone. This mechanism allows for an increase in Q. For example, for the strip used in the BIPM experiment, gravity provides about 90 % of the restoring torque. Quality factors on the order of 100 000 were achieved with the BIPM torsion strip. Table I gives an overview of the properties of the fibers used in recent torsion balance experiments to measure G. It can be seen that the torsion strip has a torsion constant that is four to five orders of magnitude larger than that of traditional round fibers. However, since the quality factor of the torsion strip is about one to two orders of magnitude larger than the quality factors of metal fibers at room temperature, the torque sensitivity of the strip is only reduced by a factor of one hundred. This reduced torque sensitivity is compensated by the fact that the torsion strip can carry a greater load; thus, the gravitational torque can be made larger by several orders of magnitude. Finally, a precise experiment can be built using a torsion strip. Note that this conclusion differs from that reached by Boys in 1889, 68 where he argues in favor of thinnest possible fiber. In the BIPM experiment, the gravitational torque, at 3 × 10 −8 N m is several orders of magnitude larger than in other torsion balance experiments. The corresponding angular deflection of the pendulum bob is about 30 arc sec.
a. Static deflection.: The static deflection method was used by Cavendish to measure the mean density of the Earth. A static torque on a torsion balance causes a deflection from the equilibrium position. The equilibrium position is usually not known, and two measurements are required. The two measurements are performed with two different field mass arrangements. The field masses can be "near" or "far," or they can be located clockwise or counterclockwise from the torsion fiber. The gravitational torques produced by the field masses on the test masses are denoted N n and N f for the two field mass positions. The angular excursions of the pendulum bob are given by and κ θ f − θ o t f = N f . (12) Hence, if θ o t n ≈ θ o t f , then κ θ n − θ f ≈ N n − N f . (13) Here, θ o denotes the unknown equilibrium position of the pendulum. In general, the equilibrium position is a function of time [i.e., θ o = θ o (t) due to a slow unwinding of the fiber, usually referred to as drift]. The usual data analysis techniques can be used to suppress this drift. [69][70][71] The torque depends on the gravitational constant and a constant that can be calculated from the known mass distribution (i.e., N n = Gc n , where c n has units kg 2 m −1 ). The unknown torsional stiffness κ is obtained from another measurement. The field masses are usually removed or placed in the far position. A measurement of pendulum's period T o and a calculation of the moment of inertia I yields In summary, the value of the gravitational constant measured by the static-deflection method is obtained by The measurement procedure outlined above relies on several assumptions that are worth considering in more detail:

1.
The equilibrium position of the balance θ o is usually a function of time. As discussed above, the torsion fibers are most realistically modeled as springs with an imaginary spring constant (loss). This thermal noise results in a 1/f behavior of the equilibrium position. Hence, the measurements of the two equilibrium positions should be measured within a short time. However, another problem arises: When the torque on the torsional oscillator is changed, it incurs an amplitude. For example, if the torque on the pendulum is abruptly changed by N 2 − N 1 , the amplitude gain of the pendulum will be (N 2 − N 1 )/κ. It is difficult to measure the exact equilibrium position of the pendulum if an oscillation with a large amplitude is present. Several strategies are available to cope with this problem: (a) A torsion pendulum with high damping is used. This eliminates the excitation problem but introduces thermal noise, thus making it difficult to measure the equilibrium position in a feasible time scale. (b) The experiment utilizes switchable damping. The damping is turned on during and shortly after the move and turned off for the measurement of the equilibrium position. Damping can be achieved, for example, by means of active electrostatic feedback. (c) The motion is carefully measured, and the equilibrium position is obtained from a fit of a decaying sine function to the measurements. (d) The motion of the field mass is not abrupt but is optimized so as not to excite the pendulum. One possible motion changes the torque by half and half a pendulum period later by the remaining half. This motion is sometimes referred to as the "crane operator trick."

2.
The torsional constant is measured at the oscillation frequency of the pendulum but is applied to the calculation at (almost) zero frequency. However, the spring constant is due to elasticity that is generally a function of the frequency and κ(0) < κ(f o ). This problem will be discussed in greater detail in the time-of-swing method. If the effect of the anelasticity is not taken into account, the measured value of G will be higher than the true value of G.
In modern times, the static mode is hardly used. Notable exceptions are two experiments at the BIPM by Quinn and co-workers. 55,72 The group built a single torsion balance that could be operated with three different methods: static deflection, time-of-swing, and electrostatic feedback. In the end, two methods-static deflection and electrostatic feedback-were used to determine G. The first result was published in 2001. Subsequently, the apparatus was completely rebuilt, and the second result was published in 2013. Both results are consistent with each other. Apart from these two experiments, no other precision experiment has used the static deflection in the past thirty years.

b. Time of swing.:
In the time-of-swing method, the pendulum's period is measured with the field masses in two different positions. Similar to the static-deflection method, the field masses are nearby, while for the other mode, the field masses are far away. The gravitational potential adds to the potential of the fiber, thereby increasing the restoring torque. Equation (5) must be modified to add the contribution of the field masses in the "near" and "far" positions to the differential equation. This yields "near" FMs: Iθ¨+ κ + κ n + κϕi θ = N n (16) and "far" FMs: Iθ¨+ κ + κ f + κϕi θ = N f . (17) The spring associated with the gravitational potential is lossless, so the imaginary spring constant in the equations above is derived exclusively from the torsion fiber. Usually, the field masses are placed in such a way that the gravitational torque on the pendulum vanishes, N n = N f = 0; hence, the equilibrium position of the pendulum remains unchanged between the two states-a notable difference to the static-deflection method. The complex solutions of the angular frequencies squared of Eqs. (16) and (17) are ω n 2 = κ + κ n I + i κϕ I (18) and Hence, ω n 2 − ω f 2 = κ n − κ f I . (20) The gravitational spring constant depends on the mass distribution and G (i.e., κ n,f = k n,f G), where, again, the constants have dimensions of kg 2 m −1 . Note that the constants k n,f are the derivatives of the constants c n,f used in the static deflection mode. In summary, The main advantage of the time-of-swing method is that the primary measurand is a time interval. Besides the metrology of the mass distribution, which is common to all G determinations, only periods need to be measured. The time interval is the physical quantity that can be measured with the highest precision, especially in the age of GPS (Global Positioning System). In contrast to the time-of-swing method, the other three torsion-balance methods require the measurement of an absolute quantity, angle, angular acceleration, or feedback voltage. It is much more difficult to measure these quantities with a relative uncertainty of 10 −5 than it is to measure a time interval with the same relative uncertainty.
The simplicity of measuring a simple time interval has convinced many experimenters in the past thirty years to use the time-of-swing method to measure G. In 1982, Luther and Towler published a result with a relative standard uncertainty of 64 × 10 −6 . 49,50 For many years, this result served as a kind of "gold standard" among G experiments. Fourteen years later, Karagioz and Izmailov published a new result using the time-of-swing method. The main purpose of their experiment was to search for violation of the inverse square law by measuring G with field masses at several distances. Since then, they have continued to collect data with their apparatus. In 1997, Luther published another result with the doctoral student Bagley. 52 The catalyst for this work was the discovery of the Kuroda effect (see below). The first measurement by the gravity group at the Huazhong University of Science and Technology (HUST) under the leadership of Luo was published in 1998. 73,74 The group published a second result in 2010 with a relative standard uncertainty of 26 × 10 −6 . 57 More recently, Newman and his collaborators published their results using a cryogenic torsion pendulum. Riley's team achieved a relative standard uncertainty of 19 × 10 −6 . 59 The time-of-swing method appeared in print in an article by Reich 75 in 1852. Reich attributes the idea to Forbes (see Ref. 11). However, Mackenzie 12 mentions that this method was first proposed by Muncke in 1827 76 and that the mathematical analysis was given by Brandes in 1806. 77 Bouguer's pendulum measurements during his Peru expedition, on the other hand, can be considered the very first time-of-swing experiment (see Sec. IV B).
Forty-four years after Reich's publication, Braun started experiments that were reported in a detailed publication in 1896. 78 Since then, the time-of-swing method has gained popularity with other researchers because of its simplicity. It took over one hundred years before a serious problem with the time-of-swing method was discovered by Kuroda: 36 The implicit assumption of the time-of-swing method is that the torsional constant of the fiber is the same for both states of the field masses. Then, κ cancels from Eqs. (16) and (17). However, this assumption cannot hold perfectly since the torsional oscillator swings at different frequencies and the presence of an imaginary spring constant requires a frequencydependent real part due to the Kramers-Kronig relations.
The theoretical frequency of the spring constant depends on the model that is used for the spring. The simplest model for a spring is a parallel circuit (Fig. 7) of an ideal spring and a Maxwell unit. A Maxwell unit is the serial connection of a spring and a dissipative element (dashpot). The ideal spring is characterized by its spring constant κ ideal . The Maxwell unit can be characterized by the spring constant δκ of its spring and a time constant τ. If a sinusoidal force is applied to the simplified model, the total spring constant depends on the angular frequency of the force, κ(ω). For ω ≪ τ −1 , the resulting spring constant is κ res ≈ κ ideal ; for ω ≫ τ −1 , it is κ res ≈ κ ideal + δκ (see Fig. 8).
Figure 8 also shows the imaginary part of the spring constant, which is a measure of the loss in the system. For the simplified model discussed here, the imaginary spring constant reaches a maximum of δκ/2 at an angular frequency of ω = τ −1 . For this model, the maximum of the imaginary part of the spring constant is exactly half the change in the real part of the spring constant. Hence, one way to find a material that has a small dependence of the spring constant on the oscillation frequency is to search for a material with small loss or large Q.
Clearly, a dependence of κ on ω is undesirable, but how does it influence a G measurement using the time-of-swing method? For the G experiment, the higher oscillation frequency is in the "near" position with the masses (i.e., in line with the pendulum at the equilibrium position). In this geometry, the gravitational torque adds to the restoring torque of the fiber, yielding an increased torsional frequency. Hence, Eq. (20) needs to be amended to read ω n 2 − ω f 2 = κ n − κ f + κ ω n − κ ω f I , (22) where κ ω n − κ ω f > 0. The Newtonian constant is now obtained from From the measured frequency difference, a small positive correction term must be subtracted to obtain the value of G. If the experimenter "forgets" to subtract this correction, the reported value of G will be too high.
The model with a single Maxwell unit is very simple and does not describe a real spring. However, the qualitative conclusions drawn from the simple model remain valid. More realistic models assume a distribution of Maxwell units that are in parallel to an ideal spring.
The distribution covers a continuum in values of the time constant and the oscillator strength δκ.
Depending on the model used for the distribution of the Maxwell units, several limits of the bias for the G measurements can be estimated. Kuroda 36 estimated that the relative bias in the G measurement is <Q −1 /π (i.e., without correction, the result would be relatively higher by this amount). Kuroda's estimate rests on the assumption that the ratio of the imaginary and the real parts of the torsion constant is fixed. Newman 79 set a different limit of <Q −1 /2 using a continuous distribution of Maxwell units.
This effect is, in addition to the thermal noise arguments discussed above, another reason to use special fibers with high Q (see Sec. IV A). In recent years, several experiments have been carried out using the time-of-swing method. Table II summarizes the key parameters.
An interesting and well-documented time-of-swing measurement was carried out by Newman et al., 59,65,79 who built a cryogenic torsion pendulum to increase the quality factor and thereby decrease the bias introduced by the Kuroda effect. The fiber was suspended from a stage at 2.5 K. At the other end of the fiber, a flat plate was mounted. As is described in Sec. IV A 2 d, the exact dimensions are less crucial, simplifying the test mass metrology. Two field masses shaped like doughnuts were used to generate a very uniform gravitational field, see Sec. V. Since the field masses were at room temperature and the plate was at cryogenic temperatures, the Dewar had to be between the field masses and the pendulum, resulting in a large distance between the two. For this reason, the gravitational signal is very small, about three orders of magnitude smaller than in other time-of-swing methods (see Table II). Another interesting feature of this experiment is that the measurement is performed at various oscillation amplitudes up to slightly more than one complete revolution. Three different fibers were used (see Table I), and detailed studies of each fiber material were carried out. The relative uncertainty of the final result is 19 × 10 −6 .
c. Torque feedback.: In general, one successful strategy used to measure an unknown quantity is to compensate the effect of the unknown on a system by means of an effect that is known or calculable. Specifically, the gravitational torque of a modulated mass arrangement can be compensated by an electrostatic torque. The torsion balance acts as a null detector, while the measured torsion angle is used as the input signal for a control loop that adjusts the voltages on the electrodes to zero the input. Since no torsional excursion occurs, the measurement is not affected by the Kuroda effect discussed in Sec. IV A 2 b. Nevertheless, it is beneficial to employ a fiber with a large quality factor because the power spectral density of the torque cannot be lower than the expression given in Eq. (9).
Evidence suggests that Dicke was the first to operate a torsion balance with electrostatic feedback 80  An argument frequently made is that the electrostatic interaction is many orders of magnitude larger than the gravitational interaction. For example, the electrostatic force between a proton and an electron in the hydrogen atom is 10 38 times larger than the gravitational force between the two. One could conclude that there is no way to control the electrostatic feedback well enough to measure the effect of the gravitational force. However, this frequently made comparison is flawed because of the huge charge-to-mass ratio of the electron and the proton. Of the two, the proton has the smaller charge-to-mass ratio, which is 10 8 C kg −1 . By way of comparison, the charge-to-mass ratio of an aluminum ball with a mass of 5 g and a potential of 1 V (typical numbers for an object in a laboratory) is 18 orders of magnitude smaller. With the latter number, and by increasing the distance of the charge, the electrostatic feedback seems more realistic.
The electrostatic energy on a capacitor with capacitance C at voltage V is If one of the electrodes is connected to the torsion balance, the capacitance (and thus, the electrostatic energy) is a function of the torsional angle θ of the pendulum bob. If the system is not at the angle where the system has minimal energy, a torque toward this direction arises, The source masses are modulated between two states, producing two torques, N n = Gc n and N f = Gc f , where c n,f denote constants that are calculated from the mass arrangement of the experiment. For each of the two states, the torques are balanced using the voltages V n and V f , which can be measured with a precision voltmeter. From the difference, G can be obtained as follows: The capacitance gradient, dC/dθ, is measured in a separate measurement. One strategy used to measure the capacitance gradient is to excite the pendulum to a large torsional motion and to measure the capacitance with a (commercial) capacitance bridge and the angle with an autocollimator. A numerical derivative is then calculated from these measurements.
Several considerations are important for the operation of a torsion balance in the electrostatic feedback mode: Electrostatic forces are unidirectional: With one set of electrodes, only a unidirectional torque can be generated. Hence, either two sets of electrodes must be used or the feedback position must be offset from the equilibrium position of the torsional oscillator in such a way that the fiber provides the torque in the other direction.

AC versus DC:
Capacitance measurements are usually carried out at audio frequencies. The bridge shown in Ref. 87 is a typical example of a circuit that is used for this measurement.
The voltage applied to the electrode can either be alternating current (AC) or direct current (DC). The former is more difficult to measure precisely; precise measurements of AC voltages are usually made using thermal converters that find an equivalent DC voltage, which is eventually traceable to the Josephson effect. For the latter, the frequency dependence of the capacitance must be well understood.
Contact potentials: Applying a voltage to the metal does not always result in the same potential at the electrode surface due to surface and contact potentials, which can be as large as several hundreds of mV. This problem can be solved by reversing the voltage. In one case, the torque is proportional to (V s − V n ) 2 , and in the other case, it is proportional to (V s + V p ) 2 . Here V p and V n are both positive values. If the torques produced are nominally the same, the surface voltage can be obtained, V s = (V n − V p )/2. (24) is an approximation if there are only two conducting surfaces and the potential difference between the two surfaces is V. In reality, it is extremely difficult to achieve an electrostatic setup, where only two surfaces matter. If there are more than two surfaces, Eq. (24) has to be a sum over all combinations of electrode pairs. Each summand contains the cross capacitance and the squared voltage differences. 88,89 All cross capacitances that depend on the angle of the torsion balance will contribute to the electrostatic torque. Notably, one such component was mistakenly left out by a G experiment. 90,91 This led to a large systematic bias and eventually to the withdrawal of the result.

Parasitic capacitances: Equation
Electrostatic force measurements are becoming more important due to the impending revision of the International System of Units (SI). In the revised SI, the unit of mass is no longer given via an artifact but can be realized from a fixed value of Planck's constant, h.
For small mass values, electrostatic balances can be used to realize the unit of mass. 92 Here, "Big G" experiments and new technical demands can benefit from each other. Both fields require a solid understanding of the absolute magnitude of electrostatic torques and forces.
The electrostatic feedback method, as well as the static deflection method, relies on an absolute calibrated angle readout which is necessary to calculate the capacitance gradient. Quinn et al. found an interesting way to take advantage of the fact that θ appears in the denominator in Eq. (26) and in the numerator in Eq. (15). By combining the two methods in one apparatus and averaging the results, the uncertainties due to the angle calibration are anti-correlated. 86 This negative correlation substantially reduces the effect of the calibration of the autocollimator on the average value. For example, in the 2010 measurement performed by the BIPM group, the relative uncertainty of the angle calibration is 47 × 10 −6 .
The total relative standard uncertainty of the average of both the measurements is only 25 × 10 −6 , which is achieved by combining the two anti-correlated measurements.
d. Angular-acceleration feedback.: According to available information, the angularacceleration method was first discussed by Rose et al. 93 in 1969 and was perfected by Gundlach 53 in 2000. For this method, the torsion balance is mounted on a turntable. The principal idea is that the gravitational torque acting on the pendulum is compensated by an inertial torque: The gravitational torques between the source masses and the pendulum bob produce an angular acceleration of the pendulum bob in the direction of the position with the lowest potential energy. A feedback-control loop accelerates the turntable in such a way that the torsion bob does not move with respect to the rotating reference frame. The data to obtain G are extracted from the angular acceleration of the turntable, which (at least for the infinite gain of the control loop) is identical to the gravitational angular acceleration acting on the pendulum bob.
The angular-acceleration feedback method has three advantages: (1) The fiber does not twist since the pendulum remains stationary in the rotating reference frame; thus, the experiment is not affected by the Kuroda effect. (2) Measuring the acceleration has the advantage that the mass of the pendulum bob (and even some of its dimensions) is canceled out (see below). In brief, the cancellation is analogous to the cancellation of the mass of a dropping object in a gravitational field: From mg = ma, the masses are canceled out and a = g. In other words, all objects drop with an acceleration of g regardless of their mass (at least in vacuum). While this is true in a homogeneous field, it is more complicated in field geometries that are typical for torsion balances, see below. (3) The measurement required is the angular acceleration of the turntable. The angular acceleration is computed from a time series of angle measurements. Taking the second derivative with respect to time yields the angular acceleration. Similar to the static deflection method, a precise measurement of an angle is needed. Here, however, the experiment can be constructed in such a way that a full circle (2π) can be measured. This provides a self-calibration point and can even be used to characterize non-linearities of the angle encoder.
To date, this method has produced the measurement of G with the smallest uncertainty. The experiment was carried out by Gundlach and Merkowitz at the University of Washington in 2000. 53 Figure 9 contains a three-dimensional drawing of the experiment. This experiment incorporates several sophisticated techniques in addition to those mentioned above. For example, the outer masses are also mounted on a turntable whose angular velocity is controlled such that the difference in angular velocities between the inner and outer turntables remains constant. The relevant signal is only at the difference frequency, while most parasitic couplings occur on the rotation frequency of the inner turntable.
A dramatic decrease in uncertainty over previous experiments was achieved through an innovative shape of the pendulum bob. Unlike prior experiments that used dumbbells or cylindrical rods as test masses, the duo at the University of Washington used a thin rectangular plate. This geometry makes the experimental result almost independent of the detailed mass distribution. An elegant way of performing the mass integration for an experiment with a rotation axis is via the multipole formalism. 88,94 The torque on a pendulum is given by where ϕ is the angle between the field mass and test mass assemblies. q lm denotes the pendulum's multipole moments and Q lm denotes the multipole fields of the source mass arrangement. 94 Well below the resonance frequency of the pendulum, this torque produces an acceleration of the pendulum given by ϕ = N/I. For conventional mass distributions (field masses at a larger radius than test masses), the series in Eq. (27) converges quickly. Thus, the largest term is given by the product of q 22 and Q 22 .
For a thin plate with mass m, width b, and thickness d, the inner multipole moment is given by 95,96 The moment of inertia of this plate is given by Combining Eqs.
For a two-dimensional plate, the angular acceleration is independent of its dimensions and mass. For a pendulum with finite thickness, the expression in the right-hand side of Eq. (30) has to be multiplied by the correction factor The angular acceleration remains independent of the mass of the pendulum bob. Gundlach and Merkowitz chose b = 76 mm and d = 1.5 mm that minimized higher-order contributions to the series in Eq. (27). With these dimensions, a correction factor of 1-7.8 × 10 −4 is obtained. Since the correction deviates by only a small amount from one, pendulum's dimensions do not have to be known precisely. The relative standard deviation contribution to the test mass metrology was 4 × 10 −6 , for a total relative standard uncertainty of 14 × 10 −6 . By way of comparison, the experiment carried out by Luther and Towler 50 obtained a total relative uncertainty of 64 × 10 −6 . The sum of the contribution of the metrology of the small mass system was 48 × 10 −6 . The knowledge of the mass distribution of the pendulum bob was the dominating component in the uncertainty budget of Luther and Towler. By using a flat plate, Gundlach and Merkowitz managed to reduce this component significantly.

B. One pendulum or two pendulums
1. From the clock to the gravimeter-Timekeeping is a very important task in science and technology, as well as in commerce. Accurate navigation, for example, is impossible without accurate clocks. The search for stable clocks therefore goes back almost to the dawn of human civilization itself. In 1657, Christian Huygens patented the pendulum, which at the time was considered a very stable clock. He also described the motion of the pendulum. Around 1672, Jean Richer noticed on a trip to French Guiana that the oscillation frequency of a seconds pendulum depends on the geographical latitude. The dependence of the local acceleration on the latitude is a consequence of the Earth's rotation: The local acceleration is a sum of the centrifugal acceleration and the gravitational acceleration. At the equator, the local acceleration is reduced by the centrifugal acceleration. This effect is exacerbated by the fact that the figure of the Earth is in response to the centrifugal acceleration an oblate spheroid. Hence the polar radius is smaller than the equatorial radius, increasing the gravitational part of the local acceleration towards the pole. A model describing this normal gravity g 0 approximately, the so-called reference ellipsoid, is WGS84, 97 and ϕ denotes the latitude. This formula describes the theoretical local acceleration on an equipotential surface at mean sea level. It includes both gravitational and centrifugal potentials. In the 18th century, a debate took place as to whether the radius of the Earth was greater at the equator or at the poles. To resolve this debate, the King of France, Louis XV, sent two groups of scientists to take measurements. One group traveled to Lapland, near the North Pole, and the second group to Ecuador (then called the Territory of Quito by Spain), close to the equator. A prominent member of the second expedition was Bouguer, a French scientist and geodesist who, besides the measurements of arcs of the Earth's curvature, conducted two experiments to determine the mean density of the Earth. 98 The first measurement was performed in Quito, which can be thought of as being located on top of a plateau. He measured the period of the pendulum at this location and compared it to the period determined at sea level. The period of a simple pendulum in local gravity g is with the pendulum length l. If there were no mass between both heights, the swing rate difference would be that derived from Newton's inverse square law, the so-called free-air correction. The measured gravity, however, was higher. Bouguer explained this difference by assuming the mass between Quito and the sea level was a slab of Earth of equal density. The attraction from this slab can be approximated by where ρ is the density of the Earth and H is the height of the slab. The correction described by Eq. (34) is still applied in geophysics today and is called the Bouguer correction or Bouguer anomaly. From this measurement, Bouguer could determine the mean density of the Earth. If the mean density of the slab is known (for example, from measurement), then this density can be compared to the mean density of the Earth-similar to what Cavendish did, when he compared the mean density of the Earth to that of the field mass. Bouguer estimated the mean density of the Earth as being more than 4 times that of the slab. The standard density of rock in geophysics, as used in the Bouguer correction, is 2.67 g cm −3 .
The mean density of the Earth, however, as derived from the current G value, is about 5.5 g cm −3 . Bouguer's estimation was obviously too high. 11 In another experiment, he measured the horizontal attraction of Mount Chimborazo, a mountain more than 6000 m high. The principle of the experiment is depicted in Fig. 10.
Here as well, two measurements were necessary. In the first measurement, Bouguer placed the pendulum at the foot of the mountain and measured the direction of the plumb line with respect to the stars. In the second measurement, he placed the pendulum farther away from the mountain, but at the same latitude [normal gravity changes with latitude due to the shape of the Earth; see Eq. (32)], and again measured the direction of the plumb line with respect to the stars. The angular difference was a measure of the pull of the mountain onto the pendulum bob by comparison with a model based on Newton's law of gravitation. 11 Neither of the two experiments gave even an approximate mean density of the Earth, but this is extremely difficult, since local density inhomogeneities can lead to large errors. Several similar experiments were later performed by other scientists (e.g., Maskelyne's experiment at Mount Schehallien, a mountain in Perthshire, Scotland, in 1774 99 ). High accuracies were not expected, as the "field mass" (i.e., the Earth) has a very irregular density distribution. Well-defined field masses are required in addition to good environmental conditions.

A modern experiment measuring the time of swing of a simple pendulum -A modern version of measuring G by observing the period of two simple pendulums in
vacuum is currently being built at the Politecnico di Torino, Italy by De Marchi. 100 Two 1m long pendulums spaced 0.1 m apart measure the change in local acceleration introduced by a field mass arrangement. The field masses are moved such that they slow the period of either one pendulum or the other. The relative frequency difference is modulated with the field mass position. This differential measurement rejects several common mode effects, e.g., changes in the gravitational environment identical to both pendulums due to tides or moving masses in the laboratory. The field mass induced frequency change is of order 10 −7 of the resonance frequency. In order to resolve this frequency change, a very high quality factor is required. De Marchi is aiming to reach a quality factor in the order of 10 8 . Such a high quality factor would allow a measurement with a relative uncertainty of 10 −5 . Kleinevoß et al. 101,102 Their setup, which is depicted in Fig. 11, consisted of two pendulums of length l ≈ 2.6 m. The bobs of these pendulums formed a microwave resonator. Two symmetrically arranged field masses, each of which was made of brass and had a mass of about 576 kg, were placed alternating at two distances from the bobs. The microwave frequency measured when the field masses were far ≈ 2.1 m from the pendulums was taken as a reference, f ref . When the field masses were placed closer ≈ 0.6 m to the pendulums, the frequency, f meas , of the resonator changed due to the attraction of the field masses, which made the two mirrors separate more, thus leading to a different cavity length.

Modern experiments measuring the static deflection with two simple pendulums-A laboratory version of this Mount Chimborazo experiment was conducted in 2002 by
where M denotes the masses of the field masses, ω 0 denotes the fundamental frequency of the pendulums oscillations, b denotes the cavity length, and K r and K ref are correction factors for the field mass distributions. The relative combined standard uncertainty they reached was 147 ppm. 102 In 2010, Parks and Faller published the results of a similar experiment, 103,104 which had actually been conducted in 2004. Two main differences to the Wuppertal experiment are noteworthy. The first difference was the use of a laser Fabry-Pérot interferometer (cavity), rather than a microwave cavity. Reducing the wavelength of the electromagnetic waves from centimeters to several hundreds of nanometers improved the resolution of the distance sensing. The second difference was a simultaneous measurement of a second cavity, which was attached to the support of the pendulums. This was done in order to compensate thermal drifts in the setup. Their pendulums had a length of 72 cm and four identical source masses of 120 kg each were used. The group waited six years before publishing the results, as the value of G they found was about 14 standard deviations higher than the CODATA-2008 value. Before submitting the publication, every detail of the experiment was carefully checked. The final result has a relative standard uncertainty of 21 × 10 −6 .

Historical beam balance measurements-A conventional beam balance
compares the clockwise and counterclockwise torques generated by masses on two balance pans on either side of the fulcrum. The beam tilts until the net torque about the central pivot is zero. Traditionally, the center of mass of the beam is below the pivot point and, as the beam pivots, a counteracting torque is generated. The tilt of the beam is a measure of the initial torque difference. Assuming that the arms are of equal length, the beam balance is a device to compare forces. The assumption of equal arm lengths is not necessary if a weighing scheme such as substitution or transposition is implemented.
The weight of a mass is given by the product of the mass value and the local acceleration of gravity, mg. In most cases, balances are used to measure m, since g is assumed to be constant. In measurements of G with beam balances, m is constant and changes in g introduced by modulating a source mass are measured. The readings of beam balances are usually given in grams and must be converted into the units of force by multiplying them by the local acceleration.
The beam of a balance with two identical masses on the balance pans is in equilibrium. Changing the local gravity at the position of one mass causes an excursion of the beam proportional to the weight change. The weight change can either be read off at the balance's pointer or small calibrated masses can be added to restore equilibrium. Local gravity can be changed by placing large field masses below or above the pans. Von Jolly conducted such a measurement in Munich, Germany, in 1878 and 1881, almost a century after Cavendish. 17,105,106 He built a special beam balance-a so-called "double-balance"-with four mass pans (see Fig. 12) that was able to resolve weight differences smaller than 1 μg. 107 Von Jolly was able to measure the Earth's gravity gradient, as the gravity decreases by about 300 μGal m −1 (in gravimetry, the non-SI unit "Gal" is commonly used: 1 Gal = 1 cm s −2 ). Raising a 1 kg mass by 1 m decreases its weight by about 3 μN (which would correspond to a mass of 300 μg). In von Jolly's setup, the upper and lower mass were separated by about 21 m, leading to a relative weight change of 6 × 10 −6 . Von Jolly used this experiment to verify Newton's inverse square law (F ∝ r −2 ). To do so, four identical glass flasks were made, two of which were filled with approximately 5 kg of mercury and two of which were air-filled. Then, all four flasks were sealed. The purpose of the air filled flasks was to suppress effects related to changes in the air buoyancy due to changing air pressure. For the measurement, von Jolly first placed both mercury-filled flasks on the upper pans and the air-filled flasks on the lower pans. Later, he switched the two flasks on one arm of the beam. As he expected, he was able to measure an increase in weight when the mass was moved from the upper to the lower pan. A difference was observed with respect to theoretical calculations; however, this difference was attributed to mass inhomogeneities in the Earth. For his G measurement, he placed a field mass-a lead sphere with a diameter of approximately 1 m and a mass of 5775 kg-below one of the lower pans (see Fig. 12). Then, he applied the same measurement strategy as before. 12 With a distance between the centers of mass of the field mass and the test mass of a = 0.57 m, he measured an increase in weight of 5.8 μN, which corresponds to a mass of 589 μg. This difference corresponds to 10 −7 times the mass of the test mass. Von Jolly was able to measure this difference with a relative uncertainty of 1.2 %, a highly significant achievement for the time. He obtained G = 6.465 × 10 −11 m 3 kg −1 s −2 . To be historically accurate, von Jolly did not directly measure G but instead the mean density of the Earth, similar to Cavendish. He compared the density of the lead, ρ Pb , with that of the Earth, ρ Earth , ρ Earth = ρ Pb m Δm r R r 2 a 2 , (36) where R denotes the radius of the Earth.
Hermann von Helmholtz later encouraged König and Richarz, who were later joined by Krigar-Menzel, 29,109 to repeat the measurement with a modified setup in Berlin, where they used the incredible amount of approximately 9 m 3 of lead. The mass of this field mass was about 100 metric tons of weight. The lead was made available from a nearby cannon foundry. The field mass was a large rectangular block. 17 The British physicist Poynting measured G with a beam balance in 1878. 110 Like von Jolly, he used a spherical lead mass as a field mass, although with a smaller mass (170 kg). None of these experimenters reached the uncertainty that von Jolly reached. The obvious solution was to "bring the lake to the laboratory." Since water has a low density, a huge volume of water would have been necessary to create a sizable signal. For this reason, Kündig et al. resorted to a material that von Jolly had used: mercury. However, this time, mercury was used as a field mass. Mercury has a density of 13.54 g cm −3 , which allows it to be used very effectively as a field mass. Effective, in this context, means that a small volume of the field mass can create a large signal.

Modern beam balance measurement-The
The second desirable property of mercury is that it is liquid at room temperature. Liquids have a higher homogeneity in density than solids. The density in solid metals can vary, depending on the details of the casting process, relatively up to 10 −4 . The mass distribution of a liquid field mass has (almost) perfect homogeneity. Possible deviations from perfect homogeneity include a linear density gradient due to the isobaric pressure gradient and the compressibility of the liquid. The major disadvantage of a liquid field mass is that a tank, which must be taken into account for the mass integration, is needed to contain the liquid. A tank is usually a very complicated structure consisting of several parts that are joined by bolts and use O-rings in grooves to seal the joints. The mass, shape, and location of these elements must be measured, and their effect on the gravitational signal must be calculated, increasing the complexity of the mass integration compared to an experiment with a solid source mass.
In the Zurich experiment, a total of 1 m 3 of mercury was filled into two identical tanks shaped like hollow cylinders (see Fig. 13). It is possible to move the tanks into either of two positions, labeled "together" and "apart." A vacuum tube passed through the inner bore of the tanks. Inside the vacuum tube are two test masses suspended by tungsten wires. Vertically, the test masses are separated by 1.4 m, approximately double the height of one tank. Thus, when the field masses are together, each test mass is located at the end of a field mass cylinder. The gravitational field generated by the cylinder has an extremum at this position (see Sec. V). The "apart" position is designed in such a way that each test mass is again at an extremum of the field. In the together (T) and apart (A) states, the force difference between the upper (m u ) and lower (m l ) masses is measured as follows: ΔF T = m u g z u + F z T,u − m l g z l − F z T,l , (37) and ΔF A = m u g z u + F z A,u − m l g z l − F z A,l , (38) where F z (A/T, u/l) denotes the vertical force on the upper/lower test mass in the field mass state (apart/together). g(z u/l ) denotes the local acceleration of gravity at the position of the upper/lower test mass.
Subtracting the differences from each other eliminates the weights m u g(z u ) and m l g(z l ) and yields the following: ΔF T − ΔF A = F z T,u − F z A,u − F z T,l + F z A,l = kG, (39) where k denotes a factor containing masses and distances between masses. This second difference, also referred to as the signal, was approximately 8 μN.
The vertical forces on the test masses are measured by a very sensitive mass comparator. A mass exchanger hanging below the mass comparator can connect either mass to the mass comparator. The reading of the mass comparator is in units of mass (kilogram) rather than force (Newton). To obtain the gravitational force, the reading has to be multiplied by the measured value of g.
In contrast to torsion balance experiments, the traceability of the force measurements in beam balance experiments is straightforward: A traceable calibration force can be generated by adding a mass standard with a calibrated mass to the balance pan. The second measurement required is that of g, which is not usually a problem at relative uncertainties of 10 −6 .
The relative uncertainty obtained by the Kündig experiment was 18.3 × 10 −6 . The uncertainty was limited by the statistical uncertainty (i.e., the noise in the weighing and sorption effects on the test masses that are correlated to the motion of the field masses).
The Zurich experiment was dismantled at the end of 2002, and mercury was sent back to the mine it was leased from.

D. Free-fall absolute gravimeters and gradiometers
1. Principle of free-fall gravimeter-An absolute gravimeter is an instrument used to measure the local acceleration due to gravity, g, also commonly called "Little g." A diagram explaining the principle of a classical free-fall absolute gravimeter is shown in Fig. 14(a).
A test mass to which a retroreflector is attached is released in a vacuum chamber. The trajectory of the mass in free fall is traced by means of a laser interferometer. In order to conduct repeated measurements, the test mass sits in an elevator. This elevator is accelerated with more than g downwards. As a result, the test mass hovers inside the elevator and falls freely without being in contact with the elevator until the elevator decelerates in order to catch the test mass again gently. The accelerated motion of the test mass produces a chirped fringe signal, which is detected with a photo diode. From the fringe signal, the trajectory of the free fall can be recovered, as one fringe crossing corresponds to a travel distance of the test mass of half the laser wavelength. By timing the fringe signal values, the time/distance (t i , z i ) pairs are obtained and a least-squares fit of a linear model provides the acceleration, g 0 , The parameters z 0 and v 0 denote the initial position and velocity at the start of the trajectory (i.e., for t = 0). There are two ways to include the Earth's gravity gradient, γ: The first way is to include the gradient (if it is known), as in Eq. (40) in the fit model. Then, the calculated acceleration refers to the start position of the trajectory. The second way is to skip the gradient in the fit model (i.e., by setting γ = 0). In this case, the calculated acceleration refers to a position between the start and the end of the trajectory, a function which depends on the initial velocity, υ 0 , of the test mass, the approximated local gravity, g 0 , and the total free-fall time, T (see, e.g., Ref. 119). This position is then called the reference or reported height. 120 Resolutions on the order of 1 part in 10 9 and better are possible. A short overview of relative and absolute gravimeters can be found in Ref. 121.
From a didactic standpoint, free-fall experiments are ideally suited to measure G. After all, Newton was allegedly inspired to derive the law of gravitation after observing the free fall of an apple from a tree. However, free-fall experiments also have a practical advantage over the experiments discussed above. Free-fall experiments do not require a suspension for the test mass(es). Perturbing material properties, which are often not well understood or whose models are only valid to a certain degree, do not apply here. A disadvantage, on the other hand, is the short measurement time of such experiments in an Earth-bound laboratory. In a satellite encircling the Earth, this situation would be different, as mentioned above.
2. G measurement with a free-fall gravimeter-G can be measured with a gravimeter by placing a well-defined field mass close to the gravimeter's test mass. Because the field mass perturbs the local acceleration due to gravity, the gravity measured is g meas = g 0 + g FM (i.e., the local Earth's gravity, g 0 , plus a perturbing term arising from the field mass, g FM ). By applying Newton's law of gravitation, this perturbation can be modeled, and the acceleration measured can be taken to determine the magnitude of G. This experiment was conducted in 1998 by a group led by Faller at Boulder, USA, using a commercial freefall absolute gravimeter FG5. 118 Twelve tungsten alloy cylinders (in two layers) were placed to form a ring. The gravity field of a ring mass shows two extrema where the integrated signal over the test mass trajectory reaches its maximum, see Fig. 14(b). Another advantage of the ring shape is that the field strength variation at these extrema is minimal; as a result, a minimum uncertainty due to positioning errors is given. The relative standard uncertainty obtained was on the order of 1.4 × 10 −3 . This is only approximately five times better than what Cavendish obtained, but the important point is that it represents a completely different measurement approach. It does not need to compensate for Earth's gravity by means of (for example) a torsion wire, an error source that was underestimated for many years. 36 Although there is no drift due to material properties, many other factors are part of this relatively high uncertainty. First, there is a poor signal-to-noise ratio. Although gravimeters can resolve gravity to better than one part in 10 9 , the signal, when compared to g, is only about 1 part in 10 7 . This requires a long integration time. In order to compensate for systematic effects, the experiment was repeated with two different field mass positions. In the first mode, the field mass was placed below the test mass; as a result, the gravity measured was higher than g 0 . In the second mode, the field mass was placed above the test mass, thus taking advantage of both extrema of the ring-shaped field strength. This second position reduced the effective gravity acting on the test mass. Figure 15 shows some measurement data of the experiment. The time variations in the data are mainly from the influences of the Moon and the Sun (tides), as well as due to environmental effects (e.g., temperature, air pressure, and hydrological effects). The pure tidal variations are already three times the magnitude of the field mass signal. Moving the 500 kg field mass produces additional scatter on the data.

Cold atom gravimeters and gradiometers-Dropping a macrosopic
retroreflector is not the only way to realize a free-fall gravimeter. In 1991, Kasevich and Chu 122,123 were able to measure g by dropping cooled atoms (i.e., by using atom interferometry 124 ). Here, a cloud of atoms are cooled and all atoms are prepared to the same quantum state, described by their phase ϕ. Then, the atoms are released by briefly turning off the trap. Three laser pulses with a pulse separation T follow. The first pulse splits the cloud into two by increasing the momentum of some of the atoms. This laser pulse is called a π/2 pulse. The second pulse acts on both clouds in such a way that their momenta are interchanged (the so-called π-pulse). Finally, the third pulse, which is a second π/2 pulse, recombines both clouds. As both clouds interact at different heights with the laser, where the clouds have different velocities and the laser has a different phase, the states of the atoms are different, depending on the acceleration.
The final recombined cloud is then interrogated by another laser in order to obtain the information on the atom quantum states. The probability P = (1 + cos(ϕ))/2 of how many atoms are still in their initial state is then a measure of g. Here, ϕ represents the total accumulated phase difference. Although the information about the acceleration is extracted from the quantum states of the atoms, the principle has some points in common with the macroscopic setup. It should be noted that atom gravimeters essentially correspond to a measurement of three positions of the atom clouds-or to a measurement of two velocities. During the pioneering years of conventional free-fall gravimeters, only three points were measured, whereas today, data have been collected on thousands of positions. 121 These developments took place before lasers were available. Moreover, the measurement sequence π/2 − π − π/2 corresponds to a Mach-Zehnder laser interferometer type, the interferometer setup that is commonly used in conventional gravimeters. From these commonalities, it follows that perturbations enter both systems in the same way, as explained in Ref. 120. This can be easily seen by comparing of the measurement functions, which can be written as where T denotes the total free-fall time. For a classical gravimeter, the factor k = 4π/λ, with λ being the wavelength of the laser (usually 633 nm). For an atom gravimeter, k = k eff (the effective Raman wavenumber). In a conventional gravimeter, ϕ i denotes the time-dependent phase difference between both interferometer beams; in an atom gravimeter, it denotes the local Raman phases at the times when the π/2 and π pulses are applied. Also of note is the fact that atom gravimeters can be constructed in two ways: using a simple drop of atom clouds and using a fountain-like setup (i.e., the launch-and-drop method), which is actually the most common method for atom gravimeters. The same is true for conventional gravimeters although, here, the simple free fall is preferred.
Atom gravimeters have already been used in several G measurements in different configurations. 125 However-possibly as a consequence of the knowledge gained from classical gravimeters-a simple atom gravimeter configuration has never been used for G measurements but a combination of two gravimeters, defining a gradiometer. A gradiometer measures spatial gravity differences; for example, if the setup depicted in Fig. 14 Since the lower retroreflector is closer to Earth's center of gravity, its acceleration due to gravity will be larger than the upper retroreflector. If this differential acceleration ∆g is measured, then the Earth's gravity gradient γ = ∆g/d can be calculated together with the known separation d. Since both masses fall simultaneously, environmental changes or tidal signals are common mode effects for both masses and are thus canceled out. Such a gradiometer would give an improved signal-to-noise ratio, as the background signal is no longer g but only its gradient γ, which is on the order of 3 × 10 −7 m −1 . At this point, G is measured by perturbing the local gravity gradient by adding a well-defined field mass, as in the gravimeter experiment. However, the Earth's gravity gradient can only be measured to a few parts in 10 −2 , which limits the achievable accuracy of the experiment. In order to circumvent this problem, an additional differential measurement can be conducted. By repeatedly positioning the field mass at different positions, the differential signal can be considered. An experiment of this type was carried out by Fixler et al., 126 who achieved a relative uncertainty of 5 × 10 −3 , and more recently by Rosi et al., 127 who gave a relative uncertainty estimation of 1.5 × 10 −4 .

Differential gravity gradiometer-
The gradiometer setup still involves the problem of an unknown gravity gradient; the differential measurement requires a re-positioning of the field mass, which leads to positioning uncertainties. One way of avoiding this problem was proposed in 2014 by Rothleitner and Francis 128 and combines two gradiometers in one, as depicted in Fig. 16. The second (middle) test mass (TM2) is part of both gradiometers. When all three test masses fall simultaneously and the distance between TM1 and TM2 is the same as that between TM2 and TM3, both gradiometers measure the same (Earth's gravity) gradient. When taking the differential signal of both gradiometers into consideration, this gradient is canceled to first order and the resulting instrument can be considered a "null instrument." Such an instrument is highly sensitive to local gravity variations and not influenced by environmental changes since most influences are common mode effects on both gradiometers. This instrument is perfectly suited for measuring local gravity variations and thus for determining G. In principle, not even the positions of the field mass have to be varied, as only the influence of the field mass is measured, in accordance with the nature of the instrument. Most of the effects that have a large uncertainty contribution in gravimeter measurements are the common mode here and are canceled out. As a consequence, this setup requires lower stabilities in the laser system and other components, making it less expensive than an absolute gravimeter. Only a few effects are not common mode effects such as the rotation of the test masses. 129 The test mass contains a retroreflector attached to a housing. If the test mass falls freely, there will always be a minute induced rotational velocity. This rotation will be around the center of mass (COM) of the test mass. If the optical center (OC) differs from the COM, then a parasitic acceleration will be measured (the centrifugal acceleration of the test mass). Thus, the test masses must be well balanced in order for the COM to coincide with the OC. 130 An experiment with a conventional free-fall gradiometer, as suggested in Ref. 128, has still not been realized. However, a first attempt by means of an atom gravimeter configuration has been carried out by Rosi et al. in 2015. 131

V. THE MASSES
Besides length and time, the third dimension that appears in the unit of the gravitational constant is mass. The choice of the material and the shape of the test and field masses are not trivial and depend on many factors. Although mass can be measured very accurately by means of comparators, in "Big G" measurements, the determination of the mass alone is not sufficient. This is due to the fact that the field mass is usually very close to the test mass for a G measurement. As a consequence, we cannot treat the masses as point masses and have to integrate over both mass density distributions from the test and field masses. Therefore, knowledge of the density distribution within the masses is also necessary. Furthermore, highly accurate dimensional measurements are necessary, as well as precise positioning of the test and field masses. Table III shows some examples of field masses that have been used for G measurements. The table also shows the very wide range of weights that have been used for laboratory measurements over the centuries. From Eq. (1), we infer that in order to increase the signal, the masses must be as large as possible. At the same time, the distance between the masses should be as small as possible for the signal decays with the inverse square of the distance. This brings us to the conclusion that the density of the material should be maximized. This is why most of the experiments use highly dense material. Elements such as lead, tungsten, and uranium have been used for their high densities. Some of these materials, however, are either hard to process (tungsten) or are relatively soft (lead) and allow the surface to be easily damaged. Because density distribution is also of critical importance, either the mass should be homogeneous or (as a minimum) the density distribution should be easily measurable. Due to the high density of the materials, X-ray imaging methods cannot be used. Therefore, one of the masses used in the experiment is usually examined by means of destructive testing. A better solution is the use of a liquid such as mercury 114 or water, as they have an almost perfectly homogeneous density distribution. However, because mercury is poisonous, it is generally avoided, while the low density of water means that it can only be used in enormous quantities such as a lake 116,134 or the sea. 135 Another important consideration is the shape of the mass. Table III lists four types of shapes: a sphere, a cylinder, a ring, and a block. Spheres have several advantages over other geometrical shapes of the field mass:

1.
The sphere is the shape for which the gravitational field can be calculated most easily. The calculation of the gravitational field of other shapes is more involved and sometimes cannot be solved in closed mathematical form.

2.
The sphere has the highest symmetry. It can be oriented in many different positions in order to average out density non-uniformity.

3.
It is easier to fabricate a large sphere with low form deviations than, for example, a cylinder. Currently, the best spheres in the world are those fabricated for the Avogadro project. These spheres are made from a crystal of almost pure 28 Si silicon. Their form deviations are on the order of 40 nm (and below) for diameters of 93.7 mm, 136 and their surface roughnesses are less than 0.2 nm 137 (i.e., the relative form deviations are about 4.3 × 10 −7 ). The tungsten-sphere field masses with nominal diameters of 101.6 mm that were used by Beams 138 had form deviations of less than 75 nm (relative form deviation of 7.4 × 10 −7 ). 133 For comparison, industrial ball bearings with nominal diameters of up to 100 mm have tolerances in diameter of up to 1 μm (grade 40) (i.e., one part in 10 5 relative form deviation), whereas smaller ball bearings with nominal diameters of up to 12.7 mm have relative form tolerances of 6.3 × 10 −6 (grade 3). 139 In order to estimate the measurement error due to these form deviations, we assume an ellipsoidal shape of the field mass. The relative error can then be calculated as (see Ref. 43) where a, b, and c denote the radii of the three rotational axis of the ellipsoid and R denotes the distance between the test mass and the field mass. If the separation R = 100 mm, a = b = 50 mm, and c = a − 0.5 μm (grade 40), then this relative error amounts to about 3 × 10 −6 (i.e., sufficient for a measurement on the order 1 × 10 −5 ).
Although it is sometimes believed that a cylinder is easier to manufacture than a sphere, this does not hold if small tolerances have to be met, as the form deviations of cylinders are usually higher than those for spheres. As an example, Tino et al. (Florence, Italy) used cylindrical field masses made of 95.3 % W, 3.2 % Ni, and 1.5 % Cu. The relative form deviation, which was measured with a coordinate measuring machine, was on the order of 1 × 10 −5 . 127,140 While the tolerances for a sphere can be specified with one parameter (radius), additional parameters are necessary to specify the tolerances for a cylinder: cylindricity, parallelity (of end planes), flatness (of end planes), and angles. Hence, it is more difficult to manufacture and to quantify the form deviations on a cylinder. It is challenging to keep the aforementioned tolerances below 1 μm for cylinders. 141 A considerable advantage of using a hollow cylinder versus a sphere as a field mass is a larger tolerance for the test mass position, as Faller and Koldewyn 142 pointed out in 1983. At a certain distance above the cylinder along the symmetry axis, the gravitational field of a hollow cylinder has a maximum. At the maximum, the derivative of the gravitational signal with respect to the axial position is zero; thus, the signal to first order is independent of the precise position of the test mass. Chen and Cook 143 made a detailed calculation and showed that there is actually a saddle point (i.e., a minimum and a maximum along the radial and axial directions, respectively). Hence, the signal is also independent to first order of the radial position. Figure 17 depicts a sketch of this saddle point. Kündig et al. also took advantage of this property, as shown in the graphs on the left and the right of Fig. 13.
Such a field mass can also be constructed by arranging a number of cylinders to form a ring, as done in the free-fall experiment by Schwarz et al. 118 The graph in Fig. 14(b) shows these extrema of gravity for this source mass arrangement. The freely falling test mass was then positioned in such a way that its trajectory covered the precise area of the saddle point. Lamporesi et al. 132 used a similar arrangement. Furthermore, as Chen and Cook point out, a cylinder with a diameter equal to its length has almost the same gravitational field characteristics as the sphere. 43 Another restriction is that perturbing forces should be avoided. Masses should therefore have a low magnetic and electric susceptibility and be made of electrically conductive materials in order to avoid electric charges.

VI. SUMMARY AND OUTLOOK
Measuring G accurately is a challenging task. To determine G, a very carefully constructed setup is necessary since the gravitational force is about 38 orders of magnitude smaller than the electromagnetic force and since the gravitational interaction, unlike the electromagnetic interaction, cannot be shielded. A good indicator of the struggle to assign a value to G is given by the spread of the published values. Figure 18 shows the values and uncertainties of fourteen precision determinations of G published in the past 35 years. The smallest reported relative standard uncertainty is 14 × 10 −6 . However, the difference of the largest reported result to the smallest reported result exceeds 500 × 10 −6 , more than 30 times the smallest uncertainty.
In 2014, CODATA used a data set to determine the average value of G that was slightly different from the data set shown in Fig. 18.
The normalized residual of an experiment i is given by y i − y avg /σ i , where y i and σ i are the published value and uncertainty of the experiment, respectively. Here, y avg is the average value that can be obtained as a weighted average of the published data. The Birge ratio for the G data set is about five. This means that if every uncertainty is multiplied by a factor of five (replace σ i with 5σ i in the above equations), χ 2 would be thirteen, the expectation value of a data set with fourteen measurements. The task group chose to use a slightly different expansion factor of 6.3 instead in order to reduce all normalized residuals below two.
Why is the scatter of the G data so large? Experimenters have devoted tremendous effort to investigating many possible contributions to the measurement uncertainty, but the complete data set reported is not statistically probable. In principle, there are three possibilities that can explain the observed inconsistency of the data:

1.
Some or all of the experiments suffer from an unknown bias. A bias is a systematic effect that shifts the measured result from the true value by a predictable amount. It is normal for experiments to have biases. Usually, the experimenter determines the bias and applies a correction in such a way that the published value no longer has a bias, i.e., the experimenter's best estimate of the true value. However, a bias may be present in the experiment that the experimenter is not aware of. In this case, the published result will differ from the true value by the size of the bias. For example, prior to 1995, the publication year of Kuroda's article, the time-of-swing experiments suffered from a relative bias on the order of 1/Q because the experimenters were not aware that the inelastic properties of the spring affect the measurement result. After 1995, the experimenters tried to avoid the bias either by using a suspension with a large Q (e.g., Ref. 59) or by estimating the bias and applying a correction (e.g., Ref. 57).

2.
Some or all of the experiments underestimate the relative uncertainty of the measurement. Hypothetically, all of the reported values of the measurements may be correct, but the uncertainties reported may be too small. If the true uncertainty were five times larger, the data set would be perfectly consistent (see the Birge ratio above). What could cause the experimenters to under-report the measurement uncertainty? We can say with certainty that this is not the intention of the experimenters, who usually spend a great deal of time and considerable resources in establishing a comprehensive uncertainty budget. In most G measurements, the time spent taking the actual measurement data is often much shorter than the time spent investigating the uncertainties.
The principal problem is that the set of the known systematic and statistical effects is only a subset of all systematic and statistical effects that can perturb an experiment. Hence, regardless of how much effort is spent, the uncertainty budget can never be complete. There will always be an unknown uncertainty, often referred to as a dark uncertainty. 5 The aim of the experimenter is to consider every conceivable effect that may change the result of the measurement in order to keep the dark uncertainty as small as possible (ideally, limiting it to a small fraction of the total reported uncertainty).
Fortunately, however, as more experiments are conducted, more systematic effects are discovered. As our knowledge base of systematic effects grows, the sources of dark uncertainty will diminish and the scatter of future measurements will decrease. Eventually, the recommended value of G will converge to the true value.

3.
The most exciting, yet least probable explanation is that new, unknown forms of physics can explain the variation in the data. This is a variant of the first point, with the difference being that the first point focuses on technical biases, while this point focuses on a bias that is more fundamental in nature. Over the years, several theories have been published that have attempted to explain the variation observed with the different experiments (e.g., Refs. 147 and 148). While attempts to formulate new theories are encouraged, the existing G data set is not the best data set to disprove such new theories. The experiments were all done with the intent of measuring G, and other variables may not have been strictly controlled. In most cases, it would be much better to conduct a dedicated experiment to disprove a certain theory. For example, if the hypothesis is that G depends on time, the best course of action is to build one experiment that is optimized to be precise and stable, but not necessarily accurate, because the true value of G would not be relevant for this test. The time series produced by such an experiment may be able to constrain a time-varying G, especially if all other variables are kept constant. A long data set of G measurements is available from the work of Karagioz. 51 Theories of a time-varying G will necessarily predict a time variation of g, which contains G, unless there is a subtle canceling effect. Many institutes around the world measure g and unexplained variations of g within a year can be limited to well below one part in 10 9 . 149 An interesting possibility is a deviation of the inverse square law in the range of the laboratory experiments. A simple parameterization of a possible violation of the fifth force is the Yukawa potential. In the context here, this can be written as a distance-dependent constant of gravitation, G r = G ∞ 1 + αe −r/λ , (44) where α denotes the strength of possible new interaction relative to gravity and λ is the typical range for the interaction. For r ≫ λ, G = G ∞ , and for r ≪ λ, G = G ∞ (1 + α). For a review on the test of the inverse square law see Ref. 150. Figure 19 shows the current limits on α as a function of λ. Surprisingly, the exclusion is remarkably weak for the laboratory scale experiments discussed here with typical dimensions ranging from several centimeters to a few meters. Only α > 10 −3 are ruled out. While a variation of G at ranges from a few centimeters to a few meters are not ruled out, they seem unlikely. Again, excluding a variation of G that has distances in the range of interest is best addressed in a dedicated experiment that compares the gravitational attraction on two length scales, rather than comparing the results of different G experiments.
Besides the technical and scientific issues discussed here and above, two additional facts hamper progress in measurements of G. First, most measurements are performed by small groups-publications with only two authors are not uncommon in this field. Very often, these groups measure G once and then move on to other experiments. The group led by Quinn is a notable exception. This makes it difficult to build institutional memory. In other words, every attempt to measure G starts from scratch and the investigators can only learn from the literature and not from their mentors. Second, experiments are not repeated. A core tenet of the scientific method is for results to be reproducible. Usually, a discovery is made and then verified by a second laboratory. However, no two identical G experiments have ever been repeated. For most researchers, it is more interesting and rewarding to invent a new method of measuring G than to repeat an existing measurement. However, repeating and independently assessing the uncertainty of the experiment is very important for this field.
In 2014, Quinn, Speake, and Luo invited many experimenters to discuss the situation of the G measurements. 58 Later that year, NIST hosted another workshop to continue the discussion and proposed that a collaboration or consortium of several institutes-preferably national metrology institutes-be held in order to develop a common experiment. 152 The idea was to make two identical setups so that two institutions could conduct the same experiment. Both were to give the same value for G, if not, the experimenters would have to search for the error until the results agreed within the single standard uncertainties. Although this would certainly not reveal all systematic errors, it would give more confidence in the result. Unfortunately, such a consortium has not yet been founded.
One positive outcome of both meetings was the formation of a "Big G" working group under the auspices of the International Union of Pure and Applied Physics (IUPAP). 153 The purpose of this working group is to assist in resolving the discrepancy present in G measurements. An additional function of the working group could be to provide institutional memory, mentoring, and advice for new experiments.
In addition, the International Committee for Weights and Measures (CIPM) decided in its November 2014 meeting to establish a consortium of national metrology institutes to facilitate new work aimed at resolving the present disagreement among measurements of the Newtonian constant of gravitation. 154 Given the current situation in the measurement of G, it is difficult to see how our knowledge of G can be improved, for example, χ 2 will not decrease by adding new experiments, as it is a sum of squares and can increase only with new data. The Birge ratio can decrease by increasing N − 1 in the denominator; however, this will be a slow process. However, re-evaluating or repeating experiments that have already been performed may provide insights into hidden biases or dark uncertainty. NIST has the unique opportunity to repeat the experiment of Quinn et al. 72 with an almost identical setup. By mid-2018, NIST researchers will publish their results and assign a number as well as an uncertainty to their value. The same researchers have also acquired the equipment of Parks and Faller 103 although there are no immediate plans to repeat this experiment. Securing the equipment to prevent it from being lost is an important first step. Because these two experiments span almost the whole range of all G values, having both of them is an important asset. The relevance of repeating previous experiments after so many years may be called into question, as technology has changed, and an improved setup should be possible. One might liken this situation to ascending Mount Everest using the equipment of Sir Edmund Hillary. However, the lessons learned by realizing a situation from another point of view (i.e., following in the footsteps of others) can prove to be highly instructive. The experience that experimenters have acquired to date using new equipment may help to identify unconsidered systematical errors. 151

VII. CONCLUSIONS
The measurement of the constant of gravitation, "Big G," is still one of the most challenging of all experiments. Being the second fundamental constant ever measured, it remains the fundamental physical constant with the highest measurement uncertainty. We have given an overview of recent and historic measurements in order to show the different experimental approaches that have been taken over the past 200 years. Furthermore, ongoing efforts are in progress to unveil the origins of large discrepancies in recent measurements. The National Metrology Institute of the United States, NIST, is repeating one "Big G" measurement conducted by another group.
It is hoped that the quest for a more accurate "Big G" will not cease, as science has yet to devise a solution to the mystery of why "Big G" measurements do not converge. Example of how to measure G by means of a spring balance. First, the elongation of the spring due to the Earth's mass, M, is measured. Then, an additional field mass m 1 is added, and the change in elongation is measured. If an elongation of, for example, 10 cm results due to the Earth's mass, then the field mass results in a variation of only 0.67 pm.

FIG. 5.
Schematic drawing of a simplified torsion balance. The torsion fiber hangs vertically like a plumb line. The pendulum bob shown here is called the dumbbell. If the torsion balance is not balanced (i.e., m 1 ≠ m 2 ), the bob is angled with respect to the horizontal plane such that the center of mass (COM) is below the suspension point. The fiber is only sensitive to torques around its axis, which is vertical.

FIG. 6.
Measured noise and thermal noise of a torsional oscillator with κ = 774 pN m rad −1 . The top plot shows the amplitude spectral density of the torsional angle θ. The bottom plot shows the amplitude spectral density of the torque. Ideally, the signal that is to be measured is placed at the minimum value of the torque noise, here about 3 mHz.

FIG. 8.
The real and imaginary parts of a simple model of a real spring consisting of an ideal spring in parallel to a Maxwell unit (see Fig. 7).

FIG. 9.
Cut-away drawing of the torsion balance used by Gundlach and Merkowitz to determine G. This instrument has measured G with the smallest relative standard deviation to date, 13.6 × The principle of the experiment conducted at the University of Zurich. The two gray cylinders (field masses) can be either together (T) or apart (A). Either one of the two test masses is connected to the mass comparator to measure its weight given by m(g + g z ), where g is the local acceleration of gravity at the test mass position and g z is the additional vertical field produced by the source masses. On either side of the drawing, g z is shown. (a) In a free-fall absolute gravimeter, a test mass contains a retroreflector M obj , which is part of a Mach-Zehnder laser interferometer. The test mass is released in vacuum and its free-fall path is traced with respect to an inertially isolated reference retroreflector (M ref ). BS denotes beam splitters; M denotes mirrors. The interference signal registered with the detector Det contains the information about the acceleration due to gravity. For repeated measurements, the test mass is lifted up with an elevator. The alternating positions of a field mass are sketched by the two rings. (b) The graph shows the qualitative field strength of the ring-shaped field mass. Two extrema appear. The trajectory of the test mass is adjusted to precisely cover the range of the extrema in order to minimize positioning errors. When the field mass is in the lower (L/Pos 2) position, the measured gravity is higher than the local gravity. When the field mass is positioned above (U/Pos 1) the test mass, the measured gravity is lower than the local Earth's gravity. The theoretical effective gravity from the source mass is obtained by integrating over the field strength covered by the trajectory. Differential gradiometer principle. Test masses TM 1 , TM 2 , and TM 3 are in simultaneous free fall. The ring-shaped field masses perturb the local gravity field. Due to the differential character of the setup, only the gravity of the field masses is measured, not the Earth's gravity or its gradient (to first order) (M-mirror, BS-beam splitter, d-distance between upper and lower test mass).