* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download 4. Weighty Arguments - The University of Arizona – The Atlas Project
Classical central-force problem wikipedia , lookup
Scale relativity wikipedia , lookup
Modified Newtonian dynamics wikipedia , lookup
Lagrangian mechanics wikipedia , lookup
Relativistic mechanics wikipedia , lookup
Centripetal force wikipedia , lookup
Relativity priority dispute wikipedia , lookup
Fictitious force wikipedia , lookup
Hunting oscillation wikipedia , lookup
Sagnac effect wikipedia , lookup
Routhian mechanics wikipedia , lookup
Velocity-addition formula wikipedia , lookup
Classical mechanics wikipedia , lookup
Analytical mechanics wikipedia , lookup
Equations of motion wikipedia , lookup
Four-vector wikipedia , lookup
Relational approach to quantum physics wikipedia , lookup
One-way speed of light wikipedia , lookup
History of special relativity wikipedia , lookup
Criticism of the theory of relativity wikipedia , lookup
Frame of reference wikipedia , lookup
Newton's laws of motion wikipedia , lookup
Faster-than-light wikipedia , lookup
Time dilation wikipedia , lookup
Tests of special relativity wikipedia , lookup
Variable speed of light wikipedia , lookup
Derivations of the Lorentz transformations wikipedia , lookup
Inertial frame of reference wikipedia , lookup
Special relativity wikipedia , lookup
Special relativity (alternative formulations) wikipedia , lookup
4. Weighty Arguments 4.1 Immovable Spacetime My argument for the notion of space being really independent of body is founded on the possibility of the material universe being finite and moveable. 'Tis not enough for this learned writer [Leibniz] to reply that he thinks it would not have been wise and reasonable for God to have made the material universe finite and moveable… Neither is it sufficient barely to repeat his assertion that the motion of a finite material universe would be nothing, and (for want of other bodies to compare it with) would produce no discoverable change, unless he could disprove the instance which I gave of a very great change that would happen, viz., that the parts would be sensibly shocked by a sudden acceleration or stopping of the motion of the whole: to which instance, he has not attempted to give any answer. Samuel Clarke, 1716 Although the words "relativity" and "relational" share a common root, their meanings are quite different. The principle of relativity asserts that for any material particle in any state of motion there exists a system of space and time coordinates in terms of which the particle is instantaneously at rest and inertia is homogeneous and isotropic. Thus the natural (inertial) decomposition of spacetime intervals into temporal and spatial components can be defined only relative to some particular frame of reference. Of course, the absolute spacetime intervals themselves are invariant, so the "relativity" refers only to the anaytical decomposition of these intervals. (The physical significance of this particular decomposition is that the quantum phase of any object evolves in proportion to its "natural" temporal coordinate.) In contrast, the principle of relationism asserts that the absolute intervals between material objects fully characterize their extrinsic positional status, without reference to any underlying non-material system of reference which might be called "absolute space". The traditional debate between proponents of relational and absolute motion (such as Leibniz and Clarke, respectively) is of questionable relevance if continuous fields are accepted as extended physical entities, permeating all of space, because this implies there are no unoccupied locations. In this context every point in the entire spacetime manifold is a vertex of actual relations between physical entities, obscuring the distinction between absolute and relational premises. Moreover, in the context of the general theory of relativity, the metrical properties of spacetime itself constitute a field, i.e., an extended physical entity, which not only acts upon material objects but is also acted upon by them, so the absolute-relational distinction has no clear meaning. However, it remains possible to regard fields as only representations of effects, and to insist on materiality for ontological objects, in which case the absolute-relational question remains both relevant and unresolved. Physicists have always recognized the appeal of a purely relational theory of motion, but every such theory has foundered on the same problem, namely, the physicality of acceleration. For example, one of Newton’s greatest challenges was to account for the fact that the Moon is relationally stationary with respect to the Earth (i.e., the distance between Earth and Moon is roughly unchanging), whereas it ought to be accelerating toward the Earth due to the influence of gravity. What is holding the Moon up? Or, to put the question differently, why is the Moon not accelerating directly toward the Earth in accord with the gravitational force that is presumably being applied to it? Newton's brilliant answer was that the Moon is indeed accelerating directly toward the Earth, and with precisely the magnitude of acceleration predicted by his gravity formula, but th the Moon is also moving perpendicularly to the Earth-Moon axis, with a velocity v = R, where R is the Earth-Moon distance and is the Moon's angular velocity, i.e., roughly 2 radians/moonth. If it were not accelerating toward the Earth, the Moon would just wander off tangentially away from the Earth, but the force of gravity is modifying its velocity by adding GM/R2 ft/sec toward the Earth each second, which causes the Moon to turn continually in a roughly circular orbit around the Earth. The centripetal acceleration of an object revolving in a circle is v2/R = 2R, and so (Newton reasoned) this must equal the gravitational acceleration. Thus we have 2 R3 = GM, which of course is Kepler's third law. This explanation depends on strictly non-relational concept of motion. In fact, it might be said that this was the crucial insight of Newtonian dynamics - and it applies no less in the special theory of relativity. For the purposes of dynamical analysis, motion must be referred to an absolute background class of rectilinear inertial coordinate systems, rather than simply to the relations between material bodies, or even classical fields. Thus we can not infer everything important about an object's state of motion simply from its distances to other objects (at least not to nearby objects). In this sense, both Newtonian and relativistic physics find it necessary to invoke absolute space. But this concept of absolute space presents us with an ontological puzzle, because we can empirically verify the physical equivalence of all uniform states of motion, which suggests that position and velocity have no absolute physical significance, and yet we can also verify that changes in velocity (i.e., accelerations) do have absolute significance, independent of the relations between material bodies (at least in a local sense). If the evident relativity of position and velocity lead us to discard the idea of absolute space, then how are we to understand the apparent absoluteness of acceleration? Some have argued that in order for the change in something to be ontologically real, it is necessary for the thing itself to be real, but of course that's not the case. It's perfectly possible for "the thing itself" to be an artificial conception, whereas the "change" is the ontological entity. For example, the Newtonian concept of the physical world is a set of particles, between which relations exist. The primary ontological entities are the particles, but it's equally possible to imagine that the separations are the "real" entities, and particles are merely abstract entities, i.e., a convenient bookkeeping device for organizing the facts of a set of separations. This raises some interesting questions, such as whether an unordered multiset of n(n1)/2 separations suffices to uniquely determine a configuration of n points in a space of fixed dimension. It isn't difficult to find examples of multisets of separations that allow for multiple distinct spatial arrangements. For example, given the multiset of ten separations we can construct either of the two five-point configurations shown below For another example, the following three distinct configurations of eight co-planar points each have the same multiset of 28 point-to-point separations: In fact, of the 12870 possible arrangements of eight points on a 4x4 grid, there are only 1120 distinct multisets of separations. Much of this reduction is due to rotations and reflections, but not all. Intrinsically distinct configurations of points with the same multiset of distances are not uncommon. They are sometimes called isospectral sets, referring to the spectrum of point-to-point distances. Examples such as these may suggest that unordered separations cannot be the basis of our experience, although we can't rule out, a priori, the possibility that our interpretation of experience is non-unique, and that different states of consciousness might perceive a given physical configuration differently. Even if we reject the possibility of non-unique mapping to our conventional domain of objects, we could still imagine a separation-based ontology by stipulating an ordering for those separations. (One hypothetical form which laws of separation might take is discussed in Section 4.2.) By recognizing the need to specify this ordering, our focus shifts back to a particle-based ontology. As noted previously, according to both Galilean and Einsteinian (special) relativity, position and velocity are relative but acceleration is not. However, it can be argued that the absoluteness of acceleration is incongruous with Galilean spacetime, because if spacetime was Galilean there would be no reason for acceleration to be absolute. This was already alluded to in the discussion of Section 1.8, where the cyclic symmetry of the velocity relations between three Galilean reference systems was noted. In a sense, the relationist Leibniz was correct in asserting that absolute space and time are inconsistent with Galilean relativity, citing the “principle of sufficient reason” in support of this claim. If time and space are separate and distinct (which no one had ever disputed) then there would be no observable distinction between accelerated and un-accelerated systems of reference, as revealed by the fact that the concept of a moveable rigid body of arbitrary size is perfectly consistent with the kinematics of Galilean relativity. Samual Clarke had argued that if all the material in some finite universe was accelerated in tandem, maintaining all the intrinsic relations between the particles, this acceleration would still be physically real, even though no one could observe the acceleration (for lack of anything to compare with it). Leibniz replied Motion does not indeed depend upon being observed, but it does depend upon being possible to be observed. There is no motion when there is no change that can be observed. And when there is no change that can be observed, there is no change at all. The contrary opinion is grounded upon the supposition of a real absolute space, which I have demonstratively confuted by the principle of the want of a sufficient reason of things. It is quite right that, in the context of Galilean relativity, the acceleration of all the matter of the universe in tandem would be strictly unobservable, so Leibniz has a valid point. However, barring some Machian long-range influence which neither Clarke nor Leibniz seems to have imagined, the same argument implies that inertia should not exist at all. Thus Clarke was correct in pointing out that the very existence of inertia refuts Leibniz’s position. There is indeed an observable distinction between uniform and accelerated, i.e., inertia does exist. In summary, Leibniz was correct in (effectively) claiming that the existence of inertia is logically incompatible with the Galilean concept of space and time, whereas Clarke was correct in pointing out that inertia does actually exist. The only was out of this impasse would have been to discard the one premise that neither of them ever questioned, namely, the Galilean concept of space and time. It was to be another 200 years before a viable alternative to Galilean spacetime was recognized. As explained in Section 1, the spacetime structures of Galileo and Minkowski are formally identical if the characteristic constant c of the latter is infinite. In that case it follows that arbitrarily large rigid bodies are possible, so it is conceivable for all the material in an arbitrarily large region to accelerate in tandem, maintaining all the same intrinsic spatial relations. However, if c has some finite value, this is no longer the case. Section 2.9 described the kinematic limitation on the size of a spatial region in which objects can be accelerated in tandem. Hence the structure of Minkowski spacetime intrinsically distinguishes uniform motion as the only kind of motion that could be applied in tandem to all objects throughout space. In this context, Leibniz’s principle of sufficient reason can be used to argue that different states of uniform motion should not be regarded as physically different, but it cannot be applied to accelerated motion, because the very kinematics of Minkowski spacetime do not permit the tandem acceleration of objects over arbitrarily large regions. It seems justifiable to say that the existence of inertia implies the Minkowski character of spacetime. This goes some way towards resolving the epistemological problems that have often been raised against the principle of inertia. To the question “How are we to distinguish the inertial coordinate systems from all possible systems of reference?”, we can answer that the inertial coordinate systems are precisely those in terms of which two objects separated by an arbitrary distance can be accelerated in tandem. This doesn’t help to identify inertial coordinate systems in Galilean spacetime, but it fully identifies them in the context of Minkowski spacetime. So, it can be argued that (from an epistemological standpoint) Minkowski spacetime is the only satisfactory framework for the principle of inertia. Still, there remain some legitimate open issues regarding any (so far) conceived relativististic spacetime. According to both classical and special relativity, the inertial coordinate systems are fully symmetrical, and each one is regarded as physically equivalent (in the absence of matter). In particular, we cannot single out one particular inertial system and claim that it is the "central" frame, because the equivalence class has no center, and all ontological qualities are uniformly distributed over the entire class. Unfortunately, from a purely formal standpoint, a purported uniform distribution over inertial frames is somewhat problematic, because the inertial systems of reference along a single line can only be linearly parameterized in terms of a variable that ranges from - to +, such as q = log((1+v)/(1-v)), but if each value of q is to be regarded as equally probable, then we are required to imagine a perfectly uniform density distribution over the real numbers. Mathematically, no such distribution exists. To illustrate, imagine trying to select a number randomly from a uniform distribution of all the real numbers. This is the source of many well-known mathematical conundrums, such as the "HighLow Number" strategy game, whose answer depends on the fact that no perfectly uniform distribution exists over the real numbers (nor even over the integers). In trying to understand whether there was any arbitrary choice in the creation of the physical world, it’s interesting to note that the selection of our particular rest frame cannot have been perfectly arbitrary from a set of pre-existing alternatives. It might be argued that the impossibility of a choice between indistinguishable inertial reference frames implies that only an absolutist framework is intelligible. However, the identity of indiscernables led Leibnic and Mach to argue just the opposite, i.e., that the only intelligible way to imagine the existence of objects, all in roughly the same frame of reference within a perfectly symmetrical class of possible reference systems, is to imagine that the objects themselves are in some way responsible for the class, which brings us back to pure relationism. Alas, as we’ve seen, pure relationism has its own problematic implications. For one, there has traditionally been a close association between relationism and the concept of absolute simultaneity. This is because the “relations” were regarded as purely spatial, and it was necessary to posit a unique instant of time in which to evaluate those spatial relations. To implement a spatial relationist theory in the framework of Minkowski spacetime would evidently require that whatever laws apply to the spatial relations for one particular decomposition of spacetime must also apply to all other decompositions. (A simple example of this is discussed in Section 4.2.) Alternatively, we might say that only invariant quantities should be subject to the relational laws, but this amounts to the same thing as requiring that the laws apply to all decompositions. One common feature of all purely relational models based on Galilean space and time is their evident non-locality, because (as noted above) there is no way, if we limit ourselves to local observations, to identify the inertial motions of material objects purely from the kinematical relations between them. We're forced to attribute the distinction between inertial and non-inertial motion to some non-material (or non-local) interaction. This is nicely illustrated by Einstein's thought experiment (based on Newton's famous "spinning pail") involving two nominally identical fluid globes S1 and S2 floating in an empty region of space. One of these globes is set rotating (about their common axis) while the other remains stationary. The rotating globe assumes an oblate shape due to its rotation. If globes are mutually stationary and not rotating, they are both spherical and symmetrical, and we cannot distinguish between them, but if one of the globes is spinning about their common axis, the principle of inertia leads us to expect that the spinning globe will bulge at the "equator" and shrink along its axis of rotation due to the centripetal forces. The "paradox" (for the relationist) is that each globe is spinning with respect to the other, so they must still be regarded as perfectly symmetrical, and yet their shapes are no longer congruent. To what can we attribute the asymmetry? If we look further afield we may notice that the deformed globe is rotating relative to all the distant stars, whereas the spherical globe is not. A little experimentation shows that a globe's deformation is strictly a function of its speed of rotation relative to the distant stars, and presumably this is not a mere coincidence. Newton's explanation for this coincidence was to argue that the local globes and the distant stars all reside in the same absolute space, and it is this space that defines absolute (inertial) motion, and likewise the special relativistic theory invokes an absolutely preferred class of reference frames. Moreover, in the general theory of relativity, when viewed from a specific cosmological perspective, there is always a preferred frame of reference, owing to the global boundary conditions that must be imposed in order to single out a solution. This came as a shock to Einstein himself at first, since he was originally thinking (hoping) that the field equations of general relativity represented true relationism, but his conversion began when he received Schwarzschild's exact solution for spherical symmetry, which of course exhibits a preferred coordinate system such that the metric coefficients are independent of time, i.e., the usual Schwarzschild coordinates, which are essentially unique for that particular solution. Likewise for any given solution there is some globally unique system of reference singled out by symmetry or boundary conditions (even for asymptotically flat universes, as Einstein himself showed). For example, in the Friedman "big bang" cosmologies there is a preferred global system of coordinates corresponding to the worldlines with respect to which the cosmic background radiation is isotropic. Of course, this is not a fresh insight. The non-relational global aspects of general relativistic cosmologies have been extensively studied, beginning with Einstein's 1917 paper on the subject, and continuing with Gödel's rotating empty universes, and so on. Such examples make it clear that general relativity is not a relational theory of motion. In other words, general relativity does not correlate all physical effects with the relations between material bodies, but rather with the relations between objects (including fields) and the absolute background metric, which is affected by, but is not determined by, the distribution of objects (except arguably in closed cosmological models). Thus relativity, no less than Newtonian mechanics, relies on spacetime as an absolute entity in itself, exerting influence on fields and material bodies. The extra information contained in the metric of spacetime is typically introduced by means of boundary conditions or "initial values" on a spacelike foliation, sufficient to fix a solution of the field equations. In this way relativity very quickly disappointed its early logical-positivist supporters when it became clear that it was not, and never had been, a relational theory of motion, in the sense of Leibniz, Berkeley, or Mach. Initially even Einstein was disturbed by the Schwarzschild and de Sitter solutions (see Section 7.6), which represent complete metrical manifolds with only one material object or none at all (respectively). These examples showed that spacetime in the theory of relativity cannot simply be regarded as the totality of the extrinsic relations between material objects (and non-gravitational fields), but is a primary physical entity of the theory, with its own absolute properties, most notably the metric with its related invariants, at each point. Indeed this was Einstein's eventual answer to Mach's critique of pre-relativity physics. Mach had complained that it was unacceptable for our theories to contain elements (such as spacetime) that act on (i.e., have an effect on) other things, but that are not acted upon by other things. Mach, and the other relationalists before him, naturally expected this to be resolved by eliminating spacetime, i.e., by denying that an entity called "spacetime" acts in any physical way. To Mach's surprise (and unhappiness), the theory of relativity actually did just the opposite - it satisfied Mach's criticism by instead making spacetime a full-fledged element of theory, acted upon by other objects. By so doing, Einstein believed he had responded to Mach's critique, but of course Mach hated it, and said so. Early in his career, Einstein was sympathetic to the idea of relationism, and entertained hopes of banishing absolute space from physics but, like Newton before him, he was forced to abandon this hope in order to produce a theory that satisfactorily represents our observations. The absolute significance of spacetime in the theory of relativity was already obvious from trivial considerations of the special theory. The twins paradox is a good illustration of why relativity cannot be a relational theory, because the relation between the twins is perfectly symmetrical, i.e., the spatial distance between them starts at zero, increases to some maximum value, and then decreases back to zero. The distinction between the twins cannot be expressed in terms of their mutual relations to each other, but only in terms of how each of their individual worldlines are embedded in the absolute metrical manifold of spacetime. This becomes even more obvious in the context of general relativity, because we can then have multiple distinct geodesic paths between two given events, with different lapses of proper time, so we cannot even appeal to any difference in "felt" accelerations or local physics of any kind along the two world-paths to account for the asymmetry. Hopes of accounting for this asymmetry by reference to the distant stars, ala Mach, were certainly not fulfilled by general relativity, according to which the metric of spacetime is conditioned by the presence of matter, but only to a very slight degree in most circumstances. From an overall cosmological standpoint we are unable to attribute the basic inertial field to the configuration of mass and energy, and we have no choice but to simply assume a plausible absolute inertial background field, just as in Newtonian physics, in order to actually make predictions and solve problems. This is necessarily a separate and largely independent stipulation from our assumed distribution of matter and energy. To understand why Galilean relativity is actually more relational than special relativity, note that the unified spacetime manifold with the lightcone structure of Minkowski spacetime is more rigid than a pure Cartesian product of a three-dimensional spatial manifold and an independent one-dimensional temporal manifold. In Galilean spacetime at a spatial point P0 and time t0 there is no restriction at all on the set of spatial points at t0 + dt that may "spatially coincide with P0" with respect to some valid inertial frame of reference. In other words, an inertial worldline through P0 at time t0 can pass through any point in the entire universe at time t0 + dt for any positive dt. In contrast, the lightcone structure of Minkowski spacetime restricts the future of the point P0 to points inside the future null cone, i.e., P0 cdt, and as dt goes to zero, this range goes to zero, imposing a well-defined unique connection from each "infinitesimal" instant to the next, which of course is what the unification of space and time into a single continuum accomplishes. We referred above to Newtonian spacetime without distinguishing it from what has come to be called Galilean spacetime. This is because Newton's laws are manifestly invariant under Galilean transformations, and in view of this it would seem that Newton should be counted as an advocate of relativistic spacetime. However, in several famous passages of the first Scholium of the Principia Newton seems to reject the very relativity on which his physics is founded, and to insist on distinctly metaphysical conceptions of absolute space and time. He wrote I do not define the words time, space, place, and motion, since they are well known to all. However, I note that people commonly conceive of these quantities solely in terms of the relations between the objects of sense perception, and this is the source of certain preconceptions, for the dispelling of which it is useful to distinguish between absolute and relative, true and apparent, mathematical and common. It isn't trivial to unpack the intended significance of these statements, especially because Newton has supplied three alternate names for each of the two types of quantities that he wishes us to distinguish. On one hand we have absolute, true, mathematical quantities, and on the other we have relative, apparent, common quantities. The latter are understood to be founded on our sense perceptions, so the former presumably are not, which seems to imply that they are metaphysical. However, Newton also says that this distinction is useful for dispelling certain prejudices, which suggests that his motives are utilitarian and/or pedagogical rather than to establish an ontology. He continues Absolute, true, and mathematical time, in and of itself and of its own nature flows uniformly (equably), without reference to anything external. By another name it is called duration. Relative, apparent, and common time is any sensible external measure of duration by means of motion. Such measures (for example, an hour, a day, a month, a year) are commonly used instead of true time. Absolute space, in its own nature, without relation to anything external, remains always similar and immovable. Relative space is some movable measure of absolute space, which our senses determine by the positions of bodies... Absolute and relative space are of the same type (species) and magnitude, but are not always numerically the same... Place is a part of space which a body takes up, and is according to the space either absolute or relative. Absolute motion is the translation of a body from one absolute place to another, and relative motion is the translation from one relative place to another. Newton's insistence on the necessity of referring all true motions to "immovable space" has often puzzled historians of science, because his definition of absolute space and time are plainly metaphysical, and it's easy to see that Newton's actual formulation of the laws of physics is invariant under Galilean transformations, and the concept of absolute space plays no role. Indeed, each mention of a "state of rest" in the definitions and laws is accompanied by the phrase "or uniform motion in a right line", so the system built on these axioms explicitly does not distinguish between these two concepts. What, then, did Newton mean when he wrote that true motions must be referred to immovable space? The introductory Scholium ends with a promise to explain how the true motions of objects are to be determined, declaring that this was the purpose for which the Principia was composed, so it's all the more surprising when we find that the subject is never even mentioned in Books I or II. Only in the concluding Book III, "The System of the World", does Newton return to this subject, and we finally learn what he means by "immovable space". Although his motto was "I frame no hypotheses" we find, immediately following Proposition X in Book III (in the third edition) the singular hypothesis HYPOTHESIS I: That the centre of the system of the world is immovable. In support of this remarkable assertion, Newton simply says "This is acknowledged by all, although some contend that the earth, others that the sun, is fixed in that centre." In the subsequent proposition XI we finally discover Newton's immovable space. He writes PROPOSITION XI: That the common centre of gravity of the earth, the sun, and all the planets, is immovable. For that centre either is at rest or moves uniformly forwards in a right line; but if that centre moved, the center of the world would move also, against the Hypothesis. This makes it clear that Newton's purpose all along has been not to deny Galilean relativity or the fundamental principle of inertia, but simply to show that a suitable system of reference for determining true inertial motions need not be centered on some material body. This was foreshadowed in the first Scholium when he wrote "it may be that there is no body really at rest, to which the places and motions of others may be referred". Furthermore, he notes that many people believed the immovable center of the world was at the center of the Earth, whereas others followed Copernicus in thinking the Sun was the immovable center. Newton evidently (and rightly) regarded it as one of the most significant conclusions of his deliberations that the true inertial center of the world was in neither of those objects, but is instead the center of gravity of the entire solar system. We recall that Galileo found himself in trouble for claiming that the Earth moves, whereas both he and Copernicus believed that the Sun was absolutely stationary. Newton showed that the Sun itself moves, as he continues PROPOSITION XII: That the sun is agitated by a continual motion, but never recedes far from the common centre of gravity of all the planets. For since the quantity of matter in the sun is to the quantity of matter in Jupiter as 1067 to 1, and the distance to Jupiter from the sun is to the semidiameter of the sun is in a slightly greater proportion, the common center of gravity of Jupiter and the sun will fall upon a point a little without the surface of the sun. This was certainly a magnificent discovery, worthy of being called the purpose for which the Principia was composed, and it is clearly what Newton had in mind when he wrote the introductory Scholium promising to reveal how immovable space (i.e., the center of the world) is to be found. In this context we can see that Newton was not claiming the ability to determine absolute rest, but rather the ability to infer from phenomena a state of absolute inertial motion, which he identified with the center of gravity of the solar system. He very conspicuously labels as a Hypothesis (one of only three in the final edition of the Principia) the conventional statement, "acknowledged by all", that the center of the world is immovable. By these statements he was trying to justify calling the solar system's inertial center the center of the world, while specifically acknowledging that the immovability of this point is conventional, since it could just as well be regarded as moving "uniformly forwards in a right line". The modern confusion over Newton's first Scholium arises from trying to impose an ontological interpretation on a 17th century attempt to isolate the concept of pure inertia, and incidentally to locate the "center of the world". It was essential for Newton to make sure his readers understood that "uniform motion" and "right lines" cannot generally be judged with reference to neighboring bodies (such as the Earth's spinning surface), because those bodies themselves are typically in non-uniform motion. Hence he needed to convey the fact that the seat of inertia is not the Earth's center, or the Sun, or any other material body, but is instead absolute space and time - in precisely the same sense that spacetime is absolute in special relativity. This is distinct from asserting an absolute state of rest, which Newton explicitly recognized as a matter of convention. Indeed, we now know the solar system itself revolves around the center of the galaxy, which itself moves with respect to other galaxies, so under Hypothesis I we must conclude that Proposition XI is strictly false. Nevertheless, the deviations from true inertial motion represented by those stellar and galactic motions are so slight that Newton's "immovable center of the world" is still suitable as the basis of true inertial motion for nearly all purposes. In a more profound sense, the concept of "immoveable space" been carried over into modern relativity because, as Einstein said, spacetime in general relativity is endowed with physical qualities that enable it to establish the local inertial frames, but "the idea of motion may not be applied to it". 4.2 Inertial and Gravitational Separations And I am dumb to tell a weather’s wind How time has ticked a heaven round the stars. Dylan Thomas, 1934 The special theory of relativity is formulated as a local theory, so its natural focus is on the worldlines of individual particles. In addition, special relativity presupposes a preferred class of worldlines, those representing inertial motion. The idea of a worldline is inherently “absolute” in the sense that it is nominally defined with reference only to a system of space and time coordinates, not to any other objects. This is in contrast to a truly relational theory, which would take the "dual" approach, and regard the separations between particles as the most natural objects of study. In fact, as mentioned in Section 4.1, we could go to the relationist extreme of regarding separations as the primary ontological entities, and considering particles to be merely abstract concepts that we use to psychologically organize and coordinate the separations. The relationist view arguably has the advantage of not presupposing a fixed background or even a definite dimensionality of space, since each “separation” could be considered to represent an independent degree of freedom. Of course, this freedom doesn’t seem to exist in the real world, since we cannot arrange five particles all mutually equidistant from each other. Indeed it appears that the n(n1)/2 separations between n particles can be fully encoded as just 3n real numbers, and moreover that those real number vary continuously as the individual particles “move”. This is the justification for the idea of particles moving in a coherent three-dimensional space. Nevertheless, it’s interesting to examine the spatial separations that exist between material particles (as opposed to the space and time coordinates of individual particles), to see if their behavior can be characterized in a simple way. From this point of view, the idea of "motion" is secondary; we simply regard separations as abstract entities having certain properties that may vary with time. In this context, rather than discussing inertial motion of an individual particle, we consider the spatial separation (as a function of time) between two inertial particles. However, since we don’t presuppose a background of absolute inertial motion, we will refer to the particles as being “co-inertial”, meaning simply that the spatial separation between them behaves like the separation between two particles in absolute inertial motion, regardless of whether the two particles are actually in absolute inertial motion. Is it possible to characterize in a simple way the spatial separations that exist between coinertial particles? Consider, for example, the spatial separation s(t) as a function of time between a stationary particle and a particle moving uniformly in a straight line through space, as depicted in the figure below for the condition when the direction of motion of the moving particle B is perpendicular to the displacement from the stationary particle A. Obviously the separation between objects A and B in this configuration is stationary at this instant, i.e., we have ds/dt = 0, and yet we know from experience that this physical situation is distinct from one in which the two objects are actually stationary with respect to each other’s inertial rest frames. For example, the Moon and Earth are separated by roughly a constant distance, and yet we understand that the Moon is in constant motion perpendicular to its separation from the Earth. It is this transverse motion that counteracts the effect of gravity and keeps the Moon in its orbit. This is another reason that we ordinarily find it necessary to describe motion not in purely relational terms, but in terms of absolutely non-rotating systems of inertial coordinates. Of course, as Mach observed, the apparent existence of “absolute rotation” doesn’t necessarily refute relationism as a viable basis for coordinating events. It could also mean that we must take more relations into account. (For example, the Moon’s motion is always tangential to the Earth, but it is not always tangential to other bodies, so it’s orbital motion does show up in the totality of binary separations.) Whether or not a workable physics could be developed on a purely relational basis is unclear, but it’s still interesting to examine the class of co-inertial separations as functions of time. It turns out that co-inertial separations are characterized by a condition that is nearly identical to the condition for linear gravitational free-fall, as well as for certain other natural kinds of motion. The three orthogonal components x, y, and z of the separation between two particles in unaccelerated motion relative to a common reference frame must be linear functions of time, i.e., where the coefficients ai and bi are constants. Therefore the magnitude of any "co-inertial separation" is of the form where Letting the subscript n denote nth derivative with respect to time, the first two derivatives of s(t) are The right hand equation shows that s2 s03 = k, and we can differentiate this again and divide the result by s02 to show that the separation s(t) between any two particles in relatively unaccelerated (i.e., co-inertial) motion in Galilean spacetime must satisfy the equation Now we consider the separation that characterizes an isolated non-rotating two-body system in gravitational free-fall. Assume the two bodies are identical particles, each of mass m. According to Newtonian theory the inertial and gravitational constraints are coupled together by the auxiliary quantity called "force" by the following equations where G is a universal constant. (Note that each particle's "absolute" acceleration is half of the second derivative of their mutual separation with respect to time.) Equating these two forces gives s2 s02 = 2Gm. Differentiating this again and dividing through by s0, we can characterize non-rotating gravitational free-fall by the purely kinematic equation The formal similarity between equations (1) and (2) is remarkable, considering that the former describes strictly inertial separations and the latter describes gravitational separations. We can show how the two are related by considering general free motion in a gravitational field. The Newtonian equations of motion are where r is the magnitude of the distance from the center of the field and is the angular velocity of the particle. If we solve the left hand equation for and differentiate to give d/dt, we can substitute these expressions into the right hand equation and re-arrange the terms to give which applies (in the Newtonian limit) to arbitrary free paths of test particles in a gravitational field. Obviously if m = 0 this reduces to equation (1), representing free inertial separations, whereas for purely radial motion we have d2r/dt2 = m/r2, and so this reduces to equation (2), representing radial gravitational separation. Other classes of physical separations also satisfy a differential equation similar to (1) and (2). For example, consider a particle of mass m attached to a rod in such a way that it can slide freely along the rod. If we rotate the rod about some point P then the particle in general will tend to slide outward along the rod away from the center of rotation in accord with the basic equation of motion where s is the distance from the center of rotation to the sliding particle, and is the angular velocity of the rod. Differentiating and multiplying through by s0 gives Then since s2 = 2s0, we see that s(t) satisfies the equation (3) So, we have found that arbitrary co-inertial separations, non-rotating gravitational separations, and rotating radial separations are all characterized by a differential equation of the form (4) for some constant N. (Among the other solutions of this equation (with N = 1) are the elementary transcendental functions et, sin(t), and cos(t).) Solving for N, to isolate the arbitrary constant, we have Differentiating this gives the basic equation If none of s0, s1, s2, and s3 is zero, we can divide each term by all of these to give the interesting form This could be seen as a (admittedly very simplistic) “unification” of a variety of physically meaningful spatial separation functions under a single equation. The “symmetry breaking” that leads to the different behavior in different physical situations arises from the choice of N, which appears as a constant of integration. Incidentally, even though the above has been based on the Galilean spatial separations between objects as a function of Galilean time, the same conditions can be shown to apply to the absolute spacetime intervals between inertial particles as a function of their proper times. Relative to any point on the worldline of one particle, the four components t, x, y, and z of the absolute interval to any other inertially moving particle are all linear functions of the proper time along the latter particle's worldline. Therefore, the components can be written in the form where the coefficients ai and bi are constants. It follows that the absolute magnitude of any "co-inertial separation" is of the form where Thus we have the same formal dependence as before, except now the parameter s represents the absolute spacetime separation. This shows that the absolute separation between any fixed point on one inertial worldline and a point advancing along any other inertial worldline satisfies equation (1), where subscripts denote derivatives with respect to proper time of the advancing point. Naturally the reciprocal relation also holds, as well as the absolute separation between two points, each advancing along arbitrary inertial worldlines, correlated according to their respective proper times. 4.3 Free-Fall Equations When, therefore, I observe a stone initially at rest falling from an elevated position and continually acquiring new increments of speed, why should I not believe that such increases take place in a manner which is exceedingly simple and rather obvious to everybody? Galileo Galilei, 1638 The equation of two-body non-rotating radial free-fall in Newtonian theory is formally identical to the one-body radial free-fall solution in Einstein's theory (as is Kepler's third law), provided we identify Newton's radial distance with the Schwarzschild parameter r, and Newton's time with the proper time of the falling particle. Therefore, it's worthwhile to explicitly derive the cycloidal form of this solution. From the Newtonian point of view we can begin with the inverse-square law of gravitation for the radial separation s(t) between two identical non-rotating particles of mass m where dots signify derivatives with respect to time. Integrating this over ds from an arbitrary initial separation s(0) to the separation s(t) at some other time t gives Notice that the left hand integral can be rewritten Therefore, the previous equation can easily be integrated to give which shows that the quantity is invariant for all t. Solving the equation for Rearranging, this gives , we have To simplify the expressions, we put s0 = s(0), v0 = preceding expression can be written and r = s(t)/s0. In these terms, the There are two cases to consider. If K is positive, then the trajectory is bounded, and there is some point on the trajectory (the apogee) at which v = 0. Choosing this point as our time origin t = 0, we have K=1, and the standard integral gives This equation describes a (scaled) cycloidal relation between t and r, which can be expressed parametrically in terms of a fictitious angle as follows To verify that these two equations are equivalent to the preceding equation, we can solve the second for and substitute into the first to give Using the trigonometric identity the right side is we see that the first term on Also, letting = invcos(2r1), we can use the trigonometric identity to show that this angle is so the second term on the right side of (2) is which completes the demonstration that the cycloid relation given by (2) is equivalent to the free-fall relation (1). The second case is when K is negative. For this case we can conveniently express the equations in terms of the positive parameter k = -K. The standard integral tells us that, for any two points s0 and s1 on the trajectory, the time interval is related to the separations according to where Notice that if we define S0 = s0 / k and R = kr, then this becomes Thus, if we define the normalized time parameter then the normalized equation of motion is This represents the shape of every non-rotating separation between identical particles of mass m for which k is positive, which means that the absolute value of v0 exceeds 2 . These are the unbound radial orbits for which R goes to infinity, as opposed to the case when the absolute value of v0 is less than this threshold, which gives bound radial orbits in the shape of a cycloid in accord with equation (1). It's interesting to note the "removable singularity" of (3) at R = 0. Physically the parameter R is always non-negative by definition, so it abruptly reverses slope at the origin, even though the position may vary monotonically with respect to an external coordinate system. 4.4 Force, Curvature, and Uncertainty The atoms, as their own weight bears them down plumb through the void, at scarce determined times, in scarce determined places, from their course decline a little - call it, so to speak, mere changed trend. For were it not their wont thuswise to swerve, down would they fall, each one, like drops of rain, through the unbottomed void; and then collisions ne'er could be, nor blows among the primal elements; and thus Nature would never have created aught. Lucretius, 50 BC The trajectory of radial non-rotating gravitational freefall can be expressed by the simple differential equation where k is a constant and dots signify derivatives with respect to time. This equation is valid for both Newtonian gravity and general relativity, provided we identify Newton's time parameter with the free-falling particle's proper time, and Newton's radial distance with the radial Schwarzschild coordinate. Notice that no gravitational constant appears in this equation (k is just a constant of integration determined by the initial conditions), so equation (1) is a purely kinematic description of gravity. Why did Newton not adopt this simple kinematic view? Historically the reasons involved considerations of rotating systems, but the basic problem with the kinematic view is present even with simple nonrotating free-fall. The problem is that equation (1) has an unrealistic "static solution" at . This condition implies that k=0, and the separation between the two objects has no proper "trajectory" (i.e., time drops out of the equation), so the equation cannot extrapolate the position forward or backward in time. Of course, this condition can never arise naturally from any non-static condition with k0, but we can imagine that by the imposition of some external force we can arrange to have the two objects initially at rest and not accelerating relative to each other. Then when the objects are released from the "outside" force we expect them to immediately begin falling toward each other under the influence of their mutual gravitational attraction. This implies that k, and therefore , must immediately assume some non-zero values, but equation (1) gives us no information about these values, because the entire equation identically vanishes at the static solution. To escape from the static solution, Newtonian mechanics splits the kinematic equation of motion into two parts, coupled together by the dynamical concepts of force and mass. Two objects are said to exert (equal and opposite) force on each other proportional to the inverse of the square of the separation between them, and the second derivative of that separation is proportional (per mass) to this force. Thus, the relation between separation and time for two identical particles, each of mass m, is given not by a single kinematic equation but by two simultaneous equations If we combine these two equations by eliminating F, we have which shows that when the two objects are released, the separation instantly acquires the second derivative 2Gm/s2. Once this "initialization" has been accomplished, the subsequent free fall is entirely determined by equation (1), as can be seen by differentiating (2) to give which, assuming the separation is not zero, can be divided by s to give , the derivative of (1). This shows that, for non-rotating radial free-fall, the coupling parameters F and m are entirely superfluous except that they serves to establish the proper initial condition when the two objects are released from rest. Thus, Newton's dual concepts of force-at-a-distance and the proportionality of acceleration to force serve only (in this context) to enable us to solve for a non-vanishing as a function of s when = 0, which equation (1) obviously cannot do. Furthermore, the constant G does not appear in (1) or (3), even though they give a complete description of gravitational free-fall except for the singularity at = 0. Thus the gravitational constant is also needed only at this singular point, the "static solution" of equation (1), which is the only point at which the dynamical concepts of force and mass are used. Aside from this singular condition, non-rotating radial Newtonian gravity is a purely kinematical phenomenon. There are several essentially equivalent formulations of the kinematic equation of nonrotating radial gravitational motion, but all lead to an indeterminate condition at the static solution. For example, if we set k = in equation (1) and multiply through by 2 we have . Integrating this over time gives constant of integration. Dividing by s gives where is a which we recognize as expressing the classical conservation of energy, with the first term representing potential energy and the second term denoting kinetic energy. Taking the derivative of this gives Notice that in each of the preceding equations the condition still represents a solution for any s, even though it is unrealistic. At this point we may be tempted to solve our problem by dividing through equation (4) by to give which is the Newtonian inverse-square "force" law of gravity. This does indeed determines the second derivative as a function of s, and thereby provides the information needed to depart from the externally imposed static initial condition. However, notice that the condition which concerns us is precisely when = 0, so when we divided equation (4) by we were essentially just eliminating the singular pole arbitrarily by dividing by zero. Thus we can't properly say that the "force-at-a-distance" law (5) follows from equation (1). The removal of the indeterminate singularity actually represents an independent assumption relative to the basic kinematic equation of motion. Of course, this assumption is perfectly compatible with the equation of motion, as can be seen by solving equation (5) for /s and substituting into the energy equation to give and thus which is the same as equation (1). This compatibility is a necessary consequence of the fact that the equation of motion is totally indeterminate when =0, which is the only condition at which the force law introduces new information not contained in the basic equation of motion. In view of the above relations, it is not surprising that in the general theory of relativity we find gravity expressed without the concept of force. Einstein avoided the problem of the static solution - without invoking an auxiliary concept such as force - simply by recasting the phenomena in four-dimensional space-time, within which no material object is ever static. Every object, even one "at rest" in space, necessarily has a proper trajectory through spacetime, because it's moving forward in time. Furthermore, if we allow the spacetime manifold to posses intrinsic curvature, it follows that a purely timelike trajectory can "veer off" and acquire space-like components. Of course, this tendency to "veer off" depends on the degree of curvature of the spacetime, which general relativity relates to the mass-energy in the region. One of Einstein's motivations for the general theory was the desire to eliminate arbitrary constants, particularly the gravitational constant G, from the expressions of physical laws, but in the general theory it is still necessary to determine the proportionality between mass and curvature empirically, so the arbitrary gravitational constant remains. In any case, we see that Newtonian mechanics and general relativity give formally identical relations between separation and time for non-rotating free-fall, and the conceptual differences between the two theories can be expressed in terms of the ways in which they escape from or avoid the static condition. It's interesting to note that the static solution of (1) is unstable in the direction collapse. Given a positive separation s, the signs of must be {+,}, {+,+}, {,+} or {,} in order to satisfy (1), but considering small perturbations of these derivatives from the state in which they are both zero, it's clear that {+,} is unrealistic, because would not go positive from zero while was going negative from zero. For similar reasons, perturbations leading to {+,+} and {,+} are also excluded. Only the case {,} represents a realistic outcome of a small perturbation from the static solution. This instability in the direction of collapse suggests another approach to escaping from (or avoiding) the static solution. The exact velocity and position of the two objects cannot be known at the quantum level, so, in a sense, the closest that two bodies can come to a static condition must still allow the equivalent of one quanta of momentum in their relative velocities. It's tempting to imagine that there might be some way of deriving the gravitational constant based on the idea that the initial condition for (1) is determined by the characteristic quantum uncertainty for the separations between massive particles, since, as we've seen, this initial condition fully determines the trajectory of radial gravitational free-fall. Simplistically we could note that, for a particle of mass m, any finite limit L on allowable distances implies two irreducible quantities of energy per unit mass, one being (h/2L)2/2m2 corresponding to the minimum "observable" momentum mv = h/2L (where h is Planck's constant) due to the uncertainty principle, and the other being the minimum gravitational potential energy Gm/L. Identifying these two energies with each other, and setting L equal to the event horizon radius c/H where c is the velocity of light and H is Hubble's expansion constant, we have the relation Inserting the values h = (6.625)10-34 J sec, G = (6.673)10-11 Nm2/kg2, c = (2.998)108 m/sec, and H = (2.3)10-18 sec-1 gives a value of (1.8477)10-28 kg for the characteristic mass m, which happens to be about one ninth the mass of a proton. Rough relationships of this kind between the fundamental physical constants have been discussed by Dirac and others, including Leopold Infeld, who wrote in 1949 Let us take as an example Maxwell’s equations and try to find their solution on a cosmological background… In a closed universe the frequency of radiation has a lowest value [corresponding to the maximum possible wavelength]. The spectrum, on its red side, cannot reach frequency zero. We obtain characteristic values for frequencies… a similar situation prevails if we consider Dirac’s equations upon a cosmological background. The solutions in a closed universe are different, not because of the metric, but because of the topology of our universe. Such ideas are intiguing, but they have yet to be incorporated meaningfully into any successful physical theory. The above represents a very simplistic sense in which the uncertainty of quantum mechanics and the spacetime curvature of general relativity can be regarded as two alternative conceptual strategies for establishing a consistent gravitational coupling. In a more sophisticated sense, we can find other interesting formal parallels between these two concepts, both of which fundamentally express non-commutativity. Given a system of orthogonal xyz coordinates, let A,B,C denote operations which, when applied to any unit vector emanating from the origin, rotate that vector in the positive sense about x, y, or z axis respectively. Each of these operations can be represented by a rotation matrix, such that multiplying any vector by that matrix will effectively rotate the vector accordingly. As Hamilton realized in his efforts to find a three-dimensional analog of complex numbers (which represent rotation operators in two-dimensions), the multiplication (i.e., composition) of two rotations in space is not commutative. This is easily seen in our example, because if we begin with a vector V emanating from the origin in the positive z direction, and we first apply rotation A and then rotation B, we arrive at a vector pointing in the positive y direction, whereas if we begin with V and apply the rotation B first and then A we arrive at a vector pointing in the negative x direction. Thus the effect of the combined operation AB is different from the effect of the combined operation BA, and so the matrix AB BA does not vanish. This is in contrast with ordinary scalars and complex numbers, which always satisfy the commutivity relation ab ba = 0 for every two numbers a,b. This non-commutivity also appears when dealing with calculus on curved manifolds, which we will discuss in more detail in Section 5. Just to give a preliminary indication of how non-commutative relations arise in this context, suppose we have a vector field T defined over a given metrical manifold, and we let T denote covariant differentiation of T first with respect to the coordinate x and then with respect to the coordinate x. In a flat manifold the covariant derivative is identical to the partial derivative, which is commutative. In other words, the result of differentiation with respect to two coordinates in succession is independent of the order in which we apply the differentiations. However, in a curved manifold this is not the case. We find that reversing the order of the differentiations yields different results, just as when applying two rotations in succession to a vector. Specifically, we will find that where R is the Riemann curvature tensor, to be discussed in detail in Section 5.7. The vanishing of this tensor is the necessary and sufficient condition for the manifold to be metrically flat, i.e., free of intrinsic curvature, so this tensor can be regarded as a measure of the degree of non-commutivity of covariant derivative operators in the manifold. Non-commutivity also plays a central role in quantum mechanics, where observables such as position and momentum are represented by operators, much like the rotation operators in our previous example, and the possible observed states are eigenvalues of those operators. If we let X and P denote the position and momentum operators, the application of one of these operators to the state vector of a given system results in a new state vector with specific probabilities. This represents a measurement of the respective observable. The effect of a position measurement followed by a momentum measurement can be represented by the combined operator XP, and likewise the effect of a momentum measurement followed by a position measurement can be represented by PX. Again we find that the commutative property does not generally hold. If two observable are compatible, such as the X position and the Y position of a particle, then the operators commute, which means we have XY YX = 0. However, if two operators are not compatible, such as position and momentum, their operators do not commute. This leads to the important relation This non-commutivity in the measurement of observables implies an inherent limit on the precision to which the values of the incompatible observables can be jointly measured. In general it can be shown that if A and B are the operators associated with the physical quantities a and b, and if a and b denote the expected root mean squares of the deviations of measured values of a and b from their respective expected values, then This is Heisenberg's uncertainty relation. The commutator of two observable operators is invariably a multiple of Planck's constant, so if Planck's constant were zero, all observables would be compatible, i.e., their operators would commute, just as do all classical operators. We might say (with some poetic license) that Planck's constant is a measure of the "curvature" of the manifold of observation. This "curvature" applies only to incompatible observables, although the term "incompatible" is somewhat misleading, because it actually signifies that two observables A,B are conjugates, i.e., transformable into each other by the conjugacy relation A=UBU-1 where U is a unitary operator (analagous to a simple rotation operator). 4.5 Conventional Wisdom This, however, is thought to be a mere strain upon the text, for the words are these: ‘That all true believers break their eggs at the convenient end’, and which end is the convenient end, seems, in my humble opinion, to be left to every man’s conscience… Jonathan Swift, 1726 It is a matter of empirical fact that the speed of light is invariant in terms of inertial coordinates, and yet the invariance of the speed of light is often said to be a matter of convention - as indeed it is. The empirical fact refers to the speed of light in terms of inertial coordinates, but the decision to define speeds in terms of inertial coordinates is conventional. It’s trivial to define systems of space and time coordinates in terms of which the speed of light is not invariant, but we ordinarily choose to describe events in terms of inertial coordinates, partly because of the invariance of light speed based on those coordinates. Of course, this invariance would be tautological if inertial coordinate systems were simply defined as the systems in terms of which the speed of light is invariant. However, as discussed in Section 1.3, the class of inertial coordinate systems is actually defined in purely mechanical terms, without reference to the propagation of light. They are the coordinate systems in terms of which mechanical inertia is homogeneous and isotropic (which are the necessary and sufficient conditions for Newton’s three laws of motion to be valid, at least quasi-statically). The empirical invariance of light speed with respect to this class of coordinate systems is a non-trivial empirical fact, but nothing requires us to define “velocity” in terms of inertial coordinate systems. Such systems cannot claim to have any a priori status as the “true” class of coordinates. Despite the undeniable success of the principle of inertia as a basis for organizing our understanding of the processes of nature, it is nevertheless a convention. The conventionalist view can be traced back to Poincare, who wrote in "The Measure of Time" in 1898 ... we have no direct intuition about the equality of two time intervals. The simultaneity of two events or the order of their succession, as well as the equality of two time intervals, must be defined in such a way that the statements of the natural laws be as simple as possible. In the same paper, Poincare described the use of light rays, together with the convention that the speed of light is invariant and the same in all directions, to give an operational meaning to the concept of simultaneity. In his book "Science and Hypothesis" (1902) he summarized his view of time by saying There is no absolute time. When we say that two periods are equal, the statement has no meaning, and can only acquire a meaning by a convention. Poincare's views had a strong influence on the young Einstein, who avidly read "Science and Hypothesis" with his friends in the self-styled "Olympia Academy". Solovine remembered that this book "profoundly impressed us, and left us breathless for weeks on end". Indeed we find in Einstein's 1905 paper on special relativity the statement We have not defined a common time for A and B, for the latter cannot be defined at all unless we establish by definition that the time required by light to travel from A to B equals the time it requires to travel from B to A. In a later popular exposition, Einstein tried to make the meaning of this definition more clear by saying That light requires the same time to traverse the path A to M (the midpoint of AB) as for the path B to M is in reality neither a supposition nor a hypothesis about the physical nature of light, but a stipulation which I can make of my own freewill in order to arrive at a definition of simultaneity. Of course, this concept of simultaneity is also embodied in Einstein's second "principle", which asserts the invariance of light speed. Throughout the writings of Poincare, Einstein, and others, we see the invariance of the speed of light referred to as a convention, a definition, a stipulation, a free choice, an assumption, a postulate, and a principle... as well as an empirical fact. There is no conflict between these characterizations, because the convention (definition, stipulation, free choice, principle) that Poincare and Einstein were referring to is nothing other than the decision to use inertial coordinate systems, and once this decision has been made, the invariance of light speed is an empirical fact. As Poincare said in 1898, we naturally choose our coordinate systems "in such a way that the statements of the natural laws are as simple as possible", and this almost invariably means inertial coordinates. It was the great achievement of Galileo, Descartes, Huygens, and Newton to identify the principle of inertia as the basis for resolving and coordinating physical phenomena. Unfortunately this insight is often disguised by the manner in which it is traditionally presented. The beginning physics student is typically expected to accept uncritically an intuitive notion of "uniformly moving" time and space coordinate systems, and is then told that Newton's laws of motion happen to be true with respect to those "inertial" systems. It is more meaningful to say that we define inertial coordinate systems as those systems in terms of which Newton's laws of motion are valid. We naturally coordinate events and organize our perceptions in such a way as to maximize symmetry, and for the motion of material objects the most important symmetries are the isotropy of inertia, the conservation of momentum, the law of equal action and re-action, and so on. Newtonian physics is organized entirely upon the principle of inertia, and the basic underlying hypothesis is that for any object in any state of motion there exists a system of coordinates in terms of which the object is instantaneously at rest and inertia is homogeneous and isotropic (implying that Newton's laws of motion are at least quasi-statically valid). The empirical validity of this remarkable hypothesis accounts for all the tremendous success of Newtonian physics. As discussed in Section 1.3, the specification of a particular state of motion, combined with the requirement for inertia to be homogeneous and isotropic, completely determines a system of coordinates (up to insignificant scale factors, rotations, etc), and such a system is called an inertial system of coordinates. Such coordinate systems can be established unambiguously by purely mechanical means (neglecting the equivalence principle and associated complications in the presence of gravity). The assumption of inertial isotropy with respect to a given state of motion suffices to establishes the loci of inertial simultaneity for that state of motion. Poincare and Einstein rightly noted the conventionality of this simultaneity definition because they were not pre-supposing the choice of inertial simultaneity. In other words, we are not required to use inertial coordinates. We simply choose, of our own free will, to use inertial coordinates - with the corresponding inertial definition of simultaneity - because this renders the statement of physical laws and the descriptions of physical phenomena as simple and perspicuous as possible, by taking advantage of the maximum possible symmetry. In this regard it's important to remember that inertial coordinates are not entirely characterized by the quality of being unaccelerated, i.e., by the requirement that isolated objects move uniformly in a straight line. It's also necessary to require the unique simultaneity convention that renders mechanical inertial isotropic (the same in all spatial directions), which amounts to the stipulation of equal one-way speeds for the propagation of physically identical actions. These comments are fully applicable to the Newtonian concept of space, time, and inertial reference frames. Given two objects in relative motion we can define two systems of inertial coordinates in which the respective objects are at rest, and we can orient these coordinates so the relative motion is purely in the x direction. Let t,x and T,X denote these two systems of inertial coordinates. That such coordinates exist is the main physical hypothesis underlying Galilean physics. An auxiliary hypothesis – one that was not always clearly recognized – concerns the relationship between two such systems of inertial coordinates, given that they exist. Galileo assumed that if the coordinates x,t of an event are known, and if the two inertial coordinate systems are the rest frames of objects moving with a relative speed of v, then the coordinates of that event in terms of the other system (with suitable choice of origins) are T = t, X = x vt. Viewed in the abstract, this is a rather peculiar and asymmetrical assumption, although it is admittedly borne out by experience - at least to the precision of measurement available to Galileo. However, we now know, empirically, that the relation between relatively moving systems of inertial coordinates has the symmetrical form T = (t vx)/ and X = (x vt)/ where = (1v2)1/2 when the time and space variables are expressed in the same units such that the constant (3)108 meters/second equals unity. It follows that the one-way (not just the two-way) speed of light is invariant and isotropic with respect to any and every system of inertial coordinates. The empirical content of this statement is simply that the propagation of light is isotropic with respect to the same class of coordinate systems in terms of which mechanical inertia is isotropic. This is consistent with the fact that light itself is an inertial phenomena, e.g., it conveys momentum. In fact, the inertia of light can be seen as a common thread running through three of the famous papers published by Einstein in 1905. In the paper entitled "On a Heuristic Point of View Concerning the Production and Transformation of Light" Einstein advocated a conception of light as tiny quanta of energy and momentum, somewhat reminiscent of Newton's inertial corpuscles of light. It's clear that Einstein already understood that the conception of light as a classical wave is incomplete. In the paper entitled "Does the Inertia of a Body Depend on its Energy Content?" he explicitly advanced the idea of light as an inertial phenomenon, and of course this was suggested by the fundamental ideas of the special theory of relativity presented in the paper "On the Electrodynamics of Moving Bodies". The Galilean conception of inertial frames assumed that all such frames share a unique foliation of spacetime into "instants". Thus the relation "in the present of" constituted an equivalence relation across all frames of reference. If A is in the present of B, and B is in the present of C, then A is in the present of C. However, special relativity makes it clear that there are infinitely many distinct loci of inertial simultaneity through any given event, because inertial simultaneity depends on the velocity of the worldline through the event. The inertial coordinate systems do induce a temporal ordering on events, but only a partial one. (See the discussion of total and partial orderings in Section 1.2.) With respect to any given event we can still partition all the other events of spacetime into distinct causal regions, including "past", "present" and "future", but in addition we have the categories "future null" and "past null", and none of these constitute equivalence classes. For example, it is possible for A to be in the present of B, and B to be in the present of C, and yet A is not in the present of C. Being "in the present of" is not a transitive relation. It could be argued that a total unique temporal ordering of events is a more useful organizing principle than the isotropy of inertia, and so we should adopt a class of coordinate systems that provides a total ordering. We can certainly do this, as Einstein himself described in his 1905 paper To be sure, we could content ourselves with evaluating the time of events by stationing an observer with a clock at the origin of the coordinates who assigns to an event to be evaluated the corresponding position of the hands of the clock when a light signal from that event reaches him through empty space. However, we know from experience that such a coordination has the drawback of not being independent of the position of the observer with the clock. The point of this "drawback" is that there is no physically distinguished "origin" on which to base the time coordination of all systems of reference, so from the standpoint of assessing possible causal relations we must still consider the full range of possible "absolute" temporal orderings. This yields the same partial ordering of events as does the set of inertial coordinates, so the "total ordering" that we can achieve by imposing a single temporal foliation on all frames of reference is only formal, and not physically meaningful. Nevertheless, we could make this choice, especially if we regard the total temporal ordering of events as a requirement of intelligibility. This seems to have been the view of Lorentz, who wrote in 1913 about the comparative merits of the traditional Galilean and the new Einsteinian conceptions of time It depends to a large extent on the way one is accustomed to think whether one is attracted to one or another interpretation. As far as this lecturer is concerned, he finds a certain satisfaction in the older interpretations, according to which... space and time can be sharply separated, and simultaneity without further specification can be spoken of... one may perhaps appeal to our ability of imagining arbitrarily large velocities. In that way one comes very close to the concept of absolute simultaneity. Of course, the idea of "arbitrarily large velocities" already pre-supposes a concept of absolute simultaneity, so Lorentz's rationale is not especially persuasive, but it expresses the point of view of someone who places great importance on a total temporal ordering, even at the expense of inertial isotropy. Indeed one of Poincare's criticisms of Lorentz's early theory was that it sacrificed Newton's third law of equal action and re-action. (This can be formally salvaged by assigning the unbalanced forces and momentum to an undetectable ether, but the physical significance of a conservation law that references undetectable elements is questionable.) Oddly enough, even Poincare sometimes expressed the opinion that a total temporal ordering would always be useful enough to out-weigh other considerations, and that it would always remain a safe convention. The approach taken by Lorentz and most others may be summarized by saying that they sacrificed the physical principles of inertial relativity, isotropy, and homogeneity in order to maintain the assumed Galilean composition law. This approach, although technically serviceable, suffers from a certain inherent lack of conviction, because while asserting the ontological reality of anisotropy in all but one (unknown) frame of reference, it unavoidably requires us to disregard that assertion and arbitrarily assume one particular frame as being "the" rest frame. Poincare and Einstein recognized that in our descriptions of events in spacetime in terms of separate space and time coordinates we're free to select our "basis" of decomposition. This is precisely what one does when converting the description of events from one frame to another using Galilean relativity, but, as noted above, the Galilean composition law yields anisotropic results when applied to actual observations. So it appeared (to most people) that we could no longer maintain isotropy and homogeneity in all inertial frames together with the ability to transform descriptions from one frame to another by simply applying the appropriate basis transformation. But Einstein realized this was too pessimistic, and that the new observations were fully consistent with both isotropy in all inertial frames and with simple basis transformations between frames, provided we adjust our assumption about the effective metrical structure of spacetime. In other words, he brilliantly discerned that Lorentz's anisotropic results totally vanish in the context of a different metrical structure. Even a metrical structure is conventional in a sense, because it relies on our ontological premises. For example, the magnitude of the interval between two events may seem to be one thing but actually be another, due (perhaps) to variations in our means of observation and measurement. However, once we have agreed on the physical significance of inertial coordinate systems, the invariance of the quantity (dt)2 (dx)2 (dy)2 (dz)2 also becomes physically significant. This shows the crucial importance of the very first sentence in Section 1 of Einstein's 1905 paper: Let us take a system of co-ordinates in which the equations of Newtonian mechanics hold good. Suitably qualified (as noted in Section 1.3), this immediately establishes not only the convention of simultaneity, but also the means of operationally establishing it, and its physical significance. Any observer in any state of inertial motion can throw two identical particles in opposite directions with equal force (i.e., so there is no net disturbance of the observer's state of motion), and the convention that those two particles have the same speed suffices to fully specify an entire system of space and time coordinates, which we call inertial coordinates. It is then an empirical fact - not a definition, convention, assumption, stipulation, or postulate - that the speed of light is isotropic in terms of inertial coordinates. This obviously doesn't imply that inertial coordinates are "true" in any absolute sense, but the principle of inertia has proven to be immensely powerful for organizing our knowledge of physical events, and for discerning and expressing the apparent chains of causation. If a flash of light emanates from the geometrical midpoint between two spatially separate particles at rest in an inertial frame, the arrival times of the light wave at those two particles are simultaneous in terms of that rest frame’s inertial coordinates. Furthermore, we find empirically that all other physical processes are isotropic with respect to those inertial coordinates, e.g., if a sound wave emanates from the midpoint of a uniform steel beam at rest in an inertial frame, the sound reaches the two ends simultaneously in accord with this definition. If we adopt any other convention we introduce anisotropies in our descriptions of physical processes, such as sound in a uniform stationary steel beam propagating more rapidly in one direction than in the other. The isotropy of physical phenomena - including the propagation of light - is strictly a convention, but it was not introduced by special relativity, it is one of the fundamental principles which we use to organize our knowledge, and it leads us to choose inertial coordinates for the description of events. On the other hand, the isotropy of multiple distinct physical phenomena in terms of inertial coordinates is not purely conventional, because those coordinates can be defined in terms of just one of those phenomena. The value of this definition is due to the fact that a wide variety of phenomena are (empirically) isotropic with respect to the same class of coordinate systems. Of course, it could be argued that all these phenomena are, in some sense, “the same”. For example, the energy conveyed by electromagnetic waves has momentum, so it is an inertial phenomenon, and therefore it is not surprising that the propagation of such energy is isotropic in terms of inertial coordinates. From this point of view, the value of the definition of inertial coordinates is that it reveals the underlying unity of superficially dissimilar phenomena, e.g., the inertia of energy. This illustrates that our conventions and definitions are not empty, because they represent ways of organizing our knowledge, and the efficiency and clarity of this organization depends on choosing conventions that reflect the unity and symmetries of the phenomena. We could, if we wish, organize our knowledge based on the assumption of a total temporal ordering of events, but then it would be necessary to introduce a whole array of unobservable anisotropic "corrections" to the descriptions of physical phenomena. As we’ve seen, the principle of relativity constrains, but does not uniquely determine, the form of the mapping from one system of inertial coordinates to another. In order to fix the observable elements of a spacetime theory with respect to every member of the equivalence class of inertial frames we require one further postulate, such as the invariance of light speed (or the inversion symmetry discussed in Chapter 1.8). However, we should distinguish between the strong and weak forms of the light-speed invariance postulate. The strong form asserts that the one-way speed of light is invariant with respect to the natural space-time basis associated with any inertial state of motion, whereas the weak form asserts only that the round-trip speed of light is invariant. To illustrate the different implications of these two different assumptions, consider an experiment of the type conducted by Michelson and Morley in their efforts to detect a directional variation in the speed of light, due to the motion of the Earth through the aether, with respect to which the absolute speed is light was presumed to be referred. To measure the speed of light along a particular axis they effectively measured the elapsed time at the point of origin for a beam of light to complete a round trip out to a mirror and back. At first we might think that it would be just as easy to measure the one-way speed of light, by simply comparing the time of transmission of a pulse of light from one location to the time of reception at another location, but of course this requires us to have clocks synchronized at two spatially separate locations, whereas it is precisely this synchronization that is at issue. Depending on how we choose to synchronize our separate clocks we can measure a wide range of light speeds. To avoid this ambiguity, we must evaluate the time interval for a transit of light at a single spatial location (in the coordinate system of interest), which requires us to measure a round trip, just as Michelson and Morley did. Incidentally, it might seem that Roemer's method of estimating the speed of light from the variations in the period between eclipses of Jupiter's moons (see Section 3.3) constituted a one-way measurement. Similarly people sometimes imagine that the oneway speed of light could be discerned by (for example) observing, from the center of a circle, pulses of light emitted uniformly by a light source moving at constant speed around the perimeter of the circle. Such methods are indeed capable of detecting certain kinds of anisotropy, but they cannot detect the anisotropy entailed by Lorentz’s ether theory, nor any of the other theories that are observationally indistinguishable from Lorentz’s theory (which itself is indistinguishable from special relativity). In any theory of this class, there is an ambiguity in the definition of a “circle” in motion, because circles contract to ellipses in the direction of motion. Likewise there is ambiguity in the definition of “uniformly-timed” pulses from a light source moving around the perimeter of a moving circle (ellipse). The combined effect of length contraction and time dilation in a Lorentzian theory is to render the anisotropies unobservable. The empirical indistinguishability between the theories in this class implies that there is no unambiguous definition of “the one-way speed of light”. We can measure without ambiguity only the lapses of time for closed-loop paths, and such measurements cannot establish the “open-loop” speed. The ambiguity in the one-way speed remains, because over any closed loop, by definition, the net change in each and every direction is zero. Hence it is possible to consistently interpret all observations based on the assumption of non-isotropic light speed. Admittedly the resulting laws take on a somewhat convoluted appearance, and contain unobservable parameters, but they can't be ruled out empirically. To illustrate, consider a measurement of the round-trip speed of light, assuming light travels at a constant speed c relative to some absolute medium with respect to which our laboratory is moving with a speed v. Under these assumptions, we would expect a pulse of light to travel with a speed c+v (relative to the lab) in one direction, and cv in the opposite direction. So, if we send a beam of light over a distance L out to a mirror in the "c+v" direction, and it bounces back over the same distance in the "cv" direction, the total elapsed time to complete the round trip of length 2L is Therefore, the average round-trip speed relative to the laboratory would be This shows why a round-trip measurement of the speed of light would not be expected to reveal any dependency on the velocity of the laboratory unless the measurement was precise enough to resolve second-order effects in v/c. The ability to detect such small effects was first achieved in the late 19th century with the development of precision interferometry (exploiting the wave-like properties of light.) The experiments of Michelson and Morley showed that, despite the movement of the Earth in its orbit around the Sun (to say nothing of the movement of the solar system, and even of the galaxy), there was no (v/c)2 term in the round-trip speed of light. In other words, they found that 2L/t is always equal to c, at least to the accuracy they could measure, which was more than adequate to rule out a second-order deviation. Thus we have a firm empirical basis for asserting that the round-trip speed of light is independent of the motion of the source. This is the weak form of the invariant light speed postulate, but in his 1905 paper Einstein asserted something stronger, namely, that we should adopt the convention of regarding the one-way speed of light as invariant. This stronger postulate doesn't follow from the results of Michelson and Morley, nor from any other conceivable experiment or observation - but there is also no conceivable observation that could conflict with it. The invariant round-trip speed of light fixes the observable elements of the theory, but it does not uniquely determine the presumed ontological structure, because multiple different interpretations can be made to fit the same set of appearances. The one-way speed of light is necessarily an interpretative element of our experience. To illustrate the ambiguity, notice that we can ensure a null result for the Michelson and Morley experiment while maintaining non-constant light speed, merely by requiring that the speed of light v1 and v2 in the two opposite directions of travel (out and back) satisfy the relation In other words, a linear round-trip measurement of light speed will yield the constant c in every direction provided only that the harmonic mean of the one-way speeds in opposite directions always equals c. This is easily accomplished by defining the one-way velocity v1 as a function of direction arbitrarily for all directions in one hemisphere, and then setting the velocities in the opposite directions the velocities v2 in the opposite directions as v2 = cv1 / (2v1 c). However, we also wish to cover more complicated round-trips, rather than just back and forth on a single line. To ensure that a circuit of light around an equilateral triangle with edges of length L yields a round-trip speed of c, the speeds v1, v2, v3 in the three equally spaced directions must satisfy so again we see that the light speeds must have a harmonic mean of c. In general, to ensure that every closed loop of light, regardless of the path, yields the average speed c, it's necessary (and also sufficient) to have light speed v = C() as a function of angle in a principal plane such that, for any positive integer n, In units with c = 1, we need the n terms on the left side to sum to n, so the velocity function must be such that 1/C() = 1 + f() where the function f() satisfies for all . The canonical example of such a function is simply f() = k cos() for any constant k. Thus if we postulate that the speed of light varies as a function of the angle of travel relative to some primary axis according to the equation then we are assured that all closed-loop measurements of the speed of light will yield the constant c, despite the fact that the one-way speed of light is distinctly non-isotropic (for non-zero k). This equation describes an ellipse, and no measurement can disprove the hypothesis that the one-way speed of light actually is (or is not) given by (1). It is, strictly speaking, a matter of convention. If we choose to believe that light has the same speed in all directions, then we assume k = 0, and in order to send a synchronizing signal to two points we would locate ourselves midway between them (i.e., at the location where round trips between ourselves and those two points take the same amount of time.) On the other hand, if we choose to believe light travels twice as fast in one direction as in the other, then we would assume k = 1/3, and we would locate ourselves 2/3 of the way between them (i.e., twice as far from one as the other, so round trip times are two to one). The latter case is illustrated in the figure below. Regardless of what value we assume for k (in the range from -1 to +1), we can synchronize all clocks according to our belief, and everything will be perfectly consistent and coherent. Of course, in any case it's necessary to account consistently for the lapse of time for information to get from one clock to another, but the lapse of time between any two clocks separated by a distance L can be anything we choose in the range from virtually 0 to 2L/c. The only real constraint is that that the speed be an elliptical function of the direction angle. The velocity profile given by (1) is simply the polar equation of an ellipse (or ellipsoid is revolved about the major axis), with the pole at one focus, the semi-latus rectum equal to c, and eccentricity equal to k. This just projects the ellipse given by cutting the light cone with an oblique plane. Interestingly, there are really two light cones that intersect on this plane, and they are the light cones of the two events whose projections are the two foci of the ellipse - for timelike separated events. Recall that all rays emanating from one focus of an ordinary ellipse and reflecting off the ellipse will re-converge on the other focus, and that this kind of ray optics is time-symmetrical. In this context our projective ellipse is the intersection of two null-cones, i.e., it is the locus of all points in spacetime that are null-separated from both of the "foci events". This was to be expected in view of the time-symmetry of Maxwell's equations (not to mention the relativistic Schrodinger equation), as discussed in Section 9. Our main reason for assuming k = 0 is our preference for symmetry, simplicity, and consistency with inertial isotropy. Within our empirical constraints, k can be interpreted as having any value between -1 and +1, but the principle of sufficient reason suggests that it should not be assigned a non-zero value in the absence of any rational justification. Nevertheless, it remains a convention (albeit a compelling one), but we should be clear about what precisely is – and what is not – conventional. The invariance of lightspeed is a convention, but the invariance of lightspeed in terms of inertial coordinates is an empirical fact, and this empirical fact is not a formal tautology, because inertial coordinates are determined by the mechanical inertia of material objects, independent of the propagation of light. Recall that Einstein’s 1905 paper states that if a pulse of light is emitted from an unaccelerated clock at time t1, and is reflected off some distant object at time t2, and is received back at the original clock at time t3, then the inertial coordinate synchronization is given by stipulating that Reichenbach noted that the formally viable simultaneity conventions correspond to the assumption where is any constant in the range from 0 to 1. This describes the same class of “elliptical speed” conventions as discussed above, with = (k+1)/2 where k ranges from 1 to +1. The corresponding coordinate transformation is a simple time skew, i.e., x’ = x, y’ = y, z’ = z, t’ = t + kx/c. This describes the essence of the Lorentzian “absolutist” interpretation of special relativity. Beginning with the putative absolute rest frame inertial coordinates x,y, Lorentz associates with each state of motion v a system of coordinates x’,t’ related to x,y by a Galilean transformation with parameter v. In other words, x’ = x – vt and t’ = t. He then re-scales the x’,t’ coordinates to account for what he regards as the physical contraction of the lengths of stable object and the slowing of the durations of stable physical processes, to arrive at the coordinates x” = x/ and t” = t where = (1v2/c2)1/2. These he regards as the proper rest frame coordinates for objects moving with speed v in terms of the absolute frame. There is nothing logically unacceptable about these coordinate systems, but we must realize that they do not constitute inertial coordinate systems in the full sense. Mechanical inertia and the speed of light are not isotropic in terms of such coordinates, precisely because the time foliation (i.e., the simultaneity convention) is skewed relative to the = 1/2 convention. If we begin with the inertial rest frame coordinates for the state of motion v (which Lorentz and Einstein agree are related to the putative absolute rest frame coordinate by a Lorentz transformation), and then apply the time skew transformation with parameter k = -v/c, we arrive at these Lorentzian rest frame coordinates. Needless to say, our choice of coordinate systems does not affect the outcome of any physical measurement, except that the outcome will be expressed in different terms. For example, by the Einsteinian convention the speed of light is isotropic in terms of the rest frame coordinates of any material object, whereas by the Lorentzian convention it is not. This difference is simply due to different definitions of “rest frame coordinates”. If we specify inertial coordinate systems (i.,e., coordinates in terms of which inertia is isotropic and Newton’s laws are quasi-statically valid) then there is no ambiguity, and both Lorentz and Einstein agree that the speed of light is isotropic in terms of all inertial coordinate systems. In subsequent sections we’ll see that the standard formalism of general relativity provides a convenient means of expressing the relations between spacetime events with respect to a larger class of coordinate systems, so it may appear that inertial references are less significant in the general theory. In fact, Einstein once hoped that the general theory would not rely on the principle of inertia as a primitive element. However, this hope was not fulfilled, and the underlying physical basis of the spacetime manifold in general relativity remains the set of primitive inertial paths (geodesics) through spacetime. Not only do these inertial paths determine the equivalence class of allowable coordinate systems (up to diffeomorphism), it even remains true that at each event we can construct a (local) system of inertial coordinates with respect to which the speed of light is c in all directions. Thus the empirical fact of lightspeed invariance and isotropy with respect to inertial coordinates remains as a primitive component of the theory. The difference is that in the general theory the convention of using inertial coordinates is less prevalent, because in general there is no single global inertial coordinate system, and non-inertial coordinate systems are often more convenient on a curved manifold. 4.6 The Field of All Fields Classes and concepts may be conceived as real objects, existing independently of our definitions and constructions. It seems to me that the assumption of such objects is quite as legitimate as the assumption of physical bodies, and there is quite as much reason to believe in their existence. Kurt Gödel, 1944 Where is the boundary between the special and general theories of relativity? It is sometimes said that any invocation of "general covariance" implies general relativity, but just about any theory can be expressed in a generally covariant form, so this doesn't even distinguish between general relativity and Newtonian physics, let alone special relativity. For example, it's perfectly possible to simply transform the special relativistic solution of a rotating platform into some arbitrary accelerated coordinate system, and although the result is ugly, it is no less (or more) valid than when it was expressed in terms of non- accelerating coordinates, because the transformation from one stipulated set of coordinates to another has no physical content. The key word there is "stipulated", because the real difference between the special and general theories is in what they take for granted. In a sense, special relativity is analogous to "naive set theory" in mathematics. By this I mean that special relativity is based on certain plausible-sounding premises which actually are quite serviceable for treating a wide class of problems, but which on close examination are susceptible to self-referential antinomies. This is most evident with regard to the assumption of the identifiability of inertial frames. As Einstein remarked, "in the special theory of relativity there is an inherent epistemological defect", namely, that the preferred class of reference frames on which the theory relies is circularly defined. Special relativity asserts that the lapse of proper time between two (timelikeseparated) events is greatest along the inertial worldline connecting those two events - a seemingly interesting and useful assertion - but if we ask which of the infinitely many paths connecting those two events is the "inertial" one, we can only answer that it is the one with the greatest lapse of proper time. If we simply accept this uncritically, and are willing to naively rely on the testimony of accelerometers as unambiguous indicators of "inertia", we have a fairly solid basis on which to do physics, and we can certainly work out correct answers to many questions. However, the epistemological defect was worrisome to Einstein, and caused him (in a remarkably short time) to abandon special relativity and global Lorentz invariance as a suitable conceptual framework for the formulation of physics. The naive reliance on accelerometers as unambiguous indicators of global inertia in the context of special relativity is immediately undermined by the equivalence principle, because we're then required to predicate any application of special relativity on the absence (or at least the negligibility) of irreducible gravitational fields, and this condition is simply not verifiable within special relativity itself, because of the circularity in the principle of inertia. This circularity genuinely troubled Einstein, and was one of the major motivations (along with the problem of reconciling mass-energy equivalence with the Equivalence Principle) that led him to abandon special relativity. Given the recognized limitations of special relativity, and considering how successfully it was generalized and extended in 1916, we may wonder why it's even necessary to continue carrying along the special theory as a conceptually distinct entity. Will this duality persist indefinitely, or will we eventually just say there is a single theory of relativity (the theory traditionally called general relativity), which subsumes and extends the earlier theory called special relativity? The reluctance to discard the special theory as a separate theory may be due largely to the fact that it represents a simple and widelyapplicable special case of the general theory, and it's convenient to have a name for this limiting case. (There are, however, many cases in which the holistic approach of the general theory is actually much simpler than the traditional special-theory-plus-generalcorrections approach.) Another reason that's sometimes mentioned is the (remote) possibility that Einstein's general relativity is not the "right" generalization/extension of the special theory. For example, if observation were ever to conclusively rule out the existence of gravitational waves (which is admittedly hard to imagine in view of the available binary star data), it might be necessary to seek another framework within which to place the special theory. In this sense, we might regard special relativity as roughly analogous to set theory without the axiom of choice, i.e., a restricted and less ambitious theory that avoids making use of potentially suspect concepts or premises. However, it's hard to say exactly which of the fundamental principles of general relativity is considered to be suspect. We've seen that "general covariance" is a property of almost any theory, so that can't be a problem. We might doubt the equivalence principle in one or more of its various flavors, but it happens to be one of the most thoroughly tested principles in physics. It seems most likely that if general relativity fails, it would be because one or more of its "simplicities" is inappropriate. For example, the restriction to 2nd order, or the assumption of Riemannian metrics rather than, say, Finsler metrics, or the naive assumption of R4 topology, or maybe even the basic assumption of a continuum. Still, each of these would also have conceptual implications for the special theory, so these aren't valid reasons for continuing to regard special relativity as a separate theory. Suppose we naively superimpose special relativity on Newtonian physics, and adopt a naive definition of "inertial worldline", such as a worldline with no locally sensible acceleration. On that basis we find that there can be multiple distinct "inertial" worldlines connecting two given events (e.g., intersecting elliptical orbits of different eccentricities), which conflicts with the special relativistic principle of a unique inertial interval between any pair of timelike separated events. To press the antinomy analogy further, we could arrange to have special relativity conclude that each of these worldlines has a lesser lapse of proper time than each of the others. (If the barber shaves everyone who doesn't shave himself, who shaves the barber?) Of course, with special relativity (as with set theory) we can easily block such specific conundrums - once they are pointed out - by imposing one or more restrictions on the definition of "inertial" (or the definition of a "set"), and in so doing we make the theory somewhat less naive, but the experience raises legitimate questions about whether we can be sure we have blocked all possible escapes. We shouldn't push the analogy too far, since there are obvious differences between a purely mathematical theory and a physical theory, the latter being exposed to potential conflict with a much wider class of "external" constraints (such as the requirement to possess a consistent mapping to a representation of experience). However, when considering naive set theory's assumption of the existence of sets, and its assertions about how to manipulate and reason with sets, all in the absence of a comprehensive criteria of how to identify what can legitimately be called a set, there is an interesting parallel with special relativity's assumption of the existence of inertial frames and how to reason with them and in them, all in the absence of a comprehensive framework for deciding what does and what does not constitute an inertial frame. It might be argued that relativity is a purely formalistic theory, which simply assumes an inertial frame is specified, without telling how to identify one. Certainly we can completely insulate special relativity from any and all conflict by simply adopting this strategy, i.e., asserting that special relativity avers no mapping at all between it's elements and the objects of our experience. However, although this strategy effectively blocks conflict, it also renders the theory quite unfalsifiable and phenomenologically otiose. Even recognizing the distinction between logical inconsistency and empirical falsification, we must also remember that the rules of logic and reason are ultimately grounded in "observations", albeit of a very abstract nature, and mathematical theories no less than physical theories are attempts to formalize "observations". As such, they are comparably subject to upset when they're found to conflict with other observations (e.g., barbers, gravity, etc.). It might be argued that we cannot really attribute any antinomies to special relativity, because the cases noted above (multiply intersecting elliptical orbits, etc) arise only from attempting to apply special relativistic reasoning to a class of entities for which it is not suited. However, the same is true of naive set theory, i.e., it works perfectly well when applied to a wide class of sets, but leads to logically impossible conclusions if we attempt to apply it to a class of sets that "act on themselves"... just as gravity is found to act on itself in the general theory. In a real sense, gravity in general relativity is a self-referential phenomenon, as revealed by the non-linearity of the field equations. Notice that our antinomies in the special theory arise only when trying to reason with "self-referential inertial frames", i.e., in the presence of irreducible gravitational fields. The basic point is that although special relativity serves as the local limiting case of the general theory, it is not able to stand alone, because it cannot identify the applicability of its premises, which renders it incapable of yielding definite macroscopic conclusions about the physical world. By placing all the necessary indefinite qualifiers on the scope of applicability, we effectively remove special relativity from the set of physical theories. This just re-affirms the point that any application of special relativity is, strictly speaking, legitimized only within the context of the general theory, which provides the framework for assessing the validity of the application. One can, of course, still practice the special theory from a naive standpoint, and be quite successful at it, just as one can practice naive set theory without running into trouble very often. Naturally none of this implies that special relativity, by itself, is unfalsifiable. Indeed it is falsifiable, but only when superimposed on some other framework (such as Newtonian physics) and combined with some auxiliary assumptions about how to identify inertial frames. In fact, the special theory of relativity is not only falsifiable, it is falsified, and was superceded in 1916 by a superior and more comprehensive theory. Nevertheless, strict epistemological scruples don't have a great deal of relevance to the actual day-to-day practice of science. From a more formal standpoint, it's interesting to consider the correspondence between the foundations of set theory and the theories of relativity. The archetypal example of a problematic concept in naive set theory was the notion of the "set of all sets". It soon became apparent to Cantor, Russell, and other mathematicians that this plausiblesounding notion could not consistently be treated as a set in the usual sense. The problem was recognized to be the self-referential nature of the concept. We can compare this to the general theory of relativity, which is compelled by the equivalence principle to represent the metric of spacetime as (so to speak) "the field of all fields". To make this more precise, recall that Newtonian gravity can be represented by a scalar field defined over a pre-existing metrical space, whose metric we may denote as g. The vacuum field equation is Lg() = 0 where Lg signifies the Laplacian operator over the space with the fixed metric g. In general relativity the Laplacian is replaced by a more complicated operator Rg which, like the Laplacian, is effectively a differential operator whose components are evaluated on the spacetime with the metric g. However, in general relativity the field on which Rg operates is nothing but the spacetime metric g itself. In other words, the vacuum field equations are Rg(g) = 0. The entity Rg(g) is called the Ricci tensor in differential geometry, usually denoted in covariant form as R. This highlights the essentially self-referential nature of the Einstein field equations, as opposed to the Newtonian field equations where the operator and the field being operated on are completely independent entities. It's interesting to compare this situation to schematic representations of Goedel's formalization of arithmetic, leading to his proof of the Incompleteness Theorem. Given a well-defined mapping between single-variable propositional statements and the natural numbers (which Goedel showed is possible, though far from trivial), let Pn(w) denote the nth statement applied to the variable w. Since every possible proposition maps to some natural number, there is a natural number k such that Pk(w) represents the proposition that Pw(w) has no proof. But then what happens if we set the variable w equal to k? We see that Pk(k) represents that proposition that there is no proof of Pk(k), from which it follows that if there is no proof of Pk(k) then Pk(k) is true, whereas if there is a proof of Pk(k) then Pk(k) is false. Hence, assuming our system of arithmetic is self-consistent, so that it doesn't contain proofs of false propositions, we must conclude that Pk(k) is true but unprovable. Obviously the negation of Pk(k) must also be unprovable, assuming our arithmetic is consistent, so the proposition is strictly undecidable within the formal system encoded by our numbering scheme. The analogy between Goedel propositions Pk(k) and the field equations of general relativity Rg(g) = 0 should not be pressed too far, but it does hint at the real and profound subtleties that can arise when we allow self-referential statements. It's interesting that Einstein seems to have been mindful very early of the eventual necessity of such statements, although he deferred it for quite some time. Prior to 1905 many physicists were attempting to construct a purely electromagnetic theory of matter based on Maxwell's equations, according to which "the particle would be merely a domain containing an especially high density of field energy". However, in presenting the special theory of relativity Einstein carefully avoided proposing any particular theory as to the ultimate structure of matter, and showed that a purely kinematical interpretation could account for the relation between energy and inertia. He took this approach not because he was disinterested in the nature of matter, but because he recognized immediately that Maxwell's equations did not permit the derivation of the equilibrium of the electricity that constitutes a particle. Only different, nonlinear field equations could possibly accomplish such a thing. But no method existed for discovering such field equations without deteriorating into adventurous arbitrariness. So in 1905 Einstein took the more conservative route and merely(!) redefined the traditional concepts of time and space. A few years later he himself embarked on an adventure leading ultimately in 1915 to the non-linear field equations of general relativity, but even in this he managed to make important progress by sidestepping again the question of the ultimate constituency of matter and light. As he recalled in his Autobiographical Notes It seemed hopeless to me at that time to venture the attempt of representing the total field [as opposed to the pure gravitational field] and to ascertain field laws for it. I preferred, therefore, to set up a preliminary formal frame for the representation of the entire physical reality; this was necessary in order to be able to investigate, at least preliminarily, the effectiveness of the basic idea of general relativity. In his later years it seems Einstein had decided he had made all the progress that could be made on this preliminary basis, and set about the attempt to represent the total field. He wrote the above comments in 1949, after a quarter-century of fruitless efforts to discover the non-linear equations for the "total field", including electromagnetism and matter, so he knew only too well the risks of deteriorating into adventurous arbitrariness. 4.7 The Inertia of Twins We have no direct intuition of simultaneity, nor of the equality of two durations. People who believe they possess this intuition are dupes of an illusion... The simultaneity of two events, the order of their succession, and the equality of two durations, are to be so defined that the enunciation of the natural laws may be as simple as possible. Poincare, The Value of Science, 1905 The most commonly discussed "paradox" associated with the theory of relativity concerns the differing lapses of proper time along two different paths between two fixed events. This is often expressed in terms of a pair of twins, one moving inertially from event A to event B, and the other moving inertially from event A to an intermediate event M, where he changes his state of motion, and then moves inertially from M to B, where it is found that the total elapsed time of the first twin exceeds that of the second. Much of the popular confusion over this sequence of events is simply due to specious reasoning. For example, if x,t and x',t' denote inertial rest frame coordinates respectively of the first and second twin (on either the outbound or inbound leg of his journey), some people are confused by the elementary fact that if those two coordinate systems are related according to the Lorentz transformation, then the partials (t'/t)x and (t/t')x' both have the same value. (For example, the unfortunate Herbert Dingle spent his retirement years on a pitiful crusade to convince the scientific community that those two partial derivatives must be the reciprocals of each other, and that therefore special relativity is logically inconsistent.) Other people struggle with the equally elementary algebraic fact that the proper time along any given path between two events is invariant under arbitrary Lorentz transformations. The inability to grasp this has actually led some eccentrics to waste years in a futile effort to prove special relativity inconsistent by finding a Lorentz transformation that does not leave the proper time along some path invariant. Despite the obvious fallacies underlying these popular confusions, and despite the manifest logical consistency of special relativity, it is nevertheless true that the so-called twins paradox, interpreted in a more profound sense, does highlight a fundamental epistemological shortcoming of the principle of inertia, on which both Newtonian mechanics and special relativity are based. Naturally if we simply stipulate that one of the twins is in inertial motion the entire time and the other is not, then the resolution of the "paradox" is trivial, but the stipulation of "inertial motion" for one of the twins begs the very question that motivates the paradox (in its more profound form), namely, how are inertial worldlines distinguished from the set of all possible worldlines? In a sense, the only answer special relativity can give is that the inertial worldline between two events is the one with the greatest lapse of proper time, which is clearly of no help in resolving which of the twins' worldlines is "inertial", because we don't know a priori which twin has the greater lapse of proper time - that's what we're trying to determine! This circularity in the definition of inertia and the inability to justify the privileged position held by inertial worldlines in special relativity were among the problems that led Einstein in the years following 1905 to seek a broader and more coherent context for the laws of physics. The same kind of circular reasoning arises whenever we critically examine the concept of inertia. For example, when trying to decide if our region of spacetime is really flat, so that "straight lines" exist, we face the same difficulty. As Einstein said: The weakness of the principle of inertia lies in this, that it involves an argument in a circle: a mass moves without acceleration if it is sufficiently far from other bodies; we know that it is sufficiently far from other bodies only by the fact that it moves without acceleration. We could equally well substitute [has the greatest lapse of proper time] for [is sufficiently far from other bodies]. In either case the point is the same: special relativity postulates the existence of inertial frames and assigns to them a preferred role, but it gives no a priori way of establishing the correct mapping between this concept and anything in reality. This is what Einstein was referring to when he said "In classical mechanics, and no less in the special theory of relativity, there is an inherent epistemological defect...". He illustrates this with a famous thought experiment involving two relatively spinning globes, discussed in Chapter 4.1. (The term "thought experiment" might be regarded as an oxymoron, since the epistemological significance of an experiment is its empirical quality, which a thought experiment obviously doesn't possess. Nevertheless, it's undeniable that scientists have made good use of this technique - along with occasionally making bad use of it.) The puzzling asymmetry of the spinning globes is essentially just another form of the twins paradox, where the twins separate and re-converge (one accelerates away and back while the other remains stationary), and they end up with asymmetric lapses of proper time. How can the asymmetry be explained? In 1916 Einstein thought that The only satisfactory answer must be that the physical system consisting of S1 and S2 reveals within itself no imaginable cause to which the differing behavior of S1 and S2 can be referred. The cause must therefore lie outside the system. We have to take it that the general laws of motion...must be such that the mechanical behavior of S1 and S2 is partly conditioned, in quite essential respects, by distant masses which we have not included in the system under consideration. It should be noted that the strongly Machian attitude conveyed by this passage was subsequently tempered for Einstein when he realized that in the general theory of relativity it may be necessary to attribute the "essential conditioning" to boundary conditions rather than distant masses. Nevertheless, this quotation serves to demonstrate how seriously Einstein took the question, which, of course, is as applicable to the twins paradox as it is to the two-globe paradox. The above “weighty argument from the theory of knowledge” was the first reason cited by Einstein (in 1916) for the need to go beyond special relativity in order to arrive at a suitable conceptual framework. The second reason was the apparent impossibility of doing justice, within the context of special relativity, to the equivalence principle relating gravitation and acceleration. The first of these reasons bears most directly on the twins paradox, although the problem of reconciling acceleration with gravity inevitably enters the picture as well, since we can't avoid the issue of gravitation as soon as we contemplate acceleration assuming we accept the equivalence principle. From these considerations it’s clear that special relativity could never have been more than a transitional theory, since it was not comprehensive enough to justify its own conclusions. The question of whether general relativity is required to resolve the twins paradox has long been a subject of spirited debate. On one hand, Einstein wrote a paper in 1918 to explain how the general theory accounts for the asymmetric aging of the twins by means of the “gravitational fields” that appear with respect to accelerated coordinates attached to the traveling twin, and Max Born recounted this analysis in a popular book, concluding that "the clock paradox is due to a false application of the special theory of relativity, namely, to a case in which the methods of the general theory should be applied". On the other hand, many people object vigorously to any suggestion that special relativity is inadequate to satisfactorily resolve the twins paradox. Ultimately the answer depends on what sort of satisfaction is being sought, viz., on whether the paradox is being presented as a challenge to the consistency of special relativity (as is Dingle's fallacy) or to the completeness of special relativity. If we're willing to accept uncritically the existence and identifiability of inertial frames, and their preferred status, and if we are willing to exclude any consideration of gravity or the equivalence principle, then we can reduce the twins paradox to a trivial exercise in special relativity. However, if it is the completeness (rather than the consistency) of special relativity that is at issue, then the naive acceptance of inertial frames is precisely what is being challenged. In this context, we can hardly justify the exclusion of gravitation, considering that the very same metrical field which determines the inertial worldlines also represents the gravitational field. Notice that the typical statement of the twins paradox does not stipulate how the galaxies in the universe along with the cosmological boundary conditions that determine the metrical field are dynamically configured relative to the twins. If every galaxy in the universe were “moving” in tandem with the "traveling twin", which (if either) of the twins' reference frames would be considered inertial? Obviously special relativity is silent on this point, and even general relativity does not give an unequivocal answer. Weinberg asserts that "inertial frames are determined by the mean cosmic gravitational field, which is in turn determined by the mean mass density of the stars", but the second clause is not necessarily true, because the field equations generally require some additional information (such as boundary conditions) in order to yield definite results. The existence of cosmological models in which the average matter of the universe rotates (a fact proven by Kurt Gödel) shows that even general relativity is incomplete, in the sense that it is subject to global conditions with considerable freedom. General relativity may not even give a unique field for a given (non-spherically symmetric) set of boundary conditions and mass distribution, which is not surprising in view of the possibility of gravitational waves. Thus even if we sharpen the statement of the twins paradox to specify how the twins are moving relative to the rest of the matter in the universe, the theory of relativity still doesn't enable us to say for sure which twin is inertial. Furthermore, once we recognize that the inertial and gravitational field are one and the same, the twins paradox becomes even more acute, because we must then acknowledge that within the theory of relativity it's possible to contrive a situation in which two identical clocks in identical local circumstances (i.e., without comparing their positions to any external reference) can nevertheless exhibit different lapses in proper time between two given events. The simplest example is to place the twins in intersecting orbits, one circular and the other highly elliptical. Each twin is in freefall continuously between their periodic meetings, and yet they experience different lapses of proper time. Thus the difference between the twins is not a consequence of local effects; it is a global effect. At any point along those two geodesic paths the local physics is identical, but the paths are embedded differently within the global manifold, and it is the different embedding within the manifold that accounts for the difference in proper length. (The same point can be made by referring to a flat cylindrical spacetime.) This more general form of the twins paradox compels us to abandon the view that physical phenomena are governed solely by locally sensible influences. (Notice, however, that we are forced to this conclusion not by logical contradiction, but only by our philosophical devotion to the principle of sufficient cause, which requires us to assign like physical causes to like physical effects.) Likewise the identification of gravity with local spacetime curvature is untenable, as shown by the fact that a suitable arrangement of gravitating masses can produce an extended region of flat spacetime in which the metrical field is nevertheless accelerating in the global sense, and we surely would not regard such a region as free of gravitation. It is fundamentally misguided to exercise such epistemological concerns within the framework of special relativity, because special relativity was always a provisional theory with recognized epistemological short-comings. As mentioned above, one of Einstein's two main two reasons for abandoning special relativity as a suitable framework for physics was the fact that, no less than Newtonian mechanics, special relativity is based on the unjustified and epistemologically problematical assumption of a preferred class of reference frames, precisely the issue raised by the twins paradox. Today the "special theory" exists only (aside from its historical importance) as a convenient set of widely applicable formulas for important limiting cases of the general theory, but the phenomenological justification for those formulas can only be found in the general theory. This is true even if we posit the absence of gravitational effects, because the question at issue is essentially the origin of inertia, i.e., why one worldline is inertial while another is not, and the answer unavoidably involves the origin and significance of the background metric, even in the absence of curvature. The special theory never claimed, and was never intended, to address such questions. The general theory attempts to provide a coherent framework within which to answer such questions, but it's not clear whether the attempt is successful. The only context in which general relativity can give (at least arguably) a complete explanation of inertia is a closed, finite, unbounded cosmology, but the observational evidence doesn't (at present) clearly support this hypothesis, and any alternative cosmology requires some principle(s) outside of general relativity to determine the metrical configuration of the universe. Thus the twins paradox is ultimately about the origin and significance of inertia, and the existence of a definite metrical structure with a preferred class of worldlines (geodesics). In the general theory of relativity, spacetime is not simply the totality of all the relations between material objects. The spacetime metric field is endowed with its own ontological existence, as is clear from the fact that gravity itself is a source of gravity. In a sense, the non-linearity of general relativity is an expression of the ontological existence of spacetime itself. In this context it's not possible to draw the classical distinction between relational and absolute entities, because spatio-temporal relations themselves are active elements of the theory. We should also mention another common objection to the relativistic treatment of the twins, based not on any empirical disagreement, but on linguistic and metaphysical preferences. It is pointed out that we can, without logical contradiction, posit the existence of a unique, absolute, and true metaphysical time at every location, and we can account for the differences between the elapsed times on clocks that have followed different paths simply by stipulating that the rate of a clock depends on its absolute state of motion (defined relative to, for instance, the local frame in which the presumably global cosmic background radiation is maximally isotropic). Indeed this was essentially the view advocated by Lorentz. However, as discussed at the end of Section 1.5, postulating a metaphysical “truth” along with whatever physical laws are necessary to account for why the observed facts differ from the postulated “truth” is not generally useful, except as a way of artificially reconciling our experience with any particular metaphysical truth that we might select. The relativistic point of view is based on purely local concepts, such as that of an “ideal clock” corrected for all locally sensible conditions, recommended to us by the empirical fact that all observable aspects of local physical phenomena – including the rates of temporal progression – exhibit the same dependence on their state of inertial motion (which is not a locally sensible condition). This is the physical symmetry presented to us, and we are certainly justified in exploiting this symmetry to simplify and clarify the enunciation of physical laws. 4.8 The Breakdown of Simultaneity I have yielded: Instruct my daughter how she shall persever, that time and place with this deceit so lawful may prove coherent. William Shakespeare, 1603 We've seen how the operational time convention enables us to define surfaces of simultaneity with respect to any given inertial frame. However, if we try to apply this procedure to a set of accelerating bodies the concept breaks down. The problem is illustrated in the spacetime diagram shown below. This drawing shows a family of worldlines, each having the identical history of velocity as a function of time relative to the inertial coordinates. By sending light beams back and forth to its neighboring worldlines, an observer following path B can determine that he is equidistant from A and C. Likewise an observer on C is equidistant between B and D, and an observer on D is equidistant from C and E. However, due to the change in velocity of these worldlines, an observer on C can not conclude that he is equidistant from A and E. This breakdown of the well-defined locus of simultaneity is unavoidable in accelerating systems, because the operational procedure defining simultaneity involves a non-zero lapse of time for spatially separate objects, so the simultaneity relations change during the performance of the procedure. Of course, the greater the distance between objects, the greater the change in velocity (and simultaneity relations) during the performance of a synchronization procedure. Another illustration of this problem is shown below, where the instantaneous loci of simultaneity of an abruptly accelerated worldline are seen to intersect each other (on the left), so that a given distant event is assigned multiple times of occurrence. Furthermore, events in the region "R" on the right do not properly correspond to any time according to the accelerating worldline's instantaneous inertial time, because at the instant of acceleration his locus of simultaneity jumps abruptly. Obviously any amount of relative "skew" between the planes of simultaneity for a given worldline will result in interference at some distance, producing non-unique time coordinates. However, if the velocity of our worldline varies continuously (instead of abruptly), then for some limited region the planes of simultaneity will be advancing forward in time faster than they are "tilting" backwards, so over this limited region we can, if we choose, make use of these planes of simultaneity for the time labels of events. This situation is illustrated below. x We can easily determine the approximate limit for unique time labels with this kind of coordinate system by noting that if the velocity changes by amount dv/c during a time interval dt, then the relative slope of the new plane of simultaneity is c/dv, so it intersects with the original plane of simultaneity at a distance dx = (cdt)(c/dv) = c2/(dv/dt). Since a = dv/dt is the acceleration, we can estimate that this accelerating system of coordinates is coherent out to distances on the order of c2/a. As an example of the use of accelerating coordinate systems and the breakdown of inertial simultaneity, consider a circular Sagnac device as described in Section 2.7. As we've seen, each point on the rim of the rotating disk can be associated with an instantaneously co-moving inertial coordinate system, each with its own surfaces of simultaneity. However, since each point of the disk is accelerating with respect to each other point, there is no coherent simultaneity (in the inertial sense) shared by any two points. If we analytically continue the local simultaneity from one point to the next around the perimeter, the result is an open helical surface as indicated below: The worldline of a particular point on the rim is shown by the helical curve AB, and the shallower helix represents the analytically continued surface of inertial simultaneity. (It's interesting to compare this construction with Riemann surfaces in complex function analysis.) Of course, we can dispense with the use of local inertial simultaneity to define our constant-t coordinate surfaces, and simply define an arbitrary system of space and time coordinates in terms of which a rotating disk is stationary (for example), but we then must be careful to correctly account for non-inertial aspects of these accelerating coordinates, particularly with regard to the meanings of spatial lengths. The usual intuitive definition of the spatial length of an object (such as the perimeter of the rim) is the absolute length of a locus of inertially simultaneous points of that object, so it depends on the establishment of a slice of "inertial simultaneity" over the entire rim. If we use inertial coordinates this is easy, but if we use non-inertial coordinates (such as those in which the rotating disk is stationary), then no surface of inertial simultaneity coincides with our surfaces of constant time parameter. In fact, this is essentially the definition of non-inertial coordinates. So, we will obviously be unable to define a coherent locus of inertial simultaneity over the whole disk as a surface of constant time parameter when working with non-inertial coordinates. One consequence of this is the fact that the spatial length of a path becomes dependent on the speed of the path. We are accustomed to this for temporal lengths, i.e., the length of time around the rim might be 30 seconds or 2 hours or 1 nanosecond, etc., depending on how fast we are going relative to the disk, how fast the rim is spinning, in which direction it is spinning, and so on. Likewise the spatial length of a path around the rim (in terms of some particular coordinates) depends on the speed of the path. This shouldn't be surprising, because the decomposition of spacetime into separate spatial and temporal components is not unique, i.e., there are multiple equally self-consistent decompositions. Since this is often a source of confusion, it's worthwhile to describe how this works in detail. Let's first establish inertial cylindrical coordinates in 2+1 spacetime, using polar coordinates (r,) for the space (where is the angular coordinate), and t for time. The metric in terms of these inertial coordinates is and for any fixed time t the purely spatial metric is So, to find the "length" of any spacelike curve, such as the perimeter of a spinning disk of radius rd centered at the origin, we simply integrate ds over this curve at the fixed value of t. For a circular disk, r = rd is constant, so dr = 0, and the spatial metric is simply ds = rd d, which we integrate from = 0 to 2 to give the length 2 rd. Now let's look at this situation in terms of a system of coordinates in which the spinning disk is stationary, i.e., such that a fixed point anywhere on the disk maintains constant spatial coordinates for all values of the temporal coordinate. Taking the most naive and simplistic approach, let's define the new coordinates T,R, by the relations where is a constant, denoting the angular speed of these coordinates with respect to the inertial t,r, coordinates. We also have the differentials Substituting these expressions into the metric equation gives According to these coordinates, a spatial length S must be given by integrating the absolute spacelike differential using the metric along some constant-T surface, i.e., with dT = 0, where the metric is Again for the perimeter of the disk we get 2 Rd = 2 rd. Notice that our constant-T surfaces are also constant-t surfaces, so this perimeter length agrees with our previous result, and of course it doesn't matter which direction we integrate around the perimeter. Incidentally, letting v = Rd denote the velocity of the rim with respect to the original inertial coordinates, the full spacetime metric for the rim (R = Rd) in terms of the rotating coordinates is For a point fixed on the rim we have d = 0, and so which confirms that the lapse of proper time for a point fixed on the rim of the rotating disk is times the lapse of T (and therefore of t). Now let's send light beams around the perimeter in opposite directions. For lightlike paths we have d = 0, so the path of light must satisfy The purely spatial component is dS = Rd d, so we can make this substitution and divide both sides by (dT)2 to give The quantity dS/dT is the "speed of light" in terms of these rotating non-inertial coordinates. Also, from the definitions we have where d/dt is the angular velocity of the light at radius Rd with respect to the inertial coordinates, so it equals 1/Rd (noting that c = 1 in our units), with the sign depending on whether the light is clockwise or counter-clockwise. Substituting into the previous expression gives Letting C = dS/dT denote the speed of light with respect to these rotating non-inertial coordinates, we therefore have C = 1 v, where again the sign depends on the direction of the light relative to the direction of rotation of the disk. Does this analysis lead to some kind of paradox? It indicates that the non-inertial "speed of light" with respect to these rotating coordinates is not equal to 1, and in fact the ratio of the speeds in the two directions is (1+v)/(1v), but of course this doesn't conflict with special relativity, because these are not inertial coordinates (due to their rotation). However, suppose we increase Rd and decrease w in proportion so that the rim speed v remains constant. The above formulas still apply for arbitrarily large Rd and small angular speed w, and yet the speed ratio remains the same, (1+v)/(1v). Does this conflict with special relativity in the limit as the radius goes to infinity and the angular speed of the rim goes to zero? Clearly not, since we saw in Section 2.7 that if t1 and t2 denote the travel times for light pulses circling the disk in opposite directions, as measured by a clock at a fixed point on the rim, so that t2/t1 = (1+v)/(1v), then we have t2/t1 1 = /, where is the angular travel of the disk during the transit of light. In other words, the observed ratio of travel times around the rim always differs from 1 by an amount proportional to the angular travel of the disk during the transit of light. Thus the net acceleration (change of velocity) of the rim observer during the measurement remains in constant proportion to the measured anisotropy of the transit times. However, even without waiting for the light rays to circle the disk and report their anisotropy, don't the above formulas imply that the speeds of light in the two directions are in the ratio of (1+v)/(1v) instantaneously with respect to our rotating coordinates, and don't the rotating coordinates approach being inertial coordinates as Rd increases while holding v constant? Yes and no. Both sets of coordinates use the same time t = T, but they use different space coordinates, s and S. For the perimeter of the disk we have where W = d/dt. Thus the ratio dS/ds of spatial distances along a given "path" depends on the angular speed W of the path. Recall that for a signal travelling at c = 1 (with respect to the inertial coordinates) around the perimeter we have W = 1/rd, and so This is consistent with the velocity ratio This shows that the "spatial distances" around the perimeter are different in the two directions. But we saw earlier that "the spatial distance" was independent of the direction in which we integrated around the perimeter, even in the rotating coordinate system, so does this indicate an inconsistency? No, because, as noted above, the ratio dS/ds along a given path depends on the speed of the path. We have dS/ds = 1 + w/W, and for the perimeter of the disk with rim speed v and for a path with speed V, this gives If the path is lightlike, we have V = 1 and so dS/ds = 1 v, whereas when we considered the purely spatial distance around the perimeter we took the "instantaneous" distance, i.e., we took a spacelike path with V = , in which case dS/ds = 1 in both directions. This explains quantitatively what we mean when we say that we are measuring different things, depending on what spacetime path is having it's "spatial length" evaluated. Just as the temporal length of a path around the rim depends on the speed of the path, so too does the spatial length. By the way, notice that if we integrate the spatial component of a path whose velocity V (relative to the original inertial coordinates) is the same as the rim speed itself, so that v = V, then obviously we will never move with respect to the disk in one direction, so dS = 0 and therefore dS/ds = 0, whereas in the other direction we have dS/ds = 2. Similarly if V = 0 we will never move relative to the original coordinates, i.e., ds = 0 and therefore dS/ds is infinite along such a path.