User:Prokaryotic Caspase Homolog/sandbox 3

Introduction to the mathematics of curved spacetime edit

The approach to tensors adopted here follows closely an older presentation by Lillian Lieber (1945, 2008)^[1]^{[note 1]}^{[note 2]} which was written to be accessible by anybody with a basic understanding of calculus. Lieber used the coordinate transformation approach to tensor analysis. The modern approach to tensor analysis stresses the geometrical nature of tensors rather than the transformation properties of their components.^[3]^: 77 Because of the coordinate-free nature of the abstract view, it is often considered more physical.^[4]^: 31 However, books on general relativity written in a manner intended to be usable by autodidacts (textbooks as well as semi-popularizations) usually adopt the coordinate transformation approach as requiring less mathematical sophistication on the part of the reader.^[2]^[5] Several textbooks, including that by Adler,^[4] provide side-by-side explanations in terms of both the classic view and the modern abstract view.^{[note 3]}

This non-rigorous introduction to the mathematics of general relativity stops at the vacuum field equations which are valid only in regions of space where the energy-momentum tensor is zero, which is to say, in regions devoid of mass-energy. Nevertheless, a variety of interesting results are possible with this limited approach, including derivation of the Schwarzschild metric and an exploration of some of its consequences.

Describing the shape of space and spacetime edit

Cartesian coordinates

Polar coordinates

Oblique coordinates

Spherical coordinates

Figure 6–1. Computing ds in different coordinate systems

In the section of this article on the Spacetime interval, the reader has been introduced to the concept of the interval $s^{2}$ and has been told, without detailed explanation, that the properties of this interval serve to characterize the geometric properties of the space (or spacetime) on which the interval has been defined.

For example, in a Euclidean plane, the Pythagorean theorem holds for right triangles drawn in that plane.

s^{2}=x^{2}+y^{2}

(A1)

Conversely, if the distance between two points on a surface is given by

s^{2}=x^{2}+y^{2}

then that surface is necessarily a Euclidean plane.^[1]^{: 113–125}

Failure of the Pythagorean theorem to hold implies that a surface has an intrinsic curvature. The intrinsic curvature of the surface can be ascertained solely from measurements made from within that surface, without external comparisons, and without information that might be obtained by measurements obtained from any higher-dimensional space in which the surface may be embedded. Intrinsic curvature is to be distinguished from extrinsic curvature. If one takes a flat sheet and rolls it into a cylinder, the surface has extrinsic curvature, but the Pythagorean theorem continues to hold for measurements made within the surface, so the surface has no intrinsic curvature. General relativity is concerned only with the intrinsic curvature of spacetime.^[3]^{: 153–154}

In differential calculus, the student learns how to apply the Pythagorean theorem in computing lengths along a curve, as in Fig. 6–1a, where the differential form of the theorem is

ds^{2}=dx^{2}+dy^{2}

(A2)

In most of the forthcoming discussion we will prefer to use generalized coordinates, substituting $x_{1}$ for $x$ and $x_{2}$ for $y,$ i.e.

ds^{2}=dx_{1}^{2}+dx_{2}^{2}

(A3)

The properties of a space do not depend on the coordinate system used to make measurements within that space. What would be the equivalent of (A2) for measurements made in other coordinate systems?

For polar coordinates, as shown in Fig. 6–1b, the relevant expression would be

ds^{2}=dr^{2}+r^{2}d\theta ^{2}

(A4)

where the equivalent expression using generalized coordinates, substituting $x_{1}$ for $r$ and $x_{2}$ for $\theta ,$ is

ds^{2}=dx_{1}^{2}+x_{1}^{2}dx_{2}^{2}\;.

(A5)

For oblique coordinates, as shown in Fig. 6–1c, the law of cosines allows us to write

ds^{2}=dx^{2}+dy^{2}-2dx\,dy\,\cos \alpha

(A6)

and the equivalent expression using generalized coordinates would be

ds^{2}=dx_{1}^{2}+dx_{2}^{2}-2dx_{1}\,dx_{2}\,\cos \alpha \;.

(A7)

What of surfaces with a bona fide intrinsic curvature? In Fig. 6–1d, we illustrate a sphere on which has been drawn the elements of the spherical coordinate system. With the understanding that $r=R\cos \beta ,$ we note that

ds^{2}=r^{2}d\alpha ^{2}+R^{2}d\beta ^{2}

(A8)

and the equivalent expression, replacing $\alpha$ with $x_{1}$ and $\beta$ with $x_{2}$ would be

ds^{2}=r^{2}dx_{1}^{2}+R^{2}dx_{2}^{2}

(A9)

The expression for $ds^{2}$ depends on both the intrinsic properties of the surface and the coordinate system used to describe that surface. Therefore, a cursory examination of $ds^{2}$ will not suffice to determine the characteristics of the surface that we are dealing with. To determine the characteristics of the surface starting from $ds^{2},$ we must determine the curvature tensor.^[1]^{: 113–125}

What are tensors? edit

In precalculus, one learns about scalars and vectors. Scalars are quantities that have magnitude only, while vectors have both magnitude and direction. Measurements such as temperature and age are scalars, whereas measurements of velocity, momentum, acceleration and force are vectors.

Tensors are a form of mathematical object that have found great use in science and engineering. "Tensor" is an inclusive term that includes scalars and vectors as special cases: A scalar is a tensor of rank zero, while a vector is a tensor of rank one.

Figure 6–2. Tensor of rank two

A familiar engineering use of tensors is in the representation of compressive, tensile, and sheer stresses on an object. A pure force (a vector) acting uniformly on an entire object will not cause the object to deform; instead, the object will accelerate uniformly, and the object will not "feel" any effects of the force. It is the differential application of forces on different parts of an object that exerts stress on the object, causing mechanical strain.

In Fig 6–2, consider a small surface element which is being acted upon by the force $AB$ . The area and orientation of this surface element is represented by the vector $AG$ , which is perpendicular to the surface and whose magnitude represents the area of the surface element. The stress at $A$ depends on both vectors and is a tensor of rank two.^[1]^{: 127–140}

Tensors exist independently of any coordinate system. However, for computational purposes, it is convenient to decompose a tensor into components.

Figure 6–3. Decomposition of tensor components

In Fig 6–3a, a force $F$ acts on a small surface $dS$ where $G$ is the vector that represents the area and orientation of this surface element. In Fig 6–3b, the projections of this surface element $dS_{x},dS_{y},$ and $dS_{z}$ on the $yz,xz,$ and $xy$ planes, respectively, are illustrated. The x, y, and z components of $G$ (not illustrated) represent the areas and orientations of these three projections.

The total effect of the force $F$ on $dS$ can be computed by considering the effect of each of its three components, $f_{x},\,f_{y},$ and $f_{z}$ on each of the three projections $dS_{x},\,dS_{y},\,$ and $dS_{z}.$

The x-component of $F,$ which is $f_{x},$ acts on each of the aforementioned projections, and the "pressure" (force per unit area) from $f_{x}$ acting on each of these projections is designated as $p_{xx},\,p_{xy},\,p_{xz},$ respectively. Since force equals pressure times area, we can write:^[1]^{: 127–140}

f_{x}=p_{xx}dS_{x}\,+\,p_{xy}dS_{y}\,+\,p_{xz}dS_{z}

Likewise, for $f_{y}$ and $f_{z},$ we write

f_{y}=p_{yx}dS_{x}\,+\,p_{yy}dS_{y}\,+\,p_{yz}dS_{z}

f_{z}=p_{zx}dS_{x}\,+\,p_{zy}dS_{y}\,+\,p_{zz}dS_{z}

The total stress $F$ on the surface $dS$ is $F=f_{x}\,+\,f_{y}\,+f_{z},$ so that

{\begin{aligned}F&=p_{xx}dS_{x}\,+\,p_{xy}dS_{y}\,+\,p_{xz}dS_{z}\\&+\,p_{yx}dS_{x}\,+\,p_{yy}dS_{y}\,+\,p_{yz}dS_{z}\\&+\,p_{zx}dS_{x}\,+\,p_{zy}dS_{y}\,+\,p_{zz}dS_{z}\end{aligned}}

(B1)

In three-dimensional space, force (a vector) has three components, but stress (a tensor of rank two) has nine components. A tensor of rank three will have n³ components and so forth.

In n-dimensional space, the n components of a vector are written in a single row, but the n² components of a tensor of rank two are written in a square array.

Effect of changes in the coordinate system edit

Relativity is concerned with finding the physical laws which hold good for all observers, regardless of their viewpoint (coordinate system). In 1905, with special relativity, Einstein considered changes in viewpoint due to differences in uniform relative velocity. In 1916, with general relativity, Einstein generalized the idea to include observers in much more complex relationships with each other. The concept of invariance that Einstein introduced is one of the most fundamental in all of physics. Tensors are objects that are intrinsically invariant under transformation of coordinate systems.^{[note 4]} In the following, we explore the effects of such transformation, beginning with a simple rotation of coordinates.^[1]^{: 141–150}

Figure 6–4. Rotational coordinate transormations

In Fig. 6–4, consider a conventional Cartesian coordinate system in the $xy$ plane. Suppose we transform to a new ${\bar {x}},\,{\bar {y}}$ coordinate system that is obtained from the $x,\,y$ system by rotating the coordinate axes by angle $\theta$ about the origin. If point $A$ has coordinates $x,\,y$ in the first coordinate system, its coordinates in the primed system are given by

{\bar {x}}=x\cos \theta +y\sin \theta

{\bar {y}}=-x\sin \theta +y\cos \theta

The inverse transformation, calculating $x$ and $y$ given ${\bar {x}}$ and ${\bar {y}},$ is readily obtained from this first transformation.

Through a series of steps, we will generalize this notation to encompass other transformations in an arbitrary number of dimensions. The generalized notation will allow an elegantly condensed method of writing the equations that simplifies complex manipulations.^[1]^{: 141–150}

Our first generalization is to rewrite the transformation so that it is no longer tied to a specific form of rotation:

{\bar {x}}=a\cdot x+b\cdot y

{\bar {y}}=c\cdot x+d\cdot y

where $a,\,b,\,c,\,d\,$ are functions of $\theta .\,$ In differential form, we may write the following:

d{\bar {x}}=a\cdot dx+b\cdot dy

d{\bar {y}}=c\cdot dx+d\cdot dy

We further generalize by using $dx^{1}$ and $dx^{2}$ instead of $dx$ and $dy$ , and by using the single letter $a$ with different subscripts instead of four different letters $a,\,b,\,c,\,d.$

We will henceforth mostly be using coordinates distinguished by superscripts rather than subscripts for reasons that will be discussed later. These superscripts are not to be confused with exponentiation:

{\begin{aligned}d{\bar {x}}^{1}=a_{11}dx^{1}+a_{12}dx^{2}\\d{\bar {x}}^{2}=a_{21}dx^{1}+a_{22}dx^{2}\end{aligned}}

(C1)

The subscripted $a$ 's are now understood as representing partial derivatives, with $a_{11}$ being the change in ${\bar {x}}^{1}$ due to a change in $x^{1}$ and so forth.^[1]^: 147

{\begin{aligned}d{\bar {x}}^{1}={\frac {\partial {\bar {x}}^{1}}{\partial x^{1}}}dx^{1}+{\frac {\partial {\bar {x}}^{1}}{\partial x^{2}}}dx^{2}\\d{\bar {x}}^{2}={\frac {\partial {\bar {x}}^{2}}{\partial x^{1}}}dx^{1}+{\frac {\partial {\bar {x}}^{2}}{\partial x^{2}}}dx^{2}\end{aligned}}

(C2)

Notational simplifications edit

The two equations in (C2) may be rewritten in a single line:

d{\bar {x}}^{\mu }=\sum \limits _{\sigma }{\frac {\partial {\bar {x}}^{\mu }}{\partial x^{\sigma }}}dx^{\sigma }\quad \quad {\begin{pmatrix}\mu =1,2\\\sigma =1,2\end{pmatrix}}

(D1)

The Einstein summation convention enables further abbreviation. Whenever a symbol occurs twice in a single term (e.g. the $\sigma$ in the right-hand member of (D1), it is understood that a summation is to be made on that subscript (or superscript).^[4]^: 14 Hence, we may rewrite (D1) as follows:

d{\bar {x}}^{\mu }={\frac {\partial {\bar {x}}^{\mu }}{\partial x^{\sigma }}}dx^{\sigma }\quad \quad {\begin{pmatrix}\mu =1,2\\\sigma =1,2\end{pmatrix}}

(D2)

Let $x^{\mu }$ be the coordinates of a point $P$ in a space of dimensionality n. Let $P'$ be a neighboring point having coordinates $x^{\mu }+dx^{\mu }$ as measured in the first frame. The coordinates of $P'$ in the second frame will be ${\bar {x}}^{\mu }+d{\bar {x}}^{\mu }.$ The n quantities $dx^{\mu }$ are understood to the components of the displacement vector ${\vec {PP'}}$ as measured in the first frame, while $d{\bar {x}}^{\mu }$ are the components of this same displacement vector as measured in the second frame. These are related to the components measured in the first frame by the transformation equation (D2).^[6]^: 89–90

The appearance of equation (D2) may be simplified further as follows: Given that $dx^{1}$ and $dx^{2}$ are the components of $ds$ in the unbarred system, we represent them more briefly by $V^{1}$ and $V^{2}.$ Likewise, given that $d{\bar {x}}^{1}$ and $d{\bar {x}}^{2}$ are the components of $ds$ in the barred system, we represent them more briefly by ${\bar {V}}^{1}$ and ${\bar {V}}^{2}.$

On the right side of (D2), $\mu ,$ which is not repeated, is known as a free index, while the repeated summation indices are known as dummy indices, since they disappear when performing the summation. Unless stated otherwise, any free index shall have the same range as the dummy indices.^[7]^: 2 Hence, in (D2),

{\begin{pmatrix}\mu =1,2\\\sigma =1,2\end{pmatrix}}

may be written as

(n=2).

These superscripts should not be confused with exponents. $V^{2}$ is not the square of $V.$ Rather, these superscripts are used for indexing purposes, the same as subscripts. Superscripts and subscripts are used for distinct purposes which will be explained shortly.

Hence, (D2) may be rewritten as follows:

{\bar {V}}^{\mu }={\frac {\partial {\bar {x}}^{\mu }}{\partial x^{\sigma }}}V^{\sigma }

(D3)

Given a vector $V^{\sigma }$ , whose components are $V^{1}$ and $V^{2}$ in a given coordinate system, (D3) allows computation of its components in a new coordinate system related to the first by the transformation represented in (C1).

Actually, (D2) and (D3) are valid not merely for the transformation represented in (C1), but are valid for any transformation of coordinates (provided that the values of $x^{\sigma }$ and ${\bar {x}}^{\mu }$ are in one-to-one correspondence). In other words, in the transformation represented by

{\bar {x}}^{\mu }=f^{\mu }(x^{\sigma }),

where $f^{\mu }$ are arbitrary functions,^{[note 5]} (D2) and (D3) allow computation of the vector components in the transformed coordinate system.

Any set of quantities that transforms according to (D3) is, by definition, a vector, or more precisely, a contravariant vector. One should also note that (D3) is extensible to vectors of any number of dimensions. In the curved spacetime of general relativity, one cannot think of vectors as being directed line segments stretching from one point to another. A set of coordinates $x^{n}$ do not form a vector. In the case discussed here, a contravariant vector is the set of coordinate differentials $dx^{n}$ along some given curve.^[4]^: 39

Using this notation, a contravariant tensor of rank two is defined as follows:

{\bar {V}}^{\alpha \beta }={\frac {\partial {\bar {x}}^{\alpha }}{\partial x^{\gamma }}}{\frac {\partial {\bar {x}}^{\beta }}{\partial x^{\delta }}}V^{\gamma \delta }

(D4)

Since $\gamma$ and $\delta$ each occur twice in the term on the right, it is understood that the term represents a sum for $\gamma$ and $\delta$ over their entire ranges. On the other hand, neither $\alpha$ nor $\beta$ occur twice in any single term. In three-space, $\alpha ,\,\beta ,\,\gamma ,\,\delta$ each range over $1,\,2,\,3,$ so the interpretation of (D4) is that it represents nine equations, each equation having the sum of nine terms on the right.

For example, given $\alpha =2,\,\beta =3,$ (D4) expands to the following:

{\bar {V}}^{23}={\frac {\partial {\bar {x}}^{2}}{\partial x^{1}}}{\frac {\partial {\bar {x}}^{3}}{\partial x^{1}}}V^{11}+{\frac {\partial {\bar {x}}^{2}}{\partial x^{1}}}{\frac {\partial {\bar {x}}^{3}}{\partial x^{2}}}V^{12}

+\;{\frac {\partial {\bar {x}}^{2}}{\partial x^{1}}}{\frac {\partial {\bar {x}}^{3}}{\partial x^{3}}}V^{13}

\quad \quad +\,{\frac {\partial {\bar {x}}^{2}}{\partial x^{2}}}{\frac {\partial {\bar {x}}^{3}}{\partial x^{1}}}V^{21}+{\frac {\partial {\bar {x}}^{2}}{\partial x^{2}}}{\frac {\partial {\bar {x}}^{3}}{\partial x^{2}}}V^{22}

+\;{\frac {\partial {\bar {x}}^{2}}{\partial x^{2}}}{\frac {\partial {\bar {x}}^{3}}{\partial x^{3}}}V^{23}

\quad \quad +\,{\frac {\partial {\bar {x}}^{2}}{\partial x^{3}}}{\frac {\partial {\bar {x}}^{3}}{\partial x^{1}}}V^{31}+{\frac {\partial {\bar {x}}^{2}}{\partial x^{3}}}{\frac {\partial {\bar {x}}^{3}}{\partial x^{2}}}V^{32}

+\;{\frac {\partial {\bar {x}}^{2}}{\partial x^{3}}}{\frac {\partial {\bar {x}}^{3}}{\partial x^{3}}}V^{33}

In four-space, (D4) expands to sixteen equations, each having a sum of sixteen terms on the right.

The notation presented here hence offers a concise representation of complex mathematical objects.^[1]^{: 151–159}

Tensor addition and multiplication edit

Tensor algebra includes various operations for making new tensors from old tensors. Here we begin with tensor addition, starting with tensors of rank one (vectors) in a plane.^[1]^{: 163–167}

Suppose we have two contravariant vectors in a plane, $A^{\alpha }$ with components $A^{1}$ and $A^{2}$ , and a second such vector, $B^{\alpha }$ with components $B^{1}$ and $B^{2}$ . Let us form another quantity, $C^{\alpha },$ by adding the corresponding components of $A^{\alpha }$ and $B^{\alpha }$ . In other words, $C^{1}=A^{1}+B^{1}$ and $C^{2}=A^{2}+B^{2}$ .

We ask whether the resulting quantity $C^{\alpha }$ is a vector, i.e. does it transform according to (D3)? Since $A^{\alpha }$ and $B^{\alpha }$ are contravariant vectors, we may write:

{\bar {A}}^{\lambda }={\frac {\partial {\bar {x}}^{\lambda }}{\partial x^{\alpha }}}A^{\alpha }

(E1)

{\bar {B}}^{\lambda }={\frac {\partial {\bar {x}}^{\lambda }}{\partial x^{\alpha }}}B^{\alpha }

(E2)

Taking the components one at a time, we may write, for the first components:

{\bar {A}}^{1}={\frac {\partial {\bar {x}}^{1}}{\partial x^{1}}}A^{1}+{\frac {\partial {\bar {x}}^{1}}{\partial x^{2}}}A^{2}

{\bar {B}}^{1}={\frac {\partial {\bar {x}}^{1}}{\partial x_{1}}}B^{1}+{\frac {\partial {\bar {x}}^{1}}{\partial x^{2}}}B^{2}

and likewise for the second components. Summing these, we obtain for the first and second components:

{\bar {A}}^{1}+{\bar {B}}^{1}={\frac {\partial {\bar {x}}^{1}}{\partial x^{1}}}(A^{1}+B^{1})

+\;{\frac {\partial {\bar {x}}^{1}}{\partial x^{2}}}(A^{2}+B^{2})

{\bar {A}}^{2}+{\bar {B}}^{2}={\frac {\partial {\bar {x}}^{2}}{\partial x^{1}}}(A^{1}+B^{1})

+\;{\frac {\partial {\bar {x}}^{2}}{\partial x^{2}}}(A^{2}+B^{2})

The above two equations may be rewritten more compactly as

{\bar {A}}^{\lambda }+{\bar {B}}^{\lambda }={\frac {\partial {\bar {x}}^{\lambda }}{\partial x^{\alpha }}}(A^{\alpha }+B^{\alpha })\quad \quad (n=2)

(E3)

or, using $C$ s to represent each summed component

{\bar {C}}^{\lambda }={\frac {\partial {\bar {x}}^{\lambda }}{\partial x^{\alpha }}}C^{\alpha }\quad \quad (n=2)

(E4)

Since $C^{\alpha }$ transforms according to (D3), we have established that the sum of two vectors is another vector. The same holds for tensors of higher rank.

Note in particular how (E4) may be obtained by summing (E1) and (E2) as if they were each single equations with a single term on the right, when in reality, each represents multiple equations with multiple terms on the right.

The notational system used here, developed by Ricci and Levi-Cevita about 1900, with later enhancements by Einstein, permits complex operations to be performed following a relatively simple algebraic process often termed "index juggling".^[4]^: 44 The notation automatically keeps track of whole sets of equations having many terms in each. We illustrate here with a process of multiplying tensors called "outer multiplication".

If we wish to multiply

{\bar {A}}^{\lambda }={\frac {\partial {\bar {x}}^{\lambda }}{\partial x^{\alpha }}}A^{\alpha }\quad \quad (n=2)

(E5)

by

{\bar {B}}^{\mu }={\frac {\partial {\bar {x}}^{\mu }}{\partial x^{\beta }}}B^{\beta }\quad \quad (n=2)

(E6)

we can immediately write

{\bar {C}}^{\lambda \mu }={\frac {\partial {\bar {x}}^{\lambda }}{\partial x^{\alpha }}}{\frac {\partial {\bar {x}}^{\mu }}{\partial x^{\beta }}}C^{\alpha \beta }\quad \quad (n=2)

(E7)

In outer multiplication, each equation of (E5) is to be multiplied by each equation of (E6), so there would be four multiplications. Written in expanded form, the first equation of (E5), with $\lambda =1,$ and the first equation of (E6), with $\mu =1,$ are, respectively,

{\bar {A}}^{1}={\frac {\partial {\bar {x}}^{1}}{\partial x^{1}}}A^{1}+{\frac {\partial {\bar {x}}^{1}}{\partial x^{2}}}A^{2}\quad

and

\quad {\bar {B}}^{1}={\frac {\partial {\bar {x}}^{1}}{\partial x^{1}}}B^{1}+{\frac {\partial {\bar {x}}^{1}}{\partial x^{2}}}B^{2}

Following ordinary rules of algebra, we obtain, as the product, the following:

{\begin{aligned}{\bar {A}}^{1}{\bar {B}}^{1}&={\frac {\partial {\bar {x}}^{1}}{\partial x^{1}}}{\frac {\partial {\bar {x}}^{1}}{\partial x^{1}}}A^{1}B^{1}+{\frac {\partial {\bar {x}}^{1}}{\partial x^{2}}}{\frac {\partial {\bar {x}}^{1}}{\partial x^{1}}}A^{2}B^{1}\\&+\,{\frac {\partial {\bar {x}}^{1}}{\partial x^{1}}}{\frac {\partial {\bar {x}}^{1}}{\partial x^{2}}}A^{1}B^{2}+{\frac {\partial {\bar {x}}^{1}}{\partial x^{2}}}{\frac {\partial {\bar {x}}^{1}}{\partial x^{2}}}A^{2}B^{2}\end{aligned}}

(E8)

In like fashion, we obtain equations for ${\bar {A}}^{1}{\bar {B}}^{2},\,{\bar {A}}^{2}{\bar {B}}^{1},\,$ and ${\bar {A}}^{2}{\bar {B}}^{2}.$

To reiterate, according to the Einstein summation convention, since $\alpha$ and $\beta$ each occur twice on the right side of (E7), they must each take on all possible values to form a sum. For $\lambda =1,\mu =1,$ the terms sum to yield (E8), except that in (E7) we simplify the appearance by replacing $A^{\alpha }B^{\beta }$ with $C^{\alpha \beta }.$ In a similar fashion, we handle the other possible values of $\lambda$ and $\mu ,$ thus showing that (E7) completely represents the outer product of (E5) and (E6).^[1]^{: 163–167}

From (E7), it is evident that the outer product of two vectors is a tensor of rank two. In general, the product of two tensors of rank m and n is a tensor of rank m + n.^{[note 6]}

Covariant tensors edit

Figure 6–5. Covariant vector under a simple coordinate transformation

In Fig. 6–5, consider an object having varying density in different parts of the object. The density at any particular point is a scalar, but the change in density as we go from point to point is a directed quantity, i.e. a vector. If we designate the density at any particular point by $\psi$ , then

{\frac {\partial \psi }{\partial x^{1}}}

and

{\frac {\partial \psi }{\partial x^{2}}}

represent the partial variation of $\psi$ in the $x^{1}$ and $x^{2}$ directions. We will see that the transformation properties of this form of vector are different from those described before.^[1]^{: 167–172}

On top of the original coordinate system in Fig. 6–5, we overlay a changed coordinate system labeled with transformed coordinates. Given the unbarred coordinate components of the vector at point A, we wish to express its barred coordinate components. In other words, we wish to express

{\frac {\partial \psi }{\partial {\bar {x}}^{1}}}\,{\text{and}}\,{\frac {\partial \psi }{\partial {\bar {x}}^{2}}}

in terms of

{\frac {\partial \psi }{\partial x^{1}}}\,{\text{and}}\,{\frac {\partial \psi }{\partial x^{2}}}

The ${\bar {x}}^{1}$ and ${\bar {x}}^{2}$ coordinates of any point in the transformed coordinate system depend on both $x^{1}$ and $x^{2}$ of the nontransformed system. The transformed vector coordinates may be written as

{\frac {\partial \psi }{\partial {\bar {x}}^{1}}}=a_{11}{\frac {\partial \psi }{\partial x^{1}}}+a_{12}{\frac {\partial \psi }{\partial x^{2}}}

{\frac {\partial \psi }{\partial {\bar {x}}^{2}}}=a_{21}{\frac {\partial \psi }{\partial x^{1}}}+a_{22}{\frac {\partial \psi }{\partial x^{2}}}

where $a_{11}$ is the partial change in $x^{1}$ per change in ${\bar {x}}^{1}$ and so forth. Writing the equations out fully,

{\begin{aligned}{\frac {\partial \psi }{\partial {\bar {x}}^{1}}}={\frac {\partial \psi }{\partial x^{1}}}{\frac {\partial x^{1}}{\partial {\bar {x}}^{1}}}+{\frac {\partial \psi }{\partial x^{2}}}{\frac {\partial x^{2}}{\partial {\bar {x}}^{1}}}\\{\frac {\partial \psi }{\partial {\bar {x}}^{2}}}={\frac {\partial \psi }{\partial x^{1}}}{\frac {\partial x^{1}}{\partial {\bar {x}}^{2}}}+{\frac {\partial \psi }{\partial x^{2}}}{\frac {\partial x^{2}}{\partial {\bar {x}}^{2}}}\end{aligned}}

(F1)

As before, the above two equations may be combined using the summation convention:

{\frac {\partial \psi }{\partial {\bar {x}}^{\mu }}}={\frac {\partial \psi }{\partial x^{\sigma }}}{\frac {\partial x^{\sigma }}{\partial {\bar {x}}^{\mu }}}\quad \quad (n=2)

(F2)

Finally, using ${\overline {W}}_{\mu }$ to represent ${\frac {\partial \psi }{\partial {\bar {x}}^{\mu }}}$ and $W_{\sigma }$ to represent ${\frac {\partial \psi }{\partial x^{\sigma }}},$ we write (F2) as follows:

{\overline {W}}_{\mu }={\frac {\partial x^{\sigma }}{\partial {\bar {x}}^{\mu }}}W_{\sigma }\quad \quad (n=2)

(F3)

The transformation rule for vectors described by (F3) is different from the transformation rule for vectors described by (D3), in that the coefficient on the right in (F3) is the reciprocal of the corresponding coefficient in (D3). Equation (F3) is the mathematical definition of a covariant vector, i.e. a covariant tensor of rank one. A covariant vector is the gradient of a scalar.^[4]^: 39

A covariant tensor of rank two is defined as follows:^[1]^{: 167–172}

{\overline {W}}_{\alpha \beta }={\frac {\partial x^{\gamma }}{\partial {\bar {x}}^{\alpha }}}{\frac {\partial x^{\delta }}{\partial {\bar {x}}^{\beta }}}W_{\gamma \delta }

(F4)

Carefully compare (F3) with (D3), and (F4) with (D4).

Note that the indices of covariant tensors are subscripts, and the bars in the coefficients are in the denominators. In contrast, the indices of contravariant tensors are superscripts, and the bars in the coefficients are in the numerators.

Mixed tensors edit

Addition of covariant tensors can be performed in the same manner as contravariant tensors. Likewise, the outer multiplication of two covariant tensors of ranks m and n yields a covariant tensor of rank m + n. For example, the outer product of

{\bar {A}}_{\lambda }={\frac {\partial x^{\alpha }}{\partial {\bar {x}}^{\lambda }}}A_{\alpha }

and

{\bar {B}}_{\mu \nu }={\frac {\partial x^{\beta }}{\partial {\bar {x}}^{\mu }}}{\frac {\partial x^{\gamma }}{\partial {\bar {x}}^{\nu }}}B_{\beta \gamma }

is given by

{\bar {C}}_{\lambda \mu \nu }={\frac {\partial x^{\alpha }}{\partial {\bar {x}}^{\lambda }}}{\frac {\partial x^{\beta }}{\partial {\bar {x}}^{\mu }}}{\frac {\partial x^{\gamma }}{\partial {\bar {x}}^{\nu }}}C_{\alpha \beta \gamma }

On the other hand, outer multiplication of a covariant tensor of rank m by a contravariant tensor of rank n yields a product of rank m + n which has m indices of covariance and n indices of contravariance. For example the outer product of the covariant tensor

{\bar {A}}_{\lambda }={\frac {\partial x^{\alpha }}{\partial {\bar {x}}^{\lambda }}}A_{\alpha }

and the contravariant tensor

{\bar {B}}^{\mu }={\frac {\partial {\bar {x}}^{\mu }}{\partial x^{\beta }}}B^{\beta }

is the mixed tensor^[1]^{: 173–178}

{\bar {C}}_{\lambda }^{\mu }={\frac {\partial x^{\alpha }}{\partial {\bar {x}}^{\lambda }}}{\frac {\partial {\bar {x}}^{\mu }}{\partial x^{\beta }}}C_{\alpha }^{\beta }

(G1)

Contraction edit

Tensor contraction is a procedure whereby, given a tensor of rank n, one may construct a tensor of rank n − 2.^[1]^{: 178–183}

The general rule to contract a tensor is to set an upper index equal to a lower index and sum, yielding a tensor of reduced rank. For example, one possible contraction of $T_{\lambda \gamma }^{\alpha \beta }$ is $T_{\beta \gamma }^{\alpha \beta }=S_{\gamma }^{\alpha }$ .^[4]^: 44 Given several possible contractions, the one chosen would be dictated by the requirements of the physical problem being addressed.

Consider the mixed tensor:

{\bar {A}}_{\gamma }^{\alpha \beta }={\frac {\partial x^{\nu }}{\partial {\bar {x}}^{\gamma }}}{\frac {\partial {\bar {x}}^{\alpha }}{\partial x^{\lambda }}}{\frac {\partial {\bar {x}}^{\beta }}{\partial x^{\mu }}}A_{\nu }^{\lambda \mu }\quad \quad (n=2)

(H1)

This expression represents eight equations, each having eight terms on the right.

In the above, let us replace $\gamma$ by $\alpha$ , yielding

{\bar {A}}_{\alpha }^{\alpha \beta }={\frac {\partial x^{\nu }}{\partial {\bar {x}}^{\alpha }}}{\frac {\partial {\bar {x}}^{\alpha }}{\partial x^{\lambda }}}{\frac {\partial {\bar {x}}^{\beta }}{\partial x^{\mu }}}A_{\nu }^{\lambda \mu }\quad \quad (n=2)

(H2)

On the left side, the summation convention means that we have two equations rather than eight. Moreover, the left side now has two terms rather than one.

On the right side, since $\alpha$ appears twice, the summation convention states that a sum needs to be taken over each value of $\nu$ and $\lambda$ . Note, however, that the $x\,{\text{'s}}$ are independent variables. Although functional relationships exist between the ${\bar {x}}\,{\text{'s}}$ and the $x\,{\text{'s}}$ , no such functional relationships exist among the $x\,{\text{'s}}$ themselves. What this means is that when $\nu \neq \lambda ,$ the terms drop out, since

{\frac {\partial x^{\nu }}{\partial {\bar {x}}^{\alpha }}}{\frac {\partial {\bar {x}}^{\alpha }}{\partial x^{\lambda }}}={\frac {\partial x^{\nu }}{\partial x^{\lambda }}}=0\quad \quad (\lambda \neq \nu )

On the other hand, when $\lambda =\nu ,$ we observe that

{\frac {\partial x^{\nu }}{\partial {\bar {x}}^{\alpha }}}{\frac {\partial {\bar {x}}^{\alpha }}{\partial x^{\lambda }}}={\frac {\partial x^{\lambda }}{\partial {\bar {x}}^{\alpha }}}{\frac {\partial {\bar {x}}^{\alpha }}{\partial x^{\lambda }}}=1\quad \quad (\lambda =\nu )

Equation (H2) therefore becomes

{\bar {A}}_{\alpha }^{\alpha \beta }={\frac {\partial {\bar {x}}^{\beta }}{\partial x^{\mu }}}A_{\lambda }^{\lambda \mu }\quad \quad (n=2)

(H3)

To clarify the meaning of (H3), we expand the individual terms, noting that $\lambda$ and $\mu$ each appear twice on the right side:

{\bar {A}}_{1}^{11}+{\bar {A}}_{2}^{21}={\frac {\partial {\bar {x}}^{1}}{\partial x^{1}}}(A_{1}^{11}+A_{2}^{21})

+\;{\frac {\partial {\bar {x}}^{1}}{\partial x^{2}}}(A_{1}^{12}+A_{2}^{22})

{\bar {A}}_{1}^{12}+{\bar {A}}_{2}^{22}={\frac {\partial {\bar {x}}^{2}}{\partial x^{1}}}(A_{1}^{11}+A_{2}^{21})

+\;{\frac {\partial {\bar {x}}^{2}}{\partial x^{2}}}(A_{1}^{12}+A_{2}^{22})

In the above expressions, perform the following substitutions and apply the summation convention:

{\bar {C}}^{1}={\bar {A}}_{1}^{11}+{\bar {A}}_{2}^{21}

{\bar {C}}^{2}={\bar {A}}_{1}^{12}+{\bar {A}}_{2}^{22}

C^{1}=A_{1}^{11}+A_{2}^{21}

C^{2}=A_{1}^{12}+A_{2}^{22}

Then (H3) becomes

{\bar {C}}^{\beta }={\frac {\partial {\bar {x}}^{\beta }}{\partial x^{\mu }}}C^{\mu }

(H4)

The starting rank 3 tensor (H1) has been contracted to yield a tensor of rank one.

If we multiply two tensors to form an outer product, and this product is a mixed tensor, contracting this mixed tensor results in an inner product. Hence, if the outer product of $A_{\alpha \beta }$ and $B^{\gamma }$ is the mixed tensor $C_{\alpha \beta }^{\gamma }\,$ , replacing $\gamma$ by $\beta$ results in the contracted tensor $D_{\alpha }$ , which is an inner product of $A_{\alpha \beta }$ and $B^{\gamma }$ .^[1]^{: 178–183}

The student will have already encountered inner products in their studies of vector algebra. The square root of the inner product of vector $A$ with itself is the magnitude of the vector $|A|.\,$ If $\theta$ is the angle between two vectors $A$ and $B\,$ then $|A||B|\cos \theta =A\cdot B$ .^[6]^: 28–29

The importance of tensor contraction will be apparent later on when we discuss the vacuum field solution of general relativity.

The problem with "ordinary" differentiation edit

To be physically meaningful, the result of applying mathematical operations on tensors should be other tensors, since otherwise the operations lack coordinate independence. We have so far shown that addition, outer multiplication, and contraction of tensor variables do, in fact, yield tensors as their result. Ordinary differentiation, however, has issues.^[1]^{: 183–187}^[4]^: 81–85

Suppose we wish to compute the partial derivative of

{\bar {A}}^{\mu }={\frac {\partial {\bar {x}}^{\mu }}{\partial x^{\sigma }}}A^{\sigma }

(I1)

with respect to ${\bar {x}}^{\nu }.$ Applying the product rule,^{[note 7]} we obtain:

{\frac {\partial {\bar {A}}^{\mu }}{\partial {\bar {x}}^{\nu }}}={\frac {\partial {\bar {x}}^{\mu }}{\partial x^{\sigma }}}{\frac {\partial A^{\sigma }}{\partial {\bar {x}}^{\nu }}}+A^{\sigma }{\frac {\partial ^{2}{\bar {x}}^{\mu }}{\partial x^{\sigma }\partial {\bar {x}}^{\nu }}}

(I2)

The result does not match up at all with any of the tensor prototypes that we have thus far identified. This situation, however, can be partially rectified by a change of variables. Note that

{\frac {\partial A^{\sigma }}{\partial {\bar {x}}^{\nu }}}={\frac {\partial A^{\sigma }}{\partial x^{\tau }}}{\frac {\partial x^{\tau }}{\partial {\bar {x}}^{\nu }}}

If we apply this substitution to the left term of (I2) and rearrange slightly,^{[note 8]} we obtain

{\frac {\partial {\bar {A}}^{\mu }}{\partial {\bar {x}}^{\nu }}}={\frac {\partial {\bar {x}}^{\mu }}{\partial x^{\sigma }}}{\frac {\partial x^{\tau }}{\partial {\bar {x}}^{\nu }}}{\frac {\partial A^{\sigma }}{\partial x^{\tau }}}+{\frac {\partial ^{2}{\bar {x}}^{\mu }}{\partial x^{\sigma }\partial {\bar {x}}^{\nu }}}A^{\sigma }

(I3)

Close comparison of the left term of (I3) with other tensor prototypes presented thus far shows that the left term represents a mixed tensor of rank two. But the right term presents an issue.

For certain simple transformations, such as the rotation illustrated in Fig. 6–4, the right term vanishes, since the coefficients $\partial {\bar {x}}^{\mu }/\partial x^{\sigma }$ are constants. In such cases, (I3) will represent a tensor. In the general case, however, $\partial {\bar {x}}^{\mu }/\partial x^{\sigma }$ will not be constants, the right term will not vanish, and (I3) will not be a tensor. In general, therefore, ordinary differentiation of tensors does not represent a physically relevant operation.^[1]^{: 183–187}

The ordinary derivative of a tensor is a tensor if and only if coordinate changes are restricted to linear transformations.^[7]^: 68

We will shortly describe an operation called covariant differentiation which does always yield a tensor, and which is used in deriving the curvature tensor which plays an important role in general relativity.

The metric tensor edit

As mentioned before, the expression for $ds^{2}$ is dependent both on the properties of the space(time) in question and on the coordinate system used. It turns out that all of the different expressions for $ds^{2}$ have the the common form^[4]^: 33–38

ds^{2}=g_{\mu \nu }dx^{\mu }dx^{\nu }

(J1)

This common form holds for all spaces and spacetimes, regardless of dimensionality.^[1]^{: 187–190}^{[note 9]}

In two dimensions, J1 may be expanded to

{\begin{aligned}ds^{2}&=g_{11}dx^{1}dx^{1}+g_{12}dx^{1}dx^{2}\\&+\,g_{21}dx^{2}dx^{1}+g_{22}dx^{2}dx^{2}\end{aligned}}

(J2)

For a Euclidean plane in Cartesian coordinates (A2), $g_{11}=1,$ $g_{12}=0,$ $g_{21}=0,$ and $g_{22}=1.$ This leads to $ds^{2}=(dx^{1})^{2}+(dx^{2})^{2}$

For polar coordinates (A4), $g_{11}=1,$ $g_{12}=0,$ $g_{21}=0,$ and $g_{22}=r^{2}.$

For oblique coordinates (A6), $g_{11}=1,$ $g_{12}=-\cos \alpha ,$ $g_{21}=-\cos \alpha ,$ and $g_{22}=1.$

For spherical coordinates (A8), $g_{11}=r^{2},$ $g_{12}=0,$ $g_{21}=0,$ and $g_{22}=R^{2}.$

Note that for each of the above, $g_{12}$ and $g_{21}$ have the same value.

In general, regardless of the dimensionality, the shape of the space(time), or the coordinate system employed,

g_{\mu \nu }=g_{\nu \mu }.

Any such set of $g\,{\text{'s}}$ form a covariant tensor of rank two. Demonstrating that the set of $g\,{\text{'s}}$ in (J1) form a tensor involves an application of the Quotient Theorem:

If the product (inner or outer) of a given quantity with a tensor of any specified type and arbitrary components is itself a tensor, then the given quantity is a tensor.^{[note 10]}

Given the Quotient theorem, demonstrating that $g_{\mu \nu }$ is a tensor is straightforward: Since $ds^{2}$ is a scalar, it is a tensor of rank zero. The product of $g_{\mu \nu }dx^{\mu }$ and $dx^{\nu }$ on the right-hand side of J1 is therefore also a tensor of rank zero. But $dx^{\nu }$ is a contravariant tensor of rank one (i.e. a vector), allowing us to deduce that $g_{\mu \nu }dx^{\mu }$ is a covariant tensor of rank one. But $dx^{\mu }$ is also a contravariant vector, demonstrating that $g_{\mu \nu }$ must be a covariant tensor of rank two.

The metric tensor $g_{\mu \nu }$ is the fundamental object of study in general relativity, since it characterizes the geometric properties of spacetime.^[1]^{: 187–190, 312–314}^[5]^: 77–128

Covariant derivatives of tensors edit

The covariant derivative discussed in this section is the natural generalization of the ordinary derivative, since it is a tensor, and since, in flat Euclidean space with Cartesian coordinates, it reduces to the ordinary derivative.^[4]^: 83 The expression of the covariant derivative introduces two new symbols, (1) the contravariant metric tensor $g^{\mu \nu }$ (with raised indices), and (2) Christoffel's symbol of the second kind $\Gamma _{\mu \nu }^{\lambda }.$ ^[1]^{: 191–200}

For simplicity, we limit ourselves to two dimensions. In this environment, $g_{\mu \nu }$ will have four components, which can be arranged in a matrix:

{\begin{bmatrix}g_{11}&g_{12}\\g_{21}&g_{22}\end{bmatrix}}

Since $g_{12}=g_{21},$ this is called a symmetric matrix, since it is symmetric with respect to the principal diagonal.

The determinant of this matrix, $|g_{\mu \nu }|,$ is often denoted simply by the letter $g\,.$

The inverse of this matrix is also symmetric, and its components transform as a contravariant tensor of rank two. The tensor represented by this matrix is $g^{\mu \nu }.$ The product of the two matrices is the identity matrix with ones along the diagonal and zeroes elsewhere. In tensor notation (note the summation upon $\lambda$ )

g^{\mu \lambda }g_{\lambda \nu }=\delta _{\lambda }^{\mu },\quad

where

\delta _{\lambda }^{\mu }

is the Kroneker delta:^[5]^: 97–99

\delta _{ij}\equiv \delta _{j}^{i}\equiv \delta ^{ij}={\begin{cases}0&{\text{if }}i\neq j\\1&{\text{if }}i=j\end{cases}}

Christoffel's symbol of the second kind is given by^{[note 11]}

\Gamma _{\mu \nu }^{\lambda }={\frac {1}{2}}g^{\lambda \alpha }\left({\frac {\partial g_{\mu \alpha }}{\partial x^{\nu }}}+{\frac {\partial g_{\nu \alpha }}{\partial x^{\mu }}}-{\frac {\partial g_{\mu \nu }}{\partial x^{\alpha }}}\right)

(K1)

Derivation of the Christoffel symbols is outside the scope of this simple introduction but may be found in most textbooks, a relatively accessible presentation being that of Grøn and Øyvind (2011).^[5]^{: 129–158} In two dimensional space, (K1) would represent eight equations. Remembering to sum over $\alpha ,$ we would have:

\Gamma _{11}^{1}={\frac {1}{2}}g^{11}\left({\frac {\partial g_{11}}{\partial x^{1}}}+{\frac {\partial g_{11}}{\partial x^{1}}}-{\frac {\partial g_{11}}{\partial x^{1}}}\right)

+\,{\frac {1}{2}}g^{12}\left({\frac {\partial g_{12}}{\partial x^{1}}}+{\frac {\partial g_{12}}{\partial x^{1}}}-{\frac {\partial g_{11}}{\partial x^{2}}}\right)

and similarly for the remaining seven values of $\Gamma _{\mu \nu }^{\lambda }.$

If $A_{\sigma }$ is a covariant tensor of rank one,^{[note 12]} its covariant derivative with respect to $x^{\tau }$ is defined as^[8]^: 44

A_{\sigma \tau }={\frac {\partial A_{\sigma }}{\partial x^{\tau }}}-\Gamma _{\sigma \tau }^{\alpha }A_{\alpha }

(K2)

$A_{\sigma \tau }$ is a covariant tensor of rank two.

If $A^{\sigma }$ is a contravariant tensor of rank one, its covariant derivative with respect to $x^{\tau }$ is defined as^[8]^: 45

A_{\tau }^{\sigma }={\frac {\partial A^{\sigma }}{\partial x^{\tau }}}+\Gamma _{\tau \epsilon }^{\sigma }A^{\epsilon }

(K3)

$A_{\tau }^{\sigma }$ is a mixed tensor of rank two.

If $A_{\sigma \tau }$ is a covariant tensor of rank two, its covariant derivative with respect to $x^{\rho }$ is defined as^[8]^: 45

A_{\sigma \tau \rho }={\frac {\partial A_{\sigma \tau }}{\partial x^{\rho }}}-\Gamma _{\sigma \rho }^{\epsilon }A_{\epsilon \tau }-\Gamma _{\tau \rho }^{\epsilon }A_{\sigma \epsilon }

(K4)

and so forth.^[7]^: 71–72

In like fashion, we may obtain the covariant derivatives for tensors of higher ranks. In all cases, covariant differentiation leads to a tensor with one more rank of covariant character than the starting tensor.

In the special case where the $g{\text{'s}}$ are constants, as for instance when using Cartesian coordinates in a flat Euclidean plane, it is evident when looking at the definition of the Christoffel symbol (K1) that the symbols will all have value zero. In this case, (K3) becomes simply

A_{\tau }^{\sigma }={\frac {\partial A^{\sigma }}{\partial x^{\tau }}}

(K5)

In this special case, the covariant derivative is the same as the ordinary derivative.^[1]^{: 191–200}

The Riemann–Christoffel curvature tensor edit

Suppose that z is a function of x and y, for example z = x² + 2xy. The partial derivative of z with respect to x and y does not depend on the order of differentiation. In other words,

{\frac {\partial ^{2}z}{\partial x\partial y}}={\frac {\partial ^{2}z}{\partial y\partial x}}=2

On the other hand, order does matter in calculation of the second covariant derivative of a tensor due to the presence of Christoffel symbols.^[1]^{: 200–206}

To illustrate, we start by taking the covariant derivative of $A_{\sigma }$ with respect to $x^{\tau }$ :

A_{\sigma \tau }={\frac {\partial A_{\sigma }}{\partial x^{\tau }}}-\Gamma _{\sigma \tau }^{\alpha }A_{\alpha }

(L1)

Follow by taking the second covariant derivative with respect to $x^{\rho }$ :

A_{\sigma \tau \rho }={\frac {\partial A_{\sigma \tau }}{\partial x^{\rho }}}-\Gamma _{\sigma \rho }^{\epsilon }A_{\epsilon \tau }-\Gamma _{\tau \rho }^{\epsilon }A_{\sigma \epsilon }

(L2)

Substituting (L1) into (L2) yields

{\begin{aligned}A_{\sigma \tau \rho }&={\frac {\partial ^{2}A_{\sigma }}{\partial x^{\tau }x^{\rho }}}-\Gamma _{\sigma \tau }^{\alpha }{\frac {\partial A_{\alpha }}{\partial x^{\rho }}}-A_{\alpha }{\frac {\partial \Gamma _{\sigma \tau }^{\alpha }}{\partial x^{\rho }}}\\&-\Gamma _{\sigma \rho }^{\epsilon }{\frac {\partial A_{\epsilon }}{\partial x^{\tau }}}+\Gamma _{\sigma \rho }^{\epsilon }\Gamma _{\epsilon \tau }^{\alpha }A_{\alpha }\\&-\Gamma _{\tau \rho }^{\epsilon }{\frac {\partial A_{\sigma }}{\partial x^{\epsilon }}}+\Gamma _{\tau \rho }^{\epsilon }\Gamma _{\sigma \epsilon }^{\alpha }A_{\alpha }\end{aligned}}

(L3)

Taking the derivatives in reverse order yields

{\begin{aligned}A_{\sigma \rho \tau }&={\frac {\partial ^{2}A_{\sigma }}{\partial x^{\rho }x^{\tau }}}-\Gamma _{\sigma \rho }^{\alpha }{\frac {\partial A_{\alpha }}{\partial x^{\tau }}}-A_{\alpha }{\frac {\partial \Gamma _{\sigma \rho }^{\alpha }}{\partial x^{\tau }}}\\&-\Gamma _{\sigma \tau }^{\epsilon }{\frac {\partial A_{\epsilon }}{\partial x^{\rho }}}+\Gamma _{\sigma \tau }^{\epsilon }\Gamma _{\epsilon \rho }^{\alpha }A_{\alpha }\\&-\Gamma _{\rho \tau }^{\epsilon }{\frac {\partial A_{\sigma }}{\partial x^{\epsilon }}}+\Gamma _{\rho \tau }^{\epsilon }\Gamma _{\sigma \epsilon }^{\alpha }A_{\alpha }\end{aligned}}

(L4)

The first terms of (L3) and (L4) are equal:

{\frac {\partial ^{2}A_{\sigma }}{\partial x^{\tau }x^{\rho }}}={\frac {\partial ^{2}A_{\sigma }}{\partial x^{\rho }x^{\tau }}}

The second term of (L3) and the fourth term of (L4) are equal, since the choice of dummy symbol used for the summation makes no difference:

\Gamma _{\sigma \tau }^{\alpha }{\frac {\partial A_{\alpha }}{\partial x^{\rho }}}=\Gamma _{\sigma \tau }^{\epsilon }{\frac {\partial A_{\epsilon }}{\partial x^{\rho }}}

Likewise, the fourth term of (L3) and the second term of (L4) are equal:

\Gamma _{\sigma \rho }^{\epsilon }{\frac {\partial A_{\epsilon }}{\partial x^{\tau }}}=\Gamma _{\sigma \rho }^{\alpha }{\frac {\partial A_{\alpha }}{\partial x^{\tau }}}

The sixth and seventh terms of (L3) are equal to the sixth and seventh terms of (L4), since swapping the $\tau$ and $\rho$ leaves the value of $\Gamma _{\tau \rho }^{\epsilon }$ unchanged. This is easily seen in the definition of the Christoffel symbol (K1), remembering that $g_{\mu \nu }$ is symmetric. Likewise, the final terms of (L3) and (L4) are equal.

The third and fifth terms of (L3), however, are not equal to any of he terms of (L4). Subtracting (L4) from (L3) followed by rearrangement, we obtain

A_{\sigma \tau \rho }-A_{\sigma \rho \tau }=\left[{\frac {\partial \Gamma _{\sigma \rho }^{\alpha }}{\partial x^{\tau }}}-{\frac {\partial \Gamma _{\sigma \tau }^{\alpha }}{\partial x^{\rho }}}+\Gamma _{\sigma \rho }^{\epsilon }\Gamma _{\epsilon \tau }^{\alpha }-\Gamma _{\sigma \tau }^{\epsilon }\Gamma _{\epsilon \rho }^{\alpha }\right]A_{\alpha }

(L5)

The difference on the left-hand side of (L5) is a covariant tensor of rank three. On the right-hand side of (L5), we had specified $A_{\alpha }$ as being an arbitrary covariant tensor of rank one. Since the inner product of $A_{\alpha }$ and the quantity in brackets is a covariant tensor of rank three, the Quotient Theorem tells us that the quantity in brackets must be a mixed tensor of rank four. This quantity is the Riemann-Christoffel curvature tensor:^[1]^{: 200–206}

R_{\sigma \tau \rho }^{\alpha }\equiv {\frac {\partial \Gamma _{\sigma \rho }^{\alpha }}{\partial x^{\tau }}}-{\frac {\partial \Gamma _{\sigma \tau }^{\alpha }}{\partial x^{\rho }}}+\Gamma _{\sigma \rho }^{\epsilon }\Gamma _{\epsilon \tau }^{\alpha }-\Gamma _{\sigma \tau }^{\epsilon }\Gamma _{\epsilon \rho }^{\alpha }

(L6)

Properties of the curvature tensor edit

If the Christoffel symbols on the right side of (L6) are expanded according to their definition in (K1), it is observed that the Riemann-Christoffel curvature tensor is an expression containing first and second derivatives of the $g\,{\text{'s}},$ which are themselves coefficients of (J1), the expression for $ds^{2}.$ ^[1]^{: 206–213}

In two dimensions, each of the indices of the curvature tensor has two possible values, so that $R_{\sigma \tau \rho }^{\alpha }$ has sixteen components. In three-space, the curvature tensor has 3⁴ or 81 components, while in the four dimensions of spacetime, the curvature tensor has 4⁴ or 256 components.

Various symmetries reduce the complexity of this expression. The first to note is that interchanging the $\tau$ and the $\rho$ of this expression merely changes its sign, so that of the sixteen possible combinations of $\tau$ and the $\rho$ , only six are independent.^[1]^{: 206–213} This may be seen as follows:

1. Suppose that we have sixteen quantities

a_{\alpha \beta }\;(n=4)

arranged in a matrix:

{\begin{bmatrix}a_{11}&a_{12}&a_{13}&a_{14}\\a_{21}&a_{22}&a_{23}&a_{24}\\a_{31}&a_{32}&a_{33}&a_{34}\\a_{41}&a_{42}&a_{43}&a_{44}\\\end{bmatrix}}

2. If we stipulate that

a_{\alpha \beta }=-a_{\beta \alpha },

then the terms in the principal diagonal are necessarily zero, and the array becomes

{\begin{bmatrix}0&a_{12}&a_{13}&a_{14}\\-a_{12}&0&a_{23}&a_{24}\\-a_{13}&-a_{23}&0&a_{34}\\-a_{14}&-a_{24}&-a_{34}&0\\\end{bmatrix}}

3. The above antisymmetric matrix has only six independent components rather than sixteen. If, on the other hand, we had stipulated that

a_{\alpha \beta }=a_{\beta \alpha },

the resulting symmetric matrix would have ten independent components.

The six independent combinations of $\tau$ and $\rho ,$ combined with the sixteen combinations of $\sigma$ and $\alpha$ gives 96 independent components rather than 256. Further symmetries reduce the total number of independent components from $n^{4}=256$ to ${\tfrac {1}{12}}n^{2}(n^{2}-1)=20.$ ^[2]^: 86^[4]^{: 115–117}

We had earlier shown that superficial examination of $ds^{2}$ does not reveal whether a space is flat or not, since the expression is dependent both on the properties of the space(time) in question and on the coordinate system used. The curvature tensor, however, allows us to make such a determination. If we apply $R_{\sigma \tau \rho }^{\alpha }$ to (A3), (A5), and (A7), we find its components are all zero, while if we apply it to (A9), the components are non-zero.

In the case of (A3), which applies to a Euclidean plane using ordinary Cartesian coordinates, the $g{\text{'s}}$ are constants, with $g_{11}=1,\,g_{22}=1$ with the others all zero. Hence the derivatives are all zero, the Christoffel symbols are all zero, and the components of the curvature tensor are all zero.

It would be a useful exercise for the reader to compute $R_{\sigma \tau \rho }^{\alpha }$ for (A5), which applies to a Euclidean plane using polar coordinates. Here, $g_{11}=1,\,g_{12}=g_{21}=0,\,g_{22}=(x^{1})^{2}.$

In summary,

R_{\sigma \tau \rho }^{\alpha }=0

(M1)

is a necessary and sufficient condition for the local space(time) to be flat. This holds regardless of dimensionality and the coordinate system used.^[1]^{: 206–213}

The vacuum field solution edit

In the development of general relativity, Einstein sought a means to relate spacetime curvature to mass and energy. However, the Riemann curvature tensor is of rank four, while the energy-momentum tensor is of rank two. Two tensors that are proportional to each other must be the same rank as well as have the same symmetries. Einstein, therefore, needed to derive a rank two tensor from the Riemann curvature tensor. (The alternative possibility, finding a rank four tensor expression of energy-momentum, makes no physical sense.) Of the three possible contractions of $R_{\sigma \tau \rho }^{\alpha }\,,$ contraction with the first subscript gives zero, while contraction with the second and third subscripts gives the same result but of opposite sign. Therefore, there was only one independent contraction of the curvature tensor that presented itself to Einstein.^[5]^{: 211–224}

Contracting (M1) with the third subscript yields the Ricci tensor, where

G_{11}=R_{111}^{1}+R_{112}^{2}+R_{113}^{3}+R_{114}^{4}=0

G_{12}=R_{121}^{1}+R_{122}^{2}+R_{123}^{3}+R_{124}^{4}=0

and so forth for each of the sixteen possible combinations of $\sigma$ and $\tau ,\,$ ultimately yielding

G_{\sigma \tau }=0\,.

(N1)

In examining (M1) before contracting it to yield (N1), we see that

G_{\sigma \tau }={\frac {\partial }{\partial x^{\tau }}}\Gamma _{\sigma \alpha }^{\alpha }-{\frac {\partial }{\partial x^{\alpha }}}\Gamma _{\sigma \tau }^{\alpha }+\Gamma _{\sigma \alpha }^{\epsilon }\Gamma _{\epsilon \tau }^{\alpha }-\Gamma _{\sigma \tau }^{\epsilon }\Gamma _{\epsilon \alpha }^{\alpha }

(N2)

From the definition of the Christoffel symbol, (N2) is revealed to be an expression containing first and second partial derivatives of the $g\,{\text{'s}}.$ Since $\sigma$ and $\tau$ may each take on four different values, (N2) represents sixteen equations. However symmetry considerations reduce this to ten equations, of which only six are independent.^[2]^: 89

Einstein proposed that (N1) should represent the vacuum field equations of general relativity, i.e. the equations that should be valid where the mass-energy density is zero.

Einstein's views on the equivalence principle had evolved significantly over the years since he first conceived of the principle in 1907. His early results in applying the equivalence principle, for example his deduction of the existence of gravitational time dilation and his early arguments on the bending of light in a gravitational field, used kinematic and dynamic analysis rather than geometric arguments. Stachel has identified Einstein's analysis of the rigid relativistic rotating disk as being key to the realization that he needed to adopt a geometric interpretation of spacetime, which he had formerly eschewed. (See Einstein's thought experiments: Non-Euclidean geometry and the rotating disk for a discussion of this point.) In later years, Einstein repeatedly stated that consideration of the rapidly rotating disk was of "decisive importance" to him because it showed that a gravitational field causes non-Euclidean arrangements of measuring rods.^[9]
The equivalence principle states that if we freefall in a gravitational field, gravity is locally eliminated. Since locally, we cannot distinguish a gravitational field from an inertial field resulting from uniform acceleration, gravitation should be regarded as an inertial force.^[2]^: 142
By 1912, Einstein had fully embraced the view that the paths of freely moving objects are determined by the geometry of the spacetime through which they travel. Freely moving objects always follow a straight line in their local inertial frames, which is to say, they always follow along the path of timelike geodesics. As indicated earlier in section Basic propositions, evidence of gravitation is observed by variation in the field rather than the field itself, as manifest in the relative accelerations of two separated particles. In Fig. 5-1, two separated particles, free-falling in the gravitational field of the Earth, exhibit tidal accelerations due to local inhomogeneities in the gravitational field such that each particle follows a different path through spacetime. The convergence or divergence of the test particles is described with the aid of the Riemann curvature tensor^[2]^: 142 which is the analog of Newtonian tidal forces.^[4]^: 100
The $g\,{\text{'s}}$ of the spacetime metric serve to quantify the shape of spacetime. In analogy with the field formulation of Newtonian gravitational theory, which we will discuss in the next section, (N1) represents a set of second-order partial differential equations for the potentials as field equations of the theory. These equations, of course, must be tensoral.^[2]^: 142

The equations of (N1) represent the simplest expression which is analogous to the field formulation of Newtonian gravitational theory (in regions of zero mass density). Predictions of this theory match up with the predictions of Newtonian gravitational theory in the low-speed, low-gravitation regime. These equations also predict additional effects that have been fully verified by observation and experiment.^[1]^{: 213–219}

The field formulation of Newtonian gravitation edit

Newton's law of universal gravitation is inherently non-relativistic. The most familiar expression of the law is in its action-at-a-distance form,

F=-G{\frac {m_{1}m_{2}}{r^{2}}},

(O1)

where $G$ in this case is the gravitational constant (not to be confused with the Ricci tensor), and the force is along a line connecting the two masses. The law requires that the forces between the gravitating bodies be transmitted instantaneously. Newton's law is incompatible with a finite speed of gravity. In 1805, Laplace concluded that the speed of gravitational interactions must be at least 7×10⁶ times the speed of light, otherwise the resulting orbital instabilities should long ago have caused the Earth to plunge into the Sun.^[10]^{[note 13]}

Einstein wanted to construct a theory of gravitation that adhered to relativistic principles. From his own work in 1905, he knew that Maxwell's theory of electromagnetism was consistent with special relativity. He also knew that it was Faraday's development of the field concept that led the way for Maxwell's inherently relativistic theory. Therefore, Einstein was certain that the general theory that he wanted to create would be a field theory rather than an action-at-a-distance theory.^[5]^{: 230–235}

In a field theory, changes in the field are expressed by means of differential equations. The gravitational potential $\phi$ is a function expressing the potential energy of a particle with unit mass in the gravitational field. The potential energy of a particle at position $P$ is the energy required to move the particle from an arbitrary position of zero energy to $P.$ This position of zero energy may be chosen freely. When performing calculations near the surface of the Earth, it is frequently chosen to be sea level. For celestial mechanics calculations, it is usually chosen to be from a position infinitely distant in space. The potential's value increases in the upward direction in the gravitational field.^[5]^{: 230–235}

Figure 6–6. Initial steps in deriving a field theory from Newton's law of gravitation

To derive a field theory version of Newton's law, we first rearrange (O1) as follows:^[1]^{: 219–227}

{\frac {F}{m_{2}}}=-G{\frac {m_{1}}{r^{2}}}=a

On the left side of the equation, $F/m_{2}$ represents the acceleration of $m_{2}$ due to the gravitational field surrounding $m_{1}.$ Since $-Gm_{1}$ is a constant, we may rewrite the above equation as

a={\frac {C}{r^{2}}}

(O2)

Fig. 6–6 shows two axes of a three-dimensional diagram, the third $Z$ axis pointing out of the page towards the reader. Mass $m_{1}$ is at the origin, $m_{2}$ is at $P$ with coordinates $x,\,y,\,z,\,$ and $OP=r.\,$ Acceleration ${\vec {a}}$ is a vector quantity and may be split up into three components, ${\vec {a}}_{x},\,{\vec {a}}_{y},\,{\vec {a}}_{z}.\,$ It is evident that

a_{x}=-a\cdot {\frac {x}{r}},\ a_{y}=-a\cdot {\frac {y}{r}},\ a_{z}=-a\cdot {\frac {z}{r}}

Substituting in the value of $a$ from (O2), we get

a_{x}=-{\frac {Cx}{r^{3}}},\ a_{y}=-{\frac {Cy}{r^{3}}},\ a_{z}=-{\frac {Cz}{r^{3}}}

Taking the partial derivative of $a_{x}$ with respect to $x$ , we obtain

{\frac {\partial a_{x}}{\partial x}}=-Cr^{-3}+3Cxr^{-4}=\,

{\frac {-Cr^{3}+3Cxr^{2}\cdot \partial r/\partial x}{r^{6}}}

and likewise for $a_{y}$ and $a_{z}.\,$ But since $r^{2}=x^{2}+y^{2}+z^{2},$

{\frac {\partial r}{\partial x}}={\frac {x}{r}}.

Substituting this into the above equation,

{\frac {\partial a_{x}}{\partial x}}={\frac {-C(r^{2}-3x^{2})}{r^{5}}}

and likewise

{\frac {\partial a_{y}}{\partial y}}={\frac {-C(r^{2}-3y^{2})}{r^{5}}}\quad

and

\quad {\frac {\partial a_{z}}{\partial z}}={\frac {-C(r^{2}-3z^{2})}{r^{5}}}

Adding together the above equations, we obtain

{\frac {\partial a_{x}}{\partial x}}+{\frac {\partial a_{y}}{\partial y}}+{\frac {\partial a_{z}}{\partial z}}=0

(O3)

From the definition of gravitational potential, we may write

a_{x}={\frac {\partial \phi }{\partial x}},\;a_{y}={\frac {\partial \phi }{\partial y}},\;a_{z}={\frac {\partial \phi }{\partial z}}

Substituting into (O3), we obtain

{\frac {\partial ^{2}\phi }{\partial x^{2}}}+{\frac {\partial ^{2}\phi }{\partial y^{2}}}+{\frac {\partial ^{2}\phi }{\partial z^{2}}}=0

(O4)

The above field formulation of Newton's law of gravitation is known as Laplace's equation, valid for regions of zero mass density. It may be written more succinctly using the $\nabla ^{2}$ operator (pronounced "del square"):^{[note 14]}

\nabla ^{2}\phi =0

We observe in (O4) that the field formulation of Newton's law of gravitation is an equation containing second partial derivatives of the gravitational potential. By way of comparison, the vacuum solution of Einstein's field equation (N1) is a set of equations containing nothing higher than the second partial derivatives of the components of the metric tensor. Einstein's field equation expresses the equivalence principle by replacing the concept of a varying gravitational potential originating from action-at-a-distance forces, with the concept of a spacetime varying in shape.^[1]^{: 219–227}

We had noted before that each component of the Ricci tensor $G_{\sigma \tau }$ represents the sum of four components of the Riemann curvature tensor $R_{\sigma \tau \rho }^{\alpha }.$ If the components of the Riemann tensor are all zero, then spacetime is flat and the components of $G_{\sigma \tau }$ will all be zero. However, the converse is not true. If the components of $G_{\sigma \tau }$ are all zero, that does not imply that the components of the Riemann tensor need all be zero.

Even as, in Newtonian theory, $\nabla ^{2}\phi =0$ is the field equation for regions of zero mass density around gravitating bodies, so $G_{\sigma \tau }=0$ is the relativistic field equation for regions of zero mass-energy density around gravitating bodies.^[1]^{: 219–227}

Solving the vacuum field equations edit

The vacuum field solution of general relativity,

G_{\sigma \tau }=0

comprises six independent equations containing partial derivatives of the components of the metric tensor $g.$ To test these equations, we must use a form of the expression for $ds^{2}$ applicable to the physical situation which we are modeling and which preferably should be in a form convenient for calculation.^[1]^{: 227–237}

Figure 6–7. Spherical coordinates

The classical tests for general relativity include observations of

Since the gravitational field of the Sun is very nearly spherically symmetric and decreases with radial distance from the Sun, a form of the expression for $ds^{2}$ which reflects this symmetry would be convenient for computation of anomalous perihelion precession, the deflection of light by the Sun, and the gravitational redshift. We begin by adopting spherical coordinates.^[1]^{: 227–237}

In three-dimensional Euclidean space, the expression for $ds^{2}$ in terms of spherical coordinates is

ds^{2}=dr^{2}+r^{2}d\theta ^{2}+r^{2}\sin ^{2}\theta \cdot d\phi ^{2}

as may be readily derived from $ds^{2}=(dx^{1})^{2}+(dx^{2})^{2}+(dx^{3})^{2}$ with the aid of Fig. 6–7.

The expression for flat Minkowski spacetime in four dimensions using Cartesian coordinates is

ds^{2}=-dx^{2}-dy^{2}-dz^{2}+c^{2}dt^{2}

which in spherical coordinates would be

ds^{2}=-dr^{2}-r^{2}d\theta ^{2}-r^{2}\sin ^{2}\theta \cdot d\phi ^{2}+c^{2}dt^{2}

However, general relativity involves consideration of curved spacetime. It is reasonable to assume that the expression for curved spacetime using spherical coordinates will have the form

ds^{2}=-e^{\lambda }dr^{2}

-\;e^{\mu }r^{2}(d\theta ^{2}+\sin ^{2}\theta \cdot d\phi ^{2})+e^{\nu }dt^{2}

{\text{or}}

(P1)

ds^{2}=-e^{\lambda }(dx^{1})^{2}-e^{\mu }(x^{1})^{2}((dx^{2})^{2}

+\;\sin ^{2}x^{2}\cdot (dx^{3})^{2})+e^{\nu }(dx^{4})^{2}

where $x^{1},\,x^{2},\,x^{3},\,x^{4}$ represent, respectively, the spherical coordinates $r,\,\theta ,\,\phi ,\,t,\;$ while $\lambda ,\,\mu ,\,\nu$ will be functions only of $x^{1}\equiv r.\,$ In other words, there will be no directional dependence of these functions, nor will there be any time dependence of these functions.

The requirement for spherical symmetry implies that $ds^{2}$ should not vary when $\theta$ and $\phi$ are varied, so that $\theta$ and $\phi$ only occur in the form $(d\theta ^{2}+\sin ^{2}\theta \cdot d\phi ^{2}).$ ^[2]^{: 184–186}

Furthermore, there are no product terms of the form $dx^{\sigma }dx^{\tau }$ where $\sigma \neq \tau .\;$ If terms like $dr\cdot d\theta ,\,d\theta \cdot d\phi ,\,$ or $dr\cdot d\phi$ existed, then the expression for $ds^{2}$ would be different if we turned in different directions. In particular, the metric needs to be invariant under the reflections $\theta \rightarrow \theta '=\pi -\theta$ and $\phi \rightarrow \phi '=-\phi .\,$ Likewise, since we are considering a static solution, we do not consider use of product terms such as $dr\cdot dt$ and so forth.

This eliminates all of the cross terms of the general expression for $ds^{2}$ presented in (J1). Only the squared terms $dr^{2},\,d\theta ^{2},\,d\phi ^{2},\,dt^{2}$ are used.

Functions $e^{\lambda },\,e^{\mu },\,e^{\nu }$ are inserted into the coefficients of (P1) to allow for the fact that the spacetime is curved. The form of these functions allows them to be adjusted to fit the scenario which we are modeling, and the expression of these functions as exponentials in the generalized formula is a mathematical convention that

ensures that their values are always positive, thus guaranteeing that the signature of the metric (i.e. the excess of plus signs over minus signs) is -2.^[2]^{: 184–186}
conveniently reduce in forthcoming calculations involving differentiation and the natural log.

Equation (P1) can be simplified by transforming coordinates:

e^{\mu }r^{2}\rightarrow {\bar {r}}^{2}

or, using generalized coordinates,

e^{\mu }(x^{1})^{2}\rightarrow ({\bar {x}}^{1})^{2}

By taking ${\bar {x}}^{1}$ as a new coordinate, it is possible to eliminate $e^{\mu }$ entirely. We may even drop the bar notation, since any change in $(dx^{1})^{2}$ resulting from the above substitution can be compensated for by modifying function $\lambda .$ Equation (P1) hence becomes

ds^{2}=-e^{\lambda }dr^{2}

-\;r^{2}(d\theta ^{2}+\sin ^{2}\theta \cdot d\phi ^{2})+e^{\nu }dt^{2}

{\text{or}}

(P2)

ds^{2}=-e^{\lambda }(dx^{1})^{2}-(x^{1})^{2}((dx^{2})^{2}

+\;\sin ^{2}x^{2}\cdot (dx^{3})^{2})+e^{\nu }(dx^{4})^{2}

The task now is to express $e^{\lambda }$ and $e^{\nu }$ as functions of $x^{1}.$ ^[1]^{: 227–237}

The Schwarzchild metric edit

From (P2), we have the following:

{\begin{aligned}&g_{11}=-e^{\lambda },\;g_{22}=-r^{2},\;g_{33}=-r^{2}\sin ^{2}\theta ,\;g_{44}=e^{\nu }\\&{\text{or}}\\&g_{11}=-e^{\lambda },\;g_{22}=-(x^{1})^{2},\;g_{33}=-(x^{1})^{2}\sin ^{2}x^{2},\;g_{44}=e^{\nu }\end{aligned}}

(Q1)

and $g_{\sigma \tau }=0$ when $\sigma \neq \tau .$

Hence the components of $g_{\mu \nu }$ form a diagonal matrix (i.e. have nonzero elements only along the principal diagonal). The determinant of $g_{\mu \nu }$ will therefore be simply equal to the product of the elements along the principal diagonal. Representing this determinant by the symbol $g,$ we have:

g=-e^{\lambda +\nu }(x^{1})^{4}\sin ^{2}x^{2}

(Q2)

Also in this case,

g^{\sigma \sigma }=1/g_{\sigma \sigma }

(meaning that $g^{11}=1/g_{11},\;g^{22}=1/g_{22}$ and so forth), and

g^{\sigma \tau }=0

when

\sigma \neq \tau .

The above relationships enable determining the coefficients $e^{\lambda }$ and $e^{\nu }$ of the metric tensor as well as enable establishing the form of the Ricci tensor $G_{\sigma \tau }$ , which represents the sixteen equations expressed by Equation (N2). In the following, these sixteen equations will be reduced to ten, then to six in the general solution. The Christoffel symbols in the solution will be categorized, and then each term will be individually addressed, ultimately leading to the Schwarzchild metric.^[1]^{: 237–255}

From sixteen equations to ten edit

We first show that $G_{\sigma \tau }$ is symmetric, which reduces $G_{\sigma \tau }=0$ to ten equations. Note the expression $\Gamma _{\sigma \alpha }^{\alpha }$ which is the first term on the right-hand side of (N2). From the definition of the Christoffel symbol (see (K1)),

\Gamma _{\sigma \alpha }^{\alpha }={\tfrac {1}{2}}g^{\alpha \epsilon }\left({\frac {\partial g_{\sigma \epsilon }}{\partial x^{\alpha }}}+{\frac {\partial g_{\alpha \epsilon }}{\partial x^{\sigma }}}-{\frac {\partial g_{\sigma \alpha }}{\partial x^{\epsilon }}}\right)

When the above expression is expanded using the Einstein summation convention, it is readily seen that most of the terms cancel out to yield

\Gamma _{\sigma \alpha }^{\alpha }={\tfrac {1}{2}}g^{\alpha \epsilon }{\frac {\partial g_{\sigma \epsilon }}{\partial x^{\sigma }}}

From the definition of the contravariant metric tensor $g^{\mu \nu },$ we obtain

{\tfrac {1}{2}}g^{\alpha \epsilon }{\frac {\partial g_{\sigma \epsilon }}{\partial x^{\sigma }}}={\frac {1}{2g}}{\frac {\partial g}{\partial x^{\sigma }}}

where $g$ is the determinant as described above. From basic calculus, we obtain

{\frac {1}{2g}}{\frac {\partial g}{\partial x^{\sigma }}}={\frac {\partial }{\partial x^{\sigma }}}\ln {\sqrt {-g}},

the negative of

g

being chosen so that the square root is real.

Hence,

\Gamma _{\sigma \alpha }^{\alpha }={\frac {\partial }{\partial x^{\sigma }}}\ln {\sqrt {-g}}

and by similar reasoning

\Gamma _{\epsilon \alpha }^{\alpha }={\frac {\partial }{\partial x^{\epsilon }}}\ln {\sqrt {-g}}

Substituting these into (K1), we obtain

{\begin{aligned}G_{\sigma \tau }&\equiv \Gamma _{\sigma \alpha }^{\epsilon }\Gamma _{\epsilon \tau }^{\alpha }+{\frac {\partial ^{2}}{\partial x^{\sigma }\partial x^{\tau }}}\ln {\sqrt {-g}}\\&\quad -{\frac {\partial }{\partial x^{\alpha }}}\Gamma _{\sigma \tau }^{\alpha }-\Gamma _{\sigma \tau }^{\epsilon }{\frac {\partial }{\partial x^{\epsilon }}}\ln {\sqrt {-g}}\\&=0\end{aligned}}

(Q3)

It is straightforward to demonstrate that interchange of $\sigma$ and $\tau$ in (Q3) leaves the equations unchanged. To start with, from the properties of the Christoffel symbol,

\Gamma _{\epsilon \tau }^{\alpha }=\Gamma _{\tau \epsilon }^{\alpha }

so that the two factors of the first term trade places but are otherwise unchanged ( $\epsilon$ and $\alpha$ are dummy variables that disappear upon expansion using the Einstein summation convention). The values of the second, third and fourth terms of (Q3) are likewise unaffected by swapping $\sigma$ and $\tau .$ Therefore,

G_{\sigma \tau }=G_{\tau \sigma }

so that the number of independent equations is reduced from sixteen to ten.^[1]^{: 237–255}

From ten equations to six edit

We refer the reader to treatments in standard textbooks such as Grøn & Næss (2011) for information on this step.^[5]^{: 217–224} The reduction of the ten equations of $G_{\mu \nu }=0$ to six is of considerable historical and physical importance, and took Einstein from 1913 to 1915 to resolve. He wished to be able to relate $G_{\mu \nu }$ to the energy-momentum tensor. Since energy and momentum are conserved, the four covariant derivatives of the energy-momentum tensor must be zero. Therefore the four covariant derivatives of the Einstein tensor must also be zero, but it was not obvious to Einstein how this should be the case. The mathematics demonstrating that this must be so had actually been developed many years earlier by Luigi Bianchi, but the Bianchi identities were unknown to Einstein in 1913. Furthermore, even if he could reduce the equations from ten to six, he still had the problem that the ten components of the metric tensor $g_{\mu \nu }$ would be underdetermined, since he would have only six equations to work with. It was not until the fall of 1915 that Einstein realized that he had a four-fold freedom in the choice of metric tensor, now called a gauge invariance, that reduced the ten $g\,{\text{'s}}$ to six, so that the number of unknowns would match the number of equations that he had available.^[1]^: 334

Categorizing the Christoffel symbols in the Ricci tensor edit

The Christoffel symbols in the expression for $G_{\sigma \tau }$ presented in (Q3) are highly degenerate, and over two hundred terms will drop out in the following analysis.^[1]^{: 237–255}

To accomplish this simplification, we first need to classify the Christoffel symbols in (Q3). We distinguish four classes of symbol:

Case A: Those where all the Greek letters are alike, i.e. $\Gamma _{\sigma \sigma }^{\sigma }$
Case B: Those of form $\Gamma _{\sigma \sigma }^{\tau }$
Case C: Those of form $\Gamma _{\sigma \tau }^{\tau }=\Gamma _{\tau \sigma }^{\tau }$
Case D: Those where the Greek letters are all different, i.e. $\Gamma _{\sigma \tau }^{\rho }$

According to the definition of the Christoffel symbol (K1),

\Gamma _{\sigma \sigma }^{\sigma }={\tfrac {1}{2}}g^{\sigma \alpha }\left({\frac {\partial g_{\sigma \alpha }}{\partial x^{\sigma }}}+{\frac {\partial g_{\sigma \alpha }}{\partial x^{\sigma }}}-{\frac {\partial g_{\sigma \sigma }}{\partial x^{\alpha }}}\right)

We had previously noted that $g_{\sigma \tau }=0$ when the indices are not alike. The $g\,{\text{'s}}$ non-zero only when the indices are the same. Furthermore, $g^{\sigma \sigma }=1/g_{\sigma \sigma }.$ We use these facts to simplify the above equation:

\Gamma _{\sigma \sigma }^{\sigma }={\frac {1}{2g_{\sigma \sigma }}}\left({\frac {\partial g_{\sigma \sigma }}{\partial x^{\sigma }}}+{\frac {\partial g_{\sigma \sigma }}{\partial x^{\sigma }}}-{\frac {\partial g_{\sigma \sigma }}{\partial x^{\sigma }}}\right)

Two terms cancel, so that

\Gamma _{\sigma \sigma }^{\sigma }={\frac {1}{2g_{\sigma \sigma }}}{\frac {\partial g_{\sigma \sigma }}{\partial x^{\sigma }}}

which yields, from basic calculus,

Case A:

\Gamma _{\sigma \sigma }^{\sigma }={\frac {1}{2}}{\frac {\partial }{\partial x^{\sigma }}}\ln g_{\sigma \sigma }

One handles the second case in similar fashion:

\Gamma _{\sigma \sigma }^{\tau }={\tfrac {1}{2}}g^{\tau \alpha }\left({\frac {\partial g_{\sigma \alpha }}{\partial x^{\sigma }}}+{\frac {\partial g_{\sigma \alpha }}{\partial x^{\sigma }}}-{\frac {\partial g_{\sigma \sigma }}{\partial x^{\alpha }}}\right)

Here, $g^{\tau \alpha }$ is non-zero only when $\alpha =\tau .$ This case is distinguished from the first case because $\tau \neq \sigma ,$ so that the first two terms within the parentheses are zero. Hence,

\Gamma _{\sigma \sigma }^{\tau }=-{\tfrac {1}{2}}g^{\tau \tau }{\frac {\partial g_{\sigma \sigma }}{\partial x^{\tau }}}

which yields

Case B:

\Gamma _{\sigma \sigma }^{\tau }=-{\frac {1}{2g_{\tau \tau }}}{\frac {\partial g_{\sigma \sigma }}{\partial x^{\tau }}}

Likewise,

Case C:

\Gamma _{\sigma \tau }^{\tau }=\Gamma _{\tau \sigma }^{\tau }={\frac {1}{2}}{\frac {\partial }{\partial x^{\sigma }}}\ln g_{\tau \tau }

Case D:

\Gamma _{\sigma \tau }^{\rho }=0

Term-by-term analysis of Case A edit

For $\sigma =1,$ and remembering the relationships in (Q1),

\Gamma _{11}^{1}={\frac {1}{2}}{\frac {\partial }{\partial x^{1}}}\ln g_{11}=

{\frac {1}{2}}{\frac {\partial }{\partial r}}\ln(-e^{\lambda })

Then

\Gamma _{11}^{1}={\frac {1}{2}}{\frac {-e^{\lambda }}{-e^{\lambda }}}{\frac {\partial \lambda }{\partial r}}=

{\frac {1}{2}}{\frac {\partial \lambda }{\partial r}}={\tfrac {1}{2}}\lambda '\,,

where $\lambda '$ represents $\partial \lambda /\partial x^{1}$ or $\partial \lambda /\partial r\,.$

For $\sigma =2$

\Gamma _{22}^{2}={\frac {1}{2}}{\frac {\partial }{\partial x^{2}}}\ln g_{22}=

{\frac {1}{2}}{\frac {\partial }{\partial x^{2}}}\ln(-x^{1})^{2}={\frac {1}{2}}{\frac {\partial }{\partial \theta }}\ln(-r^{2})=0\,,

since $r$ and $\theta$ are independent variables.

For $\sigma =3$ and $\sigma =4,$ we have:

\Gamma _{33}^{3}=\Gamma _{44}^{4}=0\,.

Term-by-term analysis of Case B edit

Let us first look at $\sigma =1,\,\tau =2\,:$

\Gamma _{11}^{2}=-{\frac {1}{2g_{22}}}{\frac {\partial }{\partial x^{2}}}g_{11}=-{\frac {1}{2g_{22}}}{\frac {\partial }{\partial x^{2}}}(-e^{\lambda })

Since $\lambda$ was defined as being a function of $x^{1}\equiv r$ only, the partial with respect to $x^{2}\equiv \theta$ is equal to zero,

\Gamma _{11}^{2}=0.

In like manner, we can work through all of the others through this case.^[1]^{: 237–255}

Complete list of non-zero Christoffel symbols in $G_{\sigma \tau }$ edit

In all, there are 4 specific examples of Case A,
$4\cdot 3=12$ combinations of $\sigma$ and $\tau$ for Case B,
$4\cdot 3=12$ combinations of $\sigma$ and $\tau$ for Case C,
and $(4\cdot 3\cdot 2)/2=12$ combinations of $\sigma ,\,\tau ,\,\rho$ for Case D (since the value of the Christoffel symbol is unchanged when the two lower indices are swapped).

Hence, there are 40 distinct combinations, 31 of which reduce to zero. The complete list of non-zero Christoffel symbols in $G_{\sigma \tau }$ is:^[1]^{: 237–255}

\left.{\begin{aligned}&\Gamma _{11}^{1}={\tfrac {1}{2}}\lambda '\\&\Gamma _{12}^{2}=\Gamma _{21}^{2}={\frac {1}{r}}\\&\Gamma _{13}^{3}=\Gamma _{31}^{3}={\frac {1}{r}}\\&\Gamma _{14}^{4}=\Gamma _{41}^{4}={\tfrac {1}{2}}\nu '\\&\Gamma _{22}^{1}=-re^{-\lambda }\\&\Gamma _{23}^{3}=\cot \theta \\&\Gamma _{33}^{1}=-r\sin ^{2}\theta \cdot e^{-\lambda }\\&\Gamma _{33}^{2}=-\sin \theta \cdot \cos \theta \\&\Gamma _{44}^{1}={\tfrac {1}{2}}e^{\nu -\lambda }\cdot \nu '\end{aligned}}\right\}

(Q4)

where $\nu '\equiv {\frac {\partial \nu }{\partial x^{1}}}\equiv {\frac {\partial \nu }{\partial r}}$ After dropping all of the (over 200) zero terms from (Q3), there remain only five equations with a much reduced number of terms. Here are the remaining equations of $G_{\sigma \tau }=0$ after the zero terms have been eliminated:^[1]^{: 237–255}

{\begin{aligned}G_{11}=&\;0\\=&\;\Gamma _{11}^{1}\Gamma _{11}^{1}+\Gamma _{12}^{2}\Gamma _{21}^{2}+\Gamma _{13}^{3}\Gamma _{31}^{3}+\Gamma _{14}^{4}\Gamma _{41}^{4}\\&-{\frac {\partial }{\partial x^{1}}}\Gamma _{11}^{1}+{\frac {\partial ^{2}}{\partial (x^{1})^{2}}}\ln {\sqrt {-g}}\\&-\Gamma _{11}^{1}{\frac {\partial }{\partial x^{1}}}\ln {\sqrt {-g}}\end{aligned}}

{\begin{aligned}G_{22}=&\;0\\=&\;2\,\Gamma _{22}^{1}\Gamma _{12}^{2}+\Gamma _{23}^{3}\Gamma _{23}^{3}\\&-{\frac {\partial }{\partial x^{1}}}\Gamma _{22}^{1}+{\frac {\partial ^{2}}{\partial (x^{2})^{2}}}\ln {\sqrt {-g}}\\&-\Gamma _{22}^{1}{\frac {\partial }{\partial x^{1}}}\ln {\sqrt {-g}}\end{aligned}}

{\begin{aligned}G_{33}=&\;0\\=&\;2\,\Gamma _{33}^{1}\Gamma _{13}^{3}+2\,\Gamma _{33}^{2}\Gamma _{23}^{3}\\&-\Gamma _{33}^{1}{\frac {\partial }{\partial x^{1}}}\ln {\sqrt {-g}}\\&-\Gamma _{33}^{2}{\frac {\partial }{\partial x^{2}}}\ln {\sqrt {-g}}\end{aligned}}

{\begin{aligned}G_{44}=&\;0\\=&\;2\,\Gamma _{44}^{1}\Gamma _{14}^{4}-{\frac {\partial }{\partial x^{1}}}\Gamma _{44}^{1}\\&-\Gamma _{44}^{1}{\frac {\partial }{\partial x^{1}}}\ln {\sqrt {-g}}\end{aligned}}

{\begin{aligned}G_{12}=&\;0\\=&\,\Gamma _{13}^{3}\Gamma _{23}^{3}-\Gamma _{12}^{2}{\frac {\partial }{\partial x^{2}}}\ln {\sqrt {-g}}\end{aligned}}

We now substitute into the above five equations the values from (Q4) and the value of $g$ from (Q2):^[1]^{: 237–255}

{\begin{aligned}G_{11}=&\;0\\=&\;{\tfrac {1}{4}}\lambda '^{2}+{\frac {1}{r^{2}}}+{\frac {1}{r^{2}}}+{\tfrac {1}{4}}\nu '^{2}-{\tfrac {1}{2}}\lambda ''\\&\;+\left({\tfrac {1}{2}}\lambda ''+{\tfrac {1}{2}}\nu ''-{\frac {2}{r^{2}}}\right)-{\tfrac {1}{2}}\lambda '\left({\tfrac {1}{2}}\lambda '+{\tfrac {1}{2}}\nu '+{\frac {2}{r}}\right)\\=&\;{\tfrac {1}{4}}\nu '^{2}+{\tfrac {1}{2}}\nu ''-{\tfrac {1}{4}}\lambda '\nu '-{\frac {\lambda '}{r}}\end{aligned}}

{\begin{aligned}G_{22}=&\;0\\=&\;e^{-\lambda }\left[1+{\tfrac {1}{2}}r\left(\nu '-\lambda '\right)\right]-1\end{aligned}}

{\begin{aligned}G_{33}=&\;0\\=&\;\sin ^{2}\theta \cdot e^{-\lambda }\left[1+{\tfrac {1}{2}}r\left(\nu '-\lambda '\right)\right]-\sin ^{2}\theta \end{aligned}}

{\begin{aligned}G_{44}=&\;0\\=&\;e^{\nu -\lambda }\left(-{\tfrac {1}{2}}\nu ''+{\tfrac {1}{4}}\lambda '\nu '-{\tfrac {1}{4}}\nu '^{2}-{\frac {\nu '}{r}}\right)\end{aligned}}

where $\lambda ''={\frac {\partial ^{2}\lambda }{\partial r^{2}}}\,$ and $\,\nu ''={\frac {\partial ^{2}\nu }{\partial r^{2}}}$ ^{[note 15]}

On the other hand,

G_{12}={\frac {1}{r}}\cot \theta -{\frac {1}{r}}\cot \theta

which is identically zero and is therefore eliminated, leaving four equations.

Also note that the expression for $G_{33}$ contains the expression for $G_{22}.$ The two equations are not independent, so we are left with only three independent equations.

Solving for e^λ and e^μ: The Schwarzschild metric edit

If we divide $G_{44}$ by $e^{\nu -\lambda }$ and add to $G_{11},$ we get

\lambda '=-\nu '

(Q5)

Integrating (Q5) yields $\,\lambda =-\nu +C\,$ where $C$ is a constant of integration. The value of the constant can be found by noting the following boundary condition on (P2): At points infinitely distant from gravitating masses, spacetime is flat so that the coefficients $e^{\lambda }$ and $e^{\nu }$ of $dr^{2}$ and $dt^{2}$ are both equal to one, i.e.

{\begin{aligned}&ds^{2}=-dr^{2}-r^{2}(d\theta ^{2}+\sin ^{2}\theta \cdot d\phi ^{2})+dt^{2}\\&{\text{or}}\\&ds^{2}=-(dx^{1})^{2}-(x^{1})^{2}((dx^{2})^{2}+\sin ^{2}x^{2}\cdot (dx^{3})^{2})+(dx^{4})^{2}\end{aligned}}

(Q6)

Infinitely distant from gravitating masses, therefore, $\lambda =-\nu =0$ and so $C$ must be zero.^[1]^{: 237–255} Hence,

\lambda =-\nu

(Q7)

Substituting (Q5) and (Q7) into the expression for $G_{22}$ above yields

{\begin{aligned}G_{22}&=0\\&=e^{\nu }(1+r\nu ')-1\end{aligned}}

which informs us that

e^{\nu }(1+r\nu ')=1

(Q8)

Let $\;\gamma =e^{\nu }\,$ which implies $\,\gamma '=e^{\nu }\nu '.\,$ Substituting into (Q8) and rearranging, we get the separable differential equation $\,\gamma +r\gamma '=1\,$ which yields

\gamma =1-{\frac {2m}{r}}

(Q9)

where $2m$ is a constant of integration expressed as such for reasons that will be discussed later on.^{[note 16]}

We have thus determined $e^{\lambda }$ and $e^{\nu }$

e^{\nu }\;=\;1/e^{\lambda }\;=\;\gamma \;=\;

1-{\frac {2m}{r}}\;=\;1-{\frac {2m}{x^{1}}}

Equation (P2) therefore becomes

ds^{2}=-\gamma ^{-1}dr^{2}

-\;r^{2}(d\theta ^{2}+\sin ^{2}\theta \cdot d\phi ^{2})+\gamma dt^{2}

{\text{or}}

(Q10)

ds^{2}=-(1-{\frac {2m}{r}})^{-1}dr^{2}

-\;r^{2}(d\theta ^{2}+\sin ^{2}\theta \cdot d\phi ^{2})+(1-{\frac {2m}{r}})dt^{2}

This is the famous Schwarzschild metric.^[1]^{: 237–255}

Anomalous perihelion precession of Mercury edit

Movement along geodesics edit

Figure 6–8. Calculus of variations

According to Newton's laws of motion, a planet orbiting the Sun would move in a straight line except for being pulled off course by the Sun's gravity. According to general relativity, there is no such thing as gravitational force. Rather, as discussed in section Basic propositions, a planet orbiting the Sun continuously follows the local "nearest thing to a straight line", which is to say, it follows a geodesic path.^[1]^{: 255–265}

Finding the equation of a geodesic requires knowing something about the calculus of variations, which is outside the scope of the typical undergraduate math curriculum, so we will not go into details of the analysis.^{[note 17]}

Determining the straightest path between two points resembles the task of finding the maximum or minimum of a function. In ordinary calculus, given the function $y=f(x),\,$ an "extremum" or "stationary point" may be found wherever the derivative of the function is zero.

In the calculus of variations, we seek to minimize the value of the functional between the start and end points. In the example shown in Fig. 6–8, this is by finding the function for which

\delta \int _{A}^{B}ds=0

where $\delta$ is the variation and the integral of $ds$ is the world-line.

Skipping the details of the derivation, the general formula for the equation of a geodesic is^[4]^: 103

{\frac {d^{2}x^{\sigma }}{ds^{2}}}+\Gamma _{\alpha \beta }^{\sigma }{\frac {dx^{\alpha }}{ds}}{\frac {dx^{\beta }}{ds}}=0

(R1)

valid for all dimensionalities and shapes of space(time). As a geometric expression, the derivative is with respect to the line element, whereas classical theory involves time derivatives.^[4]^: 103

Let us consider a flat, three dimensional Euclidean space using Cartesian coordinates. For such a space,

g_{11}=g_{22}=g_{33}=1\,

and

g_{\mu \nu }=0\,

for

\mu \neq \nu

The derivatives of the $g\,{\text{'s}}$ in the Christoffel symbol (K1) are all zero, so (R1) becomes

{\frac {d^{2}x^{\sigma }}{ds^{2}}}=0\quad \quad (n=3)

(R2)

After replacing $ds$ by the proper time $dt$ (the time along the timelike world line, i.e. the time experienced by the moving object) and expanding R2, we get

{\frac {d^{2}x^{1}}{dt^{2}}}=0,\quad {\frac {d^{2}x^{2}}{dt^{2}}}=0,\quad {\frac {d^{2}x^{3}}{dt^{2}}}=0

(R3)

which is to say, an object freely moving in Euclidean three-space travels with unaccelerated motion along a straight line.^[1]^{: 255–265}

Orbital motion: Stability of the orbital plane edit

Equation (R1) is a general expression for the geodesic. To apply it to the gravitational field around the Sun, the $g\,{\text{'s}}$ in the Christoffel symbols must be replaced with those specific to the Schwarzschild metric.^[1]^{: 266–268}

Equations (Q4) present the values of $\Gamma _{\alpha \beta }^{\sigma }$ in terms of $\lambda ,\,\nu ,\,r,\,\theta$ while (Q7) allows simplification of the expression to terms of $\nu ,\,r,\,\theta .\,$ Since $e^{\nu }=\gamma$ and (Q9) allows us to express $\gamma$ in terms of $r$ , we can thus express $\Gamma _{\alpha \beta }^{\sigma }$ in terms of $r$ and $\theta .$

Remember that (R1) is actually four equations. In particular, $x^{\sigma }$ for $\sigma =2$ corresponds to $\theta$ in Fig. 6-7. Suppose we launched an object into orbit around the Sun with $\theta =\pi /2$ and an initial velocity in the $xy$ plane? How would the object subsequently behave? Equation (R1) for $x^{2}\equiv \theta$ becomes

{\frac {d^{2}\theta }{ds^{2}}}+\Gamma _{\alpha \beta }^{2}{\frac {dx^{\alpha }}{ds}}{\frac {dx^{\beta }}{ds}}=0

(R4)

From (Q7), we know that the non-zero Christoffel symbols for $\sigma =2$ are

\Gamma _{12}^{2}=\Gamma _{21}^{2}={\frac {1}{r}}

and

\Gamma _{33}^{2}=-\sin \theta \cdot \cos \theta

so that in summing (R4) over all values of $\alpha$ and $\beta ,$ we get

{\frac {d^{2}\theta }{ds^{2}}}+{\frac {2}{r}}{\frac {dr}{ds}}{\frac {d\theta }{ds}}-\sin \theta \cdot \cos \theta \left({\frac {d\phi }{ds}}\right)^{2}=0

(R5)

Since we stipulated an initial $\theta =\pi /2$ and an initial velocity in the $xy$ plane, $\cos \theta =0$ and $d\theta /ds=0$ so that (R5) becomes

{\frac {d^{2}\theta }{ds^{2}}}=0

(R6)

In other words, a planet launched into orbit around the Sun remains in orbit around the same plane in which it was launched, the same as in Newtonian physics.^[1]^{: 266–268}

Orbital motion: Modified Keplerian ellipses edit

Starting with (R1), we explore the behavior of the other variables of the geodesic equation applied to the Schwarzschild metric:^[1]^{: 268–272}^[6]^{: 147–150}

For $\sigma =1,$ (R1) becomes

{\frac {d^{2}x^{1}}{ds^{2}}}+\Gamma _{11}^{1}\left({\frac {dx^{1}}{ds}}\right)^{2}+\Gamma _{22}^{1}\left({\frac {dx^{2}}{ds}}\right)^{2}

+\;\Gamma _{33}^{1}\left({\frac {dx^{3}}{ds}}\right)^{2}+\Gamma _{44}^{1}\left({\frac {dx^{4}}{ds}}\right)^{2}=0

or

{\frac {d^{2}r}{ds^{2}}}+{\tfrac {1}{2}}\lambda '\left({\frac {dr}{ds}}\right)^{2}-re^{-\lambda }\left({\frac {d\theta }{ds}}\right)^{2}

-\;r\cdot \sin ^{2}\theta \cdot e^{-\lambda }\left({\frac {d\phi }{ds}}\right)^{2}+{\tfrac {1}{2}}e^{\nu -\lambda }\nu '\left({\frac {dt}{ds}}\right)^{2}=0

Since we have stipulated that $\theta =\pi /2,\;$ $d\theta /ds=0\,$ and $\,\sin \theta =1,\,$ the above equation therefore becomes

{\frac {d^{2}r}{ds^{2}}}+{\tfrac {1}{2}}\lambda '\left({\frac {dr}{ds}}\right)^{2}-re^{-\lambda }\left({\frac {d\phi }{ds}}\right)^{2}+{\tfrac {1}{2}}e^{\nu -\lambda }\nu '\left({\frac {dt}{ds}}\right)^{2}=0

(R7)

Likewise, for $\sigma =3\,$ and $\,\sigma =4,\,$ we get

{\frac {d^{2}\phi }{ds^{2}}}+{\frac {2}{r}}{\frac {dr}{ds}}{\frac {d\phi }{ds}}=0

(R8)

{\frac {d^{2}t}{ds^{2}}}+\nu '{\frac {dr}{ds}}{\frac {dt}{ds}}=0

(R9)

(Q10), (R7), (R8), and (R9) may be combined to get:^[1]^{: 335–336}^[2]^{: 195–196}

\left.{\begin{aligned}&{\frac {d^{2}u}{d\phi ^{2}}}+u={\frac {m}{h^{2}}}+3mu^{2}\\&r^{2}{\frac {d\phi }{ds}}=h\end{aligned}}\right\}

(R10)

where $m$ and $h$ are constants of integration and $u=1/r.$

The equations above are those of an object in orbit around a central mass. The second of the two equations is essentially a statement of the conservation of angular momentum. The first of the two equations is expressed in this form so that it may be compared with the Binet equation, devised by Jacques Binet in the 1800s while exploring the shapes of orbits under alternative force laws.

For an inverse square law, the Binet equation predicts, in agreement with Newton, that orbits are conic sections.^[1]^{: 336–338} Given a Newtonian inverse square law, the equations of motion are:

\left.{\begin{aligned}&{\frac {d^{2}u}{d\phi ^{2}}}+u={\frac {m}{h^{2}}}\\&r^{2}{\frac {d\phi }{dt}}=h\end{aligned}}\right\}

(R11)

where $m$ is the mass of the Sun, $r$ is the orbital radius, and $d\phi /dt$ is the angular velocity of the planet.

The relativistic equations for orbital motion (R10) are observed to be nearly identical to the Newtonian equations (R11) except for the presence of $3mu^{2}$ in the relativistic equations and the use of $ds$ rather than $dt.$

The Binet equation provides the physical meaning of $m,$ which we had introduced as an arbitrary constant of integration in the derivation of the Schwarzschild metric in (Q9).^[1]^{: 268–272}^[6]^{: 147–150}

Orbital motion: Anomalous precession edit

Fibure 6–9. Perihelion precession

The presence of the term $3mu^{2}$ in (R10) means that the orbit does not form a closed loop, but rather shifts slightly with each revolution, as illustrated (in much exaggerated form) in Fig. 6–9.^[1]^{: 272–276}^[2]^{: 195–198}

Now in fact, there are a number of effects in the Solar System that cause the perihelia of planets to deviate from closed Keplerian ellipses even in the absence of relativity. Newtonian theory predicts closed ellipses only for an isolated two-body system. The presence of other planets perturb each others' orbits, so that Mercury's orbit, for instance, would precess by slightly over 532 arcsec/century due to these Newtonian effects.^[11]

In 1859, Urbain Le Verrier, after extensive extensive analysis of historical data on timed transits of Mercury over the Sun's disk from 1697 to 1848, concluded that there was a significant excess deviation of Mercury's orbit from the precession predicted by these Newtonian effects amounting to 38 arcseconds/century (This estimate was later refined to 43 arcseconds/century by Simon Newcomb in 1882). Over the next half-century, extensive observations definitively ruled out the hypothetical planet Vulcan proposed by Le Verrier as orbiting between Mercury and the Sun that might account for this discrepancy.

Starting from (R10), the excess angular advance of Mercury's perihelion per orbit may be calculated:^[1]^{: 338–341}^[2]^{: 195–198}

\Delta \phi _{orbit}={\frac {6\pi m}{a(1-e^{2})}}={\frac {6\pi GM/c^{2}}{a(1-e^{2})}}\;,

(R12)

The first equality is in relativistic units, while the second equality is in MKS units. In the second equality, we replace $m,$ the geometric mass (units of length) with M, the mass in kilograms.

G

is the gravitational constant (6.672 × 10^-11 m³/kg-s²)

M

is the mass of the Sun (1.99 × 10³⁰ kg)

c

is the speed of light (2.998 × 10⁸ m/s)

a

is Mercury's perihelion (5.791 × 10¹⁰ m)

e

is Mercury's orbital eccentricity (0.20563)

We find that

\Delta \phi _{orbit}=5.021\times 10^{-7}{\text{radian}}

which works out to 43 arcsec/century.^[1]^{: 338–341}^[2]^{: 195–198}

Deflection of light in a gravitational field edit

Figure 6–10. Deflection of light by the Sun

The most famous of the early tests of general relativity was the measurement of the gravitational deflection of starlight passing near the Sun. As noted before, anything moving freely in spacetime travels along the path of a geodesic. This includes light.

Consider Fig. 6–10. Line $AE$ represents the straight-line path of a ray of light in the absence of any large mass along its path. If the ray passes near the Sun, however, its path is deflected so that it follows the curved line $AF,\,$ which we illustrate as just grazing the Sun of radius $R.\,$ An observer situated at $F$ sees the star as apparently being at position $B$ rather than at its true position $A.$ The angle $\alpha$ is the angle between the true position of the star and its apparent position.^[1]^{: 276–289}^[2]^{: 199–201}

We have learned above, in the Spacetime interval section of this article, that the interval between two events on the world line of a particle moving at the speed of light is zero. Equations (R10) present the geodesic equation (R4) applied to the Schwarzschild metric (Q10). Substituting $\,ds=0\,$ in the second equation of (R10) gives $\,h=\infty ,\,$ which results in the first equation of (R10) becoming

{\frac {d^{2}u}{d\phi ^{2}}}+u=3mu^{2}

which is hence a differential equation describing the path of light passing by a massive spherical object. Solving this differential equation yields, in Cartesian coordinates:^[1]^{: 341–342}^[2]^{: 199–201}

x=R-{\frac {m}{R}}{\frac {x^{2}+2y^{2}}{\sqrt {x^{2}+y^{2}}}}

Given $\alpha$ a very small angle, the asymptotes of this curve are:

x=R\pm 2y{\frac {m}{R}},

where $m,\,$ in relativistic units, is a length.

The angle $\alpha$ may be calculated from the slopes of the asymptotes:

\tan \alpha ={\frac {4Rm}{R^{2}-4m^{2}}}

(S1)

which for very small $\,\alpha \,$ and $\,m\ll R\,$ becomes

\alpha ={\frac {4m}{R}}

(S2)

Plugging in $R=6.955\times 10^{5}{\text{km}}$ and $m=1.477\,{\text{km}},\,$ we get

\alpha =8.494\times 10^{-6}{\text{rad}}=1.75\,{\text{arcsec}}

The earliest measurement of the gravitational deflection of light, the 1919 Eddington experiment, established the validity of this figure to within broad limits. Modern measurements have validated the accuracy of this prediction to the 0.03% level.^[12]

Gravitational redshift edit

The third of the classical tests of relativity is the prediction of gravitational red shift. This was initially thought to represent an important test of general relativity because the Schwarzschild solution was employed in its derivation. However, as demonstrated above in the section Curvature of time, red shift is predicted by any theory of gravitation that is consistent with the equivalence principle. This includes Newtonian gravitation.^[2]^{: 201–204}

The derivation presented in Curvature of time uses kinematic arguments and does not make use of the field equations. Nevertheless, it is instructive to compare the kinematic arguments presented earlier with the more geometric approach accorded by use of the Schwarzschild solution.^[6]^{: 152–154}

Let $ds$ represent the invariant proper time of the period (i.e. inverse frequency) of some well-defined spectral line of an element. We know from special relativity that although observers in different frames may measure different $dx,\,dy,\,dz,\,dt$ for an interval, that the interval does not change with change of frame. Likewise the proper time of the period should not change with position in a gravitational potential field. Assume that a distant observer is at rest relative to an atom at the surface of the Sun as it emits light. In the Schwarzschild solution (Q10), we may write $dr=d\theta =d\phi =0,$ leaving $dt$ as the only non-zero term. The Schwarzschild solution reduces to

ds={\sqrt {1-{\frac {2m}{r}}}}dt

If $m\ll r$ ,

dt=(1+m/r)ds

(T1)

Plugging in the values for the Sun's geometric mass and radius, we conclude that the distant observer should observe the light emitted by the atom as being redshifted by a factor $1+1.477/695500=1.00000212\,.$ ^[1]^{: 289–299}

This is an extremely small factor of redshift, and confirmation took many years. See Gravitational redshift and time dilation for details.

Notes edit

^ D'Inverno writes that Lieber's book inspired him, as a young teenager, to take up relativity as his life's work. He warns, however: "This is a very bizarre book in appearance. The book is not set out in the usual way but rather as though it were concrete poetry. Moreover, it is interspersed by surrealist drawings by Hugh Lieber involving the symbols from the text. I must confess that at first sight the book looks rather cranky, but it is not." ^[2]^: 11
^ Lieber, as did Einstein, preferred to use subscripted $dx_{i}\,{\text{'s}}$ rather than superscripted $dx^{i}\,{\text{'s}}$ in the tensor formulas. Current practice is to use superscripts to emphasize that the $dx^{i}\,{\text{'s}}$ are displacement vectors that transform as contravariant vectors. Also, we have preferred to use $\Gamma _{\mu \nu }^{\lambda }$ rather than the outmoded $\{\mu \nu ,\lambda \}$ notation for the Christoffel symbol. Furthermore, Lieber used $B_{\sigma \tau \rho }^{\alpha }$ rather than the currently more commonly used $R_{\sigma \rho \tau }^{\alpha }=-R_{\sigma \tau \rho }^{\alpha }$ for the Riemann-Christoffel curvature tensor. Note that some textbook authors have adopted a definition of the curvature tensor that is of reverse sign to the definition adopted here.
^ The relation between the component and abstract views is rather like the relationship between analytic geometry using Cartesian coordinate systems, versus classic Greek geometry that assumes a small set of intuitive axioms and fundamental definitions of points, lines, and curves, from which many other theorems are proven.^[4]^: 31
^ An important theorem states that if a tensor equation is true in one system of coordinates, then it is true in all systems, whether they be Cartesian, cylindrical, spherical, rotated or in relative motion, etc. This theorem provides a powerful method of proof for a tensor equation: It needs only be proven to be true in one coordinate system (chosen for its ease of calculation) to be true for all.^[4]^: 45–46
^ To be precise, $f^{\mu }$ are assumed to be continuous, monotonic, one-to-one and infinitely differentiable, and as such, will have inverses.^[4]^: 33
^ Note: Certain superficially plausible manipulations in tensor calculus, performed by mistaken analogy with common algebraic manipulations, are in fact incorrect, as can be shown by expanding the terms following the notational rules that have been given. Contrast the following identities with the similar-looking but incorrect non-identities:^[7]^: 3
$a_{ij}(x_{j}+y_{j})\equiv a_{ij}x_{j}+a_{ij}y_{j}$

$a_{ij}(x_{i}+y_{j})\not \equiv a_{ij}x_{i}+a_{ij}y_{j}\quad$ NO!

$a_{ij}x_{i}y_{j}\equiv a_{ij}y_{j}x_{i}$

$a_{ij}x_{i}x_{j}\equiv a_{ji}x_{i}x_{j}$

$a_{ij}x_{i}y_{j}\not \equiv a_{ij}y_{i}x_{j}\quad$ NO!

$(a_{ij}+a_{ji})x_{i}x_{j}\equiv 2a_{ij}x_{i}x_{j}$

$(a_{ij}+a_{ji})x_{i}y_{j}\not \equiv 2a_{ij}x_{i}y_{j}\quad$ NO!

$(a_{ij}-a_{ji})x_{i}x_{j}\equiv 0$
^ Although one should be careful about accidentally misapplying concepts of single-variable calculus to multivariable calculus, the product rule in multivariable calculus looks almost identical to the rule in single-variable calculus: ${\frac {\partial }{\partial x}}(uv)=u{\frac {\partial v}{\partial x}}+v{\frac {\partial u}{\partial x}}$
^ Although this rearrangement of terms in the product is legitimate, various other manipulations that are common when working with full derivatives are not. In particular, one may not treat partial derivatives like fractions. Partial derivatives must be treated as complete entities whose numerators and denominators cannot be separated. So we should never pull them apart like ${\frac {\partial f}{\partial t}}=kxt^{2}\;\implies \;\partial f=kxt^{2}\partial t.$ Never do this. With full derivatives, this is permissible because full derivatives represent the ratio of two differentials. But there are no such things as partial differentials. $\partial f$ and $\partial t$ do not separately exist.
^ Except at "singular" points in space, which are points where matter is located.
^ It is sufficient to prove the Quotient Theorem true for a particular case, since it will be evident that the argument is of general application. For example, suppose $X_{\gamma \delta ...}^{\alpha \beta ...}A_{\alpha }$ is known to be a contravariant vector for all choices of the covariant vector $A_{\alpha }.$ Since $X_{\gamma \delta ...}^{\alpha \beta ...}A_{\alpha }$ is a contravariant vector, it follows the pattern of (D3):
${\bar {X}}_{\gamma \delta ...}^{\alpha \beta ...}{\bar {A}}_{\alpha }={\frac {\partial {\bar {x}}^{\beta }}{\partial x^{\nu }}}X_{\gamma \delta ...}^{\alpha \nu ...}A_{\alpha }$
Since we are given that $A_{\alpha }$ is a covariant vector,
${\bar {A}}_{\alpha }={\frac {\partial x^{\mu }}{\partial {\bar {x}}^{\alpha }}}A_{\mu }\quad$ or $\quad A_{\alpha }={\frac {\partial {\bar {x}}^{\mu }}{\partial x^{\alpha }}}{\bar {A}}_{\mu }$
Substituting,
${\bar {X}}_{\gamma \delta ...}^{\alpha \beta ...}{\bar {A}}_{\alpha }={\frac {\partial {\bar {x}}^{\beta }}{\partial x^{\nu }}}X_{\gamma \delta ...}^{\alpha \nu ...}{\frac {\partial {\bar {x}}^{\mu }}{\partial x^{\alpha }}}{\bar {A}}_{\mu }$
Swapping the dummy indices $\alpha$ and $\mu$ on the right-hand side, then rearranging, we get
$\left[{\bar {X}}_{\gamma \delta ...}^{\alpha \beta ...}-{\frac {\partial {\bar {x}}^{\alpha }}{\partial x^{\mu }}}{\frac {\partial {\bar {x}}^{\beta }}{\partial x^{\nu }}}X_{\gamma \delta ...}^{\mu \nu ...}\right]{\bar {A}}_{\alpha }=0$
${\bar {A}}_{\alpha }$ would not generally be zero, therefore
${\bar {X}}_{\gamma \delta ...}^{\alpha \beta ...}={\frac {\partial {\bar {x}}^{\alpha }}{\partial x^{\mu }}}{\frac {\partial {\bar {x}}^{\beta }}{\partial x^{\nu }}}X_{\gamma \delta ...}^{\mu \nu ...}$
Comparison with (D4) shows that $X_{\gamma \delta ...}^{\mu \nu ...}$ transforms as a contravariant tensor of rank two.^[1]^{: 312–314}^[6]^: 94–95
^ Einstein introduced a powerful comma notation for the partial derivative of a function. He would simplify the appearance of (K1) as follows:^[5]^{: 149, 157} $\Gamma _{\mu \nu }^{\lambda }={\frac {1}{2}}g^{\lambda \alpha }\left(g_{\mu \alpha ,\nu }+g_{\nu \alpha ,\mu }-g_{\mu \nu ,\alpha }\right)$ We won't use this notation, but it is frequently found in the literature.
^ Especially in the older literature, one often sees covariant tensors of rank one referred to as "covectors", while contravariant tensors of rank one are referred to simply as "vectors".
^ The precise consequences of a finite speed of light depend on the mechanism assumed to underlie Newtonian gravitation. Laplace was considering a mechanism whereby gravity is caused by "the impulse of a fluid directed towards the centre of the attracting body". In an alternative mechanistic theory, the Earth would always be pulled toward the optical position of the Sun, which is displaced forward from its geometric position due to aberration. This would cause a pull ahead of the Earth, which would cause the orbit of the Earth to rapidly spiral outward. In reality, however, any finite speed of gravity would result in the violation of conservation of energy and conservation of angular momentum. Gravitational wave astronomers have confirmed that the speed of gravity equals c to a high degree of accuracy. The seeming paradox between the measured finite speed of gravity and the stability of the Earth's orbit is resolved by general relativity.
^ In the older literature, the recommended pronunciation is often given as "nabla square"
^ ${\frac {\partial }{\partial r}}\ln {\sqrt {-g}}={\frac {\partial }{\partial r}}(\ln {\sqrt {e^{\lambda +\nu }r^{4}\sin ^{2}\theta }})=\,$ ${\tfrac {1}{2}}\lambda '+{\tfrac {1}{2}}\nu '+{\frac {2}{r}}$
$\;{\frac {\partial ^{2}}{\partial r^{2}}}\ln {\sqrt {-g}}=\,$ ${\tfrac {1}{2}}\lambda ''+{\tfrac {1}{2}}\nu ''-{\frac {2}{r^{2}}}$
^ The constant $m$ is the mass of the central particle in relativistic units.^[1]^{: 315–316} It has dimensions of length and is often called the geometric mass. The identification of $m$ with geometric mass is often expressed as a boundary condition argument, for instance in Adler (2021),^[4]^{: 125–129} but in actuality, as explained in D'Inverno (1992),^[2]^{: 186–190} the field equations force this interpretation.
^ Very basic treatments of the subject may be found in D'Inverno (1992)^[2]^{: 82–83, 99–101} and in Lawden (2002).^[6]^{: 114–117}

Additional details edit

References edit

^ ^a ^b ^c ^d ^e ^f ^g ^h ⁱ ^j ^k ^l ^m ⁿ ^o ^p ^q ^r ^s ^t ^u ^v ^w ^x ^y ^z ^aa ^ab ^ac ^ad ^ae ^af ^ag ^ah ^ai ^aj ^ak ^al ^am ^an ^ao ^ap ^aq ^ar ^as ^at ^au ^av ^aw ^ax ^ay ^az ^ba ^bb ^bc ^bd ^be ^bf ^bg ^bh Lieber, Lillian R. (2008). The Einstein Theory of Relativity (1st Paul Dry Books ed.). Philadelphia: Paul Dry Books. ISBN 978-1-58988-044-3.
^ ^a ^b ^c ^d ^e ^f ^g ^h ⁱ ^j ^k ^l ^m ⁿ ^o ^p ^q ^r D'Inverno, Ray (1992). Introducing Einstein's Relativity. Oxford: Oxford University Press. ISBN 978-0-19-859686-8.
^ ^a ^b Schutz, Bernard (2009). A First Course in General Relativity (2nd ed.). Cambridge: Cambridge University Press. ISBN 978-0-521-88705-2.
^ ^a ^b ^c ^d ^e ^f ^g ^h ⁱ ^j ^k ^l ^m ⁿ ^o ^p ^q ^r Adler, Ronald J. (2021). General Relativity and Cosmology: A First Encounter. Switzerland: Springer. ISBN 978-3-030-61573-4.
^ ^a ^b ^c ^d ^e ^f ^g ^h ⁱ Grøn, Øyvind; Næss, Arne (2011). Einstein's Theory: A Rigorous Introduction for the Mathematically Untrained. New York: Springer. ISBN 978-1-4614-0705-8.
^ ^a ^b ^c ^d ^e ^f ^g Lawden, D. F. (2002). Introduction to Tensor Calculus, Relativity and Cosmology (3rd ed.). New York: Dover Publications, Inc. ISBN 978-0-486-42540-5.
^ ^a ^b ^c ^d Kay, David C. (2011). Tensor Calculus. New York: McGraw Hill. ISBN 978-0-07-175603-7.
^ ^a ^b ^c Hentschke, Reinhard; Hölbling, Christian (2020). A Short Course in General Relativity and Cosmology. Switzerland: Springer Nature. ISBN 978-3-030-46383-0.
^ Stachel, John (2002). "The Rigidly Rotating Disk as the "Missing Link" in the History of General Relativity". Einstein from 'B' to 'Z'. Boston: Birkhäuser. pp. 245–260. ISBN 0-8176-4143-2.
^ Laplace, P.S.: (1805) "A Treatise in Celestial Mechanics", Volume IV, Book X, Chapter VII, translated by N. Bowditch (Chelsea, New York, 1966)
^ Park, Ryan S.; et al. (2017). "Precession of Mercury's Perihelion from Ranging to the MESSENGER Spacecraft". The Astronomical Journal. 153 (3): 121. Bibcode:2017AJ....153..121P. doi:10.3847/1538-3881/aa5be2. hdl:1721.1/109312.{{cite journal}}: CS1 maint: unflagged free DOI (link)
^ Fomalont, E.B.; Kopeikin S.M.; Lanyi, G.; Benson, J. (July 2009). "Progress in Measurements of the Gravitational Bending of Radio Waves Using the VLBA". Astrophysical Journal. 699 (2): 1395–1402. arXiv:0904.3992. Bibcode:2009ApJ...699.1395F. doi:10.1088/0004-637X/699/2/1395. S2CID 4506243.

[3] D'Inverno writes that Lieber's book inspired him, as a young teenager, to take up relativity as his life's work. He warns, however: "This is a very bizarre book in appearance. The book is not set out in the usual way but rather as though it were concrete poetry. Moreover, it is interspersed by surrealist drawings by Hugh Lieber involving the symbols from the text. I must confess that at first sight the book looks rather cranky, but it is not." ^[2]^: 11

[4] Lieber, as did Einstein, preferred to use subscripted $dx_{i}\,{\text{'s}}$ rather than superscripted $dx^{i}\,{\text{'s}}$ in the tensor formulas. Current practice is to use superscripts to emphasize that the $dx^{i}\,{\text{'s}}$ are displacement vectors that transform as contravariant vectors. Also, we have preferred to use $\Gamma _{\mu \nu }^{\lambda }$ rather than the outmoded $\{\mu \nu ,\lambda \}$ notation for the Christoffel symbol. Furthermore, Lieber used $B_{\sigma \tau \rho }^{\alpha }$ rather than the currently more commonly used $R_{\sigma \rho \tau }^{\alpha }=-R_{\sigma \tau \rho }^{\alpha }$ for the Riemann-Christoffel curvature tensor. Note that some textbook authors have adopted a definition of the curvature tensor that is of reverse sign to the definition adopted here.

[8] The relation between the component and abstract views is rather like the relationship between analytic geometry using Cartesian coordinate systems, versus classic Greek geometry that assumes a small set of intuitive axioms and fundamental definitions of points, lines, and curves, from which many other theorems are proven.^[4]^: 31

[9] An important theorem states that if a tensor equation is true in one system of coordinates, then it is true in all systems, whether they be Cartesian, cylindrical, spherical, rotated or in relative motion, etc. This theorem provides a powerful method of proof for a tensor equation: It needs only be proven to be true in one coordinate system (chosen for its ease of calculation) to be true for all.^[4]^: 45–46

[12] To be precise, $f^{\mu }$ are assumed to be continuous, monotonic, one-to-one and infinitely differentiable, and as such, will have inverses.^[4]^: 33

[13] Note: Certain superficially plausible manipulations in tensor calculus, performed by mistaken analogy with common algebraic manipulations, are in fact incorrect, as can be shown by expanding the terms following the notational rules that have been given. Contrast the following identities with the similar-looking but incorrect non-identities:^[7]^: 3
$a_{ij}(x_{j}+y_{j})\equiv a_{ij}x_{j}+a_{ij}y_{j}$

$a_{ij}(x_{i}+y_{j})\not \equiv a_{ij}x_{i}+a_{ij}y_{j}\quad$ NO!

$a_{ij}x_{i}y_{j}\equiv a_{ij}y_{j}x_{i}$

$a_{ij}x_{i}x_{j}\equiv a_{ji}x_{i}x_{j}$

$a_{ij}x_{i}y_{j}\not \equiv a_{ij}y_{i}x_{j}\quad$ NO!

$(a_{ij}+a_{ji})x_{i}x_{j}\equiv 2a_{ij}x_{i}x_{j}$

$(a_{ij}+a_{ji})x_{i}y_{j}\not \equiv 2a_{ij}x_{i}y_{j}\quad$ NO!

$(a_{ij}-a_{ji})x_{i}x_{j}\equiv 0$

[14] Although one should be careful about accidentally misapplying concepts of single-variable calculus to multivariable calculus, the product rule in multivariable calculus looks almost identical to the rule in single-variable calculus: ${\frac {\partial }{\partial x}}(uv)=u{\frac {\partial v}{\partial x}}+v{\frac {\partial u}{\partial x}}$

[15] Although this rearrangement of terms in the product is legitimate, various other manipulations that are common when working with full derivatives are not. In particular, one may not treat partial derivatives like fractions. Partial derivatives must be treated as complete entities whose numerators and denominators cannot be separated. So we should never pull them apart like ${\frac {\partial f}{\partial t}}=kxt^{2}\;\implies \;\partial f=kxt^{2}\partial t.$ Never do this. With full derivatives, this is permissible because full derivatives represent the ratio of two differentials. But there are no such things as partial differentials. $\partial f$ and $\partial t$ do not separately exist.

[16] Except at "singular" points in space, which are points where matter is located.

[17] It is sufficient to prove the Quotient Theorem true for a particular case, since it will be evident that the argument is of general application. For example, suppose $X_{\gamma \delta ...}^{\alpha \beta ...}A_{\alpha }$ is known to be a contravariant vector for all choices of the covariant vector $A_{\alpha }.$ Since $X_{\gamma \delta ...}^{\alpha \beta ...}A_{\alpha }$ is a contravariant vector, it follows the pattern of (D3):
${\bar {X}}_{\gamma \delta ...}^{\alpha \beta ...}{\bar {A}}_{\alpha }={\frac {\partial {\bar {x}}^{\beta }}{\partial x^{\nu }}}X_{\gamma \delta ...}^{\alpha \nu ...}A_{\alpha }$
Since we are given that $A_{\alpha }$ is a covariant vector,
${\bar {A}}_{\alpha }={\frac {\partial x^{\mu }}{\partial {\bar {x}}^{\alpha }}}A_{\mu }\quad$ or $\quad A_{\alpha }={\frac {\partial {\bar {x}}^{\mu }}{\partial x^{\alpha }}}{\bar {A}}_{\mu }$
Substituting,
${\bar {X}}_{\gamma \delta ...}^{\alpha \beta ...}{\bar {A}}_{\alpha }={\frac {\partial {\bar {x}}^{\beta }}{\partial x^{\nu }}}X_{\gamma \delta ...}^{\alpha \nu ...}{\frac {\partial {\bar {x}}^{\mu }}{\partial x^{\alpha }}}{\bar {A}}_{\mu }$
Swapping the dummy indices $\alpha$ and $\mu$ on the right-hand side, then rearranging, we get
$\left[{\bar {X}}_{\gamma \delta ...}^{\alpha \beta ...}-{\frac {\partial {\bar {x}}^{\alpha }}{\partial x^{\mu }}}{\frac {\partial {\bar {x}}^{\beta }}{\partial x^{\nu }}}X_{\gamma \delta ...}^{\mu \nu ...}\right]{\bar {A}}_{\alpha }=0$
${\bar {A}}_{\alpha }$ would not generally be zero, therefore
${\bar {X}}_{\gamma \delta ...}^{\alpha \beta ...}={\frac {\partial {\bar {x}}^{\alpha }}{\partial x^{\mu }}}{\frac {\partial {\bar {x}}^{\beta }}{\partial x^{\nu }}}X_{\gamma \delta ...}^{\mu \nu ...}$
Comparison with (D4) shows that $X_{\gamma \delta ...}^{\mu \nu ...}$ transforms as a contravariant tensor of rank two.^[1]^{: 312–314}^[6]^: 94–95

[18] Einstein introduced a powerful comma notation for the partial derivative of a function. He would simplify the appearance of (K1) as follows:^[5]^{: 149, 157} $\Gamma _{\mu \nu }^{\lambda }={\frac {1}{2}}g^{\lambda \alpha }\left(g_{\mu \alpha ,\nu }+g_{\nu \alpha ,\mu }-g_{\mu \nu ,\alpha }\right)$ We won't use this notation, but it is frequently found in the literature.

[19] Especially in the older literature, one often sees covariant tensors of rank one referred to as "covectors", while contravariant tensors of rank one are referred to simply as "vectors".

[23] The precise consequences of a finite speed of light depend on the mechanism assumed to underlie Newtonian gravitation. Laplace was considering a mechanism whereby gravity is caused by "the impulse of a fluid directed towards the centre of the attracting body". In an alternative mechanistic theory, the Earth would always be pulled toward the optical position of the Sun, which is displaced forward from its geometric position due to aberration. This would cause a pull ahead of the Earth, which would cause the orbit of the Earth to rapidly spiral outward. In reality, however, any finite speed of gravity would result in the violation of conservation of energy and conservation of angular momentum. Gravitational wave astronomers have confirmed that the speed of gravity equals c to a high degree of accuracy. The seeming paradox between the measured finite speed of gravity and the stability of the Earth's orbit is resolved by general relativity.

[24] In the older literature, the recommended pronunciation is often given as "nabla square"

[25] ${\frac {\partial }{\partial r}}\ln {\sqrt {-g}}={\frac {\partial }{\partial r}}(\ln {\sqrt {e^{\lambda +\nu }r^{4}\sin ^{2}\theta }})=\,$ ${\tfrac {1}{2}}\lambda '+{\tfrac {1}{2}}\nu '+{\frac {2}{r}}$
$\;{\frac {\partial ^{2}}{\partial r^{2}}}\ln {\sqrt {-g}}=\,$ ${\tfrac {1}{2}}\lambda ''+{\tfrac {1}{2}}\nu ''-{\frac {2}{r^{2}}}$

[26] The constant $m$ is the mass of the central particle in relativistic units.^[1]^{: 315–316} It has dimensions of length and is often called the geometric mass. The identification of $m$ with geometric mass is often expressed as a boundary condition argument, for instance in Adler (2021),^[4]^{: 125–129} but in actuality, as explained in D'Inverno (1992),^[2]^{: 186–190} the field equations force this interpretation.

[27] Very basic treatments of the subject may be found in D'Inverno (1992)^[2]^{: 82–83, 99–101} and in Lawden (2002).^[6]^{: 114–117}

[Lieber_2008-1] ^ ^a ^b ^c ^d ^e ^f ^g ^h ⁱ ^j ^k ^l ^m ⁿ ^o ^p ^q ^r ^s ^t ^u ^v ^w ^x ^y ^z ^aa ^ab ^ac ^ad ^ae ^af ^ag ^ah ^ai ^aj ^ak ^al ^am ^an ^ao ^ap ^aq ^ar ^as ^at ^au ^av ^aw ^ax ^ay ^az ^ba ^bb ^bc ^bd ^be ^bf ^bg ^bh Lieber, Lillian R. (2008). The Einstein Theory of Relativity (1st Paul Dry Books ed.). Philadelphia: Paul Dry Books. ISBN 978-1-58988-044-3.

[D'Inverno_1992-2] ^ ^a ^b ^c ^d ^e ^f ^g ^h ⁱ ^j ^k ^l ^m ⁿ ^o ^p ^q ^r D'Inverno, Ray (1992). Introducing Einstein's Relativity. Oxford: Oxford University Press. ISBN 978-0-19-859686-8.

[Schutz_2009-5] Schutz, Bernard (2009). A First Course in General Relativity (2nd ed.). Cambridge: Cambridge University Press. ISBN 978-0-521-88705-2.

[Adler_2021-6] ^ ^a ^b ^c ^d ^e ^f ^g ^h ⁱ ^j ^k ^l ^m ⁿ ^o ^p ^q ^r Adler, Ronald J. (2021). General Relativity and Cosmology: A First Encounter. Switzerland: Springer. ISBN 978-3-030-61573-4.

[Gron_2011-7] ^ ^a ^b ^c ^d ^e ^f ^g ^h ⁱ Grøn, Øyvind; Næss, Arne (2011). Einstein's Theory: A Rigorous Introduction for the Mathematically Untrained. New York: Springer. ISBN 978-1-4614-0705-8.

[Lawden_2002-10] ^ ^a ^b ^c ^d ^e ^f ^g Lawden, D. F. (2002). Introduction to Tensor Calculus, Relativity and Cosmology (3rd ed.). New York: Dover Publications, Inc. ISBN 978-0-486-42540-5.

[Kay_2011-11] Kay, David C. (2011). Tensor Calculus. New York: McGraw Hill. ISBN 978-0-07-175603-7.

[Hentschke_2020-20] Hentschke, Reinhard; Hölbling, Christian (2020). A Short Course in General Relativity and Cosmology. Switzerland: Springer Nature. ISBN 978-3-030-46383-0.

[Stachel_2002-21] Stachel, John (2002). "The Rigidly Rotating Disk as the "Missing Link" in the History of General Relativity". Einstein from 'B' to 'Z'. Boston: Birkhäuser. pp. 245–260. ISBN 0-8176-4143-2.

[Laplace1805-22] Laplace, P.S.: (1805) "A Treatise in Celestial Mechanics", Volume IV, Book X, Chapter VII, translated by N. Bowditch (Chelsea, New York, 1966)

[Park-28] Park, Ryan S.; et al. (2017). "Precession of Mercury's Perihelion from Ranging to the MESSENGER Spacecraft". The Astronomical Journal. 153 (3): 121. Bibcode:2017AJ....153..121P. doi:10.3847/1538-3881/aa5be2. hdl:1721.1/109312.{{cite journal}}: CS1 maint: unflagged free DOI (link)

[29] Fomalont, E.B.; Kopeikin S.M.; Lanyi, G.; Benson, J. (July 2009). "Progress in Measurements of the Gravitational Bending of Radio Waves Using the VLBA". Astrophysical Journal. 699 (2): 1395–1402. arXiv:0904.3992. Bibcode:2009ApJ...699.1395F. doi:10.1088/0004-637X/699/2/1395. S2CID 4506243.

[1]

[note 1]

[note 2]

[3]

[4]

[2]

[5]

[note 3]

[note 4]

[6]

[7]

[note 5]

[note 6]

[note 7]

[note 8]

[note 9]

[note 10]

[note 11]

[note 12]

[8]

[9]

[10]

[note 13]

[note 14]

[note 15]

[note 16]

[note 17]

[11]

[12]