Legendre transformation

(Redirected from Legendre transform)

In mathematics, the Legendre transformation (or Legendre transform), first introduced by Adrien-Marie Legendre in 1787 when studying the minimal surface problem,[1] is an involutive transformation on real-valued functions that are convex on a real variable. Specifically, if a real-valued multivariable function is convex on one of its independent real variables, then the Legendre transform with respect to this variable is applicable to the function.

The function is defined on the interval . For a given , the difference takes the maximum at . Thus, the Legendre transformation of is .

In physical problems, the Legendre transform is used to convert functions of one quantity (such as position, pressure, or temperature) into functions of the conjugate quantity (momentum, volume, and entropy, respectively). In this way, it is commonly used in classical mechanics to derive the Hamiltonian formalism out of the Lagrangian formalism (or vice versa) and in thermodynamics to derive the thermodynamic potentials, as well as in the solution of differential equations of several variables.

For sufficiently smooth functions on the real line, the Legendre transform of a function can be specified, up to an additive constant, by the condition that the functions' first derivatives are inverse functions of each other. This can be expressed in Euler's derivative notation as

where is an operator of differentiation, represents an argument or input to the associated function, is an inverse function such that ,

or equivalently, as and in Lagrange's notation.

The generalization of the Legendre transformation to affine spaces and non-convex functions is known as the convex conjugate (also called the Legendre–Fenchel transformation), which can be used to construct a function's convex hull.

Definition edit

Definition in   edit

Let   be an interval, and   a convex function; then the Legendre transform of   is the function   defined by

 
where   denotes the supremum over  , e.g.,   in   is chosen such that   is maximized at each  , or   is such that   as a bounded value throughout   exists (e.g., when   is a linear function).

The transform is always well-defined when   is convex. This definition requires   to be bounded from above in   in order for the supremum to exist.

Definition in   edit

The generalization to convex functions   on a convex set   is straightforward:   has domain

 
and is defined by
 
where   denotes the dot product of   and  .

The function   is called the convex conjugate function of  . For historical reasons (rooted in analytic mechanics), the conjugate variable is often denoted  , instead of  . If the convex function   is defined on the whole line and is everywhere differentiable, then

 
can be interpreted as the negative of the  -intercept of the tangent line to the graph of   that has slope  .

The Legendre transformation is an application of the duality relationship between points and lines. The functional relationship specified by   can be represented equally well as a set of   points, or as a set of tangent lines specified by their slope and intercept values.

Understanding the Legendre transform in terms of derivatives edit

For a differentiable convex function   on the real line with the first derivative   and its inverse  , the Legendre transform of  ,  , can be specified, up to an additive constant, by the condition that the functions' first derivatives are inverse functions of each other, i.e.,   and  .

To see this, first note that if   as a convex function on the real line is differentiable and   is a critical point of the function of  , then the supremum is achieved at   (by convexity, see the first figure in this Wikipedia page). Therefore, the Legendre transform of   is  .

Then, suppose that the first derivative   is invertible and let the inverse be  . Then for each  , the point   is the unique critical point   of the function   (i.e.,  ) because   and the function's first derivative with respect to   at   is  . Hence we have   for each  . By differentiating with respect to  , we find

 
Since   this simplifies to  . In other words,   and   are inverses to each other.

In general, if   as the inverse of  , then   so integration gives  . with a constant  .

In practical terms, given  , the parametric plot of   versus   amounts to the graph of   versus  .

In some cases (e.g. thermodynamic potentials, below), a non-standard requirement is used, amounting to an alternative definition of f * with a minus sign,

 

Formal Definition in Physics Context edit

In analytical mechanics and thermodynamics, Legendre transformation is usually defined as follows: suppose   is a function of  , then we have

 .

performing Legendre transformation on this function means that we take   as the independent variable, so that the above expression can be written as

 ,

and according to Leibniz's rule  , then we have

 ,

and taking  , we have  , which means

 

When   is a function of   variables  , then we can perform the Legendre transformation on each one or several variables: we have

 

where  . Then if we want to perform Legendre transformation on, e.g.  , then we take   together with   as independent variables, and with Leibniz's rule we have

 .

so for function  , we have

 .

We can also do this transformation for variables  . If we do it to all the variables, then we have

  where  .

In analytical mechanics, people perform this transformation on variables   of the Lagrangian   to get the Hamiltonian:

 

and in thermodynamics, people perform this transformation on variables according to the type of thermodynamic system they want. E.g. starting from the cardinal function of state, the internal energy  , we have

 ,

we can perform Legendre transformation on either or both of   yielding

 
 
 

and each of these three expressions has a physical meaning.

This definition of Legendre transformation is the one originally introduced by Legendre in his work in 1787,[1] and still applied by physicists nowadays. Indeed, this definition can be mathematically rigorous if we treat all the variables and functions defined above, e.g.   as differentiable functions defined on an open set of   or on a differentiable manifold, and   their differentials (which are treated as cotangent vector field in the context of differentiable manifold). And this definition is equivalent to the modern mathematicians' definition as long as   is differentiable and convex for the variables  .

Properties edit

  • The Legendre transform of a convex function, of which double derivative values are all positive, is also a convex function of which double derivative values are all positive.
    Proof. Let us show this with a doubly differentiable function   with all positive double derivative values and with a bijective (invertible) derivative.
    For a fixed  , let   maximize or make the function   bounded over  . Then the Legendre transformation of   is  , thus,
     
    by the maximizing or bounding condition  . Note that   depends on  . (This can be visually shown in the 1st figure of this page above.)
    Thus   where  , meaning that   is the inverse of   that is the derivative of   (so  ).
    Note that   is also differentiable with the following derivative (Inverse function rule),
     
    Thus, the Legendre transformation   is the composition of differentiable functions, hence it is differentiable.
    Applying the product rule and the chain rule with the found equality   yields
     
    giving
     
    so   is convex with its double derivatives are all positive.
  • The Legendre transformation is an involution, i.e.,  .
    Proof. By using the above identities as  ,  ,   and its derivative  ,
     
    Note that this derivation does not require the condition to have all positive values in double derivative of the original function  .

Identities edit

As shown above, for a convex function  , with   maximizing or making   bounded at each   to define the Legendre transform   and with  , the following identities hold.

  •  ,
  •  ,
  •  .

Examples edit

Example 1 edit

 
  over the domain   is plotted in red and its Legendre transform   over the domain   in dashed blue. Note that the Legendre transform appears convex.

Consider the exponential function   which has the domain  . From the definition, the Legendre transform is

 
where   remains to be determined. To evaluate the supremum, compute the derivative of   with respect to   and set equal to zero:
 
The second derivative   is negative everywhere, so the maximal value is achieved at  . Thus, the Legendre transform is
 
and has domain   This illustrates that the domains of a function and its Legendre transform can be different.

To find the Legendre transformation of the Legendre transformation of  ,

 
where a variable   is intentionally used as the argument of the function   to show the involution property of the Legendre transform as  . we compute
 
thus the maximum occurs at   because the second derivative   over the domain of   as   As a result,   is found as
 
thereby confirming that   as expected.

Example 2 edit

Let f(x) = cx2 defined on R, where c > 0 is a fixed constant.

For x* fixed, the function of x, x*xf(x) = x*xcx2 has the first derivative x* − 2cx and second derivative −2c; there is one stationary point at x = x*/2c, which is always a maximum.

Thus, I* = R and

 

The first derivatives of f, 2cx, and of f *, x*/(2c), are inverse functions to each other. Clearly, furthermore,

 
namely f ** = f.

Example 3 edit

Let f(x) = x2 for x ∈ (I = [2, 3]).

For x* fixed, x*xf(x) is continuous on I compact, hence it always takes a finite maximum on it; it follows that the domain of the Legendre transform of   is I* = R.

The stationary point at x = x*/2 (found by setting that the first derivative of x*xf(x) with respect to   equal to zero) is in the domain [2, 3] if and only if 4 ≤ x* ≤ 6. Otherwise the maximum is taken either at x = 2 or x = 3 because the second derivative of x*xf(x) with respect to   is negative as  ; for a part of the domain   the maximum that x*xf(x) can take with respect to   is obtained at   while for   it becomes the maximum at  . Thus, it follows that

 

Example 4 edit

The function f(x) = cx is convex, for every x (strict convexity is not required for the Legendre transformation to be well defined). Clearly x*xf(x) = (x* − c)x is never bounded from above as a function of x, unless x* − c = 0. Hence f* is defined on I* = {c} and f*(c) = 0. (The definition of the Legendre transform requires the existence of the supremum, that requires upper bounds.)

One may check involutivity: of course, x*xf*(x*) is always bounded as a function of x*∈{c}, hence I** = R. Then, for all x one has

 
and hence f **(x) = cx = f(x).

Example 5 edit

As an example of a convex continuous function that is not everywhere differentiable, consider  . This gives

 
and thus   on its domain  .

Example 6: several variables edit

Let

 
be defined on X = Rn, where A is a real, positive definite matrix.

Then f is convex, and

 
has gradient p − 2Ax and Hessian −2A, which is negative; hence the stationary point x = A−1p/2 is a maximum.

We have X* = Rn, and

 

Behavior of differentials under Legendre transforms edit

The Legendre transform is linked to integration by parts, p dx = d(px) − x dp.

Let f(x,y) be a function of two independent variables x and y, with the differential

 

Assume that the function f is convex in x for all y, so that one may perform the Legendre transform on f in x, with p the variable conjugate to x (for information, there is a relation   where   is a point in x maximizing or making   bounded for given p and y). Since the new independent variable of the transform with respect to f is p, the differentials dx and dy in df devolve to dp and dy in the differential of the transform, i.e., we build another function with its differential expressed in terms of the new basis dp and dy.

We thus consider the function g(p, y) = fpx so that

 
 
 

The function g(p, y) is the Legendre transform of f(x, y), where only the independent variable x has been supplanted by p. This is widely used in thermodynamics, as illustrated below.

Applications edit

Analytical mechanics edit

A Legendre transform is used in classical mechanics to derive the Hamiltonian formulation from the Lagrangian formulation, and conversely. A typical Lagrangian has the form

 
where   are coordinates on Rn × Rn, M is a positive real matrix, and
 

For every q fixed,   is a convex function of  , while   plays the role of a constant.

Hence the Legendre transform of   as a function of   is the Hamiltonian function,

 

In a more general setting,   are local coordinates on the tangent bundle   of a manifold  . For each q,   is a convex function of the tangent space Vq. The Legendre transform gives the Hamiltonian   as a function of the coordinates (p, q) of the cotangent bundle  ; the inner product used to define the Legendre transform is inherited from the pertinent canonical symplectic structure. In this abstract setting, the Legendre transformation corresponds to the tautological one-form.[further explanation needed]

Thermodynamics edit

The strategy behind the use of Legendre transforms in thermodynamics is to shift from a function that depends on a variable to a new (conjugate) function that depends on a new variable, the conjugate of the original one. The new variable is the partial derivative of the original function with respect to the original variable. The new function is the difference between the original function and the product of the old and new variables. Typically, this transformation is useful because it shifts the dependence of, e.g., the energy from an extensive variable to its conjugate intensive variable, which can often be controlled more easily in a physical experiment.

For example, the internal energy U is an explicit function of the extensive variables entropy S, volume V, and chemical composition Ni (e.g.,  )

 
which has a total differential
 

where  .

(Subscripts are not necessary by the definition of partial derivatives but left here for clarifying variables.) Stipulating some common reference state, by using the (non-standard) Legendre transform of the internal energy U with respect to volume V, the enthalpy H may be obtained as the following.

To get the (standard) Legendre transform   of the internal energy U with respect to volume V, the function   is defined first, then it shall be maximized or bounded by V. To do this, the condition   needs to be satisfied, so   is obtained. This approach is justified because U is a linear function with respect to V (so a convex function on V) by the definition of extensive variables. The non-standard Legendre transform here is obtained by negating the standard version, so  .

H is definitely a state function as it is obtained by adding PV (P and V as state variables) to a state function  , so its differential is an exact differential. Because of   and the fact that it must be an exact differential,  .

The enthalpy is suitable for description of processes in which the pressure is controlled from the surroundings.

It is likewise possible to shift the dependence of the energy from the extensive variable of entropy, S, to the (often more convenient) intensive variable T, resulting in the Helmholtz and Gibbs free energies. The Helmholtz free energy A, and Gibbs energy G, are obtained by performing Legendre transforms of the internal energy and enthalpy, respectively,

 
 

The Helmholtz free energy is often the most useful thermodynamic potential when temperature and volume are controlled from the surroundings, while the Gibbs energy is often the most useful when temperature and pressure are controlled from the surroundings.

Variable capacitor edit

As another example from physics, consider a parallel conductive plate capacitor, in which the plates can move relative to one another. Such a capacitor would allow transfer of the electric energy which is stored in the capacitor into external mechanical work, done by the force acting on the plates. One may think of the electric charge as analogous to the "charge" of a gas in a cylinder, with the resulting mechanical force exerted on a piston.

Compute the force on the plates as a function of x, the distance which separates them. To find the force, compute the potential energy, and then apply the definition of force as the gradient of the potential energy function.

The electrostatic potential energy stored in a capacitor of the capacitance C(x) and a positive electric charge +Q or negative charge -Q on each conductive plate is (with using the definition of the capacitance as  ),

 

where the dependence on the area of the plates, the dielectric constant of the insulation material between the plates, and the separation x are abstracted away as the capacitance C(x). (For a parallel plate capacitor, this is proportional to the area of the plates and inversely proportional to the separation.)

The force F between the plates due to the electric field created by the charge separation is then

 

If the capacitor is not connected to any electric circuit, then the electric charges on the plates remain constant and the voltage varies when the plates move with respect to each other, and the force is the negative gradient of the electrostatic potential energy as

 

where   as the charge is fixed in this configuration.

However, instead, suppose that the voltage between the plates V is maintained constant as the plate moves by connection to a battery, which is a reservoir for electric charges at a constant potential difference. Then the amount of charges   is a variable instead of the voltage;   and   are the Legendre conjugate to each other. To find the force, first compute the non-standard Legendre transform   with respect to   (also with using  ),

 

This transformation is possible because   is now a linear function of   so is convex on it. The force now becomes the negative gradient of this Legendre transform, resulting in the same force obtained from the original function  ,

 

The two conjugate energies   and   happen to stand opposite to each other (their signs are opposite), only because of the linearity of the capacitance—except now Q is no longer a constant. They reflect the two different pathways of storing energy into the capacitor, resulting in, for instance, the same "pull" between a capacitor's plates.

Probability theory edit

In large deviations theory, the rate function is defined as the Legendre transformation of the logarithm of the moment generating function of a random variable. An important application of the rate function is in the calculation of tail probabilities of sums of i.i.d. random variables, in particular in Cramér's theorem.

If   are i.i.d. random variables, let   be the associated random walk and   the moment generating function of  . For  ,  . Hence, by Markov's inequality, one has for   and  

 
where  . Since the left-hand side is independent of  , we may take the infimum of the right-hand side, which leads one to consider the supremum of  , i.e., the Legendre transform of  , evaluated at  .

Microeconomics edit

Legendre transformation arises naturally in microeconomics in the process of finding the supply S(P) of some product given a fixed price P on the market knowing the cost function C(Q), i.e. the cost for the producer to make/mine/etc. Q units of the given product.

A simple theory explains the shape of the supply curve based solely on the cost function. Let us suppose the market price for a one unit of our product is P. For a company selling this good, the best strategy is to adjust the production Q so that its profit is maximized. We can maximize the profit

 
by differentiating with respect to Q and solving
 

Qopt represents the optimal quantity Q of goods that the producer is willing to supply, which is indeed the supply itself:

 

If we consider the maximal profit as a function of price,  , we see that it is the Legendre transform of the cost function  .

Geometric interpretation edit

For a strictly convex function, the Legendre transformation can be interpreted as a mapping between the graph of the function and the family of tangents of the graph. (For a function of one variable, the tangents are well-defined at all but at most countably many points, since a convex function is differentiable at all but at most countably many points.)

The equation of a line with slope   and  -intercept   is given by  . For this line to be tangent to the graph of a function   at the point   requires

 
and
 

Being the derivative of a strictly convex function, the function   is strictly monotone and thus injective. The second equation can be solved for   allowing elimination of   from the first, and solving for the  -intercept   of the tangent as a function of its slope     where   denotes the Legendre transform of  

The family of tangent lines of the graph of   parameterized by the slope   is therefore given by   or, written implicitly, by the solutions of the equation

 

The graph of the original function can be reconstructed from this family of lines as the envelope of this family by demanding

 

Eliminating   from these two equations gives

 

Identifying   with   and recognizing the right side of the preceding equation as the Legendre transform of   yield  

Legendre transformation in more than one dimension edit

For a differentiable real-valued function on an open convex subset U of Rn the Legendre conjugate of the pair (U, f) is defined to be the pair (V, g), where V is the image of U under the gradient mapping Df, and g is the function on V given by the formula

 
where
 

is the scalar product on Rn. The multidimensional transform can be interpreted as an encoding of the convex hull of the function's epigraph in terms of its supporting hyperplanes.[2] This can be seen as consequence of the following two observations. On the one hand, the hyperplane tangent to the epigraph of   at some point   has normal vector  . On the other hand, any closed convex set   can be characterized via the set of its supporting hyperplanes by the equations  , where   is the support function of  . But the definition of Legendre transform via the maximization matches precisely that of the support function, that is,  . We thus conclude that the Legendre transform characterizes the epigraph in the sense that the tangent plane to the epigraph at any point   is given explicitly by

 

Alternatively, if X is a vector space and Y is its dual vector space, then for each point x of X and y of Y, there is a natural identification of the cotangent spaces T*Xx with Y and T*Yy with X. If f is a real differentiable function over X, then its exterior derivative, df, is a section of the cotangent bundle T*X and as such, we can construct a map from X to Y. Similarly, if g is a real differentiable function over Y, then dg defines a map from Y to X. If both maps happen to be inverses of each other, we say we have a Legendre transform. The notion of the tautological one-form is commonly used in this setting.

When the function is not differentiable, the Legendre transform can still be extended, and is known as the Legendre-Fenchel transformation. In this more general setting, a few properties are lost: for example, the Legendre transform is no longer its own inverse (unless there are extra assumptions, like convexity).

Legendre transformation on manifolds edit

Let   be a smooth manifold, let   and   be a vector bundle on   and its associated bundle projection, respectively. Let   be a smooth function. We think of   as a Lagrangian by analogy with the classical case where  ,   and   for some positive number   and function  .

As usual, the dual of   is denote by  . The fiber of   over   is denoted  , and the restriction of   to   is denoted by  . The Legendre transformation of   is the smooth morphism

 
defined by  , where  . In other words,   is the covector that sends   to the directional derivative  .

To describe the Legendre transformation locally, let   be a coordinate chart over which   is trivial. Picking a trivialization of   over  , we obtain charts   and  . In terms of these charts, we have  , where

 
for all  . If, as in the classical case, the restriction of   to each fiber   is strictly convex and bounded below by a positive definite quadratic form minus a constant, then the Legendre transform   is a diffeomorphism.[3] Suppose that   is a diffeomorphism and let   be the "Hamiltonian" function defined by
 
where  . Using the natural isomorphism  , we may view the Legendre transformation of   as a map  . Then we have[3]
 

Further properties edit

Scaling properties edit

The Legendre transformation has the following scaling properties: For a > 0,

 
 

It follows that if a function is homogeneous of degree r then its image under the Legendre transformation is a homogeneous function of degree s, where 1/r + 1/s = 1. (Since f(x) = xr/r, with r > 1, implies f*(p) = ps/s.) Thus, the only monomial whose degree is invariant under Legendre transform is the quadratic.

Behavior under translation edit

 
 

Behavior under inversion edit

 

Behavior under linear transformations edit

Let A : RnRm be a linear transformation. For any convex function f on Rn, one has

 
where A* is the adjoint operator of A defined by
 
and Af is the push-forward of f along A
 

A closed convex function f is symmetric with respect to a given set G of orthogonal linear transformations,

 
if and only if f* is symmetric with respect to G.

Infimal convolution edit

The infimal convolution of two functions f and g is defined as

 

Let f1, ..., fm be proper convex functions on Rn. Then

 

Fenchel's inequality edit

For any function f and its convex conjugate f * Fenchel's inequality (also known as the Fenchel–Young inequality) holds for every xX and pX*, i.e., independent x, p pairs,

 

See also edit

References edit

  1. ^ a b Legendre, Adrien-Marie (1789). Mémoire sur l'intégration de quelques équations aux différences partielles. In Histoire de l'Académie royale des sciences, avec les mémoires de mathématique et de physique (in French). Paris: Imprimerie royale. pp. 309–351.
  2. ^ "Legendre Transform | Nick Alger // Maps, art, etc". Archived from the original on 2015-03-12. Retrieved 2011-01-26.
  3. ^ a b Ana Cannas da Silva. Lectures on Symplectic Geometry, Corrected 2nd printing. Springer-Verlag, 2008. pp. 147-148. ISBN 978-3-540-42195-5.

Further reading edit

External links edit