Exponential Tilting (ET), Exponential Twisting, or Exponential Change of Measure (ECM) is a distribution shifting technique used in many parts of mathematics. The different exponential tiltings of a random variable is known as the natural exponential family of .

Exponential Tilting is used in Monte Carlo Estimation for rare-event simulation, and rejection and importance sampling in particular. In mathematical finance [1] Exponential Tilting is also known as Esscher tilting (or the Esscher transform), and often combined with indirect Edgeworth approximation and is used in such contexts as insurance futures pricing.[2]

The earliest formalization of Exponential Tilting is often attributed to Esscher[3] with its use in importance sampling being attributed to David Siegmund.[4]

Overview edit

Given a random variable   with probability distribution  , density  , and moment generating function (MGF)  , the exponentially tilted measure   is defined as follows:

 

where   is the cumulant generating function (CGF) defined as

 

We call

 

the  -tilted density of  . It satisfies  .

The exponential tilting of a random vector   has an analogous definition:

 

where  .

Example edit

The exponentially tilted measure in many cases has the same parametric form as that of  . One-dimensional examples include the normal distribution, the exponential distribution, the binomial distribution and the Poisson distribution.

For example, in the case of the normal distribution,   the tilted density   is the   density. The table below provides more examples of tilted density.

Original distribution[5][6] θ-Tilted distribution
   
   
   
   
   
   
   

For some distributions, however, the exponentially tilted distribution does not belong to the same parametric family as  . An example of this is the Pareto distribution with  , where   is well defined for   but is not a standard distribution. In such examples, the random variable generation may not always be straightforward.[7]

In statistical mechanics, the energy of a system in equilibrium with a heat bath has the Boltzmann distribution:  , where   is the inverse temperature. Exponential tilting then corresponds to changing the temperature:  .

Similarly, the energy and particle number of a system in equilibrium with a heat and particle bath has the grand canonical distribution:  , where   is the chemical potential. Exponential tilting then corresponds to changing both the temperature and the chemical potential.

Advantages edit

In many cases, the tilted distribution belongs to the same parametric family as the original. This is particularly true when the original density belongs to the exponential family of distribution. This simplifies random variable generation during Monte-Carlo simulations. Exponential tilting may still be useful if this is not the case, though normalization must be possible and additional sampling algorithms may be needed.

In addition, there exists a simple relationship between the original and tilted CGF,

 

We can see this by observing that

 

Thus,

 .

Clearly, this relationship allows for easy calculation of the CGF of the tilted distribution and thus the distributions moments. Moreover, it results in a simple form of the likelihood ratio. Specifically,

 .

Properties edit

  • If   is the CGF of  , then the CGF of the  -tilted   is
 
This means that the  -th cumulant of the tilted   is  . In particular, the expectation of the tilted distribution is
 .
The variance of the tilted distribution is
 .
  • Repeated tilting is additive. That is, tilting first by   and then   is the same as tilting once by  .
  • If   is the sum of independent, but not necessarily identical random variables  , then the  -tilted distribution of   is the sum of   each  -tilted individually.
  • If  , then   is the Kullback–Leibler divergence
 
between the tilted distribution   and the original distribution   of  .
  • Similarly, since  , we have the Kullback-Leibler divergence as
 .

Applications edit

Rare-event simulation edit

The exponential tilting of  , assuming it exists, supplies a family of distributions that can be used as proposal distributions for acceptance-rejection sampling or importance distributions for importance sampling. One common application is sampling from a distribution conditional on a sub-region of the domain, i.e.  . With an appropriate choice of  , sampling from   can meaningfully reduce the required amount of sampling or the variance of an estimator.

Saddlepoint approximation edit

The saddlepoint approximation method is a density approximation methodology often used for the distribution of sums and averages of independent, identically distributed random variables that employs Edgeworth series, but which generally performs better at extreme values. From the definition of the natural exponential family, it follows that

 .

Applying the Edgeworth expansion for  , we have

 

where   is the standard normal density of

 ,
 ,

and   are the hermite polynomials.

When considering values of   progressively farther from the center of the distribution,   and the   terms become unbounded. However, for each value of  , we can choose   such that

 

This value of   is referred to as the saddle-point, and the above expansion is always evaluated at the expectation of the tilted distribution. This choice of   leads to the final representation of the approximation given by

 [8][9]

Rejection sampling edit

Using the tilted distribution   as the proposal, the rejection sampling algorithm prescribes sampling from   and accepting with probability

 

where

 

That is, a uniformly distributed random variable   is generated, and the sample from   is accepted if

 

Importance sampling edit

Applying the exponentially tilted distribution as the importance distribution yields the equation

 ,

where

 

is the likelihood function. So, one samples from   to estimate the probability under the importance distribution   and then multiplies it by the likelihood ratio. Moreover, we have the variance given by

 .

Example edit

Assume independent and identically distributed   such that  . In order to estimate  , we can employ importance sampling by taking

 .

The constant   can be rewritten as   for some other constant  . Then,

 ,

where   denotes the   defined by the saddle-point equation

 .

Stochastic processes edit

Given the tilting of a normal R.V., it is intuitive that the exponential tilting of  , a Brownian motion with drift   and variance  , is a Brownian motion with drift   and variance  . Thus, any Brownian motion with drift under   can be thought of as a Brownian motion without drift under  . To observe this, consider the process  .  . The likelihood ratio term,  , is a martingale and commonly denoted  . Thus, a Brownian motion with drift process (as well as many other continuous processes adapted to the Brownian filtration) is a  -martingale.[10][11]

Stochastic Differential Equations edit

The above leads to the alternate representation of the stochastic differential equation  :  , where   =  . Girsanov's Formula states the likelihood ratio  . Therefore, Girsanov's Formula can be used to implement importance sampling for certain SDEs.

Tilting can also be useful for simulating a process   via rejection sampling of the SDE  . We may focus on the SDE since we know that   can be written  . As previously stated, a Brownian motion with drift can be tilted to a Brownian motion without drift. Therefore, we choose  . The likelihood ratio    . This likelihood ratio will be denoted  . To ensure this is a true likelihood ratio, it must be shown that  . Assuming this condition holds, it can be shown that  . So, rejection sampling prescribes that one samples from a standard Brownian motion and accept with probability  .

Choice of tilting parameter edit

Siegmund's algorithm edit

Assume i.i.d. X's with light tailed distribution and  . In order to estimate   where  , when   is large and hence   small, the algorithm uses exponential tilting to derive the importance distribution. The algorithm is used in many aspects, such as sequential tests,[12] G/G/1 queue waiting times, and   is used as the probability of ultimate ruin in ruin theory. In this context, it is logical to ensure that  . The criterion  , where   is s.t.   achieves this. Siegmund's algorithm uses  , if it exists, where   is defined in the following way:  . It has been shown that   is the only tilting parameter producing bounded relative error ( ).[13]

Black-Box algorithms edit

We can only see the input and output of a black box, without knowing its structure. The algorithm is to use only minimal information on its structure. When we generate random numbers, the output may not be within the same common parametric class, such as normal or exponential distributions. An automated way may be used to perform ECM. Let  be i.i.d. r.v.’s with distribution  ; for simplicity we assume  . Define  , where  , . . . are independent (0, 1) uniforms. A randomized stopping time for  , . . . is then a stopping time w.r.t. the filtration  , . . . Let further   be a class of distributions   on   with   and define   by  . We define a black-box algorithm for ECM for the given   and the given class  of distributions as a pair of a randomized stopping time   and an   measurable r.v.   such that   is distributed according to   for any  . Formally, we write this as   for all  . In other words, the rules of the game are that the algorithm may use simulated values from   and additional uniforms to produce an r.v. from  .[14]

See also edit

References edit

  1. ^ H.U. Gerber & E.S.W. Shiu (1994). "Option pricing by Esscher transforms". Transactions of the Society of Actuaries. 46: 99–191.
  2. ^ Cruz, Marcelo (2015). Fundamental Aspects of Operational Risk and Insurance Analytics. Wiley. pp. 784–796. ISBN 978-1-118-11839-9.
  3. ^ Butler, Ronald (2007). Saddlepoint Approximations with Applications. Cambridge University Press. pp. 156. ISBN 9780521872508.
  4. ^ Siegmund, D. (1976). "Importance Sampling in the Monte Carlo Study of Sequential Tests". The Annals of Statistics. 4 (4): 673–684. doi:10.1214/aos/1176343541.
  5. ^ Asmussen Soren & Glynn Peter (2007). Stochastic Simulation. Springer. p. 130. ISBN 978-0-387-30679-7.
  6. ^ Fuh, Cheng-Der; Teng, Huei-Wen; Wang, Ren-Her (2013). "Efficient Importance Sampling for Rare Event Simulation with Applications". {{cite journal}}: Cite journal requires |journal= (help)
  7. ^ Asmussen, Soren & Glynn, Peter (2007). Stochastic Simulation. Springer. pp. 164–167. ISBN 978-0-387-30679-7
  8. ^ Butler, Ronald (2007). Saddlepoint Approximations with Applications. Cambridge University Press. pp. 156–157. ISBN 9780521872508.
  9. ^ Seeber, G.U.H. (1992). Advances in GLIM and Statistical Modelling. Springer. pp. 195–200. ISBN 978-0-387-97873-4.
  10. ^ Asmussen Soren & Glynn Peter (2007). Stochastic Simulation. Springer. p. 407. ISBN 978-0-387-30679-7.
  11. ^ Steele, J. Michael (2001). Stochastic Calculus and Financial Applications. Springer. pp. 213–229. ISBN 978-1-4419-2862-7.
  12. ^ D. Siegmund (1985) Sequential Analysis. Springer-Verlag
  13. ^ Asmussen Soren & Glynn Peter, Peter (2007). Stochastic Simulation. Springer. pp. 164–167. ISBN 978-0-387-30679-7.
  14. ^ Asmussen, Soren & Glynn, Peter (2007). Stochastic Simulation. Springer. pp. 416–420. ISBN 978-0-387-30679-7