International
Tables for
Crystallography
Volume B
Reciprocal space
Edited by U. Shmueli

International Tables for Crystallography (2010). Vol. B, ch. 1.3, pp. 102-104   | 1 | 2 |

Section 1.3.4.5.2.1. Analytical methods of probability theory

G. Bricognea

aGlobal Phasing Ltd, Sheraton House, Suites 14–16, Castle Park, Cambridge CB3 0AX, England, and LURE, Bâtiment 209D, Université Paris-Sud, 91405 Orsay, France

1.3.4.5.2.1. Analytical methods of probability theory

| top | pdf |

The material in this section is not intended as an introduction to probability theory [for which the reader is referred to Cramér (1946)[link], Petrov (1975)[link] or Bhattacharya & Rao (1976)[link]], but only as an illustration of the role played by the Fourier transformation in certain specific areas which are used in formulating and implementing direct methods of phase determination.

  • (a) Convolution of probability densities

    The addition of independent random variables or vectors leads to the convolution of their probability distributions: if [{\bf X}_{1}] and [{\bf X}_{2}] are two n-dimensional random vectors independently distributed with probability densities [P_{1}] and [P_{2}], respectively, then their sum [{\bf X} = {\bf X}_{1} + {\bf X}_{2}] has probability density [{\scr P}] given by[\eqalign{{\scr P}({\bf X}) &= {\textstyle\int\limits_{{\bf R}^{n}}} P_{1}({\bf X}_{1}) P_{2}({\bf X} - {\bf X}_{1}) \hbox{ d}^{n}{\bf X}_{1}\cr &= {\textstyle\int\limits_{{\bf R}^{n}}} P_{1}({\bf X} - {\bf X}_{2}) P_{2}({\bf X}_{2}) \hbox{ d}^{n}{\bf X}_{2}}]i.e.[{\scr P} = P_{1} * P_{2}.]

    This result can be extended to the case where [P_{1}] and [P_{2}] are singular measures (distributions of order zero, Section 1.3.2.3.4[link]) and do not have a density with respect to the Lebesgue measure in [{\bb R}^{n}].

  • (b) Characteristic functions

    This convolution can be turned into a simple multiplication by considering the Fourier transforms (called the characteristic functions) of [P_{1}], [P_{2}] and [{\scr P}], defined with a slightly different normalization in that there is no factor of [2\pi] in the exponent (see Section 1.3.2.4.5[link]), e.g.[C({\bf t}) = {\textstyle\int\limits_{{\bf R}^{n}}} P({\bf X}) \exp (i{\bf t} \cdot {\bf X}) \hbox{ d}^{n}{\bf X}.]Then by the convolution theorem[{\scr C}({\bf t}) = C_{1}({\bf t})\times C_{2}({\bf t}),]so that [{\scr P}({\bf X})] may be evaluated by Fourier inversion of its characteristic function as[{\scr P}({\bf X}) = {1 \over (2\pi)^{n}} \int\limits_{{\bb R}^{n}} C_{1}({\bf t}) C_{2}({\bf t}) \exp (-i{\bf t} \cdot {\bf X}) \hbox{ d}^{n}{\bf t}](see Section 1.3.2.4.5[link] for the normalization factors).

    It follows from the differentiation theorem that the partial derivatives of the characteristic function [C({\bf t})] at [{\bf t} = {\bf 0}] are related to the moments of a distribution P by the identities[\eqalign{\mu_{r_{1}r_{2}\ldots r_{n}} &\equiv {\int\limits_{D}} P({\bf X}) X_{1}^{r_{1}} X_{2}^{r_{2}}\ldots X_{n}^{r_{n}} \hbox{ d}^{n}{\bf X}\cr &= i^{-(r_{1} + \ldots + r_{n})} \left.{\partial^{\,r_{1} + \ldots + r_{n}} C \over \partial t_{1}^{r_{1}} \ldots \partial t_{n}^{r_{n}}}\right|_{{\bf t} = {\bf 0}}}]for any n-tuple of non-negative integers [(r_{1}, r_{2}, \ldots, r_{n})].

  • (c) Moment-generating functions

    The above relation can be freed from powers of i by defining (at least formally) the moment-generating function:[M({\bf t}) = {\textstyle\int\limits_{{\bb R}^{n}}} P({\bf X}) \exp ({\bf t} \cdot {\bf X}) \hbox{ d}^{n}{\bf X}]which is related to [C({\bf t})] by [C({\bf t}) = {\bi M}(i{\bf t})] so that the inversion formula reads[{\scr P}({\bf X}) = {1 \over (2\pi)^{n}} \int\limits_{{\bb R}^{n}} M_{1}(i{\bf t}) M_{2}(i{\bf t}) \exp (-i{\bf t} \cdot {\bf X}) \hbox{ d}^{n}{\bf t}.]The moment-generating function is well defined, in particular, for any probability distribution with compact support, in which case it may be continued analytically from a function over [{\bb R}^{n}] into an entire function of n complex variables by virtue of the Paley–Wiener theorem (Section 1.3.2.4.2.10[link]). Its moment-generating properties are summed up in the following relations:[\mu_{r_{1}r_{2}\ldots r_{n}} = \left.{\partial^{\,r_{1} + \ldots + r_{n}} M \over \partial t_{1}^{r_{1}} \ldots \partial t_{n}^{r_{n}}}\right|_{{\bf t} = {\bf 0}}.]

  • (d) Cumulant-generating functions

    The multiplication of moment-generating functions may be further simplified into the addition of their logarithms:[\log {\scr M} = \log M_{1} + \log M_{2},]or equivalently of the coefficients of their Taylor series at [{\bf t} = {\bf 0}], viz:[\kappa_{r_{1}r_{2}\ldots r_{n}} = \left.{\partial^{\,r_{1} + \ldots + r_{n}} (\log M) \over \partial t_{1}^{r_{1}} \ldots \partial t_{n}^{r_{n}}}\right|_{{\bf t} = {\bf 0}}.]These coefficients are called cumulants, since they add when the independent random vectors to which they belong are added, and log M is called the cumulant-generating function. The inversion formula for [{\scr P}] then reads[{\scr P}({\bf X}) = {1 \over (2\pi)^{n}} \int\limits_{{\bb R}^{n}} \exp [\log M_{1}(i{\bf t}) + \log M_{2}(i{\bf t}) - i{\bf t} \cdot {\bf X}] \hbox{ d}^{n}{\bf t}.]

  • (e) Asymptotic expansions and limit theorems

    Consider an n-dimensional random vector X of the form[{\bf X} = {\bf X}_{1} + {\bf X}_{2} + \ldots + {\bf X}_{N},]where the N summands are independent n-dimensional random vectors identically distributed with probability density P. Then the distribution [{\scr P}] of X may be written in closed form as a Fourier transform:[\eqalign{{\scr P}({\bf X}) &= {1 \over (2\pi)^{n}} \int\limits_{{\bb R}^{n}} M^{N} (i{\bf t}) \exp (-i{\bf t} \cdot {\bf X}) \hbox{ d}^{n}{\bf t}\cr &= {1 \over (2\pi)^{n}} \int\limits_{{\bb R}^{n}} \exp [N \log M(i{\bf t}) - i{\bf t} \cdot {\bf X}] \hbox{ d}^{n}{\bf t},}]where[M({\bf t}) = {\textstyle\int\limits_{{\bb R}^{n}}} P({\bf Y}) \exp ({\bf t} \cdot {\bf Y}) \hbox{ d}^{n}{\bf Y}]is the moment-generating function common to all the summands.

    This an exact expression for [{\scr P}], which may be exploited analytically or numerically in certain favourable cases. Supposing for instance that P has compact support, then its characteristic function [{M}(i{\bf t})] can be sampled finely enough to accommodate the bandwidth of the support of [{\scr P} = P^{*N}] (this sampling rate clearly depends on n) so that the above expression for [{\scr P}] can be used for its numerical evaluation as the discrete Fourier transform of [{M}^{N}(i{\bf t})]. This exact method is practical only for small values of the dimension n.

    In all other cases some form of approximation must be used in the Fourier inversion of [{M}^{N}(i{\bf t})]. For this purpose it is customary (Cramér, 1946[link]) to expand the cumulant-generating function around [{\bf t} = {\bf 0}] with respect to the carrying variables t:[\log [M^{N}(i{\bf t})] = \sum\limits_{{\bf r} \in {\bb N}^{n}} {N\kappa_{\bf r} \over {\bf r}!} (i{\bf t})^{{\bf r}},]where [{\bf r} = (r_{1}, r_{2}, \ldots, r_{n})] is a multi-index (Section 1.3.2.2.3[link]). The first-order terms may be eliminated by recentring [{\scr P}] around its vector of first-order cumulants[\langle {\bf X}\rangle = {\textstyle\sum\limits_{j = 1}^{N}} \langle {\bf X}_{j}\rangle,]where [\langle \cdot \rangle] denotes the mathematical expectation of a random vector. The second-order terms may be grouped separately from the terms of third or higher order to give[\eqalign{M^{N}(i{\bf t}) &= \exp (-{\textstyle{1 \over 2}} N{\bf t}^{U}{\bf Qt})\cr &\quad \times \exp \left\{\sum\limits_{|{\bf r}| \geq 3} {N\kappa_{{\bf r}} \over {\bf r}!} (i{\bf t})^{{\bf r}}\right\},}]where [{\bf Q} = \nabla \nabla^{T}(\log {M})] is the covariance matrix of the multivariate distribution P. Expanding the exponential gives rise to a series of terms of the form[\exp (-{\textstyle{1 \over 2}} N{\bf t}^{T} {\bf Qt}) \times \hbox{monomial in } t_{1}, t_{2}, \ldots, t_{n},]each of which may now be subjected to a Fourier transformation to yield a Hermite function of t (Section 1.3.2.4.4.2[link]) with coefficients involving the cumulants κ of P. Taking the transformed terms in natural order gives an asymptotic expansion of P for large N called the Gram–Charlier series of [{\scr P}], while grouping the terms according to increasing powers of [1/\sqrt{N}] gives another asymptotic expansion called the Edgeworth series of [{\scr P}]. Both expansions comprise a leading Gaussian term which embodies the central-limit theorem:[{\scr P}({\bf E}) = {1 \over \sqrt{\det (2\pi {\bf Q})}} \exp (-{\textstyle{1 \over 2}} {\bf E}^{T} {\bf Q}^{-1} {\bf E}), \quad \hbox{where } {\bf E} = {{\bf X} - \langle {\bf X}\rangle \over \sqrt{N}}.]

  • (f) The saddlepoint approximation

    A limitation of the Edgeworth series is that it gives an accurate estimate of [{\scr P}({\bf X})] only in the vicinity of [{\bf X} = \langle {\bf X}\rangle], i.e. for small values of E. These convergence difficulties are easily understood: one is substituting a local approximation to log M (viz a Taylor-series expansion valid near [{\bf t} = {\bf 0}]) into an integral, whereas integration is a global process which consults values of log M far from [{\bf t} = {\bf 0}].

    It is possible, however, to let the point t where log M is expanded as a Taylor series depend on the particular value [{\bf X}^{*}] of X for which an accurate evaluation of [{\scr P}({\bf X})] is desired. This is the essence of the saddlepoint method (Fowler, 1936[link]; Khinchin 1949[link]; Daniels, 1954[link]; de Bruijn, 1970[link]; Bleistein & Handelsman, 1986[link]), which uses an analytical continuation of [{M}({\bf t})] from a function over [{\bb R}^{n}] to a function over [{\bb C}^{n}] (see Section 1.3.2.4.2.10[link]). Putting then [{\bf t} = {\bf s} - i\tau], the [{\bb C}^{n}] version of Cauchy's theorem (Hörmander, 1973[link]) gives rise to the identity[\eqalign{{\scr P}({\bf X}^{*}) &= {\exp (-{\boldtau} \cdot {\bf X}^{*}) \over (2\pi)^{n}}\cr &\quad \times \int\limits_{{\bb R}^{n}} \exp \left\{N \left[\log M ({\boldtau} + i{\bf s}) - i{\bf s} \cdot {{\bf X}^{*} \over N}\right]\right\}\, \hbox{d}^{n}{\bf s}}]for any [\boldtau \in {\bb R}^{n}]. By a convexity argument involving the positive-definiteness of covariance matrix Q, there is a unique value of τ such that[\nabla (\log M)|_{{\bf t} = {\bf 0} - i{\boldtau}} = {{\bf X}^{*} \over N}.]At the saddlepoint [{\bf t}^{*} = {\bf 0} - i\boldtau], the modulus of the integrand above is a maximum and its phase is stationary with respect to the integration variable s: as N tends to infinity, all contributions to the integral cancel because of rapid oscillation, except those coming from the immediate vicinity of [{\bf t}^{*}] where there is no oscillation. A Taylor expansion of log [{M}^{N}] to second order with respect to s at [{\bf t}^{*}] then gives[\log M^{N} (\boldtau + i{\bf s}) \approx \log M^{N} (\boldtau) + i{\bf s} \cdot {\bf X}^{*} - {N \over 2} [{\bf s}^{T} {\bf Qs}]]and hence[{\scr P}({\bf X}^{*}) \approx \exp [\log M^{N} (\boldtau) - \boldtau \cdot {\bf X}^{*}] {1 \over (2\pi)^{n}} \int\limits_{{\bb R}^{n}} \exp (-{\textstyle{1 \over 2}} {\bf s}^{T} {\scr Q}{\bf s}) \hbox{ d}^{n}{\bf s}.]The last integral is elementary and gives the `saddlepoint approximation':[{\scr P}^{\rm SP}({\bf X}^{*}) = {\exp (\hbox{\sf S}) \over \sqrt{\det (2\pi {\scr Q})}},]where[{\sf S} = \log M^{N} (\boldtau) - \boldtau \cdot {\bf X}^{*}]and where[{\scr Q} = \nabla \nabla^{T} (\log M^{N}) = N{\bf Q}.]

    This approximation scheme amounts to using the `conjugate distribution' (Khinchin, 1949[link])[P_{\boldtau}({\bf X}_{j}) = P({\bf X}_{j}) {\exp (\boldtau \cdot {\bf X}_{j}) \over M(\boldtau)}]instead of the original distribution [{\bi P}({\bf X}_{j}) = {\bi P}_{{\bf 0}}({\bf X}_{j})] for the common distribution of all N random vectors [{\bf X}_{j}]. The exponential modulation results from the analytic continuation of the characteristic (or moment-generating) function into [{\bb C}^{n}], as in Section 1.3.2.4.2.10.[link] The saddlepoint approximation [{\scr P}^{\rm SP}] is only the leading term of an asymptotic expansion (called the saddlepoint expansion) for [{\scr P}], which is actually the Edgeworth expansion associated with [P_{\boldtau}^{*N}].

References

Bhattacharya, R. N. & Rao, R. R. (1976). Normal Approximation and Asymptotic Expansions. New York: John Wiley.
Bleistein, N. & Handelsman, R. A. (1986). Asymptotic Expansions of Integrals. New York: Dover Publications.
Bruijn, N. G. de (1970). Asymptotic Methods in Analysis, 3rd ed. Amsterdam: North-Holland.
Cramér, H. (1946). Mathematical Methods of Statistics. Princeton University Press.
Daniels, H. E. (1954). Saddlepoint approximation in statistics. Ann. Math. Stat. 25, 631–650.
Fowler, R. H. (1936). Statistical Mechanics, 2nd ed. Cambridge University Press.
Hörmander, L. (1973). An Introduction to Complex Analysis in Several Variables, 2nd ed. Amsterdam: North-Holland.
Khinchin, A. I. (1949). Mathematical Foundations of Statistical Mechanics. New York: Dover Publications.
Petrov, V. V. (1975). Sums of Independent Random Variables. Berlin: Springer-Verlag.








































to end of page
to top of page