International
Tables for
Crystallography
Volume B
Reciprocal space
Edited by U. Shmueli

International Tables for Crystallography (2010). Vol. B, ch. 1.3, pp. 102-106   | 1 | 2 |

Section 1.3.4.5.2. Application to probability theory and direct methods

G. Bricognea

aGlobal Phasing Ltd, Sheraton House, Suites 14–16, Castle Park, Cambridge CB3 0AX, England, and LURE, Bâtiment 209D, Université Paris-Sud, 91405 Orsay, France

1.3.4.5.2. Application to probability theory and direct methods

| top | pdf |

The Fourier transformation plays a central role in the branch of probability theory concerned with the limiting behaviour of sums of large numbers of independent and identically distributed random variables or random vectors. This privileged role is a consequence of the convolution theorem and of the `moment-generating' properties which follow from the exchange between differentiation and multiplication by monomials. When the limit theorems are applied to the calculation of joint probability distributions of structure factors, which are themselves closely related to the Fourier transformation, a remarkable phenomenon occurs, which leads to the saddlepoint approximation and to the maximum-entropy method.

1.3.4.5.2.1. Analytical methods of probability theory

| top | pdf |

The material in this section is not intended as an introduction to probability theory [for which the reader is referred to Cramér (1946)[link], Petrov (1975)[link] or Bhattacharya & Rao (1976)[link]], but only as an illustration of the role played by the Fourier transformation in certain specific areas which are used in formulating and implementing direct methods of phase determination.

  • (a) Convolution of probability densities

    The addition of independent random variables or vectors leads to the convolution of their probability distributions: if [{\bf X}_{1}] and [{\bf X}_{2}] are two n-dimensional random vectors independently distributed with probability densities [P_{1}] and [P_{2}], respectively, then their sum [{\bf X} = {\bf X}_{1} + {\bf X}_{2}] has probability density [{\scr P}] given by[\eqalign{{\scr P}({\bf X}) &= {\textstyle\int\limits_{{\bf R}^{n}}} P_{1}({\bf X}_{1}) P_{2}({\bf X} - {\bf X}_{1}) \hbox{ d}^{n}{\bf X}_{1}\cr &= {\textstyle\int\limits_{{\bf R}^{n}}} P_{1}({\bf X} - {\bf X}_{2}) P_{2}({\bf X}_{2}) \hbox{ d}^{n}{\bf X}_{2}}]i.e.[{\scr P} = P_{1} * P_{2}.]

    This result can be extended to the case where [P_{1}] and [P_{2}] are singular measures (distributions of order zero, Section 1.3.2.3.4[link]) and do not have a density with respect to the Lebesgue measure in [{\bb R}^{n}].

  • (b) Characteristic functions

    This convolution can be turned into a simple multiplication by considering the Fourier transforms (called the characteristic functions) of [P_{1}], [P_{2}] and [{\scr P}], defined with a slightly different normalization in that there is no factor of [2\pi] in the exponent (see Section 1.3.2.4.5[link]), e.g.[C({\bf t}) = {\textstyle\int\limits_{{\bf R}^{n}}} P({\bf X}) \exp (i{\bf t} \cdot {\bf X}) \hbox{ d}^{n}{\bf X}.]Then by the convolution theorem[{\scr C}({\bf t}) = C_{1}({\bf t})\times C_{2}({\bf t}),]so that [{\scr P}({\bf X})] may be evaluated by Fourier inversion of its characteristic function as[{\scr P}({\bf X}) = {1 \over (2\pi)^{n}} \int\limits_{{\bb R}^{n}} C_{1}({\bf t}) C_{2}({\bf t}) \exp (-i{\bf t} \cdot {\bf X}) \hbox{ d}^{n}{\bf t}](see Section 1.3.2.4.5[link] for the normalization factors).

    It follows from the differentiation theorem that the partial derivatives of the characteristic function [C({\bf t})] at [{\bf t} = {\bf 0}] are related to the moments of a distribution P by the identities[\eqalign{\mu_{r_{1}r_{2}\ldots r_{n}} &\equiv {\int\limits_{D}} P({\bf X}) X_{1}^{r_{1}} X_{2}^{r_{2}}\ldots X_{n}^{r_{n}} \hbox{ d}^{n}{\bf X}\cr &= i^{-(r_{1} + \ldots + r_{n})} \left.{\partial^{\,r_{1} + \ldots + r_{n}} C \over \partial t_{1}^{r_{1}} \ldots \partial t_{n}^{r_{n}}}\right|_{{\bf t} = {\bf 0}}}]for any n-tuple of non-negative integers [(r_{1}, r_{2}, \ldots, r_{n})].

  • (c) Moment-generating functions

    The above relation can be freed from powers of i by defining (at least formally) the moment-generating function:[M({\bf t}) = {\textstyle\int\limits_{{\bb R}^{n}}} P({\bf X}) \exp ({\bf t} \cdot {\bf X}) \hbox{ d}^{n}{\bf X}]which is related to [C({\bf t})] by [C({\bf t}) = {\bi M}(i{\bf t})] so that the inversion formula reads[{\scr P}({\bf X}) = {1 \over (2\pi)^{n}} \int\limits_{{\bb R}^{n}} M_{1}(i{\bf t}) M_{2}(i{\bf t}) \exp (-i{\bf t} \cdot {\bf X}) \hbox{ d}^{n}{\bf t}.]The moment-generating function is well defined, in particular, for any probability distribution with compact support, in which case it may be continued analytically from a function over [{\bb R}^{n}] into an entire function of n complex variables by virtue of the Paley–Wiener theorem (Section 1.3.2.4.2.10[link]). Its moment-generating properties are summed up in the following relations:[\mu_{r_{1}r_{2}\ldots r_{n}} = \left.{\partial^{\,r_{1} + \ldots + r_{n}} M \over \partial t_{1}^{r_{1}} \ldots \partial t_{n}^{r_{n}}}\right|_{{\bf t} = {\bf 0}}.]

  • (d) Cumulant-generating functions

    The multiplication of moment-generating functions may be further simplified into the addition of their logarithms:[\log {\scr M} = \log M_{1} + \log M_{2},]or equivalently of the coefficients of their Taylor series at [{\bf t} = {\bf 0}], viz:[\kappa_{r_{1}r_{2}\ldots r_{n}} = \left.{\partial^{\,r_{1} + \ldots + r_{n}} (\log M) \over \partial t_{1}^{r_{1}} \ldots \partial t_{n}^{r_{n}}}\right|_{{\bf t} = {\bf 0}}.]These coefficients are called cumulants, since they add when the independent random vectors to which they belong are added, and log M is called the cumulant-generating function. The inversion formula for [{\scr P}] then reads[{\scr P}({\bf X}) = {1 \over (2\pi)^{n}} \int\limits_{{\bb R}^{n}} \exp [\log M_{1}(i{\bf t}) + \log M_{2}(i{\bf t}) - i{\bf t} \cdot {\bf X}] \hbox{ d}^{n}{\bf t}.]

  • (e) Asymptotic expansions and limit theorems

    Consider an n-dimensional random vector X of the form[{\bf X} = {\bf X}_{1} + {\bf X}_{2} + \ldots + {\bf X}_{N},]where the N summands are independent n-dimensional random vectors identically distributed with probability density P. Then the distribution [{\scr P}] of X may be written in closed form as a Fourier transform:[\eqalign{{\scr P}({\bf X}) &= {1 \over (2\pi)^{n}} \int\limits_{{\bb R}^{n}} M^{N} (i{\bf t}) \exp (-i{\bf t} \cdot {\bf X}) \hbox{ d}^{n}{\bf t}\cr &= {1 \over (2\pi)^{n}} \int\limits_{{\bb R}^{n}} \exp [N \log M(i{\bf t}) - i{\bf t} \cdot {\bf X}] \hbox{ d}^{n}{\bf t},}]where[M({\bf t}) = {\textstyle\int\limits_{{\bb R}^{n}}} P({\bf Y}) \exp ({\bf t} \cdot {\bf Y}) \hbox{ d}^{n}{\bf Y}]is the moment-generating function common to all the summands.

    This an exact expression for [{\scr P}], which may be exploited analytically or numerically in certain favourable cases. Supposing for instance that P has compact support, then its characteristic function [{M}(i{\bf t})] can be sampled finely enough to accommodate the bandwidth of the support of [{\scr P} = P^{*N}] (this sampling rate clearly depends on n) so that the above expression for [{\scr P}] can be used for its numerical evaluation as the discrete Fourier transform of [{M}^{N}(i{\bf t})]. This exact method is practical only for small values of the dimension n.

    In all other cases some form of approximation must be used in the Fourier inversion of [{M}^{N}(i{\bf t})]. For this purpose it is customary (Cramér, 1946[link]) to expand the cumulant-generating function around [{\bf t} = {\bf 0}] with respect to the carrying variables t:[\log [M^{N}(i{\bf t})] = \sum\limits_{{\bf r} \in {\bb N}^{n}} {N\kappa_{\bf r} \over {\bf r}!} (i{\bf t})^{{\bf r}},]where [{\bf r} = (r_{1}, r_{2}, \ldots, r_{n})] is a multi-index (Section 1.3.2.2.3[link]). The first-order terms may be eliminated by recentring [{\scr P}] around its vector of first-order cumulants[\langle {\bf X}\rangle = {\textstyle\sum\limits_{j = 1}^{N}} \langle {\bf X}_{j}\rangle,]where [\langle \cdot \rangle] denotes the mathematical expectation of a random vector. The second-order terms may be grouped separately from the terms of third or higher order to give[\eqalign{M^{N}(i{\bf t}) &= \exp (-{\textstyle{1 \over 2}} N{\bf t}^{U}{\bf Qt})\cr &\quad \times \exp \left\{\sum\limits_{|{\bf r}| \geq 3} {N\kappa_{{\bf r}} \over {\bf r}!} (i{\bf t})^{{\bf r}}\right\},}]where [{\bf Q} = \nabla \nabla^{T}(\log {M})] is the covariance matrix of the multivariate distribution P. Expanding the exponential gives rise to a series of terms of the form[\exp (-{\textstyle{1 \over 2}} N{\bf t}^{T} {\bf Qt}) \times \hbox{monomial in } t_{1}, t_{2}, \ldots, t_{n},]each of which may now be subjected to a Fourier transformation to yield a Hermite function of t (Section 1.3.2.4.4.2[link]) with coefficients involving the cumulants κ of P. Taking the transformed terms in natural order gives an asymptotic expansion of P for large N called the Gram–Charlier series of [{\scr P}], while grouping the terms according to increasing powers of [1/\sqrt{N}] gives another asymptotic expansion called the Edgeworth series of [{\scr P}]. Both expansions comprise a leading Gaussian term which embodies the central-limit theorem:[{\scr P}({\bf E}) = {1 \over \sqrt{\det (2\pi {\bf Q})}} \exp (-{\textstyle{1 \over 2}} {\bf E}^{T} {\bf Q}^{-1} {\bf E}), \quad \hbox{where } {\bf E} = {{\bf X} - \langle {\bf X}\rangle \over \sqrt{N}}.]

  • (f) The saddlepoint approximation

    A limitation of the Edgeworth series is that it gives an accurate estimate of [{\scr P}({\bf X})] only in the vicinity of [{\bf X} = \langle {\bf X}\rangle], i.e. for small values of E. These convergence difficulties are easily understood: one is substituting a local approximation to log M (viz a Taylor-series expansion valid near [{\bf t} = {\bf 0}]) into an integral, whereas integration is a global process which consults values of log M far from [{\bf t} = {\bf 0}].

    It is possible, however, to let the point t where log M is expanded as a Taylor series depend on the particular value [{\bf X}^{*}] of X for which an accurate evaluation of [{\scr P}({\bf X})] is desired. This is the essence of the saddlepoint method (Fowler, 1936[link]; Khinchin 1949[link]; Daniels, 1954[link]; de Bruijn, 1970[link]; Bleistein & Handelsman, 1986[link]), which uses an analytical continuation of [{M}({\bf t})] from a function over [{\bb R}^{n}] to a function over [{\bb C}^{n}] (see Section 1.3.2.4.2.10[link]). Putting then [{\bf t} = {\bf s} - i\tau], the [{\bb C}^{n}] version of Cauchy's theorem (Hörmander, 1973[link]) gives rise to the identity[\eqalign{{\scr P}({\bf X}^{*}) &= {\exp (-{\boldtau} \cdot {\bf X}^{*}) \over (2\pi)^{n}}\cr &\quad \times \int\limits_{{\bb R}^{n}} \exp \left\{N \left[\log M ({\boldtau} + i{\bf s}) - i{\bf s} \cdot {{\bf X}^{*} \over N}\right]\right\}\, \hbox{d}^{n}{\bf s}}]for any [\boldtau \in {\bb R}^{n}]. By a convexity argument involving the positive-definiteness of covariance matrix Q, there is a unique value of τ such that[\nabla (\log M)|_{{\bf t} = {\bf 0} - i{\boldtau}} = {{\bf X}^{*} \over N}.]At the saddlepoint [{\bf t}^{*} = {\bf 0} - i\boldtau], the modulus of the integrand above is a maximum and its phase is stationary with respect to the integration variable s: as N tends to infinity, all contributions to the integral cancel because of rapid oscillation, except those coming from the immediate vicinity of [{\bf t}^{*}] where there is no oscillation. A Taylor expansion of log [{M}^{N}] to second order with respect to s at [{\bf t}^{*}] then gives[\log M^{N} (\boldtau + i{\bf s}) \approx \log M^{N} (\boldtau) + i{\bf s} \cdot {\bf X}^{*} - {N \over 2} [{\bf s}^{T} {\bf Qs}]]and hence[{\scr P}({\bf X}^{*}) \approx \exp [\log M^{N} (\boldtau) - \boldtau \cdot {\bf X}^{*}] {1 \over (2\pi)^{n}} \int\limits_{{\bb R}^{n}} \exp (-{\textstyle{1 \over 2}} {\bf s}^{T} {\scr Q}{\bf s}) \hbox{ d}^{n}{\bf s}.]The last integral is elementary and gives the `saddlepoint approximation':[{\scr P}^{\rm SP}({\bf X}^{*}) = {\exp (\hbox{\sf S}) \over \sqrt{\det (2\pi {\scr Q})}},]where[{\sf S} = \log M^{N} (\boldtau) - \boldtau \cdot {\bf X}^{*}]and where[{\scr Q} = \nabla \nabla^{T} (\log M^{N}) = N{\bf Q}.]

    This approximation scheme amounts to using the `conjugate distribution' (Khinchin, 1949[link])[P_{\boldtau}({\bf X}_{j}) = P({\bf X}_{j}) {\exp (\boldtau \cdot {\bf X}_{j}) \over M(\boldtau)}]instead of the original distribution [{\bi P}({\bf X}_{j}) = {\bi P}_{{\bf 0}}({\bf X}_{j})] for the common distribution of all N random vectors [{\bf X}_{j}]. The exponential modulation results from the analytic continuation of the characteristic (or moment-generating) function into [{\bb C}^{n}], as in Section 1.3.2.4.2.10.[link] The saddlepoint approximation [{\scr P}^{\rm SP}] is only the leading term of an asymptotic expansion (called the saddlepoint expansion) for [{\scr P}], which is actually the Edgeworth expansion associated with [P_{\boldtau}^{*N}].

1.3.4.5.2.2. The statistical theory of phase determination

| top | pdf |

The methods of probability theory just surveyed were applied to various problems formally similar to the crystallographic phase problem [e.g. the `problem of the random walk' of Pearson (1905)[link]] by Rayleigh (1880[link], 1899[link], 1905[link], 1918[link], 1919[link]) and Kluyver (1906)[link]. They became the basis of the statistical theory of communication with the classic papers of Rice (1944, 1945[link]).

The Gram–Charlier and Edgeworth series were introduced into crystallography by Bertaut (1955a[link],b[link],c[link], 1956a[link]) and by Klug (1958)[link], respectively, who showed them to constitute the mathematical basis of numerous formulae derived by Hauptman & Karle (1953)[link]. The saddlepoint approximation was introduced by Bricogne (1984)[link] and was shown to be related to variational methods involving the maximization of certain entropy criteria. This connection exhibits most of the properties of the Fourier transform at play simultaneously, and will now be described as a final illustration.

  • (a) Definitions and conventions

    Let H be a set of unique non-origin reflections h for a crystal with lattice Λ and space group G. Let H contain [n_{\rm a}] acentric and [n_{\rm c}] centric reflections. Structure-factor values attached to all reflections in H will comprise [n = 2n_{\rm a} + n_{\rm c}] real numbers. For h acentric, [\alpha_{{\bf h}}] and [\beta_{{\bf h}}] will be the real and imaginary parts of the complex structure factor; for h centric, [\gamma_{{\bf h}}] will be the real coordinate of the (possibly complex) structure factor measured along a real axis rotated by one of the two angles [\theta_{{\bf h}}], π apart, to which the phase is restricted modulo [2\pi] (Section 1.3.4.2.2.5[link]). These n real coordinates will be arranged as a column vector containing the acentric then the centric data, i.e. in the order[\alpha_{1}, \beta_{1}, \alpha_{2}, \beta_{2}, \ldots, \alpha_{n_{\rm a}}, \beta_{n_{\rm a}}, \gamma_{1}, \gamma_{2}, \ldots, \gamma_{n_{\rm c}}.]

  • (b) Vectors of trigonometric structure-factor expressions

    Let [\boldxi({\bf x})] denote the vector of trigonometric structure-factor expressions associated with [{\bf x} \in D], where D denotes the asymmetric unit. These are defined as follows:[\let\normalbaselines\relax\openup2pt\matrix{\alpha_{{\bf h}} ({\bf x}) + i\beta_{{\bf h}} ({\bf x}) = \Xi ({\bf h},{\bf x})\hfill & \hbox{for } {\bf h} \hbox{ acentric}\hfill\cr \gamma_{{\bf h}} ({\bf x}) = \exp(- i\theta_{{\bf h}}) \Xi ({\bf h},{\bf x})\hfill & \hbox{for } {\bf h} \hbox{ centric},\hfill}]where[\Xi ({\bf h},{\bf x}) = {1 \over |G_{{\bf x}}|} \sum\limits_{g \in G} \exp \{2 \pi i{\bf h} \cdot [S_{g} ({\bf x})]\}.]

    According to the convention above, the coordinates of [\boldxi ({\bf x})] in [{\bb R}^{n}] will be arranged in a column vector as follows:[\eqalign{\boldxi_{2r - 1} ({\bf x}) &= \alpha_{{\bf h}_{r}} ({\bf x}) \quad \hbox{for } r = 1, \ldots, n_{\rm a},\cr \boldxi_{2r} ({\bf x}) &= \beta_{{\bf h}_{r}} ({\bf x}) \quad \hbox{for } r = 1, \ldots, n_{\rm a},\cr \boldxi_{n_{\rm a} + r} ({\bf x}) &= \gamma_{{\bf h}_{r}} ({\bf x}) \quad \hbox{for } r = n_{\rm a} + 1, \ldots, n_{\rm a} + n_{\rm c}.}]

  • (c) Distributions of random atoms and moment-generating functions

    Let position x in D now become a random vector with probability density [m({\bf x})]. Then [\boldxi ({\bf x})] becomes itself a random vector in [{\bb R}^{n}], whose distribution [p (\boldxi)] is the image of distribution [m ({\bf x})] through the mapping [{\bf x} \rightarrow \boldxi ({\bf x})] just defined. The locus of [\boldxi ({\bf x})] in [{\bb R}^{n}] is a compact algebraic manifold [{\scr L}] (the multidimensional analogue of a Lissajous curve), so that p is a singular measure (a distribution of order 0, Section 1.3.2.3.4[link], concentrated on that manifold) with compact support. The average with respect to p of any function Ω over [{\bb R}^{n}] which is infinitely differentiable in a neighbourhood of [{\scr L}] may be calculated as an average with respect to m over D by the `induction formula':[\langle p, \Omega \rangle = {\textstyle\int\limits_{D}} m ({\bf x}) \Omega [\boldxi ({\bf x})] \hbox{ d}^{3} {\bf x}.]

    In particular, one can calculate the moment-generating function M for distribution p as[M ({\bf t}) \equiv \langle p_{\boldxi}, \exp ({\bf t} \cdot {\boldxi})\rangle = {\textstyle\int\limits_{D}} m ({\bf x}) \exp [{\bf t} \cdot {\boldxi} ({\bf x})] \hbox{ d}^{3} {\bf x}]and hence calculate the moments μ (respectively cumulants κ) of p by differentiation of M (respectively log M) at [{\bf t} = {\bf 0}]:[\eqalign{\mu_{r_{1} r_{2} \ldots r_{n}} &\equiv {\int\limits_{D}} m({\bf x}) \boldxi_{1}^{r_{1}} ({\bf x}) \boldxi_{2}^{r_{2}} ({\bf x}) \ldots \boldxi_{n}^{r_{n}} ({\bf x}) \hbox{ d}^{3} {\bf x}\cr &= {\partial^{\,r_{1} + \ldots + r_{n}} (M) \over \partial t_{1}^{r_{1}} \ldots \partial t_{n}^{r_{n}}}\cr \kappa_{r_{1} r_{2} \ldots r_{n}} &= {\partial^{\,r_{1} + \ldots + r_{n}} (\log M) \over \partial t_{1}^{r_{1}} \ldots \partial t_{n}^{r_{n}}}.}]The structure-factor algebra for group G (Section 1.3.4.2.2.9[link]) then allows one to express products of [\boldxi]'s as linear combinations of other [\boldxi]'s, and hence to express all moments and cumulants of distribution [p(\boldxi)] as linear combinations of real and imaginary parts of Fourier coefficients of the prior distribution of atoms [m({\bf x})]. This plays a key role in the use of nonuniform distributions of atoms.

  • (d) The joint probability distribution of structure factors

    In the random-atom model of an equal-atom structure, N atoms are placed randomly, independently of each other, in the asymmetric unit D of the crystal with probability density [m({\bf x})]. For point atoms of unit weight, the vector F of structure-factor values for reflections [{\bf h} \in H] may be written[{\bf F} = {\textstyle\sum\limits_{I = 1}^{N}}\, {\boldxi}^{[I]},]where the N copies [\boldxi^{[I]}] of random vector ξ are independent and have the same distribution [p(\boldxi)].

    The joint probability distribution [{\scr P}({\bf F})] is then [Section 1.3.4.5.2.1[link](e)[link]][{\scr P}({\bf X}) = {1 \over (2 \pi)^{n}} \int\limits_{{\bb R}^{n}} \exp [N \log M (i{\bf t}) - i{\bf t} \cdot {\bf X}] \hbox{ d}^{n} {\bf t}.]

    For low dimensionality n it is possible to carry out the Fourier transformation numerically after discretization, provided [M (i{\bf t})] is sampled sufficiently finely that no aliasing results from taking its Nth power (Barakat, 1974[link]). This exact approach can also accommodate heterogeneity, and has been used first in the field of intensity statistics (Shmueli et al., 1984[link], 1985[link]; Shmueli & Weiss, 1987[link], 1988[link]), then in the study of the [\Sigma_{1}] and [\Sigma_{2}] relations in triclinic space groups (Shmueli & Weiss, 1985[link], 1986[link]). Some of these applications are described in Chapter 2.1[link] of this volume. This method could be extended to the construction of any joint probability distribution (j.p.d.) in any space group by using the generic expression for the moment-generating function (m.g.f.) derived by Bricogne (1984)[link]. It is, however, limited to small values of n by the necessity to carry out n-dimensional FFTs on large arrays of sample values.

    The asymptotic expansions of Gram–Charlier and Edgeworth have good convergence properties only if [F_{{\bf h}}] lies in the vicinity of [\langle F_{{\bf h}}\rangle = N \bar{{\scr F}}[m]({\bf h})] for all [{\bf h} \in H]. Previous work on the j.p.d. of structure factors has used for [m({\bf x})] a uniform distribution, so that [\langle {\bf F}\rangle = {\bf 0}]; as a result, the corresponding expansions are accurate only if all moduli [|F_{{\bf h}}|] are small, in which case the j.p.d. contains little phase information.

    The saddlepoint method [Section 1.3.4.5.2.1[link](f)[link]] constitutes the method of choice for evaluating the joint probability [{\scr P}({\bf F}^{*})] of structure factors when some of the moduli in [{\bf F}^{*}] are large. As shown previously, this approximation amounts to using the `conjugate distribution'[p_{{\boldtau}} ({\boldxi}) = p({\boldxi}) {\exp ({\boldtau} \cdot {\boldxi}) \over M (\boldtau)}]instead of the original distribution [p({\boldxi}) = p_{{\bf 0}} ({\boldxi})] for the distribution of random vector ξ. This conjugate distribution [p_{{\boldtau}}] is induced from the modified distribution of atoms[q_{{\boldtau}} ({\bf x}) = m({\bf x}) {\exp [{\boldtau} \cdot {\boldxi} ({\bf x})] \over M (\boldtau)}, \eqno(\hbox{SP}1)]where, by the induction formula, [M (\boldtau)] may be written as[M ({\boldtau}) = {\textstyle\int\limits_{D}} m({\bf x}) \exp [{\boldtau} \cdot {\boldxi} ({\bf x})] \hbox{ d}^{3} {\bf x} \eqno(\hbox{SP}2)]and where τ is the unique solution of the saddlepoint equation:[\nabla_{{\boldtau}} (\log M^{N}) = {\bf F}^{*}. \eqno(\hbox{SP}3)]The desired approximation is then[{\scr P}^{\rm SP} ({\bf F}^{*}) = {\exp ({\sf S}) \over \sqrt{\det (2 \pi {\scr Q})}},]where[{\sf S} = \log M^{N} ({\boldtau}) - {\boldtau} \cdot {\bf F}^{*}]and where[{\scr Q} = \nabla \nabla^{T} (\log M^{N}) = {\bf NQ}.]

    Finally, the elements of the Hessian matrix [{\bf Q} = \nabla \nabla^{T} (\log M)] are just the trigonometric second-order cumulants of distribution p, and hence can be calculated via structure-factor algebra from the Fourier coefficients of [q_{{\boldtau}} ({\bf x})]. All the quantities involved in the expression for [{\scr P}^{\rm SP} ({\bf F}^{*})] are therefore effectively computable from the initial data [m({\bf x})] and [{\bf F}^{*}].

  • (e) Maximum-entropy distributions of atoms

    One of the main results in Bricogne (1984)[link] is that the modified distribution [q_{{\boldtau}} ({\bf x})] in (SP1) is the unique distribution which has maximum entropy [{\scr S}_{m} (q)] relative to [m({\bf x})], where[{\scr S}_{m} (q) = - \int\limits_{D} q({\bf x}) \log \left[{q({\bf x}) \over m({\bf x})}\right] \hbox{d}^{3} {\bf x},]under the constraint that [{\bf F}^{*}] be the centroid vector of the corresponding conjugate distribution [{\scr P}_{{\boldtau}} ({\bf F})]. The traditional notation of maximum-entropy (ME) theory (Jaynes, 1957[link], 1968[link], 1983[link]) is in this case (Bricogne, 1984[link])[\eqalignno{q^{\rm ME} ({\bf x}) &= m({\bf x}) {\exp [\boldlambda \cdot {\boldxi} ({\bf x})] \over Z (\boldlambda)}&(\hbox{ME}1)\cr Z (\boldlambda) &= {\textstyle\int\limits_{D}} m({\bf x}) \exp [\boldlambda \cdot {\boldxi} ({\bf x})] \hbox{ d}^{3} {\bf x} &(\hbox{ME}2)\cr \nabla_{\lambda} (\log Z^{N}) &= {\bf F}^{*} &(\hbox{ME}3)\cr}]so that Z is identical to the m.g.f. M, and the coordinates [\boldtau] of the saddlepoint are the Lagrange multipliers λ for the constraints [{\bf F}^{*}].

    Jaynes's ME theory also gives an estimate for [{\scr P}({\bf F}^{*})]:[{\scr P}^{\rm ME} ({\bf F}^{*}) \approx \exp ({\scr S}),]where[{\scr S} = \log Z^{N} - \boldlambda \cdot {\bf F}^{*} = N {\scr S}_{m} ({\bf q}^{\rm ME})]is the total entropy and is the counterpart to [{\sf S}] under the equivalence just established.

    [{\scr P}^{\rm ME}] is identical to [{\scr P}^{\rm SP}], but lacks the denominator. The latter, which is the normalization factor of a multivariate Gaussian with covariance matrix [{\scr Q}], may easily be seen to arise through Szegö's theorem (Sections 1.3.2.6.9.4[link], 1.3.4.2.1.10[link]) from the extra logarithmic term in Stirling's formula[\log (q!) \approx q \log q - q + {\textstyle{1 \over 2}} \log (2 \pi q)](see, for instance, Reif, 1965[link]) beyond the first two terms which serve to define entropy, since[{1 \over n} \log \det (2 \pi {\bf Q}) \approx {\int\limits_{{\bb R}^{3}/{\bb Z}^{3}}} \log 2 \pi q^{\rm ME} ({\bf x}) \hbox{ d}^{3} {\bf x}.]The relative effect of this extra normalization factor depends on the ratio[{n \over N} = {\hbox{dimension of {\bf F} over {\bb R}} \over \hbox{number of atoms}}.]

    The above relation between entropy maximization and the saddlepoint approximation is the basis of a Bayesian statistical approach to the phase problem (Bricogne, 1988[link]) where the assumptions under which joint distributions of structure factors are sought incorporate many new ingredients (such as molecular boundaries, isomorphous substitutions, known fragments, noncrystallographic symmetries, multiple crystal forms) besides trial phase choices for basis reflections. The ME criterion intervenes in the construction of [q^{\rm ME} ({\bf x})] under these assumptions, and the distribution [q^{\rm ME} ({\bf x})] is a very useful computational intermediate in obtaining the approximate joint probability [{\scr P}^{\rm SP} ({\bf F}^{*})] and the associated conditional distributions and likelihood functions.

  • (f) Role of the Fourier transformation

    The formal developments presented above make use of the following properties of the Fourier transformation:

    • (i) the convolution theorem, which turns the convolution of probability distributions into the multiplication of their characteristic functions;

    • (ii) the differentiation property, which confers moment-generating properties to characteristic functions;

    • (iii) the reciprocity theorem, which allows the retrieval of a probability distribution from its characteristic or moment-generating function;

    • (iv) the Paley–Wiener theorem, which allows the analytic continuation of characteristic functions associated to probability distributions with compact support, and thus gives rise to conjugate families of distributions;

    • (v) Bertaut's structure-factor algebra (a discrete symmetrized version of the convolution theorem), which allows the calculation of all necessary moments and cumulants when the dimension n is small;

    • (vi) Szegö's theorem, which provides an asymptotic approximation of the normalization factor when n is large.

    This multi-faceted application seems an appropriate point at which to end this description of the Fourier transformation and of its use in crystallography.

References

Barakat, R. (1974). First-order statistics of combined random sinusoidal waves with applications to laser speckle patterns. Opt. Acta, 21, 903–921.
Bertaut, E. F. (1955a). La méthode statistique en cristallographie. I. Acta Cryst. 8, 537–543.
Bertaut, E. F. (1955b). La méthode statistique en cristallographie. II. Quelques applications. Acta Cryst. 8, 544–548.
Bertaut, E. F. (1955c). Fonction de répartition: application à l'approache directe des structures. Acta Cryst. 8, 823–832.
Bertaut, E. F. (1956a). Les groupes de translation non primitifs et la méthode statistique. Acta Cryst. 9, 322.
Bhattacharya, R. N. & Rao, R. R. (1976). Normal Approximation and Asymptotic Expansions. New York: John Wiley.
Bleistein, N. & Handelsman, R. A. (1986). Asymptotic Expansions of Integrals. New York: Dover Publications.
Bricogne, G. (1984). Maximum entropy and the foundations of direct methods. Acta Cryst. A40, 410–445.
Bricogne, G. (1988). A Bayesian statistical theory of the phase problem. I. A multichannel maximum entropy formalism for constructing generalised joint probability distributions of structure factors. Acta Cryst. A44, 517–545.
Bruijn, N. G. de (1970). Asymptotic Methods in Analysis, 3rd ed. Amsterdam: North-Holland.
Cramér, H. (1946). Mathematical Methods of Statistics. Princeton University Press.
Daniels, H. E. (1954). Saddlepoint approximation in statistics. Ann. Math. Stat. 25, 631–650.
Fowler, R. H. (1936). Statistical Mechanics, 2nd ed. Cambridge University Press.
Hauptman, H. & Karle, J. (1953). Solution of the Phase Problem. I. The Centrosymmetric Crystal. ACA Monograph No. 3. Pittsburgh: Polycrystal Book Service.
Hörmander, L. (1973). An Introduction to Complex Analysis in Several Variables, 2nd ed. Amsterdam: North-Holland.
Jaynes, E. T. (1957). Information theory and statistical mechanics. Phys. Rev. 106, 620–630.
Jaynes, E. T. (1968). Prior probabilities. IEEE Trans. SSC, 4, 227–241.
Jaynes, E. T. (1983). Papers on Probability, Statistics and Statistical Physics. Dordrecht: Kluwer Academic Publishers.
Khinchin, A. I. (1949). Mathematical Foundations of Statistical Mechanics. New York: Dover Publications.
Klug, A. (1958). Joint probability distributions of structure factors and the phase problem. Acta Cryst. 11, 515–543.
Kluyver, J. C. (1906). A local probability problem. K. Ned. Akad. Wet. Proc. 8, 341–350.
Pearson, K. (1905). The problem of the random walk. Nature (London), 72, 294, 342.
Petrov, V. V. (1975). Sums of Independent Random Variables. Berlin: Springer-Verlag.
Rayleigh (J. W. Strutt), Lord (1880). On the resultant of a large number of vibrations of the same pitch and arbitrary phase. Philos. Mag. 10, 73–78.
Rayleigh (J. W. Strutt), Lord (1899). On James Bernoulli's theorem in probabilities. Philos. Mag. 47, 246–251.
Rayleigh (J. W. Strutt), Lord (1905). The problem of the random walk. Nature (London), 72, 318.
Rayleigh (J. W. Strutt), Lord (1918). On the light emitted from a random distribution of luminous sources. Philos. Mag. 36, 429–449.
Rayleigh (J. W. Strutt), Lord (1919). On the problem of random flights in one, two or three dimensions. Philos. Mag. 37, 321–347.
Reif, F. (1965). Fundamentals of Statistical and Thermal Physics, Appendix A.6. New York: McGraw-Hill.
Rice, S. O. (1944, 1945). Mathematical analysis of random noise. Bell Syst. Tech. J. 23, 283–332 (parts I and II); 24, 46–156 (parts III and IV). [Reprinted in Selected Papers on Noise and Stochastic Processes (1954), edited by N. Wax, pp. 133–294. New York: Dover Publications.]
Shmueli, U. & Weiss, G. H. (1985). Exact joint probability distribution for centrosymmetric structure factors. Derivation and application to the Σ1 relationship in the space group [P\bar{1}]. Acta Cryst. A41, 401–408.
Shmueli, U. & Weiss, G. H. (1986). Exact joint distribution of [E_{\bf h}], [E_{\bf k}] and [E_{\bf h+k}], and the probability for the positive sign of the triple product in the space group [P{\bar {1}}]. Acta Cryst. A42, 240–246.
Shmueli, U. & Weiss, G. H. (1987). Exact random-walk models in crystallographic statistics. III. Distributions of [|E|] for space groups of low symmetry. Acta Cryst. A43, 93–98.
Shmueli, U. & Weiss, G. H. (1988). Exact random-walk models in crystallographic statistics. IV. P.d.f.'s of [|E|] allowing for atoms in special positions. Acta Cryst. A44, 413–417.
Shmueli, U., Weiss, G. H. & Kiefer, J. E. (1985). Exact random-walk models in crystallographic statistics. II. The bicentric distribution for the space group [P{\bar {1}}]. Acta Cryst. A41, 55–59.
Shmueli, U., Weiss, G. H., Kiefer, J. E. & Wilson, A. J. C. (1984). Exact random-walk models in crystallographic statistics. I. Space groups [P{\bar {1}}] and P1. Acta Cryst. A40, 651–660.








































to end of page
to top of page