International
Tables for
Crystallography
Volume B
Reciprocal space
Edited by U. Shmueli

International Tables for Crystallography (2006). Vol. B, ch. 1.3, pp. 25-98   | 1 | 2 |
https://doi.org/10.1107/97809553602060000551

Chapter 1.3. Fourier transforms in crystallography: theory, algorithms and applications

G. Bricognea

aMRC Laboratory of Molecular Biology, Hills Road, Cambridge CB2 2QH, England, and LURE, Bâtiment 209D, Université Paris-Sud, 91405 Orsay, France

In the first part of this chapter, the mathematical theory of the Fourier transformation is cast in the language of Schwartz's theory of distributions, allowing Fourier transforms, Fourier series and discrete Fourier transforms to be treated together. Next the numerical computation of the discrete Fourier transform is discussed. One-dimensional algorithms are examined first, including the Cooley–Tukey algorithm, the Good (or prime factor) algorithm, the Rader algorithm and the Winograd algorithms. Multidimensional algorithms are then covered. The last part of the chapter surveys the crystallographic applications of Fourier transforms.

1.3.1. General introduction

| top | pdf |

Since the publication of Volume II of International Tables, most aspects of the theory, computation and applications of Fourier transforms have undergone considerable development, often to the point of being hardly recognizable.

The mathematical analysis of the Fourier transformation has been extensively reformulated within the framework of distribution theory, following Schwartz's work in the early 1950s.

The computation of Fourier transforms has been revolutionized by the advent of digital computers and of the Cooley–Tukey algorithm, and progress has been made at an ever-accelerating pace in the design of new types of algorithms and in optimizing their interplay with machine architecture.

These advances have transformed both theory and practice in several fields which rely heavily on Fourier methods; much of electrical engineering, for instance, has become digital signal processing.

By contrast, crystallography has remained relatively unaffected by these developments. From the conceptual point of view, old-fashioned Fourier series are still adequate for the quantitative description of X-ray diffraction, as this rarely entails consideration of molecular transforms between reciprocal-lattice points. From the practical point of view, three-dimensional Fourier transforms have mostly been used as a tool for visualizing electron-density maps, so that only moderate urgency was given to trying to achieve ultimate efficiency in these relatively infrequent calculations.

Recent advances in phasing and refinement methods, however, have placed renewed emphasis on concepts and techniques long used in digital signal processing, e.g. flexible sampling, Shannon interpolation, linear filtering, and interchange between convolution and multiplication. These methods are iterative in nature, and thus generate a strong incentive to design new crystallographic Fourier transform algorithms making the fullest possible use of all available symmetry to save both storage and computation.

As a result, need has arisen for a modern and coherent account of Fourier transform methods in crystallography which would provide:

  • (i) a simple and foolproof means of switching between the three different guises in which the Fourier transformation is encountered (Fourier transforms, Fourier series and discrete Fourier transforms), both formally and computationally;

  • (ii) an up-to-date presentation of the most important algorithms for the efficient numerical calculation of discrete Fourier transforms;

  • (iii) a systematic study of the incorporation of symmetry into the calculation of crystallographic discrete Fourier transforms;

  • (iv) a survey of the main types of crystallographic computations based on the Fourier transformation.

The rapid pace of progress in these fields implies that such an account would be struck by quasi-immediate obsolescence if it were written solely for the purpose of compiling a catalogue of results and formulae `customized' for crystallographic use. Instead, the emphasis has been placed on a mode of presentation in which most results and formulae are derived rather than listed. This does entail a substantial mathematical overhead, but has the advantage of preserving in its `native' form the context within which these results are obtained. It is this context, rather than any particular set of results, which constitutes the most fertile source of new ideas and new applications, and as such can have any hope at all of remaining useful in the long run.

These conditions have led to the following choices:

  • (i) the mathematical theory of the Fourier transformation has been cast in the language of Schwartz's theory of distributions which has long been adopted in several applied fields, in particular electrical engineering, with considerable success; the extra work involved handsomely pays for itself by allowing the three different types of Fourier transformations to be treated together, and by making all properties of the Fourier transform consequences of a single property (the convolution theorem). This is particularly useful in all questions related to the sampling theorem;

  • (ii) the various numerical algorithms have been presented as the consequences of basic algebraic phenomena involving Abelian groups, rings and finite fields; this degree of formalization greatly helps the subsequent incorporation of symmetry;

  • (iii) the algebraic nature of space groups has been re-emphasized so as to build up a framework which can accommodate both the phenomena used to factor the discrete Fourier transform and those which underlie the existence (and lead to the classification) of space groups; this common ground is found in the notion of module over a group ring (i.e. integral representation theory), which is then applied to the formulation of a large number of algorithms, many of which are new;

  • (iv) the survey of the main types of crystallographic computations has tried to highlight the roles played by various properties of the Fourier transformation, and the ways in which a better exploitation of these properties has been the driving force behind the discovery of more powerful methods.

In keeping with this philosophy, the theory is presented first, followed by the crystallographic applications. There are `forward references' from mathematical results to the applications which later invoke them (thus giving `real-life' examples rather than artificial ones), and `backward references' as usual. In this way, the internal logic of the mathematical developments – the surest guide to future innovations – can be preserved, whereas the alternative solution of relegating these to appendices tends on the contrary to obscure that logic by subordinating it to that of the applications.

It is hoped that this attempt at an overall presentation of the main features of Fourier transforms and of their ubiquitous role in crystallography will be found useful by scientists both within and outside the field.

1.3.2. The mathematical theory of the Fourier transformation

| top | pdf |

1.3.2.1. Introduction

| top | pdf |

The Fourier transformation and the practical applications to which it gives rise occur in three different forms which, although they display a similar range of phenomena, normally require distinct formulations and different proof techniques:

  • (i) Fourier transforms, in which both function and transform depend on continuous variables;

  • (ii) Fourier series, which relate a periodic function to a discrete set of coefficients indexed by n-tuples of integers;

  • (iii) discrete Fourier transforms, which relate finite-dimensional vectors by linear operations representable by matrices.

At the same time, the most useful property of the Fourier transformation – the exchange between multiplication and convolution – is mathematically the most elusive and the one which requires the greatest caution in order to avoid writing down meaningless expressions.

It is the unique merit of Schwartz's theory of distributions (Schwartz, 1966[link]) that it affords complete control over all the troublesome phenomena which had previously forced mathematicians to settle for a piecemeal, fragmented theory of the Fourier transformation. By its ability to handle rigorously highly `singular' objects (especially δ-functions, their derivatives, their tensor products, their products with smooth functions, their translates and lattices of these translates), distribution theory can deal with all the major properties of the Fourier transformation as particular instances of a single basic result (the exchange between multiplication and convolution), and can at the same time accommodate the three previously distinct types of Fourier theories within a unique framework. This brings great simplification to matters of central importance in crystallography, such as the relations between

  • (a) periodization, and sampling or decimation;

  • (b) Shannon interpolation, and masking by an indicator function;

  • (c) section, and projection;

  • (d) differentiation, and multiplication by a monomial;

  • (e) translation, and phase shift.

All these properties become subsumed under the same theorem.

This striking synthesis comes at a slight price, which is the relative complexity of the notion of distribution. It is first necessary to establish the notion of topological vector space and to gain sufficient control (or, at least, understanding) over convergence behaviour in certain of these spaces. The key notion of metrizability cannot be circumvented, as it underlies most of the constructs and many of the proof techniques used in distribution theory. Most of Section 1.3.2.2[link] builds up to the fundamental result at the end of Section 1.3.2.2.6.2[link], which is basic to the definition of a distribution in Section 1.3.2.3.4[link] and to all subsequent developments.

The reader mostly interested in applications will probably want to reach this section by starting with his or her favourite topic in Section 1.3.4[link], and following the backward references to the relevant properties of the Fourier transformation, then to the proof of these properties, and finally to the definitions of the objects involved. Hopefully, he or she will then feel inclined to follow the forward references and thus explore the subject from the abstract to the practical. The books by Dieudonné (1969)[link] and Lang (1965)[link] are particularly recommended as general references for all aspects of analysis and algebra.

1.3.2.2. Preliminary notions and notation

| top | pdf |

Throughout this text, [{\bb R}] will denote the set of real numbers, [{\bb Z}] the set of rational (signed) integers and [ {\bb N}] the set of natural (unsigned) integers. The symbol [{\bb R}^{n}] will denote the Cartesian product of n copies of [{\bb R}]: [{\bb R}^{n} = {\bb R} \times \ldots \times {\bb R} \quad (n \hbox{ times}, n \geq 1),] so that an element x of [{\bb R}^{n}] is an n-tuple of real numbers: [{\bf x} = (x_{1}, \ldots, x_{n}).] Similar meanings will be attached to [{\bb Z}^{n}] and [{\bb N}^{n}].

The symbol [{\bb C}] will denote the set of complex numbers. If [z \in {\bb C}], its modulus will be denoted by [|z|], its conjugate by [\bar{z}] (not [z^{*}]), and its real and imaginary parts by [{\scr Re}\; (z)] and [{\scr Im}\; (z)]: [{\scr Re}\; (z) = {\textstyle{1 \over 2}} (z + \bar{z}), \qquad {\scr Im}\; (z) = {1 \over 2i} (z - \bar{z}).]

If X is a finite set, then [|X|] will denote the number of its elements. If mapping f sends an element x of set X to the element [f(x)] of set Y, the notation [f: x \;\longmapsto\; f(x)] will be used; the plain arrow → will be reserved for denoting limits, as in [\lim\limits_{\rho \rightarrow \infty} \left(1 + {x \over p}\right)^{p} = e^{x}.]

If X is any set and S is a subset of X, the indicator function [\chi_{s}] of S is the real-valued function on X defined by [\eqalign{\chi_{S} (x) &= 1 \quad \hbox{if } x \in S\cr &= 0 \quad \hbox{if } x \;\notin\; S.}]

1.3.2.2.1. Metric and topological notions in [{\bb R}^{n}]

| top | pdf |

The set [{\bb R}^{n}] can be endowed with the structure of a vector space of dimension n over [{\bb R}], and can be made into a Euclidean space by treating its standard basis as an orthonormal basis and defining the Euclidean norm: [\|{\bf x}\| = \left({\textstyle\sum\limits_{i = 1}^{n}} x_{i}^{2}\right)^{1/2}.]

By misuse of notation, x will sometimes also designate the column vector of coordinates of [{\bf x} \in {\bb R}^{n}]; if these coordinates are referred to an orthonormal basis of [{\bb R}^{n}], then [\|{\bf x}\| = ({\bf x}^{T} {\bf x})^{1/2},] where [{\bf x}^{T}] denotes the transpose of x.

The distance between two points x and y defined by [d({\bf x},{\bf y}) = \|{\bf x} - {\bf y}\|] allows the topological structure of [{\bb R}] to be transferred to [{\bb R}^{n}], making it a metric space. The basic notions in a metric space are those of neighbourhoods, of open and closed sets, of limit, of continuity, and of convergence (see Section 1.3.2.2.6.1[link]).

A subset S of [{\bb R}^{n}] is bounded if sup [\|{\bf x} - {\bf y}\| \;\lt\; \infty] as x and y run through S; it is closed if it contains the limits of all convergent sequences with elements in S. A subset K of [{\bb R}^{n}] which is both bounded and closed has the property of being compact, i.e. that whenever K has been covered by a family of open sets, a finite subfamily can be found which suffices to cover K. Compactness is a very useful topological property for the purpose of proof, since it allows one to reduce the task of examining infinitely many local situations to that of examining only finitely many of them.

1.3.2.2.2. Functions over [{\bb R}^{n}]

| top | pdf |

Let ϕ be a complex-valued function over [{\bb R}^{n}]. The support of ϕ, denoted Supp ϕ, is the smallest closed subset of [{\bb R}^{n}] outside which ϕ vanishes identically. If Supp ϕ is compact, ϕ is said to have compact support.

If [{\bf t} \in {\bb R}^{n}], the translate of ϕ by t, denoted [\tau_{\bf t} \varphi], is defined by [(\tau_{\bf t} \varphi) ({\bf x}) = \varphi ({\bf x} - {\bf t}).] Its support is the geometric translate of that of ϕ: [\hbox{Supp } \tau_{\bf t} \varphi = \{{\bf x} + {\bf t} | {\bf x} \in \hbox{Supp } \varphi\}.]

If A is a non-singular linear transformation in [{\bb R}^{n}], the image of ϕ by A, denoted [A^{\#} \varphi], is defined by [(A^{\#} \varphi) ({\bf x}) = \varphi [A^{-1} ({\bf x})].] Its support is the geometric image of Supp ϕ under A: [\hbox{Supp } A^{\#} \varphi = \{A ({\bf x}) | {\bf x} \in \hbox{Supp } \varphi\}.]

If S is a non-singular affine transformation in [{\bb R}^{n}] of the form [S({\bf x}) = A({\bf x}) + {\bf b}] with A linear, the image of ϕ by S is [S^{\#} \varphi = \tau_{\bf b} (A^{\#} \varphi)], i.e. [(S^{\#} \varphi) ({\bf x}) = \varphi [A^{-1} ({\bf x} - {\bf b})].] Its support is the geometric image of Supp ϕ under S: [\hbox{Supp } S^{\#} \varphi = \{S({\bf x}) | {\bf x} \in \hbox{Supp } \varphi\}.]

It may be helpful to visualize the process of forming the image of a function by a geometric operation as consisting of applying that operation to the graph of that function, which is equivalent to applying the inverse transformation to the coordinates x. This use of the inverse later affords the `left-representation property' [see Section 1.3.4.2.2.2[link](e)[link]] when the geometric operations form a group, which is of fundamental importance in the treatment of crystallographic symmetry (Sections 1.3.4.2.2.4[link], 1.3.4.2.2.5[link]).

1.3.2.2.3. Multi-index notation

| top | pdf |

When dealing with functions in n variables and their derivatives, considerable abbreviation of notation can be obtained through the use of multi-indices.

A multi-index [{\bf p} \in {\bb N}^{n}] is an n-tuple of natural integers: [{\bf p} = (p_{1}, \ldots, p_{n})]. The length of p is defined as [|{\bf p}| = {\textstyle\sum\limits_{i = 1}^{n}}\; p_{i},] and the following abbreviations will be used: [\displaylines{\quad (\hbox{i})\qquad\;\;{\bf x}^{{\bf p}} = x_{1}^{p_{1}} \ldots x_{n}^{p_{n}}\hfill\cr \quad (\hbox{ii})\;\qquad D_{i} f = {\partial f \over \partial x_{i}} = \partial_{i}\; f\hfill\cr \quad (\hbox{iii})\qquad D^{{\bf p}} f = D_{1}^{p_{1}} \ldots D_{n}^{p_{n}} f = {\partial^{|{\bf p}|} f \over \partial x_{1}^{p_{1}} \ldots \partial x_{n}^{p_{n}}}\hfill\cr \quad (\hbox{iv})\qquad {\bf q} \leq {\bf p} \hbox{ if and only if } q_{i} \leq p_{i} \hbox{ for all } i = 1, \ldots, n\hfill\cr \quad (\hbox{v})\qquad\;{\bf p} - {\bf q} = (p_{1} - q_{1}, \ldots, p_{n} - q_{n})\hfill\cr \quad (\hbox{vi})\qquad {\bf p}! = p_{1}! \times \ldots \times p_{n}!\hfill\cr \quad (\hbox{vii})\qquad\!\! \pmatrix{{\bf p}\cr {\bf q}\cr} = \pmatrix{p_{1}\cr q_{1}\cr} \times \ldots \times \pmatrix{p_{n}\cr q_{n}\cr}.\hfill}]

Leibniz's formula for the repeated differentiation of products then assumes the concise form [D^{\bf p} (fg) = \sum\limits_{{\bf q} \leq {\bf p}} \pmatrix{{\bf p}\cr {\bf q}\cr} D^{{\bf p} - {\bf q}} f D^{\bf q} g,] while the Taylor expansion of f to order m about [{\bf x} = {\bf a}] reads [f({\bf x}) = \sum\limits_{|{\bf p}| \leq m} {1 \over {\bf p}!} [D^{\bf p} f ({\bf a})] ({\bf x} - {\bf a})^{\bf p} + o (\|{\bf x} - {\bf a}\|^{m}).]

In certain sections the notation [\nabla f] will be used for the gradient vector of f, and the notation [(\nabla \nabla^{T})f] for the Hessian matrix of its mixed second-order partial derivatives: [\displaylines{\nabla = \pmatrix{\displaystyle{\partial \over \partial x_{1}}\cr \vdots\cr\noalign{\vskip6pt} {\displaystyle{\partial \over \partial x_{n}}}\cr}, \quad \nabla f = \pmatrix{\displaystyle{\partial f \over \partial x_{1}}\cr \vdots\cr\noalign{\vskip6pt}  {\displaystyle{\partial f \over \partial x_{n}}}\cr},\cr (\nabla \nabla^{T}) f = \pmatrix{\displaystyle{\partial^{2} f \over \partial x_{1}^{2}} &\ldots &{\displaystyle{\partial^{2} f \over \partial x_{1} \partial x_{n}}}\cr \vdots &\ddots &\vdots\cr\noalign{\vskip6pt}  {\displaystyle{\partial^{2} f \over \partial x_{n} \partial x_{1}}} &\ldots &{\displaystyle{\partial^{2} f \over \partial x_{n}^{2}}}\cr}.}]

1.3.2.2.4. Integration, [L^{p}] spaces

| top | pdf |

The Riemann integral used in elementary calculus suffers from the drawback that vector spaces of Riemann-integrable functions over [{\bb R}^{n}] are not complete for the topology of convergence in the mean: a Cauchy sequence of integrable functions may converge to a non-integrable function.

To obtain the property of completeness, which is fundamental in functional analysis, it was necessary to extend the notion of integral. This was accomplished by Lebesgue [see Berberian (1962)[link], Dieudonné (1970)[link], or Chapter 1 of Dym & McKean (1972)[link] and the references therein, or Chapter 9 of Sprecher (1970)[link]], and entailed identifying functions which differed only on a subset of zero measure in [{\bb R}^{n}] (such functions are said to be equal `almost everywhere'). The vector spaces [L^{p} ({\bb R}^{n})] consisting of function classes f modulo this identification for which [\|{\bf f}\|_{p} = \left({\textstyle\int\limits_{{\bb R}^{n}}} |\;f ({\bf x}) |^{p}\ {\rm d}^{n} {\bf x}\right)^{1/p} \;\lt\; \infty] are then complete for the topology induced by the norm [\|.\|_{p}]: the limit of every Cauchy sequence of functions in [L^{p}] is itself a function in [L^{p}] (Riesz–Fischer theorem).

The space [L^{1} ({\bb R}^{n})] consists of those function classes f such that [\|\;f \|_{1} = {\textstyle\int\limits_{{\bb R}^{n}}} |\;f ({\bf x})|\;\hbox{d}^{n} {\bf x} \;\lt\; \infty] which are called summable or absolutely integrable. The convolution product: [\eqalign{(\;f * g) ({\bf x}) &= {\textstyle\int\limits_{{\bb R}^{n}}} f({\bf y}) g({\bf x} - {\bf y})\;\hbox{d}^{n} {\bf y}\cr &= {\textstyle\int\limits_{{\bb R}^{n}}} f({\bf x} - {\bf y}) g ({\bf y})\;\hbox{d}^{n} {\bf y} = (g * f) ({\bf x})}] is well defined; combined with the vector space structure of [L^{1}], it makes [L^{1}] into a (commutative) convolution algebra. However, this algebra has no unit element: there is no [f \in L^{1}] such that [f * g = g] for all [g \in L^{1}]; it has only approximate units, i.e. sequences [(f_{\nu })] such that [f_{\nu } * g] tends to g in the [L^{1}] topology as [\nu \rightarrow \infty]. This is one of the starting points of distribution theory.

The space [L^{2} ({\bb R}^{n})] of square-integrable functions can be endowed with a scalar product [(\;f, g) = {\textstyle\int\limits_{{\bb R}^{n}}} \overline{f({\bf x})} g({\bf x})\;\hbox{d}^{n} {\bf x}] which makes it into a Hilbert space. The Cauchy–Schwarz inequality [|(\;f, g)| \leq [(\;f, f) (g, g)]^{1/2}] generalizes the fact that the absolute value of the cosine of an angle is less than or equal to 1.

The space [L^{\infty} ({\bb R}^{n})] is defined as the space of functions f such that [\|\;f \|_{\infty} = \lim\limits_{p \rightarrow \infty} \|\;f \|_{p} = \lim\limits_{p \rightarrow \infty} \left({\textstyle\int\limits_{{\bb R}^{n}}} |\; f({\bf x}) |^{p} \;\hbox{d}^{n} {\bf x}\right)^{1/p} \;\lt\; \infty.] The quantity [\|\;f \|_{\infty}] is called the `essential sup norm' of f, as it is the smallest positive number which [|\;f({\bf x})|] exceeds only on a subset of zero measure in [{\bb R}^{n}]. A function [f \in L^{\infty}] is called essentially bounded.

1.3.2.2.5. Tensor products. Fubini's theorem

| top | pdf |

Let [f \in L^{1} ({\bb R}^{m})], [g \in L^{1} ({\bb R}^{n})]. Then the function [f \otimes g: ({\bf x},{\bf y}) \;\longmapsto\; f({\bf x}) g({\bf y})] is called the tensor product of f and g, and belongs to [L^{1} ({\bb R}^{m} \times {\bb R}^{n})]. The finite linear combinations of functions of the form [f \otimes g] span a subspace of [L^{1} ({\bb R}^{m} \times {\bb R}^{n})] called the tensor product of [L^{1} ({\bb R}^{m})] and [L^{1} ({\bb R}^{n})] and denoted [L^{1} ({\bb R}^{m}) \otimes L^{1} ({\bb R}^{n})].

The integration of a general function over [{\bb R}^{m} \times {\bb R}^{n}] may be accomplished in two steps according to Fubini's theorem. Given [F \in L^{1} ({\bb R}^{m} \times {\bb R}^{n})], the functions [\eqalign{F_{1} : {\bf x} &\;\longmapsto\; {\textstyle\int\limits_{{\bb R}^{n}}} F ({\bf x},{\bf y}) \;\hbox{d}^{n} {\bf y}\cr F_{2} : {\bf y} &\;\longmapsto\; {\textstyle\int\limits_{{\bb R}^{m}}} F ({\bf x},{\bf y}) \;\hbox{d}^{m} {\bf x}}] exist for almost all [{\bf x} \in {\bb R}^{m}] and almost all [{\bf y} \in {\bb R}^{n}], respectively, are integrable, and [\textstyle\int\limits_{{\bb R}^{m} \times {\bb R}^{n}} F ({\bf x},{\bf y}) \;\hbox{d}^{m} {\bf x} \;\hbox{d}^{n} {\bf y} = {\textstyle\int\limits_{{\bb R}^{m}}} F_{1} ({\bf x}) \;\hbox{d}^{m} {\bf x} = {\textstyle\int\limits_{{\bb R}^{n}}} F_{2} ({\bf y}) \;\hbox{d}^{n} {\bf y}.] Conversely, if any one of the integrals [\displaylines{\quad (\hbox{i})\qquad {\textstyle\int\limits_{{\bb R}^{m} \times {\bb R}^{n}}} |F ({\bf x},{ \bf y})| \;\hbox{d}^{m} {\bf x} \;\hbox{d}^{n} {\bf y}\qquad \hfill\cr \quad (\hbox{ii})\qquad {\textstyle\int\limits_{{\bb R}^{m}}} \left({\textstyle\int\limits_{{\bb R}^{n}}} |F ({\bf x},{ \bf y})| \;\hbox{d}^{n} {\bf y}\right) \;\hbox{d}^{m} {\bf x}\hfill\cr \quad (\hbox{iii})\qquad {\textstyle\int\limits_{{\bb R}^{n}}} \left({\textstyle\int\limits_{{\bb R}^{m}}} |F ({\bf x},{ \bf y})| \;\hbox{d}^{m} {\bf x}\right) \;\hbox{d}^{n} {\bf y}\hfill}] is finite, then so are the other two, and the identity above holds. It is then (and only then) permissible to change the order of integrations.

Fubini's theorem is of fundamental importance in the study of tensor products and convolutions of distributions.

1.3.2.2.6. Topology in function spaces

| top | pdf |

Geometric intuition, which often makes `obvious' the topological properties of the real line and of ordinary space, cannot be relied upon in the study of function spaces: the latter are infinite-dimensional, and several inequivalent notions of convergence may exist. A careful analysis of topological concepts and of their interrelationship is thus a necessary prerequisite to the study of these spaces. The reader may consult Dieudonné (1969[link], 1970[link]), Friedman (1970)[link], Trèves (1967)[link] and Yosida (1965)[link] for detailed expositions.

1.3.2.2.6.1. General topology

| top | pdf |

Most topological notions are first encountered in the setting of metric spaces. A metric space E is a set equipped with a distance function d from [E \times E] to the non-negative reals which satisfies: [\matrix{(\hbox{i})\hfill & d(x, y) = d(y, x)\hfill &\forall x, y \in E\hfill &\hbox{(symmetry);}\hfill\cr\cr (\hbox{ii})\hfill &d(x, y) = 0 \hfill &\hbox{iff } x = y\hfill &\hbox{(separation);}\hfill\cr\cr (\hbox{iii})\hfill & d(x, z) \leq d(x, y) + d(y, z)\hfill &\forall x, y, z \in E\hfill &\hbox{(triangular}\hfill\cr& & &\hbox{inequality).}\hfill}] By means of d, the following notions can be defined: open balls, neighbourhoods; open and closed sets, interior and closure; convergence of sequences, continuity of mappings; Cauchy sequences and completeness; compactness; connectedness. They suffice for the investigation of a great number of questions in analysis and geometry (see e.g. Dieudonné, 1969[link]).

Many of these notions turn out to depend only on the properties of the collection [{\scr O}(E)] of open subsets of E: two distance functions leading to the same [{\scr O}(E)] lead to identical topological properties. An axiomatic reformulation of topological notions is thus possible: a topology in E is a collection [{\scr O}(E)] of subsets of E which satisfy suitable axioms and are deemed open irrespective of the way they are obtained. From the practical standpoint, however, a topology which can be obtained from a distance function (called a metrizable topology) has the very useful property that the notions of closure, limit and continuity may be defined by means of sequences. For non-metrizable topologies, these notions are much more difficult to handle, requiring the use of `filters' instead of sequences.

In some spaces E, a topology may be most naturally defined by a family of pseudo-distances [(d_{\alpha})_{\alpha \in A}], where each [d_{\alpha}] satisfies (i) and (iii) but not (ii). Such spaces are called uniformizable. If for every pair [(x, y) \in E \times E] there exists [\alpha \in A] such that [d_{\alpha} (x, y) \neq 0], then the separation property can be recovered. If furthermore a countable subfamily of the [d_{\alpha}] suffices to define the topology of E, the latter can be shown to be metrizable, so that limiting processes in E may be studied by means of sequences.

1.3.2.2.6.2. Topological vector spaces

| top | pdf |

The function spaces E of interest in Fourier analysis have an underlying vector space structure over the field [{\bb C}] of complex numbers. A topology on E is said to be compatible with a vector space structure on E if vector addition [i.e. the map [({\bf x},{ \bf y}) \;\longmapsto\; {\bf x} + {\bf y}]] and scalar multiplication [i.e. the map [(\lambda, {\bf x}) \;\longmapsto\; \lambda {\bf x}]] are both continuous; E is then called a topological vector space. Such a topology may be defined by specifying a `fundamental system S of neighbourhoods of [{\bf 0}]', which can then be translated by vector addition to construct neighbourhoods of other points [{\bf x} \neq {\bf 0}].

A norm ν on a vector space E is a non-negative real-valued function on [E \times E] such that [\displaylines{\quad (\hbox{i}')\;\;\quad\nu (\lambda {\bf x}) = |\lambda | \nu ({\bf x}) \phantom{|\lambda | v ({\bf x} =i} \hbox{for all } \lambda \in {\bb C} \hbox{ and } {\bf x} \in E\hbox{;}\hfill\cr \quad (\hbox{ii}')\;\quad\nu ({\bf x}) = 0 \phantom{|\lambda | v ({\bf x} = |\lambda | vxxx}\; \hbox{if and only if } {\bf x} = {\bf 0}\hbox{;}\hfill\cr \quad (\hbox{iii}')\quad \nu ({\bf x} + {\bf y}) \leq \nu ({\bf x}) + \nu ({\bf y}) \quad \hbox{for all } {\bf x},{\bf y} \in E.\hfill}] Subsets of E defined by conditions of the form [\nu ({\bf x}) \leq r] with [r\gt 0] form a fundamental system of neighbourhoods of 0. The corresponding topology makes E a normed space. This topology is metrizable, since it is equivalent to that derived from the translation-invariant distance [d({\bf x},{ \bf y}) = \nu ({\bf x} - {\bf y})]. Normed spaces which are complete, i.e. in which all Cauchy sequences converge, are called Banach spaces; they constitute the natural setting for the study of differential calculus.

A semi-norm σ on a vector space E is a positive real-valued function on [E \times E] which satisfies (i′) and (iii′) but not (ii′). Given a set Σ of semi-norms on E such that any pair (x, y) in [E \times E] is separated by at least one [\sigma \in \Sigma], let B be the set of those subsets [\Gamma_{\sigma{, \,} r}] of E defined by a condition of the form [\sigma ({\bf x}) \leq r] with [\sigma \in \Sigma] and [r \gt 0]; and let S be the set of finite intersections of elements of B. Then there exists a unique topology on E for which S is a fundamental system of neighbourhoods of 0. This topology is uniformizable since it is equivalent to that derived from the family of translation-invariant pseudo-distances [({\bf x},{ \bf y}) \;\longmapsto\; \sigma ({\bf x} - {\bf y})]. It is metrizable if and only if it can be constructed by the above procedure with Σ a countable set of semi-norms. If furthermore E is complete, E is called a Fréchet space.

If E is a topological vector space over [{\bb C}], its dual [E^{*}] is the set of all linear mappings from E to [{\bb C}] (which are also called linear forms, or linear functionals, over E). The subspace of [E^{*}] consisting of all linear forms which are continuous for the topology of E is called the topological dual of E and is denoted E′. If the topology on E is metrizable, then the continuity of a linear form [T \in E'] at [f \in E] can be ascertained by means of sequences, i.e. by checking that the sequence [[T(\;f_{j})]] of complex numbers converges to [T(\;f)] in [{\bb C}] whenever the sequence [(\;f_{j})] converges to f in E.

1.3.2.3. Elements of the theory of distributions

| top | pdf |

1.3.2.3.1. Origins

| top | pdf |

At the end of the 19th century, Heaviside proposed under the name of `operational calculus' a set of rules for solving a class of differential, partial differential and integral equations encountered in electrical engineering (today's `signal processing'). These rules worked remarkably well but were devoid of mathematical justification (see Whittaker, 1928[link]). In 1926, Dirac introduced his famous δ-function [see Dirac (1958)[link], pp. 58–61], which was found to be related to Heaviside's constructs. Other singular objects, together with procedures to handle them, had already appeared in several branches of analysis [Cauchy's `principal values'; Hadamard's `finite parts' (Hadamard, 1932[link], 1952[link]); Riesz's regularization methods for certain divergent integrals (Riesz, 1938[link], 1949[link])] as well as in the theories of Fourier series and integrals (see e.g. Bochner, 1932[link], 1959[link]). Their very definition often verged on violating the rigorous rules governing limiting processes in analysis, so that subsequent recourse to limiting processes could lead to erroneous results; ad hoc precautions thus had to be observed to avoid mistakes in handling these objects.

In 1945–1950, Laurent Schwartz proposed his theory of distributions (see Schwartz, 1966[link]), which provided a unified and definitive treatment of all these questions, with a striking combination of rigour and simplicity. Schwartz's treatment of Dirac's δ-function illustrates his approach in a most direct fashion. Dirac's original definition reads: [\displaylines{\quad (\hbox{i})\;\quad\delta ({\bf x}) = 0 \hbox{ for } {\bf x} \neq {\bf 0},\hfill\cr \quad (\hbox{ii})\quad {\textstyle\int_{{\bb R}^{n}}} \delta ({\bf x}) \;\hbox{d}^{n} {\bf x} = 1.\hfill}] These two conditions are irreconcilable with Lebesgue's theory of integration: by (i), δ vanishes almost everywhere, so that its integral in (ii) must be 0, not 1.

A better definition consists in specifying that [\displaylines{\quad (\hbox{iii})\quad {\textstyle\int_{{\bb R}^{n}}} \delta ({\bf x}) \varphi ({\bf x}) \;\hbox{d}^{n} {\bf x} = \varphi ({\bf 0})\hfill}] for any function ϕ sufficiently well behaved near [{\bf x} = {\bf 0}]. This is related to the problem of finding a unit for convolution (Section 1.3.2.2.4[link]). As will now be seen, this definition is still unsatisfactory. Let the sequence [(\;f_{\nu})] in [L^{1} ({\bb R}^{n})] be an approximate convolution unit, e.g. [f_{\nu} ({\bf x}) = \left({\nu \over 2\pi}\right)^{1/2} \exp (-{\textstyle{1 \over 2}} \nu^{2} \|{\bf x}\|^{2}).] Then for any well behaved function ϕ the integrals [{\textstyle\int\limits_{{\bb R}^{n}}} f_{\nu} ({\bf x}) \varphi ({\bf x}) \;\hbox{d}^{n} {\bf x}] exist, and the sequence of their numerical values tends to [\varphi ({\bf 0})]. It is tempting to combine this with (iii) to conclude that δ is the limit of the sequence [(\;f_{\nu})] as [\nu \rightarrow \infty]. However, [\lim f_{\nu} ({\bf x}) = 0 \quad \hbox{as } \nu \rightarrow \infty] almost everywhere in [{\bb R}^{n}] and the crux of the problem is that [\eqalign{\varphi ({\bf 0}) &= \lim\limits_{\nu \rightarrow \infty} {\textstyle\int\limits_{{\bb R}^{n}}} f_{\nu} ({\bf x}) \varphi ({\bf x}) \;\hbox{d}^{n} {\bf x} \cr &\neq {\textstyle\int\limits_{{\bb R}^{n}}} \left[\lim\limits_{\nu \rightarrow \infty} f_{v} ({\bf x}) \right] \varphi ({\bf x}) \;\hbox{d}^{n} {\bf x} = 0}] because the sequence [(\;f_{\nu})] does not satisfy the hypotheses of Lebesgue's dominated convergence theorem.

Schwartz's solution to this problem is deceptively simple: the regular behaviour one is trying to capture is an attribute not of the sequence of functions [(\;f_{\nu})], but of the sequence of continuous linear functionals [T_{\nu}: \varphi \;\longmapsto\; {\textstyle\int\limits_{{\bb R}^{n}}} f_{\nu} ({\bf x}) \varphi ({\bf x}) \;\hbox{d}^{n} {\bf x}] which has as a limit the continuous functional [T: \varphi \;\longmapsto\; \varphi ({\bf 0}).] It is the latter functional which constitutes the proper definition of δ. The previous paradoxes arose because one insisted on writing down the simple linear operation T in terms of an integral.

The essence of Schwartz's theory of distributions is thus that, rather than try to define and handle `generalized functions' via sequences such as [(\;f_{\nu})] [an approach adopted e.g. by Lighthill (1958)[link] and Erdélyi (1962)[link]], one should instead look at them as continuous linear functionals over spaces of well behaved functions.

There are many books on distribution theory and its applications. The reader may consult in particular Schwartz (1965[link], 1966[link]), Gel'fand & Shilov (1964)[link], Bremermann (1965)[link], Trèves (1967)[link], Challifour (1972)[link], Friedlander (1982)[link], and the relevant chapters of Hörmander (1963)[link] and Yosida (1965)[link]. Schwartz (1965)[link] is especially recommended as an introduction.

1.3.2.3.2. Rationale

| top | pdf |

The guiding principle which leads to requiring that the functions ϕ above (traditionally called `test functions') should be well behaved is that correspondingly `wilder' behaviour can then be accommodated in the limiting behaviour of the [f_{\nu}] while still keeping the integrals [{\textstyle\int_{{\bb R}^{n}}} f_{\nu} \varphi \;\hbox{d}^{n} {\bf x}] under control. Thus

  • (i) to minimize restrictions on the limiting behaviour of the [f_{\nu}] at infinity, the ϕ's will be chosen to have compact support;

  • (ii) to minimize restrictions on the local behaviour of the [f_{\nu}], the ϕ's will be chosen infinitely differentiable.

To ensure further the continuity of functionals such as [T_{\nu}] with respect to the test function ϕ as the [f_{\nu}] go increasingly wild, very strong control will have to be exercised in the way in which a sequence [(\varphi_{j})] of test functions will be said to converge towards a limiting ϕ: conditions will have to be imposed not only on the values of the functions [\varphi_{j}], but also on those of all their derivatives. Hence, defining a strong enough topology on the space of test functions ϕ is an essential prerequisite to the development of a satisfactory theory of distributions.

1.3.2.3.3. Test-function spaces

| top | pdf |

With this rationale in mind, the following function spaces will be defined for any open subset Ω of [{\bb R}^{n}] (which may be the whole of [{\bb R}^{n}]):

  • (a) [{\scr E}(\Omega)] is the space of complex-valued functions over Ω which are indefinitely differentiable;

  • (b) [{\scr D}(\Omega)] is the subspace of [{\scr E}(\Omega)] consisting of functions with (unspecified) compact support contained in [{\bb R}^{n}];

  • (c) [{\scr D}_{K} (\Omega)] is the subspace of [{\scr D}(\Omega)] consisting of functions whose (compact) support is contained within a fixed compact subset K of Ω.

When Ω is unambiguously defined by the context, we will simply write [{\scr E},{\scr D},{\scr D}_{K}].

It sometimes suffices to require the existence of continuous derivatives only up to finite order m inclusive. The corresponding spaces are then denoted [{\scr E}^{(m)},{\scr D}^{(m)},{\scr D}_{K}^{(m)}] with the convention that if [m = 0], only continuity is required.

The topologies on these spaces constitute the most important ingredients of distribution theory, and will be outlined in some detail.

1.3.2.3.3.1. Topology on [{\scr E}(\Omega)]

| top | pdf |

It is defined by the family of semi-norms [\varphi \in {\scr E}(\Omega) \;\longmapsto\; \sigma_{{\bf p}, \,  K} (\varphi) = \sup\limits_{{\bf x} \in K} |D^{{\bf p}} \varphi ({\bf x})|,] where p is a multi-index and K a compact subset of Ω. A fundamental system S of neighbourhoods of the origin in [{\scr E}(\Omega)] is given by subsets of [{\scr E}(\Omega)] of the form [V (m, \varepsilon, K) = \{\varphi \in {\scr E}(\Omega)| |{\bf p}| \leq m \Rightarrow \sigma_{{\bf p}, K} (\varphi) \;\lt\; \varepsilon\}] for all natural integers m, positive real [epsilon], and compact subset K of Ω. Since a countable family of compact subsets K suffices to cover Ω, and since restricted values of [epsilon] of the form [\varepsilon = 1/N] lead to the same topology, S is equivalent to a countable system of neighbourhoods and hence [{\scr E}(\Omega)] is metrizable.

Convergence in [{\scr E}] may thus be defined by means of sequences. A sequence [(\varphi_{\nu})] in [{\scr E}] will be said to converge to 0 if for any given [V (m, \varepsilon, K)] there exists [\nu_{0}] such that [\varphi_{\nu} \in V (m, \varepsilon, K)] whenever [\nu \gt \nu_{0}]; in other words, if the [\varphi_{\nu}] and all their derivatives [D^{\bf p} \varphi_{\nu}] converge to 0 uniformly on any given compact K in Ω.

1.3.2.3.3.2. Topology on [{\scr D}_{k} (\Omega)]

| top | pdf |

It is defined by the family of semi-norms [\varphi \in {\scr D}_{K} (\Omega) \;\longmapsto\; \sigma_{\bf p} (\varphi) = \sup\limits_{{\bf x} \in K} |D^{{\bf p}} \varphi ({\bf x})|,] where K is now fixed. The fundamental system S of neighbourhoods of the origin in [{\scr D}_{K}] is given by sets of the form [V (m, \varepsilon) = \{\varphi \in {\scr D}_{K} (\Omega)| |{\bf p}| \leq m \Rightarrow \sigma_{\bf p} (\varphi) \;\lt\; \varepsilon\}.] It is equivalent to the countable subsystem of the [V (m, 1/N)], hence [{\scr D}_{K} (\Omega)] is metrizable.

Convergence in [{\scr D}_{K}] may thus be defined by means of sequences. A sequence [(\varphi_{\nu})] in [{\scr D}_{K}] will be said to converge to 0 if for any given [V(m, \varepsilon)] there exists [\nu_{0}] such that [\varphi_{\nu} \in V(m, \varepsilon)] whenever [\nu \gt \nu_{0}]; in other words, if the [\varphi_{\nu}] and all their derivatives [D^{\bf p} \varphi_{\nu}] converge to 0 uniformly in K.

1.3.2.3.3.3. Topology on [{\scr D}(\Omega)]

| top | pdf |

It is defined by the fundamental system of neighbourhoods of the origin consisting of sets of the form [\eqalign{&V((m), (\varepsilon)) \cr &\qquad = \left\{\varphi \in {\scr D}(\Omega)| |{\bf p}| \leq m_{\nu} \Rightarrow \sup\limits_{\|{\bf x}\| \leq \nu} |D^{{\bf p}} \varphi ({\bf x})| \;\lt\; \varepsilon_{\nu} \hbox{ for all } \nu\right\},}] where (m) is an increasing sequence [(m_{\nu})] of integers tending to [+ \infty] and ([epsilon]) is a decreasing sequence [(\varepsilon_{\nu})] of positive reals tending to 0, as [\nu \rightarrow \infty].

This topology is not metrizable, because the sets of sequences (m) and ([epsilon]) are essentially uncountable. It can, however, be shown to be the inductive limit of the topology of the subspaces [{\scr D}_{K}], in the following sense: V is a neighbourhood of the origin in [{\scr D}] if and only if its intersection with [{\scr D}_{K}] is a neighbourhood of the origin in [{\scr D}_{K}] for any given compact K in Ω.

A sequence [(\varphi_{\nu})] in [{\scr D}] will thus be said to converge to 0 in [{\scr D}] if all the [\varphi_{\nu}] belong to some [{\scr D}_{K}] (with K a compact subset of Ω independent of ν) and if [(\varphi_{\nu})] converges to 0 in [{\scr D}_{K}].

As a result, a complex-valued functional T on [{\scr D}] will be said to be continuous for the topology of [{\scr D}] if and only if, for any given compact K in Ω, its restriction to [{\scr D}_{K}] is continuous for the topology of [{\scr D}_{K}], i.e. maps convergent sequences in [{\scr D}_{K}] to convergent sequences in [{\bb C}].

This property of [{\scr D}], i.e. having a non-metrizable topology which is the inductive limit of metrizable topologies in its subspaces [{\scr D}_{K}], conditions the whole structure of distribution theory and dictates that of many of its proofs.

1.3.2.3.3.4. Topologies on [{\scr E}^{(m)}, {\scr D}_{k}^{(m)},{\scr D}^{(m)}]

| top | pdf |

These are defined similarly, but only involve conditions on derivatives up to order m.

1.3.2.3.4. Definition of distributions

| top | pdf |

A distribution T on Ω is a linear form over [{\scr D}(\Omega)], i.e. a map [T: \varphi \;\longmapsto\; \langle T, \varphi \rangle] which associates linearly a complex number [\langle T, \varphi \rangle] to any [\varphi \in {\scr D}(\Omega)], and which is continuous for the topology of that space. In the terminology of Section 1.3.2.2.6.2[link], T is an element of [{\scr D}\,'(\Omega)], the topological dual of [{\scr D}(\Omega)].

Continuity over [{\scr D}] is equivalent to continuity over [{\scr D}_{K}] for all compact K contained in Ω, and hence to the condition that for any sequence [(\varphi_{\nu})] in [{\scr D}] such that

  • (i) Supp [\varphi_{\nu}] is contained in some compact K independent of ν,

  • (ii) the sequences [(|D^{\bf p} \varphi_{\nu}|)] converge uniformly to 0 on K for all multi-indices p;

then the sequence of complex numbers [\langle T, \varphi_{\nu}\rangle] converges to 0 in [{\bb C}].

If the continuity of a distribution T requires (ii)[link] for [|{\bf p}| \leq m] only, T may be defined over [{\scr D}^{(m)}] and thus [T \in {\scr D}\,'^{(m)}]; T is said to be a distribution of finite order m. In particular, for [m = 0, {\scr D}^{(0)}] is the space of continuous functions with compact support, and a distribution [T \in {\scr D}\,'^{(0)}] is a (Radon) measure as used in the theory of integration. Thus measures are particular cases of distributions.

Generally speaking, the larger a space of test functions, the smaller its topological dual: [m \;\lt\; n \Rightarrow {\scr D}^{(m)} \supset {\scr D}^{(n)} \Rightarrow {\scr D}\,'^{(n)} \supset {\scr D}\,'^{(m)}.] This clearly results from the observation that if the ϕ's are allowed to be less regular, then less wildness can be accommodated in T if the continuity of the map [\varphi \;\longmapsto\; \langle T, \varphi \rangle] with respect to ϕ is to be preserved.

1.3.2.3.5. First examples of distributions

| top | pdf |

  • (i) The linear map [\varphi \;\longmapsto\; \langle \delta, \varphi \rangle = \varphi ({\bf 0})] is a measure (i.e. a zeroth-order distribution) called Dirac's measure or (improperly) Dirac's `δ-function'.

  • (ii) The linear map [\varphi \;\longmapsto\; \langle \delta_{({\bf a})}, \varphi \rangle = \varphi ({\bf a})] is called Dirac's measure at point [{\bf a} \in {\bb R}^{n}].

  • (iii) The linear map [\varphi\;\longmapsto\; (-1)^{\bf p} D^{\bf p} \varphi ({\bf a})] is a distribution of order [m = |{\bf p}| \gt 0], and hence is not a measure.

  • (iv) The linear map [\varphi \;\longmapsto\; {\textstyle\sum_{\nu \gt 0}} \varphi^{(\nu)} (\nu)] is a distribution of infinite order on [{\bb R}]: the order of differentiation is bounded for each ϕ (because ϕ has compact support) but is not as ϕ varies.

  • (v) If [({\bf p}_{\nu})] is a sequence of multi-indices [{\bf p}_{\nu} = (p_{1\nu}, \ldots, p_{n\nu})] such that [|{\bf p}_{\nu}| \rightarrow \infty] as [\nu \rightarrow \infty], then the linear map [\varphi \;\longmapsto\; {\textstyle\sum_{\nu \gt 0}} (D^{{\bf p}_{\nu}} \varphi) ({\bf p}_{\nu})] is a distribution of infinite order on [{\bb R}^{n}].

1.3.2.3.6. Distributions associated to locally integrable functions

| top | pdf |

Let f be a complex-valued function over Ω such that [{\textstyle\int_{K}} | \;f({\bf x}) | \;\hbox{d}^{n} {\bf x}] exists for any given compact K in Ω; f is then called locally integrable.

The linear mapping from [{\scr D}(\Omega)] to [{\bb C}] defined by [\varphi \;\longmapsto\; {\textstyle\int\limits_{\Omega}} f({\bf x}) \varphi ({\bf x}) \;\hbox{d}^{n} {\bf x}] may then be shown to be continuous over [{\scr D}(\Omega)]. It thus defines a distribution [T_{f} \in {\scr D}\,'(\Omega)]: [\langle T_{f}, \varphi \rangle = {\textstyle\int\limits_{\Omega}} f({\bf x}) \varphi ({\bf x}) \;\hbox{d}^{n} {\bf x}.] As the continuity of [T_{f}] only requires that [\varphi \in {\scr D}^{(0)} (\Omega)], [T_{f}] is actually a Radon measure.

It can be shown that two locally integrable functions f and g define the same distribution, i.e. [\langle T_{f}, \varphi \rangle = \langle T_{K}, \varphi \rangle \quad \hbox{for all } \varphi \in {\scr D},] if and only if they are equal almost everywhere. The classes of locally integrable functions modulo this equivalence form a vector space denoted [L_{\rm loc}^{1} (\Omega)]; each element of [L_{\rm loc}^{1} (\Omega)] may therefore be identified with the distribution [T_{f}] defined by any one of its representatives f.

1.3.2.3.7. Support of a distribution

| top | pdf |

A distribution [T \in {\scr D}\,'(\Omega)] is said to vanish on an open subset ω of Ω if it vanishes on all functions in [{\scr D}(\omega)], i.e. if [\langle T, \varphi \rangle = 0] whenever [\varphi \in {\scr D}(\omega)].

The support of a distribution T, denoted Supp T, is then defined as the complement of the set-theoretic union of those open subsets ω on which T vanishes; or equivalently as the smallest closed subset of Ω outside which T vanishes.

When [T = T_{f}] for [f \in L_{\rm loc}^{1} (\Omega)], then Supp [T = \hbox{Supp } f], so that the two notions coincide. Clearly, if Supp T and Supp ϕ are disjoint subsets of Ω, then [\langle T, \varphi \rangle = 0].

It can be shown that any distribution [T \in {\scr D}\,'] with compact support may be extended from [{\scr D}] to [{\scr E}] while remaining continuous, so that [T \in {\scr E}\,']; and that conversely, if [S \in {\scr E}\,'], then its restriction T to [{\scr D}] is a distribution with compact support. Thus, the topological dual [{\scr E}\,'] of [{\scr E}] consists of those distributions in [{\scr D}\,'] which have compact support. This is intuitively clear since, if the condition of having compact support is fulfilled by T, it needs no longer be required of ϕ, which may then roam through [{\scr E}] rather than [{\scr D}].

1.3.2.3.8. Convergence of distributions

| top | pdf |

A sequence [(T_{j})] of distributions will be said to converge in [{\scr D}\,'] to a distribution T as [j \rightarrow \infty] if, for any given [\varphi \in {\scr D}], the sequence of complex numbers [(\langle T_{j}, \varphi \rangle)] converges in [{\bb C}] to the complex number [\langle T, \varphi \rangle].

A series [{\textstyle\sum_{j=0}^{\infty}} T_{j}] of distributions will be said to converge in [{\scr D}\,'] and to have distribution S as its sum if the sequence of partial sums [S_{k} = {\textstyle\sum_{j=0}^{k}}] converges to S.

These definitions of convergence in [{\scr D}\,'] assume that the limits T and S are known in advance, and are distributions. This raises the question of the completeness of [{\scr D}\,']: if a sequence [(T_{j})] in [{\scr D}\,'] is such that the sequence [(\langle T_{j}, \varphi \rangle)] has a limit in [{\bb C}] for all [\varphi \in {\scr D}], does the map [\varphi \;\longmapsto\; \lim_{j \rightarrow \infty} \langle T_{j}, \varphi \rangle] define a distribution [T \in {\scr D}\,']? In other words, does the limiting process preserve continuity with respect to ϕ? It is a remarkable theorem that, because of the strong topology on [{\scr D}], this is actually the case. An analogous statement holds for series. This notion of convergence does not coincide with any of the classical notions used for ordinary functions: for example, the sequence [(\varphi_{\nu})] with [\varphi_{\nu} (x) = \cos \nu x] converges to 0 in [{\scr D}\,'({\bb R})], but fails to do so by any of the standard criteria.

An example of convergent sequences of distributions is provided by sequences which converge to δ. If [(\;f_{\nu})] is a sequence of locally summable functions on [{\bb R}^{n}] such that

  • (i) [\textstyle{\int_{\|{\bf x}\| \lt\; b}} \;f_{\nu} ({\bf x}) \;\hbox{d}^{n} {\bf x} \rightarrow 1] as [\nu \rightarrow \infty] for all [b \gt 0];

  • (ii) [{\textstyle\int_{a \leq \|{\bf x}\| \leq 1/a}} |\;f_{\nu} ({\bf x})| \;\hbox{d}^{n} {\bf x} \rightarrow 0] as [\nu \rightarrow \infty] for all [0 \;\lt\; a \;\lt\; 1];

  • (iii) there exists [d \gt 0] and [M \gt 0] such that [{\textstyle\int_{\|{\bf x}\|\lt\; d}} |\;f_{\nu} ({\bf x})| \;\hbox{d}^{n} {\bf x}\lt M] for all ν;

then the sequence [(T_{f_{\nu}})] of distributions converges to δ in [{\scr D}\,'({\bb R}^{n})].

1.3.2.3.9. Operations on distributions

| top | pdf |

As a general rule, the definitions are chosen so that the operations coincide with those on functions whenever a distribution is associated to a function.

Most definitions consist in transferring to a distribution T an operation which is well defined on [\varphi \in {\scr D}] by `transposing' it in the duality product [\langle T, \varphi \rangle]; this procedure will map T to a new distribution provided the original operation maps [{\scr D}] continuously into itself.

1.3.2.3.9.1. Differentiation

| top | pdf |

  • (a) Definition and elementary properties

    If T is a distribution on [{\bb R}^{n}], its partial derivative [\partial_{i} T] with respect to [x_{i}] is defined by [\langle \partial_{i} T, \varphi \rangle = - \langle T, \partial_{i} \varphi \rangle]

    for all [\varphi \in {\scr D}]. This does define a distribution, because the partial differentiations [\varphi \;\longmapsto\; \partial_{i} \varphi] are continuous for the topology of [{\scr D}].

    Suppose that [T = T_{f}] with f a locally integrable function such that [\partial_{i}\; f] exists and is almost everywhere continuous. Then integration by parts along the [x_{i}] axis gives [\eqalign{&{\textstyle\int\limits_{{\bb R}^{n}}} \partial_{i}\; f(x_{\rm l}, \ldots, x_{i}, \ldots, x_{n}) \varphi (x_{\rm l}, \ldots, x_{i}, \ldots, x_{n}) \;\hbox{d}x_{i} \cr &\quad = (\;f\varphi)(x_{\rm l}, \ldots, + \infty, \ldots, x_{n}) - (\;f\varphi)(x_{\rm l}, \ldots, - \infty, \ldots, x_{n}) \cr &\qquad - {\textstyle\int\limits_{{\bb R}^{n}}} f(x_{\rm l}, \ldots, x_{i}, \ldots, x_{n}) \partial_{i} \varphi (x_{\rm l}, \ldots, x_{i}, \ldots, x_{n}) \;\hbox{d}x_{i}\hbox{;}}] the integrated term vanishes, since ϕ has compact support, showing that [\partial_{i} T_{f} = T_{\partial_{i}\; f}].

    The test functions [\varphi \in {\scr D}] are infinitely differentiable. Therefore, transpositions like that used to define [\partial_{i} T] may be repeated, so that any distribution is infinitely differentiable. For instance, [\displaylines{\langle \partial_{ij}^{2} T, \varphi \rangle = - \langle \partial_{j} T, \partial_{i} \varphi \rangle = \langle T, \partial_{ij}^{2} \varphi \rangle, \cr \langle D^{\bf p} T, \varphi \rangle = (-1)^{|{\bf p}|} \langle T, D^{\bf p} \varphi \rangle, \cr \langle \Delta T, \varphi \rangle = \langle T, \Delta \varphi \rangle,}] where Δ is the Laplacian operator. The derivatives of Dirac's δ distribution are [\langle D^{\bf p} \delta, \varphi \rangle = (-1)^{|{\bf p}|} \langle \delta, D^{\bf p} \varphi \rangle = (-1)^{|{\bf p}|} D^{\bf p} \varphi ({\bf 0}).]

    It is remarkable that differentiation is a continuous operation for the topology on [{\scr D}\,']: if a sequence [(T_{j})] of distributions converges to distribution T, then the sequence [(D^{\bf p} T_{j})] of derivatives converges to [D^{\bf p} T] for any multi-index p, since as [j \rightarrow \infty] [\langle D^{\bf p} T_{j}, \varphi \rangle = (-1)^{|{\bf p}|} \langle T_{j}, D^{\bf p} \varphi \rangle \rightarrow (-1)^{|{\bf p}|} \langle T, D^{\bf p} \varphi \rangle = \langle D^{\bf p} T, \varphi \rangle.] An analogous statement holds for series: any convergent series of distributions may be differentiated termwise to all orders. This illustrates how `robust' the constructs of distribution theory are in comparison with those of ordinary function theory, where similar statements are notoriously untrue.

  • (b) Differentiation under the duality bracket

    Limiting processes and differentiation may also be carried out under the duality bracket [\langle ,\rangle] as under the integral sign with ordinary functions. Let the function [\varphi = \varphi ({\bf x}, \lambda)] depend on a parameter [\lambda \in \Lambda] and a vector [{\bf x} \in {\bb R}^{n}] in such a way that all functions [\varphi_{\lambda}: {\bf x} \;\longmapsto\; \varphi ({\bf x}, \lambda)] be in [{\scr D}({\bb R}^{n})] for all [\lambda \in \Lambda]. Let [T \in {\scr D}^{\prime}({\bb R}^{n})] be a distribution, let [I(\lambda) = \langle T, \varphi_{\lambda}\rangle] and let [\lambda_{0} \in \Lambda] be given parameter value. Suppose that, as λ runs through a small enough neighbourhood of [\lambda_{0}],

    • (i) all the [\varphi_{\lambda}] have their supports in a fixed compact subset K of [{\bb R}^{n}];

    • (ii) all the derivatives [D^{\bf p} \varphi_{\lambda}] have a partial derivative with respect to λ which is continuous with respect to x and λ.

    Under these hypotheses, [I(\lambda)] is differentiable (in the usual sense) with respect to λ near [\lambda_{0}], and its derivative may be obtained by `differentiation under the [\langle ,\rangle] sign': [{\hbox{d}I \over \hbox{d}\lambda} = \langle T, \partial_{\lambda} \varphi_{\lambda}\rangle.]

  • (c) Effect of discontinuities

    When a function f or its derivatives are no longer continuous, the derivatives [D^{\bf p} T_{f}] of the associated distribution [T_{f}] may no longer coincide with the distributions associated to the functions [D^{\bf p} f].

    In dimension 1, the simplest example is Heaviside's unit step function [Y\; [Y(x) = 0 \hbox{ for } x \;\lt\; 0, Y(x) = 1 \hbox{ for } x \geq 0]]: [\langle (T_{Y})', \varphi \rangle = - \langle (T_{Y}), \varphi'\rangle = - {\textstyle\int\limits_{0}^{+ \infty}} \varphi' (x) \;\hbox{d}x = \varphi (0) = \langle \delta, \varphi \rangle.] Hence [(T_{Y})' = \delta], a result long used `heuristically' by electrical engineers [see also Dirac (1958)[link]].

    Let f be infinitely differentiable for [x \;\lt\; 0] and [x \gt 0] but have discontinuous derivatives [f^{(m)}] at [x = 0] [[\;f^{(0)}] being f itself] with jumps [\sigma_{m} = f^{(m)} (0 +) - f^{(m)} (0 -)]. Consider the functions: [\eqalign{g_{0} &= f - \sigma_{0} Y \cr g_{1} &= g'_{0} - \sigma_{1} Y \cr---&-------\cr g_{k} &= g'_{k - 1} - \sigma_{k} Y.}] The [g_{k}] are continuous, their derivatives [g'_{k}] are continuous almost everywhere [which implies that [(T_{g_{k}})' = T_{g'_{k}}] and [g'_{k} = f^{(k + 1)}] almost everywhere]. This yields immediately: [\eqalign{(T_{f})' &= T_{f'} + \sigma_{0} \delta \cr (T_{f})'' &=T_{f''} + \sigma_{0} \delta' + \sigma_{\rm 1} \delta \cr----&--------------\cr (T_{f})^{(m)} &= T_{f^{(m)}} + \sigma_{0} \delta^{(m - 1)} + \ldots + \sigma_{m - 1} \delta.\cr----&--------------\cr}] Thus the `distributional derivatives' [(T_{f})^{(m)}] differ from the usual functional derivatives [T_{f^{(m)}}] by singular terms associated with discontinuities.

    In dimension n, let f be infinitely differentiable everywhere except on a smooth hypersurface S, across which its partial derivatives show discontinuities. Let [\sigma_{0}] and [\sigma_{\nu}] denote the discontinuities of f and its normal derivative [\partial_{\nu} \varphi] across S (both [\sigma_{0}] and [\sigma_{\nu}] are functions of position on S), and let [\delta_{(S)}] and [\partial_{\nu} \delta_{(S)}] be defined by [\eqalign{\langle \delta_{(S)}, \varphi \rangle &= {\textstyle\int\limits_{S}} \varphi \;\hbox{d}^{n - 1} S \cr \langle \partial_{\nu} \delta_{(S)}, \varphi \rangle &= - {\textstyle\int\limits_{S}} \partial_{\nu} \varphi \;\hbox{d}^{n - 1} S.}] Integration by parts shows that [\partial_{i} T_{f} = T_{\partial_{i}\; f} + \sigma_{0} \cos \theta_{i} \delta_{(S)},] where [\theta_{i}] is the angle between the [x_{i}] axis and the normal to S along which the jump [\sigma_{0}] occurs, and that the Laplacian of [T_{f}] is given by [\Delta (T_{f}) = T_{\Delta f} + \sigma_{\nu} \delta_{(S)} + \partial_{\nu} [\sigma_{0} \delta_{(S)}].] The latter result is a statement of Green's theorem in terms of distributions. It will be used in Section 1.3.4.4.3.5[link] to calculate the Fourier transform of the indicator function of a molecular envelope.

1.3.2.3.9.2. Integration of distributions in dimension 1

| top | pdf |

The reverse operation from differentiation, namely calculating the `indefinite integral' of a distribution S, consists in finding a distribution T such that [T' = S].

For all [\chi \in {\scr D}] such that [\chi = \psi'] with [\psi \in {\scr D}], we must have [\langle T, \chi \rangle = - \langle S, \psi \rangle .] This condition defines T in a `hyperplane' [{\scr H}] of [{\scr D}], whose equation [\langle 1, \chi \rangle \equiv \langle 1, \psi' \rangle = 0] reflects the fact that ψ has compact support.

To specify T in the whole of [{\scr D}], it suffices to specify the value of [\langle T, \varphi_{0} \rangle] where [\varphi_{0} \in {\scr D}] is such that [\langle 1, \varphi_{0} \rangle = 1]: then any [\varphi \in {\scr D}] may be written uniquely as [\varphi = \lambda \varphi_{0} + \psi'] with [\lambda = \langle 1, \varphi \rangle, \qquad \chi = \varphi - \lambda \varphi_{0}, \qquad \psi (x) = {\textstyle\int\limits_{0}^{x}} \chi (t) \;\hbox{d}t,] and T is defined by [\langle T, \varphi \rangle = \lambda \langle T, \varphi_{0} \rangle - \langle S, \psi \rangle.] The freedom in the choice of [\varphi_{0}] means that T is defined up to an additive constant.

1.3.2.3.9.3. Multiplication of distributions by functions

| top | pdf |

The product [\alpha T] of a distribution T on [{\bb R}^{n}] by a function α over [{\bb R}^{n}] will be defined by transposition: [\langle \alpha T, \varphi \rangle = \langle T, \alpha \varphi \rangle \quad \hbox{for all } \varphi \in {\scr D}.] In order that [\alpha T] be a distribution, the mapping [\varphi \;\longmapsto\; \alpha \varphi] must send [{\scr D}({\bb R}^{n})] continuously into itself; hence the multipliers α must be infinitely differentiable. The product of two general distributions cannot be defined. The need for a careful treatment of multipliers of distributions will become clear when it is later shown (Section 1.3.2.5.8[link]) that the Fourier transformation turns convolutions into multiplications and vice versa.

If T is a distribution of order m, then α needs only have continuous derivatives up to order m. For instance, δ is a distribution of order zero, and [\alpha \delta = \alpha ({\bf 0}) \delta] is a distribution provided α is continuous; this relation is of fundamental importance in the theory of sampling and of the properties of the Fourier transformation related to sampling (Sections 1.3.2.6.4[link], 1.3.2.6.6[link]). More generally, [D^{{\bf p}}\delta] is a distribution of order [|{\bf p}|], and the following formula holds for all [\alpha \in {\scr D}^{(m)}] with [m = |{\bf p}|]: [\alpha (D^{{\bf p}}\delta) = {\displaystyle\sum\limits_{{\bf q} \leq {\bf p}}} (-1)^{|{\bf p}-{\bf q}|} \pmatrix{{\bf p}\cr {\bf q}\cr} (D^{{\bf p}-{\bf q}} \alpha) ({\bf 0}) D^{\bf q}\delta.]

The derivative of a product is easily shown to be [\partial_{i}(\alpha T) = (\partial_{i}\alpha) T + \alpha (\partial_{i}T)] and generally for any multi-index p [D^{\bf p}(\alpha T) = {\displaystyle\sum\limits_{{\bf q}\leq {\bf p}}} \pmatrix{{\bf p}\cr {\bf q}\cr} (D^{{\bf p}-{\bf q}} \alpha) ({\bf 0}) D^{{\bf q}}T.]

1.3.2.3.9.4. Division of distributions by functions

| top | pdf |

Given a distribution S on [{\bb R}^{n}] and an infinitely differentiable multiplier function α, the division problem consists in finding a distribution T such that [\alpha T = S].

If α never vanishes, [T = S/\alpha] is the unique answer. If [n = 1], and if α has only isolated zeros of finite order, it can be reduced to a collection of cases where the multiplier is [x^{m}], for which the general solution can be shown to be of the form [T = U + {\textstyle\sum\limits_{i=0}^{m-1}} c_{i}\delta^{(i)},] where U is a particular solution of the division problem [x^{m} U = S] and the [c_{i}] are arbitrary constants.

In dimension [n \gt 1], the problem is much more difficult, but is of fundamental importance in the theory of linear partial differential equations, since the Fourier transformation turns the problem of solving these into a division problem for distributions [see Hörmander (1963)[link]].

1.3.2.3.9.5. Transformation of coordinates

| top | pdf |

Let σ be a smooth non-singular change of variables in [{\bb R}^{n}], i.e. an infinitely differentiable mapping from an open subset Ω of [{\bb R}^{n}] to Ω′ in [{\bb R}^{n}], whose Jacobian [J(\sigma) = \det \left[{\partial \sigma ({\bf x}) \over \partial {\bf x}}\right]] vanishes nowhere in Ω. By the implicit function theorem, the inverse mapping [\sigma^{-1}] from Ω′ to Ω is well defined.

If f is a locally summable function on Ω, then the function [\sigma^{\#} f] defined by [(\sigma^{\#} f)({\bf x}) = f[\sigma^{-1}({\bf x})]] is a locally summable function on Ω′, and for any [\varphi \in {\scr D}(\Omega')] we may write: [\eqalign{{\textstyle\int\limits_{\Omega'}} (\sigma^{\#} f) ({\bf x}) \varphi ({\bf x}) \;\hbox{d}^{n} {\bf x} &= {\textstyle\int\limits_{\Omega'}} f[\sigma^{-1} ({\bf x})] \varphi ({\bf x}) \;\hbox{d}^{n} {\bf x} \cr &= {\textstyle\int\limits_{\Omega'}} f({\bf y}) \varphi [\sigma ({\bf y})]|J(\sigma)| \;\hbox{d}^{n} {\bf y} \quad \hbox{by } {\bf x} = \sigma ({\bf y}).}] In terms of the associated distributions [\langle T_{\sigma^{\#} f}, \varphi \rangle = \langle T_{f}, |J(\sigma)|(\sigma^{-1})^{\#} \varphi \rangle.]

This operation can be extended to an arbitrary distribution T by defining its image [\sigma^{\#} T] under coordinate transformation σ through [\langle \sigma^{\#} T, \varphi \rangle = \langle T, |J(\sigma)|(\sigma^{-1})^{\#} \varphi \rangle,] which is well defined provided that σ is proper, i.e. that [\sigma^{-1}(K)] is compact whenever K is compact.

For instance, if [\sigma: {\bf x} \;\longmapsto\; {\bf x} + {\bf a}] is a translation by a vector a in [{\bb R}^{n}], then [|J(\sigma)| = 1]; [\sigma^{\#}] is denoted by [\tau_{\bf a}], and the translate [\tau_{\bf a} T] of a distribution T is defined by [\langle \tau_{\bf a} T, \varphi \rangle = \langle T, \tau_{-{\bf a}} \varphi \rangle.]

Let [A: {\bf x} \;\longmapsto\; {\bf Ax}] be a linear transformation defined by a non-singular matrix A. Then [J(A) = \det {\bf A}], and [\langle A^{\#} T, \varphi \rangle = |\det {\bf A}| \langle T, (A^{-1})^{\#} \varphi \rangle.] This formula will be shown later (Sections 1.3.2.6.5[link], 1.3.4.2.1.1[link]) to be the basis for the definition of the reciprocal lattice.

In particular, if [{\bf A} = -{\bf I}], where I is the identity matrix, A is an inversion through a centre of symmetry at the origin, and denoting [A^{\#} \varphi] by [\breve{\varphi}] we have: [\langle \breve{T}, \varphi \rangle = \langle T, \breve{\varphi} \rangle.] T is called an even distribution if [\breve{T} = T], an odd distribution if [\breve{T} = -T].

If [{\bf A} = \lambda {\bf I}] with [\lambda \gt 0], A is called a dilation and [\langle A^{\#} T, \varphi \rangle = \lambda^{n} \langle T, (A^{-1})^{\#} \varphi \rangle.] Writing symbolically δ as [\delta ({\bf x})] and [A^{\#} \delta] as [\delta ({\bf x}/\lambda)], we have: [\delta ({\bf x}/\lambda) = \lambda^{n} \delta ({\bf x}).] If [n = 1] and f is a function with isolated simple zeros [x_{j}], then in the same symbolic notation [\delta [\;f(x)] = \sum\limits_{j} {1 \over |\;f'(x_{j})|} \delta (x_{j}),] where each [\lambda_{j} = 1/|\;f'(x_{j})|] is analogous to a `Lorentz factor' at zero [x_{j}].

1.3.2.3.9.6. Tensor product of distributions

| top | pdf |

The purpose of this construction is to extend Fubini's theorem to distributions. Following Section 1.3.2.2.5[link], we may define the tensor product [L_{\rm loc}^{1} ({\bb R}^{m}) \otimes L_{\rm loc}^{1} ({\bb R}^{n})] as the vector space of finite linear combinations of functions of the form [f \otimes g: ({\bf x},{ \bf y}) \;\longmapsto\; f({\bf x})g({\bf y}),] where [{\bf x} \in {\bb R}^{m},{\bf y} \in {\bb R}^{n}, f \in L_{\rm loc}^{1} ({\bb R}^{m})] and [g \in L_{\rm loc}^{1} ({\bb R}^{n})].

Let [S_{\bf x}] and [T_{\bf y}] denote the distributions associated to f and g, respectively, the subscripts x and y acting as mnemonics for [{\bb R}^{m}] and [{\bb R}^{n}]. It follows from Fubini's theorem (Section 1.3.2.2.5[link]) that [f \otimes g \in L_{\rm loc}^{1} ({\bb R}^{m} \times {\bb R}^{n})], and hence defines a distribution over [{\bb R}^{m} \times {\bb R}^{n}]; the rearrangement of integral signs gives [\langle S_{\bf x} \otimes T_{\bf y}, \varphi_{{\bf x}, \,{\bf y}} \rangle = \langle S_{\bf x}, \langle T_{\bf y}, \varphi_{{\bf x}, \,{\bf y}} \rangle\rangle = \langle T_{\bf y}, \langle S_{\bf x}, \varphi_{{\bf x}, \, {\bf y}} \rangle\rangle] for all [\varphi_{{\bf x}, \,{\bf y}} \in {\scr D}({\bb R}^{m} \times {\bb R}^{n})]. In particular, if [\varphi ({\bf x},{ \bf y}) = u({\bf x}) v({\bf y})] with [u \in {\scr D}({\bb R}^{m}),v \in {\scr D}({\bb R}^{n})], then [\langle S \otimes T, u \otimes v \rangle = \langle S, u \rangle \langle T, v \rangle.]

This construction can be extended to general distributions [S \in {\scr D}\,'({\bb R}^{m})] and [T \in {\scr D}\,'({\bb R}^{n})]. Given any test function [\varphi \in {\scr D}({\bb R}^{m} \times {\bb R}^{n})], let [\varphi_{\bf x}] denote the map [{\bf y} \;\longmapsto\; \varphi ({\bf x}, {\bf y})]; let [\varphi_{\bf y}] denote the map [{\bf x} \;\longmapsto\; \varphi ({\bf x},{\bf y})]; and define the two functions [\theta ({\bf x}) = \langle T, \varphi_{\bf x} \rangle] and [\omega ({\bf y}) = \langle S, \varphi_{\bf y} \rangle]. Then, by the lemma on differentiation under the [\langle,\rangle] sign of Section 1.3.2.3.9.1[link], [\theta \in {\scr D}({\bb R}^{m}),\omega \in {\scr D}({\bb R}^{n})], and there exists a unique distribution [S \otimes T] such that [\langle S \otimes T, \varphi \rangle = \langle S, \theta \rangle = \langle T, \omega \rangle.] [S \otimes T] is called the tensor product of S and T.

With the mnemonic introduced above, this definition reads identically to that given above for distributions associated to locally integrable functions: [\langle S_{\bf x} \otimes T_{\bf y}, \varphi_{{\bf x}, \, {\bf y}} \rangle = \langle S_{\bf x}, \langle T_{\bf y}, \varphi_{{\bf x}, \, {\bf y}} \rangle\rangle = \langle T_{\bf y}, \langle S_{\bf x}, \varphi_{{\bf x}, \, {\bf y}} \rangle\rangle.]

The tensor product of distributions is associative: [(R \otimes S) \otimes T = R \otimes (S \otimes T).] Derivatives may be calculated by [D_{\bf x}^{\bf p} D_{\bf y}^{\bf q} (S_{\bf x} \otimes T_{\bf y}) = (D_{\bf x}^{\bf p} S_{\bf x}) \otimes (D_{\bf y}^{\bf q} T_{\bf y}).] The support of a tensor product is the Cartesian product of the supports of the two factors.

1.3.2.3.9.7. Convolution of distributions

| top | pdf |

The convolution [f * g] of two functions f and g on [{\bb R}^{n}] is defined by [(\;f * g) ({\bf x}) = {\textstyle\int\limits_{{\bb R}^{n}}} f({\bf y}) g({\bf x} - {\bf y}) \;\hbox{d}^{n}{\bf y} = {\textstyle\int\limits_{{\bb R}^{n}}} f({\bf x} - {\bf y}) g ({\bf y}) \;\hbox{d}^{n}{\bf y}] whenever the integral exists. This is the case when f and g are both in [L^{1} ({\bb R}^{n})]; then [f * g] is also in [L^{1} ({\bb R}^{n})]. Let S, T and W denote the distributions associated to f, g and [f * g,] respectively: a change of variable immediately shows that for any [\varphi \in {\scr D}({\bb R}^{n})], [\langle W, \varphi \rangle = {\textstyle\int\limits_{{\bb R}^{n} \times {\bb R}^{n}}} f({\bf x}) g({\bf y}) \varphi ({\bf x} + {\bf y}) \;\hbox{d}^{n}{\bf x} \;\hbox{d}^{n}{\bf y}.] Introducing the map σ from [{\bb R}^{n} \times {\bb R}^{n}] to [{\bb R}^{n}] defined by [\sigma ({\bf x}, {\bf y}) = {\bf x} + {\bf y}], the latter expression may be written: [\langle S_{\bf x} \otimes T_{\bf y}, \varphi \circ \sigma \rangle] (where [\circ] denotes the composition of mappings) or by a slight abuse of notation: [\langle W, \varphi \rangle = \langle S_{\bf x} \otimes T_{\bf y}, \varphi ({\bf x} + {\bf y}) \rangle.]

A difficulty arises in extending this definition to general distributions S and T because the mapping σ is not proper: if K is compact in [{\bb R}^{n}], then [\sigma^{-1} (K)] is a cylinder with base K and generator the `second bisector' [{\bf x} + {\bf y} = {\bf 0}] in [{\bb R}^{n} \times {\bb R}^{n}]. However, [\langle S \otimes T, \varphi \circ \sigma \rangle] is defined whenever the intersection between Supp [(S \otimes T) = (\hbox{Supp } S) \times (\hbox{Supp } T)] and [\sigma^{-1} (\hbox{Supp } \varphi)] is compact.

We may therefore define the convolution [S * T] of two distributions S and T on [{\bb R}^{n}] by [\langle S * T, \varphi \rangle = \langle S \otimes T, \varphi \circ \sigma \rangle = \langle S_{\bf x} \otimes T_{\bf y}, \varphi ({\bf x} + {\bf y})\rangle] whenever the following support condition is fulfilled:

`the set [\{({\bf x},{\bf y})|{\bf x} \in A, {\bf y} \in B, {\bf x} + {\bf y} \in K\}] is compact in [{\bb R}^{n} \times {\bb R}^{n}] for all K compact in [{\bb R}^{n}]'.

The latter condition is met, in particular, if S or T has compact support. The support of [S * T] is easily seen to be contained in the closure of the vector sum [A + B = \{{\bf x} + {\bf y}|{\bf x} \in A, {\bf y} \in B\}.]

Convolution by a fixed distribution S is a continuous operation for the topology on [{\scr D}\,']: it maps convergent sequences [(T_{j})] to convergent sequences [(S * T_{j})]. Convolution is commutative: [S * T = T * S].

The convolution of p distributions [T_{1}, \ldots, T_{p}] with supports [A_{1}, \ldots, A_{p}] can be defined by [\langle T_{1} * \ldots * T_{p}, \varphi \rangle = \langle (T_{1})_{{\bf x}_{1}} \otimes \ldots \otimes (T_{p})_{{\bf x}_{p}}, \varphi ({\bf x}_{1} + \ldots + {\bf x}_{p})\rangle] whenever the following generalized support condition:

`the set [\{({\bf x}_{1}, \ldots, {\bf x}_{p})|{\bf x}_{1} \in A_{1}, \ldots, {\bf x}_{p} \in A_{p}, {\bf x}_{1} + \ldots + {\bf x}_{p} \in K\}] is compact in [({\bb R}^{n})^{p}] for all K compact in [{\bb R}^{n}]'

is satisfied. It is then associative. Interesting examples of associativity failure, which can be traced back to violations of the support condition, may be found in Bracewell (1986[link], pp. 436–437).

It follows from previous definitions that, for all distributions [T \in {\scr D}\,'], the following identities hold:

  • (i) [\delta * T = T]: [\delta] is the unit convolution;

  • (ii) [\delta_{({\bf a})} * T = \tau_{\bf a} T]: translation is a convolution with the corresponding translate of δ;

  • (iii) [(D^{{\bf p}} \delta) * T = D^{{\bf p}} T]: differentiation is a convolution with the corresponding derivative of δ;

  • (iv) translates or derivatives of a convolution may be obtained by translating or differentiating any one of the factors: convolution `commutes' with translation and differentiation, a property used in Section 1.3.4.4.7.7[link] to speed up least-squares model refinement for macromolecules.

The latter property is frequently used for the purpose of regularization: if T is a distribution, α an infinitely differentiable function, and at least one of the two has compact support, then [T * \alpha] is an infinitely differentiable ordinary function. Since sequences [(\alpha_{\nu})] of such functions α can be constructed which have compact support and converge to δ, it follows that any distribution T can be obtained as the limit of infinitely differentiable functions [T * \alpha_{\nu}]. In topological jargon: [{\scr D}({\bb R}^{n})] is `everywhere dense' in [{\scr D}\,'({\bb R}^{n})]. A standard function in [{\scr D}] which is often used for such proofs is defined as follows: put [\eqalign{\theta (x) &= {1 \over A} \exp \left(- {1 \over 1-x^{2}}\right){\hbox to 10.5pt{}} \hbox{for } |x| \leq 1, \cr &= 0 \phantom{\exp \left(- {1 \over x^{2} - 1}\right)a}\quad \hbox{for } |x| \geq 1,}] with [A = \int\limits_{-1}^{+1} \exp \left(- {1 \over 1-x^{2}}\right) \;\hbox{d}x] (so that θ is in [{\scr D}] and is normalized), and put [\eqalign{\theta_{\varepsilon} (x) &= {1 \over \varepsilon} \theta \left({x \over \varepsilon}\right){\hbox to 13.5pt{}}\hbox{ in dimension } 1,\cr \theta_{\varepsilon} ({\bf x}) &= \prod\limits_{j=1}^{n} \theta_{\varepsilon} (x_{j})\quad \hbox{in dimension } n.}]

Another related result, also proved by convolution, is the structure theorem: the restriction of a distribution [T \in {\scr D}\,'({\bb R}^{n})] to a bounded open set Ω in [{\bb R}^{n}] is a derivative of finite order of a continuous function.

Properties (i)[link] to (iv)[link] are the basis of the symbolic or operational calculus (see Carslaw & Jaeger, 1948[link]; Van der Pol & Bremmer, 1955[link]; Churchill, 1958[link]; Erdélyi, 1962[link]; Moore, 1971[link]) for solving integro-differential equations with constant coefficients by turning them into convolution equations, then using factorization methods for convolution algebras (Schwartz, 1965[link]).

1.3.2.4. Fourier transforms of functions

| top | pdf |

1.3.2.4.1. Introduction

| top | pdf |

Given a complex-valued function f on [{\bb R}^{n}] subject to suitable regularity conditions, its Fourier transform [{\scr F}[\;f]] and Fourier cotransform [\bar{\scr F}[\;f]] are defined as follows: [\eqalign{{\scr F}[\;f] (\xi) &= {\textstyle\int\limits_{{\bb R}^{n}}} f({\bf x}) \exp (-2\pi i {\boldxi} \cdot {\bf x}) \;\hbox{d}^{n} {\bf x}\cr \bar{\scr F}[\;f] (\xi) &= {\textstyle\int\limits_{{\bb R}^{n}}} f({\bf x}) \exp (+2\pi i {\boldxi} \cdot {\bf x}) \;\hbox{d}^{n} {\bf x},}] where [{\boldxi} \cdot {\bf x} = {\textstyle\sum_{i=1}^{n}} \xi_{i}x_{i}] is the ordinary scalar product. The terminology and sign conventions given above are the standard ones in mathematics; those used in crystallography are slightly different (see Section 1.3.4.2.1.1[link]). These transforms enjoy a number of remarkable properties, whose natural settings entail different regularity assumptions on f: for instance, properties relating to convolution are best treated in [L^{1} ({\bb R}^{n})], while Parseval's theorem requires the Hilbert space structure of [L^{2} ({\bb R}^{n})]. After a brief review of these classical properties, the Fourier transformation will be examined in a space [{\scr S}({\bb R}^{n})] particularly well suited to accommodating the full range of its properties, which will later serve as a space of test functions to extend the Fourier transformation to distributions.

There exists an abundant literature on the `Fourier integral'. The books by Carslaw (1930)[link], Wiener (1933)[link], Titchmarsh (1948)[link], Katznelson (1968)[link], Sneddon (1951[link], 1972[link]), and Dym & McKean (1972)[link] are particularly recommended.

1.3.2.4.2. Fourier transforms in [L^{1}]

| top | pdf |

1.3.2.4.2.1. Linearity

| top | pdf |

Both transformations [{\scr F}] and [\bar{\scr F}] are obviously linear maps from [L^{1}] to [L^{\infty}] when these spaces are viewed as vector spaces over the field [{\bb C}] of complex numbers.

1.3.2.4.2.2. Effect of affine coordinate transformations

| top | pdf |

[{\scr F}] and [\bar{\scr F}] turn translations into phase shifts: [\eqalign{{\scr F}[\tau_{\bf a}\; f] ({\boldxi}) &= \exp (-2\pi i {\boldxi} \cdot {\bf a}) {\scr F}[\;f] ({\boldxi})\cr \bar{\scr F}[\tau_{\bf a}\; f] ({\boldxi}) &= \exp (+2\pi i {\boldxi} \cdot {\bf a}) \bar{\scr F}[\;f] ({\boldxi}).}]

Under a general linear change of variable [{\bf x} \;\longmapsto\; {\bf Ax}] with non-singular matrix A, the transform of [A^{\#} f] is [\eqalign{{\scr F}[A^{\#} f] ({\boldxi}) &= {\textstyle\int\limits_{{\bb R}^{n}}} f({\bf A}^{-1} {\bf x}) \exp (-2\pi i {\boldxi} \cdot {\bf x}) \;\hbox{d}^{n} {\bf x}\cr &= {\textstyle\int\limits_{{\bb R}^{n}}} f({\bf y}) \exp (-2\pi i (A^{T} {\boldxi}) \cdot {\bf y}) |\det {\bf A}| \;\hbox{d}^{n} {\bf y}\cr &\phantom{{\bb R}^{n}f({\bf y}) \exp (-2\pi i (A^{T} {\boldxi}) \cdot {\bf y}) |\det {\bf A}|} \hbox{ by } {\bf x} = {\bf Ay}\cr &= |\det {\bf A}| {\scr F}[\;f] ({\bf A}^{T} {\boldxi})}] i.e. [{\scr F}[A^{\#} f] = |\det {\bf A}| [({\bf A}^{-1})^{T}]^{\#} {\scr F}[\;f]] and similarly for [\bar{\scr F}]. The matrix [({\bf A}^{-1})^{T}] is called the contragredient of matrix A.

Under an affine change of coordinates [{\bf x} \;\longmapsto\; S({\bf x}) = {\bf Ax} + {\bf b}] with non-singular matrix A, the transform of [S^{\#} f] is given by [\eqalign{{\scr F}[S^{\#} f] ({\boldxi}) &= {\scr F}[\tau_{\bf b} (A^{\#} f)] ({\boldxi})\cr &= \exp (-2\pi i {\boldxi} \cdot {\bf b}) {\scr F}[A^{\#} f] ({\boldxi})\cr &= \exp (-2\pi i {\boldxi} \cdot {\bf b}) |\det {\bf A}| {\scr F}[\;f] ({\bf A}^{T} {\boldxi})}] with a similar result for [\bar{\scr F}], replacing −i by +i.

1.3.2.4.2.3. Conjugate symmetry

| top | pdf |

The kernels of the Fourier transformations [{\scr F}] and [\bar{\scr F}] satisfy the following identities: [\exp (\pm 2\pi i {\boldxi} \cdot {\bf x}) = \exp \overline{[\pm 2\pi i {\boldxi} \cdot (-{\bf x})]} = \exp \overline{[\pm 2\pi i (-{\boldxi}) \cdot {\bf x}]}.] As a result the transformations [{\scr F}] and [\bar{\scr F}] themselves have the following `conjugate symmetry' properties [where the notation [\breve{f}({\bf x}) = f(-{\bf x})] of Section 1.3.2.2.2[link] will be used]: [\displaylines{{\scr F}[\;f] ({\boldxi}) = \overline{{\scr F}[\bar{\; f}] (-{\boldxi})} = \breve{\overline{{\scr F}[\bar{\; f}] ({\boldxi})}}\cr {\scr F}[\;f] ({\boldxi}) = \overline{{\scr F}[\breve{\bar{\;f}}] ({\boldxi})}.}] Therefore,

  • (i) f real [\Leftrightarrow f = \bar{f} \Leftrightarrow {\scr F}[\;f] = \breve{\overline{{\scr F}[\;f]}} \Leftrightarrow {\scr F}[\;f] ({\boldxi}) = \overline{{\scr F}[\;f] (-{\boldxi})}:{\scr F}[\;f]] is said to possess Hermitian symmetry;

  • (ii) f centrosymmetric [\Leftrightarrow f = \breve{f} \Leftrightarrow {\scr F}[\;f] = \overline{{\scr F}[\bar{\; f}]}];

  • (iii) f real centrosymmetric [\Leftrightarrow f = \bar{f} = \breve{f} \Leftrightarrow {\scr F}[\;f] = \overline{{\scr F}[\;f]} = \breve{\overline{{\scr F}[\;f]}} \Leftrightarrow {\scr F}[\;f]] real centrosymmetric.

Conjugate symmetry is the basis of Friedel's law (Section 1.3.4.2.1.4[link]) in crystallography.

1.3.2.4.2.4. Tensor product property

| top | pdf |

Another elementary property of [{\scr F}] is its naturality with respect to tensor products. Let [u \in L^{1} ({\bb R}^{m})] and [v \in L^{1} ({\bb R}^{n})], and let [{\scr F}_{\bf x},{\scr F}_{\bf y},{\scr F}_{{\bf x}, \,{\bf y}}] denote the Fourier transformations in [L^{1} ({\bb R}^{m}),L^{1} ({\bb R}^{n})] and [L^{1} ({\bb R}^{m} \times {\bb R}^{n})], respectively. Then [{\scr F}_{{\bf x}, \, {\bf y}} [u \otimes v] = {\scr F}_{\bf x} [u] \otimes {\scr F}_{\bf y} [v].] Furthermore, if [f \in L^{1} ({\bb R}^{m} \times {\bb R}^{n})], then [{\scr F}_{\bf y} [\;f] \in L^{1} ({\bb R}^{m})] as a function of x and [{\scr F}_{\bf x} [\;f] \in L^{1} ({\bb R}^{n})] as a function of y, and [{\scr F}_{{\bf x}, \,{\bf y}} [\;f] = {\scr F}_{\bf x} [{\scr F}_{\bf y} [\;f]] = {\scr F}_{\bf y} [{\scr F}_{\bf x} [\;f]].] This is easily proved by using Fubini's theorem and the fact that [({\boldxi}, {\boldeta}) \cdot ({\bf x},{ \bf y}) = {\boldxi} \cdot {\bf x} + {\boldeta} \cdot {\bf y}], where [{\bf x}, {\boldxi} \in {\bb R}^{m},{\bf y}, {\boldeta} \in {\bb R}^{n}]. This property may be written: [{\scr F}_{{\bf x}, \, {\bf y}} = {\scr F}_{\bf x} \otimes {\scr F}_{\bf y}.]

1.3.2.4.2.5. Convolution property

| top | pdf |

If f and g are summable, their convolution [f * g] exists and is summable, and [{\scr F}[\;f * g] ({\boldxi}) = {\textstyle\int\limits_{{\bb R}^{n}}} \left[{\textstyle\int\limits_{{\bb R}^{n}}} f({\bf y}) g({\bf x} - {\bf y}) \;\hbox{d}^{n} {\bf y}\right] \exp (-2\pi i {\boldxi} \cdot {\bf x}) \;\hbox{d}^{n} {\bf x}.] With [{\bf x} = {\bf y} + {\bf z}], so that [\exp (-2\pi i{\boldxi} \cdot {\bf x}) = \exp (-2\pi i{\boldxi} \cdot {\bf y}) \exp (-2\pi i{\boldxi} \cdot {\bf z}),] and with Fubini's theorem, rearrangement of the double integral gives: [{\scr F}[\;f * g] = {\scr F}[\;f] \times {\scr F}[g]] and similarly [\bar{\scr F}[\;f * g] = \bar{\scr F}[\;f] \times \bar{\scr F}[g].] Thus the Fourier transform and cotransform turn convolution into multiplication.

1.3.2.4.2.6. Reciprocity property

| top | pdf |

In general, [{\scr F}[\;f]] and [\bar{\scr F}[\;f]] are not summable, and hence cannot be further transformed; however, as they are essentially bounded, their products with the Gaussians [G_{t} (\xi) = \exp (-2\pi^{2} \|\xi\|^{2} t)] are summable for all [t \gt 0], and it can be shown that [f = \lim\limits_{t\rightarrow 0} \bar{\scr F}[G_{t} {\scr F}[\;f]] = \lim\limits_{t \rightarrow 0} {\scr F}[G_{t} \bar{\scr F}[\;f]],] where the limit is taken in the topology of the [L^{1}] norm [\|.\|_{1}]. Thus [{\scr F}] and [\bar{\scr F}] are (in a sense) mutually inverse, which justifies the common practice of calling [\bar{\scr F}] the `inverse Fourier transformation'.

1.3.2.4.2.7. Riemann–Lebesgue lemma

| top | pdf |

If [f \in L^{1} ({\bb R}^{n})], i.e. is summable, then [{\scr F}[\;f]] and [\bar{\scr F}[\;f]] exist and are continuous and essentially bounded: [\|{\scr F}[\;f]\|_{\infty} = \|\bar{\scr F}[\;f]\|_{\infty} \leq \|\;f\|_{1}.] In fact one has the much stronger property, whose statement constitutes the Riemann–Lebesgue lemma, that [{\scr F}[\;f] ({\boldxi})] and [\bar{\scr F}[\;f] ({\boldxi})] both tend to zero as [\|{\boldxi}\| \rightarrow \infty].

1.3.2.4.2.8. Differentiation

| top | pdf |

Let us now suppose that [n = 1] and that [f \in L^{1} ({\bb R})] is differentiable with [f' \in L^{1} ({\bb R})]. Integration by parts yields [\eqalign{{\scr F}[\;f'] (\xi) &= {\textstyle\int\limits_{-\infty}^{+\infty}} f' (x) \exp (-2\pi i\xi \cdot x) \;\hbox{d}x\cr &= [\;f(x) \exp (-2\pi i\xi \cdot x)]_{-\infty}^{+\infty}\cr &\quad + 2\pi i\xi {\textstyle\int\limits_{-\infty}^{+\infty}} f(x) \exp (-2\pi i\xi \cdot x) \;\hbox{d}x.}] Since f′ is summable, f has a limit when [x \rightarrow \pm \infty], and this limit must be 0 since f is summable. Therefore [{\scr F}[\;f'] (\xi) = (2\pi i\xi) {\scr F}[\;f] (\xi)] with the bound [\|2\pi \xi {\scr F}[\;f]\|_{\infty} \leq \|\;f'\|_{1}] so that [|{\scr F}[\;f] (\xi)|] decreases faster than [1/|\xi| \rightarrow \infty].

This result can be easily extended to several dimensions and to any multi-index m: if f is summable and has continuous summable partial derivatives up to order [|{\bf m}|], then [{\scr F}[D^{{\bf m}} f] ({\boldxi}) = (2\pi i{\boldxi})^{{\bf m}} {\scr F}[\;f] ({\boldxi})] and [\|(2\pi {\boldxi})^{{\bf m}} {\scr F}[\;f]\|_{\infty} \leq \|D^{{\bf m}} f\|_{1}.]

Similar results hold for [\bar{\scr F}], with [2\pi i{\boldxi}] replaced by [-2\pi i{\boldxi}]. Thus, the more differentiable f is, with summable derivatives, the faster [{\scr F}[\;f]] and [\bar{\scr F}[\;f]] decrease at infinity.

The property of turning differentiation into multiplication by a monomial has many important applications in crystallography, for instance differential syntheses (Sections 1.3.4.2.1.9[link], 1.3.4.4.7.2[link], 1.3.4.4.7.5[link]) and moment-generating functions [Section 1.3.4.5.2.1[link](c [link])].

1.3.2.4.2.9. Decrease at infinity

| top | pdf |

Conversely, assume that f is summable on [{\bb R}^{n}] and that f decreases fast enough at infinity for [{\bf x}^{{\bf m}} f] also to be summable, for some multi-index m. Then the integral defining [{\scr F}[\;f]] may be subjected to the differential operator [D^{{\bf m}}], still yielding a convergent integral: therefore [D^{{\bf m}} {\scr F}[\;f]] exists, and [D^{{\bf m}} ({\scr F}[\;f]) ({\boldxi}) = {\scr F}[(-2\pi i{\bf x})^{{\bf m}} f] ({\boldxi})] with the bound [\|D^{{\bf m}} ({\scr F}[\;f])\|_{\infty} = \|(2\pi {\bf x})^{{\bf m}} f\|_{1}.]

Similar results hold for [\bar{\scr F}], with [-2\pi i {\bf x}] replaced by [2\pi i{\bf x}]. Thus, the faster f decreases at infinity, the more [{\scr F}[\;f]] and [\bar{\scr F}[\;f]] are differentiable, with bounded derivatives. This property is the converse of that described in Section 1.3.2.4.2.8[link], and their combination is fundamental in the definition of the function space [{\scr S}] in Section 1.3.2.4.4.1[link], of tempered distributions in Section 1.3.2.5[link], and in the extension of the Fourier transformation to them.

1.3.2.4.2.10. The Paley–Wiener theorem

| top | pdf |

An extreme case of the last instance occurs when f has compact support: then [{\scr F}[\;f]] and [\bar{\scr F}[\;f]] are so regular that they may be analytically continued from [{\bb R}^{n}] to [{\bb C}^{n}] where they are entire functions, i.e. have no singularities at finite distance (Paley & Wiener, 1934[link]). This is easily seen for [{\scr F}[\;f]]: giving vector [{\boldxi} \in {\bb R}^{n}] a vector [{\boldeta} \in {\bb R}^{n}] of imaginary parts leads to [\eqalign{{\scr F}[\;f] ({\boldxi} + i{\boldeta}) &= {\textstyle\int\limits_{{\bb R}^{n}}} f({\bf x}) \exp [-2\pi i ({\boldxi} + i{\boldeta}) \cdot {\bf x}] \;\hbox{d}^{n} {\bf x}\cr &= {\scr F}[\exp (2\pi {\boldeta} \cdot {\bf x})f] ({\boldxi}),}] where the latter transform always exists since [\exp (2\pi {\boldeta} \cdot {\bf x})f] is summable with respect to x for all values of η. This analytic continuation forms the basis of the saddlepoint method in probability theory [Section 1.3.4.5.2.1[link](f)[link]] and leads to the use of maximum-entropy distributions in the statistical theory of direct phase determination [Section 1.3.4.5.2.2[link](e)[link]].

By Liouville's theorem, an entire function in [{\bb C}^{n}] cannot vanish identically on the complement of a compact subset of [{\bb R}^{n}] without vanishing everywhere: therefore [{\scr F}[\;f]] cannot have compact support if f has, and hence [{\scr D}({\bb R}^{n})] is not stable by Fourier transformation.

1.3.2.4.3. Fourier transforms in [L^{2}]

| top | pdf |

Let f belong to [L^{2} ({\bb R}^{n})], i.e. be such that [\|\;f\|_{2} = \left({\textstyle\int\limits_{{\bb R}^{n}}} |\;f({\bf x})|^{2} \;\hbox{d}^{n} {\bf x}\right)^{1/2} \;\lt\; \infty.]

1.3.2.4.3.1. Invariance of [L^{2}]

| top | pdf |

[{\scr F}[\;f]] and [\bar{\scr F}[\;f]] exist and are functions in [L^{2}], i.e. [{\scr F}L^{2} = L^{2}], [\bar{\scr F}L^{2} = L^{2}].

1.3.2.4.3.2. Reciprocity

| top | pdf |

[{\scr F}[\bar{\scr F}[\;f]] = f] and [\bar{\scr F}[{\scr F}[\;f]] = f], equality being taken as `almost everywhere' equality. This again leads to calling [\bar{\scr F}] the `inverse Fourier transformation' rather than the Fourier cotransformation.

1.3.2.4.3.3. Isometry

| top | pdf |

[{\scr F}] and [\bar{\scr F}] preserve the [L^{2}] norm: [\|{\scr F}[\;f]\|_{2} = \|\bar{\scr F}[\;f]\|_{2} = \|\;f\|_{2} \hbox{ (Parseval's/Plancherel's theorem)}.] This property, which may be written in terms of the inner product (,) in [L^{2}({\bb R}^{n})] as [({\scr F}[\;f], {\scr F}[g]) = (\bar{\scr F}[\;f], \bar{\scr F}[g]) = (\;f,g),] implies that [{\scr F}] and [\bar{\scr F}] are unitary transformations of [L^{2}({\bb R}^{n})] into itself, i.e. infinite-dimensional `rotations'.

1.3.2.4.3.4. Eigenspace decomposition of [L^{2}]

| top | pdf |

Some light can be shed on the geometric structure of these rotations by the following simple considerations. Note that [\eqalign{{\scr F}^{2}[\;f]({\bf x}) &= {\textstyle\int\limits_{{\bb R}^{n}}} {\scr F}[\;f]({\boldxi}) \exp (-2\pi i{\bf x}\cdot {\boldxi}) \;\hbox{d}^{n}{\boldxi}\cr &= \bar{\scr F}[{\scr F}[\;f]](-{\bf x}) = f(-{\bf x})}] so that [{\scr F}^{4}] (and similarly [\bar{\scr F}^{4}]) is the identity map. Any eigenvalue of [{\scr F}] or [\bar{\scr F}] is therefore a fourth root of unity, i.e. ±1 or [\pm i], and [L^{2}({\bb R}^{n})] splits into an orthogonal direct sum [{\bf H}_{0} \otimes {\bf H}_{1} \otimes {\bf H}_{2} \otimes {\bf H}_{3},] where [{\scr F}] (respectively [\bar{\scr F}]) acts in each subspace [{\bf H}_{k}(k = 0, 1, 2, 3)] by multiplication by [(-i)^{k}]. Orthonormal bases for these subspaces can be constructed from Hermite functions (cf. Section 1.3.2.4.4.2[link]) This method was used by Wiener (1933[link], pp. 51–71).

1.3.2.4.3.5. The convolution theorem and the isometry property

| top | pdf |

In [L^{2}], the convolution theorem (when applicable) and the Parseval/Plancherel theorem are not independent. Suppose that f, g, [f \times g] and [f * g] are all in [L^{2}] (without questioning whether these properties are independent). Then [f * g] may be written in terms of the inner product in [L^{2}] as follows: [(\;f * g)({\bf x}) = {\textstyle\int\limits_{{\bb R}^{n}}} f({\bf x} - {\bf y})g({\bf y}) \;\hbox{d}^{n}{\bf y} = {\textstyle\int\limits_{{\bb R}^{n}}} \overline{\breve{\bar{f}}({\bf y} - {\bf x})}g({\bf y}) \;\hbox{d}^{n}{\bf y},] i.e. [(\;f * g)({\bf x}) = (\tau_{\bf x}\;\breve{\bar{f}}, g).]

Invoking the isometry property, we may rewrite the right-hand side as [\eqalign{({\scr F}[\tau_{\bf x}\;\breve{\bar{f}}], {\scr F}[g]) &= (\exp (- 2\pi i{\bf x} \cdot {\boldxi}) \overline{{\scr F}[\;f]_{\boldxi}}, {\scr F}[g]_{\boldxi})\cr &= {\textstyle\int\limits_{{\bb R}^{n}}} ({\scr F}[\;f] \times {\scr F}[g])({\bf x})\cr &\quad \times \exp (+ 2\pi i{\bf x} \cdot {\boldxi}) \;\hbox{d}^{n}{\boldxi}\cr &= \bar{\scr F}[{\scr F}[\;f] \times {\scr F}[g]],}] so that the initial identity yields the convolution theorem.

To obtain the converse implication, note that [\eqalign{(\;f, g) &= {\textstyle\int\limits_{{\bb R}^{n}}} \overline{f({\bf y})}g({\bf y}) \;\hbox{d}^{n}{\bf y} = (\; \breve{\bar{f}} * g)({\bf 0})\cr &= \bar{\scr F}[{\scr F}[\;\breve{\bar{ f}}] \times {\scr F}[g]]({\bf 0})\cr &= {\textstyle\int\limits_{{\bb R}^{n}}} \overline{{\scr F}[\;f]({\boldxi})} {\scr F}[g]({\boldxi}) \;\hbox{d}^{n}{\boldxi} = ({\scr F}[\;f], {\scr F}[g]),}] where conjugate symmetry (Section 1.3.2.4.2.2[link]) has been used.

These relations have an important application in the calculation by Fourier transform methods of the derivatives used in the refinement of macromolecular structures (Section 1.3.4.4.7[link]).

1.3.2.4.4. Fourier transforms in [{\scr S}]

| top | pdf |

1.3.2.4.4.1. Definition and properties of [{\scr S}]

| top | pdf |

The duality established in Sections 1.3.2.4.2.8[link] and 1.3.2.4.2.9[link] between the local differentiability of a function and the rate of decrease at infinity of its Fourier transform prompts one to consider the space [{\scr S}({\bb R}^{n})] of functions f on [{\bb R}^{n}] which are infinitely differentiable and all of whose derivatives are rapidly decreasing, so that for all multi-indices k and p [({\bf x}^{\bf k}D^{\bf p}f)({\bf x})\rightarrow 0 \quad\hbox{as } \|{\bf x}\|\rightarrow \infty.] The product of [f \in {\scr S}] by any polynomial over [{\bb R}^{n}] is still in [{\scr S}] ([{\scr S}] is an algebra over the ring of polynomials). Furthermore, [{\scr S}] is invariant under translations and differentiation.

If [f \in {\scr S}], then its transforms [{\scr F}[\;f]] and [\bar{\scr F}[\;f]] are

  • (i) infinitely differentiable because f is rapidly decreasing;

  • (ii) rapidly decreasing because f is infinitely differentiable;

hence [{\scr F}[\;f]] and [\bar{\scr F}[\;f]] are in [{\scr S}]: [{\scr S}] is invariant under [{\scr F}] and [\bar{\scr F}].

Since [L^{1} \supset {\scr S}] and [L^{2} \supset {\scr S}], all properties of [{\scr F}] and [\bar{\scr F}] already encountered above are enjoyed by functions of [{\scr S}], with all restrictions on differentiability and/or integrability lifted. For instance, given two functions f and g in [{\scr S}], then both f g and [f * g] are in [{\scr S}] (which was not the case with [L^{1}] nor with [L^{2}]) so that the reciprocity theorem inherited from [L^{2}] [{\scr F}[\bar{\scr F}[\;f]] = f \quad\hbox{and}\quad \bar{\scr F}[{\scr F}[\;f]] = f] allows one to state the reverse of the convolution theorem first established in [L^{1}]: [\eqalign{{\scr F}[\;fg] &= {\scr F}[\;f] * {\scr F}[g]\cr \bar{\scr F}[\;fg] &= \bar{\scr F}[\;f] * \bar{\scr F}[g].}]

1.3.2.4.4.2. Gaussian functions and Hermite functions

| top | pdf |

Gaussian functions are particularly important elements of [{\scr S}]. In dimension 1, a well known contour integration (Schwartz, 1965[link], p. 184) yields [{\scr F}[\exp (- \pi x^{2})](\xi) = \bar{\scr F}[\exp (- \pi x^{2})](\xi) = \exp (- \pi \xi^{2}),] which shows that the `standard Gaussian' [\exp (- \pi x^{2})] is invariant under [{\scr F}] and [\bar{\scr F}]. By a tensor product construction, it follows that the same is true of the standard Gaussian [G({\bf x}) = \exp (- \pi \|{\bf x}\|^{2})] in dimension n: [{\scr F}[G]({\boldxi}) = \bar{\scr F}[G]({\boldxi}) = G({\boldxi}).] In other words, G is an eigenfunction of [{\scr F}] and [\bar{\scr F}] for eigenvalue 1 (Section 1.3.2.4.3.4[link]).

A complete system of eigenfunctions may be constructed as follows. In dimension 1, consider the family of functions [H_{m} = {D^{m}G^{2} \over G}\quad (m \geq 0),] where D denotes the differentiation operator. The first two members of the family [H_{0} = G,\qquad H_{1} = 2 DG,] are such that [{\scr F}[H_{0}] = H_{0}], as shown above, and [DG(x) = - 2\pi xG(x) = i(2\pi ix)G(x) = i{\scr F}[DG](x),] hence [{\scr F}[H_{1}] = (- i)H_{1}.] We may thus take as an induction hypothesis that [{\scr F}[H_{m}] = (-i)^{m}H_{m}.] The identity [D\left({D^{m}G^{2} \over G}\right) = {D^{m+1}G^{2} \over G} - {DG \over G} {D^{m}G^{2} \over G}] may be written [H_{m+1}(x) = (DH_{m})(x) - 2\pi xH_{m}(x),] and the two differentiation theorems give: [\eqalign{{\scr F}[DH_{m}](\xi) &= (2\pi i{\boldxi}) {\scr F}[H_{m}](\xi)\cr {\scr F}[-2\pi xH_{m}](\xi) &= - iD({\scr F}[H_{m}])(\xi).}] Combination of this with the induction hypothesis yields [\eqalign{{\scr F}[H_{m+1}](\xi) &= (-i)^{m+1}[(DH_{m})(\xi) - 2\pi \xi H_{m}(\xi)]\cr &= (-i)^{m+1} H_{m+1}(\xi),}] thus proving that [H_{m}] is an eigenfunction of [{\scr F}] for eigenvalue [(-i)^{m}] for all [m \geq 0]. The same proof holds for [\bar{\scr F}], with eigenvalue [i^{m}]. If these eigenfunctions are normalized as [{\scr H}_{m}(x) = {(-1)^{m}2^{1/4} \over \sqrt{m!}2^{m}\pi^{m/2}} H_{m}(x),] then it can be shown that the collection of Hermite functions [\{{\scr H}_{m}(x)\}_{m \geq 0}] constitutes an orthonormal basis of [L^{2}({\bb R})] such that [{\scr H}_{m}] is an eigenfunction of [{\scr F}] (respectively [\bar{\scr F}]) for eigenvalue [(-i)^{m}] (respectively [i^{m}]).

In dimension n, the same construction can be extended by tensor product to yield the multivariate Hermite functions [{\scr H}_{\bf m}({\bf x}) = {\scr H}_{m_{1}}(x_{1}) \times {\scr H}_{m_{2}}(x_{2}) \times \ldots \times {\scr H}_{m_{n}}(x_{n})] (where [{\bf m} \geq {\bf 0}] is a multi-index). These constitute an orthonormal basis of [L^{2}({\bb R}^{n})], with [{\scr H}_{\bf m}] an eigenfunction of [{\scr F}] (respectively [\bar{\scr F}]) for eigenvalue [(-i)^{|{\bf m}|}] (respectively [i^{|{\bf m}|}]). Thus the subspaces [{\bf H}_{k}] of Section 1.3.2.4.3.4[link] are spanned by those [{\scr H}_{\bf m}] with [|{\bf m}| \equiv k\hbox{ mod } 4\ (k = 0, 1, 2, 3)].

General multivariate Gaussians are usually encountered in the non-standard form [G_{\bf A}({\bf x}) = \exp (- {\textstyle{1 \over 2}} {\bf x}^{T} \cdot {\bf Ax}),] where A is a symmetric positive-definite matrix. Diagonalizing A as [{\bf E}\boldLambda{\bf E}^{T}] with [{\bf EE}^{T}] the identity matrix, and putting [{\bf A}^{1/2} = {\bf E}{\boldLambda}^{1/2}{\bf E}^{T}], we may write [G_{\bf A}({\bf x}) = G\left[\left({{\bf A} \over 2 \pi}\right)^{1/2} {\bf x}\right]] i.e. [G_{\bf A} = [(2\pi {\bf A}^{-1})^{1/2}]^{\#} G\hbox{;}] hence (by Section 1.3.2.4.2.3[link]) [{\scr F}[G_{\bf A}] = |\det (2\pi {\bf A}^{-1})|^{1/2} \left[\left({{\bf A} \over 2 \pi}\right)^{1/2}\right]^{\#} G,] i.e. [{\scr F}[G_{\bf A}]({\boldxi}) = |\det (2\pi {\bf A}^{-1})|^{1/2} G[(2\pi {\bf A}^{-1})^{1/2}{\boldxi}],] i.e. finally [{\scr F}[G_{\bf A}] = |\det (2\pi {\bf A}^{-1})|^{1/2} G_{4\pi^{2}{\bf A}^{-1}}.]

This result is widely used in crystallography, e.g. to calculate form factors for anisotropic atoms (Section 1.3.4.2.2.6[link]) and to obtain transforms of derivatives of Gaussian atomic densities (Section 1.3.4.4.7.10[link]).

1.3.2.4.4.3. Heisenberg's inequality, Hardy's theorem

| top | pdf |

The result just obtained, which also holds for [\bar{\scr F}], shows that the `peakier' [G_{\bf A}], the `broader' [{\scr F}[G_{\bf A}]]. This is a general property of the Fourier transformation, expressed in dimension 1 by the Heisenberg inequality (Weyl, 1931[link]): [\eqalign{&\left({\int} x^{2}|\;f(x)|^{2} \;\hbox{d}x\right) \left({\int} \xi^{2}|{\scr F}[\;f]( \xi)|^{2} \;\hbox{d}\xi \right)\cr &\quad \geq {1 \over 16\pi^{2}} \left({\int} |\;f(x)|^{2} \;\hbox{d}x\right)^{2},}] where, by a beautiful theorem of Hardy (1933)[link], equality can only be attained for f Gaussian. Hardy's theorem is even stronger: if both f and [{\scr F}[\;f]] behave at infinity as constant multiples of G, then each of them is everywhere a constant multiple of G; if both f and [{\scr F}[\;f]] behave at infinity as constant multiples of [G \times \hbox{monomial}], then each of them is a finite linear combination of Hermite functions. Hardy's theorem is invoked in Section 1.3.4.4.5[link] to derive the optimal procedure for spreading atoms on a sampling grid in order to obtain the most accurate structure factors.

The search for optimal compromises between the confinement of f to a compact domain in x-space and of [{\scr F}[\;f]] to a compact domain in ξ-space leads to consideration of prolate spheroidal wavefunctions (Pollack & Slepian, 1961[link]; Landau & Pollack, 1961[link], 1962[link]).

1.3.2.4.4.4. Symmetry property

| top | pdf |

A final formal property of the Fourier transform, best established in [{\scr S}], is its symmetry: if f and g are in [{\scr S}], then by Fubini's theorem [\eqalign{\langle {\scr F}[\;f], g\rangle &= {\textstyle\int\limits_{{\bb R}^{n}}} \left({\textstyle\int\limits_{{\bb R}^{n}}} f({\bf x}) \exp (-2\pi i{\boldxi} \cdot {\bf x}) \;\hbox{d}^{n}{\bf x}\right) g({\boldxi}) \;\hbox{d}^{n}{\boldxi}\cr &= {\textstyle\int\limits_{{\bb R}^{n}}} f({\bf x}) \left({\textstyle\int\limits_{{\bb R}^{n}}} g({\boldxi}) \exp (-2\pi i{\boldxi} \cdot {\bf x}) \;\hbox{d}^{n}{\boldxi}\right) \;\hbox{d}^{n}{\bf x}\cr &= \langle f, {\scr F}[g]\rangle.}]

This possibility of `transposing' [{\scr F}] (and [\bar{\scr F}]) from the left to the right of the duality bracket will be used in Section 1.3.2.5.4[link] to extend the Fourier transformation to distributions.

1.3.2.4.5. Various writings of Fourier transforms

| top | pdf |

Other ways of writing Fourier transforms in [{\bb R}^{n}] exist besides the one used here. All have the form [{\scr F}_{h, \, \omega}[\;f]({\boldxi}) = {1 \over h^{n}} {\int\limits_{{\bb R}^{n}}} f({\bf x}) \exp (-i\omega {\boldxi} \cdot {\bf x}) \;\hbox{d}^{n}{\bf x},] where h is real positive and ω real non-zero, with the reciprocity formula written: [f({\bf x}) = {1 \over k^{n}} {\int\limits_{{\bb R}^{n}}} {\scr F}_{h, \,\omega}[\;f]({\boldxi}) \exp (+i\omega {\boldxi} \cdot {\bf x}) \;\hbox{d}^{n}{\bf x}] with k real positive. The consistency condition between h, k and ω is [hk = {2\pi \over |\omega|}.]

The usual choices are: [\displaylines{\quad (\hbox{i})\quad\; \omega = \pm 2 \pi, h = k = 1 {\hbox to 18pt{}} (\hbox{as here})\hbox{;}\hfill\cr \quad (\hbox{ii})\quad\omega = \pm 1, h = 1, k = 2 \pi {\hbox to 9.5pt{}} (\hbox{in probability theory}\hfill\cr \quad \phantom{(\hbox{ii})\quad\omega = \pm 1, h = 1, k = 2 \pi {\hbox to 10pt{}}} \hbox{and in solid-state physics})\hbox{;}\hfill\cr \quad(\hbox{iii})\;\;\; \omega = \pm 1, h = k = \sqrt{2 \pi} {\hbox to 10pt{}} (\hbox{in much of classical analysis}).\hfill}]

It should be noted that conventions (ii) and (iii) introduce numerical factors of 2π in convolution and Parseval formulae, while (ii) breaks the symmetry between [{\scr F}] and [\bar{\scr F}].

1.3.2.4.6. Tables of Fourier transforms

| top | pdf |

The books by Campbell & Foster (1948)[link], Erdélyi (1954)[link], and Magnus et al. (1966)[link] contain extensive tables listing pairs of functions and their Fourier transforms. Bracewell (1986)[link] lists those pairs particularly relevant to electrical engineering applications.

1.3.2.5. Fourier transforms of tempered distributions

| top | pdf |

1.3.2.5.1. Introduction

| top | pdf |

It was found in Section 1.3.2.4.2[link] that the usual space of test functions [{\scr D}] is not invariant under [{\scr F}] and [\bar{\scr F}]. By contrast, the space [{\scr S}] of infinitely differentiable rapidly decreasing functions is invariant under [{\scr F}] and [\bar{\scr F}], and furthermore transposition formulae such as [\langle {\scr F}[\;f], g\rangle = \langle \;f, {\scr F}[g]\rangle] hold for all [f, g \in {\scr S}]. It is precisely this type of transposition which was used successfully in Sections 1.3.2.3.9.1[link] and 1.3.2.3.9.3[link] to define the derivatives of distributions and their products with smooth functions.

This suggests using [{\scr S}] instead of [{\scr D}] as a space of test functions ϕ, and defining the Fourier transform [{\scr F}[T]] of a distribution T by [\langle {\scr F}[T], \varphi \rangle = \langle T, {\scr F}[\varphi] \rangle] whenever T is capable of being extended from [{\scr D}] to [{\scr S}] while remaining continuous. It is this latter proviso which will be subsumed under the adjective `tempered'. As was the case with the construction of [{\scr D}\,'], it is the definition of a sufficiently strong topology (i.e. notion of convergence) in [{\scr S}] which will play a key role in transferring to the elements of its topological dual [{\scr S}\,'] (called tempered distributions) all the properties of the Fourier transformation.

Besides the general references to distribution theory mentioned in Section 1.3.2.3.1[link] the reader may consult the books by Zemanian (1965[link], 1968[link]). Lavoine (1963)[link] contains tables of Fourier transforms of distributions.

1.3.2.5.2. [{\scr S}] as a test-function space

| top | pdf |

A notion of convergence has to be introduced in [{\scr S}({\bb R}^{n})] in order to be able to define and test the continuity of linear functionals on it.

A sequence [(\varphi_{j})] of functions in [{\scr S}] will be said to converge to 0 if, for any given multi-indices k and p, the sequence [({\bf x}^{{\bf k}}D^{{\bf p}} \varphi_{j})] tends to 0 uniformly on [{\bb R}^{n}].

It can be shown that [{\scr D}({\bb R}^{n})] is dense in [{\scr S}({\bb R}^{n})]. Translation is continuous for this topology. For any linear differential operator [P(D) = {\textstyle\sum_{\bf p}} a_{\bf p} D^{{\bf p}}] and any polynomial [Q({\bf x})] over [{\bb R}^{n}], [(\varphi_{j}) \rightarrow 0] implies [[Q({\bf x}) \times P(D)\varphi_{j}] \rightarrow 0] in the topology of [{\scr S}]. Therefore, differentiation and multiplication by polynomials are continuous for the topology on [{\scr S}].

The Fourier transformations [{\scr F}] and [\bar{\scr F}] are also continuous for the topology of [{\scr S}]. Indeed, let [(\varphi_{j})] converge to 0 for the topology on [{\scr S}]. Then, by Section 1.3.2.4.2[link], [\|(2\pi \boldxi)^{{\bf m}} D^{{\bf p}} ({\scr F}[\varphi_{j}])\|_{\infty} \leq \| D^{{\bf m}} [(2\pi {\bf x})^{{\bf p}} \varphi_{j}]\|_{1}.] The right-hand side tends to 0 as [j \rightarrow \infty] by definition of convergence in [{\scr S}], hence [\|\boldxi\|^{{\bf m}} D^{{\bf p}} ({\scr F}[\varphi_{j}]) \rightarrow 0] uniformly, so that [({\scr F}[\varphi_{j}]) \rightarrow 0] in [{\scr S}] as [j \rightarrow \infty]. The same proof applies to [\bar{\scr F}].

1.3.2.5.3. Definition and examples of tempered distributions

| top | pdf |

A distribution [T \in {\scr D}\,'({\bb R}^{n})] is said to be tempered if it can be extended into a continuous linear functional on [{\scr S}].

If [{\scr S}\,'({\bb R}^{n})] is the topological dual of [{\scr S}({\bb R}^{n})], and if [S \in {\scr S}^{\prime}({\bb R}^{n})], then its restriction to [{\scr D}] is a tempered distribution; conversely, if [T \in {\scr D}\,'] is tempered, then its extension to [{\scr S}] is unique (because [{\scr D}] is dense in [{\scr S}]), hence it defines an element S of [{\scr S}\,']. We may therefore identify [{\scr S}\,'] and the space of tempered distributions.

A distribution with compact support is tempered, i.e. [{\scr S}\,' \supset {\scr E}\,']. By transposition of the corresponding properties of [{\scr S}], it is readily established that the derivative, translate or product by a polynomial of a tempered distribution is still a tempered distribution.

These inclusion relations may be summarized as follows: since [{\scr S}] contains [{\scr D}] but is contained in [{\scr E}], the reverse inclusions hold for the topological duals, and hence [{\scr S}\,'] contains [{\scr E}\,'] but is contained in [{\scr D}\,'].

A locally summable function f on [{\bb R}^{n}] will be said to be of polynomial growth if [|\;f({\bf x})|] can be majorized by a polynomial in [\|{\bf x}\|] as [\|{\bf x}\| \rightarrow \infty]. It is easily shown that such a function f defines a tempered distribution [T_{f}] via [\langle T_{f}, \varphi \rangle = {\textstyle\int\limits_{{\bb R}^{n}}} f({\bf x}) \varphi ({\bf x}) \;\hbox{d}^{n} {\bf x}.] In particular, polynomials over [{\bb R}^{n}] define tempered distributions, and so do functions in [{\scr S}]. The latter remark, together with the transposition identity (Section 1.3.2.4.4[link]), invites the extension of [{\scr F}] and [\bar{\scr F}] from [{\scr S}] to [{\scr S}\,'].

1.3.2.5.4. Fourier transforms of tempered distributions

| top | pdf |

The Fourier transform [{\scr F}[T]] and cotransform [\bar{\scr F}[T]] of a tempered distribution T are defined by [\eqalign{\langle {\scr F}[T], \varphi \rangle &= \langle T, {\scr F}[\varphi]\rangle \cr \langle \bar{\scr F}[T], \varphi \rangle &= \langle T, \bar{\scr F}[\varphi]\rangle}] for all test functions [\varphi \in {\scr S}]. Both [{\scr F}[T]] and [\bar{\scr F}[T]] are themselves tempered distributions, since the maps [\varphi \;\longmapsto\; {\scr F}[\varphi]] and [\varphi \;\longmapsto\; \bar{\scr F}[\varphi]] are both linear and continuous for the topology of [{\scr S}]. In the same way that x and ξ have been used consistently as arguments for ϕ and [{\scr F}[\varphi]], respectively, the notation [T_{\bf x}] and [{\scr F}[T]_{\boldxi}] will be used to indicate which variables are involved.

When T is a distribution with compact support, its Fourier transform may be written [{\scr F}[T_{\bf x}]_{\boldxi} = \langle T_{\bf x}, \exp (- 2\pi i \boldxi \cdot {\bf x})\rangle] since the function [{\bf x} \;\longmapsto\; \exp (- 2\pi i {\boldxi} \cdot {\bf x})] is in [{\scr E}] while [T_{\bf x} \in {\scr E}\,']. It can be shown, as in Section 1.3.2.4.2[link], to be analytically continuable into an entire function over [{\bb C}^{n}].

1.3.2.5.5. Transposition of basic properties

| top | pdf |

The duality between differentiation and multiplication by a monomial extends from [{\scr S}] to [{\scr S}\,'] by transposition: [\eqalign{{\scr F}[D_{\bf x}^{{\bf p}} T_{\bf x}]_{\boldxi} &= (2\pi i \boldxi)^{{\bf p}} {\scr F}[T_{\bf x}]_{\boldxi} \cr D_{\boldxi}^{{\bf p}} ({\scr F}[T_{\bf x}]_{\boldxi}) &= {\scr F}[(- 2\pi i {\bf x})^{{\bf p}} T_{\bf x}]_{\boldxi}.}] Analogous formulae hold for [\bar{\scr F}], with i replaced by −i.

The formulae expressing the duality between translation and phase shift, e.g. [\eqalign{{\scr F}[\tau_{\bf a} T_{\bf x}]_{\boldxi} &= \exp (-2\pi i{\bf a} \cdot {\boldxi}) {\scr F}[T_{\bf x}]_{\boldxi} \cr \tau_{\boldalpha} ({\scr F}[T_{\bf x}]_{\boldxi}) &= {\scr F}[\exp (2\pi i{\boldalpha} \cdot {\bf x}) T_{\bf x}]_{\boldxi}\hbox{;}}] between a linear change of variable and its contragredient, e.g. [{\scr F}[A^{\#} T] = |\hbox{det } {\bf A}| [({\bf A}^{-1})^{T}]^{\#} {\scr F}[T]\hbox{;}] are obtained similarly by transposition from the corresponding identities in [{\scr S}]. They give a transposition formula for an affine change of variables [{\bf x} \;\longmapsto\; S({\bf x}) = {\bf Ax} + {\bf b}] with non-singular matrix A: [\eqalign{{\scr F}[S^{\#} T] &= \exp (-2\pi i{\boldxi} \cdot {\bf b}) {\scr F}[A^{\#} T] \cr &= \exp (-2\pi i{\boldxi} \cdot {\bf b}) |\hbox{det } {\bf A}| [({\bf A}^{-1})^{T}]^{\#} {\scr F}[T],}] with a similar result for [\bar{\scr F}], replacing −i by +i.

Conjugate symmetry is obtained similarly: [{\scr F}[\bar{T}] = \breve{\overline{{\scr F}[T]}}, {\scr F}[\breve{\bar{T}}] = \overline{{\scr F}[T]},] with the same identities for [\bar{\scr F}].

The tensor product property also transposes to tempered distributions: if [U \in {\scr S}\,'({\bb R}^{m}), V \in {\scr S}\,'({\bb R}^{n})], [\eqalign{{\scr F}[U_{\bf x} \otimes V_{\bf y}] &= {\scr F}[U]_{\boldxi} \otimes {\scr F}[V]_{\boldeta} \cr \bar{\scr F}[U_{\bf x} \otimes V_{\bf y}] &= \bar{\scr F}[U]_{\boldxi} \otimes \bar{\scr F}[V]_{\boldeta}.}]

1.3.2.5.6. Transforms of δ-functions

| top | pdf |

Since δ has compact support, [{\scr F}[\delta_{\bf x}]_{\boldxi} = \langle \delta_{\bf x}, \exp (-2\pi i{\boldxi} \cdot {\bf x})\rangle = 1_{\boldxi},\quad i.e.\ {\scr F}[\delta] = 1.] It is instructive to show that conversely [{\scr F}[1] = \delta] without invoking the reciprocity theorem. Since [\partial_{j} 1 = 0] for all [j = 1, \ldots, n], it follows from Section 1.3.2.3.9.4[link] that [{\scr F}[1] = c\delta]; the constant c can be determined by using the invariance of the standard Gaussian G established in Section 1.3.2.4.3[link]: [\langle {\scr F}[1]_{\bf x}, G_{\bf x}\rangle = \langle 1_{\boldxi}, G_{\boldxi}\rangle = 1\hbox{;}] hence [c = 1]. Thus, [{\scr F}[1] = \delta].

The basic properties above then read (using multi-indices to denote differentiation): [\eqalign{{\scr F}[\delta_{\bf x}^{({\bf m})}]_{\boldxi} = (2\pi i{\boldxi})^{{\bf m}}, \quad &{\scr F}[{\bf x}^{{\bf m}}]_{\boldxi} = (-2\pi i)^{-|{\bf m}|} \delta_{\boldxi}^{({\bf m})}\hbox{;} \cr {\scr F}[\delta_{\bf a}]_{\boldxi} = \exp (-2\pi i{\bf a} \cdot {\boldxi}), \quad &{\scr F}[\exp (2\pi i{\boldalpha} \cdot {\bf x})]_{\boldxi} = \delta_{\boldalpha},}] with analogous relations for [\bar{\scr F}], i becoming −i. Thus derivatives of δ are mapped to monomials (and vice versa), while translates of δ are mapped to `phase factors' (and vice versa).

1.3.2.5.7. Reciprocity theorem

| top | pdf |

The previous results now allow a self-contained and rigorous proof of the reciprocity theorem between [{\scr F}] and [\bar{\scr F}] to be given, whereas in traditional settings (i.e. in [L^{1}] and [L^{2}]) the implicit handling of δ through a limiting process is always the sticking point.

Reciprocity is first established in [{\scr S}] as follows: [\eqalign{\bar{\scr F}[{\scr F}[\varphi]] ({\bf x}) &= {\textstyle\int\limits_{{\bb R}^{n}}} {\scr F}[\varphi] ({\boldxi}) \exp (2\pi i{\boldxi} \cdot {\bf x})\ {\rm d}^{n} {\boldxi} \cr &= {\textstyle\int\limits_{{\bb R}^{n}}} {\scr F}[\tau_{-{\bf x}} \varphi] ({\boldxi})\ {\rm d}^{n} {\boldxi} \cr &= \langle 1, {\scr F}[\tau_{-{\bf x}} \varphi]\rangle \cr &= \langle {\scr F}[1], \tau_{-{\bf x}} \varphi\rangle \cr &= \langle \tau_{\bf x} \delta, \varphi\rangle \cr &= \varphi ({\bf x})}] and similarly [{\scr F}[\bar{\scr F}[\varphi]] ({\bf x}) = \varphi ({\bf x}).]

The reciprocity theorem is then proved in [{\scr S}\,'] by transposition: [\bar{\scr F}[{\scr F}[T]] = {\scr F}[\bar{\scr F}[T]] = T \quad\hbox{for all } T \in {\scr S}\,'.] Thus the Fourier cotransformation [\bar{\scr F}] in [{\scr S}\,'] may legitimately be called the `inverse Fourier transformation'.

The method of Section 1.3.2.4.3[link] may then be used to show that [{\scr F}] and [\bar{\scr F}] both have period 4 in [{\scr S}\,'].

1.3.2.5.8. Multiplication and convolution

| top | pdf |

Multiplier functions [\alpha ({\bf x})] for tempered distributions must be infinitely differentiable, as for ordinary distributions; furthermore, they must grow sufficiently slowly as [\|x\| \rightarrow \infty] to ensure that [\alpha \varphi \in {\scr S}] for all [\varphi \in {\scr S}] and that the map [\varphi \;\longmapsto\; \alpha \varphi] is continuous for the topology of [{\scr S}]. This leads to choosing for multipliers the subspace [{\scr O}_{M}] consisting of functions [\alpha \in {\scr E}] of polynomial growth. It can be shown that if f is in [{\scr O}_{M}], then the associated distribution [T_{f}] is in [{\scr S}\,'] (i.e. is a tempered distribution); and that conversely if T is in [{\scr S}\,', \mu * T] is in [{\scr O}_{M}] for all [\mu \in {\scr D}].

Corresponding restrictions must be imposed to define the space [{\scr O}'_{C}] of those distributions T whose convolution [S * T] with a tempered distribution S is still a tempered distribution: T must be such that, for all [\varphi \in {\scr S}, \theta ({\bf x}) = \langle T_{\bf y}, \varphi ({\bf x} + {\bf y})\rangle] is in [{\scr S}]; and such that the map [\varphi \;\longmapsto\; \theta] be continuous for the topology of [{\scr S}]. This implies that S is `rapidly decreasing'. It can be shown that if f is in [{\scr S}], then the associated distribution [T_{f}] is in [{\scr O}'_{C}]; and that conversely if T is in [{\scr O}'_{C}, \mu * T] is in [{\scr S}] for all [\mu \in {\scr D}].

The two spaces [{\scr O}_{M}] and [{\scr O}'_{C}] are mapped into each other by the Fourier transformation [\eqalign{{\scr F}({\scr O}_{M}) &= \bar{\scr F}({\scr O}_{M}) = {\scr O}'_{C} \cr {\scr F}({\scr O}'_{C}) &= \bar{\scr F}({\scr O}'_{C}) = {\scr O}_{M}}] and the convolution theorem takes the form [\eqalign{{\scr F}[\alpha S] &= {\scr F}[\alpha] * {\scr F}[S] \quad\; S \in {\scr S}\,', \alpha \in {\scr O}_{M},{\scr F}[\alpha] \in {\scr O}'_{C}\hbox{;}\cr {\scr F}[S * T] &= {\scr F}[S] \times {\scr F}[T] \quad S \in {\scr S}\,', T \in {\scr O}'_{C},{\scr F}[T] \in {\scr O}_{M}.}] The same identities hold for [\bar{\scr F}]. Taken together with the reciprocity theorem, these show that [{\scr F}] and [\bar{\scr F}] establish mutually inverse isomorphisms between [{\scr O}_{M}] and [{\scr O}'_{C}], and exchange multiplication for convolution in [{\scr S}\,'].

It may be noticed that most of the basic properties of [{\scr F}] and [\bar{\scr F}] may be deduced from this theorem and from the properties of δ. Differentiation operators [D^{\bf m}] and translation operators [\tau_{\bf a}] are convolutions with [D^{\bf m}\delta] and [\tau_{\bf a} \delta]; they are turned, respectively, into multiplication by monomials [(\pm 2\pi i{\boldxi})^{{\bf m}}] (the transforms of [D^{{\bf m}}\delta]) or by phase factors [\exp(\pm 2 \pi i{\boldxi} \cdot {\boldalpha})] (the transforms of [\tau_{\bf a}\delta]).

Another consequence of the convolution theorem is the duality established by the Fourier transformation between sections and projections of a function and its transform. For instance, in [{\bb R}^{3}], the projection of [f(x, y, z)] on the x, y plane along the z axis may be written [(\delta_{x} \otimes \delta_{y} \otimes 1_{z}) * f\hbox{;}] its Fourier transform is then [(1_{\xi} \otimes 1_{\eta} \otimes \delta_{\zeta}) \times {\scr F}[\;f],] which is the section of [{\scr F}[\;f]] by the plane [\zeta = 0], orthogonal to the z axis used for projection. There are numerous applications of this property in crystallography (Section 1.3.4.2.1.8[link]) and in fibre diffraction (Section 1.3.4.5.1.3[link]).

1.3.2.5.9. [L^{2}] aspects, Sobolev spaces

| top | pdf |

The special properties of [{\scr F}] in the space of square-integrable functions [L^{2}({\bb R}^{n})], such as Parseval's identity, can be accommodated within distribution theory: if [u \in L^{2}({\bb R}^{n})], then [T_{u}] is a tempered distribution in [{\scr S}\,'] (the map [u \;\longmapsto\; T_{u}] being continuous) and it can be shown that [S = {\scr F}[T_{u}]] is of the form [S_{v}], where [u = {\scr F}[u]] is the Fourier transform of u in [L^{2}({\bb R}^{n})]. By Plancherel's theorem, [\|u\|_{2} = \|v\|_{2}].

This embedding of [L^{2}] into [{\scr S}\,'] can be used to derive the convolution theorem for [L^{2}]. If u and v are in [L^{2}({\bb R}^{n})], then [u * v] can be shown to be a bounded continuous function; thus [u * v] is not in [L^{2}], but it is in [{\scr S}\,'], so that its Fourier transform is a distribution, and [{\scr F}[u * v] = {\scr F}[u] \times {\scr F}[v].]

Spaces of tempered distributions related to [L^{2}({\bb R}^{n})] can be defined as follows. For any real s, define the Sobolev space [H_{s}({\bb R}^{n})] to consist of all tempered distributions [S \in {\scr S}\,'({\bb R}^{n})] such that [(1 + |\boldxi|^{2})^{s/2} {\scr F}[S]_{\boldxi} \in L^{2}({\bb R}^{n}).]

These spaces play a fundamental role in the theory of partial differential equations, and in the mathematical theory of tomographic reconstruction – a subject not unrelated to the crystallographic phase problem (Natterer, 1986[link]).

1.3.2.6. Periodic distributions and Fourier series

| top | pdf |

1.3.2.6.1. Terminology

| top | pdf |

Let [{\bb Z}^{n}] be the subset of [{\bb R}^{n}] consisting of those points with (signed) integer coordinates; it is an n-dimensional lattice, i.e. a free Abelian group on n generators. A particularly simple set of n generators is given by the standard basis of [{\bb R}^{n}], and hence [{\bb Z}^{n}] will be called the standard lattice in [{\bb R}^{n}]. Any other `non-standard' n-dimensional lattice Λ in [{\bb R}^{n}] is the image of this standard lattice by a general linear transformation.

If we identify any two points in [{\bb R}^{n}] whose coordinates are congruent modulo [{\bb Z}^{n}], i.e. differ by a vector in [{\bb Z}^{n}], we obtain the standard n-torus [{\bb R}^{n}/{\bb Z}^{n}]. The latter may be viewed as [({\bb R}/{\bb Z})^{n}], i.e. as the Cartesian product of n circles. The same identification may be carried out modulo a non-standard lattice Λ, yielding a non-standard n-torus [{\bb R}^{n}/\Lambda]. The correspondence to crystallographic terminology is that `standard' coordinates over the standard 3-torus [{\bb R}^{3}/{\bb Z}^{3}] are called `fractional' coordinates over the unit cell; while Cartesian coordinates, e.g. in ångströms, constitute a set of non-standard coordinates.

Finally, we will denote by I the unit cube [[0, 1]^{n}] and by [C_{\varepsilon}] the subset [C_{\varepsilon} = \{{\bf x} \in {\bb R}^{n}\|x_{j}| \;\lt\; \varepsilon \hbox{ for all } j = 1, \ldots, n\}.]

1.3.2.6.2. [{\bb Z}^{n}]-periodic distributions in [{\bb R}^{n}]

| top | pdf |

A distribution [T \in {\scr D}\,' ({\bb R}^{n})] is called periodic with period lattice [{\bb Z}^{n}] (or [{\bb Z}^{n}]-periodic) if [\tau_{\bf m} T = T] for all [{\bf m} \in {\bb Z}^{n}] (in crystallography the period lattice is the direct lattice).

Given a distribution with compact support [T^{0} \in {\scr E}\,' ({\bb R}^{n})], then [T = {\textstyle\sum_{{\bf m} \in {\bb Z}^{n}}} \tau_{\bf m} T^{0}] is a [{\bb Z}^{n}]-periodic distribution. Note that we may write [T = r * T^{0}], where [r = {\textstyle\sum_{{\bf m} \in {\bb Z}^{n}}} \delta_{({\bf m})}] consists of Dirac δ's at all nodes of the period lattice [{\bb Z}^{n}].

Conversely, any [{\bb Z}^{n}]-periodic distribution T may be written as [r * T^{0}] for some [T^{0} \in {\scr E}\,']. To retrieve such a `motif' [T^{0}] from T, a function ψ will be constructed in such a way that [\psi \in {\scr D}] (hence has compact support) and [r * \psi = 1]; then [T^{0} = \psi T]. Indicator functions (Section 1.3.2.2[link]) such as [\chi_{1}] or [\chi_{C_{1/2}}] cannot be used directly, since they are discontinuous; but regularized versions of them may be constructed by convolution (see Section 1.3.2.3.9.7[link]) as [\psi_{0} = \chi_{C_{\varepsilon}} * \theta_{\eta}], with [epsilon] and η such that [\psi_{0} ({\bf x}) = 1] on [C_{1/2}] and [\psi_{0}({\bf x}) = 0] outside [C_{3/4}]. Then the function [\psi = {\psi_{0} \over {\textstyle\sum_{{\bf m} \in {\bb Z}^{n}}} \tau_{\bf m} \psi_{0}}] has the desired property. The sum in the denominator contains at most [2^{n}] non-zero terms at any given point x and acts as a smoothly varying `multiplicity correction'.

1.3.2.6.3. Identification with distributions over [{\bb R}^{n}/{\bb Z}^{n}]

| top | pdf |

Throughout this section, `periodic' will mean `[{\bb Z}^{n}]-periodic'.

Let [s \in {\bb R}], and let [s] denote the largest integer [\leq s]. For [x = (x_{1}, \ldots, x_{n}) \in {\bb R}^{n}], let [\tilde{{\bf x}}] be the unique vector [(\tilde{x}_{1}, \ldots, \tilde{x}_{n})] with [\tilde{x}_{j} = x_{j} - [x_{j}]]. If [{\bf x},{\bf y} \in {\bb R}^{n}], then [\tilde{{\bf x}} = \tilde{{\bf y}}] if and only if [{\bf x} - {\bf y} \in {\bb Z}^{n}]. The image of the map [{\bf x} \;\longmapsto\; \tilde{{\bf x}}] is thus [{\bb R}^{n}] modulo [{\bb Z}^{n}], or [{\bb R}^{n}/{\bb Z}^{n}].

If f is a periodic function over [{\bb R}^{n}], then [\tilde{{\bf x}} = \tilde{{\bf y}}] implies [f({\bf x}) = f({\bf y})]; we may thus define a function [\tilde{f}] over [{\bb R}^{n}/{\bb Z}^{n}] by putting [\tilde{f}(\tilde{{\bf x}}) = f({\bf x})] for any [{\bf x} \in {\bb R}^{n}] such that [{\bf x} - \tilde{{\bf x}} \in {\bb Z}^{n}]. Conversely, if [\tilde{f}] is a function over [{\bb R}^{n}/{\bb Z}^{n}], then we may define a function f over [{\bb R}^{n}] by putting [f({\bf x}) = \tilde{f}(\tilde{{\bf x}})], and f will be periodic. Periodic functions over [{\bb R}^{n}] may thus be identified with functions over [{\bb R}^{n}/{\bb Z}^{n}], and this identification preserves the notions of convergence, local summability and differentiability.

Given [\varphi^{0} \in {\scr D}({\bb R}^{n})], we may define [\varphi ({\bf x}) = {\textstyle\sum\limits_{{\bf m} \in {\bb Z}^{n}}} (\tau_{\bf m} \varphi^{0}) ({\bf x})] since the sum only contains finitely many non-zero terms; ϕ is periodic, and [\tilde{\varphi} \in {\scr D}({\bb R}^{n}/{\bb Z}^{n})]. Conversely, if [\tilde{\varphi} \in {\scr D}({\bb R}^{n}/{\bb Z}^{n})] we may define [\varphi \in {\scr E}({\bb R}^{n})] periodic by [\varphi ({\bf x}) = \tilde{\varphi} (\tilde{{\bf x}})], and [\varphi^{0} \in {\scr D}({\bb R}^{n})] by putting [\varphi^{0} = \psi \varphi] with ψ constructed as above.

By transposition, a distribution [\tilde{T} \in {\scr D}\,'({\bb R}^{n}/{\bb Z}^{n})] defines a unique periodic distribution [T \in {\scr D}\,'({\bb R}^{n})] by [\langle T, \varphi^{0} \rangle = \langle \tilde{T}, \tilde{\varphi} \rangle]; conversely, [T \in {\scr D}\,'({\bb R}^{n})] periodic defines uniquely [\tilde{T} \in {\scr D}\,'({\bb R}^{n}/{\bb Z}^{n})] by [\langle \tilde{T}, \tilde{\varphi}\rangle = \langle T, \varphi^{0}\rangle].

We may therefore identify [{\bb Z}^{n}]-periodic distributions over [{\bb R}^{n}] with distributions over [{\bb R}^{n}/{\bb Z}^{n}]. We will, however, use mostly the former presentation, as it is more closely related to the crystallographer's perception of periodicity (see Section 1.3.4.1[link]).

1.3.2.6.4. Fourier transforms of periodic distributions

| top | pdf |

The content of this section is perhaps the central result in the relation between Fourier theory and crystallography (Section 1.3.4.2.1.1[link]).

Let [T = r * T^{0}] with r defined as in Section 1.3.2.6.2[link]. Then [r \in {\scr S}\,'], [T^{0} \in {\scr E}\,'] hence [T^{0} \in {\scr O}'_{C}], so that [T \in {\scr S}\,']: [{\bb Z}^{n}]-periodic distributions are tempered, hence have a Fourier transform. The convolution theorem (Section 1.3.2.5.8[link]) is applicable, giving: [{\scr F}[T] = {\scr F}[r] \times {\scr F}[T^{0}]] and similarly for [\bar{\scr F}].

Since [{\scr F}[\delta_{({\bf m})}] (\xi) = \exp (-2 \pi i {\boldxi} \cdot {\bf m})], formally [{\scr F}[r]_{\boldxi} = {\textstyle\sum\limits_{{\bf m} \in {\bb Z}^{n}}} \exp (-2 \pi i \boldxi \cdot {\bf m}) = Q,] say.

It is readily shown that Q is tempered and periodic, so that [Q = {\textstyle\sum_{{\boldmu} \in {\bb Z}^{n}}} \tau_{{\boldmu}} (\psi Q)], while the periodicity of r implies that [[\exp (-2 \pi i \xi_{j}) - 1] \psi Q = 0, \quad j = 1, \ldots, n.] Since the first factors have single isolated zeros at [\xi_{j} = 0] in [C_{3/4}], [\psi Q = c\delta] (see Section 1.3.2.3.9.4[link]) and hence by periodicity [Q = cr]; convoluting with [\chi_{C_{1}}] shows that [c = 1]. Thus we have the fundamental result: [Scheme scheme1] so that [{\scr F}[T] = r \times {\scr F}[T^{0}]\hbox{;}] i.e., according to Section 1.3.2.3.9.3[link], [{\scr F}[T]_{\boldxi} = {\textstyle\sum\limits_{{\boldmu} \in {\bb Z}^{n}}} {\scr F}[T^{0}] ({\boldmu}) \times \delta_{({\boldmu})}.]

The right-hand side is a weighted lattice distribution, whose nodes [{\boldmu} \in {\bb Z}^{n}] are weighted by the sample values [{\scr F}[T^{0}] ({\boldmu})] of the transform of the motif [T^{0}] at those nodes. Since [T^{0} \in {\scr E}\,'], the latter values may be written [{\scr F}[T^{0}]({\boldmu}) = \langle T_{\bf x}^{0}, \exp (-2 \pi i {\boldmu} \cdot {\bf x})\rangle.] By the structure theorem for distributions with compact support (Section 1.3.2.3.9.7[link]), [T^{0}] is a derivative of finite order of a continuous function; therefore, from Section 1.3.2.4.2.8[link] and Section 1.3.2.5.8[link], [{\scr F}[T^{0}]({\boldmu})] grows at most polynomially as [\|{\boldmu}\| \rightarrow \infty] (see also Section 1.3.2.6.10.3[link] about this property). Conversely, let [W = {\textstyle\sum_{{\boldmu} \in {\bb Z}^{n}}} w_{{\boldmu}} \delta_{({\boldmu})}] be a weighted lattice distribution such that the weights [w_{\boldmu}] grow at most polynomially as [\|{\boldmu}\| \rightarrow \infty]. Then W is a tempered distribution, whose Fourier cotransform [T_{\bf x} = {\textstyle\sum_{{\boldmu} \in {\bb Z}^{n}}} w_{\boldmu} \exp (+2 \pi i {\boldmu} \cdot {\bf x})] is periodic. If T is now written as [r * T^{0}] for some [T^{0} \in {\scr E}\,'], then by the reciprocity theorem [w_{\boldmu} = {\scr F}[T^{0}]({\boldmu}) = \langle T_{\bf x}^{0}, \exp (-2 \pi i {\boldmu} \cdot {\bf x})\rangle.] Although the choice of [T^{0}] is not unique, and need not yield back the same motif as may have been used to build T initially, different choices of [T^{0}] will lead to the same coefficients [w_{\boldmu}] because of the periodicity of [\exp (-2 \pi i {\boldmu} \cdot {\bf x})].

The Fourier transformation thus establishes a duality between periodic distributions and weighted lattice distributions. The pair of relations [\displaylines{\quad (\hbox{i})\hfill w_{\boldmu} = \langle T_{\bf x}^{0}, \exp (-2 \pi i {\boldmu} \cdot {\bf x})\rangle \quad\hfill\cr \quad(\hbox{ii})\hfill T_{\bf x} = {\textstyle\sum\limits_{{\boldmu} \in {\bb Z}^{n}}} w_{\boldmu} \exp (+2 \pi i {\boldmu} \cdot {\bf x}) \hfill}] are referred to as the Fourier analysis and the Fourier synthesis of T, respectively (there is a discrepancy between this terminology and the crystallographic one, see Section 1.3.4.2.1.1[link]). In other words, any periodic distribution [T \in {\scr S}\,'] may be represented by a Fourier series (ii), whose coefficients are calculated by (i). The convergence of (ii) towards T in [{\scr S}\,'] will be investigated later (Section 1.3.2.6.10[link]).

1.3.2.6.5. The case of non-standard period lattices

| top | pdf |

Let Λ denote the non-standard lattice consisting of all vectors of the form [{\textstyle\sum_{j=1}} m_{j} {\bf a}_{j}], where the [m_{j}] are rational integers and [{\bf a}_{1}, \ldots, {\bf a}_{n}] are n linearly independent vectors in [{\bb R}^{n}]. Let R be the corresponding lattice distribution: [R = {\textstyle\sum_{{ x} \in \Lambda}} \delta_{({\bf x})}].

Let A be the non-singular [n \times n] matrix whose successive columns are the coordinates of vectors [{\bf a}_{1}, \ldots, {\bf a}_{n}] in the standard basis of [{\bb R}^{n}]; A will be called the period matrix of Λ, and the mapping [{\bf x} \;\longmapsto\; {\bf Ax}] will be denoted by A. According to Section 1.3.2.3.9.5[link] we have [\langle R, \varphi \rangle = {\textstyle\sum\limits_{{\bf m} \in {\bb Z}^{n}}} \varphi ({\bf Am}) = \langle r, (A^{-1})^{\#} \varphi \rangle = |\det {\bf A}|^{-1} \langle A^{\#} r, \varphi \rangle] for any [\varphi \in {\scr S}], and hence [R = |\det {\bf A}|^{-1} A^{\#} r]. By Fourier transformation, according to Section 1.3.2.5.5[link], [{\scr F}[R] = |\det {\bf A}|^{-1} {\scr F}[A^{\#} r] = [({\bf A}^{-1})^{T}]^{\#} {\scr F}[r] = [({\bf A}^{-1})^{T}]^{\#} r,] which we write: [{\scr F}[R] = |\det {\bf A}|^{-1} R^{*}] with [R^{*} = |\det {\bf A}| [({\bf A}^{-1})^{T}]^{\#} r.]

[R^{*}] is a lattice distribution: [R^{*} = {\textstyle\sum\limits_{{\boldmu} \in {\bb Z}^{n}}} \delta_{[({\bf A}^{-1})^{T} {\boldmu}]} = {\textstyle\sum\limits_{{\boldxi} \in \Lambda^{*}}} \delta_{({\boldxi})}] associated with the reciprocal lattice [\Lambda^{*}] whose basis vectors [{\bf a}_{1}^{*}, \ldots, {\bf a}_{n}^{*}] are the columns of [({\bf A}^{-1})^{T}]. Since the latter matrix is equal to the adjoint matrix (i.e. the matrix of co-factors) of A divided by det A, the components of the reciprocal basis vectors can be written down explicitly (see Section 1.3.4.2.1.1[link] for the crystallographic case [n = 3]).

A distribution T will be called Λ-periodic if [\tau_{\boldxi} T = T] for all [{\boldxi} \in \Lambda]; as previously, T may be written [R * T^{0}] for some motif distribution [T^{0}] with compact support. By Fourier transformation, [\eqalignno{{\scr F}[T] &= |\det {\bf A}|^{-1} R^{*} \cdot {\scr F}[T^{0}]\cr &= |\det {\bf A}|^{-1} {\textstyle\sum\limits_{{\boldxi} \in \Lambda^{*}}} {\scr F}[T^{0}] ({\boldxi}) \delta_{({\boldxi})}\cr &= |\det {\bf A}|^{-1} {\textstyle\sum\limits_{{\boldmu} \in {\bb Z}^{n}}} {\scr F}[T^{0}] [{({\bf A}^{-1})^{T}}{\boldmu}] \delta_{{[({\bf A}^{-1})^{T}} {\boldmu}]}}] so that [{\scr F}[T]] is a weighted reciprocal-lattice distribution, the weight attached to node [{\boldxi} \in \Lambda^{*}] being [|\det {\bf A}|^{-1}] times the value [{\scr F}[T^{0}](\boldxi)] of the Fourier transform of the motif [T^{0}].

This result may be further simplified if T and its motif [T^{0}] are referred to the standard period lattice [{\bb Z}^{n}] by defining t and [t^{0}] so that [T = A^{\#} t], [T^{0} = A^{\#} t^{0}], [t = r * t^{0}]. Then [{\scr F}[T^{0}] ({\boldxi}) = |\det {\bf A}| {\scr F}[t^{0}] ({\bf A}^{T} {\boldxi}),] hence [{\scr F}[T^{0}] [{({\bf A}^{-1})^{T}}{\boldmu}] = |\det {\bf A}| {\scr F}[t^{0}] ({\boldmu}),] so that [{\scr F}[T] = {\textstyle\sum\limits_{{\boldmu} \in {\bb Z}^{n}}} {\scr F}[t^{0}] ({\boldmu}) \delta_{[({\bf A}^{-1})^{T} {\boldmu}]}] in non-standard coordinates, while [{\scr F}[t] = {\textstyle\sum\limits_{{\boldmu} \in {\bb Z}^{n}}} {\scr F}[t^{0}] ({\boldmu}) \delta_{({\boldmu})}] in standard coordinates.

The reciprocity theorem may then be written: [\displaylines{\quad (\hbox{iii}) \hfill W_{\boldxi} = |\det {\bf A}|^{-1} \langle T_{\bf x}^{0}, \exp (-2 \pi i {\boldxi} \cdot {\bf x})\rangle, \quad {\boldxi} \in \boldLambda^{*} \hfill\cr \quad (\hbox{iv}) \hfill T_{\bf x} = {\textstyle\sum\limits_{{\boldxi} \in \Lambda^{*}}} W_{\boldxi} \exp (+2 \pi i {\boldxi} \cdot {\bf x})\qquad\qquad\qquad\quad\hfill}] in non-standard coordinates, or equivalently: [\displaylines{\quad (\hbox{v}) \hfill w_{\boldmu} = \langle t_{\bf x}^{0}, \exp (-2 \pi i {\boldmu} \cdot {\bf x})\rangle, \quad {\boldmu} \in {\bb Z}^{n} \hfill\cr \quad (\hbox{vi}) \hfill t_{\bf x} = {\textstyle\sum\limits_{{\boldmu} \in {\bb Z}^{n}}} w_{\boldmu} \exp (+2 \pi i {\boldmu} \cdot {\bf x}) \quad\qquad\hfill}] in standard coordinates. It gives an n-dimensional Fourier series representation for any periodic distribution over [{\bb R}^{n}]. The convergence of such series in [{\scr S}\,' ({\bb R}^{n})] will be examined in Section 1.3.2.6.10[link].

1.3.2.6.6. Duality between periodization and sampling

| top | pdf |

Let [T^{0}] be a distribution with compact support (the `motif'). Its Fourier transform [\bar{\scr F}[T^{0}]] is analytic (Section 1.3.2.5.4[link]) and may thus be used as a multiplier.

We may rephrase the preceding results as follows:

  • (i) if [T^{0}] is `periodized by R' to give [R * T^{0}], then [\bar{\scr F}[T^{0}]] is `sampled by [R^{*}]' to give [|\det {\bf A}|^{-1} R^{*} \cdot \bar{\scr F}[T^{0}]];

  • (ii) if [\bar{\scr F}[T^{0}]] is `sampled by [R^{*}]' to give [R^{*} \cdot \bar{\scr F}[T^{0}]], then [T^{0}] is `periodized by R' to give [|\det {\bf A}| R * T^{0}].

Thus the Fourier transformation establishes a duality between the periodization of a distribution by a period lattice Λ and the sampling of its transform at the nodes of lattice [\Lambda^{*}] reciprocal to Λ. This is a particular instance of the convolution theorem of Section 1.3.2.5.8[link].

At this point it is traditional to break the symmetry between [{\scr F}] and [\bar{\scr F}] which distribution theory has enabled us to preserve even in the presence of periodicity, and to perform two distinct identifications:

  • (i) a Λ-periodic distribution T will be handled as a distribution [\tilde{T}] on [{\bb R}^{n} / \Lambda], was done in Section 1.3.2.6.3[link];

  • (ii) a weighted lattice distribution [W = {\textstyle\sum_{{\boldmu} \in {\bb Z}^{n}}} W_{\boldmu} \delta_{[({\bf A}^{-1})^{T} {\boldmu}]}] will be identified with the collection [\{W_{\boldmu}|{\boldmu} \in {\bb Z}^{n}\}] of its n-tuply indexed coefficients.

1.3.2.6.7. The Poisson summation formula

| top | pdf |

Let [\varphi \in {\scr S}], so that [{\scr F}[\varphi] \in {\scr S}]. Let R be the lattice distribution associated to lattice Λ, with period matrix A, and let [R^{*}] be associated to the reciprocal lattice [\Lambda^{*}]. Then we may write: [\eqalignno{\langle R, \varphi \rangle &= \langle R, \bar{\scr F}[{\scr F}[\varphi]]\rangle\cr &= \langle \bar{\scr F}[R], {\scr F}[\varphi]\rangle\cr &= |\det {\bf A}|^{-1} \langle R^{*}, {\scr F}[\varphi]\rangle}] i.e. [{\textstyle\sum\limits_{{\bf x} \in \Lambda}} \varphi ({\bf x}) = |\det {\bf A}|^{-1} {\textstyle\sum\limits_{{\boldxi} \in \Lambda^{*}}} {\scr F}[\varphi] ({\boldxi}).]

This identity, which also holds for [\bar{\scr F}], is called the Poisson summation formula. Its usefulness follows from the fact that the speed of decrease at infinity of ϕ and [{\scr F}[\varphi]] are inversely related (Section 1.3.2.4.4.3[link]), so that if one of the series (say, the left-hand side) is slowly convergent, the other (say, the right-hand side) will be rapidly convergent. This procedure has been used by Ewald (1921)[link] [see also Bertaut (1952)[link], Born & Huang (1954)[link]] to evaluate lattice sums (Madelung constants) involved in the calculation of the internal electrostatic energy of crystals (see Chapter 3.4[link] in this volume on convergence acceleration techniques for crystallographic lattice sums).

When ϕ is a multivariate Gaussian [\varphi ({\bf x}) = G_{\bf B} ({\bf x}) = \exp (-\textstyle{{1 \over 2}} {\bf x}^{T} {\bf Bx}),] then [{\scr F}[\varphi] (\boldxi) = |\det (2 \pi {\bf B}^{-1})|^{1/2} G_{{\bf B}^{-1}} (\boldxi),] and Poisson's summation formula for a lattice with period matrix A reads: [\eqalignno{{\textstyle\sum\limits_{{\bf m} \in {\bb Z}^{n}}} G_{\bf B} ({\bf Am}) &= |\det {\bf A}|^{-1}| \det (2 \pi {\bf B}^{-1})|^{1/2}\cr &\quad \times \textstyle\sum\limits_{{\boldmu} \in {\bb Z}^{n}} G_{4 \pi^{2}{\bf B}^{-1}} [({\bf A}^{-1})^{T} {\boldmu}]}] or equivalently [{\textstyle\sum\limits_{{\bf m} \in {\bb Z}^{n}}} G_{C} ({\bf m}) = |\det (2 \pi {\bf C}^{-1})|^{1/2} {\textstyle\sum\limits_{{\boldmu} \in {\bb Z}^{n}}} G_{4 \pi^{2}}{{_{{\bf C}^{-1}}}} ({\boldmu})] with [{\bf C} = {\bf A}^{T} {\bf BA}.]

1.3.2.6.8. Convolution of Fourier series

| top | pdf |

Let [S = R * S^{0}] and [T = R * T^{0}] be two Λ-periodic distributions, the motifs [S^{0}] and [T^{0}] having compact support. The convolution [S * T] does not exist, because S and T do not satisfy the support condition (Section 1.3.2.3.9.7[link]). However, the three distributions R, [S^{0}] and [T^{0}] do satisfy the generalized support condition, so that their convolution is defined; then, by associativity and commutativity: [R * S^{0} * T^{0} = S * T^{0} = S^{0} * T.]

By Fourier transformation and by the convolution theorem: [\eqalignno{R^{*} \times {\scr F}[S^{0} * T^{0}] &= (R^{*} \times {\scr F}[S^{0}]) \times {\scr F}[T^{0}]\cr &= {\scr F}[T^{0}] \times (R^{*} \times {\scr F}[S^{0}]).}] Let [\{U_{\boldxi}\}_{{\boldxi} \in \Lambda^{*}}], [\{V_{\boldxi}\}_{{\boldxi} \in \Lambda^{*}}] and [\{W_{\boldxi}\}_{{\boldxi} \in \Lambda^{*}}] be the sets of Fourier coefficients associated to S, T and [S * T^{0} (= S^{0} * T)], respectively. Identifying the coefficients of [\delta_{\boldxi}] for [{\boldxi} \in \Lambda^{*}] yields the forward version of the convolution theorem for Fourier series: [W_{\boldxi} = |\det {\bf A}| U_{\boldxi} V_{\boldxi}.]

The backward version of the theorem requires that T be infinitely differentiable. The distribution [S \times T] is then well defined and its Fourier coefficients [\{Q_{\boldxi}\}_{\boldxi \in \Lambda^{*}}] are given by [Q_{\boldxi} = {\textstyle\sum\limits_{{\boldeta} \in \Lambda^{*}}} U_{\boldeta} V_{{\boldxi} - {\boldeta}}.]

1.3.2.6.9. Toeplitz forms, Szegö's theorem

| top | pdf |

Toeplitz forms were first investigated by Toeplitz (1907[link], 1910[link], 1911a[link]). They occur in connection with the `trigonometric moment problem' (Shohat & Tamarkin, 1943[link]; Akhiezer, 1965[link]) and probability theory (Grenander, 1952[link]) and play an important role in several direct approaches to the crystallographic phase problem [see Sections 1.3.4.2.1.10[link], 1.3.4.5.2.2[link](e)][link]. Many aspects of their theory and applications are presented in the book by Grenander & Szegö (1958)[link].

1.3.2.6.9.1. Toeplitz forms

| top | pdf |

Let [f \in L^{1} ({\bb R} / {\bb Z})] be real-valued, so that its Fourier coefficients satisfy the relations [c_{-m} (\;f) = \overline{c_{m} (\;f)}]. The Hermitian form in [n + 1] complex variables [T_{n} [\;f] ({\bf u}) = {\textstyle\sum\limits_{\mu = 0}^{n}}\; {\textstyle\sum\limits_{\nu = 0}^{n}} \;\overline{u_{\mu}} c_{\mu - \nu}u_{\nu}] is called the nth Toeplitz form associated to f. It is a straightforward consequence of the convolution theorem and of Parseval's identity that [T_{n} [\;f]] may be written: [T_{n} [\;f] ({\bf u}) = {\textstyle\int\limits_{0}^{1}} \left|{\textstyle\sum\limits_{\nu = 0}^{n}} {u}_{\nu} \exp (2 \pi i\nu x)\right|^{2} f (x) \;\hbox{d}x.]

1.3.2.6.9.2. The Toeplitz–Carathéodory–Herglotz theorem

| top | pdf |

It was shown independently by Toeplitz (1911b)[link], Carathéodory (1911)[link] and Herglotz (1911)[link] that a function [f \in L^{1}] is almost everywhere non-negative if and only if the Toeplitz forms [T_{n} [\;f]] associated to f are positive semidefinite for all values of n.

This is equivalent to the infinite system of determinantal inequalities [D_{n} = \det \pmatrix{c_{0} &c_{-1} &\cdot &\cdot &c_{-n}\cr c_{1} &c_{0} &c_{-1} &\cdot &\cdot\cr \cdot &c_{1} &\cdot &\cdot &\cdot\cr \cdot &\cdot &\cdot &\cdot &c_{-1}\cr c_{n} &\cdot &\cdot &c_{1} &c_{0}\cr} \geq 0 \quad \hbox{for all } n.] The [D_{n}] are called Toeplitz determinants. Their application to the crystallographic phase problem is described in Section 1.3.4.2.1.10[link].

1.3.2.6.9.3. Asymptotic distribution of eigenvalues of Toeplitz forms

| top | pdf |

The eigenvalues of the Hermitian form [T_{n} [\;f]] are defined as the [n + 1] real roots of the characteristic equation [\det \{T_{n} [\;f - \lambda]\} = 0]. They will be denoted by [\lambda_{1}^{(n)}, \lambda_{2}^{(n)}, \ldots, \lambda_{n + 1}^{(n)}.]

It is easily shown that if [m \leq f(x) \leq M] for all x, then [m \leq \lambda_{\nu}^{(n)} \leq M] for all n and all [\nu = 1, \ldots, n + 1]. As [n \rightarrow \infty] these bounds, and the distribution of the [\lambda^{(n)}] within these bounds, can be made more precise by introducing two new notions.

  • (i) Essential bounds: define ess inf f as the largest m such that [f(x) \geq m] except for values of x forming a set of measure 0; and define ess sup f similarly.

  • (ii) Equal distribution. For each n, consider two sets of [n + 1] real numbers: [a_{1}^{(n)}, a_{2}^{(n)}, \ldots, a_{n + 1}^{(n)}, \quad\hbox{and}\quad b_{1}^{(n)}, b_{2}^{(n)}, \ldots, b_{n + 1}^{(n)}.] Assume that for each [\nu] and each n, [|a_{\nu}^{(n)}| \;\lt\; K] and [|b_{\nu}^{(n)}| \;\lt\; K] with K independent of [\nu] and n. The sets [\{a_{\nu}^{(n)}\}] and [\{b_{\nu}^{(n)}\}] are said to be equally distributed in [[-K, +K]] if, for any function F over [[-K, +K]], [\lim\limits_{n \rightarrow \infty} {1 \over n + 1} \sum\limits_{\nu = 1}^{n + 1} [F (a_{\nu}^{(n)}) - F (b_{\nu}^{(n)})] = 0.]

We may now state an important theorem of Szegö (1915[link], 1920[link]). Let [f \in L^{1}], and put [m = \hbox{ess inf}\; f], [M = \hbox{ess sup}\;f]. If m and M are finite, then for any continuous function [F(\lambda)] defined in the interval [m, M] we have [\lim\limits_{n \rightarrow \infty} {1 \over n + 1} \sum\limits_{\nu = 1}^{n + 1} F (\lambda_{\nu}^{(n)}) = \int\limits_{0}^{1} F[\;f(x)] \;\hbox{d}x.] In other words, the eigenvalues [\lambda_{\nu}^{(n)}] of the [T_{n}] and the values [f[\nu/(n + 2)]] of f on a regular subdivision of ]0, 1[ are equally distributed.

Further investigations into the spectra of Toeplitz matrices may be found in papers by Hartman & Wintner (1950[link], 1954[link]), Kac et al. (1953)[link], Widom (1965)[link], and in the notes by Hirschman & Hughes (1977)[link].

1.3.2.6.9.4. Consequences of Szegö's theorem

| top | pdf |

  • (i) If the λ's are ordered in ascending order, then [\lim\limits_{n \rightarrow \infty} \lambda_{1}^{(n)} = m = \hbox{ess inf}\; f, \quad \lim\limits_{n \rightarrow \infty} \lambda_{n + 1}^{(n)} = M = \hbox{ess sup}\; f.] Thus, when [f \geq 0], the condition number [\lambda_{n + 1}^{(n)} / \lambda_{1}^{(n)}] of [T_{n}[\;f]] tends towards the `essential dynamic range' [M/m] of f.

  • (ii) Let [F(\lambda) = \lambda^{s}] where s is a positive integer. Then [\lim\limits_{n \rightarrow \infty} {1 \over n + 1} \sum\limits_{\nu = 1}^{n + 1}\; [\lambda_{\nu}^{(n)}]^{s} = \int\limits_{0}^{1} [\;f(x)]^{s} \;\hbox{d}x.]

  • (iii) Let [m \gt 0], so that [\lambda_{\nu}^{(n)} \gt 0], and let [D_{n}(\;f) = \det T_{n}(\;f)]. Then [D_{n}(\;f) = \textstyle\prod\limits_{\nu = 1}^{n + 1} \lambda_{\nu}^{(n)},] hence [\log D_{n}(\;f) = {\textstyle\sum\limits_{\nu = 1}^{n + 1}} \log \lambda_{\nu}^{(n)}.]

    Putting [F(\lambda) = \log \lambda], it follows that [\lim\limits_{n \rightarrow \infty} [D_{n} (\;f)]^{1/(n + 1)} = \exp \left\{{\textstyle\int\limits_{0}^{1}} \log f(x) \;\hbox{d}x\right\}.]

Further terms in this limit were obtained by Szegö (1952)[link] and interpreted in probabilistic terms by Kac (1954)[link].

1.3.2.6.10. Convergence of Fourier series

| top | pdf |

The investigation of the convergence of Fourier series and of more general trigonometric series has been the subject of intense study for over 150 years [see e.g. Zygmund (1976)[link]]. It has been a constant source of new mathematical ideas and theories, being directly responsible for the birth of such fields as set theory, topology and functional analysis.

This section will briefly survey those aspects of the classical results in dimension 1 which are relevant to the practical use of Fourier series in crystallography. The books by Zygmund (1959)[link], Tolstov (1962)[link] and Katznelson (1968)[link] are standard references in the field, and Dym & McKean (1972)[link] is recommended as a stimulant.

1.3.2.6.10.1. Classical [L^{1}] theory

| top | pdf |

The space [L^{1} ({\bb R} / {\bb Z})] consists of (equivalence classes of) complex-valued functions f on the circle which are summable, i.e. for which [\|\;f \|_{1} \equiv {\textstyle\int\limits_{0}^{1}}\; | \;f(x) | \;\hbox{d}x \;\lt\; + \infty.] It is a convolution algebra: If f and g are in [L^{1}], then [f * g] is in [L^{1}].

The mth Fourier coefficient [c_{m} (\;f)] of f, [c_{m} (\;f) = {\textstyle\int\limits_{0}^{1}}\; f(x) \exp (-2 \pi imx) \;\hbox{d}x] is bounded: [|c_{m} (\;f)| \leq \|\;f \|_{1}], and by the Riemann–Lebesgue lemma [c_{m} (\;f) \rightarrow 0] as [m \rightarrow \infty]. By the convolution theorem, [c_{m} (\;f * g) = c_{m} (\;f) c_{m} (g)].

The pth partial sum [S_{p}(\;f)] of the Fourier series of f, [S_{p}(\;f) (x) = {\textstyle\sum\limits_{|m|\leq p}} c_{m} (\;f) \exp (2 \pi imx),] may be written, by virtue of the convolution theorem, as [S_{p}(\;f) = D_{p} * f], where [D_{p} (x) = {\sum\limits_{|m|\leq p}} \exp (2 \pi imx) = {\sin [(2p + 1) \pi x] \over \sin \pi x}] is the Dirichlet kernel. Because [D_{p}] comprises numerous slowly decaying oscillations, both positive and negative, [S_{p}(\;f)] may not converge towards f in a strong sense as [p \rightarrow \infty]. Indeed, spectacular pathologies are known to exist where the partial sums, examined pointwise, diverge everywhere (Zygmund, 1959[link], Chapter VIII). When f is piecewise continuous, but presents isolated jumps, convergence near these jumps is marred by the Gibbs phenomenon: [S_{p}(\;f)] always `overshoots the mark' by about 9%, the area under the spurious peak tending to 0 as [p \rightarrow \infty] but not its height [see Larmor (1934)[link] for the history of this phenomenon].

By contrast, the arithmetic mean of the partial sums, also called the pth Cesàro sum, [C_{p}(\;f) = {1 \over p + 1} [S_{0}(\;f) + \ldots + S_{p}(\;f)],] converges to f in the sense of the [L^{1}] norm: [\|C_{p}(\;f) - f\|_{1} \rightarrow 0] as [p \rightarrow \infty]. If furthermore f is continuous, then the convergence is uniform, i.e. the error is bounded everywhere by a quantity which goes to 0 as [p \rightarrow \infty]. It may be shown that [C_{p} (\;f) = F_{p} * f,] where [\eqalign{F_{p} (x) &= {\sum\limits_{|m| \leq p}} \left(1 - {|m| \over p + 1}\right) \exp (2 \pi imx) \cr &= {1 \over p + 1} \left[{\sin (p + 1) \pi x \over \sin \pi x}\right]^{2}}] is the Fejér kernel. [F_{p}] has over [D_{p}] the advantage of being everywhere positive, so that the Cesàro sums [C_{p} (\;f)] of a positive function f are always positive.

The de la Vallée Poussin kernel [V_{p} (x) = 2 F_{2p + 1} (x) - F_{p} (x)] has a trapezoidal distribution of coefficients and is such that [c_{m} (V_{p}) = 1] if [|m| \leq p + 1]; therefore [V_{p} * f] is a trigonometric polynomial with the same Fourier coefficients as f over that range of values of m.

The Poisson kernel[\eqalign{P_{r} (x) &= 1 + 2 {\sum\limits_{m = 1}^{\infty}} r^{m} \cos 2 \pi mx \cr &= {1 - r^{2} \over 1 - 2r \cos 2 \pi mx + r^{2}}}] with [0 \leq r \;\lt\; 1] gives rise to an Abel summation procedure [Tolstov (1962[link], p. 162); Whittaker & Watson (1927[link], p. 57)] since [(P_{r} * f) (x) = {\textstyle\sum\limits_{m \in {\bb Z}}} c_{m} (\;f) r^{|m|} \exp (2 \pi imx).] Compared with the other kernels, [P_{r}] has the disadvantage of not being a trigonometric polynomial; however, [P_{r}] is the real part of the Cauchy kernel (Cartan, 1961[link]; Ahlfors, 1966[link]): [P_{r} (x) = {\scr Re}\left[{1 + r \exp (2 \pi ix) \over 1 - r \exp (2 \pi ix)}\right]] and hence provides a link between trigonometric series and analytic functions of a complex variable.

Other methods of summation involve forming a moving average of f by convolution with other sequences of functions [\alpha_{p} ({\bf x})] besides [D_{p}] of [F_{p}] which `tend towards δ' as [p \rightarrow \infty]. The convolution is performed by multiplying the Fourier coefficients of f by those of [\alpha_{p}], so that one forms the quantities [S'_{p} (\;f) (x) = {\textstyle\sum\limits_{|m| \leq p}} c_{m} (\alpha_{p}) c_{m} (\;f) \exp (2 \pi imx).] For instance the `sigma factors' of Lanczos (Lanczos, 1966[link], p. 65), defined by [\sigma_{m} = {\sin [m \pi / p] \over m \pi /p},] lead to a summation procedure whose behaviour is intermediate between those using the Dirichlet and the Fejér kernels; it corresponds to forming a moving average of f by convolution with [\alpha_{p} = p\chi_{[-1/(2p), \, 1/(2p)]}{*} D_{p},] which is itself the convolution of a `rectangular pulse' of width [1/p] and of the Dirichlet kernel of order p.

A review of the summation problem in crystallography is given in Section 1.3.4.2.1.3[link].

1.3.2.6.10.2. Classical [L^{2}] theory

| top | pdf |

The space [L^{2}({\bb R}/{\bb Z})] of (equivalence classes of) square-integrable complex-valued functions f on the circle is contained in [L^{1}({\bb R}/{\bb Z})], since by the Cauchy–Schwarz inequality [\eqalign{\|\;f \|_{1}^{2} &= \left({\textstyle\int\limits_{0}^{1}} |\;f (x)| \times 1 \;\hbox{d}x\right)^{2} \cr &\leq \left({\textstyle\int\limits_{0}^{1}} |\;f (x)|^{2} \;\hbox{d}x\right) \left({\textstyle\int\limits_{0}^{1}} {1}^{2} \;\hbox{d}x\right) = \|\;f \|_{2}^{2} \leq \infty.}] Thus all the results derived for [L^{1}] hold for [L^{2}], a great simplification over the situation in [{\bb R}] or [{\bb R}^{n}] where neither [L^{1}] nor [L^{2}] was contained in the other.

However, more can be proved in [L^{2}], because [L^{2}] is a Hilbert space (Section 1.3.2.2.4[link]) for the inner product [(\;f, g) = {\textstyle\int\limits_{0}^{1}}\; \overline{f (x)} g (x) \;\hbox{d}x,] and because the family of functions [\{\exp (2 \pi imx)\}_{m \in {\bb Z}}] constitutes an orthonormal Hilbert basis for [L^{2}].

The sequence of Fourier coefficients [c_{m} (\;f)] of [f \in L^{2}] belongs to the space [\ell^{2}({\bb Z})] of square-summable sequences: [{\textstyle\sum\limits_{m \in {\bb Z}}} |c_{m} (\;f)|^{2} \;\lt\; \infty.] Conversely, every element [c = (c_{m})] of [\ell^{2}] is the sequence of Fourier coefficients of a unique function in [L^{2}]. The inner product [(c, d) = {\textstyle\sum\limits_{m \in {\bb Z}}} \overline{c_{m}} d_{m}] makes [\ell^{2}] into a Hilbert space, and the map from [L^{2}] to [\ell^{2}] established by the Fourier transformation is an isometry (Parseval/Plancherel): [\|\;f \|_{L^{2}} = \| c (\;f) \|_{{\ell}^{2}}] or equivalently: [(\;f, g) = (c (\;f), c (g)).] This is a useful property in applications, since (f, g) may be calculated either from f and g themselves, or from their Fourier coefficients [c(\;f)] and [c(g)] (see Section 1.3.4.4.6[link]) for crystallographic applications).

By virtue of the orthogonality of the basis [\{\exp (2 \pi imx)\}_{m \in {\bb Z}}], the partial sum [S_{p} (\;f)] is the best mean-square fit to f in the linear subspace of [L^{2}] spanned by [\{\exp (2 \pi imx)\}_{|m| \leq p}], and hence (Bessel's inequality) [{\textstyle\sum\limits_{|m| \leq p}} |c_{m} (\;f)|^{2} = \|\;f \|_{2}^{2} - {\textstyle\sum\limits_{|M| \geq p}} |c_{M} (\;f)|^{2} \leq \|\;f \|_{2}^{2}.]

1.3.2.6.10.3. The viewpoint of distribution theory

| top | pdf |

The use of distributions enlarges considerably the range of behaviour which can be accommodated in a Fourier series, even in the case of general dimension n where classical theories meet with even more difficulties than in dimension 1.

Let [\{w_{m}\}_{m \in {\bb Z}}] be a sequence of complex numbers with [|w_{m}|] growing at most polynomially as [|m| \rightarrow \infty], say [|w_{m}| \leq C |m|^{K}]. Then the sequence [\{w_{m} / (2 \pi im)^{K + 2}\}_{m \in {\bb Z}}] is in [\ell^{2}] and even defines a continuous function [f \in L^{2}({\bb R}/{\bb Z})] and an associated tempered distribution [T_{f} \in {\scr D}\,'({\bb R}/{\bb Z})]. Differentiation of [T_{f}] [(K + 2)] times then yields a tempered distribution whose Fourier transform leads to the original sequence of coefficients. Conversely, by the structure theorem for distributions with compact support (Section 1.3.2.3.9.7[link]), the motif [T^{0}] of a [{\bb Z}]-periodic distribution is a derivative of finite order of a continuous function; hence its Fourier coefficients will grow at most polynomially with [|m|] as [|m| \rightarrow \infty].

Thus distribution theory allows the manipulation of Fourier series whose coefficients exhibit polynomial growth as their order goes to infinity, while those derived from functions had to tend to 0 by virtue of the Riemann–Lebesgue lemma. The distribution-theoretic approach to Fourier series holds even in the case of general dimension n, where classical theories meet with even more difficulties (see Ash, 1976[link]) than in dimension 1.

1.3.2.7. The discrete Fourier transformation

| top | pdf |

1.3.2.7.1. Shannon's sampling theorem and interpolation formula

| top | pdf |

Let [\varphi \in {\scr E} ({\bb R}^{n})] be such that [\Phi = {\scr F}[\varphi]] has compact support K. Let ϕ be sampled at the nodes of a lattice [\Lambda^{*}], yielding the lattice distribution [R^{*} \times \varphi]. The Fourier transform of this sampled version of ϕ is [{\scr F}[R^{*} \times \varphi] = | \det {\bf A}| (R * \Phi),] which is essentially Φ periodized by period lattice [\Lambda = (\Lambda^{*})^{*}], with period matrix A.

Let us assume that Λ is such that the translates of K by different period vectors of Λ are disjoint. Then we may recover Φ from [R * \Phi] by masking the contents of a `unit cell' [{\scr V}] of Λ (i.e. a fundamental domain for the action of Λ in [{\bb R}^{n}]) whose boundary does not meet K. If [\chi _{\scr V}] is the indicator function of [{\scr V}], then [\Phi = \chi_{\scr V}\times (R * \Phi).] Transforming both sides by [\bar{\scr F}] yields [\varphi = \bar{\scr F}\left[\chi_{\scr V}\times {1 \over |\det {\bf A}|} {\scr F}[R^{*} \times \varphi]\right],] i.e. [\varphi = \left({1 \over V} \bar{\scr F}[\chi_{\scr V}]\right) * (R^{*} \times \varphi)] since [|\det {\bf A}|] is the volume V of [{\scr V}].

This interpolation formula is traditionally credited to Shannon (1949)[link], although it was discovered much earlier by Whittaker (1915)[link]. It shows that ϕ may be recovered from its sample values on [\Lambda^{*}] (i.e. from [R^{*} \times \varphi]) provided [\Lambda^{*}] is sufficiently fine that no overlap (or `aliasing') occurs in the periodization of Φ by the dual lattice Λ. The interpolation kernel is the transform of the normalized indicator function of a unit cell of Λ containing the support K of Φ.

If K is contained in a sphere of radius [1/\Delta] and if Λ and [\Lambda^{*}] are rectangular, the length of each basis vector of Λ must be greater than [2/\Delta], and thus the sampling interval must be smaller than [\Delta /2]. This requirement constitutes the Shannon sampling criterion.

1.3.2.7.2. Duality between subdivision and decimation of period lattices

| top | pdf |

1.3.2.7.2.1. Geometric description of sublattices

| top | pdf |

Let [\Lambda_{\bf A}] be a period lattice in [{\bb R}^{n}] with matrix A, and let [\Lambda_{\bf A}^{*}] be the lattice reciprocal to [\Lambda_{\bf A}], with period matrix [(A^{-1})^{T}]. Let [\Lambda_{\bf B}, {\bf B}, \Lambda_{\bf B}^{*}] be defined similarly, and let us suppose that [\Lambda_{\bf A}] is a sublattice of [\Lambda_{\bf B}], i.e. that [\Lambda_{\bf B} \supset \Lambda_{\bf A}] as a set.

The relation between [\Lambda_{\bf A}] and [\Lambda_{\bf B}] may be described in two different fashions: (i) multiplicatively, and (ii) additively.

  • (i) We may write [{\bf A} = {\bf BN}] for some non-singular matrix N with integer entries. N may be viewed as the period matrix of the coarser lattice [\Lambda_{\bf A}] with respect to the period basis of the finer lattice [\Lambda_{\bf B}]. It will be more convenient to write [{\bf A} = {\bf DB}], where [{\bf D} = {\bf BNB}^{-1}] is a rational matrix (with integer determinant since det [{\bf D} = \det {\bf N}]) in terms of which the two lattices are related by [\Lambda_{\bf A} = {\bf D} \Lambda_{\bf B}.]

  • (ii) Call two vectors in [\Lambda_{\bf B}] congruent modulo [\Lambda_{\bf A}] if their difference lies in [\Lambda_{\bf A}]. Denote the set of congruence classes (or `cosets') by [\Lambda_{\bf B} / \Lambda_{\bf A}], and the number of these classes by [[\Lambda_{\bf B} : \Lambda_{\bf A}]]. The `coset decomposition' [\Lambda_{\bf B} = \bigcup_{{\boldell} \in \Lambda_{\bf B} / \Lambda_{\bf A}} ({\boldell} + \Lambda_{\bf A})] represents [\Lambda_{\bf B}] as the disjoint union of [[\Lambda_{\bf B} : \Lambda_{\bf A}]] translates of [\Lambda_{\bf A} .\; \Lambda_{\bf B} / \Lambda_{\bf A}] is a finite lattice with [[\Lambda_{\bf B} : \Lambda_{\bf A}]] elements, called the residual lattice of [\Lambda_{\bf B}] modulo [\Lambda_{\bf A}].

    The two descriptions are connected by the relation [[\Lambda_{\bf B} : \Lambda_{\bf A}] = \det {\bf D} = \det {\bf N}], which follows from a volume calculation. We may also combine (i)[link] and (ii)[link] into

  • [\displaylines{\quad({\rm iii})\hfill \Lambda_{\bf B} = \bigcup_{{\boldell} \in \Lambda_{\bf B} / \Lambda_{\bf A}} ({\boldell} + {\bf D} \Lambda_{\bf B})\hfill}] which may be viewed as the n-dimensional equivalent of the Euclidean algorithm for integer division: [\boldell] is the `remainder' of the division by [\Lambda_{\bf A}] of a vector in [\Lambda_{\bf B}], the quotient being the matrix D.

1.3.2.7.2.2. Sublattice relations for reciprocal lattices

| top | pdf |

Let us now consider the two reciprocal lattices [\Lambda_{\bf A}^{*}] and [\Lambda_{\bf B}^{*}]. Their period matrices [({\bf A}^{-1})^{T}] and [({\bf B}^{-1})^{T}] are related by: [({\bf B}^{-1})^{T} = ({\bf A}^{-1})^{T} {\bf N}^{T}], where [{\bf N}^{T}] is an integer matrix; or equivalently by [({\bf B}^{-1})^{T} = {\bf D}^{T} ({\bf A}^{-1})^{T}]. This shows that the roles are reversed in that [\Lambda_{\bf B}^{*}] is a sublattice of [\Lambda_{\bf A}^{*}], which we may write:

  • [\displaylines{\quad({\rm i})^*\hfill \Lambda_{\bf B}^{*} = {\bf D}^{T} \Lambda_{\bf A}^{*}\hfill}]

  • [\displaylines{\quad({\rm ii})^*\hfill\Lambda_{\bf A}^{*} = \bigcup_{{\boldell}^{*} \in \Lambda_{\bf A}^{*} / \Lambda_{\bf B}^{*}} ({\boldell}^{*} + \Lambda_{\bf B}^{*}).\hfill}] The residual lattice [\Lambda_{\bf A}^{*} / \Lambda_{\bf B}^{*}] is finite, with [[\Lambda_{\bf A}^{*}: \Lambda_{\bf B}^{*}] =] [ \det {\bf D} = \det {\bf N} = [\Lambda_{\bf B}: \Lambda_{\bf A}]], and we may again combine [(\hbox{i})^{*}] [link] and [(\hbox{ii})^{*}] [link] into

  • [\displaylines{\quad({\rm iii})^*\hfill\Lambda_{\bf A}^{*} = \bigcup_{{\boldell}^{*} \in \Lambda_{\bf A}^{*} / \Lambda_{\bf B}^{*}} ({\boldell}^{*} + {\bf D}^{T} \Lambda_{\bf A}^{*}).\hfill}]

1.3.2.7.2.3. Relation between lattice distributions

| top | pdf |

The above relations between lattices may be rewritten in terms of the corresponding lattice distributions as follows: [\displaylines{\quad (\hbox{i}) \hfill R_{\bf A} = {1 \over |\det {\bf D}|} {\bf D}^{\#} R_{\bf B}^{*} \;\hfill\cr \quad (\hbox{ii}) \hfill R_{\bf B} = T_{{\bf B} / {\bf A}} * R_{\bf A}\qquad \hfill\cr \quad (\hbox{i})^{*} \hfill \;\;R_{\bf B}^{*} = {1 \over |\det {\bf D}|} ({\bf D}^{T})^{\#} R_{\bf A}^{*} \hfill\cr \quad (\hbox{ii})^{*} \hfill R_{\bf A}^{*} =T_{{\bf A} / {\bf B}}^{*} * R_{\bf B}^{*} \qquad\;\;\hfill}] where [T_{{\bf B} / {\bf A}} = {\textstyle\sum\limits_{{\boldell} \in \Lambda_{\bf B} / \Lambda_{\bf A}}} \delta_{({\boldell})}] and [T_{{\bf A}/{\bf B}}^{*} = {\textstyle\sum\limits_{{\boldell}^{*} \in \Lambda_{\bf A}^{*} / \Lambda_{\bf B}^{*}}} \delta_{({\boldell}^{*})}] are (finite) residual-lattice distributions. We may incorporate the factor [1/|\det {\bf D}|] in (i) and [(\hbox{i})^{*}] into these distributions and define [S_{{\bf B}/{\bf A}} = {1 \over |\det {\bf D}|} T_{{\bf B}/{\bf A}},\quad S_{{\bf A}/{\bf B}}^{*} = {1 \over |\det {\bf D}|} T_{{\bf A}/{\bf B}}^{*}.]

Since [|\det {\bf D}| = [\Lambda_{\bf B}: \Lambda_{\bf A}] = [\Lambda_{\bf A}^{*}: \Lambda_{\bf B}^{*}]], convolution with [S_{{\bf B}/{\bf A}}] and [S_{{\bf A}/{\bf B}}^{*}] has the effect of averaging the translates of a distribution under the elements (or `cosets') of the residual lattices [\Lambda_{\bf B}/\Lambda_{\bf A}] and [\Lambda_{\bf A}^{*}/\Lambda_{\bf B}^{*}], respectively. This process will be called `coset averaging'. Eliminating [R_{\bf A}] and [R_{\bf B}] between (i) and (ii), and [R_{\bf A}^{*}] and [R_{\bf B}^{*}] between [(\hbox{i})^{*}] and [(\hbox{ii})^{*}], we may write: [\displaylines{\quad (\hbox{i}')\hfill \! R_{\bf A} = {\bf D}^{\#} (S_{{\bf B}/{\bf A}} * R_{\bf A})\;\;\;\hfill\cr \quad (\hbox{ii}')\hfill \! R_{\bf B} = S_{{\bf B}/{\bf A}} * ({\bf D}^{\#} R_{\bf B})\;\;\;\;\hfill\cr \quad (\hbox{i}')^{*}\hfill R_{\bf B}^{*} = ({\bf D}^{T})^{\#} (S_{{\bf A}/{\bf B}}^{*} * R_{\bf B}^{*}) \hfill\cr \quad (\hbox{ii}')^{*}\hfill R_{\bf A}^{*} = S_{{\bf A}/{\bf B}}^{*} * [({\bf D}^{T})^{\#} R_{\bf A}^{*}]. \;\hfill}] These identities show that period subdivision by convolution with [S_{{\bf B}/{\bf A}}] (respectively [S_{{\bf A}/{\bf B}}^{*}]) on the one hand, and period decimation by `dilation' by [{\bf D}^{\#}] on the other hand, are mutually inverse operations on [R_{\bf A}] and [R_{\bf B}] (respectively [R_{\bf A}^{*}] and [R_{\bf B}^{*}]).

1.3.2.7.2.4. Relation between Fourier transforms

| top | pdf |

Finally, let us consider the relations between the Fourier transforms of these lattice distributions. Recalling the basic relation of Section 1.3.2.6.5[link], [\eqalign{{\scr F}[R_{\bf A}] &= {1 \over |\det {\bf A}|} R_{\bf A}^{*}\cr &= {1 \over |\det {\bf DB}|} T_{{\bf A}/{\bf B}}^{*} * R_{\bf B}^{*} \quad \quad \quad \quad \quad \quad \hbox{by (ii)}^{*}\cr &= \left({1 \over |\det {\bf D}|} T_{{\bf A}/{\bf B}}^{*}\right) * \left({1 \over |\det {\bf B}|} R_{\bf B}^{*}\right)}] i.e. [\displaylines{\quad (\hbox{iv})\hfill {\scr F}[R_{\bf A}] = S_{{\bf A}/{\bf B}}^{*} * {\scr F}[R_{\bf B}]\hfill}] and similarly: [\displaylines{\quad (\hbox{v})\hfill {\scr F}[R_{\bf B}^{*}] = S_{{\bf B}/{\bf A}} * {\scr F}[R_{\bf A}^{*}].\hfill}]

Thus [R_{\bf A}] (respectively [R_{\bf B}^{*}]), a decimated version of [R_{\bf B}] (respectively [R_{\bf A}^{*}]), is transformed by [{\scr F}] into a subdivided version of [{\scr F}[R_{\bf B}]] (respectively [{\scr F}[R_{\bf A}^{*}]]).

The converse is also true: [\eqalign{{\scr F}[R_{\bf B}] &= {1 \over |\det {\bf B}|} R_{\bf B}^{*}\cr &= {1 \over |\det {\bf B}|} {1 \over |\det {\bf D}|} ({\bf D}^{T})^{\#} R_{\bf A}^{*}\quad \quad \quad \quad \hbox{by (i)}^{*}\cr &= ({\bf D}^{T})^{\#} \left({1 \over |\det {\bf A}|} R_{\bf A}^{*}\right)}] i.e. [\displaylines{\quad (\hbox{iv}')\hfill {\scr F}[R_{\bf B}] = ({\bf D}^{T})^{\#} {\scr F}[R_{\bf A}]\hfill}] and similarly [\displaylines{\quad (\hbox{v}')\hfill {\scr F}[R_{\bf A}^{*}] = {\bf D}^{\#} {\scr F}[R_{\bf B}^{*}].\hfill}]

Thus [R_{\bf B}] (respectively [R_{\bf A}^{*}]), a subdivided version of [R_{\bf A}] (respectively [R_{\bf B}^{*}]) is transformed by [{\scr F}] into a decimated version of [{\scr F}[R_{\bf A}]] (respectively [{\scr F}[R_{\bf B}^{*}]]). Therefore, the Fourier transform exchanges subdivision and decimation of period lattices for lattice distributions.

Further insight into this phenomenon is provided by applying [\bar{\scr F}] to both sides of (iv) and (v) and invoking the convolution theorem: [\displaylines{\quad (\hbox{iv}'')\hfill \!\! R_{\bf A} = \bar{\scr F}[S_{{\bf A}/{\bf B}}^{*}] \times R_{\bf B} \;\hfill\cr \quad (\hbox{v}'')\hfill R_{\bf B}^{*} = \bar{\scr F}[S_{{\bf B}/{\bf A}}] \times R_{\bf A}^{*}. \hfill}] These identities show that multiplication by the transform of the period-subdividing distribution [S_{{\bf A}/{\bf B}}^{*}] (respectively [S_{{\bf B}/{\bf A}}]) has the effect of decimating [R_{\bf B}] to [R_{\bf A}] (respectively [R_{\bf A}^{*}] to [R_{\bf B}^{*}]). They clearly imply that, if [\boldell \in \Lambda_{\bf B}/\Lambda_{\bf A}] and [\boldell^{*} \in \Lambda_{\bf A}^{*}/\Lambda_{\bf B}^{*}], then [\eqalign{\bar{\scr F}[S_{{\bf A}/{\bf B}}^{*}] ({\boldell}) &= 1 \hbox{ if } {\boldell} = {\bf 0} \;\;\quad (i.e. \hbox{ if } {\boldell} \hbox{ belongs}\cr &{\hbox to 66pt{}}\hbox{to the class of } \Lambda_{\bf A}),\cr &= 0 \hbox{ if } {\boldell} \neq {\bf 0}\hbox{;}\cr \bar{\scr F}[S_{{\bf B}/{\bf A}}] ({\boldell}^{*}) &= 1 \hbox{ if } {\boldell}^{*} = {\bf 0} \quad (i.e. \hbox{ if } {\boldell}^{*} \hbox{ belongs}\cr &{\hbox to 60pt{}} \hbox{ to the class of } \Lambda_{\bf B}^{*}),\cr &= 0 \hbox{ if } {\boldell}^{*} \neq {\bf 0}.}] Therefore, the duality between subdivision and decimation may be viewed as another aspect of that between convolution and multiplication.

There is clearly a strong analogy between the sampling/periodization duality of Section 1.3.2.6.6[link] and the decimation/subdivision duality, which is viewed most naturally in terms of subgroup relationships: both sampling and decimation involve restricting a function to a discrete additive subgroup of the domain over which it is initially given.

1.3.2.7.2.5. Sublattice relations in terms of periodic distributions

| top | pdf |

The usual presentation of this duality is not in terms of lattice distributions, but of periodic distributions obtained by convolving them with a motif.

Given [T^{0} \in {\scr E}\,' ({\bb R}^{n})], let us form [R_{\bf A} * T^{0}], then decimate its transform [(1/|\det {\bf A}|) R_{\bf A}^{*} \times \bar{\scr F}[T^{0}]] by keeping only its values at the points of the coarser lattice [\Lambda_{\bf B}^{*} = {\bf D}^{T} \Lambda_{\bf A}^{*}]; as a result, [R_{\bf A}^{*}] is replaced by [(1/|\det {\bf D}|) R_{\bf B}^{*}], and the reverse transform then yields [\displaylines{\hfill{1 \over |\det {\bf D}|} R_{\bf B} * T^{0} = S_{{\bf B}/{\bf A}} * (R_{\bf A} * T^{0})\hfill \hbox{by (ii)},}] which is the coset-averaged version of the original [R_{\bf A} * T^{0}]. The converse situation is analogous to that of Shannon's sampling theorem. Let a function [\varphi \in {\scr E}({\bb R}^{n})] whose transform [\Phi = {\scr F}[\varphi]] has compact support be sampled as [R_{\bf B} \times \varphi] at the nodes of [\Lambda_{\bf B}]. Then [{\scr F}[R_{\bf B} \times \varphi] = {1 \over |\det {\bf B}|} (R_{\bf B}^{*} * \Phi)] is periodic with period lattice [\Lambda_{\bf B}^{*}]. If the sampling lattice [\Lambda_{\bf B}] is decimated to [\Lambda_{\bf A} = {\bf D} \Lambda_{\bf B}], the inverse transform becomes [\eqalign{{\hbox to 48pt{}}{\scr F}[R_{\bf A} \times \varphi] &= {1 \over |\det {\bf D}|} (R_{\bf A}^{*} * \Phi)\cr &= S_{{\bf A}/{\bf B}}^{*} * (R_{\bf B}^{*} * \Phi){\hbox to 58pt{}}\hbox{by (ii)}^{*},}] hence becomes periodized more finely by averaging over the cosets of [\Lambda_{\bf A}^{*}/\Lambda_{\bf B}^{*}]. With this finer periodization, the various copies of Supp Φ may start to overlap (a phenomenon called `aliasing'), indicating that decimation has produced too coarse a sampling of ϕ.

1.3.2.7.3. Discretization of the Fourier transformation

| top | pdf |

Let [\varphi^{0} \in {\scr E}({\bb R}^{n})] be such that [\Phi^{0} = {\scr F}[\varphi^{0}]] has compact support ([\varphi^{0}] is said to be band-limited). Then [\varphi = R_{\bf A} * \varphi^{0}] is [\Lambda_{\bf A}]-periodic, and [\Phi = {\scr F}[\varphi] = (1/|\det {\bf A}|) R_{\bf A}^{*} \times \Phi^{0}] is such that only a finite number of points [\lambda_{\bf A}^{*}] of [\Lambda_{\bf A}^{*}] have a non-zero Fourier coefficient [\Phi^{0} (\lambda_{\bf A}^{*})] attached to them. We may therefore find a decimation [\Lambda_{\bf B}^{*} = {\bf D}^{T} \Lambda_{\bf A}^{*}] of [\Lambda_{\bf A}^{*}] such that the distinct translates of Supp [\Phi^{0}] by vectors of [\Lambda_{\bf B}^{*}] do not intersect.

The distribution Φ can be uniquely recovered from [R_{\bf B}^{*} * \Phi] by the procedure of Section 1.3.2.7.1[link], and we may write: [\eqalign{R_{\bf B}^{*} * \Phi &= {1 \over |\det {\bf A}|} R_{\bf B}^{*} * (R_{\bf A}^{*} \times \Phi^{0})\cr &= {1 \over |\det {\bf A}|} R_{\bf A}^{*} \times (R_{\bf B}^{*} * \Phi^{0})\cr &= {1 \over |\det {\bf A}|} R_{\bf B}^{*} * [T_{{\bf A}/{\bf B}}^{*} \times (R_{\bf B}^{*} * \Phi^{0})]\hbox{;}}] these rearrangements being legitimate because [\Phi^{0}] and [T_{{\bf A}/{\bf B}}^{*}] have compact supports which are intersection-free under the action of [\Lambda_{\bf B}^{*}]. By virtue of its [\Lambda_{\bf B}^{*}]-periodicity, this distribution is entirely characterized by its `motif' [\tilde{\Phi}] with respect to [\Lambda_{\bf B}^{*}]: [\tilde{\Phi} = {1 \over |\det {\bf A}|} T_{{\bf A}/{\bf B}}^{*} \times (R_{\bf B}^{*} * \Phi^{0}).]

Similarly, ϕ may be uniquely recovered by Shannon interpolation from the distribution sampling its values at the nodes of [\Lambda_{\bf B} = {\bf D}^{-1} \Lambda_{\bf A} (\Lambda_{\bf B}] is a subdivision of [\Lambda_{\bf B}]). By virtue of its [\Lambda_{\bf A}]-periodicity, this distribution is completely characterized by its motif: [\tilde{\varphi} = T_{{\bf B}/{\bf A}} \times \varphi = T_{{\bf B}/{\bf A}} \times (R_{\bf A}^{*} * \varphi^{0}).]

Let [{\boldell} \in \Lambda_{\bf B}/\Lambda_{\bf A}] and [{\boldell}^{*} \in \Lambda_{\bf A}^{*}/\Lambda_{\bf B}^{*}], and define the two sets of coefficients [\!\!\matrix{(1)& \tilde{\varphi} ({\boldell}) \hfill&= \varphi ({\boldell} + \boldlambda_{\bf A})\hfill&\hbox{for any } \boldlambda_{\bf A} \in \Lambda_{\bf A}\hfill&\cr &&&(\hbox{all choices of } \boldlambda_{\bf A} \hbox{ give the same } \tilde{\varphi}),\hfill&\cr (2)&\tilde{\Phi} ({\boldell}^{*}) \hfill&= \Phi^{0} ({\boldell}^{*} + \boldlambda_{\bf B}^{*})\hfill &\hbox{for the unique } \boldlambda_{\bf B}^{*} \hbox{ (if it exists)}\hfill&\cr &&&\hbox{such that } {\boldell}^{*} + \boldlambda_{\bf B}^{*} \in \hbox{Supp } \Phi^{0},\hfill&\cr &&= 0\hfill&\hbox{if no such } \boldlambda_{\bf B}^{*} \hbox{ exists}.\hfill}] Define the two distributions [\omega = {\textstyle\sum\limits_{{\boldell} \in \Lambda_{\bf B}/\Lambda_{\bf A}}} \tilde{\varphi} ({\boldell}) \delta_{({\boldell})}] and [\Omega = {\textstyle\sum\limits_{{\boldell}^{*} \in \Lambda_{\bf A}^{*}/\Lambda_{\bf B}^{*}}} \tilde{\Phi} ({\boldell}^{*}) \delta_{({\boldell}^{*})}.] The relation between ω and Ω has two equivalent forms: [\displaylines{\quad (\hbox{i})\hfill \quad R_{\bf A} * \omega = {\scr F}[R_{\bf B}^{*} * \Omega] \hfill\cr \quad (\hbox{ii})\hfill \bar{\scr F}[R_{\bf A} * \omega] = R_{\bf B}^{*} * \Omega.\quad\;\;\;\hfill}]

By (i), [R_{\bf A} * \omega = |\det {\bf B}| R_{\bf B} \times {\scr F}[\Omega]]. Both sides are weighted lattice distributions concentrated at the nodes of [\Lambda_{\bf B}], and equating the weights at [\boldlambda_{\bf B} = \boldell + \boldlambda_{\bf A}] gives [\tilde{\varphi} ({\boldell}) = {1 \over |\det {\bf D}|} {\sum\limits_{{\boldell}^{*} \in \Lambda_{\bf A}^{*}/\Lambda_{\bf B}^{*}}} \tilde{\Phi} ({\boldell}^{*}) \exp [-2\pi i {\boldell}^{*} \cdot ({\boldell} + \boldlambda_{\bf A})].] Since [\boldell^{*} \in \Lambda_{\bf A}^{*}], [\boldell^{*} \cdot \boldlambda_{\bf A}] is an integer, hence [\tilde{\varphi} ({\boldell}) = {1 \over |\det {\bf D}|} {\sum\limits_{{\boldell}^{*} \in \Lambda_{\bf A}^{*}/\Lambda_{\bf B}^{*}}} \tilde{\Phi} ({\boldell}^{*}) \exp (-2\pi i {\boldell}^{*} \cdot {\boldell}).]

By (ii), we have [{1 \over |\det {\bf A}|} R_{\bf B}^{*} * [T_{{\bf A}/{\bf B}}^{*} \times (R_{\bf B}^{*} * \Phi^{0})] = {1 \over |\det {\bf A}|} \bar{\scr F}[R_{\bf A} * \omega].] Both sides are weighted lattice distributions concentrated at the nodes of [\Lambda_{\bf B}^{*}], and equating the weights at [{\boldlambda}_{\bf A}^{*} = \boldell^{*} + {\boldlambda}_{\bf B}^{*}] gives [\tilde{\Phi} ({\boldell}^{*}) = {\textstyle\sum\limits_{{\boldell} \in \Lambda_{\bf B}/\Lambda_{\bf A}}} \tilde{\varphi} ({\boldell}) \exp [+2\pi i {\boldell} \cdot ({\boldell}^{*} + {\boldlambda}_{\bf B}^{*})].] Since [\boldell \in \Lambda_{\bf B}], [\boldell \cdot {\boldlambda}^{*}_{\bf B}] is an integer, hence [\tilde{\Phi} ({\boldell}^{*}) = {\textstyle\sum\limits_{{\boldell} \in \Lambda_{\bf B}/\Lambda_{\bf A}}} \tilde{\varphi} ({\boldell}) \exp (+2\pi i {\boldell} \cdot {\boldell}^{*}).]

Now the decimation/subdivision relations between [\Lambda_{\bf A}] and [\Lambda_{\bf B}] may be written: [{\bf A} = {\bf DB} = {\bf BN},] so that [\eqalign{{\boldell} &= {\bf B}{\bf \scr k}\qquad\qquad\hbox{for } {\bf \scr k}\in {\bb Z}^{n}\cr {\boldell}^{*} &= ({\bf A}^{-1})^{T} {\scr k}^{*}\quad \hbox{ for } {\bf \scr k}^{*} \in {\bb Z}^{n}}] with [({\bf A}^{-1})^{T} = ({\bf B}^{-1})^{T} ({\bf N}^{-1})^{T}], hence finally [{\boldell}^{*} \cdot {\boldell} = {\boldell} \cdot {\boldell}^{*} = {\scr k}^{*} \cdot ({\bf N}^{-1} {\bf \scr k}).]

Denoting [\tilde{\varphi} ({\bf B{\scr k}})] by [\psi ({\scr k})] and [\tilde{\Phi}[({\bf A}^{-1})^{T} {\scr k}^{*}]] by [\Psi ({\scr k}^{*})], the relation between ω and Ω may be written in the equivalent form [\displaylines{(\hbox{i})\quad\hfill \psi ({\bf \scr k}) = {1 \over |\det {\bf N}|} {\sum\limits_{{\bf \scr k}^{*} \in {\bb Z}^{n}/{\bf N}^{T}{\bb Z}^{n}}} \Psi ({\bf \scr k}^{*}) \exp [-2 \pi i {\bf \scr k}^{*} \cdot ({\bf N}^{-1} {\bf \scr k})] \hfill\cr (\hbox{ii})\hfill \Psi ({\bf \scr k}^{*}) = {\sum\limits_{{\scr k}\in {\bb Z}^{n}/{\bf N}{\bb Z}^{\rm n}}} \psi ({\bf \scr k}) \exp [+2 \pi i {\bf \scr k}^{*} \cdot ({\bf N}^{-1} {\bf \scr k})], \quad\;\qquad\hfill}] where the summations are now over finite residual lattices in standard form.

Equations (i) and (ii) describe two mutually inverse linear transformations [{\scr F}({\bf N})] and [\bar{\scr F}({\bf N})] between two vector spaces [W_{\bf N}] and [W_{\bf N}^{*}] of dimension [|\det {\bf N}|]. [{\scr F}({\bf N})] [respectively [\bar{\scr F}({\bf N})]] is the discrete Fourier (respectively inverse Fourier) transform associated to matrix N.

The vector spaces [W_{\bf N}] and [W_{\bf N}^{*}] may be viewed from two different standpoints:

  • (1) as vector spaces of weighted residual-lattice distributions, of the form [\alpha ({\bf x}) T_{{\bf B}/{\bf A}}] and [\beta ({\bf x}) T_{{\bf A}/{\bf B}}^{*}]; the canonical basis of [W_{\bf N}] (respectively [W_{\bf N}^{*}]) then consists of the [\delta_{({\scr k})}] for [{\scr k}\in {\bb Z}^{n}/{\bf N}{\bb Z}^{n}] [respectively [\delta_{({\scr k}^{*})}] for [{\scr k}^{*} \in {\bb Z}^{n}/{\bf N}^{T} {\bb Z}^{n}]];

  • (2) as vector spaces of weight vectors for the [|\det {\bf N}|\ \delta]-functions involved in the expression for [T_{{\bf B}/{\bf A}}] (respectively [T_{{\bf A}/{\bf B}}^{*}]); the canonical basis of [W_{\bf N}] (respectively [W_{\bf N}^{*}]) consists of weight vectors [{\bf u}_{{\scr k}}] (respectively [{\bf v}_{{\scr k}^{*}}]) giving weight 1 to element [{\scr k}] (respectively [{\scr k}^{*}]) and 0 to the others.

These two spaces are said to be `isomorphic' (a relation denoted ≅), the isomorphism being given by the one-to-one correspondence: [\eqalign{\omega &= {\textstyle\sum\limits_{{\bf \scr k}}} \psi ({\bf \scr k}) \delta_{({\bf \scr k})} \qquad \leftrightarrow \quad \psi = {\textstyle\sum\limits_{{\bf \scr k}}} \psi ({\scr k}) {\bf u}_{{\bf \scr k}}\cr \Omega &= {\textstyle\sum\limits_{{\bf \scr k}^{*}}} \Psi ({\bf \scr k}^{*}) \delta_{({\bf \scr k}^{*})} \quad\; \leftrightarrow \quad \Psi = {\textstyle\sum\limits_{{\bf \scr k}^{*}}} \Psi ({\bf \scr k}^{*}) {\bf v}_{{\bf \scr k}^{*}}.}]

The second viewpoint will be adopted, as it involves only linear algebra. However, it is most helpful to keep the first one in mind and to think of the data or results of a discrete Fourier transform as representing (through their sets of unique weights) two periodic lattice distributions related by the full, distribution-theoretic Fourier transform.

We therefore view [W_{\bf N}] (respectively [W_{\bf N}^{*}]) as the vector space of complex-valued functions over the finite residual lattice [\Lambda_{\bf B}/\Lambda_{\bf A}] (respectively [\Lambda_{\bf A}^{*}/\Lambda_{\bf B}^{*}]) and write: [\eqalign{W_{\bf N} &\cong L(\Lambda_{\bf B}/\Lambda_{\bf A}) \cong L({\bb Z}^{n}/{\bf N}{\bb Z}^{n}) \cr W_{\bf N}^{*} &\cong L(\Lambda_{\bf A}^{*}/\Lambda_{\bf B}^{*}) \cong L({\bb Z}^{n}/{\bf N}^{T} {\bb Z}^{n})}] since a vector such as ψ is in fact the function [{\scr k} \;\longmapsto\; \psi ({\scr k})].

The two spaces [W_{\bf N}] and [W_{\bf N}^{*}] may be equipped with the following Hermitian inner products: [\eqalign{(\varphi, \psi)_{W} &= {\textstyle\sum\limits_{{\bf \scr k}}} \overline{\varphi ({\bf \scr k})} \psi ({\bf \scr k}) \cr (\Phi, \Psi)_{W^{*}} &= {\textstyle\sum\limits_{{\bf \scr k}}} \overline{\Phi ({\bf \scr k}^{*})} \Psi ({\bf \scr k}^{*}),}] which makes each of them into a Hilbert space. The canonical bases [\{{\bf u}_{{\scr k}} | {\scr k}\in {\bb Z}^{n}/{\bf N} {\bb Z}^{n}\}] and [\{{\bf v}_{{\scr k}^{*}} | {\scr k}^{*} \in {\bb Z}^{n}/{\bf N}^{T} {\bb Z}^{n}\}] and [W_{\bf N}] and [W_{\bf N}^{*}] are orthonormal for their respective product.

1.3.2.7.4. Matrix representation of the discrete Fourier transform (DFT)

| top | pdf |

By virtue of definitions (i) and (ii), [\eqalign{{\scr F}({\bf N}) {\bf v}_{{\bf \scr k}^{*}} &= {1 \over |\det {\bf N}|} {\sum\limits_{{\bf \scr k}}} \exp [-2 \pi i {\bf \scr k}^{*} \cdot ({\bf N}^{-1} {\bf \scr k})] {\bf u}_{{\bf \scr k}} \cr \bar{\scr F}({\bf N}) {\bf u}_{{\bf \scr k}} &= {\sum\limits_{{\bf \scr k}^{*}}} \exp [+2 \pi i {\bf \scr k}^{*} \cdot ({\bf N}^{-1} {\bf \scr k})] {\bf v}_{{\bf \scr k}^{*}}}] so that [{\scr F}({\bf N})] and [\bar{\scr F}({\bf N})] may be represented, in the canonical bases of [W_{\bf N}] and [W_{\bf N}^{*}], by the following matrices: [\eqalign{[{\scr F}({\bf N})]_{{\bf {\bf \scr k}{\bf \scr k}}^{*}} &= {1 \over |\det {\bf N}|} \exp [-2 \pi i {\bf \scr k}^{*} \cdot ({\bf N}^{-1} {\bf \scr k})] \cr [\bar{\scr F}({\bf N})]_{{\bf \scr k}^{*} {\bf \scr k}} &= \exp [+2 \pi i {\bf \scr k}^{*} \cdot ({\bf N}^{-1} {\bf \scr k})].}]

When N is symmetric, [{\bb Z}^{n}/{\bf N} {\bb Z}^{n}] and [{\bb Z}^{n}/{\bf N}^{T} {\bb Z}^{n}] may be identified in a natural manner, and the above matrices are symmetric.

When N is diagonal, say [{\bf N} = \hbox{diag} (\nu_{1}, \nu_{2}, \ldots, \nu_{n})], then the tensor product structure of the full multidimensional Fourier transform (Section 1.3.2.4.2.4[link]) [{\scr F}_{\bf x} = {\scr F}_{x_{1}} \otimes {\scr F}_{x_{2}} \otimes \ldots \otimes {\scr F}_{x_{n}}] gives rise to a tensor product structure for the DFT matrices. The tensor product of matrices is defined as follows: [{\bf A} \otimes {\bf B} = \pmatrix{a_{11} {\bf B} &\ldots &a_{1n} {\bf B}\cr \vdots & &\vdots\cr a_{n1} {\bf B} &\ldots &a_{nn} {\bf B}\cr}.] Let the index vectors [{\scr k}] and [{\scr k}^{*}] be ordered in the same way as the elements in a Fortran array, e.g. for [{\scr k}] with [{\scr k}_{1}] increasing fastest, [{\scr k}_{2}] next fastest, [\ldots, {\scr k}_{n}] slowest; then [{\scr F}({\bf N}) = {\scr F}(\nu_{1}) \otimes {\scr F}(\nu_{2}) \otimes \ldots \otimes {\scr F}(\nu_{n}),] where [[{\scr F}(\nu_{j})]_{{\scr k}_{j}, \, {\scr k}_{j}^{*}} = {1 \over \nu_{j}} \exp \left(-2 \pi i {{\scr k}_{j}^{*} {\scr k}_{j} \over \nu_{j}}\right),] and [\bar{\scr F}({\bf N}) = \bar{\scr F}(\nu_{1}) \otimes \bar{\scr F}(\nu_{2}) \otimes \ldots \otimes \bar{\scr F}(\nu_{n}),] where [[\bar{\scr F}_{\nu_{j}}]_{{\scr k}_{j}^{*}, \, {\scr k}_{j}} = \exp \left(+2 \pi i {{\scr k}_{j}^{*} {\scr k}_{j} \over \nu_{j}}\right).]

1.3.2.7.5. Properties of the discrete Fourier transform

| top | pdf |

The DFT inherits most of the properties of the Fourier transforms, but with certain numerical factors (`Jacobians') due to the transition from continuous to discrete measure.

  • (1) Linearity is obvious.

  • (2) Shift property. If [(\tau_{{\bf {\scr a}}} \psi) ({\scr k}) = \psi ({\scr k} - {\bf {\scr a}})] and [(\tau_{{\bf {\scr a}}^{*}} \Psi) ({\scr k}^{*}) =] [\Psi ({\scr k}^{*} - {\bf {\scr a}}^{*})], where subtraction takes place by modular vector arithmetic in [{\bb Z}^{n}/{\bf N} {\bb Z}^{n}] and [{\bb Z}^{n}/{\bf N}^{T}{\bb Z}^{n}], respectively, then the following identities hold: [\eqalign{\bar{\scr F}({\bf N}) [\tau_{\bf \scr k} \psi] ({\bf \scr k}^{*}) &= \exp [+ 2 \pi i{\bf \scr k}^{*} \cdot ({\bf N}^{-1} {\bf \scr k})] \bar{\scr F}({\bf N})[\psi]({\bf \scr k}^{*}) \cr {\scr F}({\bf N})[\tau_{{\bf \scr k}^{*}} \Psi]({\bf \scr k}) &= \exp [- 2 \pi i{\bf \scr k}^{*} \cdot ({\bf N}^{-1} {\bf \scr k})] {\scr F}({\bf N})[\Psi]({\bf \scr k}).}]

  • (3) Differentiation identities. Let vectors ψ and Ψ be constructed from [\varphi^{0} \in {\scr E}({\bb R}^{n})] as in Section 1.3.2.7.3[link], hence be related by the DFT. If [D^{{\bf p}} \boldpsi] designates the vector of sample values of [D_{\bf x}^{{\bf p}} \varphi^{0}] at the points of [\Lambda_{\bf B}/\Lambda_{\bf A}], and [D^{{\bf p}} \boldPsi] the vector of values of [D_{\boldxi}^{{\bf p}} \Phi^{0}] at points of [\Lambda_{\bf A}^{*}/\Lambda_{\bf B}^{*}], then for all multi-indices [{\bf p} = (p_{1}, p_{2}, \ldots, p_{n})] [\eqalign{(D^{{\bf p}} \boldpsi) ({\bf \scr k}) &= \bar{\scr F}({\bf N}) [(+ 2 \pi i{\bf \scr k}^{*})^{{\bf p}} \boldPsi] ({\bf \scr k}) \cr (D^{{\bf p}} \boldPsi) ({\bf \scr k}^{*}) &= {\scr F}({\bf N}) [(- 2 \pi i{\bf \scr k})^{{\bf p}} \boldpsi] ({\bf \scr k}^{*})}] or equivalently [\eqalign{{\scr F}({\bf N}) [D^{{\bf p}} \boldpsi] ({\bf \scr k}^{*}) &= (+ 2 \pi i{\bf \scr k}^{*})^{{\bf p}} \boldPsi ({\bf \scr k}^{*}) \cr \bar{\scr F}({\bf N}) [D^{{\bf p}} \boldPsi] ({\bf \scr k}) &= (- 2 \pi i{\bf \scr k})^{{\bf p}} \boldpsi ({\bf \scr k}).}]

  • (4) Convolution property. Let [\boldvarphi \in W_{\bf N}] and [\boldPhi \in W_{\bf N}^{*}] (respectively ψ and Ψ) be related by the DFT, and define [\eqalign{(\boldvarphi * \boldpsi) ({\bf \scr k}) &= \textstyle\sum\limits_{{\bf \scr k}' \in {\bb Z}^{n}/{\bf N} {\bb Z}^{n}} \boldvarphi ({\bf \scr k}') \boldpsi ({\bf \scr k} - {\bf \scr k}') \cr (\boldPhi * \boldPsi) ({\bf \scr k}^{*}) &= \textstyle\sum\limits_{{\bf \scr k}^{*'} \in {\bb Z}^{n}/{\bf N}^{T} {\bb Z}^{n}} \boldPhi ({\bf \scr k}^{*'}) {\boldPsi} ({\bf \scr k}^{*} - {\bf \scr k}^{*'}).}] Then [\eqalign{\bar{\scr F}({\bf N}) [\boldPhi * \boldPsi] ({\bf \scr k}) &= |\det {\bf N}| \boldvarphi ({\bf \scr k}) \boldpsi ({\bf \scr k}) \cr {\scr F}({\bf N}) [\boldvarphi * \boldpsi] ({\bf \scr k}^{*}) &= \boldPhi ({\bf \scr k}^{*}) \boldPsi ({\bf \scr k}^{*})}] and [\eqalign{\bar{\scr F}({\bf N}) [\boldvarphi \times \boldpsi] ({\bf \scr k}^{*}) &= {1 \over |\det {\bf N}|} (\boldPhi * \boldPsi) ({\bf \scr k}^{*}) \cr {\scr F}({\bf N}) [\boldPhi \times \boldPsi] ({\bf \scr k}) &= (\boldvarphi * \boldpsi) ({\bf \scr k}).}] Since addition on [{\bb Z}^{n}/{\bf N}{\bb Z}^{n}] and [{\bb Z}^{n}/{\bf N}^{T} {\bb Z}^{n}] is modular, this type of convolution is called cyclic convolution.

  • (5) Parseval/Plancherel property. If ϕ, ψ, Φ, Ψ are as above, then [\eqalign{({\scr F}({\bf N}) [\boldPhi], {\scr F}({\bf N}) [\boldPsi])_{W} &= {1 \over |\det {\bf N}|} (\boldPhi, \boldPsi)_{W^{*}} \cr (\bar{\scr F}({\bf N}) [\boldvarphi], \bar{\scr F}({\bf N}) [\boldpsi])_{W} &= {1 \over |\det {\bf N}|} (\boldvarphi, \boldpsi)_{W}.}]

  • (6) Period 4. When N is symmetric, so that the ranges of indices [{\scr k}] and [{\scr k}^{*}] can be identified, it makes sense to speak of powers of [{\scr F}({\bf N})] and [\bar{\scr F}({\bf N})]. Then the `standardized' matrices [(1/|\det {\bf N}|^{1/2}){\scr F}({\bf N})] and [(1/|\det {\bf N}|^{1/2}) \bar{\scr F}({\bf N})] are unitary matrices whose fourth power is the identity matrix (Section 1.3.2.4.3.4[link]); their eigenvalues are therefore [\pm 1] and [\pm i].

1.3.3. Numerical computation of the discrete Fourier transform

| top | pdf |

1.3.3.1. Introduction

| top | pdf |

The Fourier transformation's most remarkable property is undoubtedly that of turning convolution into multiplication. As distribution theory has shown, other valuable properties – such as the shift property, the conversion of differentiation into multiplication by monomials, and the duality between periodicity and sampling – are special instances of the convolution theorem.

This property is exploited in many areas of applied mathematics and engineering (Campbell & Foster, 1948[link]; Sneddon, 1951[link]; Champeney, 1973[link]; Bracewell, 1986[link]). For example, the passing of a signal through a linear filter, which results in its being convolved with the response of the filter to a δ-function `impulse', may be modelled as a multiplication of the signal's transform by the transform of the impulse response (also called transfer function). Similarly, the solution of systems of partial differential equations may be turned by Fourier transformation into a division problem for distributions. In both cases, the formulations obtained after Fourier transformation are considerably simpler than the initial ones, and lend themselves to constructive solution techniques.

Whenever the functions to which the Fourier transform is applied are band-limited, or can be well approximated by band-limited functions, the discrete Fourier transform (DFT) provides a means of constructing explicit numerical solutions to the problems at hand. A great variety of investigations in physics, engineering and applied mathematics thus lead to DFT calculations, to such a degree that, at the time of writing, about 50% of all supercomputer CPU time is alleged to be spent calculating DFTs.

The straightforward use of the defining formulae for the DFT leads to calculations of size [N^{2}] for N sample points, which become unfeasible for any but the smallest problems. Much ingenuity has therefore been exerted on the design and implementation of faster algorithms for calculating the DFT (McClellan & Rader, 1979[link]; Nussbaumer, 1981[link]; Blahut, 1985[link]; Brigham, 1988[link]). The most famous is that of Cooley & Tukey (1965)[link] which heralded the age of digital signal processing. However, it had been preceded by the prime factor algorithm of Good (1958[link], 1960[link]), which has lately been the basis of many new developments. Recent historical research (Goldstine, 1977[link], pp. 249–253; Heideman et al., 1984[link]) has shown that Gauss essentially knew the Cooley–Tukey algorithm as early as 1805 (before Fourier's 1807 work on harmonic analysis!); while it has long been clear that Dirichlet knew of the basis of the prime factor algorithm and used it extensively in his theory of multiplicative characters [see e.g. Chapter I of Ayoub (1963)[link], and Chapters 6 and 8 of Apostol (1976)[link]]. Thus the computation of the DFT, far from being a purely technical and rather narrow piece of specialized numerical analysis, turns out to have very rich connections with such central areas of pure mathematics as number theory (algebraic and analytic), the representation theory of certain Lie groups and coding theory – to list only a few. The interested reader may consult Auslander & Tolimieri (1979)[link]; Auslander, Feig & Winograd (1982[link], 1984[link]); Auslander & Tolimieri (1985)[link]; Tolimieri (1985)[link].

One-dimensional algorithms are examined first. The Sande mixed-radix version of the Cooley–Tukey algorithm only calls upon the additive structure of congruence classes of integers. The prime factor algorithm of Good begins to exploit some of their multiplicative structure, and the use of relatively prime factors leads to a stronger factorization than that of Sande. Fuller use of the multiplicative structure, via the group of units, leads to the Rader algorithm; and the factorization of short convolutions then yields the Winograd algorithms.

Multidimensional algorithms are at first built as tensor products of one-dimensional elements. The problem of factoring the DFT in several dimensions simultaneously is then examined. The section ends with a survey of attempts at formalizing the interplay between algorithm structure and computer architecture for the purpose of automating the design of optimal DFT code.

It was originally intended to incorporate into this section a survey of all the basic notions and results of abstract algebra which are called upon in the course of these developments, but time limitations have made this impossible. This material, however, is adequately covered by the first chapter of Tolimieri et al. (1989)[link] in a form tailored for the same purposes. Similarly, the inclusion of numerous detailed examples of the algorithms described here has had to be postponed to a later edition, but an abundant supply of such examples may be found in the signal processing literature, for instance in the books by McClellan & Rader (1979)[link], Blahut (1985)[link], and Tolimieri et al. (1989)[link].

1.3.3.2. One-dimensional algorithms

| top | pdf |

Throughout this section we will denote by [e(t)] the expression [\exp (2 \pi it)], [t \in {\bb R}]. The mapping [t \;\longmapsto\; e(t)] has the following properties: [\eqalign{e(t_{1} + t_{2}) &= e(t_{1}) e(t_{2}) \cr e(-t) &= \overline{e(t)} = [e(t)]^{-1} \cr e(t) &= 1 \Leftrightarrow t \in {\bb Z}.}] Thus e defines an isomorphism between the additive group [{\bb R} /{\bb Z}] (the reals modulo the integers) and the multiplicative group of complex numbers of modulus 1. It follows that the mapping [\ell \;\longmapsto\; e(\ell/N)], where [\ell \in {\bb Z}] and N is a positive integer, defines an isomorphism between the one-dimensional residual lattice [{\bb Z}/N {\bb Z}] and the multiplicative group of Nth roots of unity.

The DFT on N points then relates vectors X and [{\bf X}^{*}] in W and [W^{*}] through the linear transformations: [\eqalign{&F(N): \quad X(k) = {1 \over N} {\sum\limits_{k^{*} \in {\bb Z}/N {\bb Z}}} X^{*} (k^{*}) e(-k^{*} k/N) \cr &\bar{F}(N): \quad X^{*} (k^{*}) = {\sum\limits_{k \in {\bb Z}/N {\bb Z}}} X(k) e(k^{*} k/N).}]

1.3.3.2.1. The Cooley–Tukey algorithm

| top | pdf |

The presentation of Gentleman & Sande (1966)[link] will be followed first [see also Cochran et al. (1967)[link]]. It will then be reinterpreted in geometric terms which will prepare the way for the treatment of multidimensional transforms in Section 1.3.3.3.[link]

Suppose that the number of sample points N is composite, say [N = N_{1} N_{2}]. We may write k to the base [N_{1}] and [k^{*}] to the base [N_{2}] as follows: [\eqalign{k &= k_{1} + N_{1} k_{2} \quad\; k_{1} \in {\bb Z}/N_{1} {\bb Z}, \quad k_{2} \in {\bb Z}/N_{2} {\bb Z} \cr k^{*} &= k_{2}^{*} + k_{1}^{*} N_{2} \quad\;k_{1}^{*} \in {\bb Z}/N_{1} {\bb Z}, \quad k_{2}^{*} \in {\bb Z}/N_{2} {\bb Z}.}] The defining relation for [\bar{F}(N)] may then be written: [\eqalign{X^{*} (k_{2}^{*} + k_{1}^{*} N_{2}) &= {\sum\limits_{k_{1} \in {\bb Z}/N_{1} {\bb Z}}}\; {\sum\limits_{k_{2} \in {\bb Z}/N_{2} {\bb Z}}} X (k_{1} + N_{1} k_{2}) \cr &\quad \times e \left[{(k_{2}^{*} + k_{1}^{*} N_{2}) (k_{1} + N_{1} k_{2}) \over N_{1} N_{2}}\right].}] The argument of [e[.]] may be expanded as [{k_{2}^{*} k_{1} \over N} + {k_{1}^{*} k_{1} \over N_{1}} + {k_{2}^{*} k_{2} \over N_{2}} + k_{1}^{*} k_{2},] and the last summand, being an integer, may be dropped: [\eqalign{&X^{*} (k_{2}^{*} + k_{1}^{*} N_{2})\cr &\quad = {\sum\limits_{k_{1}}} \left\{e \left({k_{2}^{*} k_{1} \over N}\right) \left[{\sum\limits_{k_{2}}}\; X (k_{1} + N_{1} k_{2}) e \left({k_{2}^{*} k_{2} \over N_{2}}\right)\right]\right\}\cr &\qquad \times e \left({k_{1}^{*} k_{1} \over N_{1}}\right).}] This computation may be decomposed into five stages, as follows:

  • (i) form the [N_{1}] vectors [{\bf Y}_{k_{1}}] of length [N_{2}] by the prescription [Y_{k_{1}} (k_{2}) = X (k_{1} + N_{1} k_{2}),\quad k_{1} \in {\bb Z}/N_{1} {\bb Z},\quad k_{2} \in {\bb Z}/N_{2} {\bb Z}\hbox{;}]

  • (ii) calculate the [N_{1}] transforms [{\bf Y}_{k_{1}}^{*}] on [N_{2}] points: [{\bf Y}_{k_{1}}^{*} = \bar{F} (N_{2}) [{\bf Y}_{k_{1}}],\quad k_{1} \in {\bb Z}/N_{1} {\bb Z}\hbox{;}]

  • (iii) form the [N_{2}] vectors [{\bf Z}_{k_{2}^{*}}] of length [N_{1}] by the prescription [{\bf Z}_{k_{2}^{*}} (k_{1}) = e \left({k_{2}^{*} k_{1} \over N}\right) Y_{k_{1}}^{*} (k_{2}^{*}),\quad k_{1} \in {\bb Z}/N_{1} {\bb Z},\quad k_{2}^{*} \in {\bb Z}/N_{2} {\bb Z}\hbox{;}]

  • (iv) calculate the [N_{2}] transforms [{\bf Z}_{k_{2}^{*}}^{*}] on [N_{1}] points: [{\bf Z}_{k_{2}^{*}}^{*} = \bar{F} (N_{1}) [{\bf Z}_{k_{2}^{*}}],\quad k_{2}^{*} \in {\bb Z}/N_{2} {\bb Z}\hbox{;}]

  • (v) collect [X^{*} (k_{2}^{*} + k_{1}^{*} N_{2})] as [Z_{k_{2}^{*}}^{*} (k_{1}^{*})].

If the intermediate transforms in stages (ii)[link] and (iv)[link] are performed in place, i.e. with the results overwriting the data, then at stage (v)[link] the result [X^{*} (k_{2}^{*} + k_{1}^{*} N_{2})] will be found at address [k_{1}^{*} + N_{1} k_{2}^{*}]. This phenomenon is called scrambling by `digit reversal', and stage (v)[link] is accordingly known as unscrambling.

The initial N-point transform [\bar{F} (N)] has thus been performed as [N_{1}] transforms [\bar{F} (N_{2})] on [N_{2}] points, followed by [N_{2}] transforms [\bar{F} (N_{1})] on [N_{1}] points, thereby reducing the arithmetic cost from [(N_{1} N_{2})^{2}] to [N_{1} N_{2} (N_{1} + N_{2})]. The phase shifts applied at stage (iii)[link] are traditionally called `twiddle factors', and the transposition between [k_{1}] and [k_{2}^{*}] can be performed by the fast recursive technique of Eklundh (1972)[link]. Clearly, this procedure can be applied recursively if [N_{1}] and [N_{2}] are themselves composite, leading to an overall arithmetic cost of order N log N if N has no large prime factors.

The Cooley–Tukey factorization may also be derived from a geometric rather than arithmetic argument. The decomposition [k = k_{1} + N_{1} k_{2}] is associated to a geometric partition of the residual lattice [{\bb Z}/N {\bb Z}] into [N_{1}] copies of [{\bb Z}/N_{2} {\bb Z}], each translated by [k_{1} \in {\bb Z}/N_{1} {\bb Z}] and `blown up' by a factor [N_{1}]. This partition in turn induces a (direct sum) decomposition of X as [{\bf X} = {\textstyle\sum\limits_{k_{1}}}\; {\bf X}_{k_{1}},] where [\eqalign{X_{k_{1}} (k) &= X (k)\quad \hbox{if } k \equiv k_{1} \hbox{ mod } N_{1},\cr &= 0{\hbox to 23pt{}} \hbox{otherwise}.}]

According to (i)[link], [{\bf X}_{k_{1}}] is related to [{\bf Y}_{k_{1}}] by decimation by [N_{1}] and offset by [k_{1}]. By Section 1.3.2.7.2[link], [\bar{F} (N) [{\bf X}_{k_{1}}]] is related to [\bar{F} (N_{2}) [{\bf Y}_{k_{1}}]] by periodization by [N_{2}] and phase shift by [e (k^{*} k_{1}/N)], so that [X^{*} (k^{*}) = {\sum\limits_{k_{1}}} e \left({k^{*} k_{1} \over N}\right) Y_{k_{1}}^{*} (k_{2}^{*}),] the periodization by [N_{2}] being reflected by the fact that [Y_{k_{1}}^{*}] does not depend on [k_{1}^{*}]. Writing [k^{*} = k_{2}^{*} + k_{1}^{*} N_{2}] and expanding [k^{*} k_{1}] shows that the phase shift contains both the twiddle factor [e (k_{2}^{*} k_{1}/N)] and the kernel [e (k_{1}^{*} k_{1}/N_{1})] of [\bar{F} (N_{1})]. The Cooley–Tukey algorithm is thus naturally associated to the coset decomposition of a lattice modulo a sublattice (Section 1.3.2.7.2[link]).

It is readily seen that essentially the same factorization can be obtained for [F(N)], up to the complex conjugation of the twiddle factors. The normalizing constant [1/N] arises from the normalizing constants [1/N_{1}] and [1/N_{2}] in [F (N_{1})] and [F (N_{2})], respectively.

Factors of 2 are particularly simple to deal with and give rise to a characteristic computational structure called a `butterfly loop'. If [N = 2M], then two options exist:

  • (a) using [N_{1} = 2] and [N_{2} = M] leads to collecting the even-numbered coordinates of X into [{\bf Y}_{0}] and the odd-numbered coordinates into [{\bf Y}_{1}] [\eqalign{Y_{0} (k_{2}) &= X (2k_{2}),\quad \qquad k_{2} = 0, \ldots, M - 1,\cr Y_{1} (k_{2}) &= X (2k_{2} + 1),\quad \;k_{2} = 0, \ldots, M - 1,}] and writing: [\eqalign{X^{*} (k_{2}^{*}) = \;&Y_{0}^{*} (k_{2}^{*}) + e (k_{2}^{*}/N) Y_{1}^{*} (k_{2}^{*}),\cr \quad &k_{2}^{*} = 0, \ldots, M - 1\hbox{;}\cr X^{*} (k_{2}^{*} + M) =\; &Y_{0}^{*} (k_{2}^{*}) - e (k_{2}^{*}/N) Y_{1}^{*} (k_{2}^{*}),\cr &k_{2}^{*} = 0, \ldots, M - 1.}] This is the original version of Cooley & Tukey, and the process of formation of [{\bf Y}_{0}] and [{\bf Y}_{1}] is referred to as `decimation in time' (i.e. decimation along the data index k).

  • (b) using [N_{1} = M] and [N_{2} = 2] leads to forming [\eqalign{Z_{0} (k_{1}) &= X (k_{1}) + X (k_{1} + M),\;\;\quad \quad \quad \qquad k_{1} = 0, \ldots, M - 1,\cr Z_{1} (k_{1}) &= [X (k_{1}) - X (k_{1} + M)] e \left({k_{1} \over N}\right),{\hbox to 20pt{}} k_{1} = 0, \ldots, M - 1,}] then obtaining separately the even-numbered and odd-numbered components of [{\bf X}^{*}] by transforming [{\bf Z}_{0}] and [{\bf Z}_{1}]: [\eqalign{X^{*} (2k_{1}^{*}) &= Z_{0}^{*} (k_{1}^{*}),\quad k_{1}^{*} = 0, \ldots, M - 1\hbox{;}\cr X^{*} (2k_{1}^{*} + 1) &= Z_{1}^{*} (k_{1}^{*}),\quad k_{1}^{*} = 0, \ldots, M - 1.}] This version is due to Sande (Gentleman & Sande, 1966[link]), and the process of separately obtaining even-numbered and odd-numbered results has led to its being referred to as `decimation in frequency' (i.e. decimation along the result index [k^{*}]).

By repeated factoring of the number N of sample points, the calculation of [F(N)] and [\bar{F} (N)] can be reduced to a succession of stages, the smallest of which operate on single prime factors of N. The reader is referred to Gentleman & Sande (1966)[link] for a particularly lucid analysis of the programming considerations which help implement this factorization efficiently; see also Singleton (1969)[link]. Powers of two are often grouped together into factors of 4 or 8, which are advantageous in that they require fewer complex multiplications than the repeated use of factors of 2. In this approach, large prime factors P are detrimental, since they require a full [P^{2}]-size computation according to the defining formula.

1.3.3.2.2. The Good (or prime factor) algorithm

| top | pdf |

1.3.3.2.2.1. Ring structure on [{\bb Z}/N{\bb Z}]

| top | pdf |

The set [{\bb Z}/N {\bb Z}] of congruence classes of integers modulo an integer N [see e.g. Apostol (1976)[link], Chapter 5] inherits from [{\bb Z}] not only the additive structure used in deriving the Cooley–Tukey factorization, but also a multiplicative structure in which the product of two congruence classes mod N is uniquely defined as the class of the ordinary product (in [{\bb Z}]) of representatives of each class. The multiplication can be distributed over addition in the usual way, endowing [{\bb Z}/N {\bb Z}] with the structure of a commutative ring.

If N is composite, the ring [{\bb Z}/N {\bb Z}] has zero divisors. For example, let [N = N_{1} N_{2}], let [n_{1} \equiv N_{1}] mod N, and let [n_{2} \equiv N_{2}] mod N: then [n_{1} n_{2} \equiv 0] mod N. In the general case, a product of non-zero elements will be zero whenever these elements collect together all the factors of N. These circumstances give rise to a fundamental theorem in the theory of commutative rings, the Chinese Remainder Theorem (CRT), which will now be stated and proved [see Apostol (1976[link]), Chapter 5; Schroeder (1986[link]), Chapter 16].

1.3.3.2.2.2. The Chinese remainder theorem

| top | pdf |

Let [N = N_{1} N_{2} \ldots N_{d}] be factored into a product of pairwise coprime integers, so that g.c.d. [(N_{i}, N_{j}) = 1] for [i \neq j]. Then the system of congruence equations [{\ell} \equiv {\ell}_{j} \hbox{ mod } N_{j},\qquad j = 1, \ldots, d,] has a unique solution [\ell] mod N. In other words, each [\ell \in {\bb Z}/N {\bb Z}] is associated in a one-to-one fashion to the d-tuple [(\ell_{1}, \ell_{2}, \ldots, \ell_{d})] of its residue classes in [{\bb Z}/N_{1} {\bb Z}, {\bb Z}/N_{2} {\bb Z}, \ldots, {\bb Z}/N_{d} {\bb Z}].

The proof of the CRT goes as follows. Let [Q_{j} = {N \over N_{j}} = {\prod\limits_{i \neq j}}\; N_{i}.] Since g.c.d. [(N_{j}, Q_{j}) = 1] there exist integers [n_{j}] and [q_{j}] such that [n_{j} N_{j} + q_{j} Q_{j} = 1,\qquad j = 1, \ldots, d,] then the integer [{\ell} = {\textstyle\sum\limits_{i = 1}^{d}}\; {\ell}_{i} q_{i} Q_{i} \hbox{ mod } N] is the solution. Indeed, [{\ell} \equiv {\ell}_{j} q_{j} Q_{j} \hbox{ mod } N_{j}] because all terms with [i \neq j] contain [N_{j}] as a factor; and [q_{j} Q_{j} \equiv 1 \hbox{ mod } N_{j}] by the defining relation for [q_{j}].

It may be noted that [\eqalign{(q_{i} Q_{i}) (q_{j} Q_{j}) &\equiv 0{\phantom{Q_j}}\quad\quad\hbox{ mod } N \hbox{ for } i \neq j,\cr (q_{j} Q_{j})^{2} &\equiv q_{j} Q_{j}\quad\quad\hbox{mod } N, \;\;j = 1, \ldots, d,}] so that the [q_{j} Q_{j}] are mutually orthogonal idempotents in the ring [{\bb Z}/N {\bb Z}], with properties formally similar to those of mutually orthogonal projectors onto subspaces in linear algebra. The analogy is exact, since by virtue of the CRT the ring [{\bb Z}/N {\bb Z}] may be considered as the direct product [{\bb Z}/N_{1} {\bb Z} \times {\bb Z}/N_{2} {\bb Z} \times \ldots \times {\bb Z}/N_{d} {\bb Z}] via the two mutually inverse mappings:

  • (i) [{\ell} \;\longmapsto\; (\ell_{1}, \ell_{2}, \ldots, \ell_{d})] by [\ell \equiv \ell_{j}] mod [N_{j}] for each j;

  • (ii) [(\ell_{1}, \ell_{2}, \ldots, \ell_{d}) \;\longmapsto\; \ell \hbox { by } \ell = {\textstyle\sum_{i = 1}^{d}} \ell_{i} q_{i} Q_{i}\hbox{ mod } N].

The mapping defined by (ii)[link] is sometimes called the `CRT reconstruction' of [\ell] from the [\ell_{j}].

These two mappings have the property of sending sums to sums and products to products, i.e: [\displaylines{\quad (\hbox{i})\hfill {\ell} + {\ell}' \;\longmapsto\; ({\ell}_{1} + {\ell}'_{1}, {\ell}_{2} + {\ell}'_{2}, \ldots, {\ell}_{d} + {\ell}'_{d}) \hfill\cr \hfill {\ell} {\ell}' \;\longmapsto\; ({\ell}_{1} {\ell}'_{1}, {\ell}_{2} {\ell}'_{2}, \ldots, {\ell}_{d} {\ell}'_{d}) \quad\;\;\; \phantom{(\hbox{i})} \hfill\cr \quad (\hbox{ii}) \hfill ({\ell}_{1} + {\ell}'_{1}, {\ell}_{2} + {\ell}'_{2}, \ldots, {\ell}_{d} + {\ell}'_{d}) \;\longmapsto\; {\ell} + {\ell}' \;\;\hfill\cr \hfill ({\ell}_{1} {\ell}'_{1}, {\ell}_{2} {\ell}'_{2}, \ldots, {\ell}_{d} {\ell}'_{d}) \;\longmapsto\; {\ell} {\ell}' \quad \phantom{(\hbox{i})} \;\;\;\;\hfill}] (the last proof requires using the properties of the idempotents [q_{j} Q_{j}]). This may be described formally by stating that the CRT establishes a ring isomorphism: [{\bb Z}/N {\bb Z} \cong ({\bb Z}/N_{1} {\bb Z}) \times \ldots \times ({\bb Z}/N_{d} {\bb Z}).]

1.3.3.2.2.3. The prime factor algorithm

| top | pdf |

The CRT will now be used to factor the N-point DFT into a tensor product of d transforms, the jth of length [N_{j}].

Let the indices k and [k^{*}] be subjected to the following mappings:

  • (i) [k \;\longmapsto\; (k_{1}, k_{2}, \ldots, k_{d}), k_{j} \in {\bb Z}/N_{j} {\bb Z}], by [k_{j} \equiv k] mod [N_{j}] for each j, with reconstruction formula [k = {\textstyle\sum\limits_{i = 1}^{d}} \;k_{i} q_{i} Q_{i} \hbox{ mod } N\hbox{;}]

  • (ii) [k^{*} \;\longmapsto\; (k_{1}^{*}, k_{2}^{*}, \ldots, k_{d}^{*}), k_{j}^{*} \in {\bb Z}/N_{j} {\bb Z}], by [k_{j}^{*} \equiv q_{j} k^{*}] mod [N_{j}] for each j, with reconstruction formula [k^{*} = {\textstyle\sum\limits_{i = 1}^{d}} \;k_{i}^{*} Q_{i} \hbox{ mod } N.]

Then [\eqalign{k^{*} k &= \left({\textstyle\sum\limits_{i = 1}^{d}}\; k_{i}^{*} Q_{i}\right) \left({\textstyle\sum\limits_{j = 1}^{d}} \;k_{j} q_{j} Q_{j}\right) \hbox{ mod } N\cr &= {\textstyle\sum\limits_{i, \, j = 1}^{d}} k_{i}^{*} k_{j} Q_{i} q_{j} Q_{j} \hbox{ mod } N.}] Cross terms with [i \neq j] vanish since they contain all the factors of N, hence [\eqalign{k^{*} k &= {\textstyle\sum\limits_{j = 1}^{d}}\; q_{j} Q_{j}^{2} k_{j}^{*} k_{j} \hbox{ mod } N\cr &= {\textstyle\sum\limits_{j = 1}^{d}} (1 - n_{j} N_{j}) Q_{j} k_{j}^{*} k_{j} \hbox{ mod } N.}] Dividing by N, which may be written as [N_{j} Q_{j}] for each j, yields [\eqalign{{k^{*} k \over N} &= {\sum\limits_{j = 1}^{d}} (1 - n_{j} N_{j}) {Q_{j} \over N_{j} Q_{j}} k_{j}^{*} k_{j} \hbox{ mod } 1\cr &= {\sum\limits_{j = 1}^{d}} \left({1 \over N_{j}} - n_{j}\right) k_{j}^{*} k_{j} \hbox{ mod } 1,}] and hence [{k^{*} k \over N} \equiv {\sum\limits_{j = 1}^{d}} {k_{j}^{*} k_{j} \over N_{j}} \hbox{ mod } 1.] Therefore, by the multiplicative property of [e(.)], [e \left({k^{*} k \over N}\right) \equiv \bigotimes\limits_{j = 1}^{d} e \left({k_{j}^{*} k_{j} \over N_{j}}\right).]

Let [{\bf X} \in L ({\bb Z}/N {\bb Z})] be described by a one-dimensional array [X(k)] indexed by k. The index mapping (i)[link] turns X into an element of [L ({\bb Z}/N_{1} {\bb Z} \times \ldots \times {\bb Z}/N_{d} {\bb Z})] described by a d-dimensional array [X (k_{1}, \ldots, k_{d})]; the latter may be transformed by [\bar{F} (N_{1}) \bigotimes \ldots \bigotimes \bar{F} (N_{d})] into a new array [X^{*} (k_{1}^{*}, k_{2}^{*}, \ldots, k_{d}^{*})]. Finally, the one-dimensional array of results [X^{*} (k^{*})] will be obtained by reconstructing [k^{*}] according to (ii)[link].

The prime factor algorithm, like the Cooley–Tukey algorithm, reindexes a 1D transform to turn it into d separate transforms, but the use of coprime factors and CRT index mapping leads to the further gain that no twiddle factors need to be applied between the successive transforms (see Good, 1971[link]). This makes up for the cost of the added complexity of the CRT index mapping.

The natural factorization of N for the prime factor algorithm is thus its factorization into prime powers: [\bar{F}(N)] is then the tensor product of separate transforms (one for each prime power factor [N_{j} = p_{j}^{\nu_{j}}]) whose results can be reassembled without twiddle factors. The separate factors [p_{j}] within each [N_{j}] must then be dealt with by another algorithm (e.g. Cooley–Tukey, which does require twiddle factors). Thus, the DFT on a prime number of points remains undecomposable.

1.3.3.2.3. The Rader algorithm

| top | pdf |

The previous two algorithms essentially reduce the calculation of the DFT on N points for N composite to the calculation of smaller DFTs on prime numbers of points, the latter remaining irreducible. However, Rader (1968)[link] showed that the p-point DFT for p an odd prime can itself be factored by invoking some extra arithmetic structure present in [{\bb Z} / p {\bb Z}].

1.3.3.2.3.1. N an odd prime

| top | pdf |

The ring [{\bb Z} / p {\bb Z} = \{0,1,2,\ldots,p - 1\}] has the property that its [p - 1] non-zero elements, called units, form a multiplicative group [U(p)]. In particular, all units [r \in U(p)] have a unique multiplicative inverse in [{\bb Z} / p {\bb Z}], i.e. a unit [s \in U(p)] such that [rs \equiv 1\hbox { mod } p]. This endows [{\bb Z} / p {\bb Z}] with the structure of a finite field.

Furthermore, [U(p)] is a cyclic group, i.e. consists of the successive powers [g^{m}\hbox{ mod } p] of a generator g called a primitive root mod p (such a g may not be unique, but it always exists). For instance, for [p = 7], [U(7) = \{1,2,3,4,5,6\}] is generated by [g = 3], whose successive powers mod 7 are: [g^{0} = 1, \quad g^{1} = 3, \quad g^{2} = 2, \quad g^{3} = 6, \quad g^{4} = 4, \quad g^{5} = 5] [see Apostol (1976[link]), Chapter 10].

The basis of Rader's algorithm is to bring to light a hidden regularity in the matrix [F(p)] by permuting the basis vectors [{\bf u}_{k}] and [{\bf v}_{k^{*}}] of [L({\bb Z} / p {\bb Z})] as follows: [\eqalign{{\bf u}'_{0} &= {\bf u}_{0} \cr {\bf u}'_{m} &= {\bf u}_{k} {\hbox to 12pt{}}\hbox{with } k = g^{m}, {\hbox to 15pt{}} m = 1, \ldots, p - 1\hbox{;} \cr {\bf v}'_{0} &= {\bf v}_{0} \cr {\bf v}'_{m^{*}} &= {\bf v}_{k^{*}} \quad \hbox{with } k^{*} = g^{m^{*}}, \quad m^{*} = 1, \ldots, p - 1\hbox{;}}] where g is a primitive root mod p.

With respect to these new bases, the matrix representing [\bar{F}(p)] will have the following elements: [\eqalign{\hbox{element } (0,0) &= 1 \cr \hbox{element } (0, m + 1) &= 1 \quad \hbox{for all } m = 0, \ldots p - 2, \cr \hbox{element } (m^{*} + 1,0) &= 1 \quad \hbox{for all } m^{*} = 0, \ldots, p - 2, \cr \hbox{element } (m^{*} + 1, m + 1) &= e \left({k^{*}k \over p}\right) \cr &= e(g^{(m^{*} + m)/p}) \cr &\qquad \quad \hbox{for all } m^{*} = 0, \ldots, p - 2.}] Thus the `core' [\bar{C}(p)] of matrix [\bar{F}(p)], of size [(p - 1) \times (p - 1)], formed by the elements with two non-zero indices, has a so-called skew-circulant structure because element [(m^{*}, m)] depends only on [m^{*} + m]. Simplification may now occur because multiplication by [\bar{C}(p)] is closely related to a cyclic convolution. Introducing the notation [C(m) = e(g^{m/p})] we may write the relation [{\bf Y}^{*} = \bar{F}(p){\bf Y}] in the permuted bases as [\eqalign{Y^{*} (0) &= {\textstyle\sum\limits_{k}} Y(k) \cr Y^{*} (m^{*} + 1) &= Y(0) + {\textstyle\sum\limits_{m = 0}^{p - 2}} C(m^{*} + m) Y(m + 1) \cr &= Y(0) + {\textstyle\sum\limits_{m = 0}^{p - 2}} C(m^{*} - m) Z(m) \cr &= Y(0) + ({\bf C} * {\bf Z}) (m^{*}), \quad m^{*} = 0, \ldots, p - 2,}] where Z is defined by [Z(m) = Y(p - m - 2)], [m = 0, \ldots, p - 2].

Thus [{\bf Y}^{*}] may be obtained by cyclic convolution of C and Z, which may for instance be calculated by [{\bf C} * {\bf Z} = F(p - 1) [\bar{F}(p - 1) [{\bf C}] \times \bar{F} (p - 1) [{\bf Z}]],] where × denotes the component-wise multiplication of vectors. Since p is odd, [p - 1] is always divisible by 2 and may even be highly composite. In that case, factoring [\bar{F} (p - 1)] by means of the Cooley–Tukey or Good methods leads to an algorithm of complexity p log p rather than [p^{2}] for [\bar{F}(p)]. An added bonus is that, because [g^{(p-1) / 2} = -1], the elements of [\bar{F} (p - 1) [{\bf C}]] can be shown to be either purely real or purely imaginary, which halves the number of real multiplications involved.

1.3.3.2.3.2. N a power of an odd prime

| top | pdf |

This idea was extended by Winograd (1976[link], 1978[link]) to the treatment of prime powers [N = p^{\nu}], using the cyclic structure of the multiplicative group of units [U(p^{\nu})]. The latter consists of all those elements of [{\bb Z} / p^{\nu} {\bb Z}] which are not divisible by p, and thus has [q_{\nu} = p^{\nu - 1} (p - 1)] elements. It is cyclic, and there exist primitive roots g modulo [p^{\nu}] such that [U(p^{\nu}) = \{1, g, g^{2}, g^{3}, \ldots, g^{q_{\nu} - 1}\}.] The [p^{\nu - 1}] elements divisible by p, which are divisors of zero, have to be treated separately just as 0 had to be treated separately for [N = p].

When [k^{*} \not\in U(p^{\nu})], then [k^{*} = pk_{1}^{*}] with [k_{1}^{*} \in {\bb Z} / p^{\nu - 1} {\bb Z}]. The results [X^{*} (pk_{1}^{*})] are p-decimated, hence can be obtained via the [p^{\nu - 1}]-point DFT of the [p^{\nu - 1}]-periodized data Y: [X^{*} (pk_{1}^{*}) = \bar{F} (p^{\nu - 1}) [{\bf Y}] (k_{1}^{*})] with [Y(k_{1}) = {\textstyle\sum\limits_{k_{2} \in {\bb Z} / p {\bb Z}}} X(k_{1} + p^{\nu - 1} k_{2}).]

When [k^{*} \in U(p^{\nu})], then we may write [X^{*} (k^{*}) = X_{0}^{*} (k^{*}) + X_{1}^{*} (k^{*}),] where [{\bf X}_{0}^{*}] contains the contributions from [k\; \notin\; U(p^{\nu})] and [{\bf X}_{1}^{*}] those from [k \in U(p^{\nu})]. By a converse of the previous calculation, [{\bf X}_{0}^{*}] arises from p-decimated data Z, hence is the [p^{\nu - 1}]-periodization of the [p^{\nu - 1}]-point DFT of these data: [X_{0}^{*} (p^{\nu - 1} k_{1}^{*} + k_{2}^{*}) = \bar{F} (p^{\nu - 1}) [{\bf Z}] (k_{2}^{*})] with [Z(k_{2}) = X(pk_{2}), \qquad k_{2} \in {\bb Z} / p^{\nu - 1} {\bb Z}] (the [p^{\nu - 1}]-periodicity follows implicity from the fact that the transform on the right-hand side is independent of [k_{1}^{*} \in {\bb Z} / p {\bb Z}]).

Finally, the contribution [X_{1}^{*}] from all [k \in U(p^{\nu})] may be calculated by reindexing by the powers of a primitive root g modulo [p^{\nu}], i.e. by writing [X_{1}^{*} (g^{m^{*}}) = {\textstyle\sum\limits_{m = 0}^{q_{\nu} - 1}} X(g^{m}) e(g^{(m^{*} + m) / p^{\nu}})] then carrying out the multiplication by the skew-circulant matrix core as a convolution.

Thus the DFT of size [p^{\nu}] may be reduced to two DFTs of size [p^{\nu - 1}] (dealing, respectively, with p-decimated results and p-decimated data) and a convolution of size [q_{\nu} = p^{\nu - 1} (p - 1)]. The latter may be `diagonalized' into a multiplication by purely real or purely imaginary numbers (because [g^{(q_{\nu} / 2)} = -1]) by two DFTs, whose factoring in turn leads to DFTs of size [p^{\nu - 1}] and [p - 1]. This method, applied recursively, allows the complete decomposition of the DFT on [p^{\nu}] points into arbitrarily small DFTs.

1.3.3.2.3.3. N a power of 2

| top | pdf |

When [N = 2^{\nu}], the same method can be applied, except for a slight modification in the calculation of [{\bf X}_{1}^{*}]. There is no primitive root modulo [2^{\nu}] for [\nu \gt 2]: the group [U(2^{\nu})] is the direct product of two cyclic groups, the first (of order 2) generated by −1, the second (of order [N/4]) generated by 3 or 5. One then uses a representation [\eqalign{k &= (-1)^{m_{1}} 5^{m_{2}} \cr k^{*} &= (-1)^{m_{1}^{*}} 5^{m_{2}^{*}}}] and the reindexed core matrix gives rise to a two-dimensional convolution. The latter may be carried out by means of two 2D DFTs on [2 \times (N/4)] points.

1.3.3.2.4. The Winograd algorithms

| top | pdf |

The cyclic convolutions generated by Rader's multiplicative reindexing may be evaluated more economically than through DFTs if they are re-examined within a new algebraic setting, namely the theory of congruence classes of polynomials [see, for instance, Blahut (1985[link]), Chapter 2; Schroeder (1986[link]), Chapter 24].

The set, denoted [{\bb K}[X]], of polynomials in one variable with coefficients in a given field [{\bb K}] has many of the formal properties of the set [{\bb Z}] of rational integers: it is a ring with no zero divisors and has a Euclidean algorithm on which a theory of divisibility can be built.

Given a polynomial [P(z)], then for every [W(z)] there exist unique polynomials [Q(z)] and [R(z)] such that [W(z) = P(z) Q(z) + R(z)] and [\hbox{degree } (R) \;\lt\; \hbox{degree } (P).] [R(z)] is called the residue of [H(z)] modulo [P(z)]. Two polynomials [H_{1}(z)] and [H_{2}(z)] having the same residue modulo [P(z)] are said to be congruent modulo [P(z)], which is denoted by [H_{1}(z) \equiv H_{2}(z) \hbox{ mod } P(z).]

If [H(z) \equiv 0\hbox{ mod } P(z),\; H(z)] is said to be divisible by [P(z)]. If [H(z)] only has divisors of degree zero in [{\bb K}[X]], it is said to be irreducible over [{\bb K}] (this notion depends on [{\bb K}]). Irreducible polynomials play in [{\bb K}[X]] a role analogous to that of prime numbers in [{\bb Z}], and any polynomial over [{\bb K}] has an essentially unique factorization as a product of irreducible polynomials.

There exists a Chinese remainder theorem (CRT) for polynomials. Let [P(z) = P_{1}(z) \ldots P_{d}(z)] be factored into a product of pairwise coprime polynomials [i.e. [P_{i}(z)] and [P_{j}(z)] have no common factor for [i \neq j]]. Then the system of congruence equations [H(z) \equiv H_{j}(z) \hbox{ mod } P_{j}(z), \quad j = 1, \ldots, d,] has a unique solution [H(z)] modulo [P(z)]. This solution may be constructed by a procedure similar to that used for integers. Let [Q_{j}(z) = P(z) / P_{j}(z) = {\textstyle\prod\limits_{i \neq j}} \;P_{i}(z).] Then [P_{j}] and [Q_{j}] are coprime, and the Euclidean algorithm may be used to obtain polynomials [p_{j}(z)] and [q_{j}(z)] such that [p_{j}(z) P_{j}(z) + q_{j}(z) Q_{j}(z) = 1.] With [S_{i}(z) = q_{i}(z) Q_{i}(z)], the polynomial [H(z) = {\textstyle\sum\limits_{i = 1}^{d}} \;S_{i}(z) H_{i}(z) \hbox{ mod } P(z)] is easily shown to be the desired solution.

As with integers, it can be shown that the 1:1 correspondence between [H(z)] and [H_{j}(z)] sends sums to sums and products to products, i.e. establishes a ring isomorphism: [{\bb K}[X] \hbox{ mod } P \cong ({\bb K}[X] \hbox{ mod } P_{1}) \times \ldots \times ({\bb K}[X] \hbox{ mod } P_{d}).]

These results will now be applied to the efficient calculation of cyclic convolutions. Let [{\bf U} = (u_{0}, u_{1}, \ldots, u_{N - 1})] and [{\bf V} = (v_{0}, v_{1}, \ldots, v_{N - 1})] be two vectors of length N, and let [{\bf W} = (w_{0}, w_{1}, \ldots, w_{N - 1})] be obtained by cyclic convolution of U and V: [w_{n} = {\textstyle\sum\limits_{m = 0}^{N - 1}} u_{m} v_{n - m}, \quad n = 0, \ldots, N - 1.] The very simple but crucial result is that this cyclic convolution may be carried out by polynomial multiplication modulo [(z^{N} - 1)]: if [\eqalign{U(z) &= {\textstyle\sum\limits_{l = 0}^{N - 1}} u_{l} z^{l} \cr V(z) &= {\textstyle\sum\limits_{m = 0}^{N - 1}} v_{m} z^{m} \cr W(z) &= {\textstyle\sum\limits_{n = 0}^{N - 1}} w_{n} z^{n}}] then the above relation is equivalent to [W(z) \equiv U(z) V(z) \hbox{ mod } (z^{N} - 1).] Now the polynomial [z^{N} - 1] can be factored over the field of rational numbers into irreducible factors called cyclotomic polynomials: if d is the number of divisors of N, including 1 and N, then [z^{N} - 1 = {\textstyle\prod\limits_{i = 1}^{d}} P_{i}(z),] where the cyclotomics [P_{i}(z)] are well known (Nussbaumer, 1981[link]; Schroeder, 1986[link], Chapter 22). We may now invoke the CRT, and exploit the ring isomorphism it establishes to simplify the calculation of [W(z)] from [U(z)] and [V(z)] as follows:

  • (i) compute the d residual polynomials [\eqalign{U_{i}(z) &\equiv U(z) \hbox{ mod } P_{i}(z), \quad i = 1, \ldots, d,\cr V_{i}(z) &\equiv V(z) \hbox{ mod } P_{i}(z), \quad i = 1, \ldots, d\hbox{;}}]

  • (ii) compute the d polynomial products [W_{i}(z) \equiv U_{i}(z) V_{i}(z) \hbox{ mod } P_{i}(z), \quad i = 1, \ldots, d\hbox{;}]

  • (iii) use the CRT reconstruction formula just proved to recover [W(z)] from the [W_{i} (z)]: [W (z) \equiv {\textstyle\sum\limits_{i = 1}^{d}} S_{i} (z) W_{i} (z) \hbox{ mod } (z^{N} - 1).]

When N is not too large, i.e. for `short cyclic convolutions', the [P_{i} (z)] are very simple, with coefficients 0 or ±1, so that (i)[link] only involves a small number of additions. Furthermore, special techniques have been developed to multiply general polynomials modulo cyclotomic polynomials, thus helping keep the number of multiplications in (ii)[link] and (iii)[link] to a minimum. As a result, cyclic convolutions can be calculated rapidly when N is sufficiently composite.

It will be recalled that Rader's multiplicative indexing often gives rise to cyclic convolutions of length [p - 1] for p an odd prime. Since [p - 1] is highly composite for all [p \leq 50] other than 23 and 47, these cyclic convolutions can be performed more efficiently by the above procedure than by DFT.

These combined algorithms are due to Winograd (1977[link], 1978[link], 1980[link]), and are known collectively as `Winograd small FFT algorithms'. Winograd also showed that they can be thought of as bringing the DFT matrix F to the following `normal form': [{\bf F} = {\bf CBA},] where

  • A is an integer matrix with entries 0, [\pm 1], defining the `pre-additions',

  • B is a diagonal matrix of multiplications,

  • C is a matrix with entries 0, [\pm 1], [\pm i], defining the `post-additions'.

The elements on the diagonal of B can be shown to be either real or pure imaginary, by the same argument as in Section 1.3.3.2.3.1.[link] Matrices A and C may be rectangular rather than square, so that intermediate results may require extra storage space.

1.3.3.3. Multidimensional algorithms

| top | pdf |

From an algorithmic point of view, the distinction between one-dimensional (1D) and multidimensional DFTs is somewhat blurred by the fact that some factoring techniques turn a 1D transform into a multidimensional one. The distinction made here, however, is a practical one and is based on the dimensionality of the indexing sets for data and results. This section will therefore be concerned with the problem of factoring the DFT when the indexing sets for the input data and output results are multidimensional.

1.3.3.3.1. The method of successive one-dimensional transforms

| top | pdf |

The DFT was defined in Section 1.3.2.7.4[link] in an n-dimensional setting and it was shown that when the decimation matrix N is diagonal, say [{\bf N} = \hbox{diag} (N^{(1)}, N^{(2)}, \ldots, N^{(n)})], then [\bar{F} (N)] has a tensor product structure: [\bar{F} ({\bf N}) = \bar{F} (N^{(1)}) \otimes \bar{F} (N^{(2)}) \otimes \ldots \otimes \bar{F} (N^{(n)}).] This may be rewritten as follows: [\eqalign{\bar{F} ({\bf N}) &= [\bar{F} (N^{(1)}) \otimes I_{N^{(2)}} \otimes \ldots \otimes I_{N^{(n)}}] \cr &\quad \times [I_{N^{(1)}} \otimes \bar{F} (N^{(2)}) \otimes \ldots \otimes I_{{N}^{(n)}}] \cr &\quad \times \ldots \cr &\quad \times [I_{N^{(1)}} \otimes I_{N^{(2)}} \otimes \ldots \otimes \bar{F} (N^{(n)}],}] where the I's are identity matrices and × denotes ordinary matrix multiplication. The matrix within each bracket represents a one-dimensional DFT along one of the n dimensions, the other dimensions being left untransformed. As these matrices commute, the order in which the successive 1D DFTs are performed is immaterial.

This is the most straightforward method for building an n-dimensional algorithm from existing 1D algorithms. It is known in crystallography under the name of `Beevers–Lipson factorization' (Section 1.3.4.3.1[link]), and in signal processing as the `row–column method'.

1.3.3.3.2. Multidimensional factorization

| top | pdf |

Substantial reductions in the arithmetic cost, as well as gains in flexibility, can be obtained if the factoring of the DFT is carried out in several dimensions simultaneously. The presentation given here is a generalization of that of Mersereau & Speake (1981)[link], using the abstract setting established independently by Auslander, Tolimieri & Winograd (1982)[link].

Let us return to the general n-dimensional setting of Section 1.3.2.7.4[link], where the DFT was defined for an arbitrary decimation matrix N by the formulae (where [|{\bf N}|] denotes [| \hbox{det }{\bf N}|]): [\eqalign{F ({\bf N})&:\quad X ({\bf k}) \;\;\;= {1 \over |{\bf N}|} {\sum\limits_{{\bf k}^{*}}} \;X^{*} ({\bf k}^{*}) e[-{\bf k}^{*} \cdot ({\bf N}^{-1} {\bf k})] \cr \bar{F} ({\bf N})&:\quad X^{*} ({\bf k}^{*}) = \phantom{{1 \over |{\bf N}|}} {\sum\limits_{{\bf k}}} \;X ({\bf k}) e[{\bf k}^{*} \cdot ({\bf N}^{-1} {\bf k})]}] with [{\bf k} \in {\bb Z}^{n} / {\bf N} {\bb Z}^{n},\quad {\bf k}^{*} \in {\bb Z}^{n} / {\bf N}^{T} {\bb Z}^{n}.]

1.3.3.3.2.1. Multidimensional Cooley–Tukey factorization

| top | pdf |

Let us now assume that this decimation can be factored into d successive decimations, i.e. that [{\bf N} = {\bf N}_{1} {\bf N}_{2} \ldots {\bf N}_{d-1} {\bf N}_{d}] and hence [{\bf N}^{T} = {\bf N}_{d}^{T} {\bf N}_{d - 1}^{T} \ldots {\bf N}_{2}^{T} {\bf N}_{1}^{T}.] Then the coset decomposition formulae corresponding to these successive decimations (Section 1.3.2.7.1[link]) can be combined as follows: [\eqalign{{\bb Z}^{n} &= \bigcup_{{\bf k}_{1}}\; ({\bf k}_{1} + {\bf N}_{1} {\bb Z}^{n}) \cr &= \bigcup_{{\bf k}_{1}}\; \left\{{\bf k}_{1} + {\bf N}_{1} \left[\bigcup_{{\bf k}_{2}}\; ({\bf k}_{2} + {\bf N}_{2} {\bb Z}^{n})\right]\right\} \cr &= \ldots \cr &= \bigcup_{{\bf k}_{1}} \ldots \bigcup_{{\bf k}_{d}}\; ({\bf k}_{1} + {\bf N}_{1} {\bf k}_{2} + \ldots + {\bf N}_{1} {\bf N}_{2} \times \ldots \times {\bf N}_{d - 1} {\bf k}_{d} + {\bf N} {\bb Z}^{n})}] with [{\bf k}_{j} \in {\bb Z}^{n} / {\bf N}_{j} {\bb Z}^{n}]. Therefore, any [{\bf k} \in {\bb Z} / {\bf N} {\bb Z}^{n}] may be written uniquely as [{\bf k} = {\bf k}_{1} + {\bf N}_{1} {\bf k}_{2} + \ldots + {\bf N}_{1} {\bf N}_{2} \times \ldots \times {\bf N}_{d - 1} {\bf k}_{d}.] Similarly: [\eqalign{{\bb Z}^{n} &= \bigcup_{{\bf k}_{d}^{*}}\; ({\bf k}_{d}^{*} + {\bf N}_{d}^{T} {\bb Z}^{n}) \cr &= \ldots \cr &= \bigcup_{{\bf k}_{d}^{*}} \ldots \bigcup_{{\bf k}_{1}^{*}} \;({\bf k}_{d}^{*} + {\bf N}_{d}^{T} {\bf k}_{d - 1}^{*} + \ldots + {\bf N}_{d}^{T} \times \ldots \times {\bf N}_{2}^{T} {\bf k}_{1}^{*} \cr &\quad + {\bf N}^{T} {\bb Z}^{n})}] so that any [{\bf k}^{*} \in {\bb Z}^{n} / {\bf N}^{T} {\bb Z}^{n}] may be written uniquely as [{\bf k}^{*} = {\bf k}_{d}^{*} + {\bf N}_{d}^{T} {\bf k}_{d - 1}^{*} + \ldots + {\bf N}_{d}^{T} \times \ldots \times {\bf N}_{2}^{T} {\bf k}_{1}^{*}] with [{\bf k}_{j}^{*} \in {\bb Z}^{n} / {\bf N}_{j}^{T} {\bb Z}^{n}]. These decompositions are the vector analogues of the multi-radix number representation systems used in the Cooley–Tukey factorization.

We may then write the definition of [\bar{F} ({\bf N})] with [d = 2] factors as [\eqalign{X^{*} ({\bf k}_{2}^{*} + {\bf N}_{2}^{T} {\bf k}_{1}^{*}) &= {\textstyle\sum\limits_{{\bf k}_{1}}} {\textstyle\sum\limits_{{\bf k}_{2}}}\; X ({\bf k}_{1} + {\bf N}_{1} {\bf k}_{2}) \cr &\quad \times e[({\bf k}_{2}^{*T} + {\bf k}_{1}^{*T}{\bf N}_2) {\bf N}_{2}^{-1} {\bf N}_{1}^{-1} ({\bf k}_{1} + {\bf N}_{1} {\bf k}_{2})].}] The argument of e(–) may be expanded as [{\bf k}_{2}^{*} \cdot ({\bf N}^{-1} {\bf k}_{1}) + {\bf k}_{1}^{*} \cdot ({\bf N}_{1}^{-1} {\bf k}_{1}) + {\bf k}_{2}^{*} \cdot ({\bf N}_{2}^{-1} {\bf k}_{2}) + {\bf k}_{1}^{*} \cdot {\bf k}_{2}.] The first summand may be recognized as a twiddle factor, the second and third as the kernels of [\bar{F} ({\bf N}_{1})] and [\bar{F} ({\bf N}_{2})], respectively, while the fourth is an integer which may be dropped. We are thus led to a `vector-radix' version of the Cooley–Tukey algorithm, in which the successive decimations may be introduced in all n dimensions simultaneously by general integer matrices. The computation may be decomposed into five stages analogous to those of the one-dimensional algorithm of Section 1.3.3.2.1[link]:

  • (i) form the [|{\bf N}_{1}|] vectors [{\bf Y}_{{\bf k}_{1}}] of shape [{\bf N}_{2}] by [Y_{{\bf k}_{1}} ({\bf k}_{2}) = X ({\bf k}_{1} + {\bf N}_{1} {\bf k}_{2}),\quad {\bf k}_{1} \in {\bb Z}^{n} / {\bf N}_{1} {\bb Z}^{n},\quad {\bf k}_{2} \in {\bb Z}^{n} / {\bf N}_{2} {\bb Z}^{n}\hbox{;}]

  • (ii) calculate the [|{\bf N}_{1}|] transforms [{\bf Y}_{{\bf k}_{1}}^{*}] on [|{\bf N}_{2}|] points: [Y_{{\bf k}_{1}}^{*} ({\bf k}_{2}^{*}) = {\textstyle\sum\limits_{{\bf k}_{2}}}\; e[{\bf k}_{2}^{*} \cdot ({\bf N}_{2}^{-1} {\bf k}_{2})] Y_{{\bf k}_{1}} ({\bf k}_{2}),\quad {\bf k}_{1} \in {\bb Z}^{n} / {\bf N}_{1} {\bb Z}^{n}\hbox{;}]

  • (iii) form the [|{\bf N}_{2}|] vectors [{\bf Z}_{{\bf k}_{2}^{*}}] of shape [{\bf N}_{1}] by [\displaylines{Z_{{\bf k}_{2}^{*}} ({\bf k}_{1}) = e[{\bf k}_{2}^{*} \cdot ({\bf N}^{-1} {\bf k}_{1})] Y_{{\bf k}_{1}}^{*} ({\bf k}_{2}^{*}),\quad {\bf k}_{1} \in {\bb Z}^{n} / {\bf N}_{1} {\bb Z}^{n},\cr {\bf k}_{2}^{*} \in {\bb Z}^{n} / {\bf N}_{2}^{T} {\bb Z}^{n}\hbox{;}}]

  • (iv) calculate the [|{\bf N}_{2}|] transforms [{\bf Z}_{{\bf k}_{2}^{*}}^{*}] on [|{\bf N}_{1}|] points: [Z_{{\bf k}_{2}^{*}}^{*} ({\bf k}_{1}^{*}) = {\textstyle\sum\limits_{{\bf k}_{1}}}\; e[{\bf k}_{1}^{*} \cdot ({\bf N}_{1}^{-1} {\bf k}_{1})] Z_{{\bf k}_{2}^{*}} ({\bf k}_{1}),\quad {\bf k}_{2}^{*} \in {\bb Z}^{n} / {\bf N}_{2}^{T} {\bb Z}^{n}\hbox{;}]

  • (v) collect [X^{*} ({\bf k}_{2}^{*} + {\bf N}_{2}^{T} {\bf k}_{1}^{*})] as [Z_{{\bf k}_{2}^{*}}^{*} ({\bf k}_{1}^{*})].

The initial [|{\bf N}|]-point transform [\bar{F} ({\bf N})] can thus be performed as [|{\bf N}_{1}|] transforms [\bar{F} ({\bf N}_{2})] on [|{\bf N}_{2}|] points, followed by [|{\bf N}_{2}|] transforms [\bar{F} ({\bf N}_{1})] on [|{\bf N}_{1}|] points. This process can be applied successively to all d factors. The same decomposition applies to [F ({\bf N})], up to the complex conjugation of twiddle factors, the normalization factor [1 / |{\bf N}|] being obtained as the product of the factors [1 / |{\bf N}_{j}|] in the successive partial transforms [F ({\bf N}_{j})].

The geometric interpretation of this factorization in terms of partial transforms on translates of sublattices applies in full to this n-dimensional setting; in particular, the twiddle factors are seen to be related to the residual translations which place the sublattices in register within the big lattice. If the intermediate transforms are performed in place, then the quantity [X^{*} ({\bf k}_{d}^{*} + {\bf N}_{d}^{T} {\bf k}_{d - 1}^{*} + \ldots + {\bf N}_{d}^{T} {\bf N}_{d - 1}^{T} \times \ldots \times {\bf N}_{2}^{T} {\bf k}_{1}^{*})] will eventually be found at location [{\bf k}_{1}^{*} + {\bf N}_{1} {\bf k}_{2}^{*} + \ldots + {\bf N}_{1} {\bf N}_{2} \times \ldots \times {\bf N}_{d - 1} {\bf k}_{d}^{*},] so that the final results will have to be unscrambled by a process which may be called `coset reversal', the vector equivalent of digit reversal.

Factoring by 2 in all n dimensions simultaneously, i.e. taking [{\bf N} = 2{\bf M}], leads to `n-dimensional butterflies'. Decimation in time corresponds to the choice [{\bf N}_{1} = 2{\bf I}, {\bf N}_{2} = {\bf M}], so that [{\bf k}_{1} \in {\bb Z}^{n} / 2{\bb Z}^{n}] is an n-dimensional parity class; the calculation then proceeds by [\displaylines{Y_{{\bf k}_{1}} ({\bf k}_{2}) = X ({\bf k}_{1} + 2{\bf k}_{2}),\quad{\bf k}_{1} \in {\bb Z}^{n} / 2{\bb Z}^{n},\quad {\bf k}_{2} \in {\bb Z}^{n} / {\bf M}{\bb Z}^{n}, \cr Y_{{\bf k}_{1}}^{*} = \bar{F} ({\bf M}) [{\bf Y}_{{\bf k}_{1}}],\quad{\bf k}_{1} \in {\bb Z}^{n} / 2{\bb Z}^{n}\hbox{;} \cr \eqalign{X^{*} ({\bf k}_{2}^{*} + {\bf M}^{T} {\bf k}_{1}^{*}) &= {\textstyle\sum\limits_{{\bf k}_{1} \in {\bb Z}^{n} / 2{\bb Z}^{n}}} (-1)^{{\bf k}_{1}^{*} \cdot {\bf k}_{1}} \cr &\quad \times e[{\bf k}_{2}^{*} \cdot ({\bf N}^{-1} {\bf k}_{1})] Y_{{\bf k}_{1}}^{*} ({\bf k}_{2}^{*}).}\cr}] Decimation in frequency corresponds to the choice [{\bf N}_{1} = {\bf M}], [{\bf N}_{2} = 2{\bf I}], so that [{\bf k}_{2} \in {\bb Z}^{n} / 2{\bb Z}^{n}] labels `octant' blocks of shape M; the calculation then proceeds through the following steps: [\eqalign{Z_{{\bf k}_{2}^{*}} ({\bf k}_{1}) &= \left[{\textstyle\sum\limits_{{\bf k}_{2} \in {\bb Z}^{n} / 2{\bb Z}^{n}}} (-1)^{{\bf k}_{2}^{*} \cdot {\bf k}_{2}} X ({\bf k}_{1} + {\bf M}{\bf k}_{2})\right] \cr &\quad \times e[{\bf k}_{2}^{*} \cdot ({\bf N}^{-1} {\bf k}_{1})], \cr {\bf Z}_{{\bf k}_{2}^{*}}^{*} &= \bar{F} ({\bf M}) [{\bf Z}_{{\bf k}_{2}^{*}}], \cr X^{*} ({\bf k}_{2}^{*} + 2{\bf k}_{1}^{*}) &= Z_{{\bf k}_{2}^{*}}^{*} ({\bf k}_{1}^{*}),}] i.e. the [2^{n}] parity classes of results, corresponding to the different [{\bf k}_{2}^{*} \in {\bb Z}^{n} / 2{\bb Z}^{n}], are obtained separately. When the dimension n is 2 and the decimating matrix is diagonal, this analysis reduces to the `vector radix FFT' algorithms proposed by Rivard (1977)[link] and Harris et al. (1977)[link]. These lead to substantial reductions in the number M of multiplications compared to the row–column method: M is reduced to [3M/4] by simultaneous [2 \times 2] factoring, and to [15M/32] by simultaneous [4 \times 4] factoring.

The use of a non-diagonal decimating matrix may bring savings in computing time if the spectrum of the band-limited function under study is of such a shape as to pack more compactly in a non-rectangular than in a rectangular lattice (Mersereau, 1979[link]). If, for instance, the support K of the spectrum Φ is contained in a sphere, then a decimation matrix producing a close packing of these spheres will yield an aliasing-free DFT algorithm with fewer sample points than the standard algorithm using a rectangular lattice.

1.3.3.3.2.2. Multidimensional prime factor algorithm

| top | pdf |

Suppose that the decimation matrix N is diagonal [{\bf N} = \hbox{diag } (N^{(1)}, N^{(2)}, \ldots, N^{(n)})] and let each diagonal element be written in terms of its prime factors: [N^{(i)} = {\textstyle\prod\limits_{j = 1}^{m}} \;p_{j}^{\nu (i, \, \;j)},] where m is the total number of distinct prime factors present in the [N^{(i)}].

The CRT may be used to turn each 1D transform along dimension i [(i = 1, \ldots, n)] into a multidimensional transform with a separate `pseudo-dimension' for each distinct prime factor of [N^{(i)}]; the number [\mu_{i}], of these pseudo-dimensions is equal to the cardinality of the set: [\{\;j \in \{1, \ldots, m\} | \nu (i,j) \gt 0 \hbox{ for some } i\}.] The full n-dimensional transform thus becomes μ-dimensional, with [\mu = {\textstyle\sum_{i = 1}^{n}} \mu_{i}].

We may now permute the μ pseudo-dimensions so as to bring into contiguous position those corresponding to the same prime factor [p_{j}]; the m resulting groups of pseudo-dimensions are said to define `p-primary' blocks. The initial transform is now written as a tensor product of m p-primary transforms, where transform j is on [p_{j}^{\nu (1, \, \;j)} \times p_{j}^{\nu (2, \, j)} \times \ldots \times p_{j}^{\nu (n, \, j)}] points [by convention, dimension i is not transformed if [\nu (i,j) = 0]]. These p-primary transforms may be computed, for instance, by multidimensional Cooley–Tukey factorization (Section 1.3.3.3.1[link]), which is faster than the straightforward row–column method. The final results may then be obtained by reversing all the permutations used.

The extra gain with respect to the multidimensional Cooley–Tukey method is that there are no twiddle factors between p-primary pieces corresponding to different primes p.

The case where N is not diagonal has been examined by Guessoum & Mersereau (1986)[link].

1.3.3.3.2.3. Nesting of Winograd small FFTs

| top | pdf |

Suppose that the CRT has been used as above to map an n-dimensional DFT to a μ-dimensional DFT. For each [\kappa = 1, \ldots, \mu] [κ runs over those pairs (i, j) such that [\nu (i,j) \gt 0]], the Rader/Winograd procedure may be applied to put the matrix of the κth 1D DFT in the CBA normal form of a Winograd small FFT. The full DFT matrix may then be written, up to permutation of data and results, as [\bigotimes_{\kappa = 1}^{\mu}({\bf C}_{\kappa} {\bf B}_{\kappa} {\bf A}_{\kappa}).]

A well known property of the tensor product of matrices allows this to be rewritten as [\left(\bigotimes_{\gamma = 1}^{\mu} {\bf C}_{\gamma}\right) \times \left(\bigotimes_{\beta = 1}^{\mu} {\bf B}_{\beta}\right) \times \left(\bigotimes_{\alpha = 1}^{\mu} {\bf A}_{\alpha}\right)] and thus to form a matrix in which the combined pre-addition, multiplication and post-addition matrices have been precomputed. This procedure, called nesting, can be shown to afford a reduction of the arithmetic operation count compared to the row–column method (Morris, 1978[link]).

Clearly, the nesting rearrangement need not be applied to all μ dimensions, but can be restricted to any desired subset of them.

1.3.3.3.2.4. The Nussbaumer–Quandalle algorithm

| top | pdf |

Nussbaumer's approach views the DFT as the evaluation of certain polynomials constructed from the data (as in Section 1.3.3.2.4[link]). For instance, putting [\omega = e(1/N)], the 1D N-point DFT [X^{*}(k^{*}) = {\textstyle\sum\limits_{k = 0}^{N - 1}} X(k) \omega^{k^{*}k}] may be written [X^{*}(k^{*}) = Q(\omega^{k^{*}}),] where the polynomial Q is defined by [Q(z) = {\textstyle\sum\limits_{k = 0}^{N - 1}} X(k)z^{k}.]

Let us consider (Nussbaumer & Quandalle, 1979[link]) a 2D transform of size [N \times N]: [X^{*}(k_{1}^{*}, k_{2}^{*}) = {\textstyle\sum\limits_{k_{1} = 0}^{N - 1}}\; {\textstyle\sum\limits_{k_{2} = 0}^{N - 1}} X(k_{1}, k_{2}) \omega^{k_{1}^{*} k_{1} + k_{2}^{*} k_{2}}.] By introduction of the polynomials [\eqalign{Q_{k_{2}}(z) &= {\textstyle\sum\limits_{k_{1}}}\; X (k_{1}, k_{2})z^{k_{1}} \cr R_{k_{2}^{*}}(z) &= {\textstyle\sum\limits_{k_{2}}}\; \omega^{k_{2}^{*} k_{2}} Q_{k_{2}}(z),}] this may be rewritten: [X^{*}(k_{1}^{*}, k_{2}^{*}) = R_{k_{2}^{*}} (\omega^{k_{1}^{*}}) = {\textstyle\sum\limits_{k_{2}}}\; \omega^{k_{2}^{*} k_{2}} Q_{k_{2}} (\omega^{k_{1}^{*}}).]

Let us now suppose that [k_{1}^{*}] is coprime to N. Then [k_{1}^{*}] has a unique inverse modulo N (denoted by [1/k_{1}^{*}]), so that multiplication by [k_{1}^{*}] simply permutes the elements of [{\bb Z}/N {\bb Z}] and hence [{\textstyle\sum\limits_{k_{2} = 0}^{N - 1}} f(k_{2}) = {\textstyle\sum\limits_{k_{2} = 0}^{N - 1}} f(k_{1}^{*} k_{2})] for any function f over [{\bb Z}/N {\bb Z}]. We may thus write: [\eqalign{X^{*}(k_{1}^{*}, k_{2}^{*}) &= {\textstyle\sum\limits_{k_{2}}} \;\omega^{k_{1}^{*} k_{2}^{*} k_{2}} Q_{k_{1}^{*} k_{2}} (\omega^{k_{1}^{*}}) \cr &= S_{k_{1}^{*} k_{2}} (\omega^{k_{1}^{*}})}] where [S_{k^{*}}(z) = {\textstyle\sum\limits_{k_{2}}}\; z^{k^{*} k_{2}} Q_{k_{2}}(z).] Since only the value of polynomial [S_{k^{*}}(z)] at [z = \omega^{k_{1}^{*}}] is involved in the result, the computation of [S_{k^{*}}] may be carried out modulo the unique cyclotomic polynomial [P(z)] such that [P(\omega^{k_{1}^{*}}) = 0]. Thus, if we define: [T_{k^{*}}(z) = {\textstyle\sum\limits_{k_{2}}} \;z^{k^{*} k_{2}} Q_{k_{2}}(z) \hbox{ mod } P(z)] we may write: [X^{*}(k_{1}^{*}, k_{2}^{*}) = T_{k_{1}^{*} k_{2}^{*}} (\omega^{k_{1}^{*}})] or equivalently [X^{*} \left(k_{1}^{*}, {k_{2}^{*} \over k_{1}^{*}}\right) = T_{k_{2}^{*}} (\omega^{k_{1}^{*}}).]

For N an odd prime p, all non-zero values of [k_{1}^{*}] are coprime with p so that the [p \times p]-point DFT may be calculated as follows:

  • (1) form the polynomials [T_{k_{2}^{*}}(z) = {\textstyle\sum\limits_{k_{1}}} {\textstyle\sum\limits_{k_{2}}} \;X(k_{1}, k_{2})z^{k_{1} + k_{2}^{*} k_{2}} \hbox{ mod } P(z)] for [k_{2}^{*} = 0, \ldots, p - 1];

  • (2) evaluate [T_{k_{2}^{*}} (\omega^{k_{1}^{*}})] for [k_{1}^{*} = 0, \ldots, p - 1];

  • (3) put [X^{*}(k_{1}^{*}, k_{2}^{*}/k_{1}^{*}) = T_{k_{2}^{*}} (\omega^{k_{1}^{*}})];

  • (4) calculate the terms for [k_{1}^{*} = 0] separately by [X^{*}(0, k_{2}^{*}) = {\textstyle\sum\limits_{k_{2}}} \left[{\textstyle\sum\limits_{k_{1}}} \;X(k_{1}, k_{2})\right] \omega^{k_{2}^{*} k_{2}}.]

Step (1)[link] is a set of p `polynomial transforms' involving no multiplications; step (2)[link] consists of p DFTs on p points each since if [T_{k_{2}^{*}}(z) = {\textstyle\sum\limits_{k_{1}}}\; Y_{k_{2}^{*}}(k_{1})z^{k_{1}}] then [T_{k_{2}^{*}} (\omega^{k_{1}^{*}}) = {\textstyle\sum\limits_{k_{1}}} \;Y_{k_{2}^{*}}(k_{1}) \omega^{k_{1}^{*} k_{1}} = Y_{k_{2}^{*}}^{*}(k_{1}^{*})\hbox{;}] step (3)[link] is a permutation; and step (4)[link] is a p-point DFT. Thus the 2D DFT on [p \times p] points, which takes 2p p-point DFTs by the row–column method, involves only [(p + 1)] p-point DFTs; the other DFTs have been replaced by polynomial transforms involving only additions.

This procedure can be extended to n dimensions, and reduces the number of 1D p-point DFTs from [np^{n - 1}] for the row–column method to [(p^{n} -1)/(p - 1)], at the cost of introducing extra additions in the polynomial transforms.

A similar algorithm has been formulated by Auslander et al. (1983)[link] in terms of Galois theory.

1.3.3.3.3. Global algorithm design

| top | pdf |

1.3.3.3.3.1. From local pieces to global algorithms

| top | pdf |

The mathematical analysis of the structure of DFT computations has brought to light a broad variety of possibilities for reducing or reshaping their arithmetic complexity. All of them are `analytic' in that they break down large transforms into a succession of smaller ones.

These results may now be considered from the converse `synthetic' viewpoint as providing a list of procedures for assembling them:

  • (i) the building blocks are one-dimensional p-point algorithms for p a small prime;

  • (ii) the low-level connectors are the multiplicative reindexing methods of Rader and Winograd, or the polynomial transform reindexing method of Nussbaumer and Quandalle, which allow the construction of efficient algorithms for larger primes p, for prime powers [p^{\nu}], and for p-primary pieces of shape [p^{\nu} \times \ldots \times p^{\nu}];

  • (iii) the high-level connectors are the additive reindexing scheme of Cooley–Tukey, the Chinese remainder theorem reindexing, and the tensor product construction;

  • (iv) nesting may be viewed as the `glue' which seals all elements.

The simplest DFT may then be carried out into a global algorithm in many different ways. The diagrams in Fig. 1.3.3.1[link] illustrate a few of the options available to compute a 400-point DFT. They may differ greatly in their arithmetic operation counts.

[Figure 1.3.3.1]

Figure 1.3.3.1 | top | pdf |

A few global algorithms for computing a 400-point DFT. CT: Cooley–Tukey factorization. PF: prime factor (or Good) factorization. W: Winograd algorithm.

1.3.3.3.3.2. Computer architecture considerations

| top | pdf |

To obtain a truly useful measure of the computational complexity of a DFT algorithm, its arithmetic operation count must be tempered by computer architecture considerations. Three main types of trade-offs must be borne in mind:

  • (i) reductions in floating-point (f.p.) arithmetic count are obtained by reindexing, hence at the cost of an increase in integer arithmetic on addresses, although some shortcuts may be found (Uhrich, 1969[link]; Burrus & Eschenbacher, 1981[link]);

  • (ii) reduction in the f.p. multiplication count usually leads to a large increase in the f.p. addition count (Morris, 1978[link]);

  • (iii) nesting can increase execution speed, but causes a loss of modularity and hence complicates program development (Silverman, 1977[link]; Kolba & Parks, 1977[link]).

Many of the mathematical developments above took place in the context of single-processor serial computers, where f.p. addition is substantially cheaper than f.p. multiplication but where integer address arithmetic has to compete with f.p. arithmetic for processor cycles. As a result, the alternatives to the Cooley–Tukey algorithm hardly ever led to particularly favourable trade-offs, thus creating the impression that there was little to gain by switching to more exotic algorithms.

The advent of new machine architectures with vector and/or parallel processing features has greatly altered this picture (Pease, 1968[link]; Korn & Lambiotte, 1979[link]; Fornberg, 1981[link]; Swartzrauber, 1984[link]):

  • (i) pipelining equalizes the cost of f.p. addition and f.p. multiplication, and the ideal `blend' of the two types of operations depends solely on the number of adder and multiplier units available in each machine;

  • (ii) integer address arithmetic is delegated to specialized arithmetic and logical units (ALUs) operating concurrently with the f.p. units, so that complex reindexing schemes may be used without loss of overall efficiency.

Another major consideration is that of data flow [see e.g. Nawab & McClellan (1979)[link]]. Serial machines only have few registers and few paths connecting them, and allow little or no overlap between computation and data movement. New architectures, on the other hand, comprise banks of vector registers (or `cache memory') besides the usual internal registers, and dedicated ALUs can service data transfers between several of them simultaneously and concurrently with computation.

In this new context, the devices described in Sections 1.3.3.2[link] and 1.3.3.3[link] for altering the balance between the various types of arithmetic operations, and reshaping the data flow during the computation, are invaluable. The field of machine-dependent DFT algorithm design is thriving on them [see e.g. Temperton (1983a[link],b[link],c[link], 1985[link]); Agarwal & Cooley (1986[link], 1987[link])].

1.3.3.3.3.3. The Johnson–Burrus family of algorithms

| top | pdf |

In order to explore systematically all possible algorithms for carrying out a given DFT computation, and to pick the one best suited to a given machine, attempts have been made to develop:

  • (i) a high-level notation of describing all the ingredients of a DFT computation, including data permutation and data flow;

  • (ii) a formal calculus capable of operating on these descriptions so as to represent all possible reorganizations of the computation;

  • (iii) an automatic procedure for evaluating the performance of a given algorithm on a specific architecture.

Task (i)[link] can be accomplished by systematic use of a tensor product notation to represent the various stages into which the DFT can be factored (reindexing, small transforms on subsets of indices, twiddle factors, digit-reversal permutations).

Task (ii)[link] may for instance use the Winograd CBA normal form for each small transform, then apply the rules governing the rearrangement of tensor product [\bigotimes] and ordinary product × operations on matrices. The matching of these rearrangements to the architecture of a vector and/or parallel computer can be formalized algebraically [see e.g. Chapter 2 of Tolimieri et al. (1989)[link]].

Task (iii)[link] is a complex search which requires techniques such as dynamic programming (Bellman, 1958[link]).

Johnson & Burrus (1983)[link] have proposed and tested such a scheme to identify the optimal trade-offs between prime factor nesting and Winograd nesting of small Winograd transforms. In step (ii)[link], they further decomposed the pre-addition matrix A and post-addition matrix C into several factors, so that the number of design options available becomes very large: the N-point DFT when N has four factors can be calculated in over 1012 distinct ways.

This large family of nested algorithms contains the prime factor algorithm and the Winograd algorithms as particular cases, but usually achieves greater efficiency than either by reducing the f.p. multiplication count while keeping the number of f.p. additions small.

There is little doubt that this systematic approach will be extended so as to incorporate all available methods of restructuring the DFT.

1.3.4. Crystallographic applications of Fourier transforms

| top | pdf |

1.3.4.1. Introduction

| top | pdf |

The central role of the Fourier transformation in X-ray crystallography is a consequence of the kinematic approximation used in the description of the scattering of X-rays by a distribution of electrons (Bragg, 1915[link]; Duane, 1925[link]; Havighurst, 1925a[link],b[link]; Zachariasen, 1945[link]; James, 1948a[link], Chapters 1 and 2; Lipson & Cochran, 1953[link], Chapter 1; Bragg, 1975[link]).

Let [\rho ({\bf X})] be the density of electrons in a sample of matter contained in a finite region V which is being illuminated by a parallel monochromatic X-ray beam with wavevector [{\bf K}_{0}]. Then the far-field amplitude scattered in a direction corresponding to wavevector [{\bf K} = {\bf K}_{0} + {\bf H}] is proportional to [\eqalign{F({\bf H}) &= {\textstyle\int\limits_{V}} \rho ({\bf X}) \exp (2\pi i{\bf H} \cdot {\bf X}) \;\hbox{d}^{3}{\bf X}\cr &= \bar{\scr F}[\rho]({\bf H})\cr &= \langle \rho_{\bf x}, \exp (2\pi i{\bf H} \cdot {\bf X})\rangle.}]

In certain model calculations, the `sample' may contain not only volume charges, but also point, line and surface charges. These singularities may be accommodated by letting ρ be a distribution, and writing [F({\bf H}) = \bar{\scr F}[\rho]({\bf H}) = \langle \rho_{\bf x}, \exp (2\pi i{\bf H} \cdot {\bf X})\rangle.] F is still a well behaved function (analytic, by Section 1.3.2.4.2.10[link]) because ρ has been assumed to have compact support.

If the sample is assumed to be an infinite crystal, so that ρ is now a periodic distribution, the customary limiting process by which it is shown that F becomes a discrete series of peaks at reciprocal-lattice points (see e.g. von Laue, 1936[link]; Ewald, 1940[link]; James, 1948a[link] p. 9; Lipson & Taylor, 1958[link], pp. 14–27; Ewald, 1962[link], pp. 82–101; Warren, 1969[link], pp. 27–30) is already subsumed under the treatment of Section 1.3.2.6[link].

1.3.4.2. Crystallographic Fourier transform theory

| top | pdf |

1.3.4.2.1. Crystal periodicity

| top | pdf |

1.3.4.2.1.1. Period lattice, reciprocal lattice and structure factors

| top | pdf |

Let ρ be the distribution of electrons in a crystal. Then, by definition of a crystal, ρ is Λ-periodic for some period lattice Λ (Section 1.3.2.6.5[link]) so that there exists a motif distribution [\rho^{0}] with compact support such that [\rho = R * \rho^{0},] where [R = {\textstyle\sum_{{\bf x}\in \Lambda}} \delta_{({\bf X})}]. The lattice Λ is usually taken to be the finest for which the above representation holds.

Let Λ have a basis [({\bf a}_{1}, {\bf a}_{2}, {\bf a}_{3})] over the integers, these basis vectors being expressed in terms of a standard orthonormal basis [({\bf e}_{1}, {\bf e}_{2}, {\bf e}_{3})] as [{\bf a}_{k} = {\textstyle\sum\limits_{j = 1}^{3}} a_{jk} {\bf e}_{j}.] Then the matrix [{\bf A} = \pmatrix{a_{11} &a_{12} &a_{13}\cr a_{21} &a_{22} &a_{23}\cr a_{31} &a_{32} &a_{33}\cr}] is the period matrix of Λ (Section 1.3.2.6.5[link]) with respect to the unit lattice with basis [({\bf e}_{1}, {\bf e}_{2}, {\bf e}_{3})], and the volume V of the unit cell is given by [V = |\det {\bf A}|].

By Fourier transformation [\bar{\scr F}[\rho] = R^{*} \times \bar{\scr F}[\rho^{0}],] where [R^{*} = {\textstyle\sum_{{\bf H}\in \Lambda^{*}}} \delta_{({\bf H})}] is the lattice distribution associated to the reciprocal lattice [\Lambda^{*}]. The basis vectors [({\bf a}_{1}^{*}, {\bf a}_{2}^{*}, {\bf a}_{3}^{*})] have coordinates in [({\bf e}_{1}, {\bf e}_{2}, {\bf e}_{3})] given by the columns of [({\bf A}^{-1})^{T}], whose expression in terms of the cofactors of A (see Section 1.3.2.6.5[link]) gives the familiar formulae involving the cross product of vectors for [n = 3]. The H-distribution F of scattered amplitudes may be written [F = \bar{\scr F}[\rho]_{{\bf H}} = {\textstyle\sum\limits_{{\bf H}\in \Lambda^{*}}} \bar{\scr F}[\rho^{0}]({\bf H})\delta_{({\bf H})} = {\textstyle\sum\limits_{{\bf H}\in \Lambda^{*}}} F_{{\bf H}}\delta_{({\bf H})}] and is thus a weighted reciprocal-lattice distribution, the weight [F_{{\bf H}}] attached to each node [{\bf H} \in \Lambda^{*}] being the value at H of the transform [\bar{\scr F}[\rho^{0}]] of the motif [\rho^{0}]. Taken in conjunction with the assumption that the scattering is elastic, i.e. that H only changes the direction but not the magnitude of the incident wavevector [{\bf K}_{0}], this result yields the usual forms (Laue or Bragg) of the diffraction conditions: [{\bf H} \in \Lambda^{*}], and simultaneously H lies on the Ewald sphere.

By the reciprocity theorem, [\rho^{0}] can be recovered if F is known for all [{\bf H} \in \Lambda^{*}] as follows [Section 1.3.2.6.5[link], e.g. (iv)]: [\rho_{\bf x} = {1 \over V} {\sum\limits_{{\bf H}\in \Lambda^{*}}} F_{{\bf H}} \exp (-2\pi i{\bf H} \cdot {\bf X}).]

These relations may be rewritten in terms of standard, or `fractional crystallographic', coordinates by putting [{\bf X} = {\bf Ax}, \quad {\bf H} = ({\bf A}^{-1})^{T}{\bf h},] so that a unit cell of the crystal corresponds to [{\bf x} \in {\bb R}^{3}/{\bb Z}^{3}], and that [{\bf h} \in {\bb Z}^{3}]. Defining [\rho\llap{$-\!$}] and [\rho\llap{$-\!$}^{0}] by [\rho = {1 \over V} A^{\#} \rho\llap{$-\!$}, \quad \rho^{0} = {1 \over V} A^{\#} \rho\llap{$-\!$}^{0}] so that [\rho ({\bf X}) \;\hbox{d}^{3}{\bf X} = \rho\llap{$-\!$} ({\bf x}) \;\hbox{d}^{3}{\bf x}, \quad \rho^{0} ({\bf X}) \;\hbox{d}^{3}{\bf X} = \rho\llap{$-\!$}^{0} ({\bf x}) \;\hbox{d}^{3}{\bf x},] we have [\eqalign{\bar{\scr F}[\rho\llap{$-\!$}]_{{\bf h}} &= {\textstyle\sum\limits_{{\bf h}\in {\bb Z}^{3}}} F({\bf h})\delta_{({\bf h})},\cr F({\bf h}) &= \langle \rho\llap{$-\!$}_{\bf x}^{0}, \exp (2\pi i{\bf h} \cdot {\bf x})\rangle\cr &= {\textstyle\int\limits_{{\bb R}^{3}/{\bb Z}^{3}}} \rho\llap{$-\!$}^{0} ({\bf x}) \exp (2\pi i{\bf h} \cdot {\bf x}) \;\hbox{d}^{3}{\bf x} \quad \hbox{if } \rho\llap{$-\!$}^{0} \in L_{\rm loc}^{1} ({\bb R}^{3}/{\bb Z}^{3}),\cr \rho\llap{$-\!$}_{\bf x} &= {\textstyle\sum\limits_{{\bf h}\in {\bb Z}^{3}}} F({\bf h}) \exp (-2\pi i{\bf h} \cdot {\bf x}).}] These formulae are valid for an arbitrary motif distribution [\rho\llap{$-\!$}^{0}], provided the convergence of the Fourier series for [\rho\llap{$-\!$}] is considered from the viewpoint of distribution theory (Section 1.3.2.6.10.3[link]).

The experienced crystallographer may notice the absence of the familiar factor [1/V] from the expression for [\rho\llap{$-\!$}] just given. This is because we use the (mathematically) natural unit for [\rho\llap{$-\!$}], the electron per unit cell, which matches the dimensionless nature of the crystallographic coordinates x and of the associated volume element [\hbox{d}^{3}{\bf x}]. The traditional factor [1/V] was the result of the somewhat inconsistent use of x as an argument but of [\hbox{d}^{3}{\bf X}] as a volume element to obtain ρ in electrons per unit volume (e.g. Å3). A fortunate consequence of the present convention is that nuisance factors of V or [1/V], which used to abound in convolution or scalar product formulae, are now absent.

It should be noted at this point that the crystallographic terminology regarding [{\scr F}] and [\bar{\scr F}] differs from the standard mathematical terminology introduced in Section 1.3.2.4.1[link] and applied to periodic distributions in Section 1.3.2.6.4[link]: F is the inverse Fourier transform of ρ rather than its Fourier transform, and the calculation of ρ is called a Fourier synthesis in crystallography even though it is mathematically a Fourier analysis. The origin of this discrepancy may be traced to the fact that the mathematical theory of the Fourier transformation originated with the study of temporal periodicity, while crystallography deals with spatial periodicity; since the expression for the phase factor of a plane wave is [\exp [2 \pi i(\nu t - {\bf K} \cdot {\bf X})]], the difference in sign between the contributions from time versus spatial displacements makes this conflict unavoidable.

1.3.4.2.1.2. Structure factors in terms of form factors

| top | pdf |

In many cases, [\rho\llap{$-\!$}^{0}] is a sum of translates of atomic electron-density distributions. Assume there are n distinct chemical types of atoms, with [N_{j}] identical isotropic atoms of type j described by an electron distribution [\rho\llap{$-\!$}_{j}] about their centre of mass. According to quantum mechanics each [\rho\llap{$-\!$}_{j}] is a smooth rapidly decreasing function of x, i.e. [\rho\llap{$-\!$}_{j} \in {\scr S}], hence [\rho\llap{$-\!$}^{0} \in {\scr S}] and (ignoring the effect of thermal agitation) [\rho\llap{$-\!$}^{0}({\bf x}) = {\textstyle\sum\limits_{j=1}^{n}} \left[{\textstyle\sum\limits_{k_{j}=1}^{N_{j}}} \rho\llap{$-\!$}_{j} ({\bf x} - {\bf x}_{k_{j}})\right],] which may be written (Section 1.3.2.5.8[link]) [\rho\llap{$-\!$}^{0} = {\textstyle\sum\limits_{j=1}^{n}} \left[\rho\llap{$-\!$}_{j} * \left({\textstyle\sum\limits_{k_{j}=1}^{N_{j}}} \delta_{({\bf x}_{k_{j}})}\right)\right].] By Fourier transformation: [F({\bf h}) = {\textstyle\sum\limits_{j=1}^{n}} \left\{\bar{\scr F}[\rho\llap{$-\!$}_{j}] ({\bf h}) \times \left[{\textstyle\sum\limits_{k_{j}=1}^{N_{j}}} \exp (2\pi i{\bf h} \cdot {\bf x}_{k_{j}})\right]\right\}.] Defining the form factor [f_{j}] of atom j as a function of h to be [f_{j}({\bf h}) = \bar{\scr F}[\rho\llap{$-\!$}_{j}] ({\bf h})] we have [F({\bf h}) = {\textstyle\sum\limits_{j=1}^{n}}\; f_{j}({\bf h}) \times \left[{\textstyle\sum\limits_{k_{j}=1}^{N_{j}}} \exp (2\pi i{\bf h} \cdot {\bf x}_{k_{j}})\right].] If [{\bf X} = {\bf Ax}] and [{\bf H} = ({\bf A}^{-1})^{T} {\bf h}] are the real- and reciprocal-space coordinates in Å and Å−1, and if [\rho_{j}(\|{\bf X}\|)] is the spherically symmetric electron-density function for atom type j, then [f_{j}({\bf H}) = \int\limits_{0}^{\infty} 4\pi \|{\bf X}\|^{2} \rho_{j} (\|{\bf X}\|) {\sin (2\pi \|{\bf H}\| \|{\bf X}\|) \over 2\pi \|{\bf H}\| \|{\bf X}\|} \;\hbox{d}\|{\bf X}\|.]

More complex expansions are used for electron-density studies (see Chapter 1.2[link] in this volume). Anisotropic Gaussian atoms may be dealt with through the formulae given in Section 1.3.2.4.4.2[link].

1.3.4.2.1.3. Fourier series for the electron density and its summation

| top | pdf |

The convergence of the Fourier series for [\rho\llap{$-\!$}] [\rho\llap{$-\!$}({\bf x}) = {\textstyle\sum\limits_{{\bf h}\in {\bb Z}^{3}}} F({\bf h}) \exp (-2\pi i {\bf h} \cdot {\bf x})] is usually examined from the classical point of view (Section 1.3.2.6.10[link]). The summation of multiple Fourier series meets with considerable difficulties, because there is no natural order in [{\bb Z}^{n}] to play the role of the natural order in [{\bb Z}] (Ash, 1976[link]). In crystallography, however, the structure factors [F({\bf h})] are often obtained within spheres [\|{\bf H}\| \leq \Delta^{-1}] for increasing resolution (decreasing Δ). Therefore, successive estimates of [\rho\llap{$-\!$}] are most naturally calculated as the corresponding partial sums (Section 1.3.2.6.10.1[link]): [S_{\Delta} (\rho\llap{$-\!$})({\bf x}) = {\textstyle\sum\limits_{\|({\bf A}^{-1})^{T} {\bf h}\| \leq \Delta^{-1}}} F({\bf h}) \exp (-2\pi i{\bf h} \cdot {\bf x}).] This may be written [S_{\Delta} (\rho\llap{$-\!$})({\bf x}) = (D_{\Delta} * \rho\llap{$-\!$})({\bf x}),] where [D_{\Delta}] is the `spherical Dirichlet kernel' [D_{\Delta}({\bf x}) = {\textstyle\sum\limits_{\|({\bf A}^{-1})^{T} {\bf h}\| \leq \Delta^{-1}}} \exp (-2\pi i{\bf h} \cdot {\bf x}).] [D_{\Delta}] exhibits numerous negative ripples around its central peak. Thus the `series termination errors' incurred by using [S_{\Delta}(\rho\llap{$-\!$})] instead of [\rho\llap{$-\!$}] consist of negative ripples around each atom, and may lead to a Gibbs-like phenomenon (Section 1.3.2.6.10.1[link]) near a molecular boundary.

As in one dimension, Cesàro sums (arithmetic means of partial sums) have better convergence properties, as they lead to a convolution by a `spherical Fejér kernel' which is everywhere positive. Thus Cesàro summation will always produce positive approximations to a positive electron density. Other positive summation kernels were investigated by Pepinsky (1952)[link] and by Waser & Schomaker (1953)[link].

1.3.4.2.1.4. Friedel's law, anomalous scatterers

| top | pdf |

If the wavelength λ of the incident X-rays is far from any absorption edge of the atoms in the crystal, there is a constant phase shift in the scattering, and the electron density may be considered to be real-valued. Then [\eqalign{F({\bf h}) &= {\textstyle\int\limits_{{\bb R}^{3}/{\bb Z}^{3}}} \rho\llap{$-\!$} ({\bf x}) \exp (2\pi i {\bf h} \cdot {\bf x}) \;\hbox{d}^{3} {\bf x} \cr &= \overline{{\textstyle\int\limits_{{\bb R}^{3}/{\bb Z}^{3}}} \overline{\rho\llap{$-\!$} ({\bf x})} \exp [2\pi i (-{\bf h}) \cdot {\bf x}] \;\hbox{d}^{3} {\bf x}} \cr &= \overline{F (-{\bf h})} \hbox{ since } \overline{\rho\llap{$-\!$} ({\bf x})} = \rho\llap{$-\!$} ({\bf x}).}] Thus if [F({\bf h}) = |F({\bf h})| \exp (i\varphi ({\bf h})),] then [|F(-{\bf h})| = |F({\bf h})|\quad \hbox{ and } \quad \varphi (-{\bf h}) = - \varphi ({\bf h}).] This is Friedel's law (Friedel, 1913[link]). The set [\{F_{{\bf h}}\}] of Fourier coefficients is said to have Hermitian symmetry.

If λ is close to some absorption edge(s), the proximity to resonance induces an extra phase shift, whose effect may be represented by letting [\rho\llap{$-\!$} ({\bf x})] take on complex values. Let [\rho\llap{$-\!$} ({\bf x}) = \rho\llap{$-\!$}^{R} ({\bf x}) + i\rho\llap{$-\!$}^{I} ({\bf x})] and correspondingly, by termwise Fourier transformation [F({\bf h}) = F^{R} ({\bf h}) + iF^{I} ({\bf h}).]

Since [\rho\llap{$-\!$}^{R} ({\bf x})] and [\rho\llap{$-\!$}^{I} ({\bf x})] are both real, [F^{R} ({\bf h})] and [F^{I} ({\bf h})] are both Hermitian symmetric, hence [F(-{\bf h}) = \overline{F^{R} ({\bf h})} + \overline{iF^{I} ({\bf h})},] while [\overline{F({\bf h})} = \overline{F^{R}({\bf h})} - \overline{iF^{I}({\bf h})}.] Thus [F(-{\bf h}) \neq \overline{F({\bf h})}], so that Friedel's law is violated. The components [F^{R}({\bf h})] and [F^{I}({\bf h})], which do obey Friedel's law, may be expressed as: [\eqalign{F^{R}({\bf h}) &= {\textstyle{1 \over 2}} [F({\bf h}) + \overline{F(-{\bf h})}],\cr F^{I}({\bf h}) &= {1 \over 2i}[F({\bf h}) - \overline{F(-{\bf h})}].}]

1.3.4.2.1.5. Parseval's identity and other [L^{2}] theorems

| top | pdf |

By Section 1.3.2.4.3.3[link] and Section 1.3.2.6.10.2[link], [{\textstyle\sum\limits_{{\bf h}\in {\bb Z}^{3}}} |F({\bf h})|^{2} = {\textstyle\int\limits_{{\bb R}^{3} / {\bb Z}^{3}}} |\rho\llap{$-\!$} ({\bf x})|^{2} \;\hbox{d}^{3} {\bf x} = V {\textstyle\int\limits_{{\bb R}^{3} / {\Lambda}}} |\rho ({\bf X})|^{2} \;\hbox{d}^{3} {\bf X}.] Usually [\rho\llap{$-\!$} ({\bf x})] is real and positive, hence [|\rho\llap{$-\!$} ({\bf x})| = \rho\llap{$-\!$} ({\bf x})], but the identity remains valid even when [\rho\llap{$-\!$} ({\bf x})] is made complex-valued by the presence of anomalous scatterers.

If [\{G_{\bf h}\}] is the collection of structure factors belonging to another electron density [\sigma = A^{\#} \sigma\llap{$-$}] with the same period lattice as ρ, then [\eqalign{{\textstyle\sum\limits_{{\bf h} \in {\bb Z}^{3}}} \overline{F({\bf h})}G({\bf h}) &= {\textstyle\int\limits_{{\bb R}^{3} / {\bb Z}^{3}}} \overline{\rho\llap{$-\!$} ({\bf x})} \sigma\llap{$-$} ({\bf x}) \;\hbox{d}^{3} {\bf x} \cr &= V {\textstyle\int\limits_{{\bb R}^{3} / {\Lambda}}} \rho ({\bf X}) \sigma ({\bf X}) \;\hbox{d}^{3} {\bf X}.}] Thus, norms and inner products may be evaluated either from structure factors or from `maps'.

1.3.4.2.1.6. Convolution, correlation and Patterson function

| top | pdf |

Let [\rho\llap{$-\!$} = r * \rho\llap{$-\!$}^{0}] and [\sigma\llap{$-$} = r * \sigma\llap{$-$}^{0}] be two electron densities referred to crystallographic coordinates, with structure factors [\{F_{{\bf h}}\}_{{\bf h} \in {\bb Z}^{3}}] and [\{G_{{\bf h}}\}_{{\bf h} \in {\bb Z}^{3}}], so that [\eqalign{\rho\llap{$-\!$}_{\bf x} &= {\textstyle\sum\limits_{{\bf h} \in {\bb Z}^{3}}} F({\bf h}) \exp (-2 \pi i {\bf h} \cdot {\bf x}), \cr \sigma\llap{$-$}_{\bf x} &= {\textstyle\sum\limits_{{\bf h} \in {\bb Z}^{3}}} G({\bf h}) \exp (-2 \pi i {\bf h} \cdot {\bf x}).}]

The distribution [\omega = r * (\rho\llap{$-\!$}^{0} * \sigma\llap{$-$}^{0})] is well defined, since the generalized support condition (Section 1.3.2.3.9.7[link]) is satisfied. The forward version of the convolution theorem implies that if [\omega_{\bf x} = {\textstyle\sum\limits_{{\bf h} \in {\bb Z}^{3}}} W({\bf h}) \exp (-2 \pi i {\bf h} \cdot {\bf x}),] then [W({\bf h}) = F({\bf h}) G({\bf h}).]

If either [\rho\llap{$-\!$}^{0}] or [\sigma\llap{$-$}^{0}] is infinitely differentiable, then the distribution [\psi = \rho\llap{$-\!$} \times \sigma\llap{$-$}] exists, and if we analyse it as [\psi_{\bf x} = {\textstyle\sum\limits_{{\bf h} \in {\bb Z}^{3}}} Y({\bf h}) \exp (-2 \pi i {\bf h} \cdot {\bf x}),] then the backward version of the convolution theorem reads: [Y({\bf h}) = {\textstyle\sum\limits_{{\bf k} \in {\bb Z}^{3}}} F({\bf h}) G({\bf h} - {\bf k}).]

The cross correlation [\kappa [\rho\llap{$-\!$}, \sigma\llap{$-$}]] between [\rho\llap{$-\!$}] and [\sigma\llap{$-$}] is the [{\bb Z}^{3}]-periodic distribution defined by: [\kappa = \breve{\rho\llap{$-\!$}}^{0} * \sigma\llap{$-$}.] If [\rho\llap{$-\!$}^{0}] and [\sigma\llap{$-$}^{0}] are locally integrable, [\eqalign{\kappa [\rho\llap{$-\!$}, \sigma\llap{$-$}] ({\bf t)} &= {\textstyle\int\limits_{{\bb R}^{3}}} \rho\llap{$-\!$}^{0} ({\bf x})\sigma\llap{$-$}({\bf x} + {\bf t}) \;\hbox{d}^{3} {\bf x} \cr &= {\textstyle\int\limits_{{\bb R}^{3} / {\bb Z}^{3}}} \rho\llap{$-\!$}({\bf x})\sigma\llap{$-$}({\bf x} + {\bf t}) \;\hbox{d}^{3} {\bf x}.}] Let [\kappa ({\bf t}) = {\textstyle\sum\limits_{{\bf h} \in {\bb Z}^{3}}} K({\bf h}) \exp (-2\pi i {\bf h} \cdot {\bf t}).] The combined use of the shift property and of the forward convolution theorem then gives immediately: [K({\bf h}) = \overline{F({\bf h})} G({\bf h})\hbox{;}] hence the Fourier series representation of [\kappa [\rho\llap{$-\!$}, \sigma\llap{$-$}]]: [\kappa [\rho\llap{$-\!$}, \sigma\llap{$-$}]({\bf t}) = {\textstyle\sum\limits_{{\bf h} \in {\bb Z}^{3}} \overline{F({\bf h})}} G({\bf h}) \exp (-2\pi i {\bf h} \cdot {\bf t}).] Clearly, [\kappa [\rho\llap{$-\!$}, \sigma\llap{$-$}] = (\kappa [\sigma\llap{$-$}, \rho\llap{$-\!$}]){\breve{}}], as shown by the fact that permuting F and G changes [K({\bf h})] into its complex conjugate.

The auto-correlation of [\rho\llap{$-\!$}] is defined as [\kappa [\rho\llap{$-\!$},\rho\llap{$-\!$}]] and is called the Patterson function of [\rho\llap{$-\!$}]. If [\rho\llap{$-\!$}] consists of point atoms, i.e. [\rho\llap{$-\!$}^{0} = {\textstyle\sum\limits_{j = 1}^{N}} \;Z_{j}\delta_{({\bf x}_{j})},] then [\kappa [\rho\llap{$-\!$}, \rho\llap{$-\!$}] = r * \left[{\textstyle\sum\limits_{j = 1}^{N}} \;{\textstyle\sum\limits_{k = 1}^{N}}\; Z_{j}Z_{k}\delta_{({\bf x}_{j} - {\bf x}_{k})}\right]] contains information about interatomic vectors. It has the Fourier series representation [\kappa [\rho\llap{$-\!$}, \rho\llap{$-\!$}]({\bf t}) = {\textstyle\sum\limits_{{\bf h} \in {\bb Z}^{3}}} |F({\bf h})|^{2} \exp (-2\pi i {\bf h} \cdot {\bf t}),] and is therefore calculable from the diffraction intensities alone. It was first proposed by Patterson (1934[link], 1935a[link],b[link]) as an extension to crystals of the radially averaged correlation function used by Warren & Gingrich (1934)[link] in the study of powders.

1.3.4.2.1.7. Sampling theorems, continuous transforms, interpolation

| top | pdf |

Shannon's sampling and interpolation theorem (Section 1.3.2.7.1[link]) takes two different forms, according to whether the property of finite bandwidth is assumed in real space or in reciprocal space.

  • (1) The most usual setting is in reciprocal space (see Sayre, 1952c[link]). Only a finite number of diffraction intensities can be recorded and phased, and for physical reasons the cutoff criterion is the resolution [\Delta = 1/\|{\bf H}\|_{\max}]. Electron-density maps are thus calculated as partial sums (Section 1.3.4.2.1.3[link]), which may be written in Cartesian coordinates as [S_{\Delta}(\rho)({\bf X}) = {\textstyle\sum\limits_{{\bf H} \in \Lambda^{*}, \, \|{\bf H}\| \leq \Delta^{-1}}} F({\bf H}) \exp (-2\pi i {\bf H} \cdot {\bf X}).] [S_{\Delta}(\rho)] is band-limited, the support of its spectrum being contained in the solid sphere [{\Sigma_{\Delta}}] defined by [\|{\bf H}\| \leq \Delta^{-1}]. Let [\chi_{\Delta}] be the indicator function of [{\Sigma_{\Delta}}]. The transform of the normalized version of [\chi_{\Delta}] is (see below, Section 1.3.4.4.3.5[link]) [\eqalign{I_{\Delta}({\bf X}) &= {3\Delta^{3} \over 4\pi} {\scr F}[\chi_{\Delta}]({\bf X}) \cr &= {3 \over u^{3}} (\sin u - u \cos u) \quad\hbox{where } u = 2\pi {\|{\bf X}\| \over \Delta}.}] By Shannon's theorem, it suffices to calculate [S_{\Delta}(\rho)] on an integral subdivision Γ of the period lattice Λ such that the sampling criterion is satisfied (i.e. that the translates of [{\Sigma_{\Delta}}] by vectors of [\Gamma^{*}] do not overlap). Values of [S_{\Delta}(\rho)] may then be calculated at an arbitrary point X by the interpolation formula: [S_{\Delta}(\rho)({\bf X}) = {\textstyle\sum\limits_{{\bf Y} \in \Gamma}} I_{\Delta}({\bf X} - {\bf Y})S_{\Delta}(\rho)({\bf Y}).]

  • (2) The reverse situation occurs whenever the support of the motif [\rho\llap{$-\!$}^{0}] does not fill the whole unit cell, i.e. whenever there exists a region M (the `molecular envelope'), strictly smaller than the unit cell, such that the translates of M by vectors of r do not overlap and that [\chi_{M} \times \rho\llap{$-\!$}^{0} = \rho\llap{$-\!$}^{0}.] It then follows that [\rho\llap{$-\!$} = r * (\chi_{M} \times \rho\llap{$-\!$}).] Defining the `interference function' G as the normalized indicator function of M according to [G(\boldeta) = {1 \over \hbox{vol} (M)} \bar{\scr F}[\chi_{M}](\boldeta)] we may invoke Shannon's theorem to calculate the value [\bar{\scr F}[\rho\llap{$-\!$}^{0}](\boldxi)] at an arbitrary point ξ of reciprocal space from its sample values [F({\bf h}) = \bar{\scr F}[\rho\llap{$-\!$}^{0}]({\bf h})] at points of the reciprocal lattice as [\bar{\scr F}[\rho\llap{$-\!$}^{0}](\boldxi) = {\textstyle\sum\limits_{{\bf h} \in {\bb Z}^{3}}} {\bf G}(\boldxi - {\bf h}) F({\bf h}).] This aspect of Shannon's theorem constitutes the mathematical basis of phasing methods based on geometric redundancies created by solvent regions and/or noncrystallographic symmetries (Bricogne, 1974[link]). The connection between Shannon's theorem and the phase problem was first noticed by Sayre (1952b)[link]. He pointed out that the Patterson function of [\rho\llap{$-\!$}], written as [\kappa [\rho\llap{$-\!$}, \rho\llap{$-\!$}] = r * (\breve{\rho\llap{$-\!$}}^{0} * \rho\llap{$-\!$}^{0})], may be viewed as consisting of a motif [\kappa^{0} = \breve{\rho\llap{$-\!$}}^{0} * \rho\llap{$-\!$}^{0}] (containing all the internal interatomic vectors) which is periodized by convolution with r. As the translates of [\kappa^{0}] by vectors of [{\bb Z}^{3}] do overlap, the sample values of the intensities [|F({\bf h})|^{2}] at nodes of the reciprocal lattice do not provide enough data to interpolate intensities [|F(\boldxi)|^{2}] at arbitrary points of reciprocal space. Thus the loss of phase is intimately related to the impossibility of intensity interpolation, implying in return that any indication of intensity values attached to non-integral points of the reciprocal lattice is a potential source of phase information.

1.3.4.2.1.8. Sections and projections

| top | pdf |

It was shown at the end of Section 1.3.2.5.8[link] that the convolution theorem establishes, under appropriate assumptions, a duality between sectioning a smooth function (viewed as a multiplication by a δ-function in the sectioning coordinate) and projecting its transform (viewed as a convolution with the function 1 everywhere equal to 1 as a function of the projection coordinate). This duality follows from the fact that [{\scr F}] and [\bar{\scr F}] map [{\bf 1}_{{x}_{i}}] to [\delta_{{x}_{i}}] and [\delta_{{x}_{i}}] to [{\bf 1}_{{x}_{i}}] (Section 1.3.2.5.6[link]), and from the tensor product property (Section 1.3.2.5.5[link]).

In the case of periodic distributions, projection and section must be performed with respect to directions or subspaces which are integral with respect to the period lattice if the result is to be periodic; furthermore, projections must be performed only on the contents of one repeating unit along the direction of projection, or else the result would diverge. The same relations then hold between principal central sections and projections of the electron density and the dual principal central projections and sections of the weighted reciprocal lattice, e.g. [\eqalign{&\rho\llap{$-\!$}(x_{1}, 0, 0) \leftrightarrow {\textstyle\sum\limits_{h_{1}, \, h_{2}}} F(h_{1}, h_{2}, h_{3}),\cr &\rho\llap{$-\!$}(x_{1}, x_{2}, 0) \leftrightarrow {\textstyle\sum\limits_{h_{3}}}\; F(h_{1}, h_{2}, h_{3}),\cr \rho\llap{$-\!$}_{1, \,2}(x_{3}) &= {\textstyle\int\limits_{{\bb R}^{2}/{\bb Z}^{2}}} \rho\llap{$-\!$}(x_{1}, x_{2}, x_{3}) \;\hbox{d}x_{1} \;\hbox{d}x_{2} \leftrightarrow F(0, 0, h_{3}), \cr \rho\llap{$-\!$}_{1}(x_{2}, x_{3}) &= {\textstyle\int\limits_{{\bb R}/{\bb Z}}} \rho\llap{$-\!$}(x_{1}, x_{2}, x_{3}) \;\hbox{d}x_{1} \phantom{\hbox{d}x_{12}^{12}.}\;\leftrightarrow F(0, h_{2}, h_{3})}] etc.

When the sections are principal but not central, it suffices to use the shift property of Section 1.3.2.5.5[link]. When the sections or projections are not principal, they can be made principal by changing to new primitive bases B and [B^{*}] for Λ and [\Lambda^{*}], respectively, the transition matrices P and [{\bf P}^{*}] to these new bases being related by [{\bf P}^{*} = ({\bf P}^{-1})^{T}] in order to preserve duality. This change of basis must be such that one of these matrices (say, P) should have a given integer vector u as its first column, u being related to the line or plane defining the section or projection of interest.

The problem of constructing a matrix P given u received an erroneous solution in Volume II of International Tables (Patterson, 1959[link]), which was subsequently corrected in 1962. Unfortunately, the solution proposed there is complicated and does not suggest a general approach to the problem. It therefore seems worthwhile to record here an effective procedure which solves this problem in any dimension n (Watson, 1970[link]).

Let [{\bf u} = \pmatrix{u_{1}\cr \vdots \cr u_{n}\cr}] be a primitive integral vector, i.e. g.c.d. [(u_{1},\ldots, u_{n}) = 1]. Then an [n \times n] integral matrix P with det [{\bf P} = 1] having u as its first column can be constructed by induction as follows. For [n = 1] the result is trivial. For [n = 2] it can be solved by means of the Euclidean algorithm, which yields [z_{1}, z_{2}] such that [u_{1}z_{2} - u_{2}z_{1} = 1], so that we may take [{\bf P} = \pmatrix{u_{1} &z_{1}\cr u_{2} &z_{2}\cr}]. Note that, if [{\bf z} = \pmatrix{z_{1}\cr z_{2}\cr}] is a solution, then [{\bf z} + m{\bf u}] is another solution for any [m \in {\bb Z}]. For [n \geq 3], write [{\bf u} = \pmatrix{u_{1}\cr d{\bf z}\cr}] with [d = \hbox{g.c.d. } (u_{2}, \ldots, u_{n})] so that both [{\bf z} = \pmatrix{z_{2}\cr \vdots\cr z_{n}\cr}] and [\pmatrix{u_{1}\cr d\cr}] are primitive. By the inductive hypothesis there is an integral [2 \times 2] matrix V with [\pmatrix{u_{1}\cr d\cr}] as its first column, and an integral [(n - 1) \times (n - 1)] matrix Z with z as its first column, with [\det {\bf V} = 1] and [\det {\bf Z} = 1].

Now put [{\bf P} = \pmatrix{1 &\cr &{\bf Z}\cr} \pmatrix{{\bf V} &\cr &{\bf I}_{n - 2}\cr},] i.e. [{\bf P} = \pmatrix{1 &0 &0 &. &0\cr 0 &z_{2} &* &. &*\cr 0 &z_{3} &* &. &*\cr . &. &. &. &.\cr 0 &z_{n} &* &. &*\cr} \pmatrix{u_{1} &* &0 &. &0\cr d &* &0 &. &0\cr 0 &0 &1 &. &0\cr . &. &. &. &.\cr 0 &0 &0 &. &1\cr}.] The first column of P is [\pmatrix{u_{1}\cr dz_{2}\cr .\cr .\cr dz_{n}\cr} = {\bf u},] and its determinant is 1, QED.

The incremental step from dimension [n - 1] to dimension n is the construction of [2 \times 2] matrix V, for which there exist infinitely many solutions labelled by an integer [m_{n - 1}]. Therefore, the collection of matrices P which solve the problem is labelled by [n - 1] arbitrary integers [(m_{1}, m_{2}, \ldots, m_{n - 1})]. This freedom can be used to adjust the shape of the basis B.

Once P has been chosen, the calculation of general sections and projections is transformed into that of principal sections and projections by the changes of coordinates: [{\bf x} = {\bf Px}', \qquad {\bf h} = {\bf P}^{*} {\bf h}',] and an appeal to the tensor product property.

Booth (1945a)[link] made use of the convolution theorem to form the Fourier coefficients of `bounded projections', which provided a compromise between 2D and 3D Fourier syntheses. If it is desired to compute the projection on the (x, y) plane of the electron density lying between the planes [z = z_{1}] and [z = z_{2}], which may be written as [[\rho\llap{$-\!$} \times ({\bf 1}_{x} \otimes {\bf 1}_{y} \otimes \chi_{[z_{1}, \, z_{2}]})] * (\delta_{x} \otimes \delta_{y} \otimes {\bf 1}_{z}).] The transform is then [[F * (\delta_{h} \otimes \delta_{k} \otimes \bar{\scr F}[\chi_{[z_{1}, \, z_{2}]}])] \times ({\bf 1}_{h} \otimes {\bf 1}_{k} \otimes \delta_{l}),] giving for coefficient [(h, k)]: [{\sum\limits_{l \in {\bb Z}}} \;F (h, k, l) \exp \{2 \pi il[(z_{1} + z_{2})/2]\} \times {\sin \pi l (z_{1} - z_{2}) \over \pi l}.]

1.3.4.2.1.9. Differential syntheses

| top | pdf |

Another particular instance of the convolution theorem is the duality between differentiation and multiplication by a monomial (Sections 1.3.2.4.2.8[link], 1.3.2.5.8[link]).

In the present context, this result may be written [\eqalign{&\bar{\scr F}\left[{\partial^{m_{1} + m_{2} + m_{3}} \rho \over \partial X_{1}^{m_{1}} \partial X_{2}^{m_{2}} \partial X_{3}^{m_{3}}}\right] ({\bf H}) \cr &\quad = (-2 \pi i)^{m_{1} + m_{2} + m_{3}} H_{1}^{m_{1}} H_{2}^{m_{2}} H_{3}^{m_{3}} F ({\bf A}^{T} {\bf H})}] in Cartesian coordinates, and [\bar{\scr F}\left[{\partial^{m_{1} + m_{2} + m_{3}} \rho\llap{$-\!$} \over \partial x_{1}^{m_{1}} \partial x_{2}^{m_{2}} \partial x_{3}^{m_{3}}}\right] ({\bf h}) = (-2 \pi i)^{m_{1} + m_{2} + m_{3}} h_{1}^{m_{1}} h_{2}^{m_{2}} h_{3}^{m_{3}} F ({\bf h})] in crystallographic coordinates.

A particular case of the first formula is [-4 \pi^{2} {\textstyle\sum\limits_{{\bf H} \in \Lambda^{*}}} \|{\bf H}\|^{2} F ({\bf A}^{T} {\bf H}) \exp (-2 \pi i {\bf H} \cdot {\bf X}) = \Delta \rho ({\bf X}),] where [\Delta \rho = \sum\limits_{j = 1}^{3} {\partial^{2} \rho \over \partial X_{j}^{2}}] is the Laplacian of ρ.

The second formula has been used with [|{\bf m}| = 1] or 2 to compute `differential syntheses' and refine the location of maxima (or other stationary points) in electron-density maps. Indeed, the values at x of the gradient vector [\nabla \rho\llap{$-\!$}] and Hessian matrix [(\nabla \nabla^{T}) \rho\llap{$-\!$}] are readily obtained as [\eqalign{(\nabla \rho\llap{$-\!$}) ({\bf x}) &= {\textstyle\sum\limits_{{\bf h} \in {\bb Z}^{3}}} (-2 \pi i {\bf h}) F ({\bf h}) \exp (-2 \pi i {\bf h} \cdot {\bf x}), \cr [(\nabla \nabla^{T}) \rho\llap{$-\!$}] ({\bf x}) &= {\textstyle\sum\limits_{{\bf h} \in {\bb Z}^{3}}} (-4 \pi^{2} {\bf hh}^{T}) F ({\bf h}) \exp (-2 \pi i {\bf h} \cdot {\bf x}),}] and a step of Newton iteration towards the nearest stationary point of [\rho\llap{$-\!$}] will proceed by [{\bf x} \;\longmapsto\; {\bf x} - \{[(\nabla \nabla^{T}) \rho\llap{$-\!$}] ({\bf x})\}^{-1} (\nabla \rho\llap{$-\!$}) ({\bf x}).]

The modern use of Fourier transforms to speed up the computation of derivatives for model refinement will be described in Section 1.3.4.4.7[link].

The converse property is also useful: it relates the derivatives of the continuous transform [\bar{\scr F}[\rho^{0}]] to the moments of [\rho^{0}]: [{\partial^{m_{1} + m_{2} + m_{3}} \bar{\scr F}[\rho^{0}] \over \partial X_{1}^{m_{1}} \partial X_{2}^{m_{2}} \partial X_{3}^{m_{3}}} ({\bf H}) = \bar{\scr F}[(2 \pi i)^{m_{1} + m_{2} + m_{3}} X_{1}^{m_{1}} X_{2}^{m_{2}} X_{3}^{m_{3}} \rho_{{\bf x}}^{0}] ({\bf H}).] For [|{\bf m}| = 2] and [{\bf H} = {\bf 0}], this identity gives the well known relation between the Hessian matrix of the transform [\bar{\scr F}[\rho^{0}]] at the origin of reciprocal space and the inertia tensor of the motif [\rho^{0}]. This is a particular case of the moment-generating properties of [\bar{\scr F}], which will be further developed in Section 1.3.4.5.2[link].

1.3.4.2.1.10. Toeplitz forms, determinantal inequalities and Szegö's theorem

| top | pdf |

The classical results presented in Section 1.3.2.6.9[link] can be readily generalized to the case of triple Fourier series; no new concept is needed, only an obvious extension of the notation.

Let [\rho\llap{$-\!$}] be real-valued, so that Friedel's law holds and [F (-{\bf h}) = \overline{F ({\bf h})}]. Let [{\sf H}] be a finite set of indices comprising the origin: [{\sf H} = \{{\bf h}_{0} = {\bf 0},{\bf h}_{1}, \ldots, {\bf h}_{n}\}]. Then the Hermitian form in [n + 1] complex variables [T_{{\sf H}} [\rho\llap{$-\!$}] ({\bf u}) = {\textstyle\sum\limits_{j, \, \;k = 0}^{n}} F ({\bf h}_{j} - {\bf h}_{k}) \overline{u_{j}} u_{k}] is called the Toeplitz form of order [{\sf H}] associated to [\rho\llap{$-\!$}]. By the convolution theorem and Parseval's identity, [T_{\sf H} [\rho\llap{$-\!$}] ({\bf u}) = {\textstyle\int\limits_{{\bb R}^{3}/{\bb Z}^{3}}} \rho\llap{$-\!$} ({\bf x}) \left|{\textstyle\sum\limits_{j = 0}^{n} u_{j}} \exp (2 \pi i {\bf h}_{j} \cdot {\bf x})\right|^{2} \;\hbox{d}^{3} {\bf x}.] If [\rho\llap{$-\!$}] is almost everywhere non-negative, then for all [{\sf H}] the forms [T_{{\sf H}} [\rho\llap{$-\!$}]] are positive semi-definite and therefore all Toeplitz determinants [D_{{\sf H}} [\rho\llap{$-\!$}]] are non-negative, where [D_{{\sf H}} [\rho\llap{$-\!$}] = \det \{[F ({\bf h}_{j} - {\bf h}_{k})]\}.]

The Toeplitz–Carathéodory–Herglotz theorem given in Section 1.3.2.6.9.2[link] states that the converse is true: if [D_{{\sf H}} [\rho] \geq 0] for all [{\sf H}], then [\rho\llap{$-\!$}] is almost everywhere non-negative. This result is known in the crystallographic literature through the papers of Karle & Hauptman (1950)[link], MacGillavry (1950)[link], and Goedkoop (1950)[link], following previous work by Harker & Kasper (1948)[link] and Gillis (1948a[link],b[link]).

Szegö's study of the asymptotic distribution of the eigenvalues of Toeplitz forms as their order tends to infinity remains valid. Some precautions are needed, however, to define the notion of a sequence [({\sf H}_{k})] of finite subsets of indices tending to infinity: it suffices that the [{\sf H}_{k}] should consist essentially of the reciprocal-lattice points h contained within a domain of the form [k\Omega] (k-fold dilation of Ω) where Ω is a convex domain in [{\bb R}^{3}] containing the origin (Widom, 1960[link]). Under these circumstances, the eigenvalues [\lambda_{\nu}^{(n)}] of the Toeplitz forms [T_{{\sf H}_{k}} [\rho\llap{$-\!$}]] become equidistributed with the sample values [\rho\llap{$-\!$}_{\nu'}^{(n)}] of [\rho\llap{$-\!$}] on a grid satisfying the Shannon sampling criterion for the data in [{\sf H}_{k}] (cf. Section 1.3.2.6.9.3[link]).

A particular consequence of this equidistribution is that the geometric means of the [\lambda_{\nu}^{(n)}] and of the [\rho\llap{$-\!$}_{\nu'}^{(n)}] are equal, and hence as in Section 1.3.2.6.9.4[link] [\lim\limits_{k \rightarrow \infty} \{D_{{\sf H}_{k}} [\rho\llap{$-\!$}]\}^{1/|{\sf H}_{k}|} = \exp \left\{{\textstyle\int\limits_{{\bb R}^{3}/{\bb Z}^{3}}} \log \rho\llap{$-\!$} ({\bf x}) \;\hbox{d}^{3} {\bf x}\right\},] where [|{\sf H}_{k}|] denotes the number of reflections in [{\sf H}_{k}]. Complementary terms giving a better comparison of the two sides were obtained by Widom (1960[link], 1975[link]) and Linnik (1975)[link].

This formula played an important role in the solution of the 2D Ising model by Onsager (1944)[link] (see Montroll et al., 1963[link]). It is also encountered in phasing methods involving the `Burg entropy' (Britten & Collins, 1982[link]; Narayan & Nityananda, 1982[link]; Bricogne, 1982[link], 1984[link], 1988[link]).

1.3.4.2.2. Crystal symmetry

| top | pdf |

1.3.4.2.2.1. Crystallographic groups

| top | pdf |

The description of a crystal given so far has dealt only with its invariance under the action of the (discrete Abelian) group of translations by vectors of its period lattice Λ.

Let the crystal now be embedded in Euclidean 3-space, so that it may be acted upon by the group [M(3)] of rigid (i.e. distance-preserving) motions of that space. The group [M(3)] contains a normal subgroup [T(3)] of translations, and the quotient group [M(3)/T(3)] may be identified with the 3-dimensional orthogonal group [O(3)]. The period lattice Λ of a crystal is a discrete uniform subgroup of [T(3)].

The possible invariance properties of a crystal under the action of [M(3)] are captured by the following definition: a crystallographic group is a subgroup Γ of [M(3)] if

  • (i) [\Gamma \cap T(3) = \Lambda], a period lattice and a normal subgroup of Γ;

  • (ii) the factor group [G = \Gamma/\Lambda] is finite.

The two properties are not independent: by a theorem of Bieberbach (1911)[link], they follow from the assumption that Λ is a discrete subgroup of [M(3)] which operates without accumulation point and with a compact fundamental domain (see Auslander, 1965[link]). These two assumptions imply that G acts on Λ through an integral representation, and this observation leads to a complete enumeration of all distinct Γ's. The mathematical theory of these groups is still an active research topic (see, for instance, Farkas, 1981[link]), and has applications to Riemannian geometry (Wolf, 1967[link]).

This classification of crystallographic groups is described elsewhere in these Tables (Wondratschek, 2005[link]), but it will be surveyed briefly in Section 1.3.4.2.2.3[link] for the purpose of establishing further terminology and notation, after recalling basic notions and results concerning groups and group actions in Section 1.3.4.2.2.2[link].

1.3.4.2.2.2. Groups and group actions

| top | pdf |

The books by Hall (1959)[link] and Scott (1964)[link] are recommended as reference works on group theory.

  • (a) Left and right actions

    Let G be a group with identity element e, and let X be a set. An action of G on X is a mapping from [G \times X] to X with the property that, if g x denotes the image of [(g, x)], then [\displaylines{\quad \hbox{(i)}\;\; (g_{1} g_{2}) x = g_{1} (g_{2}x)\quad \hbox{for all } g_{1}, g_{2} \in G \hbox{ and all } x \in X, \hfill\cr \quad \hbox{(ii)}{\hbox to 22pt{}}ex = x {\hbox to 34pt{}}\hbox{for all } x \in X. \hfill}] An element g of G thus induces a mapping [T_{g}] of X into itself defined by [T_{g} (x) = gx], with the `representation property': [\displaylines{\quad \hbox{(iii) }T_{g_{1} g_{2}} = T_{g_{1}} T_{g_{2}} \quad \hbox{for all } g_{1}, g_{2} \in G.\hfill}] Since G is a group, every g has an inverse [g^{-1}]; hence every mapping [T_{g}] has an inverse [T_{g^{-1}}], so that each [T_{g}] is a permutation of X.

    Strictly speaking, what has just been defined is a left action. A right action of G on X is defined similarly as a mapping [(g, x) \;\longmapsto\; xg] such that [\displaylines{\quad (\hbox{i}')\;\;\; x(g_{1} g_{2}) = (xg_{1}) g_{2}\quad \;\hbox{ for all } g_{1}, g_{2} \in G \hbox{ and all } x \in X, \hfill\cr \quad (\hbox{ii}'){\hbox to 25pt{}} xe = x{\hbox to 39pt{}}\hbox{for all } x \in X. \hfill}] The mapping [T'_{g}] defined by [T'_{g}(x) = xg] then has the `right-representation' property: [\displaylines{\quad (\hbox{iii}')\ T'_{g_{1} g_{2}} = T'_{g_{2}} T'_{g_{1}}\quad \hbox{for all } g_{1}, g_{2} \in G.\hfill}]

    The essential difference between left and right actions is of course not whether the elements of G are written on the left or right of those of X: it lies in the difference between (iii) and (iii′). In a left action the product [g_{1} g_{2}] in G operates on [x \in X] by [g_{2}] operating first, then [g_{1}] operating on the result; in a right action, [g_{1}] operates first, then [g_{2}]. This distinction will be of importance in Sections 1.3.4.2.2.4[link] and 1.3.4.2.2.5[link]. In the sequel, we will use left actions unless otherwise stated.

  • (b) Orbits and isotropy subgroups

    Let x be a fixed element of X. Two fundamental entities are associated to x:

    • (1) the subset of G consisting of all g such that [gx = x] is a subgroup of G, called the isotropy subgroup of x and denoted [G_{x}];

    • (2) the subset of X consisting of all elements g x with g running through G is called the orbit of x under G and is denoted Gx.

    Through these definitions, the action of G on X can be related to the internal structure of G, as follows. Let [G / G_{x}] denote the collection of distinct left cosets of [G_{x}] in G, i.e. of distinct subsets of G of the form [gG_{x}]. Let [|G|, |G_{x}|, |Gx|] and [|G / G_{x}|] denote the numbers of elements in the corresponding sets. The number [|G / G_{x}|] of distinct cosets of [G_{x}] in G is also denoted [[G : G_{x}]] and is called the index of [G_{x}] in G; by Lagrange's theorem [[G : G_{x}] = |G/G_{x}| = {|G| \over |G_{x}|}.] Now if [g_{1}] and [g_{2}] are in the same coset of [G_{x}], then [g_{2} = g_{1}g'] with [g' \in G_{x}], and hence [g_{1}x = g_{2}x]; the converse is obviously true. Therefore, the mapping from cosets to orbit elements [gG_{x} \;\longmapsto\; gx] establishes a one-to-one correspondence between the distinct left cosets of [G_{x}] in G and the elements of the orbit of x under G. It follows that the number of distinct elements in the orbit of x is equal to the index of [G_{x}] in G: [|Gx| = [G : G_{x}] = {|G| \over |G_{x}|},] and that the elements of the orbit of x may be listed without repetition in the form [Gx = \{\gamma x | \gamma \in G/G_{x}\}.]

    Similar definitions may be given for a right action of G on X. The set of distinct right cosets [G_{x}g] in G, denoted [G_{x} \backslash G], is then in one-to-one correspondence with the distinct elements in the orbit xG of x.

  • (c) Fundamental domain and orbit decomposition

    The group properties of G imply that two orbits under G are either disjoint or equal. The set X may thus be written as the disjoint union [X = \bigcup\limits_{i \in I} Gx_{i},] where the [x_{i}] are elements of distinct orbits and I is an indexing set labelling them. The subset [D = \{x_{i}\}_{i\in I}] is said to constitute a fundamental domain (mathematical terminology) or an asymmetric unit (crystallographic terminology) for the action of G on X: it contains one representative [x_{i}] of each distinct orbit. Clearly, other fundamental domains may be obtained by choosing different representatives for these orbits.

    If X is finite and if f is an arbitrary complex-valued function over X, the `integral' of f over X may be written as a sum of integrals over the distinct orbits, yielding the orbit decomposition formula: [\eqalign{{\sum\limits_{x\in X}}\; f(x) &= {\sum\limits_{i\in I}} \left({\sum\limits_{y_{i}\in Gx_{i}}} f(y_{i})\right) = {\sum\limits_{i\in I}} \left({\sum\limits_{\gamma_{i}\in G/G_{x_{i}}}} f(\gamma_{i} x_{i})\right) \cr &= \sum\limits_{i\in I} {1 \over |G_{x_{i}}|} \left(\sum\limits_{g_{i}\in G} f(g_{i} x_{i})\right).}] In particular, taking [f(x) = 1] for all x and denoting by [|X|] the number of elements of X: [|X| = \sum\limits_{i\in I} |Gx_{i}| = \sum\limits_{i\in I} |G/G_{x_{i}}| = \sum\limits_{i\in I} {|G| \over |G_{x_{i}}|}.]

  • (d) Conjugation, normal subgroups, semi-direct products

    A group G acts on itself by conjugation, i.e. by associating to [g \in G] the mapping [C_{g}] defined by [C_{g} (h) = ghg^{-1}.] Indeed, [C_{g} (hk) = C_{g} (h) C_{g} (k)] and [[C_{g} (h)]^{-1} = C_{g^{-1}}(h)]. In particular, [C_{g}] operates on the set of subgroups of G, two subgroups H and K being called conjugate if [H = C_{g} (K)] for some [g \in G]; for example, it is easily checked that [G_{gx} = C_{g}(G_{x})]. The orbits under this action are the conjugacy classes of subgroups of G, and the isotropy subgroup of H under this action is called the normalizer of H in G.

    If [\{H\}] is a one-element orbit, H is called a self-conjugate or normal subgroup of G; the cosets of H in G then form a group [G/H] called the factor group of G by H.

    Let G and H be two groups, and suppose that G acts on H by automorphisms of H, i.e. in such a way that [\eqalign{g (h_{1}h_{2}) &= g(h_{1})g(h_{2}) \cr g (e_{H}) &= e_{H}\qquad\quad (\hbox{where } e_{H} \hbox{ is the identity element of } H). \cr g (h^{-1}) &= (g(h))^{-1}}]

    Then the symbols [g, h] with [g \in G], [h \in H] form a group K under the product rule: [[g_{1}, h_{1}] [g_{2}, h_{2}] = [g_{1}g_{2}, h_{1}g_{1}(h_{2})]] {associativity checks; [[e_{G},e_{H}]] is the identity; [[g,h]] has inverse [[g^{-1}, g^{-1} (h^{-1})]]}. The group K is called the semi-direct product of H by G, denoted [K = H\; \triangleright\kern-4pt \lt G].

    The elements [[g, e_{H}]] form a subgroup of K isomorphic to G, the elements [[e_{G}, h]] form a normal subgroup of K isomorphic to H, and the action of G on H may be represented as an action by conjugation in the sense that [C_{[g, \,  e_{H}]} ([e_{G}, h]) = [e_{G}, g(h)].]

    A familiar example of semi-direct product is provided by the group of Euclidean motions [M(3)] (Section 1.3.4.2.2.1[link]). An element S of [M(3)] may be written [S = [R, t]] with [R \in O(3)], the orthogonal group, and [t \in T(3)], the translation group, and the product law [[R_{1}, t_{1}] [R_{2}, t_{2}] = [R_{1}R_{2}, t_{1} + R_{1}(t_{2})]] shows that [ M(3) = T(3) \; \triangleright\kern-4pt \lt O(3)] with [O(3)] acting on [T(3)] by rotating the translation vectors.

  • (e) Associated actions in function spaces

    For every left action [T_{g}] of G in X, there is an associated left action [T_{g}^{\#}] of G on the space [L(X)] of complex-valued functions over X, defined by `change of variable' (Section 1.3.2.3.9.5[link]): [[T_{g}^{\#} f](x) = f((T_{g})^{-1} x) = f(g^{-1} x).] Indeed for any [g_{1},g_{2}] in G, [\eqalign{[T_{g_{1}}^{\#}[T_{g_{2}}^{\#}\; f]](x) &= [T_{g_{2}}^{\#}\; f] ((T_{g_{1}})^{-1} x) = f[T_{g_{2}}^{-1} T_{g_{1}}^{-1} x] \cr &= f((T_{g_{1}} T_{g_{2}})^{-1} x)\hbox{;}}] since [T_{g_{1}} T_{g_{2}} = T_{g_{1}g_{2}}], it follows that [T_{g_{1}}^{\#} T_{g_{2}}^{\#} = T_{g_{1}g_{2}}^{\#}.] It is clear that the change of variable must involve the action of [g^{-1}] (not g) if [T^{\#}] is to define a left action; using g instead would yield a right action.

    The linear representation operators [T_{g}^{\#}] on [L(X)] provide the most natural instrument for stating and exploiting symmetry properties which a function may possess with respect to the action of G. Thus a function [f \in L(X)] will be called G-invariant if [f(gx) = f(x)] for all [g \in G] and all [x \in X]. The value [f(x)] then depends on x only through its orbit G x, and f is uniquely defined once it is specified on a fundamental domain [D = \{x_{i}\}_{i\in I}]; its integral over X is then a weighted sum of its values in D: [{\textstyle\sum\limits_{x \in X}}\; f(x) = {\textstyle\sum\limits_{i \in I}}\; [G:G_{x_{i}}]\; f(x_{i}).]

    The G-invariance of f may be written: [T_{g}^{\#}f = f \quad \hbox{for all } g \in G.] Thus f is invariant under each [T_{g}^{\#}], which obviously implies that f is invariant under the linear operator in [L(X)] [A_{G} = {1 \over |G|} \sum\limits_{g \in G} T_{g}^{\#},] which averages an arbitrary function by the action of G. Conversely, if [A_{G}\;f = f], then [T_{g_{0}}^{\#}\; f = T_{g_{0}}^{\#} (A_{G}\;f) = (T_{g_{0}}^{\#} A_{G})f = A_{G}\;f = f \quad \hbox{for all } g_{0} \in G,] so that the two statements of the G-invariance of f are equivalent. The identity [T_{g_{0}}^{\#} A_{G} = A_{G} \quad \hbox{for all } g_{0} \in G] is easily proved by observing that the map [g \;\longmapsto\; g_{0}g] ([g_{0}] being any element of G) is a one-to-one map from G into itself, so that [{\textstyle\sum\limits_{g \in G}} T_{g}^{\#} = {\textstyle\sum\limits_{g \in G}} T_{g_{0}g}^{\#}] as these sums differ only by the order of the terms. The same identity implies that [A_{G}] is a projector: [(A_{G})^{2} = A_{G},] and hence that its eigenvalues are either 0 or 1. In summary, we may say that the invariance of f under G is equivalent to f being an eigenfunction of the associated projector [A_{G}] for eigenvalue 1.

  • (f) Orbit exchange

    One final result about group actions which will be used repeatedly later is concerned with the case when X has the structure of a Cartesian product: [X = X_{1} \times X_{2} \times \ldots \times X_{n}] and when G acts diagonally on X, i.e. acts on each [X_{j}] separately: [gx = g(x_{1}, x_{2}, \ldots, x_{n}) = (gx_{1}, gx_{2}, \ldots, gx_{n}).] Then complete sets (but not usually minimal sets) of representatives of the distinct orbits for the action of G in X may be obtained in the form [D_{k} = X_{1} \times \ldots \times X_{k-1} \times \{x_{i_{k}}^{(k)}\}_{i_{k} \in I_{k}} \times X_{k+1} \times \ldots \times X_{n}] for each [k = 1, 2, \ldots, n], i.e. by taking a fundamental domain in [X_{k}] and all the elements in [X_{j}] with [j \neq k]. The action of G on each [D_{k}] does indeed generate the whole of X: given an arbitrary element [y = (y_{1}, y_{2}, \ldots, y_{n})] of X, there is an index [i_{k} \in I_{k}] such that [y_{k} \in Gx_{i_{k}}^{(k)}] and a coset of [G_{x_{i_{k}}^{(k)}}] in G such that [y_{k} = \gamma x_{i_{k}}^{(k)}] for any representative γ of that coset; then [y = \gamma (\gamma^{-1} y_{1}, \ldots, \gamma^{-1} y_{k-1}, x_{i_{k}}^{(k)}, \gamma^{-1} y_{k+1}, \ldots, \gamma^{-1} y_{n})] which is of the form [y = \gamma d_{k}] with [d_{k} \in D_{k}].

    The various [D_{k}] are related in a simple manner by `transposition' or `orbit exchange' (the latter name is due to J. W. Cooley). For instance, [D_{j}] may be obtained from [D_{k}(\;j \neq k)] as follows: for each [y_{j} \in X_{j}] there exists [g(y_{j}) \in G] and [i_{j}(y_{j}) \in I_{j}] such that [y_{j} = g(y_{j})x_{i_{j}(y_{j})}^{(j)}]; therefore [D_{j} = \bigcup\limits_{y_{j} \in X_{j}} [g(y_{j})]^{-1} D_{k},] since the fundamental domain of [X_{k}] is thus expanded to the whole of [X_{k}], while [X_{j}] is reduced to its fundamental domain. In other words: orbits are simultaneously collapsed in the jth factor and expanded in the kth.

    When G operates without fixed points in each [X_{k}] (i.e. [G_{x_{k}} = \{e\}] for all [x_{k} \in X_{k}]), then each [D_{k}] is a fundamental domain for the action of G in X. The existence of fixed points in some or all of the [X_{k}] complicates the situation in that for each k and each [x_{k} \in X_{k}] such that [G_{x_{k}} \neq \{e\}] the action of [G/G_{x_{k}}] on the other factors must be examined. Shenefelt (1988)[link] has made a systematic study of orbit exchange for space group P622 and its subgroups.

    Orbit exchange will be encountered, in a great diversity of forms, as the basic mechanism by which intermediate results may be rearranged between the successive stages of the computation of crystallographic Fourier transforms (Section 1.3.4.3)[link].

1.3.4.2.2.3. Classification of crystallographic groups

| top | pdf |

Let Γ be a crystallographic group, Λ the normal subgroup of its lattice translations, and G the finite factor group [\Gamma/\Lambda]. Then G acts on Λ by conjugation [Section 1.3.4.2.2.2[link](d)[link]] and this action, being a mapping of a lattice into itself, is representable by matrices with integer entries.

The classification of crystallographic groups proceeds from this observation in the following three steps:

  • Step 1: find all possible finite abstract groups G which can be represented by [3 \times 3] integer matrices.

  • Step 2: for each such G find all its inequivalent representations by [3 \times 3] integer matrices, equivalence being defined by a change of primitive lattice basis (i.e. conjugation by a [3 \times 3] integer matrix with determinant ±1).

  • Step 3: for each G and each equivalence class of integral representations of G, find all inequivalent extensions of the action of G from Λ to [T(3)], equivalence being defined by an affine coordinate change [i.e. conjugation by an element of [A(3)]].

Step 1[link] leads to the following groups, listed in association with the crystal system to which they later give rise: [\matrix{{\bb Z}/2{\bb Z}\hfill &\hbox{monoclinic}\hfill \cr {\bb Z}/2{\bb Z} \oplus {\bb Z}/2{\bb Z}\hfill & \hbox{orthorhombic}\hfill \cr{\bb Z}/3{\bb Z}, ({\bb Z}/3{\bb Z})\; \triangleright\kern-4pt \lt \{\alpha\}\hfill &\hbox{trigonal}\hfill \cr {\bb Z}/4{\bb Z}, ({\bb Z}/4{\bb Z}) \; \triangleright\kern-4pt \lt \{\alpha\}\hfill &\hbox{tetragonal}\hfill \cr {\bb Z}/6{\bb Z}, ({\bb Z}/6{\bb Z}) \; \triangleright\kern-4pt \lt \{\alpha\}\hfill &\hbox{hexagonal}\hfill \cr ({\bb Z}/2{\bb Z} \oplus {\bb Z}/2{\bb Z}) \; \triangleright\kern-4pt \lt \{S_{3}\}\hfill &\hbox{cubic}\hfill}] and the extension of these groups by a centre of inversion. In this list ⋉ denotes a semi-direct product [Section 1.3.4.2.2.2[link](d)[link]], α denotes the automorphism [g \;\longmapsto\; g^{-1}], and [S_{3}] (the group of permutations on three letters) operates by permuting the copies of [{\bb Z}/2{\bb Z}] (using the subgroup [A_{3}] of cyclic permutations gives the tetrahedral subsystem).

Step 2[link] leads to a list of 73 equivalence classes called arithmetic classes of representations [g \;\longmapsto\; {\bf R}_{g}], where [{\bf R}_{g}] is a [3 \times 3] integer matrix, with [{\bf R}_{g_{1} g_{2}} = {\bf R}_{g_{1}} {\bf R}_{g_{2}}] and [{\bf R}_{e} = {\bf I}_{3}]. This enumeration is more familiar if equivalence is relaxed so as to allow conjugation by rational [3 \times 3] matrices with determinant ± 1: this leads to the 32 crystal classes. The difference between an arithmetic class and its rational class resides in the choice of a lattice mode [(P,\ A/B/C,\ I,\ F \hbox { or } R)]. Arithmetic classes always refer to a primitive lattice, but may use inequivalent integral representations for a given geometric symmetry element; while crystallographers prefer to change over to a non-primitive lattice, if necessary, in order to preserve the same integral representation for a given geometric symmetry element. The matrices P and [{\bf Q} = {\bf P}^{-1}] describing the changes of basis between primitive and centred lattices are listed in Table 5.1.3.1[link] and illustrated in Figs. 5.1.3.2[link] to 5.1.3.8[link] , pp. 80–85, of Volume A of International Tables (Arnold, 2005[link]).

Step 3[link] gives rise to a system of congruences for the systems of non-primitive translations [\{{\bf t}_{g}\}_{g \in G}] which may be associated to the matrices [\{{\bf R}_{g}\}_{g \in G}] of a given arithmetic class, namely: [{\bf t}_{g_{1}g_{2}} \equiv {\bf R}_{g_{1}} {\bf t}_{g_{2}} + {\bf t}_{g_{1}} \hbox{ mod } \Lambda,] first derived by Frobenius (1911)[link]. If equivalence under the action of [A(3)] is taken into account, 219 classes are found. If equivalence is defined with respect to the action of the subgroup [A^{+}(3)] of [A(3)] consisting only of transformations with determinant +1, then 230 classes called space-group types are obtained. In particular, associating to each of the 73 arithmetic classes a trivial set of non-primitive translations [({\bf t}_{g} = {\bf 0} \hbox { for all } g \in G)] yields the 73 symmorphic space groups. This third step may also be treated as an abstract problem concerning group extensions, using cohomological methods [Ascher & Janner (1965)[link]; see Janssen (1973)[link] for a summary]; the connection with Frobenius's approach, as generalized by Zassenhaus (1948)[link], is examined in Ascher & Janner (1968)[link].

The finiteness of the number of space-group types in dimension 3 was shown by Bieberbach (1912[link]) to be the case in arbitrary dimension. The reader interested in N-dimensional space-group theory for [N \gt 3] may consult Brown (1969)[link], Brown et al. (1978)[link], Schwarzenberger (1980[link]), and Engel (1986)[link]. The standard reference for integral representation theory is Curtis & Reiner (1962)[link].

All three-dimensional space groups G have the property of being solvable, i.e. that there exists a chain of subgroups [G = G_{r} \gt G_{r-1} \gt \ldots \gt G_{1} \gt G_{0} = \{e\},] where each [G_{i-1}] is a normal subgroup of [G_{1}] and the factor group [G_{i}/G_{i-1}] is a cyclic group of some order [m_{i}] [(1 \leq i \leq r)]. This property may be established by inspection, or deduced from a famous theorem of Burnside [see Burnside (1911[link]), pp. 322–323] according to which any group G such that [|G| = p^{\alpha} q^{\beta}], with p and q distinct primes, is solvable; in the case at hand, [p = 2] and [q = 3]. The whole classification of 3D space groups can be performed swiftly by a judicious use of the solvability property (L. Auslander, personal communication).

Solvability facilitates the indexing of elements of G in terms of generators and relations (Coxeter & Moser, 1972[link]; Magnus et al., 1976[link]) for the purpose of calculation. By definition of solvability, elements [g_{1}, g_{2}, \ldots, g_{r}] may be chosen in such a way that the cyclic factor group [G_{i}/G_{i-1}] is generated by the coset [g_{i}G_{i-1}]. The set [\{g_{1}, g_{2}, \ldots, g_{r}\}] is then a system of generators for G such that the defining relations [see Brown et al. (1978[link]), pp. 26–27] have the particularly simple form [\eqalign{g_{1}^{m_{1}} &= e, \cr g_{i}^{m_{i}} &= g_{i-1}^{a(i, \, i-1)} g_{i-2}^{a(i, \, i-2)} \ldots g_{1}^{a(i, \, 1)}\phantom{1,2,}\quad \hbox{for } 2 \leq i \leq r, \cr g_{i}^{-1} g_{j}^{-1} g_{i}g_{j} &= g_{j-1}^{b(i, \,  j, \,  j-1)} g_{j-2}^{b(i, \,  j, \,  j-2)} \ldots g_{1}^{b(i, \,  j, \,  1)}\quad \hbox{for } 1 \leq i \;\lt\; j \leq r,}] with [0 \leq a(i, h) \lt m_{h}] and [0 \leq b(i, j, h) \lt m_{h}]. Each element g of G may then be obtained uniquely as an `ordered word': [g = g_{r}^{k_{r}} g_{r-1}^{k_{r-1}} \ldots g_{1}^{k_{1}},] with [0 \leq k_{i} \lt m_{i} \hbox{ for all } i = 1, \ldots, r], using the algorithm of Jürgensen (1970)[link]. Such generating sets and defining relations are tabulated in Brown et al. (1978[link], pp. 61–76). An alternative list is given in Janssen (1973[link], Table 4.3, pp. 121–123, and Appendix D, pp. 262–271).

1.3.4.2.2.4. Crystallographic group action in real space

| top | pdf |

The action of a crystallographic group Γ may be written in terms of standard coordinates in [{\bb R}^{3}/{\bb Z}^{3}] as [(g, {\bf x}) \;\longmapsto\; S_{g} ({\bf x}) = {\bf R}_{g} {\bf x} + {\bf t}_{g} \hbox{ mod } \Lambda, \quad g \in G,] with [S_{g_{1} g_{2}} = S_{g_{1}} S_{g_{2}}.]

An important characteristic of the representation [\theta : g \;\longmapsto\; S_{g}] is its reducibility, i.e. whether or not it has invariant subspaces other than [\{{\bf 0}\}] and the whole of [{\bb R}^{3}/{\bb Z}^{3}]. For triclinic, monoclinic and orthorhombic space groups, θ is reducible to a direct sum of three one-dimensional representations: [{\bf R}_{g} = \pmatrix{{\bf R}_{g}^{(1)} &{\bf 0} &{\bf 0}\cr {\bf 0} &{\bf R}_{g}^{(2)} &{\bf 0}\cr {\bf 0} &{\bf 0} &{\bf R}_{g}^{(3)}\cr}\hbox{;}] for trigonal, tetragonal and hexagonal groups, it is reducible to a direct sum of two representations, of dimension 2 and 1, respectively; while for tetrahedral and cubic groups, it is irreducible.

By Schur's lemma (see e.g. Ledermann, 1987[link]), any matrix which commutes with all the matrices [{\bf R}_{g}] for [g \in G] must be a scalar multiple of the identity in each invariant subspace.

In the reducible cases, the reductions involve changes of basis which will be rational, not integral, for those arithmetic classes corresponding to non-primitive lattices. Thus the simplification of having maximally reduced representation has as its counterpart the use of non-primitive lattices.

The notions of orbit, isotropy subgroup and fundamental domain (or asymmetric unit) for the action of G on [{\bb R}^{3}/{\bb Z}^{3}] are inherited directly from the general setting of Section 1.3.4.2.2.2.[link] Points x for which [G_{{\bf x}} \neq \{e\}] are called special positions, and the various types of isotropy subgroups which may be encountered in crystallographic groups have been labelled by means of Wyckoff symbols. The representation operators [S_{g}^{\#}] in [L({\bb R}^{3}/{\bb Z}^{3})] have the form: [[S_{g}^{\#} f] ({\bf x}) = f[S_{g}^{-1} ({\bf x})] = f[{\bf R}_{g}^{-1} ({\bf x} - {\bf t}_{g})].] The operators [R_{g}^{\#}] associated to the purely rotational part of each transformation [S_{g}] will also be used. Note the relation: [S_{g}^{\#} = \tau_{{\bf t}_{g}} R_{g}^{\#}.]

Let a crystal structure be described by the list of the atoms in its unit cell, indexed by [k \in K]. Let the electron-density distribution about the centre of mass of atom k be described by [\rho\llap{$-\!$}_{k}] with respect to the standard coordinates x. Then the motif [\rho\llap{$-\!$}^{0}] may be written as a sum of translates: [\rho\llap{$-\!$}^{0} = {\textstyle\sum\limits_{k \in K}} \tau_{{\bf x}_{k}}\rho\llap{$-\!$}_{k}] and the crystal electron density is [\rho\llap{$-\!$} = r^{*} \rho\llap{$-\!$}^{0}].

Suppose that [\rho\llap{$-\!$}] is invariant under Γ. If [{\bf x}_{k_{1}}] and [{\bf x}_{k_{2}}] are in the same orbit, say [{\bf x}_{k_{2}} = S_{g}({\bf x}_{k_{1}})], then [\tau_{{\bf x}_{k_{2}}} \rho\llap{$-\!$}_{k_{2}} = S_{g}^{\#} (\tau_{{\bf x}_{k_{1}}} \rho\llap{$-\!$}_{k_{1}}).] Therefore if [{\bf x}_{k}] is a special position and thus [G_{{\bf x}_{k}} \neq \{e\}], then [S_{g}^{\#} (\tau_{{\bf x}_{k}} \rho\llap{$-\!$}_{k}) = \tau_{{\bf x}_{k}} \rho\llap{$-\!$}_{k} \quad \hbox{for all } g \in G_{{\bf x}_{k}}.] This identity implies that [{\bf R}_{g}{\bf x}_{k} + {\bf t}_{g} \equiv {\bf x}_{k} \hbox{ mod } \Lambda] (the special position condition), and that [\rho\llap{$-\!$}_{k} = R_{g}^{\#} \rho\llap{$-\!$}_{k},] i.e. that [\rho\llap{$-\!$}_{k}] must be invariant by the pure rotational part of [G_{{\bf x}_{k}}]. Trueblood (1956)[link] investigated the consequences of this invariance on the thermal vibration tensor of an atom in a special position (see Section 1.3.4.2.2.6[link] below).

Let J be a subset of K such that [\{{\bf x}_{j}\}_{j \in J}] contains exactly one atom from each orbit. An orbit decomposition yields an expression for [\rho\llap{$-\!$}^{0}] in terms of symmetry-unique atoms: [\rho\llap{$-\!$}^{0} = {\textstyle\sum\limits_{j \in J}} \left({\textstyle\sum\limits_{\gamma_{j} \in G/G_{{\bf x}_{j}}}} S_{\gamma_{j}}^{\#} (\tau_{{\bf x}_{j}} \rho\llap{$-\!$}_{j})\right)] or equivalently [\rho\llap{$-\!$}^{0}({\bf x}) = {\textstyle\sum\limits_{j \in J}} \left\{{\textstyle\sum\limits_{\gamma_{j} \in G/G_{{\bf x}_{j}}}} \rho\llap{$-\!$}_{j}[{\bf R}_{\gamma_{j}}^{-1} ({\bf x} - {\bf t}_{\gamma_{j}}) - {\bf x}_{j}]\right\}.] If the atoms are assumed to be Gaussian, write [\eqalign{ \rho_{j}({\bf X}) &= {Z_{j} \over |\det \pi {\bf U}_{j}|^{1/2}}\cr &\quad\times \exp (- {\textstyle{1 \over 2}}{\bf X}^{T} {\bf U}_{j}^{-1} {\bf X}) \hbox{ in Cartesian \AA{} coordinates},}] where [Z_{j}] is the total number of electrons, and where the matrix [{\bf U}_{j}] combines the Gaussian spread of the electrons in atom j at rest with the covariance matrix of the random positional fluctuations of atom j caused by thermal agitation.

In crystallographic coordinates: [\eqalign{ \rho\llap{$-\!$}_{j} ({\bf x}) &= {Z_{j} \over |\hbox{det } \pi {\bf Q}_{j}|^{1/2}}\cr &\quad\times \exp (- {\textstyle{1 \over 2}}{\bf x}^{T} {\bf Q}_{j}^{-1}{\bf x}) \hbox{ with } {\bf Q}_{j} = {\bf A}^{-1} {\bf U}_{j} ({\bf A}^{-1})^{T}.}]

If atom k is in a special position [{\bf x}_{k}], then the matrix [{\bf Q}_{k}] must satisfy the identity [{\bf R}_{g} {\bf Q}_{k} {\bf R}_{g}^{-1} = {\bf Q}_{k}] for all g in the isotropy subgroup of [{\bf x}_{k}]. This condition may also be written in Cartesian coordinates as [{\bf T}_{g} {\bf U}_{k} {\bf T}_{g}^{-1} = {\bf U}_{k},] where [{\bf T}_{g} = {\bf AR}_{g} {\bf A}^{-1}.] This is a condensed form of the symmetry properties derived by Trueblood (1956)[link].

1.3.4.2.2.5. Crystallographic group action in reciprocal space

| top | pdf |

An elementary discussion of this topic may be found in Chapter 1.4[link] of this volume.

Having established that the symmetry of a crystal may be most conveniently stated and handled via the left representation [g \;\longmapsto\; S_{g}^{\#}] of G given by its action on electron-density distributions, it is natural to transpose this action by the identity of Section 1.3.2.5.5[link]: [\eqalign{ \bar{{\scr F}}[S_{g}^{\#} T]_{\boldxi} &= \bar{{\scr F}}[\tau_{{\bf t}_{{g}}} (R_{g}^{\#} T)]_{\boldxi}\cr &= \exp (2 \pi i {\boldxi} \cdot {\bf t}_{{g}}) [({\bf R}_{g}^{-1})^{T \#} \bar{{\scr F}}[T]]_{\boldxi}}] for any tempered distribution T, i.e. [\bar{{\scr F}}[S_{g}^{\#}T] ({\boldxi}) = \exp (2 \pi i {\boldxi} \cdot {\bf t}_{g}) \bar{{\scr F}}[T]({\bf R}_{g}^{T} {\boldxi})] whenever the transforms are functions.

Putting [T = \rho\llap{$-\!$}], a [{\bb Z}^{3}]-periodic distribution, this relation defines a left action [S_{g}^{*}] of G on [L({\bb Z}^{3})] given by [(S_{g}^{*} F) ({\bf h}) = \exp (2 \pi i {\boldxi} \cdot {\bf t}_{g}) F ({\bf R}_{g}^{T} {\bf h})] which is conjugate to the action [S_{g}^{\#}] in the sense that [\bar{{\scr F}}[S_{g}^{\#} \rho\llap{$-\!$}] = S_{g}^{*} \bar{{\scr F}}[\rho\llap{$-\!$}], \quad i.e.\ S_{g}^{*} = \bar{{\scr F}}S_{g}^{\#} \bar{{\scr F}}.] The identity [S_{g}^{\#} \rho\llap{$-\!$} = \rho\llap{$-\!$}] expressing the G-invariance of [\rho\llap{$-\!$}] is then equivalent to the identity [S_{g}^{*} F = F] between its structure factors, i.e. (Waser, 1955a[link]) [F({\bf h}) = \exp (2 \pi i {\bf h} \cdot {\bf t}_{g}) F ({\bf R}_{g}^{T} {\bf h}).]

If G is made to act on [{\bb Z}^{3}] via [\theta^{*}: \quad (g, {\bf h}) \;\longmapsto\; ({\bf R}_{g}^{-1})^{T} {\bf h},] the usual notions of orbit, isotropy subgroup (denoted [G_{{\bf h}}]) and fundamental domain may be attached to this action. The above relation then shows that the spectrum [\{F ({\bf h})\}_{{\bf h} \in {\bb Z}^{3}}] is entirely known if it is specified on a fundamental domain [D^{*}] containing one reciprocal-lattice point from each orbit of this action.

A reflection h is called special if [G_{{\bf h}} \neq \{e\}]. Then for any [g \in G_{{\bf h}}] we have [{\bf R}_{g}^{T} {\bf h} = {\bf h}], and hence [F ({\bf h}) = \exp (2 \pi i {\bf h} \cdot {\bf t}_{g}) F ({\bf h}),] implying that [F ({\bf h}) = 0] unless [{\bf h} \cdot {\bf t}_{g} \equiv 0\hbox{ mod } 1]. Special reflections h for which [{\bf h} \cdot {\bf t}_{g} \not \equiv 0\hbox{ mod } 1] for some [g \in G_{{\bf h}}] are thus systematically absent. This phenomenon is an instance of the duality between periodization and decimation of Section 1.3.2.7.2[link]: if [{\bf t}_{g} \neq {\bf 0}], the projection of [\rho\llap{$-\!$}] on the direction of h has period [({\bf t}_{g} \cdot {\bf h}) / ({\bf h} \cdot {\bf h}) \;\lt\; 1], hence its transform (which is the portion of F supported by the central line through h) will be decimated, giving rise to the above condition.

A reflection h is called centric if [G{\bf h} = G(-{\bf h})], i.e. if the orbit of h contains [-{\bf h}]. Then [{\bf R}_{\gamma}^{T} {\bf h} = -{\bf h}] for some coset γ in [G/G_{{\bf h}}], so that the following relation must hold: [|F({\bf h})| \exp (i\varphi_{{\bf h}}) = \exp (2\pi i {\bf h} \cdot {\bf t}_{\gamma})|F(-{\bf h})| \exp (i\varphi_{-{\bf h}}).] In the absence of dispersion, Friedel's law gives rise to the phase restriction: [\varphi_{{\bf h}} \equiv \pi {\bf h} \cdot {\bf t}_{\gamma} \hbox{ mod } \pi.] The value of the restricted phase is independent of the choice of coset representative γ. Indeed, if [\gamma '] is another choice, then [\gamma' = g\gamma] with [g \in G_{{\bf h}}] and by the Frobenius congruences [{\bf t}_{\gamma'} = {\bf R}_{g} {\bf t}_{\gamma} + {\bf t}_{g}], so that [{\bf h} \cdot {\bf t}_{\gamma'} \equiv ({\bf R}_{g}^{T} {\bf h}) \cdot {\bf t}_{\gamma} + {\bf h} \cdot {\bf t}_{g} \hbox{ mod } 1.] Since [g \in G_{{\bf h}}], [{\bf R}_{g}^{T} {\bf h} = {\bf h}] and [{\bf h} \cdot {\bf t}_{g} \equiv 0\hbox{ mod } 1] if h is not a systematic absence: thus [\pi {\bf h} \cdot {\bf t}_{\gamma} \equiv \pi {\bf h} \cdot {\bf t}_{\gamma} \hbox{ mod } \pi.]

The treatment of centred lattices may be viewed as another instance of the duality between periodization and decimation (Section 1.3.2.7.2[link]): the periodization of the electron density by the non-primitive lattice translations has as its counterpart in reciprocal space the decimation of the transform by the `reflection conditions' describing the allowed reflections, the decimation and periodization matrices being each other's contragredient.

The reader may consult the papers by Bienenstock & Ewald (1962)[link] and Wells (1965)[link] for earlier approaches to this material.

1.3.4.2.2.6. Structure-factor calculation

| top | pdf |

Structure factors may be calculated from a list of symmetry-unique atoms by Fourier transformation of the orbit decomposition formula for the motif [\rho\llap{$-\!$}^{0}] given in Section 1.3.4.2.2.4[link]: [\eqalignno{F({\bf h}) &= \bar{{\scr F}}[\rho\llap{$-\!$}^{0}] ({\bf h})&\cr &= \bar{{\scr F}}\left[{\textstyle\sum\limits_{j \in J}} \left({\textstyle\sum\limits_{\gamma_{j} \in G/G_{{\bf x}_{j}}}} S_{\gamma_{j}}^{\#} (\tau_{{\bf x}_{j}} \rho\llap{$-\!$}_{j})\right)\right] ({\bf h})&\cr &= {\textstyle\sum\limits_{j \in J}} {\textstyle\sum\limits_{\gamma_{j} \in G/G_{{\bf x}_{j}}}} \bar{{\scr F}}[\tau_{{\bf t}_{\gamma_{j}}} {\bf R}_{\gamma_{j}}^{\#} \tau_{{\bf x}_{j}} \rho\llap{$-\!$}_{j}] ({\bf h})&\cr &= {\textstyle\sum\limits_{j \in J}} {\textstyle\sum\limits_{\gamma_{j} \in G/G_{{\bf x}_{j}}}} \exp (2\pi i {\bf h} \cdot {\bf t}_{\gamma_{j}})&\cr &\quad \times [({\bf R}_{\gamma_{j}}^{-1})^{T_{\#}} [\exp (2\pi i{\boldxi} \cdot {\bf x}_{j}) \bar{{\scr F}}[\rho\llap{$-\!$}_{j}]_{{\boldxi}}]] ({\bf h}) &\cr &= {\textstyle\sum\limits_{j \in J}}\; {\textstyle\sum\limits_{\gamma_{j} \in G/G_{{\bf x}_{j}}}} \exp (2\pi i{\bf h} \cdot {\bf t}_{\gamma_{j}})&\cr &\quad \times \exp [2\pi i({\bf R}_{\gamma_{j}}^{T} {\bf h}) \cdot {\bf x}_{j}] \bar{{\scr F}}[\rho\llap{$-\!$}_{j}] ({\bf R}_{\gamma_{j}}^{T} {\bf h})\hbox{;}&}] i.e. finally: [F({\bf h}) = {\textstyle\sum\limits_{j \in J}}\; {\textstyle\sum\limits_{\gamma_{j} \in G/G_{{\bf x}_{j}}}} \exp \{2\pi i{\bf h} \cdot [S_{\gamma_{j}} ({\bf x}_{j})]\} \bar{{\scr F}}[\rho\llap{$-\!$}_{j}] ({\bf R}_{\gamma_{j}}^{T} {\bf h}).]

In the case of Gaussian atoms, the atomic transforms are [\bar{{\scr F}}[\rho\llap{$-\!$}_{j}] ({\bf h}) = Z_{j} \exp [- {\textstyle{1 \over 2}} {\bf h}^{T} (4\pi^{2} {\bf Q}_{j}) {\bf h}]] or equivalently [\bar{{\scr F}}[\rho_{j}] ({\bf H}) = Z_{j} \exp [-{\textstyle{1 \over 2}} {\bf H}^{T} (4\pi^{2} {\bf U}_{j}) {\bf H}].]

Two common forms of equivalent temperature factors (incorporating both atomic form and thermal motion) are

  • (i) isotropic B: [\bar{{\scr F}}[\rho\llap{$-\!$}_{j}] ({\bf h}) = Z_{j} \exp (-{\textstyle{1 \over 4}} B_{j} {\bf H}^{T} {\bf H}),] so that [{\bf U}_{j} = (B_{j}/8\pi^{2}) {\bf I}], or [{\bf Q}_{j} = (B_{j}/8\pi^{2}) {\bf A}^{-1} ({\bf A}^{-1})^{T}];

  • (ii) anisotropic β's: [\bar{{\scr F}}[\rho\llap{$-\!$}_{j}] ({\bf h}) = Z_{j} \exp (-{\bf h}^{T} {\boldbeta}_{j} {\bf h}),] so that [{\boldbeta}_{j} = 2\pi^{2} {\bf Q}_{j} = 2\pi^{2} {\bf A}^{-1} {\bf U}_{j} ({\bf A}^{-1})^{T}], or [{\bf U}_{j} = (1/2\pi^{2}) {\bf A}\beta_{j} {\bf A}^{T}].

In the first case, [\bar{{\scr F}}[\rho\llap{$-\!$}_{j}] ({\bf R}_{\gamma_{j}}^{T} {\bf h})] does not depend on [\gamma_{j}], and therefore: [\eqalign{ F({\bf h}) &= {\textstyle\sum\limits_{j \in J}} \;Z_{j} \exp \{-{\textstyle{1 \over 4}} B_{j} {\bf h}^{T} [{\bf A}^{-1} ({\bf A}^{-1})^{T}] {\bf h}\}\cr &\quad \times {\textstyle\sum\limits_{\gamma_{j} \in G/G_{{\bf x}_{j}}}} \exp \{2\pi i{\bf h} \cdot [S_{\gamma_{j}} ({\bf x}_{j})]\}.}] In the second case, however, no such simplification can occur: [\eqalign{ F({\bf h}) &= {\textstyle\sum\limits_{j \in J}} \;Z_{j} {\textstyle\sum\limits_{\gamma_{j} \in G/G_{{\bf x}_{j}}}} \exp [-{\bf h}^{T} ({\bf R}_{\gamma_{j}} {\boldbeta}_{j} {\bf R}_{\gamma_{j}}^{T}) {\bf h}]\cr &\quad \times \exp \{2\pi i{\bf h} \cdot [S_{\gamma_{j}} ({\bf x}_{j})]\}.}] These formulae, or special cases of them, were derived by Rollett & Davies (1955)[link], Waser (1955b)[link], and Trueblood (1956)[link].

The computation of structure factors by applying the discrete Fourier transform to a set of electron-density values calculated on a grid will be examined in Section 1.3.4.4.5[link].

1.3.4.2.2.7. Electron-density calculations

| top | pdf |

A formula for the Fourier synthesis of electron-density maps from symmetry-unique structure factors is readily obtained by orbit decomposition: [\eqalign{\rho\llap{$-\!$} ({\bf x}) &= {\textstyle\sum\limits_{{\bf h} \in {\bb Z}^{3}}} F({\bf h}) \exp (-2\pi i{\bf h} \cdot {\bf x})\cr &= {\textstyle\sum\limits_{l \in L}} \left[{\textstyle\sum\limits_{\gamma_{l} \in G/G_{{\bf h}_{l}}}} F({\bf R}_{\gamma_{l}}^{T} {\bf h}_{l}) \exp [-2\pi i({\bf R}_{\gamma_{l}}^{T} {\bf h}_{l}) \cdot {\bf x}]\right]\cr &= {\textstyle\sum\limits_{l \in L}} \;F({\bf h}_{l}) \left[{\textstyle\sum\limits_{\gamma_{l} \in G/G_{{\bf h}_{l}}}} \exp \{-2\pi i{\bf h}_{l} \cdot [S_{\gamma_{l}} ({\bf x})]\}\right],}] where L is a subset of [{\bb Z}^{3}] such that [\{{\bf h}_{l}\}_{l \in L}] contains exactly one point of each orbit for the action [\theta^{*}: (g, {\bf h}) \;\longmapsto\; ({\bf R}_{g}^{-1})^{T} {\bf h}] of G on [{\bb Z}^{3}]. The physical electron density per cubic ångström is then [\rho ({\bf X}) = {1 \over V} \rho\llap{$-\!$} ({\bf Ax})] with V in Å3.

In the absence of anomalous scatterers in the crystal and of a centre of inversion −I in Γ, the spectrum [\{F({\bf h})\}_{{\bf h} \in {\bb Z}^{3}}] has an extra symmetry, namely the Hermitian symmetry expressing Friedel's law (Section 1.3.4.2.1.4[link]). The action of a centre of inversion may be added to that of Γ to obtain further simplification in the above formula: under this extra action, an orbit [G{\bf h}_{l}] with [{\bf h}_{l} \neq {\bf 0}] is either mapped into itself or into the disjoint orbit [G(-{\bf h}_{l})]; the terms corresponding to [+{\bf h}_{l}] and [-{\bf h}_{l}] may then be grouped within the common orbit in the first case, and between the two orbits in the second case.

  • Case 1: [G (-{\bf h}_{l}) = G{\bf h}_{l}, {\bf h}_{l}] is centric. The cosets in [G/G_{{\bf h}_{l}}] may be partitioned into two disjoint classes by picking one coset in each of the two-coset orbits of the action of −I. Let [(G/G_{{\bf h}_{l}})^{+}] denote one such class: then the reduced orbit [\{{\bf R}_{\gamma_{l}}^{T} {\bf h}_{l} | \gamma_{l} \in (G/G_{{\bf h}_{l}})^{+}\}] contains exactly once the Friedel-unique half of the full orbit [G{\bf h}_{l}], and thus [|(G/G_{{\bf h}_{l}})^{+}| = {\textstyle{1 \over 2}} |G/G_{{\bf h}_{l}}|.] Grouping the summands for [+{\bf h}_{l}] and [- {\bf h}_{l}] yields a real-valued summand [2F({\bf h}_{l}) {\textstyle\sum\limits_{\gamma_{l} \in (G/G_{{\bf h}_{l}})^{+}}} \cos [2\pi {\bf h}_{l} \cdot [S_{\gamma_{l}} ({\bf x})] - \varphi_{{\bf h}{l}}].]

  • Case 2: [G(- {\bf h}_{l}) \neq G{\bf h}_{l},\ {\bf h}_{l}] is acentric. The two orbits are then disjoint, and the summands corresponding to [+ {\bf h}_{l}] and [- {\bf h}_{l}] may be grouped together into a single real-valued summand [2F({\bf h}_{l}) {\textstyle\sum\limits_{\gamma_{l} \in G/G_{{\bf h}_{l}}}} \cos [2\pi {\bf h}_{l} \cdot [S_{\gamma_{l}} ({\bf x})] - \varphi_{{\bf h}_{l}}].]

    In order to reindex the collection of all summands of [\rho\llap{$-\!$}], put [L = L_{c} \cup L_{a},] where [L_{c}] labels the Friedel-unique centric reflections in L and [L_{a}] the acentric ones, and let [L_{a}^{+}] stand for a subset of [L_{a}] containing a unique element of each pair [\{+ {\bf h}_{l}, - {\bf h}_{l}\}] for [l \in L_{a}]. Then [\eqalign{\rho\llap{$-\!$} ({\bf x}) &= F ({\bf 0})\cr &\quad + {\textstyle\sum\limits_{c \in L_{c}}} \left[2F ({\bf h}_{c}) {\textstyle\sum\limits_{\gamma_{c} \in (G/G_{{\bf h}_{c}})^{+}}} \cos [2\pi {\bf h}_{c} \cdot [S_{\gamma_{c}} ({\bf x})] - \varphi_{{\bf h}_{c}}]\right]\cr &\quad + {\textstyle\sum\limits_{a \in L_{a}^{+}}} \left[2F ({\bf h}_{a}) {\textstyle\sum\limits_{\gamma_{a} \in G/G_{{\bf h}_{a}}}} \cos [2 \pi {\bf h}_{a} \cdot [S_{\gamma_{a}} ({\bf x})] - \varphi_{{\bf h}_{a}}]\right].}]

1.3.4.2.2.8. Parseval's theorem with crystallographic symmetry

| top | pdf |

The general statement of Parseval's theorem given in Section 1.3.4.2.1.5[link] may be rewritten in terms of symmetry-unique structure factors and electron densities by means of orbit decomposition.

In reciprocal space, [{\textstyle\sum\limits_{{\bf h} \in {\bb Z}^{3}}} \overline{F_{1} ({\bf h})} F_{2} ({\bf h}) = {\textstyle\sum\limits_{l \in L}}\; {\textstyle\sum\limits_{\gamma_{l} \in G/G_{{\bf h}_{l}}}} \overline{F_{1} ({\bf R}_{\gamma_{l}}^{T} {\bf h}_{l})} F_{2} ({\bf R}_{\gamma_{l}}^{T} {\bf h}_{l})\hbox{;}] for each l, the summands corresponding to the various [\gamma_{l}] are equal, so that the left-hand side is equal to [\eqalign{&F_{1} ({\bf 0}) F_{2} ({\bf 0})\cr &\quad + {\textstyle\sum\limits_{c \in L_{c}}} 2|(G/G_{{\bf h}_{c}})^{+} \|F_{1} ({\bf h}_{c})\| F_{2} ({\bf h}_{c})| \cos [\varphi_{1} ({\bf h}_{c}) - \varphi_{2} ({\bf h}_{c})]\cr &\quad + {\textstyle\sum\limits_{a \in L_{a}^{+}}} 2|G/G_{{\bf h}_{a}} \|F_{1} ({\bf h}_{a})\| F_{2} ({\bf h}_{a})| \cos [\varphi_{1} ({\bf h}_{a}) - \varphi_{2} ({\bf h}_{a})].}]

In real space, the triple integral may be rewritten as [{\textstyle\int\limits_{{\bb R}^{3}/{\bb Z}^{3}}} \overline{\rho\llap{$-\!$}_{1} ({\bf x})} \rho\llap{$-\!$}_{2} ({\bf x}) \hbox{ d}^{3} {\bf x} = |G| {\textstyle\int\limits_{D}} \overline{\rho\llap{$-\!$}_{1} ({\bf x})} \rho\llap{$-\!$}_{2} ({\bf x}) \hbox{ d}^{3} {\bf x}] (where D is the asymmetric unit) if [\rho\llap{$-\!$}_{1}] and [\rho\llap{$-\!$}_{2}] are smooth densities, since the set of special positions has measure zero. If, however, the integral is approximated as a sum over a G-invariant grid defined by decimation matrix N, special positions on this grid must be taken into account: [\eqalign{&{1 \over |{\bf N}|} \sum\limits_{{\bf k} \in {\bb Z}^{3}/{\bf N}{\bb Z}^{3}} \overline{\rho\llap{$-\!$}_{1} ({\bf x})} \rho\llap{$-\!$}_{2} ({\bf x})\cr &\qquad = {1 \over |{\bf N}|} \sum\limits_{{\bf x} \in D}\; [G:G_{\bf x}] \overline{\rho\llap{$-\!$}_{1} ({\bf x})} \rho\llap{$-\!$}_{2} ({\bf x})\cr &\qquad = {|G| \over |{\bf N}|} \sum\limits_{{\bf x} \in D} {1 \over |G_{\bf x}|} \overline{\rho\llap{$-\!$}_{1} ({\bf x})} \rho\llap{$-\!$}_{2} ({\bf x}),}] where the discrete asymmetric unit D contains exactly one point in each orbit of G in [{\bb Z}^{3}/{\bf N}{\bb Z}^{3}].

1.3.4.2.2.9. Convolution theorems with crystallographic symmetry

| top | pdf |

The standard convolution theorems derived in the absence of symmetry are readily seen to follow from simple properties of functions [e^{\pm} ({\bf h},{\bf x}) = \exp (\pm 2\pi i {\bf h} \cdot {\bf x})] (denoted simply e in formulae which are valid for both signs), namely: [\eqalign{ (\hbox{i})\ \qquad e ({\bf h},{\bf x}) \times e ({\bf k},{\bf x}) &= e ({\bf h} + {\bf k},{\bf x}),\cr (\hbox{ii}) \qquad e ({\bf h},{\bf x}) \times e ({\bf h},{\bf y}) &= e ({\bf h},{\bf x} + {\bf y}).}] These relations imply that the families of functions [\eqalign{&\qquad\{{\bf x} \;\longmapsto\; e ({\bf h},{\bf x})\}_{{\bf h} \in {\bb Z}^{3}}\qquad\hbox{in real space}\cr {\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!}\hbox{and}\cr &\qquad\{{\bf h} \;\longmapsto\; e ({\bf h},{\bf x})\}_{{\bf x} \in {\bb R}^{3}/{\bb Z}^{3}} \quad \hbox{in reciprocal space}}] both generate an algebra of functions, i.e. a vector space endowed with an internal multiplication, since (i) and (ii) show how to `linearize products'.

Friedel's law (when applicable) on the one hand, and the Fourier relation between intensities and the Patterson function on the other hand, both follow from the property [(\hbox{iii}) \qquad \overline{e ({\bf h},{\bf x})} = e (-{\bf h},{\bf x}) = e ({\bf h}, -{\bf x}).]

When crystallographic symmetry is present, the convolution theorems remain valid in their original form if written out in terms of `expanded' data, but acquire a different form when rewritten in terms of symmetry-unique data only. This rewriting is made possible by the extra relation (Section 1.3.4.2.2.5[link]) [(\hbox{iv}) \qquad S_{g^{-1}}^{\#} e ({\bf h},{\bf x}) \equiv e [{\bf h}, S_{g} ({\bf x})] = e ({\bf h},{\bf t}_{g}) e ({\bf R}_{g}^{T} {\bf h},{\bf x})] or equivalently [\eqalign{ (\hbox{iv}') \qquad S_{g}^{\#} e ({\bf h},{\bf x}) &\equiv e [{\bf h}, S_{g}^{-1} ({\bf x})]\cr &= e [(-{\bf R}_{g}^{-1})^{T} {\bf h},{\bf t}_{g}] e [({\bf R}_{g}^{-1})^{T} {\bf h},{\bf x}].}]

The kernels of symmetrized Fourier transforms are not the functions e but rather the symmetrized sums [\Xi^{\pm} ({\bf h},{\bf x}) = {\textstyle\sum\limits_{g \in G}} e^{\pm} [{\bf h}, S_{g} ({\bf x})] = {\textstyle\sum\limits_{g \in G}} e^{\pm} [{\bf h}, S_{g}^{-1} ({\bf x})]] for which the linearization formulae are readily obtained using (i), (ii) and (iv) as [\eqalign{ (\hbox{i})_{G}\ \quad \Xi^{\pm} ({\bf h},{\bf x}) \Xi^{\pm} ({\bf k}, {\bf x}) &= {\textstyle\sum\limits_{g \in G}} e^{\pm} ({\bf k},{\bf t}_{g}) \Xi^{\pm} ({\bf h} + {\bf R}_{g}^{T} {\bf k},{\bf x}),\cr (\hbox{ii})_{G} \quad \Xi^{\pm} ({\bf h},{\bf x}) \Xi^{\pm} ({\bf h},{\bf y}) &= {\textstyle\sum\limits_{g \in G}} \Xi^{\pm} [{\bf h},{\bf x} + S_{g} ({\bf y})],}] where the choice of sign in ± must be the same throughout each formula.

Formulae [\hbox{(i)}_{G}] defining the `structure-factor algebra' associated to G were derived by Bertaut (1955c[link], 1956b[link],c[link], 1959a[link],b[link]) and Bertaut & Waser (1957)[link] in another context.

The forward convolution theorem (in discrete form) then follows. Let [\eqalign{ F_{1} ({\bf h}) &= \sum\limits_{{\bf y} \in D} {1 \over |G_{\bf y}|} \rho\llap{$-\!$}_{1} ({\bf y}) \Xi^{+} ({\bf h},{\bf y}),\cr F_{2} ({\bf h}) &= \sum\limits_{{\bf z} \in D} {1 \over |G_{\bf z}|} \rho\llap{$-\!$}_{2} ({\bf z}) \Xi^{+} ({\bf h},{\bf z}),}] then [F_{1} ({\bf h}) F_{2} ({\bf h}) = \sum\limits_{{\bf x} \in D} {1 \over |G_{\bf x}|} \sigma\llap{$-$} ({\bf x}) \Xi^{+} ({\bf h},{\bf x})] with [\sigma\llap{$-$} ({\bf x}) = {1 \over |{\bf N}|} \sum\limits_{{\bf z} \in D} \sum\limits_{g \in G} {|G_{\bf x}| \over |G_{{\bf x} - S_{g} ({\bf z})}| \times |G_{{\bf z}}|} \rho\llap{$-\!$}_{1} [{\bf x} - S_{g} ({\bf z})] \rho\llap{$-\!$}_{2} ({\bf z}).]

The backward convolution theorem is derived similarly. Let [\eqalign{ \rho\llap{$-\!$}_{1} ({\bf x}) &= \sum\limits_{{\bf k} \in D^{*}} {1 \over |G_{{\bf k}}|} F_{1} ({\bf k}) \Xi^{-} ({\bf k},{\bf x}),\cr \rho\llap{$-\!$}_{2} ({\bf x}) &= \sum\limits_{{\bf l} \in D^{*}} {1 \over |G_{{\bf l}}|} F_{2} ({\bf l}) \Xi^{-} ({\bf l},{\bf x}),}] then [\rho\llap{$-\!$}_{1} ({\bf x}) \rho\llap{$-\!$}_{2} ({\bf x}) = \sum\limits_{{\bf h} \in D^{*}} {1 \over |G_{{\bf h}}|} F({\bf h}) \Xi^{-} ({\bf h},{\bf x})] with [F({\bf h}) = \sum\limits_{{\bf l} \in D^{*}} \sum\limits_{g \in G} {|G_{\bf h}| \over |G_{{\bf h} - {\bf R}_{g}^{T} ({\bf l})}| \times |G_{{\bf l}}|} e^{-} ({\bf l},{\bf t}_{g}) F_{1} ({\bf h} - {\bf R}_{g}^{T} {\bf l}) F_{2} ({\bf l}).] Both formulae are simply orbit decompositions of their symmetry-free counterparts.

1.3.4.2.2.10. Correlation and Patterson functions

| top | pdf |

Consider two model electron densities [\rho\llap{$-\!$}_{1}] and [\rho\llap{$-\!$}_{2}] with the same period lattice [{\bb Z}^{3}] and the same space group G. Write their motifs in terms of atomic electron densities (Section 1.3.4.2.2.4[link]) as [\eqalign{ \rho\llap{$-\!$}_{1}^{0} &= {\textstyle\sum\limits_{j_{1} \in J_{1}}} \left({\textstyle\sum\limits_{\scriptstyle_{\gamma_{j_{1}} \in G/G_{{\bf x}_{j_{1}}}^{(1)}}}} S_{\gamma_{j_{1}}}^{\#} (\tau_{{\bf x}_{j_{1}}^{(1)}} \rho\llap{$-\!$}_{j_{1}}^{(1)})\right),\cr \rho\llap{$-\!$}_{2}^{0} &= {\textstyle\sum\limits_{j_{2} \in J_{2}}} \left({\textstyle\sum\limits_{\scriptstyle{\gamma_{j_{2}} \in G/G_{{\bf x}_{j_{2}}}^{(2)}}} S_{\gamma_{j_{2}}}^{\#} (\tau_{{\bf x}_{j_{2}}^{(2)}} \rho\llap{$-\!$}_{j_{2}}^{(2)})}\right),}] where [J_{1}] and [J_{2}] label the symmetry-unique atoms placed at positions [\{{\bf x}_{j_{1}}^{(1)}\}_{j_{1} \in J_{1}}] and [\{{\bf x}_{j_{2}}^{(2)}\}_{j_{2} \in J_{2}}], respectively.

To calculate the correlation between [\rho\llap{$-\!$}_{1}] and [\rho\llap{$-\!$}_{2}] we need the following preliminary formulae, which are easily established: if [S({\bf x}) = {\bf Rx} + {\bf t}] and f is an arbitrary function on [{\bb R}^{3}], then [(R^{\#} f)\breve{} = R^{\#} \breve{f}, \quad (\tau_{{\bf x}}\; f)\breve{} = \tau_{-{\bf x}} \;\breve{f}, \quad R^{\#} (\tau_{{\bf x}}\; f) = \tau_{{\bf Rx}}\; f,] hence [S^{\#} (\tau_{{\bf x}} \;f) = \tau_{S({\bf x})} R^{\#} f \quad \hbox{and} \quad [S^{\#} (\tau_{{\bf x}} \;f)]\;\breve{} = \tau_{-S({\bf x})} R^{\#} \breve{f}\hbox{;}] and [S_{1}^{\#} f_{1} * S_{2}^{\#} f_{2} = S_{1}^{\#} [\;f_{1} * (S_{1}^{-1} S_{2})^{\#} f_{2}] = S_{2}^{\#} [(S_{2}^{-1} S_{1})^{\#} f_{1} * f_{2}].]

The cross correlation [\breve{\rho\llap{$-\!$}}_{1}^{0} * \rho\llap{$-\!$}_{2}^{0}] between motifs is therefore [\eqalign{ \breve{\rho\llap{$-\!$}}_{1}^{0} * \rho\llap{$-\!$}_{2}^{0} &= {\textstyle\sum\limits_{j_{1}}} {\textstyle\sum\limits_{j_{2}}} {\textstyle\sum\limits_{\gamma_{j_{1}}}} {\textstyle\sum\limits_{\gamma_{j_{2}}}} [S_{\gamma_{j_{1}}}^{\#} (\tau_{{\bf x}_{j_{1}}^{(1)}} \rho\llap{$-\!$}_{j_{1}}^{(1)})]\;\breve{} * [S_{\gamma_{j_{2}}}^{\#} (\tau_{{\bf x}_{j_{2}}^{(2)}} \rho\llap{$-\!$}_{j_{2}}^{(2)})]\cr &= {\textstyle\sum\limits_{j_{1}}} {\textstyle\sum\limits_{j_{2}}} {\textstyle\sum\limits_{\gamma_{j_{1}}}} {\textstyle\sum\limits_{\gamma_{j_{2}}}} \tau_{S_{\gamma_{j_{2}}}_{({\bf x}_{j_{2}}^{(2)}) - S_{\gamma_{j_{1}}} ({\bf x}_{j_{1}}^{(1)})}} [(R_{\gamma_{j_{1}}}^{\#} \breve{\rho\llap{$-\!$}}_{j_{1}}^{(1)}) * (R_{\gamma_{j_{2}}}^{\#} \rho\llap{$-\!$}_{j_{2}}^{(2)})]}] which contains a peak of shape [(R_{\gamma_{j_{1}}}^{\#} \breve{\rho\llap{$-\!$}}_{j_{1}}^{(1)}) * (R_{\gamma_{j_{2}}}^{\#} \rho\llap{$-\!$}_{j_{2}}^{(2)})] at the interatomic vector [S_{\gamma_{j_{2}}} ({\bf x}_{j_{2}}^{(2)}) - S_{\gamma_{j_{1}}} ({\bf x}_{j_{1}}^{(1)})] for each [j_{1} \in J_{1}], [j_{2} \in J_{2}], [\gamma_{j_{1}} \in G/G_{{\bf x}_{j_{1}}^{(1)}}], [\gamma_{j_{2}} \in G/G_{{\bf x}_{j_{2}}^{(2)}}].

The cross-correlation [r * \breve{\rho\llap{$-\!$}}_{1}^{0} * \rho\llap{$-\!$}_{2}^{0}] between the original electron densities is then obtained by further periodizing by [{\bb Z}^{3}].

Note that these expressions are valid for any choice of `atomic' density functions [\rho\llap{$-\!$}_{j_{1}}^{(1)}] and [\rho\llap{$-\!$}_{j_{2}}^{(2)}], which may be taken as molecular fragments if desired (see Section 1.3.4.4.8[link]).

If G contains elements g such that [{\bf R}_{g}] has an eigenspace [E_{1}] with eigenvalue 1 and an invariant complementary subspace [E_{2}], while [{\bf t}_{g}] has a non-zero component [{\bf t}_{g}^{(1)}] in [E_{1}], then the Patterson function [r * \breve{\rho\llap{$-\!$}}^{0} * \rho\llap{$-\!$}^{0}] will contain Harker peaks (Harker, 1936[link]) of the form [S_{g} ({\bf x}) - {\bf x} = {\bf t}_{g}^{(1)} \oplus (S_{g}^{(2)} ({\bf x}) - {\bf x})] [where [S_{g}^{(s)}] represent the action of g in [E_{2}]] in the translate of [E_{1}] by [{\bf t}_{g}^{(1)}].

1.3.4.3. Crystallographic discrete Fourier transform algorithms

| top | pdf |

1.3.4.3.1. Historical introduction

| top | pdf |

In 1929, W. L. Bragg demonstrated the practical usefulness of the Fourier transform relation between electron density and structure factors by determining the structure of diopside from three principal projections calculated numerically by 2D Fourier summation (Bragg, 1929[link]). It was immediately realized that the systematic use of this powerful method, and of its extension to three dimensions, would entail considerable amounts of numerical computation which had to be organized efficiently. As no other branch of applied science had yet needed this type of computation, crystallographers had to invent their own techniques.

The first step was taken by Beevers & Lipson (1934)[link] who pointed out that a 2D summation could be factored into successive 1D summations. This is essentially the tensor product property of the Fourier transform (Sections 1.3.2.4.2.4[link], 1.3.3.3.1[link]), although its aspect is rendered somewhat complicated by the use of sines and cosines instead of complex exponentials. Computation is economized to the extent that the cost of an [N \times N] transform grows with N as [2N^{3}] rather than [N^{4}]. Generalization to 3D is immediate, reducing computation size from [N^{6}] to [3N^{4}] for an [N \times N \times N] transform. The complication introduced by using expressions in terms of sines and cosines is turned to advantage when symmetry is present, as certain families of terms are systematically absent or are simply related to each other; multiplicity corrections must, however, be introduced. The necessary information was tabulated for each space group by Lonsdale (1936)[link], and was later incorporated into Volume I of International Tables.

The second step was taken by Beevers & Lipson (1936)[link] and Lipson & Beevers (1936)[link] in the form of the invention of the `Beevers–Lipson strips', a practical device which was to assist a whole generation of crystallographers in the numerical computation of crystallographic Fourier sums. The strips comprise a set of `cosine strips' tabulating the functions [A \cos \left({2\pi hm \over 60}\right)\;\; (A = 1, 2, \ldots, 99\hbox{; } h = 1, 2, \ldots, 99)] and a set of `sine strips' tabulating the functions [B \sin \left({2\pi hm \over 60}\right) \;\;(B = 1, 2, \ldots, 99\hbox{; } h = 1, 2, \ldots, 99)] for the 16 arguments [m = 0, 1, \ldots, 15]. Function values are rounded to the nearest integer, and those for other arguments m may be obtained by using the symmetry properties of the sine and cosine functions. A Fourier summation of the form [Y(m) = \sum\limits_{j = 1}^{n} \left[A_{j} \cos \left({2\pi h_{j}m \over 60}\right) + B_{j} \sin \left({2\pi h_{j}m \over 60}\right)\right]] is then performed by selecting the n cosine strips labelled [(A_{j}, h_{j})] and the n sine strips labelled [(B_{j}, h_{j})], placing them in register, and adding the tabulated values columnwise. The number 60 was chosen as the l.c.m. of 12 (itself the l.c.m. of the orders of all possible non-primitive translations) and of 10 (for decimal convenience). The limited accuracy imposed by the two-digit tabulation was later improved by Robertson's sorting board (Robertson, 1936a[link],b[link]) or by the use of separate strips for each decimal digit of the amplitude (Booth, 1948b[link]), which allowed three-digit tabulation while keeping the set of strips within manageable size. Cochran (1948a)[link] found that, for most structures under study at the time, the numerical inaccuracies of the method were less than the level of error in the experimental data. The sampling rate was subsequently increased from 60 to 120 (Beevers, 1952[link]) to cope with larger unit cells.

Further gains in speed and accuracy were sought through the construction of special-purpose mechanical, electro-mechanical, electronic or optical devices. Two striking examples are the mechanical computer RUFUS built by Robertson (1954[link], 1955[link], 1961[link]) on the principle of previous strip methods (see also Robertson, 1932[link]) and the electronic analogue computer X-RAC built by Pepinsky, capable of real-time calculation and display of 2D and 3D Fourier syntheses (Pepinsky, 1947[link]; Pepinsky & Sayre, 1948[link]; Pepinsky et al., 1961[link]; see also Suryan, 1957[link]). The optical methods of Lipson & Taylor (1951[link], 1958[link]) also deserve mention. Many other ingenious devices were invented, whose descriptions may be found in Booth (1948b)[link], Niggli (1961)[link], and Lipson & Cochran (1968)[link].

Later, commercial punched-card machines were programmed to carry out Fourier summations or structure-factor calculations (Shaffer et al., 1946a[link],b[link]; Cox et al., 1947[link], 1949[link]; Cox & Jeffrey, 1949[link]; Donohue & Schomaker, 1949[link]; Grems & Kasper, 1949[link]; Hodgson et al., 1949[link]; Greenhalgh & Jeffrey, 1950[link]; Kitz & Marchington, 1953[link]).

The modern era of digital electronic computation of Fourier series was initiated by the work of Bennett & Kendrew (1952)[link], Mayer & Trueblood (1953)[link], Ahmed & Cruickshank (1953b)[link], Sparks et al. (1956)[link] and Fowweather (1955)[link]. Their Fourier-synthesis programs used Beevers–Lipson factorization, the program by Sparks et al. being the first 3D Fourier program useable for all space groups (although these were treated as P1 or [P\bar{1}] by data expansion). Ahmed & Barnes (1958)[link] then proposed a general programming technique to allow full use of symmetry elements (orthorhombic or lower) in the 3D Beevers–Lipson factorization process, including multiplicity corrections. Their method was later adopted by Shoemaker & Sly (1961)[link], and by crystallographic program writers at large.

The discovery of the FFT algorithm by Cooley & Tukey in 1965, which instantly transformed electrical engineering and several other disciplines, paradoxically failed to have an immediate impact on crystallographic computing. A plausible explanation is that the calculation of large 3D Fourier maps was a relatively infrequent task which was not thought to constitute a bottleneck, as crystallographers had learned to settle most structural questions by means of cheaper 2D sections or projections. It is significant in this respect that the first use of the FFT in crystallography by Barrett & Zwick (1971)[link] should have occurred as part of an iterative scheme for improving protein phases by density modification in real space, which required a much greater number of Fourier transformations than any previous method. Independently, Bondot (1971)[link] had attracted attention to the merits of the FFT algorithm.

The FFT program used by Barrett & Zwick had been written for signal-processing applications. It was restricted to sampling rates of the form [2^{n}], and was not designed to take advantage of crystallographic symmetry at any stage of the calculation; Bantz & Zwick (1974)[link] later improved this situation somewhat.

It was the work of Ten Eyck (1973)[link] and Immirzi (1973[link], 1976[link]) which led to the general adoption of the FFT in crystallographic computing. Immirzi treated all space groups as P1 by data expansion. Ten Eyck based his program on a versatile multi-radix FFT routine (Gentleman & Sande, 1966[link]) coupled with a flexible indexing scheme for dealing efficiently with multidimensional transforms. He also addressed the problems of incorporating symmetry elements of order 2 into the factorization of 1D transforms, and of transposing intermediate results by other symmetry elements. He was thus able to show that in a large number of space groups (including the 74 space groups having orthorhombic or lower symmetry) it is possible to calculate only the unique results from the unique data within the logic of the FFT algorithm. Ten Eyck wrote and circulated a package of programs for computing Fourier maps and re-analysing them into structure factors in some simple space groups (P1, P1, P2, P2/m, P21, P222, P212121, Pmmm). This package was later augmented by a handful of new space-group-specific programs contributed by other crystallographers (P21212, I222, P3121, P41212). The writing of such programs is an undertaking of substantial complexity, which has deterred all but the bravest: the usual practice is now to expand data for a high-symmetry space group to the largest subgroup for which a specific FFT program exists in the package, rather than attempt to write a new program. Attempts have been made to introduce more modern approaches to the calculation of crystallographic Fourier transforms (Auslander, Feig & Winograd, 1982[link]; Auslander & Shenefelt, 1987[link]; Auslander et al., 1988[link]) but have not gone beyond the stage of preliminary studies.

The task of fully exploiting the FFT algorithm in crystallographic computations is therefore still unfinished, and it is the purpose of this section to provide a systematic treatment such as that (say) of Ahmed & Barnes (1958)[link] for the Beevers–Lipson algorithm.

Ten Eyck's approach, based on the reducibility of certain space groups, is extended by the derivation of a universal transposition formula for intermediate results. It is then shown that space groups which are not completely reducible may nevertheless be treated by three-dimensional Cooley–Tukey factorization in such a way that their symmetry may be fully exploited, whatever the shape of their asymmetric unit. Finally, new factorization methods with built-in symmetries are presented. The unifying concept throughout this presentation is that of `group action' on indexing sets, and of `orbit exchange' when this action has a composite structure; it affords new ways of rationalizing the use of symmetry, or of improving computational speed, or both.

1.3.4.3.2. Defining relations and symmetry considerations

| top | pdf |

A finite set of reflections [\{F_{{\bf h}_{l}}\}_{l \in L}] can be periodized without aliasing by the translations of a suitable sublattice [{\bf N}^{T} \Lambda^{*}] of the reciprocal lattice [\Lambda^{*}]; the converse operation in real space is the sampling of ρ at points X of a grid of the form [{\bf N}^{-1} \Lambda] (Section 1.3.2.7.3[link]). In standard coordinates, [\{F_{{\bf h}_{l}}\}_{l \in L}] is periodized by [{\bf N}^{T} {\bb Z}^{3}], and [\rho\llap{$-\!$}] is sampled at points [{\bf x} \in {\bf N}^{-1} {\bb Z}^{3}].

In the absence of symmetry, the unique data are

  • – the [F_{{\bf h}}] indexed by [{\bf h} \in {\bb Z}^{3} / {\bf N}^{T} {\bb Z}^{3}] in reciprocal space;

  • – the [\rho\llap{$-\!$}_{{\bf x}}] indexed by [{\bf x} \in ({\bf N}^{-1} {\bb Z}^{3}) / {\bb Z}^{3}]; or equivalently the [\rho\llap{$-\!$}_{{\bf m}}] indexed by [{\bf m} \in {\bb Z}^{3} / {\bf N} {\bb Z}^{3}], where [{\bf x} = {\bf N}^{-1} {\bf m}].

They are connected by the ordinary DFT relations: [F_{{\bf h}} = {1 \over |\det {\bf N}|} {\sum\limits_{{\bf x} \in ({\bf N}^{-1} {\bb Z}^{3}) / {\bb Z}^{3}}} \rho\llap{$-\!$}_{{\bf x}} \exp (2\pi i {\bf h} \cdot {\bf x})] or [F_{{\bf h}} = {1 \over |\det {\bf N}|} {\sum\limits_{{\bf m} \in {\bb Z}^{3} / {\bf N}{\bb Z}^{3}}} \rho\llap{$-\!$}_{{\bf m}} \exp [2\pi i {\bf h} \cdot ({\bf N}^{-1} {\bf m})]] and [\rho\llap{$-\!$}_{{\bf x}} = {\textstyle\sum\limits_{{\bf h} = {\bb Z}^{3} / {\bf N}^{T} {\bb Z}^{3}}} F_{{\bf h}} \exp (-2\pi i {\bf h} \cdot {\bf x})] or [\rho\llap{$-\!$}_{{\bf m}} = {\textstyle\sum\limits_{{\bf h} \in {\bb Z}^{3} / {\bf N}^{T} {\bb Z}^{3}}} F_{{\bf h}} \exp [-2\pi i {\bf h} \cdot ({\bf N}^{-1} {\bf m})].]

In the presence of symmetry, the unique data are

[\{\rho\llap{$-\!$}_{{\bf x}}\}_{{\bf x} \in D}] or [\{\rho\llap{$-\!$}_{{\bf m}}\}_{{\bf m} \in D}] in real space (by abuse of notation, D will denote an asymmetric unit for x or for m indifferently);

[\{F_{{\bf h}}\}_{{\bf h} \in D^{*}}] in reciprocal space.

The previous summations may then be subjected to orbital decomposition, to yield the following `crystallographic DFT' (CDFT) defining relations: [\eqalign{F_{{\bf h}} &= {1 \over |\det {\bf N}|} {\sum\limits_{{\bf x} \in D}} \rho\llap{$-\!$}_{{\bf x}} \left[{\textstyle\sum\limits_{\gamma \in G / G_{{\bf x}}}} \exp \{2 \pi i {\bf h} \cdot [S_{\gamma} ({\bf x})]\}\right]\cr &= {1 \over |\det {\bf N}|} {\sum\limits_{{\bf x} \in D}} \rho\llap{$-\!$}_{{\bf x}} \left[{1 \over |G_{{\bf x}}|} {\textstyle\sum\limits_{g \in G}} \exp \{2\pi i {\bf h} \cdot [S_{g} ({\bf x})]\}\right],\cr &\rho\llap{$-\!$}_{{\bf x}} = {\textstyle\sum\limits_{{\bf h} \in D^{*}}} F_{{\bf h}} \left[{\textstyle\sum\limits_{\gamma \in G / G_{{\bf h}}}} \exp \{-2 \pi i {\bf h} \cdot [S_{\gamma} ({\bf x})]\}\right]\cr &\quad = {\textstyle\sum\limits_{{\bf h} \in D^{*}}} F_{{\bf h}} \left[{1 \over |G_{{\bf h}}|} {\textstyle\sum\limits_{g \in G}} \exp \{-2 \pi i {\bf h} \cdot [S_{g} ({\bf x})]\}\right],}] with the obvious alternatives in terms of [\rho\llap{$-\!$}_{{\bf m}}, {\bf m} = {\bf Nx}]. Our problem is to evaluate the CDFT for a given space group as efficiently as possible, in spite of the fact that the group action has spoilt the simple tensor-product structure of the ordinary three-dimensional DFT (Section 1.3.3.3.1[link]).

Two procedures are available to carry out the 3D summations involved as a succession of smaller summations:

  • (1) decomposition into successive transforms of fewer dimensions but on the same number of points along these dimensions. This possibility depends on the reducibility of the space group, as defined in Section 1.3.4.2.2.4[link], and simply invokes the tensor product property of the DFT;

  • (2) factorization of the transform into transforms of the same number of dimensions as the original one, but on fewer points along each dimension. This possibility depends on the arithmetic factorability of the decimation matrix N, as described in Section 1.3.3.3.2[link].

Clearly, a symmetry expansion to the largest fully reducible subgroup of the space group will give maximal decomposability, but will require computing more than the unique results from more than the unique data. Economy will follow from factoring the transforms in the subspaces within which the space group acts irreducibly.

For irreducible subspaces of dimension 1, the group action is readily incorporated into the factorization of the transform, as first shown by Ten Eyck (1973)[link].

For irreducible subspaces of dimension 2 or 3, the ease of incorporation of symmetry into the factorization depends on the type of factorization method used. The multidimensional Cooley–Tukey method (Section 1.3.3.3.1[link]) is rather complicated; the multidimensional Good method (Section 1.3.3.3.2.2[link]) is somewhat simpler; and the Rader/Winograd factorization admits a generalization, based on the arithmetic of certain rings of algebraic integers, which accommodates 2D crystallographic symmetries in a most powerful and pleasing fashion.

At each stage of the calculation, it is necessary to keep track of the definition of the asymmetric unit and of the symmetry properties of the numbers being manipulated. This requirement applies not only to the initial data and to the final results, where these are familiar; but also to all the intermediate quantities produced by partial transforms (on subsets of factors, or subsets of dimensions, or both), where they are less familiar. Here, the general formalism of transposition (or `orbit exchange') described in Section 1.3.4.2.2.2[link] plays a central role.

1.3.4.3.3. Interaction between symmetry and decomposition

| top | pdf |

Suppose that the space-group action is reducible, i.e. that for each [g \in G] [{\bf R}_{g} = \pmatrix{{\bf R}'_{g} &{\bf 0}\cr {\bf 0} &{\bf R}''_{g}\cr},\qquad {\bf t}_{g} = \openup 2pt\pmatrix{{\bf t}'_{g}\cr {\bf t}''_{g}\cr}\hbox{;}] by Schur's lemma, the decimation matrix must then be of the form [{\bf N} = \pmatrix{{\bf N}' &{\bf 0}\cr {\bf 0} &{\bf N}''\cr}] if it is to commute with all the [{\bf R}_{g}].

Putting [{\bf x} = \pmatrix{{\bf x}'\cr {\bf x}''\cr}] and [{\bf h} = \pmatrix{{\bf h}'\cr {\bf h}''\cr}], we may define [\eqalign{S'_{g} ({\bf x}') &= {\bf R}'_{g} {\bf x}' + {\bf t}'_{g},\cr S''_{g} ({\bf x}'') &= {\bf R}''_{g} {\bf x}'' + {\bf t}''_{g},}] and write [S_{g} = S'_{g} \oplus S''_{g}] (direct sum) as a shorthand for [S_{g} ({\bf x}) = \openup2pt\pmatrix{S'_{g} ({\bf x}')\cr S''_{g} ({\bf x}'')\cr}.]

We may also define the representation operators [S_{g}^{'\#}] and [S_{g}^{''\#}] acting on functions of [{\bf x}'] and [{\bf x}''], respectively (as in Section 1.3.4.2.2.4[link]), and the operators [S_{g}^{'*}] and [S_{g}^{''*}] acting on functions of [{\bf h}'] and [{\bf h}''], respectively (as in Section 1.3.4.2.2.5[link]). Then we may write [S_{g}^{\#} = (S'_{g})^{\#} \oplus (S''_{g})^{\#}] and [S_{g}^{*} = (S'_{g})^{*} \oplus (S''_{g})^{*}] in the sense that g acts on [f({\bf x}) \equiv f({\bf x}', {\bf x}'')] by [(S_{g}^{\#} f)({\bf x}', {\bf x}'') = f[(S'_{g})^{-1} ({\bf x}'), (S''_{g})^{-1} ({\bf x}'')]] and on [\Phi ({\bf h}) \equiv \Phi ({\bf h}', {\bf h}'')] by [\eqalign{(S_{g}^{*} \Phi)({\bf h}', {\bf h}'') &= \exp (2\pi i{\bf h}' \cdot {\bf t}'_{g}) \exp (2\pi i{\bf h}'' \cdot {\bf t}''_{g})\cr &\quad \times \Phi [{\bf R}_{g}^{'T} {\bf h}', {\bf R}_{g}^{''T} {\bf h}''].}]

Thus equipped we may now derive concisely a general identity describing the symmetry properties of intermediate quantities of the form [\eqalign{ T ({\bf x}', {\bf h}'') &= {\sum\limits_{{\bf h}^\prime}}\; F({\bf h}', {\bf h}'') \exp (-2\pi i{\bf h}' \cdot {\bf x}')\cr &= {1 \over |\det {\bf N}'|} {\sum\limits_{{\bf x}''}}\; \rho\llap{$-\!$} ({\bf x}', {\bf x}'') \exp (+2\pi i{\bf h}'' \cdot {\bf x}''),}] which arise through partial transformation of F on [{\bf h}'] or of [\rho\llap{$-\!$}] on [{\bf x}'']. The action of [g \in G] on these quantities will be

  • (i) through [(S'_{g})^{\#}] on the function [{\bf x}'\;\longmapsto\; T ({\bf x}', {\bf h}'')],

  • (ii) through [(S''_{g})^{*}] on the function [{\bf h}'' \;\longmapsto\; T ({\bf x}', {\bf h}'')],

and hence the symmetry properties of T are expressed by the identity [T = [(S'_{g})^{\#} \oplus (S''_{g})^{*}] T.] Applying this relation not to T but to [[(S'_{g^{-1}})^{\#} \oplus (S''_{e})^{*}] T] gives [[(S'_{g^{-1}})^{\#} \oplus (S''_{e})^{*}] T = [(S'_{e})^{\#} \oplus (S''_{g})^{*}] T,] i.e. [Scheme scheme2]

If the unique [F({\bf h}) \equiv F({\bf h}', {\bf h}'')] were initially indexed by [(\hbox{all } {\bf h}') \times (\hbox{unique } {\bf h}'')] (see Section 1.3.4.2.2.2[link]), this formula allows the reindexing of the intermediate results [T ({\bf x}', {\bf h}'')] from the initial form [(\hbox{all } {\bf x}') \times (\hbox{unique } {\bf h}'')] to the final form [(\hbox{unique } {\bf x}') \times (\hbox{all } {\bf h}''),] on which the second transform (on [{\bf h}'']) may now be performed, giving the final results [\rho\llap{$-\!$} ({\bf x}', {\bf x}'')] indexed by [(\hbox{unique } {\bf x}') \times (\hbox{all } {\bf x}''),] which is an asymmetric unit. An analogous interpretation holds if one is going from [\rho\llap{$-\!$}] to F.

The above formula solves the general problem of transposing from one invariant subspace to another, and is the main device for decomposing the CDFT. Particular instances of this formula were derived and used by Ten Eyck (1973)[link]; it is useful for orthorhombic groups, and for dihedral groups containing screw axes [n_{m}] with g.c.d. [(m, n) = 1]. For comparison with later uses of orbit exchange, it should be noted that the type of intermediate results just dealt with is obtained after transforming on all factors in one summand.

A central piece of information for driving such a decomposition is the definition of the full asymmetric unit in terms of the asymmetric units in the invariant subspaces. As indicated at the end of Section 1.3.4.2.2.2[link], this is straightforward when G acts without fixed points, but becomes more involved if fixed points do exist. To this day, no systematic `calculus of asymmetric units' exists which can automatically generate a complete description of the asymmetric unit of an arbitrary space group in a form suitable for directing the orbit exchange process, although Shenefelt (1988)[link] has outlined a procedure for dealing with space group P622 and its subgroups. The asymmetric unit definitions given in Volume A of International Tables are incomplete in this respect, in that they do not specify the possible residual symmetries which may exist on the boundaries of the domains.

1.3.4.3.4. Interaction between symmetry and factorization

| top | pdf |

Methods for factoring the DFT in the absence of symmetry were examined in Sections 1.3.3.2[link] and 1.3.3.3[link]. They are based on the observation that the finite sets which index both data and results are endowed with certain algebraic structures (e.g. are Abelian groups, or rings), and that subsets of indices may be found which are not merely subsets but substructures (e.g. subgroups or subrings). Summation over these substructures leads to partial transforms, and the way in which substructures fit into the global structure indicates how to reassemble the partial results into the final results. As a rule, the richer the algebraic structure which is identified in the indexing set, the more powerful the factoring method.

The ability of a given factoring method to accommodate crystallographic symmetry will thus be determined by the extent to which the crystallographic group action respects (or fails to respect) the partitioning of the index set into the substructures pertaining to that method. This remark justifies trying to gain an overall view of the algebraic structures involved, and of the possibilities of a crystallographic group acting `naturally' on them.

The index sets [\{{\bf m} | {\bf m} \in {\bb Z}^{3}/{\bf N}{\bb Z}^{3}\}] and [\{{\bf h} | {\bf h} \in {\bb Z}^{3}/{\bf N}^{T} {\bb Z}^{3}\}] are finite Abelian groups under component-wise addition. If an iterated addition is viewed as an action of an integer scalar [n \in {\bb Z}] via [\matrix{n{\bf h} = {\bf h} + {\bf h} + \ldots + {\bf h}\hfill & (n \hbox{ times})\hfill & \hbox{for } n \gt 0,\hfill \cr \phantom{n{\bf h}}= {\bf 0}\hfill & & \hbox{for } n = 0,\hfill \cr \phantom{n{\bf h}}= -({\bf h} + {\bf h} + \ldots + {\bf h})\hfill &(|n| \hbox{ times})\hfill &\hbox{for } n \;\lt\; 0,\hfill}] then an Abelian group becomes a module over the ring [{\bb Z}] (or, for short, a [{\bb Z}]-module), a module being analogous to a vector space but with scalars drawn from a ring rather than a field. The left actions of a crystallographic group G by [\displaylines{g: \quad {\bf m} \;\longmapsto\; {\bf R}_{g} {\bf m} + {\bf Nt}_{g} \quad \hbox{mod } {\bf N}{\bb Z}^{3}\cr \hbox{and by}\hfill\cr \;\;g: \quad {\bf h} \;\longmapsto\; ({\bf R}_{g}^{-1})^{T} {\bf h} {\hbox to 22.5pt{}}\hbox{mod } {\bf N}^{T} {\bb Z}^{3}}] can be combined with this [{\bb Z}] action as follows: [\displaylines{\quad {\textstyle\sum\limits_{g \in G}} n_{g} g:\qquad {\bf m} \;\longmapsto\; {\textstyle\sum\limits_{g \in G}} n_{g} ({\bf R}_{g} {\bf m} + {\bf Nt}_{g})\qquad \hbox{ mod } {\bf N}{\bb Z}^{3},\hfill\cr \quad {\textstyle\sum\limits_{g \in G}} n_{g} g:\qquad {\bf h} \;\longmapsto\; {\textstyle\sum\limits_{g \in G}} n_{g} [({\bf R}_{g}^{-1})^{T} {\bf h}] {\hbox to 15pt{}} \phantom{\hbox{mod } {\bf}} \hbox{ mod } {\bf N}^{T} {\bb Z}^{3}.\hfill\cr}] This provides a left action, on the indexing sets, of the set [{\bb Z} G = \left\{{\textstyle\sum\limits_{g \in G}} n_{g} g\big| n_{g} \in {\bb Z} \hbox{ for each } g \in G\right\}] of symbolic linear combinations of elements of G with integral coefficients. If addition and multiplication are defined in [{\bb Z}G] by [\left({\textstyle\sum\limits_{g_{1} \in G}} a_{g_{1}} g_{1}\right) + \left({\textstyle\sum\limits_{g_{2} \in G}} b_{g_{2}} g_{2}\right) = {\textstyle\sum\limits_{g \in G}} (a_{g} + b_{g})g] and [\left({\textstyle\sum\limits_{g_{1} \in G}} a_{g_{1}} g_{1}\right) \times \left({\textstyle\sum\limits_{g_{2} \in G}} b_{g_{2}} g_{2}\right) = {\textstyle\sum\limits_{g \in G}} c_{g} g,] with [c_{g} = {\textstyle\sum\limits_{g' \in G}} a_{g'} b_{(g')^{-1}} g,] then [{\bb Z}G] is a ring, and the action defined above makes the indexing sets into [{\bb Z}G]-modules. The ring [{\bb Z}G] is called the integral group ring of G (Curtis & Reiner, 1962[link], p. 44).

From the algebraic standpoint, therefore, the interaction between symmetry and factorization can be expected to be favourable whenever the indexing sets of partial transforms are [{\bb Z}G]-submodules of the main [{\bb Z}G]-modules.

1.3.4.3.4.1. Multidimensional Cooley–Tukey factorization

| top | pdf |

Suppose, as in Section 1.3.3.3.2.1[link], that the decimation matrix N may be factored as [{\bf N}_{1} {\bf N}_{2}]. Then any grid point index [{\bf m} \in {\bb Z}^{3}/{\bf N} {\bb Z}^{3}] in real space may be written [{\bf m} = {\bf m}_{1} + {\bf N}_{1} {\bf m}_{2}] with [{\bf m}_{1} \in {\bb Z}^{3}/{\bf N}_{1} {\bb Z}^{3}] and [{\bf m}_{2} \in {\bb Z}^{3}/{\bf N}_{2} {\bb Z}^{3}] determined by [\let\normalbaselines\relax\openup4pt\matrix{ {\bf m}_{1} = {\bf m}\hfill &\hbox{ mod } {\bf N}_{1} {\bb Z}^{3},\hfill\cr {\bf m}_{2} = {\bf N}_{1}^{-1} ({\bf m} - {\bf m}_{1})\hfill &\hbox{ mod } {\bf N}_{2} {\bb Z}^{3}.\hfill}] These relations establish a one-to-one correspondence [{\bf m} \leftrightarrow ({\bf m}_{1}, {\bf m}_{2})] between [I = {\bb Z}^{3}/{\bf N} {\bb Z}^{3}] and the Cartesian product [I_{1} \times I_{2}] of [I_{1} = {\bb Z}^{3}/{\bf N}_{1} {\bb Z}^{3}] and [I_{2} = {\bb Z}^{3}/{\bf N}_{2} {\bb Z}^{3}], and hence [I \cong I_{1} \times I_{2}] as a set. However [I \not \cong I_{1} \times I_{2}] as an Abelian group, since in general [{\bf m} + {\bf m}'\;\; {\not{\hbox to -7pt{}}\longleftrightarrow} ({\bf m}_{1} + {\bf m}'_{1}, {\bf m}_{2} + {\bf m}'_{2})] because there can be a `carry' from the addition of the first components into the second components; therefore, [I \not \cong I_{1} \times I_{2}] as a [{\bb Z}G]-module, which shows that the incorporation of symmetry into the Cooley–Tukey algorithm is not a trivial matter.

Let [g \in G] act on I through [g: \quad {\bf m} \;\longmapsto\; S_{g} ({\bf m}) = {\bf R}_{g} {\bf m} + {\bf Nt}_{g} \hbox{ mod } {\bf N}{\bb Z}^{3}] and suppose that N `integerizes' all the non-primitive translations [{\bf t}_{g}] so that we may write [{\bf Nt}_{g} = {\bf t}_{g}^{(1)} + {\bf N}_{1} {\bf t}_{g}^{(2)},] with [{\bf t}_{g}^{(1)} \in I_{1}] and [{\bf t}_{g}^{(2)} \in I_{2}] determined as above. Suppose further that N, [{\bf N}_{1}] and [{\bf N}_{2}] commute with [{\bf R}_{g}] for all [g \in G], i.e. (by Schur's lemma, Section 1.3.4.2.2.4[link]) that these matrices are integer multiples of the identity in each G-invariant subspace. The action of g on [{\bf m} = {\bf Nx} \hbox{ mod } {\bf N}{\bb Z}^{3}] leads to [\let\normalbaselines\relax\openup4pt\matrix{S_{g} ({\bf m}) = {\bf N} [{\bf R}_{g} ({\bf N}^{-1} {\bf m}) + {\bf Nt}_{g}]\hfill &\hbox{ mod } {\bf N}{\bb Z}^{3}\hfill \cr \phantom{S_{g} ({\bf m})}= {\bf NR}_{g} {\bf N}^{-1} ({\bf m}_{1} + {\bf N}_{1} {\bf m}_{2}) + {\bf t}_{g}^{(1)} + {\bf N}_{1} {\bf t}_{g}^{(2)}\hfill &\hbox{ mod } {\bf N}{\bb Z}^{3}\hfill \cr  \phantom{S_{g} ({\bf m})}= {\bf R}_{g} {\bf m}_{{1}} + {\bf t}_{g}^{(1)} + {\bf N}_{1} ({\bf R}_{g} {\bf m}_{2} + {\bf t}_{g}^{(2)})\hfill &\hbox{ mod } {\bf N}{\bb Z}^{3},\hfill}] which we may decompose as [S_{g} ({\bf m}) = [S_{g} ({\bf m})]_{1} + {\bf N}_{1} [S_{g} ({\bf m})]_{2}] with [[S_{g} ({\bf m})]_{1} \equiv S_{g} ({\bf m}) \qquad\hbox{mod } {\bf N}_{1}{\bb Z}^{3}] and [[S_{g} ({\bf m})]_{2} \equiv {\bf N}_{1}^{-1} \{S_{g} ({\bf m}) - [S_{g} ({\bf m})]_{1}\} \qquad\hbox{mod } {\bf N}_{2}{\bb Z}^{3}.]

Introducing the notation [\eqalign{ S_{g}^{(1)} ({\bf m}_{1}) &= {\bf R}_{g} {\bf m}_{1} + {\bf t}_{g}^{(1)} \hbox{ mod } {\bf N}_{1}{\bb Z}^{3},\cr S_{g}^{(2)} ({\bf m}_{2}) &= {\bf R}_{g} {\bf m}_{2} + {\bf t}_{g}^{(2)} \hbox{ mod } {\bf N}_{2}{\bb Z}^{3},}] the two components of [S_{g} ({\bf m})] may be written [\eqalign{ [S_{g} ({\bf m})]_{1} &= S_{g}^{(1)} ({\bf m}_{1}),\cr [S_{g} ({\bf m})]_{2} &= S_{g}^{(2)} ({\bf m}_{2}) + {\boldmu}_{2} (g, {\bf m}_{1}) \hbox{ mod } {\bf N}_{2}{\bb Z}^{3},}] with [{\boldmu}_{2} (g, {\bf m}_{1}) = {\bf N}_{1}^{-1} \{({\bf R}_{g} {\bf m}_{1} + {\bf t}_{g}^{(1)}) - [S_{g} ({\bf m}_{1})]_{1}\} \hbox{ mod } {\bf N}_{2}{\bb Z}^{3}.]

The term [\boldmu_{2}] is the geometric equivalent of a carry or borrow: it arises because [{\bf R}_{g} {\bf m}_{1} + {\bf t}_{g}^{(1)}], calculated as a vector in [{\bb Z}^{3}/{\bf N}{\bb Z}^{3}], may be outside the unit cell [{\bf N}_{1} [0, 1]^{3}], and may need to be brought back into it by a `large' translation with a non-zero component in the [{\bf m}_{2}] space; equivalently, the action of g may need to be applied around different permissible origins for different values of [{\bf m}_{1}], so as to map the unit cell into itself without any recourse to lattice translations. [Readers familiar with the cohomology of groups (see e.g. Hall, 1959[link]; MacLane, 1963[link]) will recognize [\boldmu_{2}] as the cocycle of the extension of [{\bb Z}] G-modules described by the exact sequence [0 \rightarrow I_{2} \rightarrow I \rightarrow I_{1} \rightarrow 0].]

Thus G acts on I in a rather complicated fashion: although [g \;\longmapsto\; S_{g}^{(1)}] does define a left action in [I_{1}] alone, no action can be defined in [I_{2}] alone because [\boldmu_{2}] depends on [{\bf m}_{1}]. However, because [S_{g}], [S_{g}^{(1)}] and [S_{g}^{(2)}] are left actions, it follows that [\boldmu_{2}] satisfies the identity [{\boldmu}_{2} (gg', {\bf m}_{1}) = S_{g}^{(2)} [{\boldmu}_{2} (g', {\bf m}_{1})] + {\boldmu}_{2} [g, S_{g}^{(1)} ({\bf m}_{1})] \qquad\hbox{mod } {\bf N}_{2} {\bb Z}^{3}] for all g, [g'] in G and all [{\bf m}_{1}] in [I_{1}]. In particular, [\boldmu_{2} (\hbox{e}, {\bf m}_{1}) = {\bf 0}] for all [{\bf m}_{1}], and [{\boldmu}_{2} (g^{-1}, {\bf m}_{1}) = -S_{g^{-1}}^{(2)} \{{\boldmu}_{2} [g, S_{g^{-1}}^{(1)} ({\bf m}_{1})]\} \hbox{ mod } {\bf N}_{2} {\bb Z}^{3}.]

This action will now be used to achieve optimal use of symmetry in the multidimensional Cooley–Tukey algorithm of Section 1.3.3.3.2.1.[link] Let us form an array Y according to [Y({\bf m}_{1}, {\bf m}_{2}) = \rho ({\bf m}_{1} + {\bf N}_{1} {\bf m}_{2})] for all [{\bf m}_{2} \in I_{2}] but only for the unique [{\bf m}_{1}] under the action [S_{g}^{(1)}] of G in [I_{1}]. Except in special cases which will be examined later, these vectors contain essentially an asymmetric unit of electron-density data, up to some redundancies on boundaries. We may then compute the partial transform on [{\bf m}_{2}]: [Y^{*} ({\bf m}_{1}, {\bf h}_{2}) = {1 \over |\hbox{det } {\bf N}_{2}|} {\sum\limits_{{\bf m}_{2} \in I_{2}}} Y({\bf m}_{1}, {\bf m}_{2}) e[{\bf h}_{2} \cdot ({\bf N}_{2}^{-1} {\bf m}_{2})].] Using the symmetry of [\rho\llap{$-\!$}] in the form [\rho\llap{$-\!$} = S_{g}^{\#} \rho\llap{$-\!$}] yields by the procedure of Section 1.3.3.3.2[link] the transposition formula [\eqalign{ Y^{*} (S_{g}^{(1)} ({\bf m}_{1}), {\bf h}_{2}) &= e\{{\bf h}_{2} \cdot [{\bf N}_{2}^{-1} ({\bf t}_{g}^{(2)} + {\boldmu}_{2} (g, {\bf m}_{1}))]\}\cr &\quad \times Y^{*} ({\bf m}_{1}, [{\bf R}_{g}^{(2)}]^{T} {\bf h}_{2}).}]

By means of this identity we can transpose intermediate results [Y^{*}] initially indexed by [(\hbox{unique } {\bf m}_{1}) \times (\hbox{all } {\bf h}_{2}),] so as to have them indexed by [(\hbox{all } {\bf m}_{1}) \times (\hbox{unique } {\bf h}_{2}).] We may then apply twiddle factors to get [Z({\bf m}_{1}, {\bf h}_{2}) = e[{\bf h}_{2} \cdot ({\bf N}^{-1} {\bf m}_{1})] Y^{*} ({\bf m}_{1}, {\bf h}_{2})] and carry out the second transform [Z^{*} ({\bf h}_{1}, {\bf h}_{2}) = {1 \over |\hbox{det } {\bf N}_{1}|} {\sum\limits_{{\bf m}_{1} \in I_{1}}} Z({\bf m}_{1}, {\bf h}_{2}) e[{\bf h}_{1} \cdot ({\bf N}_{1}^{-1} {\bf m}_{1})].] The final results are indexed by [(\hbox{all } {\bf h}_{1}) \times (\hbox{unique } {\bf h}_{2}),] which yield essentially an asymmetric unit of structure factors after unscrambling by: [F({\bf h}_{2} + {\bf N}_{2}^{T} {\bf h}_{1}) = Z^{*} ({\bf h}_{1}, {\bf h}_{2}).]

The transposition formula above applies to intermediate results when going backwards from F to [\rho\llap{$-\!$}], provided these results are considered after the twiddle-factor stage. A transposition formula applicable before that stage can be obtained by characterizing the action of G on h (including the effects of periodization by [{\bf N}^{T} {\bb Z}^{3}]) in a manner similar to that used for m.

Let [{\bf h} = {\bf h}_{2} + {\bf N}_{2}^{T} {\bf h}_{1},] with [\let\normalbaselines\relax\openup4pt\matrix{{\bf h}_{2} = {\bf h}\hfill &\hbox{mod } {\bf N}_{2}^{T} {\bb Z}^{3},\hfill \cr {\bf h}_{1} = ({\bf N}_{2}^{-1})^{T} ({\bf h} - {\bf h}_{2}) \hfill & \hbox{mod } {\bf N}_{1}^{T} {\bb Z}^{3}.\hfill}] We may then write [{\bf R}_{g}^{T} {\bf h} = [{\bf R}_{g}^{T} {\bf h}]_{2} + {\bf N}_{2}^{T} [{\bf R}_{g}^{T} {\bf h}]_{1},] with [\let\normalbaselines\relax\openup4pt\matrix{ [{\bf R}_{g}^{T} {\bf h}]_{2} = [{\bf R}_{g}^{(2)}]^{T} {\bf h}_{2} \hfill &\hbox{mod } {\bf N}_{2}^{T} {\bb Z}^{3},\hfill \cr [{\bf R}_{g}^{T} {\bf h}]_{1} = [{\bf R}_{g}^{(1)}]^{T} {\bf h}_{1} + {\boldeta}_{1} (g, {\bf h}_{2}) \hfill &\hbox{mod } {\bf N}_{1}^{T} {\bb Z}^{3}.\hfill}] Here [[{\bf R}_{g}^{(2)}]^{T}, [{\bf R}_{g}^{(1)}]^{T}] and [\boldeta_{1}] are defined by [\eqalign{ [{\bf R}_{g}^{(2)}]^{T} {\bf h}_{2} &= {\bf R}_{g}^{T} {\bf h}\quad \quad \hbox{ mod } {\bf N}_{2}^{T} {\bb Z}^{3},\cr [{\bf R}_{g}^{(1)}]^{T} {\bf h}_{1} &= {\bf R}_{g}^{T} {\bf h}\quad \quad \hbox{ mod } {\bf N}_{1}^{T} {\bb Z}^{3}}] and [{\boldeta}_{1} (g, {\bf h}_{2}) = ({\bf N}_{2}^{-1})^{T} ({\bf R}_{g}^{T} {\bf h}_{2} - [{\bf R}_{g}^{(2)}]^{T} {\bf h}_{2}) \hbox{ mod } {\bf N}_{1}^{T} {\bb Z}^{3}.]

Let us then form an array [Z^{*}] according to [Z^{*} ({\bf h}'_{1}, {\bf h}'_{2}) = F({\bf h}'_{2} + {\bf N}_{2}^{T} {\bf h}'_{1})] for all [{\bf h}'_{1}] but only for the unique [{\bf h}'_{2}] under the action of G in [{\bb Z}^{3} / {\bf N}_{2}^{T} {\bb Z}^{3}], and transform on [{\bf h}'_{1}] to obtain [Z({\bf m}_{1}, {\bf h}_{2}) = {\textstyle\sum\limits_{{\bf h}'_{1} \in {\bb Z}^{3} / {\bf N}_{1}^{T} {\bb Z}^{3}}} Z^{*}({\bf h}'_{1}, {\bf h}'_{2}) e[-{\bf h}'_{1} \cdot ({\bf N}_{1}^{-1} {\bf m}_{1})].] Putting [{\bf h}' = {\bf R}_{g}^{T}{\bf h}] and using the symmetry of F in the form [F({\bf h}') = F({\bf h}) \exp (-2\pi i{\bf h} \cdot {\bf t}_{g}),] where [\eqalign{ {\bf h} \cdot {\bf t}_{g} &= ({\bf h}_{2}^{T} + {\bf h}_{1}^{T} {\bf N}_{2}) ({\bf N}_{2}^{-1} {\bf N}_{1}^{-1}) ({\bf t}_{g}^{(1)} + {\bf N}_{1} {\bf t}_{g}^{(2)})\cr &\equiv {\bf h}_{2} \cdot {\bf t}_{g} + {\bf h}_{2} \cdot ({\bf N}_{1}^{-1} {\bf t}_{g}^{(1)}) \hbox{ mod } 1}] yields by a straightforward rearrangement [\eqalign{Z({\bf m}_{1}, [{\bf R}_{g}^{(2)}]^{T} {\bf h}_{2}) &= e[-\{{\bf h}_{2} \cdot {\bf t}_{g} + \boldeta_{1} (g, {\bf h}_{2}) \cdot ({\bf N}_{1}^{-1} {\bf m}_{1})\}]\cr &\quad \times Z\{S_{g}^{(1)} ({\bf m}_{1}), {\bf h}_{2}\}.}]

This formula allows the transposition of intermediate results Z from an indexing by [(\hbox{all } {\bf m}_{1}) \times (\hbox{unique } {\bf h}_{2})] to an indexing by [(\hbox{unique } {\bf m}_{1}) \times (\hbox{all } {\bf h}_{2}).] We may then apply the twiddle factors to obtain [Y^{*} ({\bf m}_{1}, {\bf h}_{2}) = e[-{\bf h}_{2} \cdot ({\bf N}^{-1} {\bf m}_{1})] Z ({\bf m}_{1}, {\bf h}_{2})] and carry out the second transform on [{\bf h}_{2}] [Y({\bf m}_{1}, {\bf m}_{2}) = {\textstyle\sum\limits_{{\bf h}_{2} \in {\bb Z}^{3}/{\bf N}_{2}^{T} {\bb Z}^{3}}} Y^{*} ({\bf m}_{1}, {\bf h}_{2}) e[-{\bf h}_{2} \cdot ({\bf N}_{2}^{-1} {\bf m}_{2})].] The results, indexed by [(\hbox{unique } {\bf m}_{1}) \times (\hbox{all } {\bf m}_{2})] yield essentially an asymmetric unit of electron densities by the rearrangement [\rho\llap{$-\!$} ({\bf m}_{1} + {\bf N}_{1}{\bf m}_{2}) = Y({\bf m}_{1}, {\bf m}_{2}).]

The equivalence of the two transposition formulae up to the intervening twiddle factors is readily established, using the relation [{\bf h}_{2} \cdot [{\bf N}_{2}^{-1} {\boldmu}_{2} (g, {\bf m}_{1})] = \boldeta_{1} (g, {\bf h}_{2}) \cdot ({\bf N}_{1}^{-1} {\bf m}_{1}) \hbox{ mod } 1] which is itself a straightforward consequence of the identity [{\bf h} \cdot [{\bf N}^{-1} S_{g} ({\bf m})] = {\bf h} \cdot {\bf t}_{g} + ({\bf R}_{g}^{T}{\bf h}) \cdot ({\bf N}^{-1} {\bf m}).]

To complete the characterization of the effect of symmetry on the Cooley–Tukey factorization, and of the economy of computation it allows, it remains to consider the possibility that some values of [{\bf m}_{1}] may be invariant under some transformations [g \in G] under the action [{\bf m}_{1} \;\longmapsto\; S_{g}^{(1)} ({\bf m}_{1})].

Suppose that [{\bf m}_{1}] has a non-trivial isotropy subgroup [G_{{\bf m}_{1}}], and let [g \in G_{{\bf m}_{1}}]. Then each subarray [Y_{{\bf m}_{1}}] defined by [Y_{{\bf m}_{1}} ({\bf m}_{2}) = Y({\bf m}_{1}, {\bf m}_{2}) = \rho ({\bf m}_{1} + {\bf N}_{1}{\bf m}_{2})] satisfies the identity [\eqalign{ Y_{{\bf m}_{1}} ({\bf m}_{2}) &= Y_{S_{g}^{(1)} ({\bf m}_{1})} [S_{g}^{(2)} ({\bf m}_{2}) + {\boldmu}_{2} (g, {\bf m}_{1})]\cr &= Y_{{\bf m}_{1}} [S_{g}^{(2)} ({\bf m}_{2}) + {\boldmu}_{2} (g, {\bf m}_{1})]}] so that the data for the transform on [{\bf m}_{2}] have residual symmetry properties. In this case the identity satisfied by [\boldmu_{2}] simplifies to [{\boldmu}_{2}(gg', {\bf m}_{1}) = S_{g}^{(2)} [{\boldmu}_{2} (g', {\bf m}_{1})] + {\boldmu}_{2} (g, {\bf m}_{1}) \hbox{ mod } {\bf N}_{2} {\bb Z}^{3},] which shows that the mapping [g \;\longmapsto\; \boldmu_{2} (g, {\bf m}_{1})] satisfies the Frobenius congruences (Section 1.3.4.2.2.3[link]). Thus the internal symmetry of subarray [Y_{{\bf m}_{1}}] with respect to the action of G on [{\bf m}_{2}] is given by [G_{{\bf m}_{1}}] acting on [{\bb Z}^{3} / {\bf N}_{2} {\bb Z}^{3}] via [{\bf m}_{2} \;\longmapsto\; S_{g}^{(2)} ({\bf m}_{2}) + {\boldmu}_{2} (g, {\bf m}_{1}) \hbox{ mod } {\bf N}_{2} {\bb Z}^{3}.]

The transform on [{\bf m}_{2}] needs only be performed for one out of [[G:G_{{\bf m}_{1}}]] distinct arrays [Y_{{\bf m}_{1}}] (results for the others being obtainable by the transposition formula), and this transforms is [G_{{\bf m}_{1}}]-symmetric. In other words, the following cases occur: [\let\normalbaselines\relax\openup3pt\matrix{(\hbox{i})\hfill &G_{{\bf m}_{1}} = \{e\}\hfill & \hbox{maximum saving in computation}\hfill\cr & & (\hbox{by } |G|)\hbox{;}\hfill\cr &&{\bf m}_{2}\hbox{-transform has no symmetry}.\hfill\cr (\hbox{ii})\hfill & G_{{\bf m}_{1}} = G' \;\lt\; G\hfill &\hbox{saving in computation by a factor}\hfill\cr && \hbox{of } [G:G']\hbox{;}\hfill\cr && {\bf m}_{2}\hbox{-transform is } G'\hbox{-symmetric}.\hfill \cr (\hbox{iii})\hfill & G_{{\bf m}_{1}} = G \hfill & \hbox{no saving in computation}\hbox{;}\hfill\cr && {\bf m}_{2}\hbox{-transform is } G\hbox{-symmetric}.\hfill}]

The symmetry properties of the [{\bf m}_{2}]-transform may themselves be exploited in a similar way if [{\bf N}_{2}] can be factored as a product of smaller decimation matrices; otherwise, an appropriate symmetrized DFT routine may be provided, using for instance the idea of `multiplexing/demultiplexing' (Section 1.3.4.3.5[link]). We thus have a recursive descent procedure, in which the deeper stages of the recursion deal with transforms on fewer points, or of lower symmetry (usually both).

The same analysis applies to the [{\bf h}_{1}]-transforms on the subarrays [Z_{{\bf h}_{2}}^{*}], and leads to a similar descent procedure.

In conclusion, crystallographic symmetry can be fully exploited to reduce the amount of computation to the minimum required to obtain the unique results from the unique data. No such analysis was so far available in cases where the asymmetric units in real and reciprocal space are not parallelepipeds. An example of this procedure will be given in Section 1.3.4.3.6.5[link].

1.3.4.3.4.2. Multidimensional Good factorization

| top | pdf |

This procedure was described in Section 1.3.3.3.2.2[link]. The main difference with the Cooley–Tukey factorization is that if [{\bf N} = {\bf N}_{1}{\bf N}_{2} \ldots {\bf N}_{d-1} {\bf N}_{d}], where the different factors are pairwise coprime, then the Chinese remainder theorem reindexing makes [{\bb Z}^{3}/{\bf N}{\bb Z}^{3}] isomorphic to a direct sum. [{\bb Z}^{3}/{\bf N}{\bb Z}^{3} \cong ({\bb Z}^{3}/{\bf N}_{1}{\bb Z}^{3}) \oplus \ldots \oplus ({\bb Z}^{3}/{\bf N}_{d}{\bb Z}^{3}),] where each p-primary piece is endowed with an induced [{\bb Z}G]-module structure by letting G operate in the usual way but with the corresponding modular arithmetic. The situation is thus more favourable than with the Cooley–Tukey method, since there is no interference between the factors (no `carry'). In the terminology of Section 1.3.4.2.2.2[link], G acts diagonally on this direct sum, and results of a partial transform may be transposed by orbit exchange as in Section 1.3.4.3.4.1[link] but without the extra terms μ or η. The analysis of the symmetry properties of partial transforms also carries over, again without the extra terms. Further simplification occurs for all p-primary pieces with p other than 2 or 3, since all non-primitive translations (including those associated to lattice centring) disappear modulo p.

Thus the cost of the CRT reindexing is compensated by the computational savings due to the absence of twiddle factors and of other phase shifts associated with non-primitive translations and with geometric `carries'.

Within each p-primary piece, however, higher powers of p may need to be split up by a Cooley–Tukey factorization, or carried out directly by a suitably adapted Winograd algorithm.

1.3.4.3.4.3. Crystallographic extension of the Rader/Winograd factorization

| top | pdf |

As was the case in the absence of symmetry, the two previous classes of algorithms can only factor the global transform into partial transforms on prime numbers of points, but cannot break the latter down any further. Rader's idea of using the action of the group of units [U(p)] to obtain further factorization of a p-primary transform has been used in `scalar' form by Auslander & Shenefelt (1987)[link], Shenefelt (1988)[link], and Auslander et al. (1988)[link]. It will be shown here that it can be adapted to the crystallographic case so as to take advantage also of the possible existence of n-fold cyclic symmetry elements [(n = 3,\;4,\;6)] in a two-dimensional transform (Bricogne & Tolimieri, 1990[link]). This adaptation entails the use of certain rings of algebraic integers rather than ordinary integers, whose connection with the handling of cyclic symmetry will now be examined.

Let G be the group associated with a threefold axis of symmetry: [G = \{e, g, g^{2}\}] with [g^{3} = e]. In a standard trigonal basis, G has matrix representation [{\bf R}_{e} = \pmatrix{1 &0\cr 0 &1\cr} = {\bf I}, \quad {\bf R}_{g} = \pmatrix{0 &-1\cr 1 &-1\cr}, \quad {\bf R}_{g^{2}} = \pmatrix{-1 &1\cr -1 &0\cr}] in real space, [{\bf R}_{e}^{*} = \pmatrix{1 &0\cr 0 &1\cr} = {\bf I}, \quad {\bf R}_{g}^{*} = \pmatrix{-1 &-1\cr \;\;\;1 &\;\;\;0\cr}, \quad {\bf R}_{g^{2}}^{*} = \pmatrix{\;\;\;0 &\;\;\;1\cr -1 &-1\cr}] in reciprocal space. Note that [{\bf R}_{g^{2}}^{*} = [{\bf R}_{g^{2}}^{-1}]^{T} = {\bf R}_{g}^{T},] and that [{\bf R}_{g}^{T} = {\bf J}^{-1} {\bf R}_{g}{\bf J},\quad \hbox{ where } {\bf J} = \pmatrix{1 &\;\;\;0\cr 0 &-1\cr}] so that [{\bf R}_{g}] and [{\bf R}_{g}^{T}] are conjugate in the group of [2\times 2] unimodular integer matrices. The group ring [{\bb Z}G] is commutative, and has the structure of the polynomial ring [{\bb Z}[X]] with the single relation [X^{2} + X + 1 = 0] corresponding to the minimal polynomial of [{\bf R}_{g}]. In the terminology of Section 1.3.3.2.4[link], the ring structure of [{\bb Z}G] is obtained from that of [{\bb Z}[X]] by carrying out polynomial addition and multiplication modulo [X^{2} + X + 1], then replacing X by any generator of G. This type of construction forms the very basis of algebraic number theory [see Artin (1944[link], Section IIc) for an illustration of this viewpoint], and [{\bb Z}G] as just defined is isomorphic to the ring [{\bb Z}[\omega]] of algebraic integers of the form [a + b\omega\ [a, b\in {\bb Z},\omega = \exp (2\pi i/3)]] under the identification [X \leftrightarrow \omega]. Addition in this ring is defined component-wise, while multiplication is defined by [\eqalign{(a_{1} + b_{1} \omega) \times (a_{2} + b_{2} \omega) &= (a_{1} a_{2} - b_{1} b_{2}) \cr&\quad+{ [(a_{1} - b_{1}) b_{2} + b_{1} a_{2}] \omega.}\cr}]

In the case of a fourfold axis, [G = \{e, g, g^{2}, g^{3}\}] with [g^{4} = e], and [{\bf R}_{g} = \pmatrix{0 &-1\cr 1 &\;\;\;0\cr} = {\bf R}_{g}^{*}, \quad \hbox{with again } {\bf R}_{g}^{T} = {\bf J}^{-1} {\bf R}_{g}{\bf J}.] [{\bb Z}G] is obtained from [{\bb Z}[X]] by carrying out polynomial arithmetic modulo [X^{2} + 1]. This identifies [{\bb Z}G] with the ring [{\bb Z}[i]] of Gaussian integers of the form [a + bi], in which addition takes place component-wise while multiplication is defined by [(a_{1} + b_{1} i) \times (a_{2} + b_{2} i) = (a_{1} a_{2} - b_{1} b_{2}) + (a_{1} b_{2} + b_{1} a_{2})i.]

In the case of a sixfold axis, [G = \{e, g, g^{2}, g^{3}, g^{4}, g^{5}\}] with [g^{6} = e], and [{\bf R}_{g} = \pmatrix{1 &-1\cr 1 &\;\;\;0\cr}, \quad {\bf R}_{g}^{*} = \pmatrix{0 &-1\cr 1 &\;\;\;1\cr}, \quad {\bf R}_{g}^{T} = {\bf J}^{-1} {\bf R}_{g}{\bf J}.] [{\bb Z}G] is isomorphic to [{\bb Z}[\omega]] under the mapping [g \leftrightarrow 1 + \omega] since [(1 + \omega)^{6} = 1].

Thus in all cases [{\bb Z}G \cong {\bb Z}[X]/P(X)] where [P(X)] is an irreducible quadratic polynomial with integer coefficients.

The actions of G on lattices in real and reciprocal space (Sections 1.3.4.2.2.4[link], 1.3.4.2.2.5[link]) extend naturally to actions of [{\bb Z}G] on [{\bb Z}^{2}] in which an element [z = a + bg] of [{\bb Z}G] acts via [{\bf m} = \pmatrix{m_{1}\cr m_{2}\cr} \;\longmapsto\; z{\bf m} = (a{\bf I} + b{\bf R}_{g}) \pmatrix{{\bf m}_{1}\cr {\bf m}_{2}\cr}] in real space, and via [{\bf h} = \pmatrix{h_{1}\cr h_{2}\cr} \;\longmapsto\; z{\bf h} = (a{\bf I} + b{\bf R}_{g}^{T}) \pmatrix{h_{1}\cr h_{2}\cr}] in reciprocal space. These two actions are related by conjugation, since [(a{\bf I} + b{\bf R}_{g}^{T}) = {\bf J}^{-1} (a{\bf I} + b{\bf R}_{g}){\bf J}] and the following identity (which is fundamental in the sequel) holds: [(z{\bf h}) \cdot {\bf m} = {\bf h} \cdot (z{\bf m}) \quad \hbox{for all } {\bf m},{\bf h} \in {\bb Z}^{2}.]

Let us now consider the calculation of a [p \times p] two-dimensional DFT with n-fold cyclic symmetry [(n = 3,\;4,\;6)] for an odd prime [p \geq 5]. Denote [{\bb Z}/p {\bb Z}] by [{\bb Z}_{p}]. Both the data and the results of the DFT are indexed by [{\bb Z}_{p} \times {\bb Z}_{p}]: hence the action of [{\bb Z}G] on these indices is in fact an action of [{\bb Z}_{p}G], the latter being obtained from [{\bb Z}G] by carrying out all integer arithmetic in [{\bb Z}G] modulo p. The algebraic structure of [{\bb Z}_{p}G] combines the symmetry-carrying ring structure of [{\bb Z}G] with the finite field structure of [{\bb Z}_{p}] used in Section 1.3.3.2.3.1[link], and holds the key to a symmetry-adapted factorization of the DFT at hand.

The structure of [{\bb Z}_{p}G] depends on whether [P(X)] remains irreducible when considered as a polynomial over [{\bb Z}_{p}]. Thus two cases arise:

  • (1) [P(X)] remains irreducible mod p, i.e. there is no nth root of unity in [{\bb Z}_{p}];

  • (2) [P(X)] factors as [(X - u)(X - v)], i.e. there are nth roots of unity in [{\bb Z}_{p}].

These two cases require different developments.

  • Case 1. [{\bb Z}_{p}G] is a finite field with [p^{2}] elements. There is essentially (i.e. up to isomorphism) only one such field, denoted [GF(p^{2})], and its group of units is a cyclic group with [p^{2} - 1] elements. If γ is a generator of this group of units, the input data [\rho_{{\bf m}}] with [{\bf m} \ne {\bf 0}] may be reordered as [{\bf m}_{0}, \gamma {\bf m}_{0}, \gamma^{2} {\bf m}_{0}, \gamma^{3} {\bf m}_{0}, \ldots, \gamma^{p^{2}-2} {\bf m}_{0}] by the real-space action of γ; while the results [F_{\bf h}] with [{\bf h} \neq {\bf 0}] may be reordered as [{\bf h}_{0}, \gamma {\bf h}_{0}, \gamma^{2} {\bf h}_{0}, \gamma^{3} {\bf h}_{0}, \ldots, \gamma^{p^{2}-2} {\bf h}_{0}] by the reciprocal-space action of γ, where [{\bf m}_{0}] and [{\bf h}_{0}] are arbitrary non-zero indices.

    The core [{\bf C}_{p \times p}] of the DFT matrix, defined by [{\bf F}_{p \times p} = \pmatrix{1 &1 &\ldots &1\cr 1 & & &\cr \vdots & &{\bf C}_{p \times p} &\cr 1 & & &\cr},] will then have a skew-circulant structure (Section 1.3.3.2.3.1[link]) since [({\bf C}_{p \times p})_{jk} = e \left[{(\gamma\hskip 2pt^{j} {\bf h}_{0}) \cdot (\gamma^{k} {\bf m}_{0}) \over p}\right] = e \left[{{\bf h}_{0} \cdot (\gamma\hskip 2pt^{j + k} {\bf m}_{0}) \over p}\right]] depends only on [j + k]. Multiplication by [{\bf C}_{p \times p}] may then be turned into a cyclic convolution of length [p^{2} - 1], which may be factored by two DFTs (Section 1.3.3.2.3.1[link]) or by Winograd's techniques (Section 1.3.3.2.4[link]). The latter factorization is always favourable, as it is easily shown that [p^{2} - 1] is divisible by 24 for any odd prime [p \geq 5]. This procedure is applicable even if no symmetry is present in the data.

    Assume now that cyclic symmetry of order [n = 3], 4 or 6 is present. Since n divides 24 hence divides [p^{2} - 1], the generator g of this symmetry is representable as [\gamma^{(p^{2} - 1)/n}] for a suitable generator γ of the group of units. The reordered data will then be [(p^{2} - 1)/n]-periodic rather than simply [(p^{2} - 1)]-periodic; hence the reindexed results will be n-decimated (Section 1.3.2.7.2[link]), and the [(p^{2} - 1)/n] non-zero results can be calculated by applying the DFT to the [(p^{2} - 1)/n] unique input data. In this way, the n-fold symmetry can be used in full to calculate the core contributions from the unique data to the unique results by a DFT of length [(p^{2} - 1)/n].

    It is a simple matter to incorporate non-primitive translations into this scheme. For example, when going from structure factors to electron densities, reordered data items separated by [(p^{2} - 1)/n] are not equal but differ by a phase shift proportional to their index mod p, whose effect is simply to shift the origin of the n-decimated transformed sequence. The same economy of computation can therefore be achieved as in the purely cyclic case.

    Dihedral symmetry elements, which map g to [g^{-1}] (Section 1.3.4.2.2.3[link]), induce extra one-dimensional symmetries of order 2 in the reordered data which can also be fully exploited to reduce computation.

  • Case 2. If [p \geq 5], it can be shown that the two roots u and v are always distinct. Then, by the Chinese remainder theorem (CRT) for polynomials (Section 1.3.3.2.4[link]) we have a ring isomorphism [{\bb Z}_{p} [X] / P(X) \cong \{{\bb Z}_{p} [X] / (X - u)\} \times \{{\bb Z}_{p} [X] / (X - v)\}] defined by sending a polynomial [Q(X)] from the left-hand-side ring to its two residue classes modulo [X - u] and [X - v], respectively. Since the latter are simply the constants [Q(u)] and [Q(v)], the CRT reindexing has the particularly simple form [a + bX \;\longmapsto\; (a + bu, a + bv) = (\alpha, \beta)] or equivalently [\pmatrix{a\cr b\cr} \;\longmapsto\; \pmatrix{\alpha\cr \beta\cr} = {\bf M} \pmatrix{a\cr b\cr} \hbox{mod } p, \quad \hbox{with } {\bf M} = \pmatrix{1 &u\cr 1 &v\cr}.] The CRT reconstruction formula similarly simplifies to [\displaylines{\pmatrix{\alpha\cr \beta\cr} \;\longmapsto\; \pmatrix{a\cr b\cr} = {\bf M}^{-1} \pmatrix{\alpha\cr \beta\cr} \hbox{mod } p,\cr \hbox{with } {\bf M}^{-1} = {1 \over v - u} \pmatrix{v &-u\cr -1 &1\cr}.}] The use of the CRT therefore amounts to the simultaneous diagonalization (by M) of all the matrices representing the elements of [{\bb Z}_{p}G] in the basis (1, X).

    A first consequence of this diagonalization is that the internal structure of [{\bb Z}_{p}G] becomes clearly visible. Indeed, [{\bb Z}_{p}G] is mapped isomorphically to a direct product of two copies of [{\bb Z}_{p}], in which arithmetic is carried out component-wise between eigenvalues α and β. Thus if [\eqalign{z &= a + bX {\buildrel {\scriptstyle {\rm CRT}} \over {\longleftrightarrow}} (\alpha, \beta),\cr z' &= a' + b' X {\buildrel {\scriptstyle {\rm CRT}} \over {\longleftrightarrow}} (\alpha', \beta'),}] then [\eqalign{z + z' &{\buildrel{\scriptstyle {\rm CRT}} \over {\longleftrightarrow}} (\alpha + \alpha', \beta + \beta'),\cr zz' &{\buildrel{\scriptstyle {\rm CRT}} \over {\longleftrightarrow}} (\alpha \alpha', \beta \beta').}] Taking in particular [\eqalign{z &{\buildrel{\scriptstyle {\rm CRT}} \over {\longleftrightarrow}} (\alpha, 0) \neq (0, 0),\cr z' &{\buildrel {\scriptstyle {\rm CRT}} \over {\longleftrightarrow}} (0, \beta) \neq (0, 0),}] we have [zz' = 0], so that [{\bb Z}_{p}G] contains zero divisors; therefore [{\bb Z}_{p}G] is not a field. On the other hand, if [z {\buildrel {\rm CRT} \over {\longleftrightarrow}} (\alpha, \beta)] with [\alpha \neq 0] and [\beta \neq 0], then α and β belong to the group of units [U(p)] (Section 1.3.3.2.3.1[link]) and hence have inverses [\alpha^{-1}] and [\beta^{-1}]; it follows that z is a unit in [{\bb Z}_{p}G], with inverse [{z}^{-1} {\buildrel {\rm CRT} \over {\longleftrightarrow}} (\alpha^{-1}, \beta^{-1})]. Therefore, [{\bb Z}_{p}G] consists of four distinct pieces: [\eqalign{0 &{\buildrel{\scriptstyle {\rm CRT}} \over {\longleftrightarrow}} \{(0, 0)\},\cr D_{1} &{\buildrel{\scriptstyle {\rm CRT}} \over {\longleftrightarrow}} \{(\alpha, 0) | \alpha \in U (p)\} \cong U (p),\cr D_{2} &{\buildrel{\scriptstyle {\rm CRT}}\over {\longleftrightarrow}} \{(0, \beta) | \beta \in U (p)\} \cong U (p),\cr U &{\buildrel{\scriptstyle {\rm CRT}}\over {\longleftrightarrow}} \{(\alpha, \beta) |\alpha \in U (p), \beta \in U (p)\} \cong U (p) \times U (p).}]

    A second consequence of this diagonalization is that the actions of [{\bb Z}_{p}G] on indices m and h can themselves be brought to diagonal form by basis changes: [\displaylines{\hfill {\bf m} \;\longmapsto\; (a {\bf I} + b {\bf R}_{g}) {\bf m}\hfill\cr \hfill \hbox{becomes } {\boldmu} \;\longmapsto\; \pmatrix{\alpha &0\cr 0 &\beta\cr} {\boldmu} \quad \hbox{with } {\boldmu} = {\bf Mm},\hfill\cr \hfill {\bf h} \;\longmapsto\; (a {\bf I} + b {\bf R}_{g}^{T}) {\bf h}\hfill\cr \hbox{becomes}\; \boldeta \;\longmapsto\; \pmatrix{\alpha &0\cr 0 &\beta\cr} \boldeta \quad \hbox{with } \boldeta = {\bf MJh.}}] Thus the sets of indices μ and η can be split into four pieces as [{\bb Z}_{p}G] itself, according as these indices have none, one or two of their coordinates in [U(p)]. These pieces will be labelled by the same symbols – 0, [D_{1}], [D_{2}] and U – as those of [{\bb Z}_{p}G].

    The scalar product [{\bf h} \cdot {\bf m}] may be written in terms of η and μ as [{\bf h} \cdot {\bf m} = [\boldeta \cdot (({\bf M}^{-1})^{T} {\bf JM}^{-1}) {\boldmu}],] and an elementary calculation shows that the matrix [\boldDelta = ({\bf M}^{-1})^{T} {\bf JM}^{-1}] is diagonal by virtue of the relation [uv = \hbox{constant term in } P(X) = 1.] Therefore, [{\bf h} \cdot {\bf m} = 0] if [{\bf h} \in D_{1}] and [{\boldmu} \in D_{2}] or vice versa.

    We are now in a position to rearrange the DFT matrix [{\bf F}_{p \times p}]. Clearly, the structure of [{\bf F}_{p \times p}] is more complex than in case 1, as there are three types of `core' matrices: [\eqalign{\hbox{type 1:} \quad &D \times D\ (\hbox {with } D = D_{1} \hbox{ or } D_{2})\hbox{;}\cr \hbox{type 2:} \quad &D \times U \hbox{ or } U \times D\hbox{;}\cr \hbox{type 3:} \quad &U \times U.}] (Submatrices of type [D_{1} \times D_{2}] and [D_{2} \times D_{1}] have all their elements equal to 1 by the previous remark.)

    Let γ be a generator of [U(p)]. We may reorder the elements in [D_{1}], [D_{2}] and U – and hence the data and results indexed by these elements – according to powers of γ. This requires one exponent in each of [D_{1}] and [D_{2}], and two exponents in U. For instance, in the h-index space: [\eqalign{D_{1} &= \left\{\pmatrix{\gamma &0\cr 0 &0\cr}^{j} \pmatrix{\eta_{1}\cr 0\cr}_{0} \Big|\; j = 1, \ldots, p - 1\right\}\cr D_{2} &= \left\{\pmatrix{0 &0\cr 0 &\gamma\cr}^{j} \pmatrix{0\cr \eta_{2}\cr}_{0} \Big|\; j = 1, \ldots, p - 1\right\}\cr U &= \left\{\pmatrix{\gamma &0\cr 0 &1\cr}^{j_{1}} \pmatrix{1 &0\cr 0 &\gamma\cr}^{j_{2}} \pmatrix{\eta_{1}\cr \eta_{2}\cr}_{0} \Big|\;j_{1} = 1, \ldots, p - 1\hbox{;}\right.\cr &\left. {\hbox to 137pt{}}j_{2} = 1, \ldots, p - 1\right.\biggr\}}] and similarly for the μ index.

    Since the diagonal matrix Δ commutes with all the matrices representing the action of γ, this rearrangement will induce skew-circulant structures in all the core matrices. The corresponding cyclic convolutions may be carried out by Rader's method, i.e. by diagonalizing them by means of two ([p - 1])-point one-dimensional DFTs in the [D \times D] pieces and of two [(p - 1) \times (p - 1)]-point two-dimensional DFTs in the [U \times U] piece (the [U \times D] and [D \times U] pieces involve extra section and projection operations).

    In the absence of symmetry, no computational saving is achieved, since the same reordering could have been applied to the initial [{\bb Z}_{p} \times {\bb Z}_{p}] indexing, without the CRT reindexing.

    In the presence of n-fold cyclic symmetry, however, the rearranged [{\bf F}_{p \times p}] lends itself to an n-fold reduction in size. The basic fact is that whenever case 2[link] occurs, [p - 1] is divisible by n (i.e. [p - 1] is divisible by 6 when [n = 3] or 6, and by 4 when [n = 4]), say [p - 1 = nq]. If g is a generator of the cyclic symmetry, the generator γ of [U(p)] may be chosen in such a way that [g = \gamma^{q}]. The action of g is then to increment the j index in [D_{1}] and [D_{2}] by q, and the [(j_{1}, j_{2})] index in U by (q, q). Since the data items whose indices are related in this way have identical values, the DFTs used to diagonalize the Rader cyclic convolutions will operate on periodized data, hence yield decimated results; and the non-zero results will be obtained from the unique data by DFTs n times smaller than their counterparts in the absence of symmetry.

    A more thorough analysis is needed to obtain a Winograd factorization into the normal from CBA in the presence of symmetry (see Bricogne & Tolimieri, 1990[link]).

    Non-primitive translations and dihedral symmetry may also be accommodated within this framework, as in case 1[link].

    This reindexing by means of algebraic integers yields larger orbits, hence more efficient algorithms, than that of Auslander et al. (1988)[link] which only uses ordinary integers acting by scalar dilation.

1.3.4.3.5. Treatment of conjugate and parity-related symmetry properties

| top | pdf |

Most crystallographic Fourier syntheses are real-valued and originate from Hermitian-symmetric collections of Fourier coefficients. Hermitian symmetry is closely related to the action of a centre of inversion in reciprocal space, and thus interacts strongly with all other genuinely crystallographic symmetry elements of order 2. All these symmetry properties are best treated by factoring by 2 and reducing the computation of the initial transform to that of a collection of smaller transforms with less symmetry or none at all.

1.3.4.3.5.1. Hermitian-symmetric or real-valued transforms

| top | pdf |

The computation of a DFT with Hermitian-symmetric or real-valued data can be carried out at a cost of half that of an ordinary transform, essentially by `multiplexing' pairs of special partial transforms into general complex transforms, and then `demultiplexing' the results on the basis of their symmetry properties. The treatment given below is for general dimension n; a subset of cases for [n = 1] was treated by Ten Eyck (1973)[link].

  • (a) Underlying group action

    Hermitian symmetry is not a geometric symmetry, but it is defined in terms of the action in reciprocal space of point group [G = \bar{1}], i.e. [G = \{e, -e\}], where e acts as I (the [n \times n] identity matrix) and [-e] acts as [-{\bf I}].

    This group action on [{\bb Z}^{n} / {\bf N}{\bb Z}^{n}] with [{\bf N} = {\bf N}_{1} {\bf N}_{2}] will now be characterized by the calculation of the cocycle [{\boldeta}_{1}] (Section 1.3.4.3.4.1[link]) under the assumption that [{\bf N}_{1}] and [{\bf N}_{2}] are both diagonal. For this purpose it is convenient to associate to any integer vector [{\bf v} = \pmatrix{v_{1}\cr \vdots\cr v_{n}\cr}] in [{\bb Z}^{n}] the vector [\boldzeta (\bf v)] whose jth component is [\cases{0 \hbox{ if } &$v_{j} = 0$\cr 1 \hbox{ if } &$v_{j} \neq 0$.\cr}]

    Let [{\bf m} = {\bf m}_{1} + {\bf N}_{1} {\bf m}_{2}], and hence [{\bf h} = {\bf h}_{2} + {\bf N}_{2} {\bf h}_{1}]. Then [\eqalign{&-{\bf h}_{2} \hbox{ mod } {\bf N} {\bb Z}^{n} = {\bf N} {\boldzeta} ({\bf h}_{2}) - {\bf h}_{2},\cr &-{\bf h}_{2} \hbox{ mod } {\bf N}_{2} {\bb Z}^{n} = {\bf N}_{2} {\boldzeta} ({\bf h}_{2}) - {\bf h}_{2},}] hence [\eqalign{{\boldeta}_{1} (-{e, {\bf h}}_{2}) &= {\bf N}_{2}^{-1} \{[{\bf N} {\boldzeta} ({\bf h}_{2}) - {\bf h}_{2}] - [{\bf N}_{2} {\boldzeta} ({\bf h}_{2}) - {\bf h}_{2}]\} \hbox{ mod } {\bf N}_{1} {\bb Z}^{n}\cr &= -{\boldzeta} ({\bf h}_{2}) \hbox{ mod } {\bf N}_{1} {\bb Z}^{n}.}] Therefore −e acts by [({\bf h}_{2}, {\bf h}_{1}) \;\longmapsto\; [{\bf N}_{2} {\boldzeta} ({\bf h}_{2}) - {\bf h}_{2}, {\bf N}_{1} {\boldzeta} ({\bf h}_{1}) - {\bf h}_{1} - {\boldzeta} ({\bf h}_{2})].]

    Hermitian symmetry is traditionally dealt with by factoring by 2, i.e. by assuming [{\bf N} = 2{\bf M}]. If [{\bf N}_{2} = 2{\bf I}], then each [{\bf h}_{2}] is invariant under G, so that each partial vector [{\bf Z}_{{\bf h}_{2}}^{*}] (Section 1.3.4.3.4.1[link]) inherits the symmetry internally, with a `modulation' by [{\boldeta}_{1} (g, {\bf h}_{2})]. The `multiplexing–demultiplexing' technique provides an efficient treatment of this singular case.

  • (b) Calculation of structure factors

    The computation may be summarized as follows: [\rho\llap{$-\!$} {\buildrel{{\bf dec}({\bf N}_{1})} \over {\;\longmapsto\;}} {\bf Y} {\buildrel {\bar{F} ({\bf N}_{2})} \over {\;\longmapsto\;}} {\bf Y}^{*} {\buildrel {\scriptstyle{\rm TW}} \over {\;\longmapsto\;}} {\bf Z} {\buildrel{\bar{F} ({\bf N}_{1})} \over {\;\longmapsto\;}} {\bf Z}^{*} {\buildrel{{\bf rev}({\bf N}_{2})} \over {\;\longmapsto\;}} {\bf F}] where [{\bf dec}({\bf N}_{1})] is the initial decimation given by [Y_{{\bf m}_{1}} ({\bf m}_{2}) = \rho\llap{$-\!$} ({\bf m}_{1} + {\bf N}_{1} {\bf m}_{2})], TW is the transposition and twiddle-factor stage, and [{\bf rev}({\bf N}_{2})] is the final unscrambling by coset reversal given by [F ({\bf h}_{2} + {\bf N}_{2} {\bf h}_{1}) = {\bf Z}_{{\bf h}_{2}}^{*} ({\bf h}_{1})].

    • (i) Decimation in time [({\bf N}_{1} = 2{\bf I},{\bf N}_{2} = {\bf M})]

      The decimated vectors [{\bf Y}_{{\bf m}_{1}}] are real and hence have Hermitian transforms [{\bf Y}_{{\bf m}_{1}}^{*}]. The [2^{n}] values of [{\bf m}_{1}] may be grouped into [2^{n - 1}] pairs [({\bf m}'_{1}, {\bf m}''_{1})] and the vectors corresponding to each pair may be multiplexed into a general complex vector [{\bf Y} = {\bf Y}_{{\bf m}'_{1}} + i {\bf Y}_{{\bf m}''_{1}}.] The transform [{\bf Y}^{*} = \bar{F} ({\bf M}) [{\bf Y}]] can then be resolved into the separate transforms [{\bf Y}_{{\bf m}'_{1}}^{*}] and [{\bf Y}_{{\bf m}''_{1}}^{*}] by using the Hermitian symmetry of the latter, which yields the demultiplexing formulae [\eqalign{Y_{{\bf m}'_{1}}^{*} ({\bf h}_{2}) + iY_{{\bf m}''_{1}}^{*} ({\bf h}_{2}) &= Y^{*} ({\bf h}_{2})\cr \overline{Y_{{\bf m}'_{1}}^{*} ({\bf h}_{2})} + \overline{iY_{{\bf m}''_{1}}^{*} ({\bf h}_{2})} &= Y^{*} [{\bf M} {\boldzeta} ({\bf h}_{2}) - {\bf h}_{2}].}] The number of partial transforms [\bar{F} ({\bf M})] is thus reduced from [2^{n}] to [2^{n - 1}]. Once this separation has been achieved, the remaining steps need only be carried out for a unique half of the values of [{\bf h}_{2}].

    • (ii) Decimation in frequency [({\bf N}_{1} = {\bf M},{\bf N}_{2} = 2{\bf I})]

      Since [{\bf h}_{2} \in {\bb Z}^{n} / 2{\bb Z}^{n}] we have [-{\bf h}_{2} = {\bf h}_{2}] and [{\boldzeta} ({\bf h}_{2}) = {\bf h}_{2}] mod [2{\bb Z}^{n}]. The vectors of decimated and scrambled results [{\bf Z}_{{\bf h}_{2}}^{*}] then obey the symmetry relations [Z_{{\bf h}_{2}}^{*} ({\bf h}_{1} - {\bf h}_{2}) = \overline{Z_{{\bf h}_{2}}^{*} [{\bf M} {\boldzeta} ({\bf h}_{1}) - {\bf h}_{1}]}] which can be used to halve the number of [\bar{F} ({\bf M})] necessary to compute them, as follows.

      Having formed the vectors [{\bf Z}_{{\bf h}_{2}}] given by [Z_{{\bf h}_{2}} ({\bf m}_{1}) = \left[\sum\limits_{{\bf m}_{2} \in {\bb Z}^{n} / 2{\bb Z}^{n}} {(-1)^{{\bf h}_{2} \cdot {\bf m}_{2}} \over 2^{n}} \rho\llap{$-\!$} ({\bf m}_{1} + {\bf Mm}_{2})\right] e[{\bf h}_{2} \cdot ({\bf N}^{-1} {\bf m}_{1})],] we may group the [2^{n}] values of [{\bf h}_{2}] into [2^{n - 1}] pairs [({\bf h}'_{2}, {\bf h}''_{2})] and for each pair form the multiplexed vector: [{\bf Z} = {\bf Z}_{{\bf h}'_{2}} + i{\bf Z}_{{\bf h}''_{2}}.] After calculating the [2^{n - 1}] transforms [{\bf Z}^{*} = \bar{F} ({\bf M}) [{\bf Z}]], the [2^{n}] individual transforms [{\bf Z}_{{\bf h}'_{2}}^{*}] and [{\bf Z}_{{\bf h}''_{2}}^{*}] can be separated by using for each pair the demultiplexing formulae [\eqalign{Z_{{\bf h}'_{2}}^{*} ({\bf h}_{1}) + iZ_{{\bf h}''_{2}}^{*} ({\bf h}_{1}) &= Z^{*} ({\bf h}_{1})\cr Z_{{\bf h}'_{2}}^{*} ({\bf h}_{1} - {\bf h}'_{2}) + iZ_{{\bf h}''_{2}}^{*} ({\bf h}_{1} - {\bf h}''_{2}) &= \overline{Z^{*} [{\bf M} {\boldzeta} ({\bf h}_{1}) - {\bf h}_{1}]}}] which can be solved recursively. If all pairs are chosen so that they differ only in the jth coordinate [({\bf h}_{2})_{j}], the recursion is along [({\bf h}_{1})_{j}] and can be initiated by introducing the (real) values of [Z_{{\bf h}'_{2}}^{*}] and [Z_{{\bf h}''_{2}}^{*}] at [({\bf h}_{1})_{j} = 0] and [({\bf h}_{1})_{j} = M_{j}], accumulated e.g. while forming Z for that pair. Only points with [({\bf h}_{1})_{j}] going from 0 to [{1 \over 2} M_{j}] need be resolved, and they contain the unique half of the Hermitian-symmetric transform F.

  • (c) Calculation of electron densities

    The computation may be summarized as follows: [{\bf F} {\buildrel{{\bf scr}({\bf N}_{2})} \over {\;\longmapsto\;}} {\bf Z}^{*} {\buildrel{F({\bf N}_{1})} \over {\;\longmapsto\;}} {\bf Z} {\buildrel{\scriptstyle{\rm TW}} \over {\;\longmapsto\;}} {\bf Y}^{*} {\buildrel{F({\bf N}_{2})} \over {\;\longmapsto\;}} {\bf Y} {\buildrel{{\bf nat}({\bf N}_{1})} \over {\;\longmapsto\;}} \rho\llap{$-\!$}] where [{\bf scr}({\bf N}_{2})] is the decimation with coset reversal given by [{\bf Z}_{{\bf h}_{2}}^{*} ({\bf h}_{1}) = F ({\bf h}_{2} + {\bf N}_{2} {\bf h}_{1})], TW is the transposition and twiddle-factor stage, and [{\bf nat}({\bf N}_{1})] is the recovery in natural order given by [\rho\llap{$-\!$} ({\bf m}_{1} + {\bf N}_{1} {\bf m}_{2}) = Y_{{\bf m}_{1}} ({\bf m}_{2})].

    • (i) Decimation in time [({\bf N}_{1} = {\bf M},{\bf N}_{2} = 2{\bf I})]

      The last transformation [F(2{\bf I})] has a real-valued matrix, and the final result [\rho\llap{$-\!$}] is real-valued. It follows that the vectors [{\bf Y}_{{\bf m}_{1}}^{*}] of intermediate results after the twiddle-factor stage are real-valued, hence lend themselves to multiplexing along the real and imaginary components of half as many general complex vectors.

      Let the [2^{n}] initial vectors [{\bf Z}_{{\bf h}_{2}}^{*}] be multiplexed into [2^{n - 1}] vectors [{\bf Z}^{*} = {\bf Z}_{{\bf h}'_{2}}^{*} + i{\bf Z}_{{\bf h}''_{2}}^{*}] [one for each pair [({\bf h}'_{2}, {\bf h}''_{2})]], each of which yields by F(M) a vector [{\bf Z} = {\bf Z}_{{\bf h}'_{2}} + i{\bf Z}_{{\bf h}''_{2}}.] The real-valuedness of the [{\bf Y}_{{\bf m}_{1}}^{*}] may be used to recover the separate result vectors for [{\bf h}'_{2}] and [{\bf h}''_{2}]. For this purpose, introduce the abbreviated notation [\eqalign{&e[-{\bf h}'_{2} \cdot ({\bf N}^{-1} {\bf m}_{1})] = (c' + is') ({\bf m}_{1})\cr &e[-{\bf h}''_{2} \cdot ({\bf N}^{-1} {\bf m}_{1})] = (c'' + is'') ({\bf m}_{1})\cr &\qquad \quad R_{{\bf h}_{2}} ({\bf m}_{1}) = Y_{{\bf m}_{1}}^{*} ({\bf h}_{2})\cr &\qquad {\bf R}' = {\bf R}_{{\bf h}'_{2}},\quad {\bf R}'' = {\bf R}_{{\bf h}''_{2}}.}] Then we may write [\eqalign{{\bf Z} &= (c' + is') {\bf R}' + i(c'' + is'') {\bf R}''\cr &= (c' {\bf R}' + s'' {\bf R}'') + i(-s' {\bf R}' + c'' {\bf R}'')}] or, equivalently, for each [{\bf m}_{1}], [\pmatrix{{\scr Re} \;Z\cr {\scr Im}\;Z\cr} = \pmatrix{c' &s''\cr -s' &c''\cr} \pmatrix{R'\cr R''\cr}.] Therefore [{\bf R}'] and [{\bf R}''] may be retrieved from Z by the `demultiplexing' formula: [\pmatrix{R'\cr R''\cr} = {1 \over c' c'' + s' s''} \pmatrix{c'' &-s''\cr s' &c'\cr} \pmatrix{{\scr Re}\; Z\cr {\scr Im} \;Z\cr}] which is valid at all points [{\bf m}_{1}] where [c' c'' + s' s'' \neq 0], i.e. where [\cos [2 \pi ({\bf h}'_{2} - {\bf h}''_{2}) \cdot ({\bf N}^{-1} {\bf m}_{1})] \neq 0.] Demultiplexing fails when [({\bf h}'_{2} - {\bf h}''_{2}) \cdot ({\bf N}^{-1} {\bf m}_{1}) = {\textstyle{1 \over 2}} \hbox{ mod } 1.] If the pairs [({\bf h}'_{2}, {\bf h}''_{2})] are chosen so that their members differ only in one coordinate (the jth, say), then the exceptional points are at [({\bf m}_{1})_{j} = {1 \over 2} M_{j}] and the missing transform values are easily obtained e.g. by accumulation while forming [{\bf Z}^{*}].

      The final stage of the calculation is then [\rho\llap{$-\!$} ({\bf m}_{1} + {\bf Mm}_{2}) = {\textstyle\sum\limits_{{\bf h}_{2} \in {\bf Z}^{n} / 2{\bf Z}^{n}}} (-1)^{{\bf h}_{2} \cdot {\bf m}_{2}} R_{{\bf h}_{2}} ({\bf m}_{1}).]

    • (ii) Decimation in frequency [({\bf N}_{1} = 2{\bf I},{\bf N}_{2} = {\bf M})]

      The last transformation F(M) gives the real-valued results [\rho\llap{$-\!$}], therefore the vectors [{\bf Y}_{{\bf m}_{1}}^{*}] after the twiddle-factor stage each have Hermitian symmetry.

      A first consequence is that the intermediate vectors [{\bf Z}_{{\bf h}_{2}}] need only be computed for the unique half of the values of [{\bf h}_{2}], the other half being related by the Hermitian symmetry of [{\bf Y}_{{\bf m}_{1}}^{*}].

      A second consequence is that the [2^{n}] vectors [{\bf Y}_{{\bf m}_{1}}^{*}] may be condensed into [2^{n - 1}] general complex vectors [{\bf Y}^{*} = {\bf Y}_{{\bf m}'_{1}}^{*} + i{\bf Y}_{{\bf m}''_{1}}^{*}] [one for each pair [({\bf m}'_{1}, {\bf m}''_{1})]] to which a general complex F(M) may be applied to yield [{\bf Y} = {\bf Y}_{{\bf m}'_{1}} + i{\bf Y}_{{\bf m}''_{1}}] with [{\bf Y}_{{\bf m}'_{1}}] and [{\bf Y}_{{\bf m}''_{1}}] real-valued. The final results can therefore be retrieved by the particularly simple demultiplexing formulae: [\eqalign{\rho\llap{$-\!$} ({\bf m}'_{1} + 2{\bf m}_{2}) &= {\scr Re} \;Y({\bf m}_{2}),\cr \rho\llap{$-\!$} ({\bf m}''_{1} + 2{\bf m}_{2}) &= {\scr Im}\; Y({\bf m}_{2}).}]

1.3.4.3.5.2. Hermitian-antisymmetric or pure imaginary transforms

| top | pdf |

A vector [{\bf X} = \{X ({\bf k}) | {\bf k} \in {\bb Z}^{n} / {\bf N}{\bb Z}^{n}\}] is said to be Hermitian-antisymmetric if [X ({\bf k}) = -\overline{X (-{\bf k})} \hbox{ for all } {\bf k.}] Its transform [{\bf X}^{*}] then satisfies [X^{*} ({\bf k}^{*}) = -\overline{X^{*} ({\bf k}^{*})} \hbox{ for all } {\bf k}^{*},] i.e. is purely imaginary.

If X is Hermitian-antisymmetric, then [{\bf F} = \pm i{\bf X}] is Hermitian-symmetric, with [\rho\llap{$-\!$} = \pm i{\bf X}^{*}] real-valued. The treatment of Section 1.3.4.3.5.1[link] may therefore be adapted, with trivial factors of i or [-1], or used as such in conjunction with changes of variable by multiplication by [\pm i].

1.3.4.3.5.3. Complex symmetric and antisymmetric transforms

| top | pdf |

The matrix [-{\bf I}] is its own contragredient, and hence (Section 1.3.2.4.2.2[link]) the transform of a symmetric (respectively antisymmetric) function is symmetric (respectively antisymmetric). In this case the group [G = \{e, -e\}] acts in both real and reciprocal space as [\{{\bf I}, -{\bf I}\}]. If [{\bf N} = {\bf N}_{1} {\bf N}_{2}] with both factors diagonal, then [-e] acts by [\eqalign{ ({\bf m}_{1}, {\bf m}_{2}) \;&\longmapsto\; [{\bf N}_{1} {\boldzeta} ({\bf m}_{1}) - {\bf m}_{1}, {\bf N}_{2} {\boldzeta} ({\bf m}_{2}) - {\bf m}_{2} - {\boldzeta} ({\bf m}_{1})],\cr ({\bf h}_{2}, {\bf h}_{1}) \;&\longmapsto\; [{\bf N}_{2} {\boldzeta} ({\bf h}_{2}) - {\bf h}_{2}, {\bf N}_{1} {\boldzeta} ({\bf h}_{1}) - {\bf h}_{1} - {\boldzeta} ({\bf h}_{2})],}] i.e. [\eqalign{{\boldmu}_{2} (-e, {\bf m}_{1}) &= -{\boldzeta} ({\bf m}_{1}) \hbox{ mod } {\bf N}_{2} {\bb Z}^{n},\cr {\boldeta}_{1} (-e, {\bf h}_{2}) &= -{\boldzeta} ({\bf h}_{2}) \hbox{ mod } {\bf N}_{1} {\bb Z}^{n}.}]

The symmetry or antisymmetry properties of X may be written [X (-{\bf m}) = -\varepsilon X ({\bf m}) \hbox{ for all } {\bf m},] with [\varepsilon = + 1] for symmetry and [\varepsilon = -1] for antisymmetry.

The computation will be summarized as [{\bf X} {\buildrel{{\bf dec}({\bf N}_{1})} \over {\;\longmapsto\;}} {\bf Y} {\buildrel{\bar{F}({\bf N}_{2})} \over {\;\longmapsto\;}} {\bf Y}^{*} {\buildrel{\scriptstyle{\rm TW}} \over {\;\longmapsto\;}} {\bf Z} {\buildrel{\bar{F}({\bf N}_{1})} \over {\;\longmapsto\;}} {\bf Z}^{*} {\buildrel{{\bf rev}({\bf N}_{2})} \over {\;\longmapsto\;}} {\bf X}^{*}] with the same indexing as that used for structure-factor calculation. In both cases it will be shown that a transform [F({\bf N})] with [{\bf N} = {\bf 2M}] and M diagonal can be computed using only [2^{n-1}] partial transforms [F({\bf M})] instead of [2^{n}].

  • (i) Decimation in time [({\bf N}_{1} = 2{\bf I},{\bf N}_{2} = {\bf M})]

    Since [{\bf m}_{1} \in {\bb Z}^{n}/2{\bb Z}^{n}] we have [-{\bf m}_{1} = {\bf m}_{1}] and [{\boldzeta} ({\bf m}_{1}) = {\bf m}_{1}] mod [2{\bb Z}^{n}], so that the symmetry relations for each parity class of data [{\bf Y}_{{\bf m}_{1}}] read [Y_{{\bf m}_{1}} [{\bf M} {\boldzeta} ({\bf m}_{2}) - {\bf m}_{2} - {\bf m}_{1}] = \varepsilon Y_{{\bf m}_{1}} ({\bf m}_{2})] or equivalently [\tau_{{\bf m}_{1}} {\bf Y}_{{\bf m}_{1}} = \varepsilon \breve{{\bf Y}}_{{\bf m}_{1}}.] Transforming by [F({\bf M})], this relation becomes [e [-{\bf h}_{2} \cdot ({\bf M}^{-1} {\bf m}_{1})] {\bf Y}_{{\bf m}_{1}}^{*} = \varepsilon {\bf Y}_{{\bf m}_{1}}^{*}.] Each parity class thus obeys a different symmetry relation, so that we may multiplex them in pairs by forming for each pair [({\bf m}'_{1}, {\bf m}''_{1})] the vector [{\bf Y} = {\bf Y}_{{\bf m}'_{1}} + {\bf Y}_{{\bf m}''_{1}}.] Putting [\eqalign{e [-{\bf h}_{2} \cdot ({\bf M}^{-1} {\bf m}'_{1})] &= (c' + is') ({\bf h}_{2})\cr e [-{\bf h}_{2} \cdot ({\bf M}^{-1} {\bf m}''_{1})] &= (c'' + is'') ({\bf h}_{2})}] we then have the demultiplexing relations for each [{\bf h}_{2}]: [\displaylines{Y_{{\bf m}'_{1}}^{*} ({\bf h}_{2}) + Y_{{\bf m}''_{1}}^{*} ({\bf h}_{2})= Y^{*} ({\bf h}_{2})\cr (c' + is') ({\bf h}_{2}) Y_{{\bf m}'_{1}}^{*} ({\bf h}_{2}) + (c'' + is'') ({\bf h}_{2})Y_{{\bf m}''_{1}}^{*} ({\bf h}_{2})\cr \qquad\qquad\quad\; = \varepsilon Y^{*} [{\bf M} \boldzeta ({\bf h}_{2}) - {\bf h}_{2}]\hfill}] which can be solved recursively. Transform values at the exceptional points [{\bf h}_{2}] where demultiplexing fails (i.e. where [c' + is' = c'' + is'']) can be accumulated while forming Y.

    Only the unique half of the values of [{\bf h}_{2}] need to be considered at the demultiplexing stage and at the subsequent TW and F(2I) stages.

  • (ii) Decimation in frequency [({\bf N}_{1} = {\bf M},{\bf N}_{2} = 2{\bf I})]

    The vectors of final results [{\bf Z}_{{\bf h}_{2}}^{*}] for each parity class [{\bf h}_{2}] obey the symmetry relations [\tau_{{\bf h}_{2}} {\bf Z}_{{\bf h}_{2}}^{*} = \varepsilon \check{{\bf Z}}_{{\bf h}_{2}}^{*},] which are different for each [{\bf h}_{2}]. The vectors [{\bf Z}_{{\bf h}_{2}}] of intermediate results after the twiddle-factor stage may then be multiplexed in pairs as [{\bf Z} = {\bf Z}_{{\bf h}'_{2}} + {\bf Z}_{{\bf h}''_{2}}.]

    After transforming by [F({\bf M})], the results [{\bf Z}^{*}] may be demultiplexed by using the relations [\eqalign{Z_{{\bf h}'_{2}}^{*} ({\bf h}_{1})\quad &+ Z_{{\bf h}''_{2}}^{*} ({\bf h}_{1}) \phantom{- {\bf h}''_{2})} = Z^{*} ({\bf h}_{1})\cr Z_{{\bf h}'_{2}}^{*} ({\bf h}_{1} - {\bf h}'_{2}) &+ Z_{{\bf h}''_{2}}^{*} ({\bf h}_{1} - {\bf h}''_{2}) = \varepsilon Z^{*} [{\bf M} \boldzeta ({\bf h}_{1}) - {\bf h}_{1}]}] which can be solved recursively as in Section 1.3.4.3.5.1[link](b)(ii)[link].

1.3.4.3.5.4. Real symmetric transforms

| top | pdf |

Conjugate symmetric (Section 1.3.2.4.2.3[link]) implies that if the data X are real and symmetric [i.e. [X({\bf k}) = \overline{X({\bf k})}] and [X (- {\bf k}) = X ({\bf k})]], then so are the results [{\bf X}^{*}]. Thus if [\rho\llap{$-\!$}] contains a centre of symmetry, F is real symmetric. There is no distinction (other than notation) between structure-factor and electron-density calculation; the algorithms will be described in terms of the former. It will be shown that if [{\bf N} = 2{\bf M}], a real symmetric transform can be computed with only [2^{n-2}] partial transforms [F({\bf M})] instead of [2^{n}].

  • (i) Decimation in time [({\bf N}_{1} = 2{\bf I},{\bf N}_{2} = {\bf M})]

    Since [{\bf m}_{1} \in {\bb Z}^{n}/2{\bb Z}^{n}] we have [-{\bf m}_{1} = {\bf m}_{1}] and [{\boldzeta} ({\bf m}_{1})=] [ {\bf m}_{1} \hbox{ mod } 2{\bb Z}^{n}]. The decimated vectors [{\bf Y}_{{\bf m}_{1}}] are not only real, but have an internal symmetry expressed by [{\bf Y}_{{\bf m}_{1}} [{\bf M} {\boldzeta} ({\bf m}_{2}) - {\bf m}_{2} - {\bf m}_{1}] = \varepsilon {\bf Y}_{{\bf m}_{1}} ({\bf m}_{2}).] This symmetry, however, is different for each [{\bf m}_{1}] so that we may multiplex two such vectors [{\bf Y}_{{\bf m}'_{1}}] and [{\bf Y}_{{\bf m}''_{1}}] into a general real vector [{\bf Y} = {\bf Y}_{{\bf m}'_{1}} + {\bf Y}_{{\bf m}''_{1}},] for each of the [2^{n-1}] pairs [({\bf m}'_{1}, {\bf m}''_{1})]. The [2^{n-1}] Hermitian-symmetric transform vectors [{\bf Y}^{*} = {\bf Y}_{{\bf m}'_{1}}^{*} + {\bf Y}_{{\bf m}''_{1}}^{*}] can then be evaluated by the methods of Section 1.3.4.3.5.1[link](b)[link] at the cost of only [2^{n-2}] general complex [F({\bf M})].

    The demultiplexing relations by which the separate vectors [{\bf Y}_{{\bf m}'_{1}}^{*}] and [{\bf Y}_{{\bf m}''_{1}}^{*}] may be recovered are most simply obtained by observing that the vectors Z after the twiddle-factor stage are real-valued since F(2I) has a real matrix. Thus, as in Section 1.3.4.3.5.1[link](c)(i)[link], [\eqalign{{\bf Y}_{{\bf m}'_{1}}^{*} &= (c' - is') {\bf R}'\cr {\bf Y}_{{\bf m}''_{1}}^{*} &= (c'' - is'') {\bf R}'',}] where [{\bf R}'] and [{\bf R}''] are real vectors and where the multipliers [(c' - is')] and [(c'' - is'')] are the inverse twiddle factors. Therefore, [\eqalign{{\bf Y}^{*} &= (c' - is') {\bf R}' + (c'' - is'') {\bf R}''\cr &= (c' {\bf R}' + c'' {\bf R}'') - i(s' {\bf R}' + s'' {\bf R}'')}] and hence the demultiplexing relation for each [{\bf h}_{2}]: [\pmatrix{R'\cr R''\cr} = {1 \over c' s'' - s' c''} \pmatrix{s'' &-c''\cr -s' &c'\cr} \pmatrix{{\scr Re}\; Y^{*}\cr -{\scr Im} \;Y^{*}\cr}.] The values of [R'_{{\bf h}_{2}}] and [R''_{{\bf h}_{2}}] at those points [{\bf h}_{2}] where [c' s'' - s' c'' = 0] can be evaluated directly while forming Y. This demultiplexing and the final stage of the calculation, namely [F ({\bf h}_{2} + {\bf Mh}_{1}) = {1 \over 2^{n}} \sum\limits_{{\bf m}_{1} \in {\bf Z}^{n}/2{\bf Z}^{n}} (-1)^{{\bf h}_{1} \cdot {\bf m}_{1}} R_{{\bf m}_{1}} ({\bf h}_{2})] need only be carried out for the unique half of the range of [{\bf h}_{2}].

  • (ii) Decimation in frequency [({\bf N}_{1} = {\bf M}, {\bf N}_{2} = 2{\bf I})]

    Similarly, the vectors [{\bf Z}_{{\bf h}_{2}}^{*}] of decimated and scrambled results are real and obey internal symmetries [\tau_{{\bf h}_{2}} {\bf Z}_{{\bf h}_{2}}^{*} = \varepsilon \breve{{\bf Z}}_{{\bf h}_{2}}^{*}] which are different for each [{\bf h}_{2}]. For each of the [2^{n-1}] pairs [({\bf h}'_{2}, {\bf h}''_{2})] the multiplexed vector [{\bf Z} = {\bf Z}_{{\bf h}'_{2}} + {\bf Z}_{{\bf h}''_{2}}] is a Hermitian-symmetric vector without internal symmetry, and the [2^{n-1}] real vectors