International
Tables for
Crystallography
Volume B
Reciprocal space
Edited by U. Shmueli

International Tables for Crystallography (2010). Vol. B, ch. 3.5, pp. 474-480

Section 3.5.4. Computational efficiency

T. A. Dardena*

aLaboratory of Structural Biology, National Institute of Environmental Health Sciences, 111 T. W. Alexander Drive, Research Triangle Park, NC 27709, USA
Correspondence e-mail: darden@gamera.niehs.nih.gov

3.5.4. Computational efficiency

| top | pdf |

The Ewald-sum methods discussed above lead to calculations that are much faster than straightforward summation, as well as clarifying the nature of the conditional convergence. However, there are opportunities for further significant speed-ups, which we cover in this section. The computational cost of the Ewald sum is dominated by the direct and reciprocal sum. We cover these in turn.

3.5.4.1. The direct sum

| top | pdf |

For all the cases discussed above – point charge, normalized spherical Gaussians and the more complex densities made up of sums over a basis of normalized Hermite Gaussians – the Ewald direct sum involves terms like [|{\bf R}_{12} - {\bf n}| ] and in the Hermite case partial derivatives with respect to the x, y and z components of these terms. Given the efficiency opportunities in the reciprocal sum to be discussed below, it is advantageous to choose an Ewald parameter (referred to above as α in the point-charge case or β for the compact–compact Gaussian or Hermite Gaussian interactions) so that direct-sum terms can be neglected past a fixed cutoff, i.e. when [|{\bf R}_{12} - {\bf n}| > R_C]. Given a fixed Ewald parameter α, the cutoff may be fixed independently of the unit-cell size. At constant or near constant material density (number of atoms per cubic ångstrom) this leads to `linear scaling', i.e. the computational cost of the Ewald direct sum grows linearly with the unit-cell volume or number of atoms in the unit cell U.

For the point-charge or normalized spherical Gaussian case, the remaining issue is efficient evaluation of the erfc function. Various table lookup schemes have been advanced, as well as a polynomial expression (Allen & Tildesley, 1987[link]). Ewald summation for ideal multipoles up to quadrupole have been discussed by Smith (1982[link], 1998[link]) and Aguado & Madden (2003[link]), using optimized combinations of terms to minimize the cost of the direct sum. A more general scheme, covering higher-order multipoles, was proposed by Sagui et al. (2004[link]). In the latter scheme the McMurchie–Davidson recursion (McMurchie & Davidson, 1978[link]; Helgaker et al., 2000[link]) was used to calculate the higher-order derivatives of [{\rm erfc}(\alpha^{1/2} r) / r ] for the multipole–multipole interaction tensor. This same approach was used by Cisneros et al. (2006[link]) to calculate the interaction tensor for normalized Hermite Gaussian interactions. Although originally formulated for Hermite Gaussian interactions, it applies to more general partial derivative calculations. Let r = (x, y, z) be a point in [\Re^3] with [|{\bf r}| = r], and let g(r) be a smooth function of r. For example, [g(r) = {\rm erf}(\alpha^{1/2} r)/r ]. We are interested in the partial derivatives with respect to x, y and z of g(r). Let R(0, 0, 0, 0) = g(r), and for n > 0 let R(0, 0, 0, n) = (1/r)(d/dr)R(0, 0, 0, n − 1). For non-negative integers t, u, v, let [R(t,u,v,n)] = [(\partial/\partial x)^t (\partial/\partial y)^u (\partial/\partial z)^v R(0,0,0,n) ]. Then we have the following recursion:[\eqalign{R(t+1,u,v,n) &= x R(t,u,v,n+1) + t R(t-1,u,v,n+1), \cr R(t,u+1,v,n) &= y R(t,u,v,n+1) + u R(t,u-1,v,n+1), \cr R(t,u,v+1,n) &= z R(t,u,v,n+1) + v R(t,u,v-1,n+1).} ]

The proof of the last recursion depends on the fact that [\partial / \partial z = (z/r) \partial / \partial r ] and on Leibniz' generalization of the product rule for differentiation, namely [(fg)^{(k)} = \textstyle\sum_{j=0}^k {k \choose j} f^{(j)}g^{(k-j)}], and is as follows:[\eqalign{R(t,u,v+1,n) &= \left(\partial / \partial x\right)^t \left(\partial / \partial y\right)^u \left(\partial / \partial z\right)^v \left(\partial / \partial z\right) R(0,0,0,n) \cr &= \left(\partial / \partial x\right)^t \left(\partial / \partial y\right)^u \left(\partial / \partial z\right)^v z (1/r) \cr &\quad\times ({\rm d}/{\rm d}r) R(0,0,0,n) \cr &= \left(\partial / \partial x\right)^t \left(\partial / \partial y\right)^u \left(\partial / \partial z\right)^v z R(0,0,0,n+1) \cr &= \left(\partial / \partial x\right)^t \left(\partial / \partial y\right)^u [z \left(\partial / \partial z\right)^v R(0,0,0,n+1) \cr &\quad + v \left(\partial / \partial z\right)^{v-1} R(0,0,0,n+1) ]\cr &= z R(t,u,v,n+1) + v R(t,u,v-1,n+1).} ]The other two recursions are proved similarly. To apply this to the case [g(r) = {\rm erf}(\alpha^{1/2} r)/r] we recall from equation (3.5.3.4)[link] that [(2 /\pi^{1/2}) F_0(x^2) = {\rm erf}(|x|) / |x|] and that for the higher-order Boy's functions [F_n(x)], [({\rm d}/{\rm d}x )F_n(x) = -F_{n+1}(x) ]. Efficient evaluation of the Boy's functions is discussed by Helgaker et al. (2000[link]).

Further economies for point multipoles can be achieved by using symmetry. For example, it is only necessary to keep track of six quadrupolar terms, with a similar reduction for Hermite Gaussians. A general discussion, including rotation of multipoles or Hermite Gaussians from local molecule frame to laboratory frame, is provided in Sagui et al. (2004[link]) as well as in Cisneros et al. (2006[link]).

3.5.4.2. Reciprocal-sum speed-ups: the smooth particle mesh Ewald and fast Fourier Poisson methods

| top | pdf |

The smooth particle mesh Ewald (PME) (Essmann et al., 1995[link]) and fast Fourier Poisson (FFP) (York & Yang, 1994[link]) methods both rely on the three-dimensional fast Fourier transform (3DFFT) for computational efficiency. In combination with the above use of cutoffs in the direct sum, they lower the computational cost of the Ewald sum from [O (N^{3/2})] (Perram et al., 1988[link]) to [O [N \log(N)]], where N is the number of particles and e.g. [O (N^{3/2})] denotes a quantity such that [O (N^{3/2})/N^{3/2}] remains bounded as [N \to \infty]. For the point-charge case, the key insight is that if the point charges happen to be located on a regular grid, the structure factors [equation (3.5.2.14)[link]] become discrete Fourier transforms and can be rapidly calculated using FFTs.

Like other variants of the original particle–particle particle–mesh (P3M) method (Hockney & Eastwood, 1981[link]), the smooth PME method interpolates charge onto grid positions. The smooth PME, like the original PME algorithm (Darden et al., 1993[link]), formulates this problem as one of interpolation of the complex exponentials appearing in the structure factor. This interpolation process is very accurate at low frequency, i.e. for small [|{\bf m}|], and becomes progressively less accurate as [|{\bf m}|] grows. However, the weight function [\exp(-\pi^2 {\bf m}^2 / \alpha) / {\bf m}^2] multiplying the structure factor in the reciprocal sum severely damps the high-frequency (large [|{\bf m}|]) terms, thus preserving overall accuracy. In contrast, the FFP method relies on the fact that [\exp(-\pi^2 {\bf m}^2 / \alpha) / {\bf m}^2 S({\bf m}) ] is the Fourier transform of a superposition of normalized spherical Gaussian charge distributions [q_j \rho_j], centred at [{\bf r}_j ], as discussed above. This Fourier transform is approximated by the FFT of the gridded charge density obtained by direct sampling of the Gaussians onto the grid (similar to the use of the FFT in macromolecular X-ray crystallography). Note that the smooth PME method has been easily extended to the treatment of London dispersion interactions, since these involve similar structure factors with a different weight function [g_p, p=6,8,10]. The extension is not so straightforward with the FFP approach. However, as discussed below, the FFP approach has advantages in the case of continuous charge density when more than one Gaussian exponent is involved.

For the case of point charges, P3M methods based on least-squares approximation of the exact Ewald reciprocal sum gradient (Pollock & Glosli, 1996[link]; Darden et al., 1997[link]) can be made more efficient than the smooth PME method at the same accuracy, despite requiring more FFTs (Deserno & Holm, 1998[link]). However, for the case of ideal multipoles or normalized Hermite Gaussians, the number of such FFTs becomes prohibitive, whereas the smooth PME, which relies on analytic differentiation of B-splines, does not need more FFTs as the order of differentiation (t + u + v) in equation (3.5.3.12)[link] grows.

3.5.4.2.1. The smooth PME approach

| top | pdf |

We begin by re-expressing the complex exponentials appearing in the structure factors in equations (3.5.2.14)[link], (3.5.2.26)[link] and (3.5.3.12)[link]. Recall the form of m from equation (3.5.2.2)[link], and let [M_1,M_2,M_3] be positive integers of the scale of the unit-cell dimensions. Then for [{\bf r} \in U ] we can write[\eqalign{{\bf m}\cdot{\bf r} &= m_1 {\bf a}_1^{*} \cdot {\bf r} + m_2 {\bf a}_2^{*} \cdot {\bf r} + m_3 {\bf a}_3^{*} \cdot {\bf r} \cr &= z_1 w_1 + z_2 w_2 + z_3 w_3,} ]where [w_\alpha = M_\alpha {\bf a}_\alpha^{*} \cdot {\bf r}] and [z_\alpha = m_\alpha / M_\alpha], [\alpha = 1,2,3], and thus[\exp(2 \pi i {\bf m} \cdot {\bf r}) = \exp(2 \pi i z_1 w_1)\exp(2 \pi i z_2 w_2) \exp(2 \pi i z_3 w_3) . ]

To motivate the smooth PME approach, suppose we can approximate [\exp(2 \pi i z w) ] by some function [g(w;z)] of the form[g(w;z) = \lambda(z)\textstyle \sum\limits_{k=-\infty} ^ \infty B(w-k) \exp(2 \pi i z k),\eqno(3.5.4.1) ]where [B(w)] is a well behaved function with continuous derivatives as needed for the general structure factor in equation (3.5.3.12)[link]. We first examine the structure factor for point-charge Ewald sums, defined in equation (3.5.2.14)[link]. This can be approximated as[\eqalignno{S({\bf m}) &= \textstyle\sum\limits_{j=1}^N q_j \exp(2 \pi i z_1 w_{1j})\exp(2 \pi i z_2 w_{2j})\exp(2 \pi i z_3 w_{3j}) & \cr &\simeq \lambda(z_1)\lambda(z_2) \lambda(z_3) & \cr &\quad\times \textstyle\sum\limits_{k_1=-\infty} ^ \infty \textstyle\sum\limits_{k_2=-\infty} ^ \infty \textstyle\sum\limits_{k_3=-\infty} ^ \infty \textstyle\sum\limits_{j=1}^N q_j B(w_{1j} - k_1) B(w_{2j} - k_2) & \cr &\quad\times B(w_{3j} - k_3) \exp[2\pi i (z_1 k_1+z_2 k_2 + z_3 k_3)]& \cr &= \lambda(z_1)\lambda(z_2) \lambda(z_3) & \cr &\quad\times \textstyle\sum\limits_{k_1 = -{({M_1}/{2})} + 1}^{{{M_1}/{2}}}\ \textstyle\sum\limits_{k_2 = -{({M_2}/{2})}+1}^{{{M_2}/{2}}}\ \textstyle\sum\limits_{k_3 = -{({M_3}/{2})}+1}^{{{M_3}/{2}}} Q(k_1,k_2,k_3) & \cr &\quad\times \exp\left[2 \pi i \left({{m_1 k_1}\over{M_1}} + {{m_2 k_2}\over{M_2}} + {{m_3 k_3}\over{M_3}}\right)\right],&(3.5.4.2)} ]where [Q(k_1,k_2,k_3)] is given by[\eqalignno{Q(k_1,k_2,k_3) &= \textstyle\sum\limits_{j=1}^N q_j \textstyle\sum\limits_{l_1=-\infty}^\infty B(w_{1j} - k_1 - l_1 M_1)&\cr &\quad\times \textstyle\sum\limits_{l_2=-\infty}^\infty B(w_{2j} - k_2 - l_2 M_2) & \cr &\quad\times \textstyle\sum\limits_{l_3=-\infty}^\infty B(w_{3j} - k_3 - l_3 M_3) &(3.5.4.3)} ]and we have used the fact that [\exp[2 \pi i m_\alpha (k_\alpha + M_\alpha)/M_\alpha]] = [\exp(2 \pi i m_\alpha k_\alpha/M_\alpha)] for [\alpha = 1,2,3].

Next we turn to the general structure factors [S_l({\bf m})] from equation (3.5.3.12)[link]. Define [w_{\alpha j}], [\alpha=1,2,3] as above for expansion points [{\bf R}_j \in U], and note that[\eqalign{{{\partial}\over{\partial {\bf R}_{jx}}} &= {{\partial}\over{\partial w_{1j}}}\ {{\partial w_{1j}}\over{\partial {\bf R}_{jx}}} + {{\partial}\over{\partial w_{2j}}}\ {{\partial w_{2j}}\over{\partial {\bf R}_{jx}}} + {{\partial}\over{\partial w_{3j}}}\ {{\partial w_{3j}}\over{\partial {\bf R}_{jx}}} \cr &= b_1\ {{\partial}\over{\partial w_{1j}}} + b_2\ {{\partial}\over{\partial w_{2j}}} + b_3\ {{\partial}\over{\partial w_{3j}}},} ]where [b_1, b_2, b_3] are constants depending on the reciprocal bases [{\bf a}_\alpha^{*}] and on [M_\alpha], [\alpha = 1,2,3 ]. Similar expansions hold for the other partial derivatives. Thus we can approximate the general structure factor [S_l({\bf m})] by[\eqalignno{ S_l({\bf m}) &= \sum_{j=1}^N \sum_{tuv} {\bf c}_{j,l,tuv} \left({{\partial }\over{\partial{\bf R}_{jx}}}\right)^t \left({{\partial }\over{\partial{\bf R}_{jy}}}\right)^u \left({{\partial }\over{\partial{\bf R}_{jz}}}\right)^v & \cr &\quad\times \exp(2 \pi i {\bf m}\cdot{\bf R}_j) & \cr &= \sum_{j=1}^N \sum_{tuv} {\bf d}_{j,l,tuv} \left({{\partial }\over{\partial w_{1j}}}\right)^t \left({{\partial }\over{\partial w_{2j}}}\right)^u \left({{\partial }\over{\partial w_{3j}}}\right)^v & \cr &\quad\times \exp(2 \pi i z_1 w_{1j})\exp(2 \pi i z_2 w_{2j})\exp(2 \pi i z_3 w_{3j}) & \cr &\simeq \lambda(z_1)\lambda(z_2) \lambda(z_3) & \cr &\quad\times \sum_{k_1 = -{({M_1}/{2})} + 1}^{{{M_1}/{2}}} \ \sum_{k_2 = -{({M_2}/{2})}+1}^{{{M_2}/{2}}} \ \sum_{k_3 = -{({M_3}/{2})}+1}^{{{M_3}/{2}}} Q_l(k_1,k_2,k_3) & \cr &\quad\times \exp\left[2 \pi i \left({{m_1 k_1}\over{M_1}} + {{m_2 k_2}\over{M_2}} + {{m_3 k_3}\over{M_3}}\right)\right], &(3.5.4.4)} ]where [{\bf d}_{j,l,tuv}] are the transformed Hermite coefficients obtained from [{\bf c}_{j,l,tuv}] by change of variables, using [b_1,b_2,b_3] and where [Q_l(k_1,k_2,k_3)] is given by[\eqalign{Q_l(k_1,k_2,k_3) &= \sum_{j=1}^N \sum_{tuv} {\bf d}_{j,l,tuv} \cr &\quad\times \sum_{l_1=-\infty}^\infty \left({{\partial }\over{\partial w_{1j}}}\right)^t B(w_{1j} - k_1 - l_1 M_1) \cr &\quad\times \sum_{l_2=-\infty}^\infty \left({{\partial }\over{\partial w_{2j}}}\right)^t B(w_{2j} - k_2 - l_2 M_2) \cr &\quad\times \sum_{l_3=-\infty}^\infty \left({{\partial }\over{\partial w_{3j}}}\right)^t B(w_{3j} - k_3 - l_3 M_3) .} ]

In equation (3.5.4.2)[link] we recognize the approximation to [S({\bf m}) ] as [\lambda(z_1)\lambda(z_2) \lambda(z_3)] times the three-dimensional discrete Fourier transform of [Q(k_1,k_2,k_3)], which can be calculated rapidly by the 3DFFT. If the function B(w) has finite support then [Q(k_1,k_2,k_3)] can be calculated rapidly [in order(N) time] as well. Similarly we have the 3DFFT of [Q_l(k_1,k_2,k_3)] in equation (3.5.4.4)[link]. Thus the efficient calculation of structure factors depends on the approximation given in equation (3.5.4.1)[link], which in turn rests on the choice of B(w). The smooth PME relies on B-splines.

3.5.4.2.1.1. Properties of B-splines

| top | pdf |

We proceed by deriving the facts needed about B-splines. The B-splines of order n, Bn(w) are functions of a real variable defined as follows:[\eqalign{B_1(w) &= \left\{\let\normalbaselines\relax\openup1pt\matrix{ 1\hfill & \hbox{ if }0 \le w \le 1\hfill \cr 0\hfill& \hbox{otherwise}\hfill} \right. \cr&\cr B_2(w) &= \left\{\let\normalbaselines\relax\openup1pt\matrix{w\hfill & \hbox{ if }0 \le w \le 1\hfill\cr 2-w\hfill & \hbox{ if }1 \le w \le 2\hfill \cr 0 \hfill & \hbox{ otherwise}\hfill}\right. \cr &\cr B_{n+1}(w) &= \textstyle\int\limits_{-\infty}^\infty B_n(w-v) B_1(v) \,{\rm d}v \cr &= \textstyle\int\limits_0^1 B_n(w-v) B_1(v) \,{\rm d}v.} ]

We use the following properties of the B-splines:

  • (1) for [n \ge 1] and all w, [0 \le B_n(w) ] and [B_n(w) = 0] for [w \,\lt\, 0] or [w > n];

  • (2) for [n \ge 1] and all w, [\textstyle\sum_{j = -\infty}^\infty B_n(w-j) = 1 ];

  • (3) for [n \ge 1, \ 0 \le w \le n], [B_n(n-w) = B_n(w) ];

  • (4) for [n \ge 2], [0 \le w \le n], [({\rm d}/{\rm d}w) B_{n}(w) = B_{n-1}(w)\ -] [B_{n-1}(w-1)];

  • (5) for [n \ge 2], [0 \le w \le n], [B_n(w) = [w B_{n-1}(w)] + [(n - w) B_{n-1}(w-1)] / (n-1) ].

The first property to be established is clearly true for n = 1 and if true for n, then in the above integral recursively defining Bn+1 if [w \,\lt\, 0] or [w > (n+1)] the integrand must be zero for [0 \le v \le 1], so [B_{n+1}(w) = 0 ]. Thus it is true in general by mathematical induction.

The second property is also clearly true for n = 1. Assume it is true for n. Then [\textstyle\sum_{j = -\infty}^\infty B_{n+1}(w-j) ] = [\textstyle\int_0^1 [\textstyle\sum_{j=-\infty}^\infty B_n(w-v -j)] B_1(v) \,{\rm d}v ] = [\textstyle\int_0^1 B_1(v) \,{\rm d}v = 1]. Thus it is true in general by mathematical induction.

Next we show symmetry: [B_n(n-w) = B_n(w), \ 0 \le w \le n]. Again this is clearly true for n = 1. Assuming it is true for n, we have[\eqalign{B_{n+1}(n+1 - w) &= \textstyle\int\limits_0^1 B_n(n+1 - w - v) B_1(v) \,{\rm d}v \cr &= \textstyle\int\limits_0^1 B_n[n+1 - w - (1-v')] B_1(1-v') \,{\rm d}v' \cr &= \textstyle\int\limits_0^1 B_n[n - (w-v')] B_1(1-v')\,{\rm d}v' \cr &= \textstyle\int\limits_0^1 B_n(w - v')B_1(v') \,{\rm d}v' \cr &= B_{n+1}(w), } ]where we substituted [v = (1-v')] in the second line above.

The fourth property is an efficient recursion for the derivatives of B-splines. It can be seen directly for n = 2 for [w \ne 0,1, \hbox{or} \ 2 ]. Assume it is true for [n \ge 2]. Then for [0 \le w \le n+1 ],[\eqalign{{{\rm d}\over{{\rm d}w}} B_{n+1}(w) &= \int_0^1 {{\rm d}\over{{\rm d}w}} B_n(w-v) B_1(v) \,{\rm d}v \cr &= \int_0^1 B_{n-1}(w-v) B_1(v)\,{\rm d}v \cr &\quad- \int_0^1 B_{n-1}(w-1-v) B_1(v) \,{\rm d}v \cr &= B_n(w) - B_n(w-1),} ]thus it is true in general by induction.

The final property is an efficient recursion for [B_n(w)]. It can be seen directly for n = 2. Assume it is true for [n \ge 2]. Then for [0 \le w \le n+1],[\eqalign{B_{n+1}(w) &= \int_0^1 \left[{{w-v}\over{n-1}} B_{n-1}(w-v) \right.\cr &\quad + \left. {{n-(w-v)}\over{n-1}} B_{n-1}(w-v-1)\right] B_1(v)\,{\rm d}v \cr &= {{n}\over{n-1}} B_n(w-1) \cr &\quad + {{1}\over{n-1}} \int_0^1 (w-v) {{\rm d}\over{{\rm d}w}} B_n(w-v) B_1(v)\,{\rm d}v \cr &= {{n}\over{n-1}} B_n(w-1) \cr &\quad - {{1}\over{n-1}} \int_0^1 (w-v) {{\rm d}\over{{\rm d}v}} B_n(w-v) B_1(v)\,{\rm d}v \cr &= {{n}\over{n-1}} B_n(w-1) \cr &\quad- {{1}\over{n-1}}[(w-1)B_n(w-1) - w B_n(w)] \cr &\quad + {{1}\over{n-1}} \int_0^1 B_n(w-v) {{\rm d}\over{{\rm d}v}}(w-v)B_1(v) \,{\rm d}v \cr &= {{n+1 - w}\over{n-1}} B_n(w-1) + {{w}\over{n-1}} B_n(w) \cr &\quad- {{1}\over{n-1}} B_{n+1}(w),} ]where we have used property (4[link]) in the third line and integration by parts in the fifth and sixth lines. The induction step follows by moving the last term on the right-hand side over to the left side of the equation.

3.5.4.2.1.2. B-spline approximation to trigonometric functions

| top | pdf |

Next we examine properties of the B-spline approximations to the complex exponentials that appear in the structure factors. Let [g_n(w;z)] be given by[g_n(w;z) = \lambda_n(z) \textstyle\sum\limits_{j=-\infty}^\infty B_n(w - j) \exp(2 \pi i z j).\eqno(3.5.4.5) ]The complex number [\lambda_n(z)] is chosen to optimize the approximation of [\exp(2 \pi i z w)] by [g_n(w;z)] as shown below. It is convenient to first study the properties of the function [h_n(w;z)] given by[\eqalign{h_n(w\semi z) &= \exp(-2 \pi i z w) g_n(u\semi z) \cr &= \lambda_n(z)\textstyle\sum\limits_{j=-\infty}^\infty B_n(w - j) \exp[-2 \pi i z (w-j)],} ]which is an approximation to the constant one. From its definition it is clear that [h_n(w;z)] is periodic in w with period one, i.e. [h_n(w+1;z) = h_n(w;z)]. Therefore it can be expanded in a Fourier series:[h_n(w;z) = \textstyle\sum\limits_{k=-\infty}^\infty c_n(k;z) \exp(2 \pi i k w).\eqno(3.5.4.6) ]The Fourier coefficients are given by[\eqalign{c_n(k\semi z) &= \textstyle\int\limits_0^1 h_n(w\semi z) \exp(-2 \pi i k w) \,{\rm d}w \cr &= \lambda_n(z)\textstyle\int\limits_0^1 \textstyle\sum\limits_{j=-\infty}^\infty B_n(w - j) \cr &\quad\times \exp[-2 \pi i z (w-j)] \exp(-2 \pi i k w) \,{\rm d}u\cr &= \lambda_n(z)\textstyle\sum\limits_{j=-\infty}^\infty \textstyle\int\limits_0^1 B_n(w - j) \cr &\quad\times \exp[-2 \pi i (z+k) (w-j)]\, {\rm d}u\cr &= \lambda_n(z)\textstyle\int\limits_{-\infty}^\infty B_n(v) \exp[-2 \pi i (z+k) v]\, {\rm d}v \cr &= \lambda_n(z) \ \hat{B}_n (z+k),} ]where [\hat{B}_n] denotes the Fourier transform of [B_n]. Since [B_n] are defined recursively by convolutions, its Fourier transform is obtained simply from that of [B_1]: [\hat{B}_n = (\hat{B}_1)^n ]. Here we are using the fact that if f, g and h are related by convolution, [h(w) = (f\star g)(w) = \textstyle\int_{-\infty}^\infty f(w-v) g(v)\,{\rm d}v ], then [\hat{h} = \hat{f}\hat{g}]. Since [\hat{B}_1] can be obtained by straightforward integration, we have[c_n(k;z) = \left\{\let\normalbaselines\relax\openup2pt\matrix{ \lambda_n(z)\hfill& \hbox{if }z + k = 0\hfill \cr \lambda_n(z) {\displaystyle\left[{{1 -\exp(-2 \pi i z)}\over{2 \pi i}}\right]^n \left({{1}\over{z+k}}\right)^n} \hfill& \hbox{otherwise.}\hfill}\right. ]

The original smooth PME method set [\lambda_n(z) = \bar{\lambda}_n(z) ], where [\bar{\lambda}_n(z)] was chosen to enforce interpolation: [g_n(w;z) = \exp(2\pi i zw)] or [h_n(w;z) = 1] for integer values of w. This was related to the so-called Euler spline, studied for more general complex-valued interpolation problems by Schoenberg (1973[link]). The special case of interpolating the function [\exp(2\pi i zw) ] is simpler, due to the Fourier-series representation, allowing us to to derive necessary properties in a self-contained way. From equations (3.5.4.6)[link] and the above expression for [c_n(k;z)], noting that [c_n(k;z) = 0] for integral z with [z+k \ne 0], the interpolation condition can be written[1 = \left\{\let\normalbasesline\relax\openup3pt\matrix{\bar{\lambda}_n(z)\hfill & z \hbox{ an integer} \hfill\cr \bar{\lambda}_n(z) \displaystyle{\left[{{1 - \exp(-2 \pi i z)}\over{2 \pi i}}\right]^n\sum_{k=-\infty}^\infty \left({{1}\over{z+k}}\right)^n}\hfill & \hbox{otherwise.} \hfill}\right. ]

Note that except for [\bar{\lambda}_n(z)], the expressions appearing in the interpolation condition are periodic in z with period one. The interpolation condition thus forces [\bar{\lambda}_n(z)] to be periodic as well, and so we focus attention on the range [-\textstyle{1\over2} \le z \le {1\over 2}]. If [n \ge 2] the sum in the above interpolation condition is bounded, so in order to be able to choose [\bar{\lambda}_n(z)] to satisfy the interpolation condition it is sufficient to demonstrate that [\textstyle\sum_{k=-\infty}^\infty [{{1}/({z+k})}]^n \ne 0 ] for [-\textstyle{1\over2} \le z \le \textstyle{1\over2}]. If n is even, the terms in this sum are all positive so it is always possible to interpolate. If n is odd and z = ½ then the sum is zero, since [(1/z)^n + [1 / (-1 +z)]^n = 0 ], [[1/(1+z)^n + 1 / (-2+z)]^n = 0] etc. Similarly the interpolation condition fails for z = −½. However, for [0 \,\lt\, z \,\lt\, \textstyle{1\over2}], [(1/z)^n + [1 / (-1 +z)]^n > 0], [[1/(1+z)^n + 1 / (-2+z)]^n > 0] etc. Similarly for [-\textstyle{1\over2} \,\lt\, z \,\lt\, 0 ]. Thus we can say that for [n \ge 2] and [-\textstyle{1\over2} \,\lt\, z \,\lt\, \textstyle{1\over2} ] the interpolation condition holds, and in this case we can write [h_n] as[h_n(w;z) = \textstyle\sum\limits_{k = -\infty}^\infty \gamma_n(k;z) \exp(2 \pi i k w),\eqno(3.5.4.7) ]where[\gamma_n(k;z) = \left({{1}\over{z + k}}\right)^n \bigg/ \sum_{j= -\infty}^\infty \left({{1}\over{z + j}}\right)^n.\eqno(3.5.4.8) ]The interpolating [g_n(w;z)] is obtained from this [h_n] by multiplying by [\exp(2 \pi i z w)]. Numerically, it is more convenient to use the alternative expression from equation (3.5.4.5)[link], with [\bar{\lambda}_n(z) ] given by[\bar{\lambda}_n(z) = \left[\textstyle\sum\limits_{j=0}^n B_n(j) \exp(-2 \pi i z j)\right]^{-1}. ]Note that due to the above results for [h_n], [\textstyle\sum_{j=0}^n B_n(j) \exp(-2 \pi i z j) \ne 0 ] if [n \ge 2] and [-\textstyle{1\over2} \,\lt\, z \,\lt\, \textstyle{1\over2}].

Subsequent to the work by Essmann et al. (1995[link]), we found (Darden et al., 1997[link]) that a more accurate approximation was given by least-squares fitting of [\exp(2 \pi i z w) ] by [g_n(w;z)], or equivalently 1 by [h_n(w;z)]. Thus, with [h_n] given by equation (3.5.4.7)[link] we are looking for a complex number [\tilde{\lambda} = \tilde{\lambda}_n(z)] that minimizes the mean-square error MSE, given by[\eqalign{{\rm MSE} &= \textstyle\int\limits_0^1 [1 - \tilde{\lambda} h_n(w\semi z) ][1 - \tilde{\lambda} h_n(w\semi z) ]^{*} \,{\rm d}w \cr &= 1 - 2 Re(\tilde{\lambda}) \gamma_n(0\semi z)+ [Re(\tilde{\lambda})^2 + Im(\tilde{\lambda})^2]\textstyle\sum\limits_{k=-\infty}^\infty \gamma_n^2(k\semi z),} ]where [Re(\tilde{\lambda})] and [Im(\tilde{\lambda})] denote the real and imaginary parts of [\tilde{\lambda}]. Since [\textstyle\sum_{k=-\infty}^\infty \gamma_n^2(k;z) > 0 ], it is clear that we can set [Im(\tilde{\lambda})] to zero, that is, restrict ourselves to real [\tilde{\lambda}]. Then, minimizing over real [\lambda] we have[\tilde{\lambda}_n(z) = \gamma_n(0\semi z) \bigg/ \textstyle\sum\limits_{k=-\infty}^\infty \gamma_n^2(k\semi z), ]with [\gamma_n(k;z)] given in equation (3.5.4.8)[link].

Thus we arrive at the least-squares optimal approximation to [\exp(2 \pi i z u) ] by setting [\lambda_n(z) = \tilde{\lambda}_n(z)\bar{\lambda}_n(z) ] in equation (3.5.4.5)[link]. The mean-square error due to this can be seen to be[\eqalign{{\rm MSE} &= 1 - \gamma_n^2(0\semi z) \bigg/ \textstyle\sum\limits_{k=-\infty}^\infty \gamma_n^2(k\semi z) \cr &= \textstyle\sum\limits_{k \ne 0} \gamma_n^2(k\semi z) \bigg/ \textstyle\sum\limits_{k=-\infty}^\infty \gamma_n^2(k\semi z)} ]and thus the error in the approximation is seen to be small for small [|z|] but increasing as [|z| \to \textstyle{1\over2}], as discussed above (the reciprocal Ewald weight term compensates for the increased error by rapidly damping the larger [|z|] terms). For [0 \,\lt\, |z| \,\lt\, \textstyle{1\over2}] the error decreases to zero as the B-spline order [n \to \infty]. The accuracy of the PME is thus increased by raising the order of interpolation and/or the grid size [M_1,M_2,M_3], which lowers [|z|] for a given m. When derivatives with respect to w of [g_n(w;z)] are considered in this latter representation, their effect is to effectively lower the order of interpolation in the coefficients [\gamma_n]. Thus the order of interpolation needs to be higher, the higher the order of differentiation considered.

3.5.4.2.2. The fast Fourier Poisson approach

| top | pdf |

The fast Fourier Poisson (FFP) method (York & Yang, 1994[link]) accelerates the calculation of the structure factors in equations (3.5.2.14)[link] and (3.5.3.12)[link] in much the same way as FFT methods accelerate structure-factor and density-map calculations in macromolecular structure determination. Originally the method was used to calculate structure factors appearing in Ewald sums over point charges. It is easier to see the basic idea if we rewrite the Ewald reciprocal sum from equation (3.5.2.15)[link] as follows:[\eqalign{E_{\rm r}(\underline{\bf r}^{\lbrace\bf N\rbrace}) &= {{1}\over{2 \pi V}} \sum_{{\bf m} \ne {\bf 0}} {{\exp(-\pi^2 {\bf m}^2 / \alpha)}\over{{\bf m}^2}} S({\bf m})S({-\bf m})\cr &= {{1}\over{2 \pi V}} \sum_{{\bf m} \ne {\bf 0}} {{1}\over{{\bf m}^2}} \tilde{S}({\bf m}) \tilde{S}(-{\bf m}),} ]where the modified structure factor [\tilde{S}({\bf m})] is given by[\tilde{S}({\bf m}) = \exp(-\pi^2 {\bf m}^2 / 2 \alpha)\textstyle\sum\limits_{j=1}^N q_j \exp(2 \pi i {\bf m}\cdot{\bf r}_j). ]

We recognize [\tilde{S}({\bf m})] as the three-dimensional Fourier transform, evaluated at m, of a density [\tilde{\rho}({\bf r})] given by[\tilde{\rho}({\bf r}) = \textstyle\sum\limits_{j=1}^N q_j \rho_{2 \alpha}({\bf r} - {\bf r}_j), ]where [\rho_{2 \alpha}] is a normalized Gaussian with Gaussian exponent 2α.

In turn we approximate this integral by a discrete Fourier transform. As above let [M_1,M_2,M_3] be positive integers of the scale of the unit-cell dimensions. Let [\delta v = V / (M_1 M_2 M_3)], where V is the volume of the unit cell. Recall the lattice basis vectors [{\bf a}_\alpha ], [\alpha=1,2,3]. Given integers [k_1,k_2,k_3] define [\tilde{\rho}_0(k_1,k_2,k_3)] by [\tilde{\rho}_0(k_1,k_2,k_3) = \tilde{\rho}[(k_1/M_1) {\bf a}_1 + (k_2/M_2) {\bf a}_2 + (k_3/M_3) {\bf a}_3] ]. Then we can approximate the modified structure factor [\tilde{S}({\bf m}) ] by[\eqalign{\tilde{S}({\bf m}) &\simeq \sum_{k_1 = -{({M_1}/{2})} + 1}^{{{M_1}/{2}}} \ \sum_{k_2 = -{({M_2}/{2})}+1}^{{{M_2}/{2}}} \ \sum_{k_3 = -{({M_3}/{2})}+1}^{{{M_3}/{2}}} Q(k_1,k_2,k_3) \cr &\quad\times \exp\left[2 \pi i \left({{m_1 k_1}\over{M_1}} + {{m_2 k_2}\over{M_2}} + {{m_3 k_3}\over{M_3}}\right)\right],} ]where [Q(k_1,k_2,k_3)] is given by[Q(k_1,k_2,k_3) = \textstyle\sum\limits_{l_1,l_2,l_3=-\infty}^\infty \tilde{\rho}_0(k_1+l_1M_1,k_2+l_2M_2,k_3+l_3M_3) \cdot \delta v. ]

The FFP can be generalized to Hermite Gaussians by sampling that density in place of the superposition of spherical Gaussians. In equation (3.5.3.11)[link] the term[\eqalign{&({1}/{2 \pi V}) \textstyle\sum\limits_{{\bf m} \ne {\bf 0}}({1}/{{\bf m}^2})\exp(-\pi^2 {\bf m}^2 /2 \beta) \textstyle\sum\limits_{l_1\in C}S_{l_1}({\bf m}) \cr&\quad\times\exp(-\pi^2 {\bf m}^2 /2 \beta) \textstyle\sum\limits_{l_2\in C} S_{l_2}(-{\bf m})} ]can be handled by the methods just outlined. However the term[\eqalign{& ({1}/{2 \pi V})\textstyle\sum\limits_{{\bf m} \ne {\bf 0}} ({1}/{{\bf m}^2})\textstyle\sum\limits_{(l_1,l_2)\not\in C\times C} \exp(-\pi^2 {\bf m}^2 / \theta_{l_1}) S_{l_1}({\bf m}) \cr &\quad\times \exp(-\pi^2 {\bf m}^2 / \theta_{l_2}) S_{l_2}(-{\bf m})} ]can present problems. The difficulty comes in the case when, for example, [l_1] is compact but [l_2] is not. Then the combined Gaussian weights are sufficiently damping to efficiently converge the series, but it can be very inefficient to directly sample the superposition of Hermite Gaussians with a compact exponent. A similar problem was faced by Fusti-Molnar & Pulay (2002[link]). Their solution was to filter out high frequencies in the Fourier-transformed compact densities and back transform them to obtain diffuse real-space densities. Instead, Cisneros et al. (2006[link]) adjusted the exponents of both the compact and diffuse Gaussians. Choose a [\bar{\theta}] such that if [\theta_l] is diffuse [1/\theta_l > 1 / \bar{\theta}]. For example, choose [\bar{\theta} = 1 / (4 \beta) ]. As before, define [\tilde{S}_l({\bf m})] by [\tilde{S}_l({\bf m})\ = ] [\exp(-\pi^2 {\bf m}^2 / \theta_l) S_l({\bf m}) ], and [\tilde{S}_{l+}({\bf m}),\tilde{S}_{l-}({\bf m})] by[\eqalign{\tilde{S}_{l+}({\bf M}) &= \exp\left[-\pi^2 {\bf m}^2 \left({{1}\over{\theta_l}}+{{1}\over{4 \beta}}\right)\right] S_l({\bf m})\cr \tilde{S}_{l-}({\bf M}) &= \exp\left[-\pi^2 {\bf m}^2 \left({{1}\over{\theta_l}}-{{1}\over{4 \beta}}\right)\right] S_l({\bf m}).} ]Then, if [l_1 \in C] and [l_2 \in D],[\textstyle\sum\limits_{l_1 \in C} \textstyle\sum\limits_{l_2 \in D} \tilde{S}_{l_1}({\bf m}) \tilde{S}_{l_2}({\bf m}) = \textstyle\sum\limits_{l_1 \in C} \textstyle\sum\limits_{l_2 \in D} \tilde{S}_{l_1+}({\bf m}) \tilde{S}_{l_2-}({\bf m}). ]Thus instead of approximating [\tilde{S}_l({\bf m})] by sampling the Gaussian followed by applying the 3DFFT, they approximated [\tilde{S}_{l+}({\bf m}) ] or [\tilde{S}_{l-}({\bf m})]. Note that with the above choice of [\bar{\theta}], for any [l \in C], [\theta_{l+} \,\lt\, 4 \beta ] and for any [l \in D], [\theta_{l-} \,\lt\, 4 \beta]. That is, all the modified Gaussians can be sampled on a uniform grid. Finally, note that for the diffuse–diffuse interactions[\eqalign{&\textstyle\sum\limits_{l_1 \in D} \textstyle\sum\limits_{l_2 \in D} \tilde{S}_{l_1}({\bf m}) \tilde{S}_{l_2}({\bf m})\cr &\quad= \textstyle\sum\limits_{l_1 \in D} \textstyle\sum\limits_{l_2 \in D} \tilde{S}_{l_1-}({\bf m}) \tilde{S}_{l_2-}({\bf m}) \exp(-\pi^2 {\bf m}^2 / 2 \beta),} ]i.e. the interaction between adjusted diffuse Gaussians can be corrected efficiently in reciprocal space.

To test the efficiency of these reciprocal-space approaches, the electrostatic energy and intermolecular forces were calculated for a series of water boxes (64, 128, 256, 512 and 1024 water molecules) under periodic boundary conditions. Similar calculations were performed by Cisneros et al. (2006[link]). As in that paper, the electron density of the water molecule was modelled by a five-site expansion consisting of the oxygen, the two hydrogens and the two bond midpoints, and using standard auxiliary basis functions, denoted A2 and P1, from the density-functional literature. These basis functions were obtained from the Environmental Molecular Sciences Laboratory at Pacific Northwest Laboratories (http://www.emsl.pnl.gov/forms/basisform.html ). The A2 basis set expansion uses 21 Hermite primitives with 102 Hermite coefficients per water molecule, resulting in a root-mean-square relative molecular force error of about 8% compared to 6–31G* B3LYP intermolecular electrostatics (Cisneros et al., 2006[link]). The more complex P1 basis set expansion uses 38 Hermite primitives with 200 Hermite coefficients per water molecule, reducing the relative molecular force error to about 2%. For each water box the electrostatic energy and forces for the model A2 or P1 densities were calculated to a relative root-mean-square molecular force error of 10−3 or 10−4, compared to exact Ewald summation of the respective model density.

The time to perform these calculations is given in Table 3.5.4.1[link]. Note that the times given here are considerably faster than those reported in Figs. 13 and 14 of Cisneros et al. (2006[link]). This is partly due to the faster processor used here (3.6 GHz Intel Xeon versus 3.2 GHz Xeon used previously; Intel compiler with similar options used in both cases) but mainly due to the more efficient algorithms described herein. Note that the PME and FFP timings increase in an approximately linear fashion as a function of box size whereas the Ewald sum timings behave quadratically. For each method the timings are a continuous function of the compact–diffuse separating exponent 2β described above in Section 3.5.3.1[link], except when 2β equals a Gaussian exponent in the model basis set. When this happens, a set of Hermite primitives change from compact to diffuse or vice versa and the relative balance of direct-sum or reciprocal-sum calculation changes discontinuously. For the smooth PME calculations it was optimal to choose the separating exponent at one of these boundaries. This separating exponent value was then retained for the FFP and Ewald calculations, although the FFP method could be made slightly more efficient by further varying this parameter, and the Ewald summation could be considerably accelerated to an order N3/2 method, where N is the number of water molecules, by making the separating exponent a decreasing function of system size. Neither of these optimizations would change the relative efficiencies, however.

Table 3.5.4.1| top | pdf |
Timing versus accuracy for smooth PME, FFP and regular Ewald sum reciprocal-space approaches

(a) Time in seconds for A2 model density.

Box sizeRMS force error 10−3RMS force error 10−4
PMEFFPEwaldPMEFFPEwald
64 0.106 0.142 0.365 0.144 0.274 0.520
128 0.218 0.271 1.336 0.287 0.380 1.869
256 0.387 0.528 5.239 0.517 0.846 7.256
512 0.837 1.100 17.881 1.104 1.534 25.511
1024 1.701 2.236 71.513 2.249 3.152 108.158

(b) Time in seconds for P1 model density

Box sizeRMS force error 10−3RMS force error 10−4
PMEFFPEwaldPMEFFPEwald
64 0.310 0.399 2.321 0.478 0.688 3.858
128 0.591 0.832 7.706 0.923 1.576 11.056
256 1.186 1.920 35.178 1.805 2.736 49.107
512 2.549 3.825 119.863 3.794 5.684 183.487
1024 4.953 6.890 486.384 6.947 11.307 714.644

The error in the FFP for a given Gaussian exponent is controlled by the size of the grid [M_1,M_2,M_3] as in the PME method, and also by the completeness in the Gaussian sampling (the cutoff in the tails of the Gaussian). Typically, for equivalent accuracy, the FFP requires more extensive sampling onto grid points than does the smooth PME. That is, each Gaussian centred at an expansion point must be sampled over a considerably larger set of points than the tensor product B-spline in equation (3.5.4.3)[link]. The additional cost due to this has been ameliorated somewhat by the Gaussian split Ewald approach (Shan et al., 2005[link]). Additionally, in rectangular unit cells fast Poisson solvers such as multigrid (Beck, 2000[link]; Sagui & Darden, 2001[link]) can be applied, which in principle should parallelize more efficiently than the 3DFFT. Finally, although the smooth PME is typically more efficient for a single Gaussian exponent structure factor, the FFP approach has the strong advantage that the various Gaussian superpositions having possibly different exponents can be sampled and then fast Fourier transformed together, whereas they must be kept separate in the PME approach. Thus, as here Cisneros et al. (2006[link]) found that the smooth PME approach was more efficient for smaller auxiliary basis sets, whereas they found that the FFP approach became more efficient for large auxiliary basis sets involving many different Gaussian exponents.

References

Aguado, A. & Madden, P. A. (2003). Ewald summation of electrostatic multipole interactions up to the quadrupolar level. J. Chem. Phys. 119, 7471–7483.
Allen, M. P. & Tildesley, D. J. (1987). Computer Simulation of Liquids. Oxford: Clarendon Press.
Beck, T. L. (2000). Real-space mesh techniques in density functional theory. Rev. Mod. Phys. 72, 1041–1080.
Cisneros, G. A., Piquemal, J. P. & Darden, T. A. (2006). Generalization of the Gaussian electrostatic model: Extension to arbitrary angular momentum, distributed multipoles, and speedup with reciprocal space methods. J. Chem. Phys. 125, 184101.
Darden, T. A., Toukmaji, A. & Pedersen, L. (1997). Long-range electrostatic effects in biomolecular simulations. J. Chim. Phys. 94, 1346–1364.
Darden, T. A., York, D. M. & Pedersen, L. G. (1993). Particle mesh Ewald: An N log(N) method for Ewald sums in large systems. J. Chem. Phys. 98, 10089–10092.
Deserno, M. & Holm, C. (1998). How to mesh up Ewald sums I: A theoretical and numerical comparison of various particle mesh routines. J. Chem. Phys. 109, 7678–7693.
Essmann, U., Perera, L., Berkowitz, M. L., Darden, T., Lee, H. & Pedersen, L. G. (1995). A smooth particle mesh Ewald method. J. Chem. Phys. 103, 8577–8593.
Fusti-Molnar, L. & Pulay, P. (2002). The Fourier transform Coulomb method: Efficient and accurate calculation of the Coulomb operator in a Gaussian basis. J. Chem. Phys. 117, 7827–7835.
Helgaker, T., Jorgensen, P. & Olsen, J. (2000). Molecular Electronic-Structure Theory. Chichester: John Wiley and Sons.
Hockney, R. W. & Eastwood, J. W. (1981). Computer Simulation Using Particles. New York: McGraw-Hill.
McMurchie, L. E. & Davidson, E. R. (1978). One-electron and 2-electron integrals over Cartesian Gaussian functions. J. Comput. Phys. 26, 218–231.
Perram, J. W., Petersen, H. G. & DeLeeuw, S. W. (1988). An algorithm for the simulation of condensed matter that grows as the 3/2 power of the number of particles. Mol. Phys. 65, 875–893.
Pollock, E. L. & Glosli, J. (1996). Comments on p(3)m, fmm and the Ewald method for large periodic Coulombic systems. Comput. Phys. Comm. 95, 93–110.
Sagui, C. & Darden, T. (2001). Multigrid methods for classical molecular dynamics simulations of biomolecules. J. Chem. Phys. 114, 6578–6591.
Sagui, C., Pedersen, L. G. & Darden, T. A. (2004). Towards an accurate representation of electrostatics in classical force fields: Efficient implementation of multipolar interactions in biomolecular simulations. J. Chem. Phys. 120, 73–87.
Schoenberg, I. J. (1973). Cardinal Spline Interpolation. Philadelphia: Society for Industrial and Applied Mathematics.
Shan, Y. B., Klepeis, J. L., Eastwood, M. P., Dror, R. O. & Shaw, D. E. (2005). Gaussian split Ewald: A fast Ewald mesh method for molecular simulations. J. Chem. Phys. 122, 054101.
Smith, W. (1982). Point multipoles in the Ewald summation. CCP5 Inf. Q. 4, 13–25.
Smith, W. (1998). Point multipoles in the Ewald summation (revisited). CCP5 Inf. Q. 46, 18–30.
York, D. & Yang, W. (1994). The fast Fourier Poisson (FFP) method for calculating Ewald sums. J. Chem. Phys. 101, 3298–3300.








































to end of page
to top of page