International
Tables for Crystallography Volume B Reciprocal space Edited by U. Shmueli © International Union of Crystallography 2010 
International Tables for Crystallography (2010). Vol. B, ch. 2.1, pp. 203207
Section 2.1.7. Nonideal distributions: the correctionfactor approach^{a}School of Chemistry, Tel Aviv University, Tel Aviv 69 978, Israel, and ^{b}St John's College, Cambridge, England 
The probability density functions (p.d.f.'s) of the magnitude of the structure factor, presented in Section 2.1.5, are based on the centrallimit theorem discussed above. In particular, the centric and acentric p.d.f.'s given by equations (2.1.5.11) and (2.1.5.8), respectively, are expected to account for the statistical properties of diffraction patterns obtained from crystals consisting of nearly equal atoms, which obey the fundamental assumptions of uniformity and independence of the atomic contributions and are not affected by noncrystallographic symmetry and dispersion. It is also assumed there that the number of atoms in the asymmetric unit is large. Distributions of structurefactor magnitudes which are based on the centrallimit theorem, and thus obey the above assumptions, have been termed `ideal', and the subjects of the following sections are those distributions for which some of the above assumptions/restrictions are not fulfilled; the latter distributions will be called `nonideal'.
We recall that the assumption of uniformity consists of the requirement that the fractional part of the scalar product be uniformly distributed over the [0, 1] interval, which holds well if are rationally independent (Hauptman & Karle, 1953), and permits one to regard the atomic contribution to the structure factor as a random variable. This is of course a necessary requirement for any statistical treatment. If, however, the atomic composition of the asymmetric unit is widely heterogeneous, the structure factor is then a sum of unequally distributed random variables and the Lindeberg–Lévy version of the centrallimit theorem (cf. Section 2.1.4.4) cannot be expected to apply. Other versions of this theorem might still predict a normal p.d.f. of the sum, but at the expense of a correspondingly large number of terms/atoms. It is well known that atomic heterogeneity gives rise to severe deviations from ideal behaviour (e.g. Howells et al., 1950) and one of the aims of crystallographic statistics has been the introduction of a correct dependence on the atomic composition into the nonideal p.d.f.'s [for a review of the early work on nonideal distributions see Srinivasan & Parthasarathy (1976)]. A somewhat less well known fact is that the dependence of the p.d.f.'s of on spacegroup symmetry becomes more conspicuous as the composition becomes more heterogeneous (e.g. Shmueli, 1979; Shmueli & Wilson, 1981). Hence both the composition and the symmetry dependence of the intensity statistics are of interest. Other problems, which likewise give rise to nonideal p.d.f.'s, are the presence of heavy atoms in (variable) special positions, heterogeneous structures with complete or partial noncrystallographic symmetry, and the presence of outstandingly heavy dispersive scatterers.
The need for theoretical representations of nonideal p.d.f.'s is exemplified in Fig. 2.1.7.1, which shows the ideal centric and acentric p.d.f.'s together with a frequency histogram of values, recalculated for a centrosymmetric structure containing a platinum atom in the asymmetric unit of (Faggiani et al., 1980). Clearly, the deviation from the Gaussian p.d.f., predicted by the centrallimit theorem, is here very large and a comparison with the possible ideal distributions can (in this case) lead to wrong conclusions.
Two general approaches have so far been employed in derivations of nonideal p.d.f.'s which account for the abovementioned problems: the correctionfactor approach, to be dealt with in the following sections, and the more recently introduced Fourier method, to which Section 2.1.8 is dedicated. In what follows, we introduce briefly the mathematical background of the correctionfactor approach, apply this formalism to centric and acentric nonideal p.d.f.'s, and present the numerical values of the moments of the trigonometric structure factor which permit an approximate evaluation of such p.d.f.'s for all the threedimensional space groups.
Suppose that is a p.d.f. which accurately describes the experimental distribution of the random variable x, where x is related to a sum of random variables and can be assumed to obey (to some approximation) an ideal p.d.f., say , based on the centrallimit theorem. In the correctionfactor approach we seek to represent as where are coefficients which depend on the cause of the deviation of from the centrallimit theorem approximation and are suitably chosen functions of x. A choice of the set is deemed suitable, if only from a practical point of view, if it allows the convenient introduction of the cause of the above deviation of into the expansion coefficients . This requirement is satisfied – also from a theoretical point of view – by taking as a set of polynomials which are orthogonal with respect to the ideal p.d.f., taken as their weight function (e.g. Cramér, 1951). That is, the functions so chosen have to obey the relationshipwhere is the range of existence of all the functions involved. It can be readily shown that the coefficients are given by where the brackets in equation (2.1.7.3) denote averaging with respect to the unknown p.d.f. and is the coefficient of the nth power of x in the polynomial . The coefficients are thus directly related to the moments of the nonideal distribution and the coefficients of the powers of x in the orthogonal polynomials. The latter coefficients can be obtained by the Gram–Schmidt procedure (e.g. Spiegel, 1974), or by direct use of the Szegö determinants (e.g. Cramér, 1951), for any weight function that has finite moments. However, the feasibility of the present approach depends on our ability to obtain the moments without the knowledge of the nonideal p.d.f., .
We shall summarize here the nonideal centric and acentric distributions of the magnitude of the normalized structure factor E (e.g. Shmueli & Wilson, 1981; Shmueli, 1982). We assume that (i) all the atoms are located in general positions and have rationally independent coordinates, (ii) all the scatterers are dispersionless, and (iii) there is no noncrystallographic symmetry. Arbitrary atomic composition and spacegroup symmetry are admitted. The appropriate weight functions and the corresponding orthogonal polynomials are where and are Hermite and Laguerre polynomials, respectively, as defined, for example, by Abramowitz & Stegun (1972). Equations (2.1.7.2), (2.1.7.3) and (2.1.7.4) suffice for the general formulation of the above nonideal p.d.f.'s of . Their full derivation entails (i) the expression of a sufficient number of moments of in terms of absolute moments of the trigonometric structure factor (e.g. Shmueli & Wilson, 1981; Shmueli, 1982) and (ii) calculation of the latter moments for the various symmetries (Wilson, 1978b; Shmueli & Kaldor, 1981, 1983). The notation below is similar to that employed by Shmueli (1982).
These nonideal p.d.f.'s of , for which the first five expansion terms are available, are given by and for centrosymmetric and noncentrosymmetric space groups, respectively, where and are the ideal centric and acentric p.d.f.'s [see (2.1.7.4)] and the unified form of the coefficients and , for k = 2, 3, 4 and 5, is (Shmueli, 1982), where U = 35 or 18, V = 210 or 100 and W = 3150 or 900 according as or is required, respectively, and the other quantities in equation (2.1.7.7) are given below. The compositiondependent terms in equations (2.1.7.7) are where m is the number of atoms in the asymmetric unit, are their scattering factors, and the symmetry dependence is expressed by the coefficients in equation (2.1.7.7), as follows: where according as the space group is centrosymmetric or noncentrosymmetric, respectively, and in equation (2.1.7.9) is given by where is the kth absolute moment of the trigonometric structure factor In equation (2.1.7.12), g is the number of general equivalent positions listed in IT A (2005) for the space group in question, times the multiplicity of the Bravais lattice, is the sth spacegroup operator and is an atomic position vector.
The cumulative distribution functions, obtained by integrating equations (2.1.7.5) and (2.1.7.6), are given by and for centrosymmetric and noncentrosymmetric space groups, respectively, where the coefficients are defined in equations (2.1.7.7)–(2.1.7.12). Note that the first term on the righthand side of equation (2.1.7.13) and the first two terms on the righthand side of equation (2.1.7.14) are just the cumulative distributions derived from the ideal centric and acentric p.d.f.'s in Section 2.1.5.6.
The moments were compiled for all the space groups by Wilson (1978b) for 1 and 2, and by Shmueli & Kaldor (1981, 1983) for 1, 2, 3 and 4. These results are presented in Table 2.1.7.1. Closed expressions for the normalized moments were obtained by Shmueli (1982) for the triclinic, monoclinic and orthorhombic space groups except and (see Table 2.1.7.2). The compositiondependent terms, , are most conveniently computed as weighted averages over the ranges of which were used in the construction of the Wilson plot for the computation of the values.
Note. hkl subsets: (1) ; (2) ; (3) ; (4) ; (5) ; (6) ; (7) ; (8) ; (9) ; (10) ; (11) ; (12) ; (13) ; (14) , ; (15) hkl all even; (16) only one index odd; (17) only one index even; (18) hkl all odd; (19) two indices odd; (20) ; (21) .
^{†}And the enantiomorphous space group.


As noted in Section 2.1.8.7 below, the Fourier representation of the probability distribution of is usually much better than the particular orthogonalfunction representation discussed in Section 2.1.7.3. Many, perhaps most, nonideal centric distributions look like slight distortions of the ideal (Gaussian) distribution and have no resemblance to a cosine function. The empirical observation thus seems paradoxical. The probable explanation has been pointed out by Wilson (1986b). A truncated Fourier series is a best approximation, in the leastsquares sense, to the function represented. The particular orthogonalfunction approach used in equation (2.1.7.5), on the other hand, is not a leastsquares approximation to , but is a leastsquares approximation to The usual expansions (often known as Gram–Charlier or Edgeworth) thus give great weight to fitting the distribution of the (compararively few) strong reflections, at the expense of a poor fit for the (much more numerous) weaktomedium ones. Presumably, a similar situation exists for the representation of acentric distributions, but this has not been investigated in detail. Since the centric distributions often look nearly Gaussian, one is led to ask if there is an expansion in orthogonal functions that (i) has the leading term and (ii) is a leastsquares (as well as an orthogonalfunction)^{2} fit to . One does exist, based on the orthogonal functions where is the Gaussian distribution (MyllerLebedeff, 1907). Unfortunately, no reasonably simple relationship between the coefficients and readily evaluated properties of has been found, and the MyllerLebedeff expansion has not, as yet, been applied in crystallography. Although Stuart & Ord (1994, p. 112) dismiss it in a threeline footnote, it does have important applications in astronomy (van der Marel & Franx, 1993; Gerhard, 1993).
References
International Tables for Crystallography (2005). Vol. A, SpaceGroup Symmetry, edited by Th. Hahn. Heidelberg: Springer.Abramowitz, M. & Stegun, I. A. (1972). Handbook of Mathematical Functions. New York: Dover.
Cramér, H. (1951). Mathematical Methods of Statistics. Princeton University Press.
Faggiani, R., Lippert, B. & Lock, C. J. L. (1980). Heavy transition metal complexes of biologically important molecules. 4. Crystal and molecular structure of pentahydroxonium chloro(uracilatoN(1))(ethylenediamine)platinum(II)chloride (H_{5}O_{2})[PtCl(NH_{2}CH_{2}CH_{2}NH_{2})(C_{4}H_{5}N_{2}O_{2})]Cl, and chloro(thyminatoN(1))(ethylenediamine)platinum(II), PtCl(NH_{2}CH_{2}CH_{2}NH_{2})(C_{5}H_{5}N_{2}O_{2}). Inorg. Chem. 19, 295–300.
Gerhard, O. E. (1993). Lineofsight velocity profiles in spherical galaxies: breaking the degeneracy between anisotropy and mass. Mon. Not. R. Astron. Soc. 265, 213–230.
Hauptman, H. & Karle, J. (1953). Solution of the Phase Problem. I. The Centrosymmetric Crystal. Am. Crystallogr. Assoc. Monograph No. 3. Dayton, Ohio: Polycrystal Book Service.
Howells, E. R., Phillips, D. C. & Rogers, D. (1950). The probability distribution of Xray intensities. II. Experimental investigation and the Xray detection of centers of symmetry. Acta Cryst. 3, 210–214.
Marel, R. P. van der & Franx, M. (1993). A new method for the identification of nonGaussian line profiles in elliptical galaxies. Astrophys. J. 407, 525–539.
MyllerLebedeff, W. (1907). Die Theorie der Integralgleichungen in Anwendung auf einige Reihenentwicklungen. Math. Ann. 64, 388–416.
Shmueli, U. (1979). Symmetry and compositiondependent cumulative distributions of the normalized structure amplitude for use in intensity statistics. Acta Cryst. A35, 282–286.
Shmueli, U. (1982). A study of generalized intensity statistics: extension of the theory and practical examples. Acta Cryst. A38, 362–371.
Shmueli, U. & Kaldor, U. (1981). Calculation of even moments of the trigonometric structure factor. Methods and results. Acta Cryst. A37, 76–80.
Shmueli, U. & Kaldor, U. (1983). Moments of the trigonometric structure factor. Acta Cryst. A39, 615–621.
Shmueli, U. & Wilson, A. J. C. (1981). Effects of spacegroup symmetry and atomic heterogeneity on intensity statistics. Acta Cryst. A37, 342–353.
Spiegel, M. R. (1974). Theory and Problems of Fourier Analysis. Schaum's Outline Series. New York: McGrawHill.
Srinivasan, R. & Parthasarathy, S. (1976). Some Statistical Applications in Xray Crystallography. Oxford: Pergamon Press.
Stuart, A. & Ord, K. (1994). Kendall's Advanced Theory of Statistics, Vol. 1, Distribution Theory, 6th ed. London: Edward Arnold.
Wilson, A. J. C. (1978b). Variance of Xray intensities: effect of dispersion and higher symmetries. Acta Cryst. A34, 986–994.
Wilson, A. J. C. (1986b). Fourier versus Hermite representations of probability distributions. Acta Cryst. A42, 81–83.