Tables for
Volume B
Reciprocal space
Edited by U. Shmueli

International Tables for Crystallography (2010). Vol. B, ch. 2.2, pp. 231-232   | 1 | 2 |

Section 2.2.7. Scheme of procedure for phase determination: the small-molecule case

C. Giacovazzoa*

aDipartimento Geomineralogico, Campus Universitario, 70125 Bari, Italy, and Institute of Crystallography, Via G. Amendola, 122/O, 70125 Bari, Italy
Correspondence e-mail:

2.2.7. Scheme of procedure for phase determination: the small-molecule case

| top | pdf |

A traditional procedure for phase assignment may be schematically presented as follows:

  • Stage 1: Normalization of s.f.'s. See Section 2.2.4.[link]

  • Stage 2: (Possible) estimation of one-phase s.s.'s. The computing program recognizes the one-phase s.s.'s and applies the proper formulae (see Section[link]).

    Each phase is associated with a reliability value, to allow the user to regard as known only those phases with reliability higher than a given threshold.

  • Stage 3: Search of the triplets. The reflections are listed for decreasing [|E|] values and, related to each [|E|] value, all possible triplets are reported (this is the so-called [\textstyle\sum_{2}] list). The value [G = 2|E_{\bf h} E_{\bf k} E_{{\bf h}-{\bf k}}|/\sqrt{N}] is associated with every triplet for an evaluation of its efficiency. Usually reflections with [|E|\,\lt\, E_{s}] ([E_{s}] may range from 1.2 to 1.6) are omitted from this stage onward.

  • Stage 4: Definition of the origin and enantiomorph. This stage is carried out according to the theory developed in Section 2.2.3.[link] Phases chosen for defining the origin and enantiomorph, one-phase seminvariants estimated at stage 2, and symbolic phases described at stage 5 are the only phases known at the beginning of the phasing procedure. This set of phases is conventionally referred to as the starting set, from which iterative application of the tangent formula will derive new phase estimates.

  • Stage 5: Assignment of one or more (symbolic or numerical) phases. In complex structures the number of phases assigned for fixing the origin and the enantiomorph may be inadequate as a basis for further phase determination. Furthermore, only a few one-phase s.s.'s can be determined with sufficient reliability to make them qualify as members of the starting set. Symbolic phases may then be associated with some (generally from 1 to 6) high-modulus reflections (symbolic addition procedures). Iterative application of triplet relations leads to the determination of other phases which, in part, will remain expressed by symbols (Karle & Karle, 1966[link]).

    In other procedures (multisolution procedures) each symbol is assigned four phase values in turn: [\pi / 4, 3\pi / 4, 5\pi / 4, 7\pi / 4]. If p symbols are used, in at least one of the possible [4^{p}] solutions each symbolic phase has unit probability of being within [45^{\circ}] of its true value, with a mean error of [22.5^{\circ}].

    To find a good starting set a convergence method (Germain et al., 1970[link]) is used according to which: (a) [\langle \alpha_{\bf h}\rangle = \textstyle\sum\limits_{j} G_{j} I_{1} (G_{j}) / I_{0} (G_{j})]is calculated for all reflections (j runs over the set of triplets containing h); (b) the reflection is found with smallest [\langle \alpha \rangle] not already in the starting set; it is retained to define the origin if the origin cannot be defined without it; (c) the reflection is eliminated if it is not used for origin definition. Its [\langle \alpha \rangle] is recorded and [\langle \alpha \rangle] values for other reflections are updated; (d) the cycle is repeated from (b) until all reflections are eliminated; (e) the reflections with the smallest [\langle \alpha \rangle] at the time of elimination go into the starting set; (f) the cycle from (a) is repeated until all reflections have been chosen.

  • Stage 6: Application of tangent formula. Phases are determined in reverse order of elimination in the convergence procedure. In order to ensure that poorly determined phases [\varphi_{{\bf k}_j}] and [\varphi_{{\bf h}-{\bf k}_j}] have little effect in the determination of other phases a weighted tangent formula is normally used (Germain et al., 1971[link]): [\tan \varphi_{\bf h} = {{\textstyle\sum_{j}} w_{{\bf k}_{j}} w_{{\bf h}-{\bf k}_{j}} |E_{{\bf k}_{j}} E_{{\bf h}-{\bf k}_{j}} |\sin (\varphi_{{\bf k}_{j}} + \varphi_{{\bf h}-{\bf k}_{j}})\over {\textstyle\sum_{j}} w_{{\bf k}_{j}} w_{{\bf h}-{\bf k}_{j}} |E_{{\bf k}_{j}} E_{{\bf h}-{\bf k}_{j}} | \cos (\varphi_{{\bf k}_{j}} + \varphi_{{\bf h}-{\bf k}_{j}})}, \eqno(]where [w_{\bf h} = \min\;(0.2 \alpha, 1).]Once a large number of contributions are available in ([link] for a given [\varphi_{\bf h}], then the value of [\alpha_{\bf h}] quickly becomes greater than 5, and so assigns an unrealistic unitary weight to [\varphi_{\bf h}]. In this respect a different weighting scheme may be proposed (Hull & Irwin, 1978[link]) according to which [w = \psi \exp (-x^{2}) \textstyle\int\limits_{0}^{x} \exp (t^{2}) \;\hbox{d}t, \eqno(]where [x = \alpha / \langle \alpha \rangle] and [\psi = 1.8585] is a constant chosen so that [w = 1] when [x = 1]. Except for ψ, the right-hand side of ([link] is the Dawson integral which assumes its maximum value at [x = 1] (see Fig.[link]): when [\alpha\,\gt\, \langle \alpha \rangle] or [\alpha\,\lt\, \langle \alpha \rangle] then [w\,\lt \,1] and so the agreement between α and [\langle \alpha \rangle] is promoted.


    Figure | top | pdf |

    The form of w as given by ([link].

    Alternative weighting schemes for the tangent formula are frequently used [for example, see Debaerdemaeker et al. (1985[link])]. In one (Giacovazzo, 1979b[link]), the values [\alpha_{{\bf k}_{j}}] and [\alpha_{{\bf h} - {\bf k}_{j}}] (which are usually available in direct procedures) are considered as additional a priori information so that ([link] may be replaced by [\tan \varphi_{{\bf h}} \simeq {{\textstyle\sum_{j}} \beta_{j} \sin (\varphi_{{\bf k}_{j}} + \varphi_{{\bf h - k}_{j}})\over {\textstyle\sum_{j}} \beta_{j} \cos (\varphi_{{\bf k}_{j}} + \varphi_{{\bf h - k}_{j}})}, \eqno(]where [\beta_{j}] is the solution of the equation [D_{1} (\beta_{j}) = D_{1} (G_{j}) D_{1} (\alpha_{{\bf k}_{j}}) D_{1} (\alpha_{{\bf h} - {\bf k}_{j}}). \eqno(]In ([link], [G_{j} = 2 | E_{\bf h} E_{{\bf k}_{j}} E_{{\bf h - k}_{j}} | \sqrt{N}]or the corresponding second representation parameter, and [D_{1} (x) = I_{1} (x) / I_{0} (x)] is the ratio of two modified Bessel functions.

    In order to promote (in accordance with the aims of Hull and Irwin) the agreement between α and [\langle \alpha \rangle], the distribution of α may be used (Cascarano, Giacovazzo, Burla et al., 1984[link]; Burla et al., 1987[link]); in particular, the first two moments of the distribution: accordingly, [w = \left\{\exp \left[{- (\alpha - \langle \alpha \rangle)^{2}\over 2 \sigma_{\alpha}^{2}}\right]\right\}^{1/3}]may be used, where [\sigma_{\alpha}^{2}] is the estimated variance of α.

  • Stage 7: Figures of merit. The correct solution is found among several by means of figures of merit (FOMs) which are expected to be extreme for the correct solution. Largely used are (Germain et al., 1970[link])

    • [\hbox{ABSFOM} = \textstyle\sum\limits_{\bf h} \alpha_{\bf h} / \textstyle\sum\limits_{\bf h} \langle \alpha_{\bf h} \rangle, \leqno(a)]which is expected to be unity for the correct solution.

    • [\hbox{PSI0} = {{\textstyle\sum_{{\bf h}}} \left| {\textstyle\sum_{{\bf k}}} E_{{\bf k}} E_{{\bf h - k}} \right|\over {\textstyle\sum_{{\bf h}}} \left({\textstyle\sum_{{\bf k}}} | E_{{\bf k}} E_{{\bf h - k}} |^{2}\right)^{1/2}}. \leqno(b)]The summation over k includes (Cochran & Douglas, 1957[link]) the strong [|E|]'s for which phases have been determined, and indices h correspond to very small [|E_{\bf h}|]. Minimal values of PSI0 (≤ 1.20) are expected to be associated with the correct solution.

    • [R_{\alpha} = {{\textstyle\sum_{{\bf h}}} | \alpha_{\bf h} - \langle \alpha_{\bf h} \rangle|\over {\textstyle\sum_{{\bf h}}} \langle \alpha_{\bf h} \rangle}. \leqno(c)]That is, the Karle & Karle (1966[link]) residual between the actual and the estimated α's. After scaling of [\alpha_{\bf h}] on [\langle \alpha_{\bf h}\rangle] the correct solution should be characterized by the smallest [R_{\alpha}] values.

    • [\hbox{NQEST} = \textstyle\sum\limits_{j} G_{j} \cos \Phi_{j}, \leqno(d)]where G is defined by ([link] and [\Phi = \varphi_{\bf h} - \varphi_{\bf k} - \varphi_{\bf l} - \varphi_{{\bf h}-{\bf k}-{\bf l}}]are quartet invariants characterized by large basis magnitudes and small cross magnitudes (De Titta et al., 1975[link]; Giacovazzo, 1976[link]). Since G is expected to be negative as well as [\cos \Phi], the value of NQEST is expected to be positive and a maximum for the correct solution.

    Figures of merit are then combined as [\eqalign{\hbox{CFOM} &= w_{1} {\hbox{ABSFOM} - \hbox{ABSFOM}_{\min}\over \hbox{ABSFOM}_{\max} - \hbox{ABSFOM}_{\min}}\cr &\quad + w_{2} {\hbox{PSI0}_{\max} - \hbox{PSI0}\over \hbox{PSI0}_{\max} - \hbox{PSI0}_{\min}}\cr &\quad + w_{3} {R_{{\alpha}_{\max}} - R_{\alpha}\over R_{{\alpha}_{\max}} - R_{{\alpha}_{\min}}}\cr &\quad + w_{4} {\hbox{NQEST} - \hbox{NQEST}_{\min}\over \hbox{NQEST}_{\max} - \hbox{NQEST}_{\min}},}]where [w_{i}] are empirical weights proportional to the confidence of the user in the various FOMs.

    Different FOMs are often used by some authors in combination with those described above: for example, enantiomorph triplets and quartets are supplementary FOMs (Van der Putten & Schenk, 1977[link]; Cascarano, Giacovazzo & Viterbo, 1987[link]).

    Different schemes of calculating and combining FOMs are also used: one scheme (Cascarano, Giacovazzo & Viterbo, 1987[link]) uses

    • [\hbox{CPHASE} = {\sum w_{j} G_{j} \cos (\Phi_{j} - \theta_{j}) + w_{j} G_{j} \cos \Phi_{j}\over {\textstyle\sum_{{\rm s.i.} + {\rm s.s.}}} w_{j} G_{j} D_{1} (G_{j})},\leqno{\quad(a1)}]where the first summation in the numerator extends over symmetry-restricted one-phase and two-phase s.s.'s (see Sections[link] and[link]), and the second summation in the numerator extends over negative triplets estimated via the second representation formula [equation ([link]] and over negative quartets. The value of CPHASE is expected to be close to unity for the correct solution.

    • (a2) [\alpha_{\bf h}] for strong triplets and [E_{\bf k} E_{{\bf h} - {\bf k}}] contributions for PSI0 triplets may be considered random variables: the agreements between their actual and their expected distributions are considered as criteria for identifying the correct solution.

    • (a3) correlation among some FOMs is taken into account.

    According to this scheme, each FOM (as well as the CFOM) is expected to be unity for the correct solution. Thus one or more figures are available which constitute a sort of criterion (on an absolute scale) concerning the correctness of the various solutions: FOMs (and CFOM) [\simeq 1] probably denote correct solutions, CFOMs [\ll 1] should indicate incorrect solutions.

  • Stage 8: Interpretation of E maps. This is carried out in up to four stages (Koch, 1974[link]; Main & Hull, 1978[link]; Declercq et al., 1973[link]):

    • (a) peak search;

    • (b) separation of peaks into potentially bonded clusters;

    • (c) application of stereochemical criteria to identify possible molecular fragments;

    • (d) comparison of the fragments with the expected molecular structure.


Burla, M. C., Cascarano, G., Giacovazzo, C., Nunzi, A. & Polidori, G. (1987). A weighting scheme for tangent formula development. III. The weighting scheme of the SIR program. Acta Cryst. A43, 370–374.
Cascarano, G., Giacovazzo, C., Burla, M. C., Nunzi, A. & Polidori, G. (1984). The distribution of [\alpha_{\bf h}]. Acta Cryst. A40, 389–394.
Cascarano, G., Giacovazzo, C. & Viterbo, D. (1987). Figures of merit in direct methods: a new point of view. Acta Cryst. A43, 22–29.
Cochran, W. & Douglas, A. S. (1957). The use of a high-speed digital computer for the direct determination of crystal structure. II. Proc. R. Soc. London Ser. A, 243, 281–288.
De Titta, G. T., Edmonds, J. W., Langs, D. A. & Hauptman, H. (1975). Use of the negative quartet cosine invariants as a phasing figure of merit: NQEST. Acta Cryst. A31, 472–479.
Debaerdemaeker, T., Tate, C. & Woolfson, M. M. (1985). On the application of phase relationships to complex structures. XXIV. The Sayre tangent formula. Acta Cryst. A41, 286–290.
Declercq, J.-P., Germain, G., Main, P. & Woolfson, M. M. (1973). On the application of phase relationships to complex structures. V. Finding the solution. Acta Cryst. A29, 231–234.
Germain, G., Main, P. & Woolfson, M. M. (1970). On the application of phase relationships to complex structures. II. Getting a good start. Acta Cryst. B26, 274–285.
Germain, G., Main, P. & Woolfson, M. M. (1971). The application of phase relationships to complex structures. III. The optimum use of phase relationships. Acta Cryst. A27, 368–376.
Giacovazzo, C. (1976). A probabilistic theory of the cosine invariant [\cos (\varphi_{\bf h} + \varphi_{\bf k} + \varphi_{\bf l} - \varphi_{{\bf h}+{\bf k}+{\bf l}})]. Acta Cryst. A32, 91–99.
Giacovazzo, C. (1979b). A theoretical weighting scheme for tangent-formula development and refinement and Fourier synthesis. Acta Cryst. A35, 757–764.
Hull, S. E. & Irwin, M. J. (1978). On the application of phase relationships to complex structures. XIV. The additional use of statistical information in tangent-formula refinement. Acta Cryst. A34, 863–870.
Karle, J. & Karle, I. L. (1966). The symbolic addition procedure for phase determination for centrosymmetric and non-centrosymmetric crystals. Acta Cryst. 21, 849–859.
Koch, M. H. J. (1974). On the application of phase relationships to complex structures. IV. Automatic interpretation of electron-density maps for organic structures. Acta Cryst. A30, 67–70.
Main, P. & Hull, S. E. (1978). The recognition of molecular fragments in E maps and electron density maps. Acta Cryst. A34, 353–361.
Van der Putten, N. & Schenk, H. (1977). On the conditional probability of quintets. Acta Cryst. A33, 856–858.

to end of page
to top of page