Tables for
Volume B
Reciprocal space
Edited by U. Shmueli

International Tables for Crystallography (2006). Vol. B, ch. 2.2, pp. 210-234   | 1 | 2 |
doi: 10.1107/97809553602060000555

Chapter 2.2. Direct methods

C. Giacovazzoa*

aDipartimento Geomineralogico, Campus Universitario, I-70125 Bari, Italy
Correspondence e-mail:

Direct methods are essentially reciprocal-space techniques, developed historically to solve small-molecule crystal structures. Their success in this area (in practice, they solve the phase problem) is based on numerous theoretical achievements which concern origin specification (Section 2.2.3[link]), the concepts of structure invariants and seminvariants, normalization of the structure factors (Section 2.2.4[link]), inequalities among structure factors, and probabilistic phase relationships (Section 2.2.5[link]). Probabilistic phase relationships are at the core of direct methods: triplet (via Cochran and via the P10 formula), quartet (according to Hauptman and to Giacovazzo) and quintet invariant phase estimates are described, along with estimates of one-phase and of two-phase structure seminvariants (via representation theory). Determinantal formulas are also quoted. Techniques for estimating invariants by using some prior information and their real-space counterparts are discussed. The success of direct methods is intimately connected with the phasing procedures. The most important tools of the procedures (e.g. the tangent formula, magic integers, random-start approaches and figures of merit for recognizing the correct solution) are analysed. The integration of direct methods with macromolecular crystallography is discussed in Section 2.2.10[link]: in particular we refer to the ab initio methods for solving protein structures (in combination with direct-space techniques like electron-density modification, envelope determination, histogram matching etc.) as well as to the combination of direct methods with molecular replacement and SAD–MAD techniques (both in 'one-step' procedures and in the two-step method, the latter requiring the prior determination of the heavy-atom or anomalous-scatterer substructure).

2.2.1. List of symbols and abbreviations

| top | pdf |

[f_{j}] atomic scattering factor of jth atom
[Z_{j}] atomic number of jth atom
N number of atoms in the unit cell
m order of the point group
[[\sigma_{r}]_{p}, [\sigma_{r}]_{q}, [\sigma_{r}]_{N}, \ldots = \textstyle\sum\limits_{j = 1}^{p} Z_{j}^{r}, \textstyle\sum\limits_{j = 1}^{q} Z_{j}^{r}, \textstyle\sum\limits_{j = 1}^{N} Z_{j}^{r}, \ldots][[\sigma_{r}]_{N}] is always abbreviated to [\sigma_{r}] when N is the number of atoms in the cell[\textstyle\sum\nolimits_{p}, \textstyle\sum\nolimits_{q}, \textstyle\sum\nolimits_{N},\ldots = \textstyle\sum\limits_{j = 1}^{p} f_{j}^{2}, \textstyle\sum\limits_{j = 1}^{q} f_{j}^{2}, \textstyle\sum\limits_{j = 1}^{N} f_{j}^{2}, \ldots]
s.f. structure factor
n.s.f. normalized structure factor
cs. centrosymmetric
ncs. noncentrosymmetric
s.i. structure invariant
s.s. structure seminvariant
[{\bf C} = ({\bf R}, {\bf T})] symmetry operator; R is the rotational part, T the translational part
[\varphi_{\bf h}] phase of the structure factor [F_{\bf h} = |F_{\bf h}| \exp (i\varphi_{\bf h})]

2.2.2. Introduction

| top | pdf |

Direct methods are today the most widely used tool for solving small crystal structures. They work well both for equal-atom molecules and when a few heavy atoms exist in the structure. In recent years the theoretical background of direct methods has been improved to take into account a large variety of prior information (the form of the molecule, its orientation, a partial structure, the presence of pseudosymmetry or of a superstructure, the availability of isomorphous data or of data affected by anomalous-dispersion effects, …). Owing to this progress and to the increasing availability of powerful computers, a number of effective, highly automated packages for the practical solution of the phase problem are today available to the scientific community.

The ab initio crystal structure solution of macromolecules seems not to exceed the potential of direct methods. Many efforts will certainly be devoted to this task in the near future: a report of the first achievements is given in Section 2.2.10[link].

This chapter describes both the traditional direct methods tools and the most recent and revolutionary techniques suitable for macromolecules.

The theoretical background and tables useful for origin specification are given in Section 2.2.3[link]; in Section 2.2.4[link] the procedures for normalizing structure factors are summarized. Phase-determining formulae (inequalities, probabilistic formulae for triplet, quartet and quintet invariants, and for one- and two-phase s.s.'s, determinantal formulae) are given in Section 2.2.5.[link] In Section 2.2.6[link] the connection between direct methods and related techniques in real space is discussed. Practical procedures for solving crystal structures are described in Sections 2.2.7[link] and 2.2.8[link], and references to the most extensively used packages are given in Section 2.2.9.[link] The techniques suitable for the ab initio crystal structure solution of macromolecules are described in Section[link]. The integration of direct methods with isomorphous-replacement and anomalous-dispersion techniques is briefly described in Sections[link] and[link].

The reader will find full coverage of the most important aspects of direct methods in the recent books by Giacovazzo (1998[link]) and Woolfson & Fan (1995[link]).

2.2.3. Origin specification

| top | pdf |

  • (a) Once the origin has been chosen, the symmetry operators [{\bf C}_{s} \equiv ({\bf R}_{s}, {\bf T}_{s})] and, through them, the algebraic form of the s.f. remain fixed.

    A shift of the origin through a vector with coordinates [{\bf X}_{0}] transforms [\varphi_{\bf h}] into [\varphi'_{\bf h} = \varphi_{\bf h} - 2\pi {\bf h} \cdot {\bf X}_{0} \eqno(] and the symmetry operators [{\bf C}_{s}] into [{\bf C}'_{s} = ({\bf R}'_{s}, {\bf T}'_{s})], where [{\bf R}'_{s} = {\bf R}_{s}\hbox{;} \;{\bf T}'_{s} = {\bf T}_{s} + ({\bf R}_{s} - {\bf I}) {\bf X}_{0} \quad s = 1, 2, \ldots, m. \eqno(]

  • (b) Allowed or permissible origins (Hauptman & Karle, 1953[link], 1959[link]) for a given algebraic form of the s.f. are all those points in direct space which, when taken as origin, maintain the same symmetry operators [{\bf C}_{s}]. The allowed origins will therefore correspond to those points having the same symmetry environment in the sense that they are related to the symmetry elements in the same way. For instance, if [{\bf T}_{s} = 0] for [s = 1, \ldots, 8], then the allowed origins in Pmmm are the eight inversion centres.

    To each functional form of the s.f. a set of permissible origins will correspond.

  • (c) A translation between permissible origins will be called a permissible or allowed translation. Trivial allowed translations correspond to the lattice periods or to their multiples. A change of origin by an allowed translation does not change the algebraic form of the s.f. Thus, according to ([link], all origins allowed by a fixed functional form of the s.f. will be connected by translational vectors [{\bf X}_{p}] such that [({\bf R}_{s} - {\bf I}) {\bf X}_{p} = {\bf V}, \quad s = 1, 2, \ldots, m, \eqno(] where V is a vector with zero or integer components.

    In centred space groups, an origin translation corresponding to a centring vector [{\bf B}_{v}] does not change the functional form of the s.f. Therefore all vectors [{\bf B}_{v}] represent permissible translations. [{\bf X}_{p}] will then be an allowed translation (Giacovazzo, 1974[link]) not only when, as imposed by ([link], the difference [{\bf T}'_{s} - {\bf T}_{s}] is equal to one or more lattice units, but also when, for any s, the condition [({\bf R}_{s} - {\bf I}) {\bf X}_{p} = {\bf V} + \alpha {\bf B}_{v}, \quad s = 1, 2, \ldots, m\hbox{;} \quad \alpha = 0, 1 \eqno(] is satisfied.

    We will call any set of cs. or ncs. space groups having the same allowed origin translations a Hauptman–Karle group (H–K group). The 94 ncs. primitive space groups, the 62 primitive cs. groups, the 44 ncs. centred space groups and the 30 cs. centred space groups can be collected into 13, 4, 14 and 5 H–K groups, respectively (Hauptman & Karle, 1953[link], 1956[link]; Karle & Hauptman, 1961[link]; Lessinger & Wondratschek, 1975[link]). In Tables[link] [link] [link]–[link] the H–K groups are given together with the allowed origin translations.

    Table| top | pdf |
    Allowed origin translations, seminvariant moduli and phases for centrosymmetric primitive space groups

     H–K group
    [(h, k, l){\underline P}(2, 2, 2)] [(h + k, l){\underline P}(2, 2)] [(l){\underline P}(2)] [(h + k + l){\underline P}(2)]
    Space group [P\bar{1}] Pmna [\displaystyle P {4\over m}] [\displaystyle P {4\over n} mm] [\displaystyle P\bar{3}] [R\bar{3}]
    [\displaystyle P {2\over m}] Pcca [\displaystyle P {4_{2}\over m}] [\displaystyle P {4\over n}cc] [\displaystyle P \bar{3}1m] [R \bar{3}m]
    [\displaystyle P {2_{1}\over m}] Pbam [\displaystyle P {4\over n}] [\displaystyle P {4_{2}\over m}mc] [\displaystyle P \bar{3}1c] [R \bar{3}c]
    [\displaystyle P {2\over c}] Pccn [\displaystyle P {4_{2}\over n}] [\displaystyle P {4_{2}\over m}cm] [\displaystyle P \bar{3}m1] [\displaystyle Pm \bar{3}]
    [\displaystyle P {2_{1}\over c}] Pbcm [\displaystyle P {4\over m}mm] [\displaystyle P {4_{2}\over n}bc] [\displaystyle P \bar{3}c1] [\displaystyle Pn \bar{3}]
    Pmmm Pnnm [\displaystyle P {4\over m}cc] [\displaystyle P {4_{2}\over n}nm] [\displaystyle P {6\over m}] [\displaystyle Pa \bar{3}]
    Pnnn Pmmn [\displaystyle P {4\over n}bm] [\displaystyle P {4_{2}\over m}bc] [\displaystyle P {6_{3}\over m}] [\displaystyle Pm \bar{3}m]
    Pccm Pbcn [\displaystyle P {4\over n}nc] [\displaystyle P {4_{2}\over m}nm] [\displaystyle P {6\over m}mm] [\displaystyle Pn \bar{3}n]
    Pban Pbca [\displaystyle P {4\over m}bm] [\displaystyle P {4_{2}\over n}mc] [\displaystyle P {6\over m}cc] [\displaystyle Pm \bar{3}n]
    Pmma Pnma [\displaystyle P {4\over m}nc] [\displaystyle P {4_{2}\over n}cm] [\displaystyle P {6_{3}\over m}cm] [\displaystyle Pn \bar{3}m]
    Pnna       [\displaystyle P {6_{3}\over m}mc]  
    Allowed origin translations (0, 0, 0); [(0, {1\over 2}, {1\over 2})] (0, 0, 0) (0, 0, 0) (0, 0, 0)
    [({1\over 2}, 0, 0)]; [({1\over 2}, 0, {1\over 2})] [(0, 0, {1\over 2})] [(0, 0, {1\over 2})] [({1\over 2}, {1\over 2}, {1\over 2})]
    [(0, {1\over 2}, 0)]; [({1\over 2}, {1\over 2}, 0)] [({1\over 2}, {1\over 2}, 0)]    
    [(0, 0, {1\over 2})]; [({1\over 2}, {1\over 2}, {1\over 2})] [({1\over 2}, {1\over 2}, {1\over 2})]    
    Vector [{\bf h}_{s}] seminvariantly associated with [{\bf h} = (h, k, l)] [(h, k, l)] [(h + k, l)] (l) [(h + k + l)]
    Seminvariant modulus [\boldomega _{s}] (2, 2, 2) (2, 2) (2) (2)
    Seminvariant phases [\varphi_{eee}] [\varphi_{eee}\hbox{; } \varphi_{ooe}] [\varphi_{eee}\hbox{; } \varphi_{eoe}] [\varphi_{eee}\hbox{; } \varphi_{ooe}]
        [\varphi_{oee}\hbox{; } \varphi_{ooe}] [\varphi_{oeo}\hbox{; } \varphi_{eoo}]
    Number of semindependent phases to be specified 3 2 1 1

    Table| top | pdf |
    Allowed origin translations, seminvariant moduli and phases for noncentrosymmetric primitive space groups

     H–K group
    [(h, k, l)P(0, 0, 0)] [(h, k, l)P(2, 0, 2)] [(h, k, l)P(0, 2, 0)] [(h, k, l)P(2, 2, 2)] [(h, k, l)P(2, 2, 0)] [(h + k, l)P(2, 0)] [(h + k, l)P(2, 2)] [(h - k, l)P(3, 0)] [(2h + 4k + 3l)P(6)][(l)P(0)][(l)P(2)] [(h + k + l)P(0)] [(h + k + l)P(2)]
    Space group P1 P2 Pm P222 Pmm2 P4 [P\bar{4}] P3 P312 P31m P321 R3 R32
      [P2_{1}] Pc [P222_{1}] [Pmc2_{1}] [P4_{1}] P422 [P3_{1}] [P3_{1}12] P31c [P3_{1}21] R3m P23
          [P2_{1}2_{1}2] Pcc2 [P4_{2}] [P42_{1}2] [P3_{2}] [P3_{2}12] P6 [P3_{2}21] R3c [P2_{1}3]
          [P2_{1}2_{1}2_{1}] Pma2 [P4_{3}] [P4_{1}22] P3m1 P6 [P6_{1}] P622   P432
            [Pca2_{1}] P4mm [P4_{1}2_{1}2] P3c1 [P\bar{6}m2] [P6_{5}] [P6_{1}22]   [P4_{2}32]
            Pnc2 P4bm [P4_{2}22]   [P\bar{6}c2] [P6_{4}] [P6_{5}22]   [P4_{3}32]
            [Pmn2_{1}] [P4_{2}cm] [P4_{2}2_{1}2]     [P6_{3}] [P6_{2}22]   [P4_{1}32]
            Pba2 [P4_{2}nm] [P4_{3}22]     [P6_{2}] [P6_{4}22]   [P\bar{4}3m]
            [Pna2_{1}] P4cc [P4_{3}2_{1}2]     P6mm [P6_{3}22]   [P\bar{4}3n]
            Pnn2 P4nc [P\bar{4}2m]     P6cc [P\bar{6}2m]    
              [P4_{2}mc] [P\bar{4}2c]     [P6_{3}cm] [P\bar{6}2c]    
              [P4_{2}bc] [P\bar{4}2_{1}m]     [P6_{3}mc]      
    Allowed origin translations (x, y, z) (0, y, 0) (x, 0, z) (0, 0, 0) (0, 0, z) (0, 0, z) (0, 0, 0) (0, 0, z) (0, 0, 0) (0, 0, z) (0, 0, 0) (x, x, x) (0, 0, 0)
      [(0, y, {1\over 2})] [(x, {1\over 2}, z)] [({1\over 2}, 0, 0)] [(0, {1\over 2}, z)] [({1\over 2}, {1\over 2}, z)] [(0, 0, {1\over 2})] [({1\over 3}, {2\over 3}, z)] [(0, 0, {1\over 2})]   [(0, 0, {1\over 2})]   [({1\over 2}, {1\over 2}, {1\over 2})]
      [({1\over 2}, y, 0)]   [(0, {1\over 2}, 0)] [({1\over 2}, 0, z)]   [({1\over 2}, {1\over 2}, 0)] [({2\over 3}, {1\over 3}, z)] [({1\over 3}, {2\over 3}, 0)]        
      [({1\over 2}, y, {1\over 2})]   [(0, 0, {1\over 2})] [({1\over 2}, {1\over 2}, z)]   [({1\over 2}, {1\over 2}, {1\over 2})]   [({1\over 3}, {2\over 3}, {1\over 2})]        
          [(0, {1\over 2}, {1\over 2})]         [({2\over 3}, {1\over 3}, 0)]        
          [({1\over 2}, 0, {1\over 2})]         [({2\over 3}, {1\over 3}, {1\over 2})]        
          [({1\over 2}, {1\over 2}, 0)]                  
          [({1\over 2}, {1\over 2}, {1\over 2})]                  
    Vector [{\bf h}_{s}] seminvariantly associated with [{\bf h} = (h, k, l)] (h, k, l) (h, k, l) (h, k, l) (h, k, l) (h, k, l) [(h + k, l)] [(h + k, l)] [(h - k, l)] [(2h + 4k + 3l)] (l) (l) [(h + k + l)] [(h + k + l)]
    Seminvariant modulus [\boldomega_{s}] (0, 0, 0) (2, 0, 2) (0, 2, 0) (2, 2, 2) (2, 2, 0) (2, 0) (2, 2) (3, 0) (6) (0) (2) (0) (2)
    Seminvariant phases [\varphi_{000}] [\varphi_{e0e}] [\varphi_{0e0}] [\varphi_{eee}] [\varphi_{ee0}] [\varphi_{ee0}] [\varphi_{eee}] [\varphi_{hk0}] if [h - k = 0] [\varphi_{hkl}] if [2h + 4k + 3l = 0] [\varphi_{hk0}] [\varphi_{hke}] [\varphi_{h, \,  k, \, \bar{h} + \bar{k}}] [\varphi_{eee}]; [\varphi_{ooe}]
              [\varphi_{oo0}] [\varphi_{ooe}] (mod 3) (mod 6)       [\varphi_{oeo}]; [\varphi_{ooe}]
    Allowed variations for the semindependent phases [\|\infty\|] [\|\infty\|], [\|2\|] if [k = 0] [\|\infty\|], [\|2\|] if [h = l = 0] [\|2\|] [\|\infty\|], [\|2\|] if [l = 0] [\|\infty\|], [\|2\|] if [l = 0] [\|2\|] [\|\infty\|], [\|3\|] if [l = 0] [\|2\|] if [h \equiv k] (mod 3)
    [\|3\|] if [l \equiv 0] (mod 2)
    [\|\infty\|] [\|2\|] [\|\infty\|] [\|2\|]
    Number of semindependent phases to be specified 3 3 3 3 3 2 2 2 1 1 1 1 1

    Table| top | pdf |
    Allowed origin translations, seminvariant moduli and phases for centrosymmetric non-primitive space groups

     H–K group
    [(h, l) {\underline C} (2, 2)][(k, l) {\underline I}(2, 2)] [(h + k + l) {\underline F}(2)][(l)  {\underline I} (2)][{\underline I}]
    Space groups [\displaystyle C {2\over m}] Immm Fmmm [\displaystyle I {4\over m}] [Im \bar{3}]
    [\displaystyle C {2\over c}] Ibam Fddd [\displaystyle I {4_{1}\over a}] [Ia \bar{3}]
    Cmcm Ibca [Fm \bar{3}] [\displaystyle I {4\over m} mm] [Im \bar{3} m]
    Cmca Imma [Fd \bar{3}] [\displaystyle I {4\over m} cm] [Ia \bar{3} d]
    Cmmm   [Fm \bar{3} m] [\displaystyle I {4_{1}\over a} md]  
    Cccm   [Fm \bar{3} c] [\displaystyle I {4_{1}\over a} cd]  
    Cmma   [Fd \bar{3} m]    
    Ccca   [Fd \bar{3} c]    
    Allowed origin translations (0, 0, 0) (0, 0, 0) (0, 0, 0) (0, 0, 0) (0, 0, 0)
    [(0, 0, {1\over 2})] [(0, 0, {1\over 2})] [({1\over 2}, {1\over 2}, {1\over 2})] [(0, 0, {1\over 2})]  
    [({1\over 2}, 0, 0)] [(0, {1\over 2}, 0)]      
    [({1\over 2}, 0, {1\over 2})] [({1\over 2}, 0, 0)]      
    Vector [{\bf h}_{s}] seminvariantly associated with [{\bf h} = (h, k, l)] [(h, l)] [(k, l)] [(h + k + l)] (l) [(h, k, l)]
    Seminvariant modulus [\boldomega _{s}] (2, 2) (2, 2) (2) (2) (1, 1, 1)
    Seminvariant phases [\varphi_{eee}] [\varphi_{eee}] [\varphi_{eee}] [\varphi_{eoe}]; [\varphi_{eee}] [\varphi_{ooe}]; [\varphi_{oee}] All
    Number of semindependent phases to be specified 2 2 1 1 0

    Table| top | pdf |
    Allowed origin translations, seminvariant moduli and phases for noncentrosymmetric non-primitive space groups

     H–K group
    [(k, l)C(0, 2)][(h, l)C(0, 0)][(h, l)C(2, 0)][(h, l)C(2, 2)][(h, l)A(2, 0)][(h, l)I(2, 0)][(h, l)I(2, 2)] [(h + k + l)F(2)] [(h + k + l)F(4)][(l)I(0)][(l)I(2)] [(2k - l)I(4)][(l)F(0)][ I]
    Space group C2 Cm Cmm2 C222 Amm2 Imm2 I222 F432 F222 I4 I422 [I\bar{4}] Fmm2 I23
      Cc [Cmc2_{1}] [C222_{1}] Abm2 Iba2 [I2_{1}2_{1}2_{1}] [F4_{1}32] F23 [I4_{1}] [I4_{1}22] [I\bar{4}m2] Fdd2 [I2_{1}3]
        Ccc2   Ama2 Ima2     [F\bar{4}3m] I4mm [I\bar{4}2m] [I\bar{4}c2]   I432
            Aba2       [F\bar{4}3c] I4cm [I\bar{4}2d]     [I4_{1}32]
                      [I4_{1}md]       [I\bar{4}3m]
                      [I4_{1}cd]       [I\bar{4}3d]
    Allowed origin translations (0, y, 0) (x, 0, z) (0, 0, z) (0, 0, 0) (0, 0, z) (0, 0, z) (0, 0, 0) (0, 0, 0) (0, 0, 0) (0, 0, z) (0, 0, 0) (0, 0, 0) (0, 0, z) (0, 0, 0)
    [(0, y, {1\over 2})]   [({1\over 2}, 0, z)] [(0, 0, {1\over 2})] [({1\over 2}, 0, z)] [({1\over 2}, 0, z)] [(0, 0, {1\over 2})] [({1\over 2}, {1\over 2}, {1\over 2})] [({1\over 4}, {1\over 4}, {1\over 4})]   [(0, 0, {1\over 2})] [(0, 0, {1\over 2})]    
          [({1\over 2}, 0, 0)]     [(0, {1\over 2}, 0)]   [({1\over 2}, {1\over 2}, {1\over 2})]     [({1\over 2}, 0, {3\over 4})]    
          [({1\over 2}, 0, {1\over 2})]     [({1\over 2}, 0, 0)]   [({3\over 4}, {3\over 4}, {3\over 4})]     [({1\over 2}, 0, {1\over 4})]    
    Vector [{\bf h}_{s}] seminvariantly associated with [{\bf h} = (h, k, l)] (k, l) (h, l) (h, l) (h, l) (h, l) (h, l) (h, l) [(h + k + l)] [(h + k + l)] (l) (l) [(2k - l)] (l) [(h, k, l)]
    Seminvariant modulus [\boldomega _{s}] (0, 2) (0, 0) (2, 0) (2, 2) (2, 0) (2, 0) (2, 2) (2) (4) (0) (2) (4) (0) (1, 1, 1)
    Seminvariant phases [\varphi_{e0e}] [\varphi_{0e0}] [\varphi_{ee0}] [\varphi_{eee}] [\varphi_{ee0}] [\varphi_{ee0}] [\varphi_{eee}] [\varphi_{eee}] [\varphi_{hkl}] with [h + k + l \equiv 0] (mod 4) [\varphi_{hk0}] [\varphi_{hke}] [\varphi_{hkl}] with [(2k - l) \equiv 0] (mod 4) [\varphi_{hk0}] All
    Allowed variations for the semindependent phases [\matrix{\|\infty\|,\cr \|2\|\cr \hbox{ if } k = 0\cr}] [\|\infty\|] [\matrix{\|\infty\|,\cr \|2\|\cr \hbox{ if } l = 0\cr}] [\|2\|] [\matrix{\|\infty\|,\cr \|2\|\cr \hbox{ if } l = 0\cr}] [\matrix{\|\infty\|,\cr \|2\|\cr \hbox{ if } l = 0\cr}] [\|2\|] [\|2\|] [\|2\|] if [h + k + l \equiv 0] (mod 2)
    [\|4\|] if [h+k+l] [\equiv 1] (mod 2)
    [\|\infty\|] [\|2\|] [\|2\|] if [h + k + l \equiv 0] (mod 2)
    [\|4\|] if [2k-l\equiv] 1 (mod 2)
    [\|\infty\|] All
    Number of semindependent phases to be specified 2 2 2 2 2 2 2 1 1 1 1 1 1 0
  • (d) Let us consider a product of structure factors [\eqalignno{F_{{\bf h}_{1}}^{A_{1}} \times F_{{\bf h}_{2}}^{A_{2}} \times \ldots \times F_{{\bf h}_{n}}^{A_{n}} &= \textstyle\prod\limits_{j = 1}^{n} F_{{\bf h}_{j}}^{A_{j}}\cr &= \exp \left(i \textstyle\sum\limits_{j = 1}^{n} A_{j}\varphi_{{\bf h}_{j}}\right) \textstyle\prod\limits_{j = 1}^{n} |F_{{\bf h}_{j}}|^{A_{j}}, &\cr&&(}] [A_{j}] being integer numbers.

    The factor [\textstyle\sum_{j = 1}^{n} A_{j}\varphi_{{\bf h}_{j}}] is the phase of the product ([link]. A structure invariant (s.i.) is a product ([link] such that [\textstyle\sum\limits_{j = 1}^{n} A_{j} {\bf h}_{j} = 0. \eqno(] Since [|F_{{\bf h}_{j}}|] are usually known from experiment, it is often said that s.i.'s are combinations of phases [\textstyle\sum\limits_{j = 1}^{n} A_{j}\varphi_{{\bf h}_{j}}, \eqno(] for which ([link] holds.

    [F_{0}], [F_{\bf h} F_{-{\bf h}}], [F_{\bf h} F_{\bf k} F_{\overline{{\bf h} + {\bf k}}}], [F_{\bf h} F_{\bf k} F_{\bf l} F_{\overline{{\bf h} + {\bf k} + {\bf l}}}], [F_{\bf h} F_{\bf k} F_{\bf l} F_{\bf p} F_{\overline{{\bf h} + {\bf k} + {\bf l} + {\bf p}}}] are examples of s.i.'s for [n = 1, 2, 3, 4, 5].

    The value of any s.i. does not change with an arbitrary shift of the space-group origin and thus it will depend on the crystal structure only.

  • (e) A structure seminvariant (s.s.) is a product of structure factors [or a combination of phases ([link]] whose value is unchanged when the origin is moved by an allowed translation.

    Let [{\bf X}_{p}]'s be the permissible origin translations of the space group. Then the product ([link] [or the sum ([link]] is an s.s., if, in accordance with ([link], [\textstyle\sum\limits_{j = 1}^{n} A_{j} ({\bf h}_{j} \cdot {\bf X}_{p}) = r, \quad p = 1, 2, \ldots \eqno(] where r is a positive integer, null or a negative integer.

    Conditions ([link] can be written in the following more useful form (Hauptman & Karle, 1953[link]): [\textstyle\sum\limits_{j = 1}^{n} A_{j} {\bf h}_{s_{j}} \equiv 0 \quad (\hbox{mod}\ \boldomega _{s}), \eqno(] where [{\bf h}_{s_{j}}] is the vector seminvariantly associated with the vector [{\bf h}_{j}] and [\boldomega _{s}] is the seminvariant modulus. In Tables[link] [link] [link]–[link], the reflection [{\bf h}_{s}] seminvariantly associated with [{\bf h} = (h, k, l)], the seminvariant modulus [\boldomega _{s}] and seminvariant phases are given for every H–K group.

    The symbol of any group (cf. Giacovazzo, 1974[link]) has the structure [{\bf h}_{s} L \boldomega _{s}], where L stands for the lattice symbol. This symbol is underlined if the space group is cs.

    By definition, if the class of permissible origin has been chosen, that is to say, if the algebraic form of the symmetry operators has been fixed, then the value of an s.s. does not depend on the origin but on the crystal structure only.

  • (f) Suppose that we have chosen the symmetry operators [{\bf C}_{s}] and thus fixed the functional form of the s.f.'s and the set of allowed origins. In order to describe the structure in direct space a unique reference origin must be fixed. Thus the phase-determining process must also require a unique permissible origin congruent to the values assigned to the phases. More specifically, at the beginning of the structure-determining process by direct methods we shall assign as many phases as necessary to define a unique origin among those allowed (and, as we shall see, possibly to fix the enantiomorph). From the theory developed so far it is obvious that arbitrary phases can be assigned to one or more s.f.'s if there is at least one allowed origin which, fixed as the origin of the unit cell, will give those phase values to the chosen reflections. The concept of linear dependence will help us to fix the origin.

  • (g) n phases [\varphi_{{\bf h}_{j}}] are linearly semidependent (Hauptman & Karle, 1956[link]) when the n vectors [{\bf h}_{s_{j}}] seminvariantly associated with the [{\bf h}_{j}] are linearly dependent modulo [\boldomega _{s}], [\boldomega _{s}] being the seminvariant modulus of the space group. In other words, when [\textstyle\sum\limits_{j = 1}^{n} A_{j} {\bf h}_{s_{j}} \equiv 0 \quad (\hbox{mod}\ \boldomega _{s}), \qquad A_{q} \not\equiv 0 \quad (\hbox{mod}\ \boldomega _{s}) \eqno(] is satisfied. The second condition means that at least one [A_q] exists that is not congruent to zero modulo each of the components of [\boldomega _{s}]. If ([link] is not satisfied for any n-set of integers [A_{j}], the phases [\varphi_{{\bf h}_{j}}] are linearly semindependent. If ([link] is valid for [n = 1] and [A = 1], then [{\bf h}_{1}] is said to be linearly semidependent and [\varphi_{{\bf h}_{1}}] is an s.s. It may be concluded that a seminvariant phase is linearly semidependent, and, vice versa, that a phase linearly semidependent is an s.s. In Tables[link] [link] [link]–[link] the allowed variations (which are those due to the allowed origin translations) for the semindependent phases are given for every H–K group. If [\varphi_{{\bf h}_{1}}] is linearly semindependent its value can be fixed arbitrarily because at least one origin compatible with the given value exists. Once [\varphi_{{\bf h}_{1}}] is assigned, the necessary condition to be able to fix a second phase [\varphi_{{\bf h}_{2}}] is that it should be linearly semindependent of [\varphi_{{\bf h}_{1}}].

    Similarly, the necessary condition to be able arbitrarily to assign a third phase [\varphi_{{\bf h}_{3}}] is that it should be linearly semindependent from [\varphi_{{\bf h}_{1}}] and [\varphi_{{\bf h}_{2}}].

    In general, the number of linearly semindependent phases is equal to the dimension of the seminvariant vector [\boldomega _{s}] (see Tables[link] [link] [link]–[link]). The reader will easily verify in (h, k, l) P (2, 2, 2) that the three phases [\varphi_{oee}], [\varphi_{eoe}], [\varphi_{eoo}] define the origin (o indicates odd, e even).

  • (h) From the theory summarized so far it is clear that a number of semindependent phases [\varphi_{{\bf h}_{j}}], equal to the dimension of the seminvariant vector [\boldomega _{s}], may be arbitrarily assigned in order to fix the origin. However, it is not always true that only one allowed origin compatible with the given phases exists. An additional condition is required such that only one permissible origin should lie at the intersection of the lattice planes corresponding to the origin-fixing reflections (or on the lattice plane h if one reflection is sufficient to define the origin). It may be shown that the condition is verified if the determinant formed with the vectors seminvariantly associated with the origin reflections, reduced modulo [\boldomega _{s}], has the value ±1. In other words, such a determinant should be primitive modulo [\boldomega _{s}].

    For example, in [P\bar{1}] the three reflections [{\bf h}_{1} = (345), {\bf h}_{2} = (139), {\bf h}_{3} = (784)] define the origin uniquely because [\left|\matrix{3 &4 &5\cr 1 &3 &9\cr 7 &8 &4\cr}\right| {_{\rm reduced\; mod\; (2, 2, 2)} \atop \hbox{\rightarrowfill}} \left|\matrix{1 &0 &1\cr 1 &1 &1\cr 1 &0 &0\cr}\right| = -1.] Furthermore, in [P4mm] [[{\bf h}_{s} = (h + k, l), \boldomega _{s} = (2, 0)]] [{\bf h}_{1} = (5, 2, 0), \quad{\bf h}_{2} = (6, 2, 1)] define the origin uniquely since [\left|\matrix{7 &0\cr 8 &1\cr}\right| {_{\rm reduced\; mod\; (2, 0)} \atop \hbox{\rightarrowfill}} \left|\matrix{1 &0\cr 0 &1\cr}\right| = 1.]

  • (i) If an s.s. or an s.i. has a general value ϕ for a given structure, it will have a value −ϕ for the enantiomorph structure. If [\varphi = 0], π the s.s. has the same value for both enantiomorphs. Once the origin has been assigned, in ncs. space groups the sign of a given s.s. [\varphi \neq 0], π can be assigned to fix the enantiomorph. In practice it is often advisable to use an s.s. or an s.i. whose value is as near as possible to [\pm \pi/2].

2.2.4. Normalized structure factors

| top | pdf | Definition of normalized structure factor

| top | pdf |

The normalized structure factors E (see also Chapter 2.1[link] ) are calculated according to (Hauptman & Karle, 1953[link]) [|E_{\bf h}|^{2} = |F_{\bf h}|^{2}/\langle |F_{\bf h}|^{2}\rangle, \eqno(] where [|F_{\bf h}|^{2}] is the squared observed structure-factor magnitude on the absolute scale and [\langle |F_{\bf h}|^{2}\rangle] is the expected value of [|F_{\bf h}|^{2}].

[\langle |F_{\bf h}|^{2}\rangle] depends on the available a priori information. Often, but not always, this may be considered as a combination of several typical situations. We mention:

  • (a) No structural information. The atomic positions are considered random variables. Then [\langle |F_{\bf h}|^{2}\rangle = \varepsilon_{\bf h} \textstyle\sum\limits_{j = 1}^{N} f_{j}^{2} = \varepsilon_{\bf h} \textstyle\sum\nolimits_{N}] so that [E_{\bf h} = {F_{\bf h} \over (\varepsilon_{\bf h} \sum\nolimits_{N})^{1/2}}. \eqno(] [\varepsilon_{\bf h}] takes account of the effect of space-group symmetry (see Chapter 2.1[link] ).

  • (b) P atomic groups having a known configuration but with unknown orientation and position (Main, 1976[link]). Then a certain number of interatomic distances [r_{j_{1}j_{2}}] are known and [\langle |F_{\bf h}|^{2}\rangle = \varepsilon_{\bf h} \left(\sum\nolimits_{N} + \sum\limits_{i = 1}^{P} \sum\limits_{j_{1} \neq j_{2} = 1}^{M_{i}} f_{j_{1}} f_{j_{2}} {\sin 2 \pi qr_{j_{1}j_{2}} \over 2 \pi qr_{j_{1}j_{2}}}\right),] where [M_{i}] is the number of atoms in the ith molecular fragment and [q = |{\bf h}|].

  • (c) P atomic groups with a known configuration, correctly oriented, but with unknown position (Main, 1976[link]). Then a certain group of interatomic vectors [{\bf r}_{j_{1} j_{2}}] is fixed and [\langle |F_{{\bf h}}|^{2} \rangle = \varepsilon_{{\bf h}} \left(\textstyle\sum\nolimits_{N} + \textstyle\sum\limits_{i=1}^{P} \textstyle\sum\limits_{j_{1} \neq j_{2} = 1}^{M_{i}} f_{j_{1}} f_{j_{2}} \exp 2\pi i{\bf h} \cdot {\bf r}_{j_{1} j_{2}}\right).] The above formula has been derived on the assumption that primitive positional random variables are uniformly distributed over the unit cell. Such an assumption may be considered unfavourable (Giacovazzo, 1988[link]) in space groups for which the allowed shifts of origin, consistent with the chosen algebraic form for the symmetry operators [{\bf C}_{s}], are arbitrary displacements along any polar axes. Thanks to the indeterminacy in the choice of origin, the first of the shifts [\boldtau _{i}] (to be applied to the ith fragment in order to translate atoms in the correct positions) may be restricted to a region which is smaller than the unit cell (e.g. in P2 we are free to specify the origin along the diad axis by restricting [\boldtau _{1}] to the family of vectors [\{\boldtau _{1}\}] of type [[x0z]]). The practical consequence is that [\langle |F_{{\bf h}}|^{2} \rangle] is significantly modified in polar space groups if h satisfies [{\bf h} \cdot \boldtau _{1} = 0,] where [\boldtau _{1}] belongs to the family of restricted vectors [\{\boldtau _{1}\}].

  • (d) Atomic groups correctly positioned. Then (Main, 1976[link]; Giacovazzo, 1983a[link]) [\langle |F_{\bf h}|^{2} \rangle = |F_{p, \,  {\bf h}}|^{2} + \varepsilon_{\bf h} \textstyle\sum\nolimits_{q},] where [F_{p, {\bf h}}] is the structure factor of the partial known structure and q are the atoms with unknown positions.

  • (e) A pseudotranslational symmetry is present. Let [{\bf u}_{1}, {\bf u}_{2}, {\bf u}_{3}, \ldots] be the pseudotranslation vectors of order [n_{1}, n_{2}, n_{3}, \ldots], respectively. Furthermore, let p be the number of atoms (symmetry equivalents included) whose positions are related by pseudotranslational symmetry and q the number of atoms (symmetry equivalents included) whose positions are not related by any pseudotranslation. Then (Cascarano et al., 1985a[link],b[link]) [\langle |F_{\bf h}|^{2} \rangle = \varepsilon_{\bf h} (\zeta_{\bf h} \textstyle\sum\nolimits_{p} + \textstyle\sum\nolimits_{q}),] where [\zeta_{\bf h} = {(n_{1} n_{2} n_{3} \ldots) \gamma_{\bf h}\over m}] and [\gamma_{\bf h}] is the number of times for which algebraic congruences [{\bf h} \cdot {\bf R}_{s} {\bf u}_{i} \equiv 0\ (\hbox{mod 1})\quad \hbox{for } i = 1, 2, 3, \ldots] are simultaneously satisfied when s varies from 1 to m. If [\gamma_{\bf h} = 0] then [F_{\bf h}] is said to be a superstructure reflection, otherwise it is a substructure reflection.

    Often substructures are not ideal: e.g. atoms related by pseudotranslational symmetry are ideally located but of different type (replacive deviations from ideality); or they are equal but not ideally located (displacive deviations); or a combination of the two situations occurs. In these cases a correlation exists between the substructure and the superstructure. It has been shown (Mackay, 1953[link]; Cascarano et al., 1988[link] a) that the scattering power of the substructural part may be estimated via a statistical analysis of diffraction data for ideal pseudotranslational symmetry or for displacive deviations from it, while it is not estimable in the case of replacive deviations. Definition of quasi-normalized structure factor

| top | pdf |

When probability theory is not used, the quasi-normalized structure factors [ \hbox{\scr E}_{\bf h}] and the unitary structure factors [U_{\bf h}] are often used. [ \hbox{\scr E}_{\bf h}] and [U_{\bf h}] are defined according to [ \eqalign{|\hbox{\scr E}_{\bf h}|^{2} &= \varepsilon_{\bf h}|E_{\bf h}|^{2}\cr U_{\bf h} &= F_{\bf h} \Big/ \left(\textstyle\sum\limits_{j=1}^{N} f_{j}\right).}] Since [\sum_{j=1}^{N} f_{j}] is the largest possible value for [F_{\bf h}, U_{\bf h}] represents the fraction of [F_{\bf h}] with respect to its largest possible value. Therefore [0 \leq |U_{\bf h}| \leq 1.] If atoms are equal, then [ U_{\bf h} = \hbox{\scr E}_{\bf h} / N^{1/2}]. The calculation of normalized structure factors

| top | pdf |

N.s.f.'s cannot be calculated by applying ([link] to observed s.f.'s because: (a) the observed magnitudes [I_{\bf h}] (already corrected for Lp factor, absorption, …) are on a relative scale; (b) [\langle |F_{\bf h}|^{2} \rangle] cannot be calculated without having estimated the vibrational motion of the atoms.

This is usually obtained by the well known Wilson plot (Wilson, 1942[link]), according to which observed data are divided into ranges of [s^{2} = \sin^{2} \theta / \lambda^{2}] and averages of intensity [\langle I_{\bf h} \rangle] are taken in each shell. Reflection multiplicities and other effects of space-group symmetry on intensities must be taken into account when such averages are calculated. The shells are symmetrically overlapped in order to reduce statistical fluctuations and are restricted so that the number of reflections in each shell is reasonably large. For each shell [K \langle I \rangle = \langle |F|^{2} \rangle = \langle |F^{o}|^{2} \rangle \exp (- 2 Bs^{2}) \eqno(] should be obtained, where K is the scale factor needed to place X-ray intensities on the absolute scale, B is the overall thermal parameter and [\langle |F^{o}|^{2} \rangle] is the expected value of [|F|^{2}] in which it is assumed that all the atoms are at rest. [\langle |F^{o}|^{2} \rangle] depends upon the structural information that is available (see Section[link] for some examples).

Equation ([link] may be rewritten as [\ln \left\{{\langle I \rangle\over \langle |F^{o}|^{2} \rangle}\right\} = - \ln K - 2Bs^{2},] which plotted at various [s^{2}] should be a straight line of which the slope (2B) and intercept (ln K) on the logarithmic axis can be obtained by applying a linear least-squares procedure.

Very often molecular geometries produce perceptible departures from linearity in the logarithmic Wilson plot. However, the more extensive the available a priori information on the structure is, the closer, on the average, are the Wilson-plot curves to their least-squares straight lines.

Accurate estimates of B and K require good strategies (Rogers & Wilson, 1953[link]) for:

  • (1) treatment of weak measured data. If weak data are set to zero, there will be bias in the statistics. Methods are, however, available (French & Wilson, 1978[link]) that provide an a posteriori estimate of weak (even negative) intensities by means of Bayesian statistics.

  • (2) treatment of missing weak data (Rogers et al., 1955[link]; Vicković & Viterbo, 1979[link]). All unobserved reflections may assume [\eqalign{\mu &= |F_{o\min}|^{2}/3 \hbox{ for cs. space groups}\cr \mu &= |F_{o\min}|^{2}/2 \hbox{ for ncs. space groups,}}] where the subscript `o min' refers to the minimum observed intensity.

Once K and B have been estimated, [E_{\bf h}] values can be obtained from experimental data by [|E_{\bf h}|^{2} = {KI_{\bf h}\over \langle |F_{\bf h}^{o}|^{2} \rangle \exp (- 2Bs^{2})},] where [\langle |F_{\bf h}^{o}|^{2} \rangle] is the expected value of [|F_{\bf h}^{o}|^{2}] for the reflection h on the basis of the available a priori information. Probability distributions of normalized structure factors

| top | pdf |

Under some fairly general assumptions (see Chapter 2.1[link] ) probability distribution functions for the variable [|E|] for cs. and ncs. structures are (see Fig.[link]) [{}_{\bar{1}}P(|E|) \;\hbox{d}|E| = \sqrt{{2\over \pi}} \exp \left(- {E^{2}\over 2}\right) \;\hbox{d}|E| \eqno (]and[{}_{1}P(|E|) \;\hbox{d}|E| = 2|E| \exp (- |E|^{2}) \;\hbox{d}|E|, \eqno (] respectively. Corresponding cumulative functions are (see Fig.[link]) [\eqalign{_{\bar{1}}N(|E|) &= \sqrt{{2\over \pi}} \int\limits_{0}^{|E|} \exp \left(- {t^{2}\over 2}\right) \;\hbox{d}t = \hbox{erf} \left({|E|\over \sqrt{2}}\right),\cr {}_{1}N(|E|) &= \int\limits_{0}^{|E|} 2t \exp (- t^{2}) \;\hbox{d}t = 1 - \exp (- |E|^{2}).}]


Figure | top | pdf |

Probability density functions for cs. and ncs. crystals.


Figure | top | pdf |

Cumulative distribution functions for cs. and ncs. crystals.

Some moments of the distributions ([link] and ([link] are listed in Table[link]. In the absence of other indications for a given crystal structure, a cs. or an ncs. space group will be preferred according to whether the statistical tests yield values closer to column 2 or to column 3 of Table[link].

Table| top | pdf |
Moments of the distributions ([link] and ([link]

[R(E_{s})] is the percentage of n.s.f.'s with amplitude greater than the threshold [E_{s}].

CriterionCentrosymmetric distributionNoncentrosymmetric distribution
[\langle |E|\rangle] 0.798 0.886
[\langle |E|^{2}\rangle] 1.000 1.000
[\langle |E|^{3}\rangle] 1.596 1.329
[\langle |E|^{4}\rangle] 3.000 2.000
[\langle |E|^{5}\rangle] 6.383 3.323
[\langle |E|^{6}\rangle] 15.000 6.000
[\langle |E^{2} - 1|\rangle] 0.968 0.736
[\langle (E^{2} - 1)^{2}\rangle] 2.000 1.000
[\langle (E^{2} - 1)^{3}\rangle] 8.000 2.000
[\langle |E^{2} - 1|^{3}\rangle] 8.691 2.415
R(1) 0.320 0.368
R(2) 0.050 0.018
R(3) 0.003 0.0001

For further details about the distribution of intensities see Chapter 2.1[link] .

2.2.5. Phase-determining formulae

| top | pdf |

From the earliest periods of X-ray structure analysis several authors (Ott, 1927[link]; Banerjee, 1933[link]; Avrami, 1938[link]) have tried to determine atomic positions directly from diffraction intensities. Significant developments are the derivation of inequalities and the introduction of probabilistic techniques via the use of joint probability distribution methods (Hauptman & Karle, 1953[link]). Inequalities among structure factors

| top | pdf |

An extensive system of inequalities exists for the coefficients of a Fourier series which represents a positive function. This can restrict the allowed values for the phases of the s.f.'s in terms of measured structure-factor magnitudes. Harker & Kasper (1948[link]) derived two types of inequalities:

Type 1. A modulus is bound by a combination of structure factors: [|U_{\bf h}|^{2}\leq {1\over m} \sum\limits_{s = 1}^{m} a_{s} (-{\bf h}) U_{{\bf h} ({\bf I}-{\bf R}_{s})}, \eqno(] where m is the order of the point group and [a_{s}(-{\bf h}) =] [\exp (-2\pi i {\bf h}\cdot {\bf T}_{s})].

Applied to low-order space groups, ([link] gives [\eqalign{ P1: &\quad |U_{h, \,  k, \,  l}|^{2}\leq 1\cr P\bar{1}: &\quad U_{h, \,  k, \,  l}^{2}\leq 0.5 + 0.5 U_{2h, \,  2k, \,  2l}\cr P2_{1}: &\quad |U_{h, \,  k, \,  l}|^{2}\leq 0.5 + 0.5(-1)^{k} U_{2h, \,  0, \,  2l}.}] The meaning of each inequality is easily understandable: in [P\bar{1}], for example, [U_{2h, \,  2k, \,  2l}] must be positive if [|U_{h, \,  k, \,  l}|] is large enough.

Type 2. The modulus of the sum or of the difference of two structure factors is bound by a combination of structure factors: [ \eqalignno{|U_{\bf h} \pm U_{{\bf h}'}|^{2} &\leq {1\over m} \left\{\sum\limits_{s = 1}^{m} a_{s} (-{\bf h}) U_{{\bf h} ({\bf I}-{\bf R}_{s})} + \sum\limits_{s = 1}^{m} a_{s} (-{\bf h}') U_{{\bf h}' ({\bf I}-{\bf R}_{s})}\right.\cr &\quad \left. \pm 2{\scr Re} \left[\sum\limits_{s = 1}^{m} a_{s} (-{\bf h}') U_{{\bf h}-{\bf h}' {\bf R}_{s}}\right]\right\}&(}] where [{\scr Re}] stands for `real part of'. Equation ([link] applied to P1 gives [|U_{\bf h}\pm U_{{\bf h}'}|^{2} \leq 2\pm 2 |U_{{\bf h}-{\bf h}'}| \cos \varphi_{{\bf h}-{\bf h}'}.]

A variant of ([link] valid for cs. space groups is [(U_{\bf h}\pm U_{{\bf h}'})^{2}\leq (1\pm U_{{\bf h}+{\bf h}'}) (1\pm U_{{\bf h}-{\bf h}'}).] After Harker & Kasper's contributions, several other inequalities were discovered (Gillis, 1948[link]; Goedkoop, 1950[link]; Okaya & Nitta, 1952[link]; de Wolff & Bouman, 1954[link]; Bouman, 1956[link]; Oda et al., 1961[link]). The most general are the Karle–Hauptman inequalities (Karle & Hauptman, 1950[link]): [D_{m} = \left|\matrix{U_{0} &U_{-{\bf h}_{1}} &U_{-{\bf h}_{2}} &\ldots &U_{-{\bf h}_{n}}\cr U_{{\bf h}_{1}} &U_{0} &U_{{\bf h}_{1}-{\bf h}_{2}} &\ldots &U_{{\bf h}_{1}-{\bf h}_{n}}\cr U_{{\bf h}_{2}} &U_{{\bf h}_{2}-{\bf h}_{1}} &U_{0} &\ldots &U_{{\bf h}_{2}-{\bf h}_{n}}\cr \vdots &\vdots &\vdots &\ddots &\vdots\cr U_{{\bf h}_{n}} &U_{{\bf h}_{n}-{\bf h}_{1}} &U_{{\bf h}_{n}-{\bf h}_{2}} &\ldots &U_{0}\cr}\right|\geq 0. \eqno(] The determinant can be of any order but the leading column (or row) must consist of U's with different indices, although, within the column, symmetry-related U's may occur. For [n = 2] and [{\bf h}_{2} = 2{\bf h}_{1} = 2{\bf h}], equation ([link] reduces to [D_{3} = \left|\matrix{U_{0} &U_{-{\bf h}} &U_{-2{\bf h}}\cr U_{\bf h} &U_{0} &U_{-{\bf h}}\cr U_{2{\bf h}} &U_{\bf h} &U_{0}\cr}\right|\geq 0,] which, for cs. structures, gives the Harker & Kasper inequality [U_{\bf h}^{2} \leq 0.5 + 0.5 U_{2{\bf h}}.] For [m = 3], equation ([link] becomes [D_{3} = \left|\matrix{U_{0} &U_{-{\bf h}} &U_{-{\bf k}}\cr U_{\bf h} &U_{0} &U_{{\bf h}-{\bf k}}\cr U_{\bf k} &U_{{\bf k}-{\bf h}} &U_{0}\cr}\right| \geq 0,] from which [{1 - |U_{\bf h}|^{2} - |U_{\bf k}|^{2} - |U_{{\bf h}-{\bf k}}|^{2} + 2|U_{\bf h} U_{\bf k} U_{{\bf h}-{\bf k}}| \cos \alpha_{{\bf h}, \,  {\bf k}} \geq 0,} \eqno(] where [\alpha_{{\bf h}, \,  {\bf k}} = \varphi_{\bf h} - \varphi_{\bf k} - \varphi_{{\bf h} - {\bf k}}.] If the moduli [|U_{\bf h}|], [|U_{\bf k}|], [|U_{{\bf h} - {\bf k}}|] are large enough, ([link] is not satisfied for all values of [\alpha_{{\bf h}, \,  {\bf k}}]. In cs. structures the eventual check that one of the two values of [\alpha_{{\bf h}, \,  {\bf k}}] does not satisfy ([link] brings about the unambiguous identification of the sign of the product [U_{\bf h} U_{\bf k} U_{{\bf h} - {\bf k}}].

It was observed (Gillis, 1948[link]) that `there was a number of cases in which both signs satisfied the inequality, one of them by a comfortable margin and the other by only a relatively small margin. In almost all such cases it was the former sign which was the correct one. That suggests that the method may have some power in reserve in the sense that there are still fundamentally stronger inequalities to be discovered'. Today we identify this power in reserve in the use of probability theory. Probabilistic phase relationships for structure invariants

| top | pdf |

For any space group (see Section 2.2.3[link]) there are linear combinations of phases with cosines that are, in principle, fixed by the [|E|] magnitudes alone (s.i.'s) or by the [|E|] values and the trigonometric form of the structure factor (s.s.'s). This result greatly stimulated the calculation of conditional distribution functions [P(\Phi |\{R\}), \eqno(] where [R_{\bf h} = |E_{\bf h}|], [\Phi = \sum A_{i} \varphi_{{\bf h}_{i}}] is an s.i. or an s.s. and [\{R\}] is a suitable set of diffraction magnitudes. The method was first proposed by Hauptman & Karle (1953[link]) and was developed further by several authors (Bertaut, 1955a[link],b[link], 1960[link]; Klug, 1958[link]; Naya et al., 1964[link], 1965[link]; Giacovazzo, 1980a[link]). From a probabilistic point of­view the crystallographic problem is clear: the joint distribution [P(E_{{\bf h}_{1}}, \ldots, E_{{\bf h}_{n}})], from which the conditional distributions ([link] can be derived, involves a number of normalized structure factors each of which is a linear sum of random variables (the atomic contributions to the structure factors). So, for the probabilistic interpretation of the phase problem, the atomic positions and the reciprocal vectors may be considered as random variables. A further problem is that of identifying, for a given Φ, a suitable set of magnitudes [|E|] on which Φ primarily depends. The formulation of the nested neighbourhood principle first (Hauptman, 1975[link]) fixed the idea of defining a sequence of sets of reflections each contained in the succeeding one and having the property that any s.i. or s.s. may be estimated via the magnitudes constituting the various neighbourhoods. A subsequent more general theory, the representation method (Giacovazzo, 1977a[link], 1980b[link]), arranges for any Φ the set of intensities in a sequence of subsets in order of their expected effectiveness (in the statistical sense) for the estimation of Φ.

In the following sections the main formulae estimating low-order invariants and seminvariants or relating phases to other phases and diffraction magnitudes are given. Triplet relationships

| top | pdf |

The basic formula for the estimation of the triplet phase [{\Phi = \varphi_{\bf h}} - \varphi_{\bf k} - \varphi_{{\bf h} - {\bf k}}] given the parameter [G = 2\sigma_{3} \sigma_{2}^{-3/2}\times R_{\bf h} R_{\bf k} R_{{\bf h} - {\bf k}}] is Cochran's (1955[link]) formula [P(\Phi) = [2\pi I_{0} (G)]^{-1} \exp (G \cos \Phi), \eqno(] where [\sigma_{n} = \sum_{j = 1}^{N} Z_{j}^{n}], [Z_{j}] is the atomic number of the jth atom and [I_{n}] is the modified Bessel function of order n. In Fig.[link] the distribution [P(\Phi)] is shown for different values of G.


Figure | top | pdf |

Curves of ([link] for some values of [G =] [2\sigma_{3} \sigma_{2}^{-3/2} |E_{\bf h} E_{\bf k} E_{{\bf h} - {\bf k}}|].

The conditional probability distribution for [\varphi_{\bf h}], given a set of [(\varphi_{{\bf k}_{j}} + \varphi_{{{\bf h} - {\bf k}}_{j}})] and [G_{j} = 2\sigma_{3} \sigma_{2}^{-3/2} R_{\bf h} R_{{\bf k}_{j}} R_{{{\bf h} - {\bf k}}_{j}}], is given (Karle & Hauptman, 1956[link]; Karle & Karle, 1966[link]) by [P(\varphi_{\bf h}) = [2\pi I_{0} (\alpha)]^{-1} \exp [\alpha \cos (\varphi_{\bf h} - \beta_{\bf h})], \eqno(] where [\eqalignno{\alpha^{2} &= \left[\textstyle\sum\limits_{j = 1}^{r} G_{{{\bf h}, \,  {\bf k}}_{j}} \cos (\varphi_{{\bf k}_{j}} + \varphi_{{{\bf h} - {\bf k}}_{j}})\right]^{2}\cr &\quad + \left[\textstyle\sum\limits_{j = 1}^{r} G_{{{\bf h}, \,  {\bf k}}_{j}} \sin (\varphi_{{\bf k}_{j}} + \varphi_{{\bf h}-{\bf k}_{j}})\right]^{2} &(}] [\tan \beta_{\bf h} = {{\textstyle\sum_{j}} G_{{{\bf h}, \,  {\bf k}}_{j}} \sin (\varphi_{{\bf k}_{j}} + \varphi_{{{\bf h} - {\bf k}}_{j}})\over {\textstyle\sum_{j}} G_{{{\bf h}, \,  {\bf k}}_{j}} \cos (\varphi_{{\bf k}_{j}} + \varphi_{{{\bf h} - {\bf k}}_{j}})}. \eqno (] [\beta_{\bf h}] is the most probable value for [\varphi_{\bf h}]. The variance of [\varphi_{\bf h}] may be obtained from ([link] and is given by [\eqalignno{V_{\bf h} &= {\pi^{2}\over 3} + [I_{0} (\alpha)]^{-1} \sum\limits_{n = 1}^{\infty} {I_{2n} (\alpha)\over n^{2}}\cr &\quad -4[I_{0} (\alpha)]^{-1} \sum\limits_{n = 0}^{\infty} {I_{2n + 1} (\alpha)\over (2n + 1)^{2}}, &(}] which is plotted in Fig.[link].


Figure | top | pdf |

Variance (in square radians) as a function of α.

Equation ([link] is the so-called tangent formula. According to ([link], the larger is α the more reliable is the relation [\varphi_{\bf h} = \beta_{\bf h}].

For an equal-atom structure [\sigma_{3} \sigma_{2}^{-3/2} = N^{-1/2}].

The basic conditional formula for sign determination of [E_{\bf h}] in cs. crystals is Cochran & Woolfson's (1955[link]) formula [P^{+} = {\textstyle{1\over 2}} + {\textstyle{1\over 2}} \tanh \left(\sigma_{3} \sigma_{2}^{-3/2} |E_{\bf h}| \textstyle\sum\limits_{j = 1}^{r} E_{{\bf k}_{j}} E_{{{\bf h} - {\bf k}}_{j}}\right), \eqno(] where [P^{+}] is the probability that [E_{\bf h}] is positive and k ranges over the set of known values [E_{\bf k} E_{{\bf h} - {\bf k}}]. The larger the absolute value of the argument of tanh, the more reliable is the phase indication.

An auxiliary formula exploiting all the [|E|]'s in reciprocal space in order to estimate a single Φ is the [B_{3, \,  0}] formula (Hauptman & Karle, 1958[link]; Karle & Hauptman, 1958[link]) given by [\eqalignno{&|E_{{\bf h}_{1}} E_{{\bf h}_{2}} E_{-{\bf h}_{1}-{\bf h}_{2}}| \cos (\varphi_{{\bf h}_{1}} + \varphi_{{\bf h}_{2}} - \varphi_{{\bf h}_{1} + {\bf h}_{2}})\cr &\quad \simeq C\langle (|E_{\bf k}|^{p} - \overline{|E|^{p}}) (|E_{{\bf h}_{1} + {\bf k}}|^{p} - \overline{|E|^{p}}) (|E_{{\bf h}_{1}+{\bf h}_{2}+{\bf k}}|^{p} - \overline{|E|^{p}})\rangle_{\bf k}\cr &\qquad - {2\sigma_{6}\over \sigma_{4}^{3/2}} + {\sigma_{8}^{1/2}\over \sigma_{4}} (|E_{{\bf h}_{1}}|^{2} + |E_{{\bf h}_{2}}|^{2} + |E_{{\bf h}_{1}+{\bf h}_{2}}|^{2}) \ldots, &(}] where C is a constant which differs for cs. and ncs. crystals, [\overline{|E|^{p}}] is the average value of [|E|^{p}] and p is normally chosen to be some small number. Several modifications of ([link] have been proposed (Hauptman, 1964[link], 1970[link]; Karle, 1970a[link]; Giacovazzo, 1977b[link]).

A recent formula (Cascarano, Giacovazzo, Camalli et al., 1984[link]) exploits information contained within the second representation of Φ, that is to say, within the collection of special quintets (see Section[link]): [\varphi_{{\bf h}_{1}} + \varphi_{{\bf h}_{2}} - \varphi_{{\bf h}_{1}+{\bf h}_{2}} + \varphi_{\bf k} - \varphi_{\bf k},] where k is a free vector. The formula retains the same algebraic form as ([link], but [G = {2R_{{\bf h}_{1}} R_{{\bf h}_{2}} R_{{\bf h}_{3}}\over \sqrt{N}} (1 + Q), \eqno(] where [[{\bf h}_{3} = -({\bf h}_{1} + {\bf h}_{2})]], [\eqalign{Q &= \sum\limits_{\bf k} {{\textstyle\sum_{i = 1}^{'m}} A_{{\bf k}, \,  i} / N\over 1 + \left(\varepsilon_{{\bf h}_{1}} \varepsilon_{{\bf h}_{2}} \varepsilon_{{\bf h}_{3}} + {\textstyle\sum_{i = 1}^{'m}} B_{{\bf k}, \,  i}\right) \bigg/ 2N},\cr A_{{\bf k}, \,  i} &= \varepsilon_{\bf k} [\varepsilon_{{\bf h}_{1}+{\bf kR}_{i}} (\varepsilon_{{\bf h}_{2}-{\bf kR}_{i}} + \varepsilon_{{\bf h}_{3}-{\bf kR}_{i}})\cr &\quad + \varepsilon_{{\bf h}_{2}+{\bf kR}_{i}} (\varepsilon_{{\bf h}_{1}-{\bf kR}_{i}} + \varepsilon_{{\bf h}_{3}-{\bf kR}_{i}})\cr &\quad + \varepsilon_{{\bf h}_{3}+{\bf kR}_{i}} (\varepsilon_{{\bf h}_{1}-{\bf kR}_{i}} + \varepsilon_{{\bf h}_{2}-{\bf kR}_{i}})],\cr B_{{\bf k}, \,  i} &= \varepsilon_{{\bf h}_{1}} [\varepsilon_{\bf k} (\varepsilon_{{\bf h}_{1}+{\bf kR}_{i}} + \varepsilon_{{\bf h}_{1}-{\bf kR}_{i}})\cr &\quad + \varepsilon_{{\bf h}_{2}+{\bf kR}_{i}} \varepsilon_{{\bf h}_{3}-{\bf kR}_{i}} + \varepsilon_{{\bf h}_{2}-{\bf kR}_{i}} \varepsilon_{{\bf h}_{3}+{\bf kR}_{i}}]\cr &\quad + \varepsilon_{{\bf h}_{2}} [\varepsilon_{\bf k} (\varepsilon_{{\bf h}_{2}+{\bf kR}_{i}} + \varepsilon_{{\bf h}_{2}-{\bf kR}_{i}})\cr &\quad + \varepsilon_{{\bf h}_{1}+{\bf kR}_{i}} \varepsilon_{{\bf h}_{3}-{\bf kR}_{i}} + \varepsilon_{{\bf h}_{1}-{\bf kR}_{i}} \varepsilon_{{\bf h}_{3}+{\bf kR}_{i}}]\cr &\quad + \varepsilon_{{\bf h}_{3}} [\varepsilon_{\bf k} (\varepsilon_{{\bf h}_{3}+{\bf kR}_{i}} + \varepsilon_{{\bf h}_{3}-{\bf kR}_{i}})\cr &\quad + \varepsilon_{{\bf h}_{1}+{\bf kR}_{i}} \varepsilon_{{\bf h}_{2}-{\bf kR}_{i}} + \varepsilon_{{\bf h}_{1}-{\bf kR}_{i}} \varepsilon_{{\bf h}_{2}+{\bf kR}_{i}}]\hbox{;}}] [\varepsilon = |E|^{2} - 1, (\varepsilon_{{\bf h}_{1}} \varepsilon_{{\bf h}_{2}} \varepsilon_{{\bf h}_{3}} + \sum_{i=1}^{'m} B_{{\bf k}, \,  i})] is assumed to be zero if it is experimentally negative. The prime to the summation warns the reader that precautions have to be taken in order to avoid duplications in the contributions.

G may be positive or negative. In particular, if [G\lt 0] the triplet is estimated negative.

The accuracy with which the value of Φ is estimated strongly depends on [\varepsilon_{\bf k}]. Thus, in practice, only a subset of reciprocal space (the reflections k with large values of [epsilon]) may be used for estimating Φ.

([link] proved to be quite useful in practice. Positive triplet cosines are ranked in order of reliability by ([link] markedly better than by Cochran's parameters. Negative estimated triplet cosines may be excluded from the phasing process and may be used as a figure of merit for finding the correct solution in a multisolution procedure. Triplet relationships using structural information

| top | pdf |

A strength of direct methods is that no knowledge of structure is required for their application. However, when some a priori information is available, it should certainly be a weakness of the methods not to make use of this knowledge. The conditional distribution of Φ given [R_{\bf h}R_{\bf k}R_{{\bf h}-{\bf k}}] and the first three of the five kinds of a priori information described in Section[link] is (Main, 1976[link]; Heinermann, 1977a[link]) [P (\Phi) \simeq {\exp \{2QR_{1}R_{2}R_{3} \cos (\Phi - q)\}\over 2\pi I_{0} (2QR_{1}R_{2}R_{3})}, \eqno(] where [Q \exp (iq) = {{\textstyle\sum_{i=1}^{p}} g_{i}({\bf h}_{1},{\bf h}_{2},{\bf h}_{3})\over \langle |F_{{\bf h}_{1}}|^{2} \rangle^{1/2} \langle |F_{{\bf h}_{2}}|^{2} \rangle^{1/2} \langle |F_{{\bf h}_{3}}|^{2} \rangle^{1/2}}.] [{\bf h}_{1}, {\bf h}_{2}, {\bf h}_{3}] stand for h, [-{\bf k}], [-{\bf h} + {\bf k}], and [R_{1}, R_{2}, R_{3}] for [R_{\bf h}, R_{\bf k}, R_{{\bf h} - {\bf k}}]. The quantities [\langle |F_{{\bf h}_{i}}|^{2} \rangle] have been calculated in Section[link] according to different categories: [g_{i}({\bf h}_{1}, {\bf h}_{2}, {\bf h}_{3})] is a suitable average of the product of three scattering factors for the ith atomic group, p is the number of atomic groups in the cell including those related by symmetry elements. We have the following categories.

  • (a) No structural information

    ([link] then reduces to ([link].

  • (b) Randomly positioned and randomly oriented atomic groups

    Then [g_{i}({\bf h}_{1}, {\bf h}_{2}, {\bf h}_{3}) = \textstyle\sum\limits_{j, \,  k, \,  l} f_{j}\;f_{k}\;f_{l} \langle \exp [2\pi i({\bf h}_{1} \cdot {\bf r}_{kj} + {\bf h}_{2} \cdot {\bf r}_{lj})]\rangle_{R},] where [\langle \ldots \rangle_{R}] means rotational average. The average of the exponential term extends over all orientations of the triangle formed by the atoms j, k and l, and is given (Hauptman, 1965[link]) by [\eqalign{B(z, t) &= \langle \exp [2\pi i ({\bf h} \cdot {\bf r} + {\bf h}' \cdot {\bf r}')] \rangle\cr &= \left({\pi\over 2z}\right)^{1/2} \sum\limits_{n=0}^{\infty} {t^{2n}\over (n!)^{2}} J_{(4n+1)/2} (z),}] where [z = 2 \pi [q^{2} r^{2} + 2 qrq' r' \cos \varphi_{q} \cos \varphi_{r} + q'^{2} r'^{2}]^{1/2}] and [t = [2 \pi^{2} qrq' r' \sin \varphi_{q} \sin \varphi_{r}]/z\hbox{;}] q, q′, r and r′ are the magnitudes of h, h′, r and r′, respectively; [\varphi_{q}] and [\varphi_{r}] are the angles [({\bf h},{\bf h}')] and [({\bf r},{\bf r}')], respectively.

  • (c) Randomly positioned but correctly oriented atomic groups

    Then [\eqalign{g_{i} ({\bf h}_{1}, {\bf h}_{2}, {\bf h}_{3}) &= \textstyle\sum\limits_{s = 1}^{m} \textstyle\sum\limits_{j, \,  k, \,  l} f_{j}\; f_{k}\; f_{l}\cr &\quad \times \exp [2\pi i ({\bf h}_{1} \cdot {\bf R}_{s} {\bf r}_{kj} + {\bf h}_{2} \cdot {\bf R}_{s} {\bf r}_{lk})],}] where the summations over j, k, l are taken over all the atoms in the ith group.

    A modified expression for [g_{i}] has to be used in polar space groups for special triplets (Giacovazzo, 1988[link]).

    Translation functions [see Chapter 2.3[link] ; for an overview, see also Beurskens et al. (1987)[link]] are also used to determine the position of a correctly oriented molecular fragment.

    Such functions can work in direct space [expressed as Patterson convolutions (Buerger, 1959[link]; Nordman, 1985[link]) or electron-density convolutions (Rossmann et al., 1964[link]; Argos & Rossmann, 1980[link])] or in reciprocal space [expressed as correlation functions (Crowther & Blow, 1967[link]; Karle, 1972[link]; Langs, 1985[link]) or residual functions (Rae, 1977[link])]. Both the probabilistic methods and the translation functions are quite efficient tools: the decision as to which one to use is often a personal choice.

  • (d) Atomic groups correctly positioned

    Let p be the number of atoms with known position, q the number of atoms with unknown position, [F_{p}] and [F_{q}] the corresponding structure factors.

    Tangent recycling methods (Karle, 1970b[link]) may be used for recovering the complete crystal structure. The phase [\varphi_{p, \,   {\bf h}}] is accepted in the starting set as a useful approximation of [\varphi_{\bf h}] if [|F_{p, \,   {\bf h}}|\gt \eta |F_{\bf h}|], where η is the fraction of the total scattering power contained in the fragment and where [|F_{\bf h}|] is associated with [|E_{\bf h}|\gt 1.5].

    Tangent recycling methods are applied (Beurskens et al., 1979[link]) with greater effectiveness to difference s.f.'s [\Delta F =] [(|F| - |F_{p}|) \exp (i \varphi_{p})]. The weighted tangent formula uses [\Delta F_{\bf h}] values in order to convert them to more probable [F_{q, \,  {\bf h}}] values.

    From a probabilistic point of view (Giacovazzo, 1983a[link]; Camalli et al., 1985[link]) the distribution of [\varphi_{\bf h}], given [E'_{p, \,  {\bf h}}] and some products [(E'_{\bf k} - E'_{p, \,  {\bf k}}) (E'_{{\bf h}-{\bf k}} - E'_{p, \,  {\bf h}-{\bf k}})], is the von Mises function [P(\varphi_{\bf h}| \ldots) = [2\pi I_{0} (\alpha)]^{-1} \exp [\alpha \cos (\varphi_{\bf h} - \theta_{\bf h})], \eqno(] where [\theta_{\bf h}], the most probable value of [\varphi_{\bf h}], is given by [\eqalignno{\tan \theta_{\bf h}& \simeq \alpha'_{2}/\alpha'_{1}, &(\cr \alpha^2&=\alpha_1^{'2}+\alpha_2^{'2}&\cr}] and [ \eqalign{\alpha'_{1} &= 2 R'_{\bf h} \left\{\hbox{\scr R} \left[E'_{p, \,  {\bf h}} + q^{-1/2} \textstyle\sum_{\bf k} (E'_{\bf k} - E'_{p, \,  {\bf k}})\right.\right.\cr &\qquad \vphantom{\sum_{k}}\times (E'_{{\bf h}-{\bf k}} - E'_{p, \, {\bf h}- {\bf k}})\Big]\Big\}\cr \alpha'_{2} &= 2 R'_{\bf h} \left\{\hbox{\scr I} \left[E'_{p, \,  {\bf h}} + q^{-1/2} \textstyle\sum_{\bf k} (E'_{\bf k} - E'_{p, \,  {\bf k}}) \right.\right.\cr &\qquad \vphantom{\sum_{k}}\times (E'_{{\bf h}-{\bf k}} - E'_{p, \,  {\bf h}-{\bf k}})\Big]\Big\}.}] [ \hbox{\scr R}] and [ \hbox{\scr I}] stand for `real and imaginary part of', respectively. Furthermore, [E' = F/\sum_{q}^{1/2}] is a pseudo-normalized s.f. If no pair [(\varphi_{\bf k}, \varphi_{{\bf h}-{\bf k}})] is known, then [\eqalign{\alpha'_{1} &= 2 R'_{\bf h} R'_{p, \,  {\bf h}} \cos \varphi_{p, \,  {\bf h}}\cr \alpha'_{2} &= 2 R'_{\bf h} R'_{p, \,  {\bf h}} \sin \varphi_{p, \,  {\bf h}}}] and ([link] reduces to Sim's (1959[link]) equation [P(\varphi_{\bf h}) \simeq [2\pi I_{0} (G)]^{-1} \exp [G \cos (\varphi_{\bf h} - \varphi_{p, \,  {\bf h}})], \eqno(] where [G = 2 R'_{\bf h} R'_{p, \,  {\bf h}}]. In this case [\varphi_{p, \,  {\bf h}}] is the most probable value of [\varphi_{\bf h}].

  • (e) Pseudotranslational symmetry is present

    Substructure and superstructure reflections are then described by different forms of the structure-factor equation (Böhme, 1982[link]; Gramlich, 1984[link]; Fan et al., 1983[link]), so that probabilistic formulae estimating triplet cosines derived on the assumption that atoms are uniformly dispersed in the unit cell cannot hold. In particular, the reliability of each triplet also depends on, besides [R_{\bf h}, R_{\bf k}, R_{{\bf h} - {\bf k}}], the actual h, k, [{\bf h}-{\bf k}] indices and on the nature of the pseudotranslation. It has been shown (Cascarano et al., 1985b[link]; Cascarano, Giacovazzo & Luić, 1987[link]) that ([link], ([link], ([link] still hold provided [G_{{{\bf h}, \,  {\bf k}}_{j}}] is replaced by [G'_{{{\bf h}, \,  {\bf k}}_{j}} = {2 R_{\bf h} R_{{\bf k}_{j}} R_{{\bf h}-{\bf k}_{j}}\over \sqrt{N_{{\bf h}, \,  {\bf k}}}},] where factors E and [n_{i}] are defined according to Section[link], [N_{{\bf h},{\bf k}} = {(\zeta_{\bf h} [\sigma_{2}]_{p} + [\sigma_{2}]_{q}) (\zeta_{\bf k} [\sigma_{2}]_{p} + [\sigma_{2}]_{q}) (\zeta_{{\bf h} - {\bf k}} [\sigma_{2}]_{p} + [\sigma_{2}]_{q})\over \{(\beta / m) [\sigma_{3}]_{p} (n_{1}^{2} n_{2}^{2} n_{3}^{2} \ldots) + [\sigma_{3}]_{q}\}^{2}},] and β is the number of times for which [\displaylines{{\bf hR}_{s} \cdot {\bf u}_{1} \equiv 0 \ (\hbox{mod} \ 1) \qquad {\bf hR}_{s} \cdot {\bf u}_{2} \equiv 0 \ (\hbox{mod} \ 1) \qquad {\bf hR}_{s} \cdot {\bf u}_{3} \equiv 0 \ (\hbox{mod} \ 1) \ldots\cr {\bf kR}_{s} \cdot {\bf u}_{1} \equiv 0 \ (\hbox{mod} \ 1) \qquad {\bf kR}_{s} \cdot {\bf u}_{2} \equiv 0 \ (\hbox{mod} \ 1) \qquad {\bf kR}_{s} \cdot {\bf u}_{3} \equiv 0 \ (\hbox{mod} \ 1) \ldots\cr ({\bf h - k}){\bf R}_{s} \cdot {\bf u}_{1} \equiv 0 \ (\hbox{mod} \ 1)\qquad ({\bf h - k}){\bf R}_{s} \cdot {\bf u}_{2} \equiv 0 \ (\hbox{mod} \ 1)\cr ({\bf h} - {\bf k}) {\bf R}_{s} \cdot {\bf u}_{3} \equiv 0 \ (\hbox{mod} \ 1) \ldots}] are simultaneously satisfied when s varies from 1 to m. The above formulae have been generalized (Cascarano et al., 1988b[link]) to the case in which deviations both of replacive and of displacive type from ideal pseudo-translational symmetry occur. Quartet phase relationships

| top | pdf |

In early papers (Hauptman & Karle, 1953[link]; Simerska, 1956[link]) the phase [\Phi = \varphi_{\bf h} + \varphi_{\bf k} + \varphi_{\bf l} - \varphi_{{\bf h} + {\bf k} + {\bf l}}] was always expected to be zero. Schenk (1973a[link],b[link]) [see also Hauptman (1974[link])] suggested that Φ primarily depends on the seven magnitudes: [R_{\bf h}, R_{\bf k}, R_{\bf l}, R_{{\bf h} + {\bf k} + {\bf l}}], called basis magnitudes, and [R_{{\bf h} + {\bf k}}, R_{{\bf h} + {\bf l}}, R_{{\bf k} + {\bf l}}], called cross magnitudes.

The conditional probability of Φ in P1 given seven magnitudes [(R_{1} = R_{\bf h}, \ldots,\ R_{4} = R_{{\bf h} + {\bf k} + {\bf l}},\ R_{5} = R_{{\bf h} + {\bf k}},\; R_{6} = R_{{\bf h} + {\bf l}},\; R_{7} = R_{{\bf k} + {\bf l}})] according to Hauptman (1975[link]) is [\eqalign{P_{7} (\Phi) &= {1\over L} \exp (- 2 B \cos \Phi) I_{0} (2 \sigma_{3} \sigma_{2}^{-3/2} R_{5} Y_{5})\cr &\quad \times I_{0} (2 \sigma_{3} \sigma_{2}^{-3/2} R_{6} Y_{6}) I_{0} (2 \sigma_{3} \sigma_{2}^{-3/2} R_{7} Y_{7}),}] where L is a suitable normalizing constant which can be derived numerically, [\eqalign{B &= \sigma_{2}^{-3} (3 \sigma_{3}^{2} - \sigma_{2} \sigma_{4}) R_{1} R_{2} R_{3} R_{4}\cr Y_{5} &= [R_{1}^{2} R_{2}^{2} + R_{3}^{2} R_{4}^{2} + 2 R_{1} R_{2} R_{3} R_{4} \cos \Phi]^{1/2}\cr Y_{6} &= [R_{3}^{2} R_{1}^{2} + R_{2}^{2} R_{4}^{2} + 2 R_{1} R_{2} R_{3} R_{4} \cos \Phi]^{1/2}\cr Y_{7} &= [R_{2}^{2} R_{3}^{2} + R_{1}^{2} R_{4}^{2} + 2 R_{1} R_{2} R_{3} R_{4} \cos \Phi]^{1/2}.}] For equal atoms [\sigma_{2}^{-3} (3 \sigma_{3}^{2} - \sigma_{2} \sigma_{4}) = 2/N]. Denoting [\displaylines{C = R_{1} R_{2} R_{3} R_{4} / N, \cr Z_{5} = 2 Y_{5} / \sqrt{N}, \quad Z_{6} = 2 Y_{6} / \sqrt{N}, \quad Z_{7} = 2 Y_{7} / \sqrt{N}}] gives [\eqalignno{P_{7} (\Phi) &= {1\over L} \exp (- 4 C \cos \Phi)\cr &\quad \times I_{0} (R_{5} Z_{5}) I_{0} (R_{6} Z_{6}) I_{0} (R_{7} Z_{7}). &(}] Fig.[link] shows the distribution ([link] for three typical cases. It is clear from the figure that the cosine estimated near π or in the middle range will be in poorer agreement with the true values than the cosine near 0 because of the relatively larger values of the variance. In principle, however, the formula is able to estimate negative or enantiomorph-sensitive quartet cosines from the seven magnitudes.


Figure | top | pdf |

Distributions ([link] (solid curve) and ([link] (dashed curve) for the indicated [|E|] values in three typical cases.

In the cs. case ([link] is replaced (Hauptman & Green, 1976[link]) by [\eqalignno{P^{\pm} \simeq &{1\over L} \exp (\mp 2 C) \cosh (R_{5} Z_{5}^{\pm})\cr &\times \cosh (R_{6} Z_{6}^{\pm}) \cosh (R_{7} Z_{7}^{\pm}), &(}] where [P^{\pm}] is the probability that the sign of [E_{1} E_{2} E_{3} E_{4}] is positive or negative, and [\eqalign{Z_{5}^{\pm} &= {1\over {N^{1/2}}} (R_{1} R_{2} \pm R_{3} R_{4}),\cr Z_{6}^{\pm} &= {1\over {N^{1/2}}} (R_{1} R_{3} \pm R_{2} R_{4}),\cr Z_{7}^{\pm} &= {1\over {N^{1/2}}} (R_{1} R_{4} \pm R_{2} R_{3}).}] The normalized probability may be derived by [P^{+} / (P^{+} + P^{-})]. More simple probabilistic formulae were derived independently by Giacovazzo (1975[link], 1976[link]): [P_{7} (\Phi) = [2\pi I_{0} (G)]^{-1} \exp (G \cos \Phi), \eqno(] where [G = {2C (1 + \varepsilon_{5} + \varepsilon_{6} + \varepsilon_{7}) \over 1 + Q / (2N)}\eqno (] [{Q = (\varepsilon_{1} \varepsilon_{2} + \varepsilon_{3} \varepsilon_{4}) \varepsilon_{5} + (\varepsilon_{1} \varepsilon_{3} + \varepsilon_{2} \varepsilon_{4}) \varepsilon_{6} + (\varepsilon_{1} \varepsilon_{4} + \varepsilon_{2} \varepsilon_{3})} \varepsilon_{7} \eqno (] and [\varepsilon_{i} = (|E_{i}|^{2} - 1)]. Q is never allowed to be negative.

According to ([link] [\cos\Phi] is expected to be positive or negative according to whether [(\varepsilon_{5} + \varepsilon_{6} + \varepsilon_{7} + 1)] is positive or negative: the larger is C, the more reliable is the phase indication. For [N \geq 150], ([link] and ([link] are practically equivalent in all cases. If N is small, ([link] is in good agreement with ([link] for quartets strongly defined as positive or negative, but in poor agreement for enantiomorph-sensitive quartets (see Fig.[link]).

In cs. cases the sign probability for [E_{1} E_{2} E_{3} E_{4}] is [P^{+} = {\textstyle{1 \over 2}} + {\textstyle{1 \over 2}} \tanh (G / 2), \eqno(] where G is defined by ([link].

All three cross magnitudes are not always in the set of measured reflections. From marginal distributions the following formulae arise (Giacovazzo, 1977c[link]; Heinermann, 1977b[link]):

  • (a) in the ncs. case, if [R_{7}], or [R_{6}] and [R_{7}], or [R_{5}] and [R_{6}] and [R_{7}], are not in the measurements, then ([link] is replaced by [P(\Phi | R_{1}, \ldots, R_{6}) \simeq {1 \over L'} \exp (-2 C \cos \Phi) I_{0} (R_{5} Z_{5}) I_{0} (R_{6} Z_{6}),] or [P (\Phi | R_{1}, \ldots, R_{5}) \simeq {1 \over L''} I_{0} (R_{5} Z_{5}),] or [P (\Phi | R_{1}, \ldots, R_{4}) \simeq {1 \over L'''} \exp (2C \cos \Phi),] respectively.

  • (b) in the same situations, we have for cs. cases [P^{\pm} \simeq {1 \over L'} \exp (\mp C) \cosh (R_{5} Z_{5}^{\pm}) \cosh (R_{6} Z_{6}^{\pm}),] or [P^{\pm} \simeq {1 \over L''} \cosh (R_{5} Z_{5}^{\pm})] or [P^{\pm} = {1 \over L'''} \exp (\pm C) \simeq 0.5 + 0.5 \tanh (\pm C),] respectively.

Equations ([link] and ([link] are easily modifiable when some cross magnitudes are not in the measurements. If [R_{i}] is not measured then ([link] or ([link] are still valid provided that in G it is assumed that [\varepsilon_{i} = 0]. For example, if [R_{7}] and [R_{6}] are not in the data then ([link] and ([link] become [G = {2C (1 + \varepsilon_{5}) \over 1 + Q / (2N)},\qquad Q = (\varepsilon_{1} \varepsilon_{2} + \varepsilon_{3} \varepsilon_{4}) \varepsilon_{5}.] In space groups with symmetry higher than [P\bar{1}] more symmetry-equivalent quartets can exist of the type [\psi = \varphi_{{\bf h} {\bi R}_{\alpha}} + \varphi_{{\bf k}{\bi R}_{\beta}} + \varphi_{{\bf l} {\bi R}_{\gamma}} + \varphi_{(\overline{{\bf h} + {\bf k} + {\bf l}}) {\bi R}_{\delta}},] where [{\bi R}_{\alpha}, {\bi R}_{\beta}, {\bi R}_{\gamma}, {\bi R}_{\delta}] are rotation matrices of the space group. The set [\{\psi\}] is called the first representation of Φ. In this case Φ primarily depends on more than seven magnitudes. For example, let us consider in Pmmm the quartet [\Phi = \varphi_{123} + \varphi_{\bar{1}5\bar{3}} + \varphi_{\bar{1}\bar{5}8} + \varphi_{1\bar{2}\bar{8}}.] Quartets symmetry equivalent to Φ and respective cross terms are given in Table[link].

Table| top | pdf |
List of quartets symmetry equivalent to [\Phi = \Phi_{1}] in the class mmm

QuartetsBasis vectorsCross vectors
[\Phi_{1}] (1, 2, 3) [(\bar{1}, 5, \bar{3})] [(\bar{1}, \bar{5}, {8})] [(1, \bar{2}, \bar{8})] (0, 7, 0) [(0, \bar{3}, 11)] [(\bar{2}, 0, 5)]
[\Phi_{2}] [(\bar{1}, 2, 3)] [(1, 5, \bar{3})] [(\bar{1}, \bar{5}, 8)] [(1, \bar{2}, \bar{8})] (0, 7, 0) [(\bar{2}, \bar{3}, 11)] (0, 0, 5)
[\Phi_{3}] [(1, 2, \bar{3})] [(\bar{1}, 5, 3)] [(\bar{1}, \bar{5}, 8)] [(1, \bar{2}, \bar{8})] (0, 7, 0) [(0, \bar{3}, 5)] [(\bar{2}, 0, 11])
[\Phi_{4}] [(\bar{1}, 2, \bar{3})] (1, 5, 3) [(\bar{1}, \bar{5}, 8)] [(1, \bar{2}, \bar{8})] (0, 7, 0) [(\bar{2}, \bar{3}, 5)] (0, 0, 11)
[\Phi_{5}] [(\bar{1}, 2, 3)] [(\bar{1}, 5, \bar{3})] [(1, \bar{5}, 8)] [(1, \bar{2}, \bar{8})] [(\bar{2}, 7, 0)] [(0, \bar{3}, 11)] (0, 0, 5)
[\Phi_{6}] [(1, 2, 3)] [(\bar{1}, \bar{5}, \bar{3})] [(\bar{1}, 5, 8)] [(1, \bar{2}, \bar{8})] [(0, \bar{3}, 0)] (0, 7, 11) [(\bar{2}, 0, 5)]
[\Phi_{7}] [(\bar{1}, 2, 3)] [(1, \bar{5}, \bar{3})] [(\bar{1}, 5, 8)] [(1, \bar{2}, \bar{8})] [(0, \bar{3}, 0)] [(\bar{2}, 7, 11)] (0, 0, 5)
[\Phi_{8}] [(\bar{1}, 2, \bar{3})] [(\bar{1}, 5, 3)] [(1, \bar{5}, 8)] [(1, \bar{2}, \bar{8})] [(\bar{2}, 7, 0)] [(0, \bar{3}, 5)] (0, 0, 11)
[\Phi_{9}] [(1, 2, \bar{3})] [(\bar{1}, \bar{5}, 3)] [(\bar{1}, 5, 8)] [(1, \bar{2}, \bar{8})] [(0, \bar{3}, 0)] (0, 7, 5) [(\bar{2}, 0, 11)]
[\Phi_{10}] [(\bar{1}, 2, \bar{3})] [(1, \bar{5}, 3)] [(\bar{1}, 5, 8)] [(1, \bar{2}, \bar{8})] [(0, \bar{3}, 0)] [(\bar{2}, 7, 5)] (0, 0, 11)
[\Phi_{11}] [(\bar{1}, 2, 3)] [(\bar{1}, \bar{5}, \bar{3})] (1, 5, 8) [(1, \bar{2}, \bar{8})] [(\bar{2}, \bar{3}, 0)] (0, 7, 11) (0, 0, 5)

Experimental tests on the application of the representation concept to quartets have recently been made (Busetta et al., 1980[link]). It was shown that quartets with more than three cross magnitudes are more accurately estimated than other quartets. Also, quartets with a cross reflection which is systematically absent were shown to be of significant importance in direct methods. In this context it is noted that systematically absent reflections are not usually included in the set of diffraction data. This custom, not exceptionable when only triplet relations are used, can give rise to a loss of information when quartets are used. In fact the usual programs of direct methods discard quartets as soon as one of the cross reflections is not measured, so that systematic absences are dealt with in the same manner as those reflections which are outside the sphere of measurements. Quintet phase relationships

| top | pdf |

A quintet phase [\Phi = \varphi_{\bf h} + \varphi_{\bf k} + \varphi_{\bf l} + \varphi_{\bf m} + \varphi_{\overline{{\bf h} + {\bf k} + {\bf l} + {\bf m}}}] may be considered as the sum of three suitable triplets or the sum of a triplet and a quartet, i.e. [\eqalign{\Phi &= (\varphi_{\bf h} + \varphi_{\bf k} - \varphi_{{\bf h} + {\bf k}}) + (\varphi_{\bf l} + \varphi_{\bf m} - \varphi_{{\bf l} + {\bf m}})\cr &\quad + (\varphi_{{\bf h} + {\bf k}} + \varphi_{{\bf l} + {\bf m}} + \varphi_{\overline{{\bf h} + {\bf k} + {\bf l} + {\bf m}}})}] or [\Phi = (\varphi_{\bf h} + \varphi_{\bf k} - \varphi_{{\bf h} + {\bf k}}) + (\varphi_{\bf l} + \varphi_{\bf m} + \varphi_{\overline{{\bf h} + {\bf k} + {\bf l} + {\bf m}}} + \varphi_{{\bf h} + {\bf k}}).] It depends primarily on 15 magnitudes: the five basis magnitudes [R_{\bf h},\quad R_{\bf k},\quad R_{\bf l},\quad R_{\bf m},\quad R_{{\bf h} + {\bf k} + {\bf l} + {\bf m}},] and the ten cross magnitudes [\displaylines{R_{{\bf h} + {\bf k}},\quad R_{{\bf h} + {\bf l}},\quad R_{{\bf h} + {\bf m}},\quad R_{{\bf k} + {\bf l} + {\bf m}},\quad R_{{\bf k} + {\bf l}},\cr \noalign{\vskip5pt} R_{{\bf k} + {\bf m}},\quad R_{{\bf h} + {\bf l} + {\bf m}},\quad R_{{\bf l} + {\bf m}},\quad R_{{\bf h} + {\bf k} + {\bf m}},\quad R_{{\bf h} + {\bf k} + {\bf l}}.}] In the following we will denote[R_{1} = R_{\bf h},\quad R_{2} = R_{\bf k}, \ldots,\quad R_{15} = R_{{\bf h} + {\bf k} + {\bf l}}.] Conditional distributions of Φ in P1 and [P\bar{1}] given the 15 magnitudes have been derived by several authors and allow in favourable circumstances in ncs. space groups the quintets having Φ near 0 or near π or near [\pm \pi / 2] to be identified. Among others, we remember:

  • (a) the semi-empirical expression for [P_{15}(\Phi)] suggested by Van der Putten & Schenk (1977[link]): [P (\Phi | \ldots) \simeq {1 \over L} \exp \left[\left(6 - \sum\limits_{j = 6}^{15} R_{j}^{2}\right) 2C \cos \Phi \right] \prod\limits_{j = 6}^{15} I_{0} (2 R_{j} Y_{j}),] where [C = N^{-3/2} R_{1} R_{2} R_{3} R_{4} R_{5}] and [Y_{j}] is an expression related to the jth of the ten quartets connected with the quintet Φ;

  • (b) the formula by Fortier & Hauptman (1977[link]), valid in [P\bar{1}], which is able to predict the sign of a quintet by means of an expression which involves a summation over 1024 sets of signs;

  • (c) the expression by Giacovazzo (1977d[link]), according to which [P_{15} (\Phi) \simeq [2 \pi I_{0} (G)]^{-1} \exp (G \cos \Phi), \eqno(] where [G = {2C\over{1+6(N)^{1/2}}} \left[{1 + A + B \over 1 + D / (2N)}\right] \eqno(] and where [\eqalign{A &= \textstyle\sum\limits_{i = 6}^{15} \varepsilon_{i},\cr B &= \varepsilon_{6} \varepsilon_{13} + \varepsilon_{6} \varepsilon_{15} + \varepsilon_{6} \varepsilon_{14} + \varepsilon_{7} \varepsilon_{11} + \varepsilon_{7} \varepsilon_{15} + \varepsilon_{7} \varepsilon_{12}\cr &\quad + \varepsilon_{8} \varepsilon_{10} + \varepsilon_{8} \varepsilon_{14} + \varepsilon_{8} \varepsilon_{12} + \varepsilon_{10} \varepsilon_{15} + \varepsilon_{10} \varepsilon_{9} + \varepsilon_{11} \varepsilon_{14}\cr &\quad + \varepsilon_{11} \varepsilon_{9} + \varepsilon_{13} \varepsilon_{9} + \varepsilon_{13} \varepsilon_{12},\cr D &= \varepsilon_{1} \varepsilon_{2} \varepsilon_{6} + \varepsilon_{1} \varepsilon_{3} \varepsilon_{7} + \varepsilon_{1} \varepsilon_{4} \varepsilon_{8} + \varepsilon_{1} \varepsilon_{5} \varepsilon_{9} + \varepsilon_{1} \varepsilon_{10} \varepsilon_{15}\cr &\quad + \varepsilon_{1} \varepsilon_{11} \varepsilon_{14} + \varepsilon_{1} \varepsilon_{13} \varepsilon_{12} + \varepsilon_{2} \varepsilon_{3} \varepsilon_{10} + \varepsilon_{2} \varepsilon_{4} \varepsilon_{11}\cr &\quad + \varepsilon_{2} \varepsilon_{5} \varepsilon_{12} + \varepsilon_{2} \varepsilon_{7} \varepsilon_{15} + \varepsilon_{2} \varepsilon_{8} \varepsilon_{14} + \varepsilon_{2} \varepsilon_{13} \varepsilon_{9} + \varepsilon_{3} \varepsilon_{4} \varepsilon_{13}\cr &\quad + \varepsilon_{3} \varepsilon_{5} \varepsilon_{14} + \varepsilon_{3} \varepsilon_{6} \varepsilon_{15} + \varepsilon_{3} \varepsilon_{8} \varepsilon_{12} + \varepsilon_{3} \varepsilon_{11} \varepsilon_{9} + \varepsilon_{4} \varepsilon_{5} \varepsilon_{15}\cr &\quad + \varepsilon_{4} \varepsilon_{6} \varepsilon_{14} + \varepsilon_{4} \varepsilon_{7} \varepsilon_{12} + \varepsilon_{4} \varepsilon_{10} \varepsilon_{9} + \varepsilon_{5} \varepsilon_{6} \varepsilon_{13} + \varepsilon_{5} \varepsilon_{7} \varepsilon_{11}\cr &\quad + \varepsilon_{5} \varepsilon_{8} \varepsilon_{10}.}]

For cs. cases ([link] reduces to [P^{+} \simeq 0.5 + 0.5 \tanh (G / 2). \eqno(] Positive or negative quintets may be identified according to whether G is larger or smaller than zero.

If [R_{i}] is not measured then ([link] and ([link] are still valid provided that in ([link] [\varepsilon_{i} = 0].

If the symmetry is higher than in [P\bar{1}] then more symmetry-equivalent quintets can exist of the type [\psi = \varphi_{{\bf h}{\bi R}_{\alpha}} + \varphi_{{\bf k}{\bi R}_{\beta}} + \varphi_{{\bf l}{\bi R}_{\gamma}} + \varphi_{{\bf m}{\bi R}_{\delta}} + \varphi_{(\overline{{\bf h} + {\bf k} + {\bf l} + {\bf m}}){\bi R}_{\varepsilon}},] where [{\bi R}_{\alpha}, \ldots, {\bi R}_{\varepsilon}] are rotation matrices of the space groups. The set [\{\psi\}] is called the first representation of Φ. In this case Φ primarily depends on more than 15 magnitudes which all have to be taken into account for a careful estimation of Φ (Giacovazzo, 1980a[link]).

A wide use of quintet invariants in direct methods procedures is prevented for two reasons: (a) the large correlation of positive quintet cosines with positive triplets; (b) the large computing time necessary for their estimation [quintets are phase relationships of order [1/(N\sqrt{N})], so a large number of quintets have to be estimated in order to pick up a sufficient percentage of reliable ones]. Determinantal formulae

| top | pdf |

In a crystal structure with N identical atoms the joint probability distribution of n normalized s.f.'s [E_{{\bf h}_{1} + {\bf k}}, E_{{\bf h}_{2} + {\bf k}}, \ldots, E_{{\bf h}_{n} + {\bf k}}] under the following conditions:

  • (a) the structure is kept fixed whereas k is the primitive random variable;

  • (b) [E_{{\bf h}_{i} - {\bf h}_{j}},\ i, j = 1, \ldots, n], have values which are known a priori; is given (Tsoucaris, 1970[link]) [see also Castellano et al. (1973[link]) and Heinermann et al. (1979[link])] by [P(E_{1}, E_{2}, \ldots, E_{n}) = (2\pi)^{-n/2} D_{n}^{-1/2} \exp (-{\textstyle{1 \over 2}} Q_{n}) \eqno(] for cs. structures and [P(E_{1}, E_{2}, \ldots, E_{n}) = (2\pi)^{-n} D_{n}^{-1/2} \exp (-Q_{n}) \eqno(] for ncs. structures. In ([link] and ([link] we have denoted [\displaylines{D_{n} = \lambda,\qquad Q_{n} = \textstyle\sum\limits_{p, \,  q = 1}^{n} \Lambda_{pq} E_{p} E_{q}^{*}\cr E_{j} = E_{{\bf h}_{j} + {\bf k}},\qquad U_{pq} = U_{{\bf h}_{p} - {\bf h}_{q}},\qquad j, p, q = 1, \ldots, n.}] [\Lambda_{pq}] is an element of [\boldlambda ^{-1}], and [\boldlambda ] is the covariance matrix with elements [\displaylines{\langle E_{{\bf h}_{p} + {\bf k}} E_{{\bf h}_{q} + {\bf k}}\rangle = U_{{\bf h}_{p} - {\bf h}_{q}}\cr \boldlambda = \left|\matrix{1 &U_{12} &\ldots &U_{1q} &\ldots &U_{1n}\cr U_{21} &1 &\ldots &U_{2q} &\ldots &U_{2n}\cr \vdots &\vdots &\ddots &\vdots &\ddots &\vdots\cr U_{p1} &U_{p2} &\ldots &U_{pq} &\ldots &U_{pn}\cr \vdots &\vdots &\ddots &\vdots &\ddots &\vdots\cr U_{n1} &U_{n2} &\ldots &U_{nq} &\ldots &1\cr}\right|.}] [\lambda] is a K–H determinant: therefore [D_{n} \geq 0]. Let us call [\Delta_{n + 1} = {1 \over N} \left|\matrix{1 &U_{12} &\ldots &U_{1n} &E_{{\bf h}_{1} + {\bf k}}\cr U_{21} &1 &\ldots &U_{2n} &E_{{\bf h}_{2} + {\bf k}}\cr \vdots &\vdots &\ddots &\vdots &\vdots\cr U_{n1} &U_{n2} &\ldots &1 &E_{{\bf h}_{n} + {\bf k}}\cr E_{-{\bf h}_{1} - {\bf k}} &E_{-{\bf h}_{2} - {\bf k}} &\ldots &E_{-{\bf h}_{n} - {\bf k}} &N\cr}\right|\hbox{;}] the K–H determinant obtained by adding to [\boldlambda ] the last column and line formed by [E_{1}, E_{2}, \ldots, E_{n}], and [E_{1}^{*}, E_{2}^{*}, \ldots, E_{n}^{*}], respectively. Then ([link] and ([link] may be written [\eqalignno{&P(E_{1}, E_{2}, \ldots, E_{n})\cr &\quad = (2\pi)^{-n/2} D_{n}^{-1/2} \exp \left[N {\Delta_{n + 1} - D_{n}\over 2D_{n}}\right] &(}] and [\eqalignno{&P(E_{1}, E_{2}, \ldots, E_{n})\cr &\quad= (2\pi)^{-n} D_{n}^{-1/2} \exp \left[N {\Delta_{n + 1} - D_{n}\over D_{n}}\right], &(}] respectively. Because [D_{n}] is a constant, the maximum values of the conditional joint probabilities ([link] and ([link] are obtained when [\Delta_{n + 1}] is a maximum. Thus the maximum determinant rule may be stated (Tsoucaris, 1970[link]; Lajzérowicz & Lajzérowicz, 1966[link]): among all sets of phases which are compatible with the inequality [\Delta_{n + 1} (E_{1}, E_{2}, \ldots, E_{n}) \geq 0] the most probable one is that which leads to a maximum value of [\Delta_{n + 1}].

    If only one phase, i.e. [\varphi_{q}], is unknown whereas all other phases and moduli are known then (de Rango et al., 1974[link]; Podjarny et al., 1976[link]) for cs. crystals [P^{\pm} (E_{q}) \simeq 0.5 + 0.5 \ \tanh \ \left\{\pm |E_{q}| \textstyle\sum\limits_{p = 1 \atop p \neq q}^{n} \Lambda_{pq} E_{p}\right\}, \eqno(] and for ncs. crystals [P(\varphi_{q}) = [2\pi I_{0} (G_{q})]^{-1} \exp \{G_{q} \cos (\varphi_{q} - \theta_{q})\}, \eqno(] where [G_{q} \exp (i\theta_{q}) = 2 |E_{q}| \textstyle\sum\limits_{p \neq q = 1}^{n} \Lambda_{pq} E_{p}.] Equations ([link] and ([link] generalize ([link] and ([link], respectively, and reduce to them for [n = 3]. Fourth-order determinantal formulae estimating triplet invariants in cs. and ncs. crystals, and making use of the entire data set, have recently been secured (Karle, 1979[link], 1980[link]).

Advantages, limitations and applications of determinantal formulae can be found in the literature (Heinermann et al., 1979[link]; de Rango et al., 1975[link], 1985[link]). Taylor et al. (1978[link]) combined K–H determinants with a magic-integer approach. The computing time, however, was larger than that required by standard computing techniques. The use of K–H matrices has been made faster and more effective by de Gelder et al. (1990[link]) (see also de Gelder, 1992[link]). They developed a phasing procedure (CRUNCH) which uses random phases as starting points for the maximization of the K–H determinants. Algebraic relationships for structure seminvariants

| top | pdf |

According to the representations method (Giacovazzo, 1977a[link], 1980a[link],b[link]):

  • (i) any s.s. Φ may be estimated via one or more s.i.'s [\{\psi\}], whose values differ from Φ by a constant arising because of symmetry;

  • (ii) two types of s.s.'s exist, first-rank and second-rank s.s.'s, with different algebraic properties:

  • (iii) conditions characterizing s.s.'s of first rank for any space group may be expressed in terms of seminvariant moduli and seminvariantly associated vectors. For example, for all the space groups with point group 422 [Hauptman–Karle group [(h + k, l)] P(2, 2)] the one-phase s.s.'s of first rank are characterized by [\eqalign{(h, k, l) &\equiv 0 \hbox{ mod } (2,2,0) \hbox{ or } (2,0,2) \hbox{ or } (0,2,2)\cr (h \pm k, l) &\equiv 0 \hbox{ mod } (0, 2) \hbox{ or } (2, 0).}]

The more general expressions for the s.s.'s of first rank are

  • (a) [\Phi = \varphi_{\bf u} = \varphi_{{\bf h} ({\bi I} - {\bi R}_{\alpha})}] for one-phase s.s.'s;

  • (b) [\Phi = \varphi_{{\bf u}_{1}} + \varphi_{{\bf u}_{2}} = \varphi_{{\bf h}_{1} - {\bf h}_{2} {\bi R}_{\beta}} + \varphi_{{\bf h}_{2} - {\bf h}_{1} {\bi R}_{\alpha}}] for two-phase s.s.'s;

  • (c) [\Phi = \varphi_{{\bf u}_{1}} + \varphi_{{\bf u}_{2}} + \varphi_{{\bf u}_{3}} = \varphi_{{\bf h}_{1} - {\bf h}_{2} {\bi R}_{\beta}} + \varphi_{{\bf h}_{2} - {\bf h}_{3}{\bi R}_{\gamma}} + \varphi_{{\bf h}_{3} - {\bf h}_{1} {\bi R}_{\alpha}}] for three-phase s.s.'s;

  • [\eqalign{\quad(d)\; \Phi &= \varphi_{{\bf u}_{1}} + \varphi_{{\bf u}_{2}} + \varphi_{{\bf u}_{3}} + \varphi_{{\bf u}_{4}} \cr &= \varphi_{{\bf h}_{1} - {\bf h}_{2} {\bi R}_{\beta}} + \varphi_{{\bf h}_{2} - {\bf h}_{3} {\bi R}_{\gamma}} + \varphi_{{\bf h}_{3} - {\bf h}_{4} {\bi R}_{\delta}} + \varphi_{{\bf h}_{4} - {\bf h}_{1} {\bi R}_{\alpha}}}\hfill] for four-phase s.s.'s; etc.

In other words:

  • (a) [\varphi_{\bf u}] is an s.s. of first rank if at least one h and at least one rotation matrix [{\bi R}_{\alpha}] exist such that [{\bf u} = {\bf h}({\bi I} - {\bi R}_{\alpha})]. [\varphi_{\bf u}] may be estimated via the special triplet invariants [\{\psi\} = \varphi_{\bf u} - \varphi_{\bf h} + \varphi_{{\bf h} {\bi R}_{\alpha}}. \eqno(] The set [\{\psi\}] is called the first representation of [\varphi_{\bf u}].

  • (b) [\Phi = \varphi_{{\bf u}_{1}} + \varphi_{{\bf u}_{2}}] is an s.s. of first rank if at least two vectors [{\bf h}_{1}] and [{\bf h}_{2}] and two rotation matrices [{\bi R}_{\alpha}] and [{\bi R}_{\beta}] exist such that [\cases{{\bf u}_{1} = {\bf h}_{1} - {\bf h}_{2} {\bi R}_{\beta}\cr {\bf u}_{2} = {\bf h}_{2} - {\bf h}_{1} {\bi R}_{\alpha}.\cr} \eqno(] Φ may then be estimated via the special quartet invariants [\{\psi\} = \varphi_{{\bf u}_{1} {\bi R}_{\alpha}} + \varphi_{{\bf u}_{2}} - \varphi_{{\bf h}_{2}} + \varphi_{{\bf h}_{2} {\bi R}_{\beta} {\bi R}_{\alpha}} \eqno(] and [\{\psi\} = \{\varphi_{{\bf u}_{1}} + \varphi_{{\bf u}_{2} {\bi R}_{\beta}} - \varphi_{{\bf h}_{1}} + \varphi_{{\bf h}_{1} {\bi R}_{\alpha} {\bi R}_{\beta}}\}. \eqno(] For example, [\Phi = \varphi_{123} + \varphi_{\bar{7}\bar{2}\bar{5}}] in [P2_{1}] may be estimated via [\{\psi\} = \varphi_{123} + \varphi_{\bar{7}\bar{2}\bar{5}} - \varphi_{\bar{3}K\bar{1}} + \varphi_{3K1}] and [\{\psi\} = \varphi_{123} + \varphi_{7\bar{2}5} - \varphi_{4K4} + \varphi_{\bar{4}K\bar{4}},] where K is a free index.

The set of special quartets ( and ( constitutes the first representations of Φ.

Structure seminvariants of the second rank can be characterized as follows: suppose that, for a given seminvariant Φ, it is not possible to find a vectorial index h and a rotation matrix [{\bi R}_{\alpha}] such that [\Phi - \varphi_{\bf h} + \varphi_{{\bf h} {\bi R}_{\alpha}}] is a structure invariant. Then Φ is a structure seminvariant of the second rank and a set of structure invariants ψ can certainly be formed, of type [\{\psi\} = \Phi + \varphi_{{\bf h} {\bi R}_{p}} - \varphi_{{\bf h} {\bi R}_{q}} + \varphi_{{\bf l} {\bi R}_{i}} - \varphi_{{\bf l} {\bi R}_{j}},] by means of suitable indices h and l and rotation matrices [{\bi R}_{p}, {\bi R}_{q}, {\bi R}_{i}] and [{\bi R}_{j}]. As an example, for symmetry class 222, [\varphi_{240}] or [\varphi_{024}] or [\varphi_{204}] are s.s.'s of the first rank while [\varphi_{246}] is an s.s. of the second rank.

The procedure may easily be generalized to s.s.'s of any order of the first and of the second rank. So far only the role of one-phase and two-phase s.s.'s of the first rank in direct procedures is well documented (see references quoted in Sections[link] and[link]). Formulae estimating one-phase structure seminvariants of the first rank

| top | pdf |

Let [E_{\bf H}] be our one-phase s.s. of the first rank, where [{\bf H} = {\bf h} ({\bi I} - {\bi R}_{n}). \eqno(] In general, more than one rotation matrix [{\bi R}_{n}] and more than one vector h are compatible with ([link]. The set of special triplets [\{\psi\} = \{\varphi_{\bf H} - \varphi_{\bf h} + \varphi_{{\bf h} {\bi R}_{n}}\}] is the first representation of [E_{\bf H}]. In cs. space groups the probability that [E_{\bf H}\gt 0], given [|E_{\bf H}|] and the set [\{|E_{\bf h}|\}], may be estimated (Hauptman & Karle, 1953[link]; Naya et al., 1964[link]; Cochran & Woolfson, 1955[link]) by [P^{+} (E_{\bf H}) \simeq 0.5 + 0.5 \tanh \textstyle\sum\limits_{{\bf h}, \,   n} G_{{\bf h}, \,   n} (-1)^{2{\bf h}\cdot {\bf T}_{n}}, \eqno(] where [G_{{\bf h}, \,   n} = |E_{\bf H}|\varepsilon_{\bf h}/(2\sqrt{N}), \hbox{ and } \varepsilon = |E|^{2} - 1.] In ([link], the summation over n goes within the set of matrices [{\bi R}_{n}] for which (,b) is compatible, and h varies within the set of vectors which satisfy ([link] for each [{\bi R}_{n}]. Equation ([link] is actually a generalized way of writing the so-called [\sum_{1}] relationships (Hauptman & Karle, 1953[link]).

If [\varphi_{\bf H}] is a phase restricted by symmetry to [\theta_{\bf H}] and [\theta_{\bf H} + \pi] in an ncs. space group then (Giacovazzo, 1978[link]) [{P(\varphi_{\bf H} = \theta_{\bf H}) \simeq 0.5 + 0.5 \tanh \left\{\sum_{{\bf h}, \,   n} G_{{\bf h}, \,   n} \cos (\theta_{\bf H} - 2\pi {\bf h} \cdot {\bf T}_{n})\right\}}. \eqno(] If [\varphi_{\bf H}] is a general phase then [\varphi_{\bf H}] is distributed according to [P(\varphi_{\bf H}) \simeq {1\over L} \exp \{\alpha \cos (\varphi_{\bf H} - \theta_{\bf H})\},] where [\tan \theta_{\bf H} = {\left({\textstyle\sum\limits_{{\bf h}, \,   n}} G_{{\bf h}, \,   n} \sin 2\pi {\bf h} \cdot {\bf T}_{n}\right)\over \left({\textstyle\sum\limits_{{\bf h}, \,   n}} G_{{\bf h}, \,   n} \cos 2\pi {\bf h} \cdot {\bf T}_{n}\right)} \eqno(] with a reliability measured by [\eqalign{\alpha &= \left\{\left(\textstyle\sum\limits_{{\bf h}, \,   n} G_{{\bf h}, \,   n} \sin 2\pi {\bf h} \cdot {\bf T}_{n}\right)^{2}\right.\cr &\quad \left. + \left(\textstyle\sum\limits_{{\bf h}, \,  n} G_{{\bf h}, \,   n} \cos 2\pi {\bf h} \cdot {\bf T}_{n}\right)^{2}\right\}^{1/2}.}] The second representation of [\varphi_{\bf H}] is the set of special quintets [\{\psi\} = \{\varphi_{\bf H} - \varphi_{\bf h} + \varphi_{{\bf h} {\bi R}_{n}} + \varphi_{{\bf k} {\bi R}_{j}} - \varphi_{{\bf k} {\bi R}_{j}}\} \eqno(] provided that h and [{\bi R}_{n}] vary over the vectors and matrices for which ([link] is compatible, k over the asymmetric region of the reciprocal space, and [{\bi R}_{j}] over the rotation matrices in the space group. Formulae estimating [\varphi_{\bf H}] via the second representation in all the space groups [all the base and cross magnitudes of the quintets ([link] now constitute the a priori information] have recently been secured (Giacovazzo, 1978[link]; Cascarano & Giacovazzo, 1983[link]; Cascarano, Giacovazzo, Calabrese et al., 1984[link]). Such formulae contain, besides the contribution of order [N^{-1/2}] provided by the first representation, a supplementary (not negligible) contribution of order [N^{-3/2}] arising from quintets.

Denoting [\eqalign{E_{1} &= E_{\bf H}, \; E_{2} = E_{\bf h}, \; E_{3} = E_{\bf k},\cr E_{4, \,  j} &= E_{{\bf h} + {\bf k} {\bi R}_{j}}, \; {E}_{5, \,  j} = { E}_{{\bf H} + {\bf k} {\bi R}_{j}},}] formulae ([link], ([link], ([link] still hold provided that [\sum_{{\bf h}, \,  n} G_{{\bf h}, \,  n}] is replaced by [\sum_{{\bf h}, \,  n} G_{{\bf h}, \,  n} + {\sum_{{{\bf h}, \,  {\bf k}}, \,   n}}' {|E_{\bf H}|\over 2N^{3/2}} {A_{{{{\bf h}, \,  {\bf k}}}, \,  n}\over 1 + B_{{{\bf h}, \,  {\bf k}}, \,  n}},] where [\eqalign{A_{{{\bf h}, \,  {\bf k}}, \,  n} &= \left[(2|E_{2}|^{2} - 1) \varepsilon_{3} \left(\sum_{{{\bi R}_{i} = {\bi R}_{j} \atop {\bi R}_{j} + {\bi R}_{i}{\bi R}_{n} = 0}} \varepsilon_{4, \,  i} \varepsilon_{5, \,  j} + \sum_{{{\bi R}_{j} = {\bi R}_{i}{\bi R}_{n} \atop {\bi R}_{i} = {\bi R}_{j}{\bi R}_{n}}} \varepsilon_{4, \,  i} \varepsilon_{4, \,  j}\right)\right.\cr &\quad \left. - {\varepsilon_{3}\over 2} \sum_{j = 1}^{m} \varepsilon_{4, \,  j} - {\textstyle{1\over 2}} \sum_{{\bi R}_{j} = {\bi R}_{i} \atop {\bi R}_{j} + {\bi R}_{i}{\bi R}_{n} = 0} \varepsilon_{4, \,  i} \varepsilon_{5, \,  j}\right] \Bigg/N,\cr B_{{{\bf h}, \,  {\bf k}}, \,   n} &= \left[\varepsilon_{1} \varepsilon_{3} \sum_{j = 1}^{m} \varepsilon_{5, \,  j} + \varepsilon_{1} \sum_{{\bi R}_{j} = {\bi R}_{i}{\bi R}_{n} \atop {\bi R}_{i} = {\bi R}_{j}{\bi R}_{n}} \varepsilon_{4, \,  i} \varepsilon_{4, \, j} + \varepsilon_{2} \varepsilon_{3} \sum_{j = 1}^{m} \varepsilon_{4, \,  j}\right.\cr &\quad \left. +\ \varepsilon_{2} \sum_{{\bi R}_{j} = {\bi R}_{i} \atop {\bi R}_{j} + {\bi R}_{i}{\bi R}_{n} = 0} \varepsilon_{4, \,  i} \varepsilon_{5, \,  j} + {\textstyle{1\over 4}} \varepsilon_{1} H_{4} (E_{2})\right] \Bigg/ (2N).}] m is the number of symmetry operators and [H_{4}(E) = E^{4} - 6E^{2} + 3] is the Hermite polynomial of order four.

[B_{{{\bf h}, \,  {\bf k}}, \,   n}] is assumed to be zero if it is computed negative. The prime to the summation warns the reader that precautions have to be taken in order to avoid duplication in the contributions. Formulae estimating two-phase structure seminvariants of the first rank

| top | pdf |

Two-phase s.s.'s of the first rank were first evaluated in some cs. space groups by the method of coincidence by Grant et al. (1957[link]); the idea was extended to ncs. space groups by Debaerdemaeker & Woolfson (1972[link]), and in a more general way by Giacovazzo (1977e[link],f[link]).

The technique was based on the combination of the two triplets [\eqalign{\varphi_{{\bf h}_{1}} + \varphi_{{\bf h}_{2}} &\simeq \varphi_{{\bf h}_{1} + {\bf h}_{2}}\cr \varphi_{{\bf h}_{1}} + \varphi_{{\bf h}_{2} {\bi R}} &\simeq \varphi_{{\bf h}_{1} + {\bf h}_{2} {\bi R}},}] which, subtracted from one another, give [\varphi_{{\bf h}_{1} + {\bf h}_{2} {\bi R}} - \varphi_{{\bf h}_{1} + {\bf h}_{2}} \simeq \varphi_{{\bf h}_{2} {\bi R}} - \varphi_{{\bf h}_{2}} \simeq - 2 \pi {\bf h} \cdot {\bf T}.] If all four [|E|]'s are sufficiently large, an estimate of the two-phase seminvariant [\varphi_{{\bf h}_{1} + {\bf h}_{2} {\bi R}} - \varphi_{{\bf h}_{1} + {\bf h}_{2}}] is available.

Probability distributions valid in [P2_{1}] according to the neighbourhood principle have been given by Hauptman & Green (1978[link]). Finally, the theory of representations was combined by Giacovazzo (1979a[link]) with the joint probability distribution method in order to estimate two-phase s.s.'s in all the space groups.

According to representation theory, the problem is that of evaluating [\Phi = \varphi_{{\bf u}_{1}} + \varphi_{{\bf u}_{2}}] via the special quartets ([link]) and ([link]). Thus, contributions of order [N^{-1}] will appear in the probabilistic formulae, which will be functions of the basis and of the cross magnitudes of the quartets ([link] [link]. Since more pairs of matrices [{\bi R}_{\alpha}] and [{\bi R}_{\beta}] can be compatible with ([link], and for each pair [({\bi R}_{\alpha}, {\bi R}_{\beta})] more pairs of vectors [{\bf h}_{1}] and [{\bf h}_{2}] may satisfy ([link], several quartets can in general be exploited for estimating Φ. The simplest case occurs in [P\bar{1}] where the two quartets ([link] [link] suggest the calculation of the six-variate distribution function [({\bf u}_{1} = {\bf h}_{1} + {\bf h}_{2}, {\bf u}_{2} = {\bf h}_{1} - {\bf h}_{2})] [P (E_{{\bf h}_{1}}, E_{{\bf h}_{2}}, E_{{\bf h}_{1} + {\bf h}_{2}}, E_{{\bf h}_{1} - {\bf h}_{2}}, E_{2{\bf h}_{1}}, E_{2{\bf h}_{2}})] which leads to the probability formula [P^{+} \simeq 0.5 + 0.5 \tanh \left({|E_{{\bf h}_{1} + {\bf h}_{2}} E_{{\bf h}_{1} - {\bf h}_{2}}|\over 2N} \cdot {A\over 1 + B}\right),] where [P^{+}] is the probability that the product [E_{{\bf h}_{1} + {\bf h}_{2}} E_{{\bf h}_{1} - {\bf h}_{2}}] is positive, and [\eqalign{A &= \varepsilon_{{\bf h}_{1}} + \varepsilon_{{\bf h}_{2}} + 2\varepsilon_{{\bf h}_{1}} \varepsilon_{{\bf h}_{2}} + \varepsilon_{{\bf h}_{1}} \varepsilon_{2{\bf h}_{1}} + \varepsilon_{{\bf h}_{2}} \varepsilon_{2{\bf h}_{2}}\cr B &= (\varepsilon_{{\bf h}_{1}} \varepsilon_{{\bf h}_{2}} \varepsilon_{{\bf u}_{1}} + \varepsilon_{{\bf h}_{1}} \varepsilon_{{\bf h}_{2}} \varepsilon_{{\bf u}_{2}}\cr &\quad + \varepsilon_{{\bf u}_{1}} \varepsilon_{{\bf u}_{2}} \varepsilon_{2{\bf h}_{1}} + \varepsilon_{{\bf u}_{1}} \varepsilon_{{\bf u}_{2}} \varepsilon_{2{\bf h}_{2}})/(2N).}] It may be seen that in favourable cases [P^{+}\lt 0.5].

For the sake of brevity, the probabilistic formulae for the general case are not given and the reader is referred to the original papers.

2.2.6. Direct methods in real and reciprocal space: Sayre's equation

| top | pdf |

The statistical treatment suggested by Wilson for scaling observed intensities corresponds, in direct space, to the origin peak of the Patterson function, so it is not surprising that a general correspondence exists between probabilistic formulation in reciprocal space and algebraic properties in direct space.

For a structure containing atoms which are fully resolved from one another, the operation of raising [\rho({\bf r})] to the nth power retains the condition of resolved atoms but changes the shape of each atom. Let [\rho ({\bf r}) = \textstyle\sum\limits_{j=1}^{N} \rho_{j} ({\bf r} - {\bf r}_{j}),] where [\rho_{j}({\bf r})] is an atomic function and [{\bf r}_{j}] is the coordinate of the `centre' of the atom. Then the Fourier transform of the electron density can be written as [\eqalignno{F_{\bf h} &= \textstyle\sum\limits_{j=1}^{N} \int\limits_{V} \rho_{j} ({\bf r} - {\bf r}_{j}) \exp (2\pi i {\bf h} \cdot {\bf r}) \;\hbox{d}V &\cr &= \textstyle\sum\limits_{j=1}^{N} f_{j} \exp (2\pi i {\bf h} \cdot {\bf r}_{j}). &(}] If the atoms do not overlap [\rho^{n} ({\bf r}) = \left[\textstyle\sum\limits_{j=1}^{N} \rho_{j} ({\bf r} - {\bf r}_{j})\right]^{n} \simeq \textstyle\sum\limits_{j=1}^{N} \rho_{j}^{n} ({\bf r} - {\bf r}_{j})] and its Fourier transform gives [\eqalignno{{}_{n}F_{\bf h} &= \textstyle\int\limits_{V} \rho^{n} ({\bf r}) \exp (2\pi i {\bf h} \cdot {\bf r}) \;\hbox{d}V &\cr &= \textstyle\sum\limits_{j=1}^{N} {_{n}}\;f{_{j}} \exp (2\pi i {\bf h} \cdot {\bf r}_{j}). &(}] [{}_{n}\;f_{j}] is the scattering factor for the jth peak of [\rho^{n} ({\bf r})]: [_{n}\;f_{j} ({\bf h}) = \textstyle\int\limits_{V} \rho_{j}^{n} ({\bf r}) \exp (2\pi i {\bf h} \cdot {\bf r}) \;\hbox{d}{\bf r}.]

We now introduce the condition that all atoms are equal, so that [f_{j} \equiv f] and [{}_{n}\;f_{j} \equiv {}_{n}\;f] for any j. From ([link] and ([link] we may write [F_{\bf h} = {f\over _{n}\;f} {}_{n}F_{\bf h} = \theta_{n} \ {}_{n}F_{\bf h}, \eqno(] where [\theta_{n}] is a function which corrects for the difference of shape of the atoms with electron distributions [\rho ({\bf r})] and [\rho^{n} ({\bf r})]. Since [\eqalign{\rho^{n} ({\bf r}) &= \rho ({\bf r}) \ldots \rho ({\bf r})\cr &= {1\over V^{n}} \sum\limits_{{\bf h}_{1}, \, \ldots, \,  {\bf h}_{n}\atop -\infty}^{+\infty} F_{{\bf h}_{1}} \ldots F_{{\bf h}_{n}} \exp [-2\pi i ({\bf h}_{1} + \ldots + {\bf h}_{n}) \cdot {\bf r}],}] the Fourier transform of both sides gives [\eqalign{{}_{n}F_{\bf h} &= {1\over V^{n}} \sum\limits_{{\bf h}_{1}, \,  \ldots, \,  {\bf h}_{n}\atop -\infty}^{+\infty} F_{{\bf h}_{1}} \ldots F_{{\bf h}_{n}} \int\limits_{V} \exp [2\pi i ({\bf h} - {\bf h}_{1} - \ldots - {\bf h}_{n}) \cdot {\bf r}] \;\hbox{d}V\cr &= {1\over V^{n-1}} \sum\limits_{{\bf h}_{1}, \,  \ldots, \,  {\bf h}_{n-1}\atop -\infty}^{+\infty} F_{{\bf h}_{1}} F_{{\bf h}_{2}} \ldots F_{{\bf h} - {\bf h}_{1} - {\bf h}_{2} - \ldots - {\bf h}_{n-1}},}] from which the following relation arises: [F_{\bf h} = \theta_{n} {1\over V^{n-1}} \sum\limits_{{\bf h}_{1}, \,  \ldots, \,  {\bf h}_{n-1}\atop -\infty}^{+\infty} F_{{\bf h}_{1}} F_{{\bf h}_{2}} \ldots F_{{\bf h} - {\bf h}_{1} - {\bf h}_{2} - \ldots -{\bf h}_{n-1}}. \eqno(] For [n = 2], equation ([link] reduces to Sayre's (1952[link]) equation [but see also Hughes (1953[link])] [F_{\bf h} = \theta_{2} {1\over V} \sum\limits_{\bf k} F_{\bf k} F_{{\bf h} - {\bf k}}. \eqno(] If the structure contains resolved isotropic atoms of two types, P and Q, it is impossible to find a factor [\theta_{2}] such that the relation [F_{\bf h} = \theta_{2}\;{}_{2}F_{\bf h}] holds, since this would imply values of [\theta_{2}] such that [({}_{2}\;f)_{P} = \theta_{2} (\;f)_{P}] and [({}_{2}\;f)_{Q} = \theta_{2} (\;f)_{Q}] simultaneously. However, the following relationship can be stated (Woolfson, 1958[link]): [F_{\bf h} = {A_{s}\over V} \sum\limits_{\bf k} F_{\bf k} F_{{\bf h}-{\bf k}} + {B_{s}\over V^{2}} \sum\limits_{{\bf k}, \,  {\bf l}} F_{\bf k} F_{\bf l} F_{{\bf h}-{\bf k}-{\bf l}}, \eqno(] where [A_{s}] and [B_{s}] are adjustable parameters of [(\sin \theta)/\lambda]. Equation ([link] can easily be generalized to the case of structures containing resolved atoms of more than two types (von Eller, 1973[link]).

Besides the algebraic properties of the electron density, Patterson methods also can be developed so that they provide phase indications. For example, it is possible to find the reciprocal counterpart of the function [{P_{n} ({\bf u}_{1}, {\bf u}_{2}, \ldots, {\bf u}_{n}) = \textstyle\int\limits_{V} \rho ({\bf r}) \rho ({\bf r} + {\bf u}_{1}) \ldots \rho ({\bf r} + {\bf u}_{n}) \;\hbox{d}V.} \eqno(] For [n = 1] the function ([link] coincides with the usual Patterson function [P({\bf u})]; for [n = 2], ([link] reduces to the double Patterson function [P_{2} ({\bf u}_{1}, {\bf u}_{2})] introduced by Sayre (1953[link]). Expansion of [P_{2} ({\bf u}_{1}, {\bf u}_{2})] as a Fourier series yields [{P_{2} ({\bf u}_{1}, {\bf u}_{2}) = {1\over V^{2}} \sum\limits_{{\bf h}_{1}, \, {\bf h}_{2}} E_{{\bf h}_{1}} E_{{\bf h}_{2}} E_{{\bf h}_{3}} \exp [-2\pi i ({\bf h}_{1} \cdot {\bf u}_{1} + {\bf h}_{2} \cdot {\bf u}_{2})].} \eqno(] Vice versa, the value of a triplet invariant may be considered as the Fourier transform of the double Patterson.

Among the main results relating direct- and reciprocal-space properties it may be remembered:

  • (a) from the properties of [P_{2} ({\bf u}_{1}, {\bf u}_{2})] the following relationship may be obtained (Vaughan, 1958[link]) [\eqalign{&E_{{\bf h}_{1}} E_{{\bf h}_{2}} E_{{\bf h}_{1} + {\bf h}_{2}} - N^{-3/2}\cr &\qquad \simeq A_{1} \langle (|E_{\bf k}|^{2} - 1) (|E_{{\bf h}_{1} + {\bf k}}|^{2} - 1) (|E_{-{\bf h}_{2} + {\bf k}}|^{2} - 1)\rangle_{\bf k} - B_{1},}] which is clearly related to ([link];

  • (b) the zero points in the Patterson function provide information about the value of a triplet invariant (Anzenhofer & Hoppe, 1962[link]; Allegra, 1979[link]);

  • (c) the Hoppe sections (Hoppe, 1963[link]) of the double Patterson provide useful information for determining the triplet signs (Krabbendam & Kroon, 1971[link]; Simonov & Weissberg, 1970[link]);

  • (d) one phase s.s.'s of the first rank can be estimated via the Fourier transform of single Harker sections of the Patterson (Ardito et al., 1985[link]), i.e. [F_{\bf H} \sim {1\over L} \exp (2\pi i{\bf h} \cdot {\bf T}_{n}) \int\limits_{HS({\bf I}, \,   {\bf C}_{n})} P({\bf u}) \exp (2\pi i {\bf h} \cdot {\bf u}) \;\hbox{d}{\bf u}, \eqno(] where (see Section[link]) [{\bf H} = {\bf h}({\bf I} - {\bf R}_{n})] is the s.s., u varies over the complete Harker section corresponding to the operator [{\bf C}_{n}] [in symbols [HS({{\bf I}, {\bf C}}_{n})]] and L is a constant which takes into account the dimensionality of the Harker section.

    If no spurious peak is on the Harker section, then ([link] is an exact relationship. Owing to the finiteness of experimental data and to the presence of spurious peaks, ([link] cannot be considered in practice an exact relation: it works better when heavy atoms are in the chemical formula.

    More recently (Cascarano, Giacovazzo, Luić et al., 1987[link]), a special least-squares procedure has been proposed for discriminating spurious peaks among those lying on Harker sections and for improving positional and thermal parameters of heavy atoms.

  • (e) translation and rotation functions (see Chapter 2.3[link] ), when defined in direct space, always have their counterpart in reciprocal space.

2.2.7. Scheme of procedure for phase determination

| top | pdf |

A traditional procedure for phase assignment may be schematically presented as follows:

  • Stage 1: Normalization of s.f.'s. See Section 2.2.4.[link]

  • Stage 2: (Possible) estimation of one-phase s.s.'s. The computing program recognizes the one-phase s.s.'s and applies the proper formulae (see Section[link]).

    Each phase is associated with a reliability value, to allow the user to regard as known only those phases with reliability higher than a given threshold.

  • Stage 3: Search of the triplets. The reflections are listed for decreasing [|E|] values and, related to each [|E|] value, all possible triplets are reported (this is the so-called [\sum_{2}] list). The value [G = 2|E_{\bf h} E_{\bf k} E_{{\bf h}-{\bf k}}|/\sqrt{N}] is associated with every triplet for an evaluation of its efficiency. Usually reflections with [|E|\lt E_{s}] ([E_{s}] may range from 1.2 to 1.6) are omitted from this stage onward.

  • Stage 4: Definition of the origin and enantiomorph. This stage is carried out according to the theory developed in Section 2.2.3.[link] Phases chosen for defining the origin and enantiomorph, one-phase seminvariants estimated at stage 2, and symbolic phases described at stage 5 are the only phases known at the beginning of the phasing procedure. This set of phases is conventionally referred to as the starting set, from which iterative application of the tangent formula will derive new phase estimates.

  • Stage 5: Assignment of one or more (symbolic or numerical) phases. In complex structures the number of phases assigned for fixing the origin and the enantiomorph may be inadequate as a basis for further phase determination. Furthermore, only a few one-phase s.s.'s can be determined with sufficient reliability to make them qualify as members of the starting set. Symbolic phases may then be associated with some (generally from 1 to 6) high-modulus reflections (symbolic addition procedures). Iterative application of triplet relations leads to the determination of other phases which, in part, will remain expressed by symbols (Karle & Karle, 1966[link]).

    In other procedures (multisolution procedures) each symbol is assigned four phase values in turn: [\pi / 4, 3\pi / 4, 5\pi / 4, 7\pi / 4]. If p symbols are used, in at least one of the possible [4^{p}] solutions each symbolic phase has unit probability of being within [45^{\circ}] of its true value, with a mean error of [22.5^{\circ}].

    To find a good starting set a convergence method (Germain et al., 1970[link]) is used according to which: (a) [\langle \alpha_{\bf h}\rangle = \textstyle\sum\limits_{j} G_{j} I_{1} (G_{j}) / I_{0} (G_{j})] is calculated for all reflections (j runs over the set of triplets containing h); (b) the reflection is found with smallest [\langle \alpha \rangle] not already in the starting set; it is retained to define the origin if the origin cannot be defined without it; (c) the reflection is eliminated if it is not used for origin definition. Its [\langle \alpha \rangle] is recorded and [\langle \alpha \rangle] values for other reflections are updated; (d) the cycle is repeated from (b) until all reflections are eliminated; (e) the reflections with the smallest [\langle \alpha \rangle] at the time of elimination go into the starting set; (f) the cycle from (a) is repeated until all reflections have been chosen.

  • Stage 6: Application of tangent formula. Phases are determined in reverse order of elimination in the convergence procedure. In order to ensure that poorly determined phases [\varphi_{{\bf k}_j}] and [\varphi_{{\bf h}-{\bf k}_j}] have little effect in the determination of other phases a weighted tangent formula is normally used (Germain et al., 1971[link]): [\tan \varphi_{\bf h} = {{\textstyle\sum_{j}} w_{{\bf k}_{j}} w_{{\bf h}-{\bf k}_{j}} |E_{{\bf k}_{j}} E_{{\bf h}-{\bf k}_{j}} |\sin (\varphi_{{\bf k}_{j}} + \varphi_{{\bf h}-{\bf k}_{j}})\over {\textstyle\sum_{j}} w_{{\bf k}_{j}} w_{{\bf h}-{\bf k}_{j}} |E_{{\bf k}_{j}} E_{{\bf h}-{\bf k}_{j}} | \cos (\varphi_{{\bf k}_{j}} + \varphi_{{\bf h}-{\bf k}_{j}})}, \eqno(] where [w_{\bf h} = \min\;(0.2 \alpha, 1).] Once a large number of contributions are available in ([link] for a given [\varphi_{\bf h}], then the value of [\alpha_{\bf h}] quickly becomes greater than 5, and so assigns an unrealistic unitary weight to [\varphi_{\bf h}]. In this respect a different weighting scheme may be proposed (Hull & Irwin, 1978[link]) according to which [w = \psi \exp (-x^{2}) \textstyle\int\limits_{0}^{x} \exp (t^{2}) \;\hbox{d}t, \eqno(] where [x = \alpha / \langle \alpha \rangle] and [\psi = 1.8585] is a constant chosen so that [w = 1] when [x = 1]. Except for ψ, the right-hand side of ([link] is the Dawson integral which assumes its maximum value at [x = 1] (see Fig.[link]): when [\alpha\gt \langle \alpha \rangle] or [\alpha\lt \langle \alpha \rangle] then [w\lt 1] and so the agreement between α and [\langle \alpha \rangle] is promoted.


    Figure | top | pdf |

    The form of w as given by ([link].

    Alternative weighting schemes for the tangent formula are frequently used [for example, see Debaerdemaeker et al. (1985[link])]. In one (Giacovazzo, 1979b[link]), the values [\alpha_{{\bf k}_{j}}] and [\alpha_{{\bf h} - {\bf k}_{j}}] (which are usually available in direct procedures) are considered as additional a priori information so that ([link] may be replaced by [\tan \varphi_{{\bf h}} \simeq {{\textstyle\sum_{j}} \beta_{j} \sin (\varphi_{{\bf k}_{j}} + \varphi_{{\bf h - k}_{j}})\over {\textstyle\sum_{j}} \beta_{j} \cos (\varphi_{{\bf k}_{j}} + \varphi_{{\bf h - k}_{j}})}, \eqno(] where [\beta_{j}] is the solution of the equation [D_{1} (\beta_{j}) = D_{1} (G_{j}) D_{1} (\alpha_{{\bf k}_{j}}) D_{1} (\alpha_{{\bf h} - {\bf k}_{j}}). \eqno(] In ([link], [G_{j} = 2 | E_{\bf h} E_{{\bf k}_{j}} E_{{\bf h - k}_{j}} | \sqrt{N}] or the corresponding second representation parameter, and [D_{1} (x) = I_{1} (x) / I_{0} (x)] is the ratio of two modified Bessel functions.

    In order to promote (in accordance with the aims of Hull and Irwin) the agreement between α and [\langle \alpha \rangle], the distribution of α may be used (Cascarano, Giacovazzo, Burla et al., 1984[link]; Burla et al., 1987[link]); in particular, the first two moments of the distribution: accordingly, [w = \left\{\exp \left[{- (\alpha - \langle \alpha \rangle)^{2}\over 2 \sigma_{\alpha}^{2}}\right]\right\}^{1/3}] may be used, where [\sigma_{\alpha}^{2}] is the estimated variance of α.

  • Stage 7: Figures of merit. The correct solution is found among several by means of figures of merit (FOMs) which are expected to be extreme for the correct solution. Largely used are (Germain et al., 1970[link])

    • [\hbox{ABSFOM} = \textstyle\sum\limits_{\bf h} \alpha_{\bf h} / \textstyle\sum\limits_{\bf h} \langle \alpha_{\bf h} \rangle, \leqno(a)] which is expected to be unity for the correct solution.

    • [\hbox{PSI0} = {{\textstyle\sum_{{\bf h}}} \left| {\textstyle\sum_{{\bf k}}} E_{{\bf k}} E_{{\bf h - k}} \right|\over {\textstyle\sum_{{\bf h}}} \left({\textstyle\sum_{{\bf k}}} | E_{{\bf k}} E_{{\bf h - k}} |^{2}\right)^{1/2}}. \leqno(b)] The summation over k includes (Cochran & Douglas, 1957[link]) the strong [|E|]'s for which phases have been determined, and indices h correspond to very small [|E_{\bf h}|]. Minimal values of PSI0 (≤ 1.20) are expected to be associated with the correct solution.

    • [R_{\alpha} = {{\textstyle\sum_{{\bf h}}} | \alpha_{\bf h} - \langle \alpha_{\bf h} \rangle|\over {\textstyle\sum_{{\bf h}}} \langle \alpha_{\bf h} \rangle}. \leqno(c)] That is, the Karle & Karle (1966[link]) residual between the actual and the estimated α's. After scaling of [\alpha_{\bf h}] on [\langle \alpha_{\bf h}\rangle] the correct solution should be characterized by the smallest [R_{\alpha}] values.

    • [\hbox{NQEST} = \textstyle\sum\limits_{j} G_{j} \cos \Phi_{j}, \leqno(d)] where G is defined by ([link] and [\Phi = \varphi_{\bf h} - \varphi_{\bf k} - \varphi_{\bf l} - \varphi_{{\bf h}-{\bf k}-{\bf l}}] are quartet invariants characterized by large basis magnitudes and small cross magnitudes (De Titta et al., 1975[link]; Giacovazzo, 1976[link]). Since G is expected to be negative as well as [\cos \Phi], the value of NQEST is expected to be positive and a maximum for the correct solution.

    Figures of merit are then combined as [\eqalign{\hbox{CFOM} &= w_{1} {\hbox{ABSFOM} - \hbox{ABSFOM}_{\min}\over \hbox{ABSFOM}_{\max} - \hbox{ABSFOM}_{\min}}\cr &\quad + w_{2} {\hbox{PSI0}_{\max} - \hbox{PSI0}\over \hbox{PSI0}_{\max} - \hbox{PSI0}_{\min}}\cr &\quad + w_{3} {R_{{\alpha}_{\max}} - R_{\alpha}\over R_{{\alpha}_{\max}} - R_{{\alpha}_{\min}}}\cr &\quad + w_{4} {\hbox{NQEST} - \hbox{NQEST}_{\min}\over \hbox{NQEST}_{\max} - \hbox{NQEST}_{\min}},}] where [w_{i}] are empirical weights proportional to the confidence of the user in the various FOMs.

    Different FOMs are often used by some authors in combination with those described above: for example, enantiomorph triplets and quartets are supplementary FOMs (Van der Putten & Schenk, 1977[link]; Cascarano, Giacovazzo & Viterbo, 1987[link]).

    Different schemes of calculating and combining FOMs are also used: a recent scheme (Cascarano, Giacovazzo & Viterbo, 1987[link]) uses

    • [\hbox{CPHASE} = {\sum w_{j} G_{j} \cos (\Phi_{j} - \theta_{j}) + w_{j} G_{j} \cos \Phi_{j}\over {\textstyle\sum_{{\rm s.i.} + {\rm s.s.}}} w_{j} G_{j} D_{1} (G_{j})},\leqno{\quad(a1)}] where the first summation in the numerator extends over symmetry-restricted one-phase and two-phase s.s.'s (see Sections[link] and[link]), and the second summation in the numerator extends over negative triplets estimated via the second representation formula [equation ([link]] and over negative quartets. The value of CPHASE is expected to be close to unity for the correct solution.

    • (a2) [\alpha_{\bf h}] for strong triplets and [E_{\bf k} E_{{\bf h} - {\bf k}}] contributions for PSI0 triplets may be considered random variables: the agreements between their actual and their expected distributions are considered as criteria for identifying the correct solution.

    • (a3) correlation among some FOMs is taken into account.

    According to this scheme, each FOM (as well as the CFOM) is expected to be unity for the correct solution. Thus one or more figures are available which constitute a sort of criterion (on an absolute scale) concerning the correctness of the various solutions: FOMs (and CFOM) [\simeq 1] probably denote correct solutions, CFOMs [\ll 1] should indicate incorrect solutions.

  • Stage 8: Interpretation of E maps. This is carried out in up to four stages (Koch, 1974[link]; Main & Hull, 1978[link]; Declercq et al., 1973[link]):

    • (a) peak search;

    • (b) separation of peaks into potentially bonded clusters;

    • (c) application of stereochemical criteria to identify possible molecular fragments;

    • (d) comparison of the fragments with the expected molecular structure.

2.2.8. Other multisolution methods applied to small molecules

| top | pdf |

In very complex structures a large initial set of known phases seems to be a basic requirement for a structure to be determined. This aim can be achieved, for example, by introducing a large number of permutable phases into the initial set. However, the introduction of every new symbol implies a fourfold increase in computing time, which, even in fast computers, quickly leads to computing-time limitations. On the other hand, a relatively large starting set is not in itself enough to ensure a successful structure determination. This is the case, for example, when the triplet invariants used in the initial steps differ significantly from zero. New strategies have therefore been devised to solve more complex structures.

  • (1) Magic-integer methods

    In the classical procedure described in Section 2.2.7[link], the unknown phases in the starting set are assigned all combinations of the values [\pm \pi / 4, \pm 3 \pi / 4]. For n unknown phases in the starting set, [4^{n}] sets of phases arise by quadrant permutation; this is a number that increases very rapidly with n. According to White & Woolfson (1975[link]), phases can be represented for a sequence of n integers by the equations [\varphi_{i} = m_{i}x \ (\hbox{mod } 2 \pi), \quad i = 1, \ldots, n. \eqno(] The set of equations can be regarded as the parametric equation of a straight line in n-dimensional phase space. The nature and size of errors connected with magic-integer representations have been investigated by Main (1977[link]) who also gave a recipe for deriving magic-integer sequences which minimize the r.m.s. errors in the represented phases (see Table[link]). To assign a phase value, the variable x in equation ([link] is given a series of values at equal intervals in the range [0\lt x\lt 2 \pi]. The enantiomorph is defined by exploring only the appropriate half of the n-dimensional space.

    Table| top | pdf |
    Magic-integer sequences for small numbers of phases (n) together with the number of sets produced and the root-mean-square error in the phases

    nSequenceNo. of setsR.m.s. error [(^{\circ})]
    1 1               4 26
    2 2 3             12 29
    3 3 4 5           20 37
    4 5 7 8 9         32 42
    5 8 11 13 14 15       50 45
    6 13 18 21 23 24 25     80 47
    7 21 29 34 37 39 40 41   128 48
    8 34 47 55 60 63 65 66 67 206 49

    A different way of using the magic-integer method (Declercq et al., 1975[link]) is the primary–secondary P–S method which may be described schematically in the following way:

    • (a) Origin- and enantiomorph-fixing phases are chosen and some one-phase s.s.'s are estimated.

    • (b) Nine phases [this is only an example: very long magic-integer sequences may be used to represent primary phases (Hull et al., 1981[link]; Debaerdemaeker & Woolfson, 1983[link])] are represented with the approximated relationships: [\cases{\varphi_{i_{1}} = 3 x\cr \varphi_{i_{2}} = 4 x\cr \varphi_{i_{3}} = 5 x\cr}\qquad \cases{\varphi_{j_{1}} = 3 y\cr \varphi_{j_{2}} = 4 y\cr \varphi_{j_{3}} = 5 y\cr}\qquad \cases{\varphi_{p_{1}} = 3 z\cr \varphi_{p_{2}} = 4 z\cr \varphi_{p_{3}} = 5 z.\cr}] Phases in (a) and (b) consistitute the primary set.

    • (c) The phases in the secondary set are those defined through [\sum_{2}] relationships involving pairs of phases from the primary set: they, too, can be expressed in magic-integer form.

    • (d) All the triplets that link together the phases in the combined primary and secondary set are now found, other than triplets used to obtain secondary reflections from the primary ones. The general algebraic form of these triplets will be [m_{1}x + m_{2}y + m_{3}z + b \equiv 0\ (\hbox{mod } 1),] where b is a phase constant which arises from symmetry translation. It may be expected that the `best' value of the unknown x, y, z corresponds to a maximum of the function [\psi (x, y, z) = \textstyle\sum |E_{1} E_{2} E_{3}| \cos 2 \pi (m_{1}x + m_{2}y + m_{3}z + b),] with [0\leq x, y, z\lt 1]. It should be noticed that ψ is a Fourier summation which can easily be evaluated. In fact, ψ is essentially a figure of merit for a large number of phases evaluated in terms of a small number of magic-integer variables and gives a measure of the internal consistency of [\sum_{2}] relationships. The ψ map generally presents several peaks and therefore can provide several solutions for the variables.

  • (2) The random-start method

    These are procedures which try to solve crystal structures by starting from random initial phases (Baggio et al., 1978[link]; Yao, 1981[link]). They may be so described:

    • (a) A number of reflections (say NUM ∼ 100 or larger) at the bottom of the CONVERGE map are selected. These, and the relationships which link them, form the system for which trial phases will be found.

    • (b) A pseudo-random number generator is used to generate M sets of NUM random phases. Each of the M sets is refined and extended by the tangent formula or similar methods.

  • (3) Accurate calculation of s.i.'s and s.s.'s with 1, 2, 3, 4, …, n phases

    Having a large set of good phase relationships allows one to overcome difficulties in the early stages and in the refinement process of the phasing procedure. Accurate estimates of s.i.'s and s.s.'s may be achieved by the application of techniques such as the representation method or the neighbourhood principle (Hauptman, 1975[link]; Giacovazzo, 1977a[link], 1980b[link]). So far, second-representation formulae are available for triplets and one-phase seminvariants; in particular, reliably estimated negative triplets can be recognized, which is of great help in the phasing process (Cascarano, Giacovazzo, Camalli et al., 1984[link]). Estimation of higher-order s.s.'s with upper representations or upper neighbourhoods is rather difficult, both because the procedures are time consuming and because the efficiency of the present joint probability distribution techniques deteriorates with complexity. However, further progress can be expected in the field.

  • (4) Modified tangent formulae and least-squares determination and refinement of phases

    The problem of deriving the individual phase angles from triplet relationships is greatly overdetermined: indeed the number of triplets, in fact, greatly exceeds the number of phases so that any [\varphi_{\bf h}] may be determined by a least-squares approach (Hauptman et al., 1969[link]). The function to be minimized may be [{M} = {{\textstyle\sum_{\bf k}} w_{\bf k}[\cos (\varphi_{\bf h} - \varphi_{\bf k} - \varphi_{{\bf h}-{\bf k}}) - C_{\bf k}]^{2}\over \sum w_{\bf k}},] where [C_{\bf k}] is the estimate of the cosine obtained by probabilistic or other methods.

    Effective least-squares procedures based on linear equations (Debaerdemaeker & Woolfson, 1983[link]; Woolfson, 1977[link]) can also be used. A triplet relationship is usually represented by [(\varphi_{p} \pm \varphi_{q} \pm \varphi_{r} + b) \approx 0\ (\hbox{mod } 2 \pi), \eqno(] where b is a factor arising from translational symmetry. If ([link] is expressed in cycles and suitably weighted, then it may be written as [w (\varphi_{p} \pm \varphi_{q} \pm \varphi_{r} + b) = wn,] where n is some integer. If the integers were known then the equation would appear (in matrix notation) as [{\bi A}\boldPhi = {\bi C}, \eqno(] giving the least-squares solution [{\boldPhi } = ({\bi A}^{T}{\bi A})^{-1} {\bi A}^{T}{\bi C}. \eqno(] When approximate phases are available, the nearest integers may be found and equations ([link] and ([link] constitute the basis for further refinement.

    Modified tangent procedures are also used, such as (Sint & Schenk, 1975[link]; Busetta, 1976[link]) [\tan \varphi_{\bf h} \simeq {{\textstyle\sum_{j}} G_{{{\bf h}, \,  {\bf k}}_{j}} \sin (\varphi_{{\bf k}_{j}} + \varphi_{{\bf h}-{\bf k}_{j}} - \Delta_{j})\over \sum G_{{{\bf h}, \,  {\bf k}}_{j}} \cos (\varphi_{{\bf k}_{j}} + \varphi_{{\bf h}-{\bf k}_{j}} - \Delta_{j})},] where [\Delta_{j}] is an estimate for the triplet phase sum [(\varphi_{\bf h} - \varphi_{{\bf k}_{j}} - \varphi_{{\bf h}-{\bf k}_{j}})].

  • (5) Techniques based on the positivity of Karle–Hauptman determinants

    (The main formulae have been briefly described in Section[link].) The maximum determinant rule has been applied to solve small structures (de Rango, 1969[link]; Vermin & de Graaff, 1978[link]) via determinants of small order. It has, however, been found that their use (Taylor et al., 1978[link]) is not of sufficient power to justify the larger amount of computing time required by the technique as compared to that required by the tangent formula.

  • (6) Tangent techniques using simultaneously triplets, quartets,…

    The availability of a large number of phase relationships, in particular during the first stages of a direct procedure, makes the phasing process easier. However, quartets are sums of two triplets with a common reflection. If the phase of this reflection (and/or of the other cross terms) is known then the quartet probability formulae described in Section[link] cannot hold. Similar considerations may be made for quintet relationships. Thus triplet, quartet and quintet formulae described in the preceding paragraphs, if used without modifications, will certainly introduce systematic errors in the tangent refinement process.

    A method which takes into account correlation between triplets and quartets has been described (Giacovazzo, 1980c[link]) [see also Freer & Gilmore (1980[link]) for a first application], according to which [\tan \varphi_{\bf h} \simeq {{\textstyle\sum\limits_{\bf k}} G \sin (\varphi_{\bf k} + \varphi_{{\bf h}-{\bf k}}) - {\textstyle\sum\limits_{{{\bf k}, \,  {\bf l}}}} G' \sin (\varphi_{\bf k} + \varphi_{\bf l} + \varphi_{{\bf h}-{\bf k}-{\bf l}})\over {\textstyle\sum\limits_{\bf k}} G \cos (\varphi_{\bf k} + \varphi_{{\bf h}-{\bf k}}) - {\textstyle\sum\limits_{{\bf k}, \,  {\bf l}}} G' \cos (\varphi_{\bf k} + \varphi_{\bf l} + \varphi_{{\bf h}-{\bf k}-{\bf l}})},] where G′ takes into account both the magnitudes of the cross terms of the quartet and the fact that their phases may be known.

  • (7) Integration of Patterson techniques and direct methods (Egert & Sheldrick, 1985[link]) [see also Egert (1983[link], and references therein)]

    A fragment of known geometry is oriented in the unit cell by real-space Patterson rotation search (see Chapter 2.3[link] ) and its position is found by application of a translation function (see Section[link] and Chapter 2.3[link] ) or by maximizing the weighted sum of the cosines of a small number of strong translation-sensitive triple phase invariants, starting from random positions. Suitable FOMs rank the most reliable solutions.

  • (8) Maximum entropy methods

    A common starting point for all direct methods is a stochastic process according to which crystal structures are thought of as being generated by randomly placing atoms in the asymmetric unit of the unit cell according to some a priori distribution. A non-uniform prior distribution of atoms p(r) gives rise to a source of random atomic positions with entropy (Jaynes, 1957[link]) [H(p) = - \textstyle\int\limits_{V} p({\bf r}) \log p({\bf r}) \;\hbox{d}{\bf r}.] The maximum value [H_{\max} = \log V] is reached for a uniform prior [p({\bf r}) = 1/V].

    The strength of the restrictions introduced by p(r) is not measured by [H(p)] but by [H(p) - H_{\max}], given by [H(p) - H_{\max} = - \textstyle\int\limits_{V} p({\bf r}) \log [\;p({\bf r})/m({\bf r})] \;\hbox{d}{\bf r},] where [m({\bf r}) = 1/V]. Accordingly, if a prior prejudice m(r) exists, which maximizes H, the revised relative entropy is [S(p) = - \textstyle\int\limits_{V} p({\bf r}) \log [\;p({\bf r})/m({\bf r})] \;\hbox{d}{\bf r}.] The maximization problem was solved by Jaynes (1957[link]). If [G_{j}(p)] are linear constraint functionals defined by given constraint functions [C_{j}({\bf r})] and constraint values [c_{j}], i.e. [G_{j}(p) = \textstyle\int\limits_{V} p({\bf r})C_{j}({\bf r}) \;\hbox{d}{\bf r} = c_{j},] the most unbiased probability density p(r) under prior prejudice m(r) is obtained by maximizing the entropy of p(r) relative to m(r). A standard variational technique suggests that the constrained maximization is equivalent to the unconstrained maximization of the functional [S(p) + \textstyle\sum\limits_{j} \lambda_{j}G_{j}(p),] where the [\lambda_{j}]'s are Lagrange multipliers whose values can be determined from the constraints.

    Such a technique has been applied to the problem of finding good electron-density maps in different ways by various authors (Wilkins et al., 1983[link]; Bricogne, 1984[link]; Navaza, 1985[link]; Navaza et al., 1983[link]).

    Maximum entropy methods are strictly connected with traditional direct methods: in particular it has been shown that:

    • (a) the maximum determinant rule (see Section[link]) is strictly connected (Britten & Collins, 1982[link]; Piro, 1983[link]; Narayan & Nityananda, 1982[link]; Bricogne, 1984[link]);

    • (b) the construction of conditional probability distributions of structure factors amounts precisely to a reciprocal-space evaluation of the entropy functional [S(p)] (Bricogne, 1984[link]).

    Maximum entropy methods are under strong development: important contributions can be expected in the near future even if a multipurpose robust program has not yet been written.

2.2.9. Some references to direct-methods packages

| top | pdf |

Some references for direct-methods packages are given below. Other useful packages using symbolic addition or multisolution procedures do exist but are not well documented.

CRUNCH: Gelder, R. de, de Graaff, R. A. G. & Schenk, H. (1993[link]). Automatic determination of crystal structures using Karle–Hauptman matrices. Acta Cryst. A49, 287–293.

DIRDIF: Beurskens, P. T., Beurskens G., de Gelder, R., Garcia-Granda, S., Gould, R. O., Israel, R. & Smits, J. M. M. (1999[link]). The DIRDIF-99 program system. Crystallography Laboratory, University of Nijmegen, The Netherlands.

MITHRIL: Gilmore, C. J. (1984[link]). MITHRIL. An integrated direct-methods computer program. J. Appl. Cryst. 17, 42–46.

MULTAN88: Main, P., Fiske, S. J., Germain, G., Hull, S. E., Declercq, J.-P., Lessinger, L. & Woolfson, M. M. (1999[link]). Crystallographic software: teXsan for Windows.

PATSEE: Egert, E. & Sheldrick, G. M. (1985[link]). Search for a fragment of known geometry by integrated Patterson and direct methods. Acta Cryst. A41, 262–268.

SAPI: Fan, H.-F. (1999[link]). Crystallographic software: teXsan for Windows.

SnB: Weeks, C. M. & Miller, R. (1999[link]). The design and implementation of SnB version 2.0. J. Appl. Cryst. 32, 120–124.

SHELX97: Sheldrick, G. M. (2000a[link]). The SHELX home page. .

SHELXS: Sheldrick, G. M. (2000b[link]). SHELX. .

SIR97: Altomare, A., Burla, M. C., Camalli, M., Cascarano,­G. L., Giacovazzo, C., Guagliardi, A., Moliterni, A. G. G., Polidori, G. & Spagna, R. (1999[link]). SIR97: a new tool for crystal structure determination and refinement. J. Appl. Cryst. 32, 115–119.

XTAL3.6.1: Hall, S. R., du Boulay, D. J. & Olthof-Hazekamp, R. (1999[link]). Xtal3.6 crystallographic software. .

2.2.10. Direct methods in macromolecular crystallography

| top | pdf | Introduction

| top | pdf |

Protein structures cannot be solved ab initio by traditional direct methods (i.e., by application of the tangent formula alone). Accordingly, the first applications were focused on two tasks:

  • (a) improvement of the accuracy of the available phases (refinement process);

  • (b) extension of phases from lower to higher resolution (phase-extension process).

The application of standard tangent techniques to (a)[link] and (b)[link] has not been found to be very satisfactory (Coulter & Dewar, 1971[link]; Hendrickson et al., 1973[link]; Weinzierl et al., 1969[link]). Tangent methods, in fact, require atomicity and non-negativity of the electron density. Both these properties are not satisfied if data do not extend to atomic resolution [(d\gt 2\;\hbox{\AA})]. Because of series termination and other errors the electron-density map at [d\gt 2\;\hbox{\AA}] presents large negative regions which will appear as false peaks in the squared structure. However, tangent methods use only a part of the information given by the Sayre equation ([link]. In fact, ([link] express two equations relating the radial and angular parts of the two sides, so obtaining a large degree of overdetermination of the phases. To achieve this Sayre (1972[link]) [see also Sayre & Toupin (1975[link])] suggested minimizing ([link] by least squares as a function of the phases: [\textstyle\sum\limits_{\bf h} \left|a_{\bf h} F_{\bf h} - \textstyle\sum\limits_{\bf k} F_{\bf k} F_{{\bf h}-{\bf k}}\right|^{2}. \eqno(] Even if tests on rubredoxin (extensions of phases from 2.5 to 1.5 Å resolution) and insulin (Cutfield et al., 1975[link]) (from 1.9 to 1.5 Å resolution) were successful, the limitations of the method are its high cost and, especially, the higher efficiency of the least-squares method. Equivalent considerations hold for the application of determinantal methods to proteins [see Podjarny et al. (1981[link]); de Rango et al. (1985[link]) and literature cited therein].

A question now arises: why is the tangent formula unable to solve protein structures? Fan et al. (1991[link]) considered the question from a first-principle approach and concluded that:

  • (1) the triplet phase probability distribution is very flat for proteins (N is very large) and close to the uniform distribution;

  • (2) low-resolution data create additional problems for direct methods since the number of available phase relationships per reflection is small.

Sheldrick (1990[link]) suggested that direct methods are not expected to succeed if fewer than half of the reflections in the range 1.1–1.2 Å are observed with [|F|\gt 4\sigma(|F|)] (a condition seldom satisfied by protein data).

The most complete analysis of the problem has been made by Giacovazzo, Guagliardi et al. (1994[link]). They observed that the expected value of α (see Section 2.2.7[link]) suggested by the tangent formula for proteins is comparable with the variance of the α parameter. In other words, for proteins the signal determining the phase is comparable with the noise, and therefore the phase indication is expected to be unreliable. Ab initio direct phasing of proteins

| top | pdf |

Section[link] suggests that the mere use of the tangent formula or the Sayre equation cannot solve ab initio protein structures of usual size. However, even in an ab initio situation, there is a source of supplementary information which may be used. Good examples are the `peaklist optimization' procedure (Sheldrick & Gould, 1995[link]) and the SIR97 procedure (Altomare et al., 1999[link]) for refining and completing the trial structure offered by the first E map.

In both cases there are reasons to suspect that the correct structure is sometimes extracted from a totally incorrect direct-methods solution. These results suggest that a direct-space procedure can provide some form of structural information complementary to that used in reciprocal space by the tangent or similar formulae. The combination of real- and reciprocal-space techniques could therefore enlarge the size of crystal structures solvable by direct methods. The first program to explicitly propose the combined use of direct and reciprocal space was Shake and Bake (SnB), which inspired a second package, half-bake (HB). A third program, SIR99, uses a different algorithm.

The SnB method (DeTitta et al., 1994[link]; Weeks et al., 1994[link]; Hauptman, 1995[link]) is the heir of the cosine least-squares method described in Section 2.2.8[link], point (4[link]). The function [R(\Phi)={\textstyle\sum_{j}G_j[\cos\Phi_j-D_1(G_j)]^2 \over \textstyle\sum_{j}G_j},] where [\Phi] is the triplet phase, [G= 2|E_{\bf h}E_{\bf k}E_{{\bf h}+{\bf k}}|/(N)^{1/2}] and [D_1(x)=I_1(x)/I_0(x)].

[R(\Phi)] is expected to have a global minimum, provided the number of phases involved is sufficiently large, when all the phases are equal to their true values for some choice of origin and enantiomorph. Thus the phasing problem reduces to that of finding the global minimum of [R(\Phi)] (the minimum principle).

SnB comprises a shake step (phase refinement) and a bake step (electron-density modification), the second step aiming to impose phase constraints implicit in real space. Accordingly, the program requires two Fourier transforms per cycle, and numerous cycles. Thus it may be very time consuming and it is not competitive with other direct methods for the solution of the crystal structures of small molecules. However, it introduced into the field the tremendous usefulness of intensive computations for the direct solution of complex crystal structures.

Owing to Sheldrick (1997[link]), HB does most of its work in direct space. Random atomic positions are generated, to which a modified peaklist optimization process is applied. A number of peaks are eliminated subject to the condition that [\textstyle\sum|E_c|(|E_0|^2-1)] remains as large as possible (only reflections with [|E_0|\gt|E_{\rm min}|] are involved, where [|E_{\rm min}|\simeq1.4]). The phases of a suitable subset of reflections are then used as input for a tangent expansion. Then an E map is calculated from which peaks are selected: these are submitted to the elimination procedure.

Typically 5–20 cycles of this internal loop are performed. Then a correlation coefficient (CC) between [|E_0|] and [|E_c|] is calculated for all the data. If the CC is good (i.e. larger than a given threshold), then a new loop is performed: a new E map is obtained, from which a list of peaks is selected for submission to the elimination procedure. The criterion now is the value of the CC, which is calculated for all the reflections. Typically two to five cycles of this external loop are performed.

The program works indefinitely, restarting from random atoms until interrupted. It may work either by applying the true space-group symmetry or after having expanded the data to P1.

The SIR99 procedure (Burla et al., 1999[link]) may be divided into two distinct parts: the tangent section (i.e., a double tangent process using triplet and quartet invariants) is followed by a real-space refinement procedure. As in SIR97, the reciprocal-space part is followed by the real-space refinement, but this time this last part is much more complex. It involves three different techniques: EDM (an electron-density modification process), the HAFR part (in which all the peaks are associated with the heaviest atomic species) and the DLSQ procedure (a least-squares Fourier refinement process). The atomicity is gradually introduced into the procedure. The entire process requires, for each trial, several cycles of EDM and HAFR: the real-space part is able to lead to the correct solution even when the tangent formula does not provide favourable phase values. Integration of direct methods with isomorphous replacement techniques

| top | pdf |

The modulus of the isomorphous difference [\Delta F = |F_{PH}| - |F_{P}|] may be assumed at a first approximation as an estimate of the heavy-atom s.f. [F_{H}]. Normalization of [|\Delta F|]'s and application of the tangent formula may reveal the heavy-atom structure (Wilson, 1978[link]).

The theoretical basis for integrating the techniques of direct methods and isomorphous replacement was introduced by Hauptman (1982a[link]). According to his notation let us denote by [f_{j}] and [g_{j}] atomic scattering factors for the atom labelled j in a pair of isomorphous structures, and let [E_{\bf h}] and [G_{\bf h}] denote corresponding normalized structure factors. Then [\eqalign{E_{\bf h} &= |E_{\bf h}| \exp (i\varphi_{\bf h}) = \alpha_{20}^{-1/2} \textstyle\sum\limits_{j=1}^{N} f_{j} \exp (2 \pi i {\bf h} \cdot {\bf r}_{j}),\cr G_{\bf h} &= |G_{\bf h}| \exp (i\psi_{\bf h}) = \alpha_{02}^{-1/2} \textstyle\sum\limits_{j=1}^{N} g_{j} \exp (2 \pi i {\bf h} \cdot {\bf r}_{j}),}] where [\alpha_{mn} = \textstyle\sum\limits_{j=1}^{N} f_{j}^{m} g_{j}^{n}.] The conditional probability of the two-phase structure invariant [\Phi = \varphi_{\bf h} - \psi_{\bf h}] given [|E_{\bf h}|] and [|G_{\bf h}|] is (Hauptman, 1982a[link]) [P(\Phi |\; |E|, |G|) \simeq [2\pi I_{0} (Q)]^{-1} \exp (Q \cos \Phi),] where [\eqalign{Q &= |EG| [2\alpha / (1 - \alpha^{2})],\cr \alpha &= \alpha_{11} / (\alpha_{20}^{1/2} \alpha_{02}^{1/2}).}] Three-phase structure invariants were evaluated by considering that eight invariants exist for a given triple of indices h, k, l [({\bf h} + {\bf k} + {\bf l} = 0)]: [\eqalign{\Phi_{1} &= \varphi_{\bf h} + \varphi_{\bf k} + \varphi_{\bf l} \qquad \Phi_{2} = \varphi_{\bf h} + \varphi_{\bf k} + \psi_{\bf l}\cr \Phi_{3} &= \varphi_{\bf h} + \psi_{\bf k} + \varphi_{\bf l} \qquad \Phi_{4} = \psi_{\bf h} + \varphi_{\bf k} + \varphi_{\bf l}\cr \Phi_{5} &= \varphi_{\bf h} + \psi_{\bf k} + \psi_{\bf l} \qquad \Phi_{6} = \psi_{\bf h} + \varphi_{\bf k} + \psi_{\bf l}\cr \Phi_{7} &= \psi_{\bf h} + \psi_{\bf k} + \varphi_{\bf l} \qquad \Phi_{8} = \psi_{\bf h} + \psi_{\bf k} + \psi_{\bf l}.}] So, for the estimation of any [\Phi_{j}], the joint probability distribution [P (E_{\bf h}, E_{\bf k}, E_{\bf l}, G_{\bf h}, G_{\bf k}, G_{\bf l})] has to be studied, from which eight conditional probability densities can be obtained: [\eqalign{&P (\Phi_{i} \|E_{\bf h}|, |E_{\bf k}|, |E_{\bf l}|, |G_{\bf h}|, |G_{\bf k}|, |G_{\bf l}|)\cr &\quad \simeq [2 \pi I_{0} (Q_{j})]^{-1} \exp [Q_{j} \cos \Phi_{j}]}] for [j = 1, \ldots, 8].

The analytical expressions of [Q_{j}] are too intricate and are not given here (the reader is referred to the original paper). We only say that [Q_{j}] may be positive or negative, so that reliable triplet phase estimates near 0 or near π are possible: the larger [|Q_{j}|], the more reliable the phase estimate.

A useful interpretation of the formulae in terms of experimental parameters was suggested by Fortier et al. (1984[link]): according to them, distributions do not depend, as in the case of the traditional three-phase invariants, on the total number of atoms per unit cell but rather on the scattering difference between the native protein and the derivative (that is, on the scattering of the heavy atoms in the derivative).

Hauptman's formulae were generalized by Giacovazzo et al. (1988[link]): the new expressions were able to take into account the resolution effects on distribution parameters. The formulae are completely general and include as special cases native protein and heavy-atom isomorphous derivatives as well as X-ray and neutron diffraction data. Their complicated algebraic forms are easily reduced to a simple expression in the case of a native protein heavy-atom derivative: in particular, the reliability parameter for [\Phi_{1}] is [Q_{1} = 2[\sigma_{3} / \sigma_{2}^{3/2}]_{P} |E_{\bf h} E_{\bf k} E_{\bf l}| + 2[\sigma_{3} / \sigma_{2}^{3/2}]_{H} \Delta_{\bf h} \Delta_{\bf k} \Delta_{\bf l}, \eqno(] where indices P and H warn that parameters have to be calculated over protein atoms and over heavy atoms, respectively, and [\Delta = (F_{PH} - F_{P}) / (\textstyle\sum f_{j}^{2})_{H}^{1/2}.] Δ is a pseudo-normalized difference (with respect to the heavy-atom structure) between moduli of structure factors.

Equation ([link] may be compared with Karle's (1983[link]) qualitative rule: if the sign of [[(F_{\bf h})_{PH} - (F_{\bf h})_{P}] [(F_{\bf k})_{PH} - (F_{\bf k})_{P}] [(F_{\bf l})_{PH} - (F_{\bf l})_{P}]] is plus then the value of [\Phi_{1}] is estimated to be zero; if its sign is minus then the expected value of [\Phi_{1}] is close to π. In practice Karle's rule agrees with ([link] only if the Cochran-type term in ([link] may be neglected. Furthermore, ([link] shows that large reliability values do not depend on the triple product of structure-factor differences, but on the triple product of pseudo-normalized differences. A series of papers (Giacovazzo, Siliqi & Ralph, 1994[link]; Giacovazzo, Siliqi & Spagna, 1994[link]; Giacovazzo, Siliqi & Platas, 1995[link]; Giacovazzo, Siliqi & Zanotti, 1995[link]; Giacovazzo et al., 1996[link]) shows how equation ([link]) may be implemented in a direct procedure which proved to be able to estimate the protein phases correctly without any preliminary information on the heavy-atom substructure.

Combination of direct methods with the two-derivative case is also possible (Fortier et al., 1984[link]) and leads to more accurate estimates of triplet invariants provided experimental data are of sufficient accuracy. Integration of anomalous-dispersion techniques with direct methods

| top | pdf |

If the frequency of the radiation is close to an absorption edge of an atom, then that atom will scatter the X-rays anomalously (see Chapter 2.4[link] ) according to [f = f' + if'']. This results in the breakdown of Friedel's law. It was soon realized that the Bijvoet difference could also be used in the determination of phases (Peerdeman & Bijvoet, 1956[link]; Ramachandran & Raman, 1956[link]; Okaya & Pepinsky, 1956[link]). Since then, a great deal of work has been done both from algebraic (see Chapter 2.4[link] ) and from probabilistic points of view. In this section we are only interested in the second.

We will mention the following different cases:

  • (1) The OAS (one-wavelength anomalous scattering) case, also called SAS (single-wavelength anomalous scattering).

  • (2) The SIRAS (single isomorphous replacement combined with anomalous scattering) case. Typically, native protein and heavy-atom-derivative data are simultaneously available, with heavy atoms as anomalous scatterers.

  • (3) The MIRAS case, which generalizes the SIRAS case.

  • (4) The MAD case, a multiple-wavelength technique. One-wavelength techniques

| top | pdf |

Probability distributions of diffraction intensities and of selected functions of diffraction intensities for dispersive structures have been given by various authors [Parthasarathy & Srinivasan (1964[link]), see also Srinivasan & Parthasarathy (1976[link]) and relevant literature cited therein]. We describe here some probabilistic formulae for estimating invariants of low order.

  • (a) Estimation of two-phase structure invariants. The conditional probability distribution of [\Phi = \varphi_{\bf h} + \varphi_{-{\bf h}}] given [R_{\bf h}] and [G_{\bf h}] (normalized moduli of [F_{\bf h}] and [F_{-{\bf h}}], respectively) (Hauptman, 1982b[link]; Giacovazzo, 1983b[link]) is [P(\Phi | R_{\bf h}, G_{\bf h}) \simeq [2\pi I_{0} (Q)]^{-1} \exp [Q \cos (\Phi - q)], \eqno(] where [\eqalign{Q &= {2 R_{\bf h} G_{\bf h}\over \sqrt{c}} [c_{1}^{2} + c_{2}^{2}]^{1/2},\cr \cos q &= {c_{1}\over [c_{1}^{2} + c_{2}^{2}]^{1/2}}, \qquad \sin q = {c_{2}\over [c_{1}^{2} + c_{2}^{2}]^{1/2}},\cr c_{1} &= \textstyle\sum\limits_{j=1}^{N} ({f'_{j}}^{2} - {f''_{j}}^{2}) / \textstyle\sum,\cr c_{2} &= 2 \textstyle\sum\limits_{j=1}^{N} f'_{j} f''_{j} / \textstyle\sum,\cr c &= [1 - (c_{1}^{2} + c_{2}^{2})]^{2},\cr \textstyle\sum &= \textstyle\sum\limits_{j=1}^{N} ({\;f'_{j}}^{2} + {f''_{j}}^{2}).}] q is the most probable value of Φ: a large value of the parameter Q suggests that the phase relation [\Phi = q] is reliable. Large values of Q are often available in practice: q, however, may be considered an estimate of [|\Phi|] rather than of [\Phi] because the enantiomorph is not fixed in ([link]. A formula for the estimation of [\Phi] in centrosymmetric structures has recently been provided by Giacovazzo (1987[link]).

    If the positions of the p anomalous scatterers are known a priori [let [F_{p{\bf h}} = |F_{p{\bf h}}| \exp (i\varphi_{p{\bf h}})] be the structure factor of the partial structure], then an estimate of [\Phi' = \varphi_{\bf h} - \varphi_{p{\bf h}}] is given (Cascarano & Giacovazzo, 1985[link]) by [P (\Phi' |R_{\bf h}, R_{p{\bf h}}) \simeq [2 \pi I_{0} (Q')]^{-1} \exp [Q' \cos \Phi'], \eqno(] where [Q' = 2R^{+} R_{p}^{+} / \left(1 - \textstyle\sum\limits_{p} / \textstyle\sum\right),\quad \textstyle\sum\limits_{p} = \textstyle\sum\limits_{j=1}^{p} ({\;f'_{j}}^{2} + {f''_{j}}^{2}).] ([link] may be considered the generalization of Sim's distribution ([link] to dispersive structures.

  • (b) Estimation of triplet invariants. Kroon et al. (1977[link]) first incorporated anomalous diffraction in order to estimate triplet invariants. Their work was based on an analysis of the complex double Patterson function. Subsequent probabilistic considerations (Heinermann et al., 1978[link]) confirmed their results, which can be so expressed: [\sin \bar{\Phi} = {|\tau |^{2} - |\bar{\tau }|^{2}\over 4\tau '' [{1\over 2} (|\tau |^{2} + |\bar{\tau }|^{2}) - |\tau ''|^{2}]^{1/2}}, \eqno(] where [({\bf h} + {\bf k} + {\bf l} = 0)] [\eqalign{\tau &= E_{\bf h} E_{\bf k} E_{\bf l} = R_{\bf h} R_{\bf k} R_{\bf l} \exp (i \Phi_{{\bf h}, \,  {\bf k}}),\cr \bar{\tau } &= E_{-{\bf h}} E_{-{\bf k}} E_{-{\bf l}} = G_{\bf h} G_{\bf k} G_{\bf l} \exp (i \Phi_{\bar{\bf h}, \,   \bar{\bf k}}),\cr \bar{\Phi} &= {\textstyle{1\over 2}} (\Phi_{{\bf h}, \,  {\bf k}} - \Phi_{\bar{\bf h}, \,   \bar{\bf k}}),}] and [\tau ''] is the contribution of the imaginary part of τ, which may be approximated in favourable conditions by [\eqalign{\tau '' &= 2f'' [\;f'_{\bf h} \;f'_{\bf k} + f'_{\bf h} \;f'_{\bf l} + f_{\bf k} \;f_{\bf l}]\cr &\quad \times [1 + S (R_{\bf h}^{2} + R_{\bf k}^{2} + R_{\bf l}^{2} - 3)],}] where S is a suitable scale factor.

    Equation ([link] gives two possible values for [\bar{\Phi}] (Φ and [\pi - \Phi]). Only if [R_{\bf h} R_{\bf k} R_{\overline{{\bf h} + {\bf k}}}] is large enough may this phase ambiguity be resolved by choosing the angle nearest to zero.

    The evaluation of triplet phases by means of anomalous dispersion has been further pursued by Hauptman (1982b[link]) and Giacovazzo (1983b[link]). Owing to the breakdown of Friedel's law there are eight distinct triplet invariants which can contemporaneously be exploited: [\eqalign{\Phi_{1} &= \varphi_{\bf h} + \varphi_{\bf k} + \varphi_{\bf l}, \qquad \quad \Phi_{2} = - \varphi_{- {\bf h}} + \varphi_{\bf k} + \varphi_{\bf l}\cr \Phi_{3} &= \varphi_{\bf h} - \varphi_{- {\bf k}} + \varphi_{\bf l}, \qquad\ \Phi_{4} = \varphi_{\bf h} + \varphi_{\bf k} - \varphi_{- {\bf l}}\cr \Phi_{5} &= \varphi_{- {\bf h}} + \varphi_{- {\bf k}} + \varphi_{- {\bf l}}, \quad\ \Phi_{6} = - \varphi_{\bf h} + \varphi_{- {\bf k}} + \varphi_{- {\bf l}}\cr \Phi_{7} &= \varphi_{- {\bf h}} - \varphi_{\bf k} + \varphi_{- {\bf l}}, \qquad \Phi_{8} = \varphi_{- {\bf h}} + \varphi_{- {\bf k}} - \varphi_{\bf l}.}] The conditional probability distribution for each of the eight triplet invariants, given [R_{\bf h}], [R_{\bf k}], [R_{\bf l}], [G_{\bf h}], [G_{\bf k}], [G_{\bf l}], is [P_{j} (\Phi_{j}) \simeq {1\over L_{j}} \exp [A_{j} \cos (\Phi_{j} - \omega_{j})].] The definitions of [A_{j}], [L_{j}] and [\omega_{j}] are rather extensive and so the reader is referred to the published papers. [A_{j}] and [L_{j}] are positive values, so [\omega_{j}] is the expected value of [\Phi_{j}]. It may lie anywhere between 0 and 2π.

    An algebraic analysis of triplet phase invariants coupled with probabilistic considerations has been carried out by Karle (1984[link], 1985[link]). The rules permit the qualitative selection of triple phase invariants that have values close to [\pi/2], [-\pi/2], 0, and other values in the range from −π to π.

Let us now describe some practical aspects of the integration of direct methods with OAS techniques.

Anomalous difference structure factors[\Delta_{\rm iso}=|F^+|-|F^-|] can be used for locating the positions of the anomalous scatterers (Mukherjee et al., 1989[link]). Tests prove that accuracy in the difference magnitudes is critical for the success of the phasing process.

Suppose now that the positions of the heavy atoms have been found. How do we estimate the phase values for the protein? The phase ambiguity strictly connected with OAS techniques can be overcome by different methods: we quote the Qs method by Hao & Woolfson (1989[link]), the Wilson distribution method and the MPS method by Ralph & Woolfson (1991[link]), and the Bijvoet–Ramachandran–Raman method by Peerdeman & Bijvoet (1956[link]), Raman (1959[link]) and Moncrief & Lipscomb (1966[link]). More recently, a probabilistic method by Fan & Gu (1985[link]) gained additional insight into the problem. The SIRAS, MIRAS and MAD cases

| top | pdf |

Isomorphous replacement and anomalous scattering are discussed in Chapter 2.4[link] and in IT F (2001[link]). We observe here only that the SIRAS case can lead algebraically to unambiguous phase determination provided the experimental data are sufficiently good. Thus, any probabilistic treatment must take into consideration errors in the measurements.

In the MIRAS and MAD cases the system is overconditioned: again any probabilistic treatment must consider errors in the measurements, but now overconditioning allows the reduction of the perverse effects of the experimental errors and (in MIRAS) of the lack of isomorphism.

A particular application of extreme relevance concerns the location of anomalous scatterers when selenomethionine-substituted proteins and MAD data are available (Hendrickson & Ogata, 1997[link]; Smith, 1998[link]). In this case, many selenium sites should be identified and usual Patterson-interpretation methods can be expected to fail. The successes of SnB and HB prove the essential role of direct methods in this important area.


Allegra, G. (1979). Derivation of three-phase invariants from the Patterson function. Acta Cryst. A35, 213–220.
Altomare, A., Burla, M. C., Camalli, M., Cascarano, G. L., Giacovazzo, C., Guagliardi, A., Moliterni, A. G. G., Polidori, G. & Spagna, R.(1999). SIR97: a new tool for crystal structure determination and refinement. J. Appl. Cryst. 32, 115–119.
Anzenhofer, K. & Hoppe, W. (1962). Phys. Verh. Mosbach. 13, 119.
Ardito, G., Cascarano, G., Giacovazzo, C. & Luić, M. (1985). 1-Phase seminvariants and Harker sections. Z. Kristallogr. 172, 25–34.
Argos, P. & Rossmann, M. G. (1980). Molecular replacement method. In Theory and practice of direct methods in crystallography, edited by M. F. C. Ladd & R. A. Palmer, pp. 381–389. New York: Plenum.
Avrami, M. (1938). Direct determination of crystal structure from X-ray data. Phys. Rev. 54, 300–303.
Baggio, R., Woolfson, M. M., Declercq, J.-P. & Germain, G. (1978). On the application of phase relationships to complex structures. XVI. A random approach to structure determination. Acta Cryst. A34, 883–892.
Banerjee, K. (1933). Determination of the signs of the Fourier terms in complete crystal structure analysis. Proc. R. Soc. London Ser. A, 141, 188–193.
Bertaut, E. F. (1955a). La méthode statistique en cristallographie. I. Acta Cryst. 8, 537–543.
Bertaut, E. F. (1955b). La méthode statistique en cristallographie. II. Quelques applications. Acta Cryst. 8, 544–548.
Bertaut, E. F. (1960). Ordre logarithmique des densités de répartition. I. Acta Cryst. 13, 546–552.
Beurskens, P. T., Beurskens, G., de Gelder, R., Garcia-Granda, S., Gould, R. O., Israel, R. & Smits, J. M. M.(1999). The DIRDIF-99 program system. Crystallography Laboratory, University of Nijmegen, The Netherlands.
Beurskens, P. T., Gould, R. O., Bruins Slot, H. J. & Bosman, W. P. (1987). Translation functions for the positioning of a well oriented molecular fragment. Z. Kristallogr. 179, 127–159.
Beurskens, P. T., Prick, A. J., Doesburg, H. M. & Gould, R. O. (1979). Statistical properties of normalized difference-structure factors for non-centrosymmetric structures. Acta Cryst. A35, 765–772.
Böhme, R. (1982). Direkte Methoden für Strukturen mit Uberstruktureffekten. Acta Cryst. A38, 318–326.
Bouman, J. (1956). A general theory of inequalities. Acta Cryst. 9, 777–780.
Bricogne, G. (1984). Maximum entropy and the foundation of direct methods. Acta Cryst. A40, 410–415.
Britten, P. L. & Collins, D. M. (1982). Information theory as a basis for the maximum determinant. Acta Cryst. A38, 129–132.
Buerger, M. J. (1959). Vector space and its applications in crystal structure investigation. New York: John Wiley.
Burla, M. C., Camalli, M., Carrozzini, B., Cascarano, G. L., Giacovazzo, C., Polidori, G. & Spagna, R. (1999). SIR99, a program for the automatic solution of small and large crystal structures. Acta Cryst. A55, 991–999.
Burla, M. C., Cascarano, G., Giacovazzo, C., Nunzi, A. & Polidori, G. (1987). A weighting scheme for tangent formula development. III. The weighting scheme of the SIR program. Acta Cryst. A43, 370–374.
Busetta, B. (1976). An example of the use of quartet and triplet structure invariants when enantiomorph discrimination is difficult. Acta Cryst. A32, 139–143.
Busetta, B., Giacovazzo, C., Burla, M. C., Nunzi, A., Polidori, G. & Viterbo, D. (1980). The SIR program. I. Use of negative quartets. Acta Cryst. A36, 68–74.
Camalli, M., Giacovazzo, C. & Spagna, R. (1985). From a partial to the complete crystal structure. II. The procedure and its applications. Acta Cryst. A41, 605–613.
Cascarano, G. & Giacovazzo, C. (1983). One-phase seminvariants of first rank. I. Algebraic considerations. Z. Kristallogr. 165, 169–174.
Cascarano, G. & Giacovazzo, C. (1985). One-wavelength technique: some probabilistic formulas using the anomalous dispersion effect. Acta Cryst. A41, 408–413.
Cascarano, G., Giacovazzo, C., Burla, M. C., Nunzi, A. & Polidori, G. (1984). The distribution of [\alpha_{\bf h}]. Acta Cryst. A40, 389–394.
Cascarano, G., Giacovazzo, C., Calabrese, G., Burla, M. C., Nunzi, A., Polidori, G. & Viterbo, D. (1984). One-phase seminvariants of first rank. II. Probabilistic considerations. Z. Kristallogr. 167, 37–47.
Cascarano, G., Giacovazzo, C., Camalli, M., Spagna, R., Burla, M. C., Nunzi, A. & Polidori, G. (1984). The method of representations of structure seminvariants. The strengthening of triplet relationships. Acta Cryst. A40, 278–283.
Cascarano, G., Giacovazzo, C. & Luić, M. (1985a). Non-crystallographic translational symmetry: effects on diffraction-intensity statistics. In Structure and statistics in crystallography, edited by A. J. C. Wilson, pp. 67–77. Guilderland, USA: Adenine Press.
Cascarano, G., Giacovazzo, C. & Luić, M. (1985b). Direct methods and superstructures. I. Effects of the pseudotranslation on the reciprocal space. Acta Cryst. A41, 544–551.
Cascarano, G., Giacovazzo, C. & Luić, M. (1987). Direct methods and structures showing superstructure effects. II. A probabilistic theory of triplet invariants. Acta Cryst. A43, 14–22.
Cascarano, G., Giacovazzo, C. & Luić, M. (1988a). Direct methods and structures showing superstructure effects. III. A general mathematical model. Acta Cryst. A44, 176–183.
Cascarano, G., Giacovazzo, C. & Luić, M. (1988b). Direct methods and structures showing superstructure effects. IV. A new approach for phase solution. Acta Cryst. A44, 183–188.
Cascarano, G., Giacovazzo, C., Luić, M., Pifferi, A. & Spagna, R. (1987). 1-Phase seminvariants and Harker sections. II. A new procedure. Z. Kristallogr. 179, 113–125.
Cascarano, G., Giacovazzo, C. & Viterbo, D. (1987). Figures of merit in direct methods: a new point of view. Acta Cryst. A43, 22–29.
Castellano, E. E., Podjarny, A. D. & Navaza, J. (1973). A multivariate joint probability distribution of phase determination. Acta Cryst. A29, 609–615.
Cochran, W. (1955). Relations between the phases of structure factors. Acta Cryst. 8, 473–478.
Cochran, W. & Douglas, A. S. (1957). The use of a high-speed digital computer for the direct determination of crystal structure. II. Proc. R. Soc. London Ser. A, 243, 281–288.
Cochran, W. & Woolfson, M. M. (1955). The theory of sign relations between structure factors. Acta Cryst. 8, 1–12.
Coulter, C. L. & Dewar, R. B. K. (1971). Tangent formula applications in protein crystallography: an evaluation. Acta Cryst. B27, 1730–1740.
Crowther, R. A. & Blow, D. M. (1967). A method of positioning of a known molecule in an unknown crystal structure. Acta Cryst. 23, 544–548.
Cutfield, J. F., Dodson, E. J., Dodson, G. G., Hodgkin, D. C., Isaacs, N. W., Sakabe, K. & Sakabe, N. (1975). The high resolution structure of insulin: a comparison of results obtained from least-squares phase refinement and difference Fourier refinement. Acta Cryst. A31, S21.
Debaerdemaeker, T., Tate, C. & Woolfson, M. M. (1985). On the application of phase relationships to complex structures. XXIV. The Sayre tangent formula. Acta Cryst. A41, 286–290.
Debaerdemaeker, T. & Woolfson, M. M. (1972). On the application of phase relationships to complex structures. IV. The coincidence method applied to general phases. Acta Cryst. A28, 477–481.
Debaerdemaeker, T. & Woolfson, M. M. (1983). On the application of phase relationships to complex structures. XXII. Techniques for random refinement. Acta Cryst. A39, 193–196.
Declercq, J.-P., Germain, G., Main, P. & Woolfson, M. M. (1973). On the application of phase relationships to complex structures. V. Finding the solution. Acta Cryst. A29, 231–234.
Declercq, J.-P., Germain, G. & Woolfson, M. M. (1975). On the application of phase relationships to complex structures. VIII. Extension of the magic-integer approach. Acta Cryst. A31, 367–372.
De Titta, G. T., Edmonds, J. W., Langs, D. A. & Hauptman, H. (1975). Use of the negative quartet cosine invariants as a phasing figure of merit: NQEST. Acta Cryst. A31, 472–479.
DeTitta, G. T., Weeks, C. M., Thuman, P., Miller, R. & Hauptman, H. A. (1994). Structure solution by minimal-function phase refinement and Fourier filtering. I. Theoretical basis. Acta Cryst. A50, 203–210.
Egert, E. (1983). Patterson search – an alternative to direct methods. Acta Cryst. A39, 936–940.
Egert, E. & Sheldrick, G. M. (1985). Search for a fragment of known geometry by integrated Patterson and direct methods. Acta Cryst. A41, 262–268.
Eller, G. von (1973). Génération de formules statistiques entre facteurs de structure: la méthode du polynome. Acta Cryst. A29, 63–67.
Fan, H.-F. (1999). Crystallographic software: teXsan for Windows.
Fan, H.-F. & Gu, Y.-X. (1985). Combining direct methods with isomorphous replacement or anomalous scattering data. III. The incorporation of partial structure information. Acta Cryst. A41, 280–284.
Fan, H. F., Hao, Q. & Woolfson, M. M. (1991). Proteins and direct methods. Z. Kristallogr. 197, 197–208.
Fan, H.-F., Yao, J.-X., Main, P. & Woolfson, M. M. (1983). On the application of phase relationships to complex structures. XXIII. Automatic determination of crystal structures having pseudo-translational symmetry by a modified MULTAN procedure. Acta Cryst. A39, 566–569.
Fortier, S. & Hauptman, H. (1977). Quintets in [P\bar{1}]: probabilistic theory of the five-phase structure invariant in the space group [P\bar{1}]. Acta Cryst. A33, 829–833.
Fortier, S., Weeks, C. M. & Hauptman, H. (1984). On integrating the techniques of direct methods and isomorphous replacement. III. The three-phase invariant for the native and two-derivative case. Acta Cryst. A40, 646–651.
Freer, A. A. & Gilmore, C. J. (1980). The use of higher invariants in MULTAN. Acta Cryst. A36, 470–475.
French, S. & Wilson, K. (1978). On the treatment of negative intensity observations. Acta Cryst. A34, 517–525.
Gelder, R. de (1992). Thesis. University of Leiden, The Netherlands.
Gelder, R. de, de Graaff, R. A. G. & Schenk, H. (1990). On the construction of Karle–Hauptman matrices. Acta Cryst. A46, 688–692.
Gelder, R. de, de Graaff, R. A. G. & Schenk, H. (1993). Automatic determination of crystal structures using Karle–Hauptman matrices. Acta Cryst. A49, 287–293.
Germain, G., Main, P. & Woolfson, M. M. (1970). On the application of phase relationships to complex structures. II. Getting a good start. Acta Cryst. B26, 274–285.
Germain, G., Main, P. & Woolfson, M. M. (1971). The application of phase relationships to complex structures. III. The optimum use of phase relationships. Acta Cryst. A27, 368–376.
Giacovazzo, C. (1974). A new scheme for seminvariant tables in all space groups. Acta Cryst. A30, 390–395.
Giacovazzo, C. (1975). A probabilistic theory in [P\bar{1}] of the invariant [E_{\bf h} E_{\bf k} E_{\bf l} E_{{\bf h}+{\bf k}+{\bf l}}]. Acta Cryst. A31, 252–259.
Giacovazzo, C. (1976). A probabilistic theory of the cosine invariant [\cos (\varphi_{\bf h} + \varphi_{\bf k} + \varphi_{\bf l} - \varphi_{{\bf h}+{\bf k}+{\bf l}})]. Acta Cryst. A32, 91–99.
Giacovazzo, C. (1977a). A general approach to phase relationships: the method of representations. Acta Cryst. A33, 933–944.
Giacovazzo, C. (1977b). Strengthening of the triplet relationships. II. A new probabilistic approach in [P\bar{1}]. Acta Cryst. A33, 527–531.
Giacovazzo, C. (1977c). On different probabilistic approaches to quartet theory. Acta Cryst. A33, 50–54.
Giacovazzo, C. (1977d). Quintets in [P\bar{1}] and related phase relationships: a probabilistic approach. Acta Cryst. A33, 944–948.
Giacovazzo, C. (1977e). A probabilistic theory of the coincidence method. I. Centrosymmetric space groups. Acta Cryst. A33, 531–538.
Giacovazzo, C. (1977f). A probabilistic theory of the coincidence method. II. Non-centrosymmetric space groups. Acta Cryst. A33, 539–547.
Giacovazzo, C. (1978). The estimation of the one-phase structure seminvariants of first rank by means of their first and second representation. Acta Cryst. A34, 562–574.
Giacovazzo, C. (1979a). A probabilistic theory of two-phase seminvariants of first rank via the method of representations. III. Acta Cryst. A35, 296–305.
Giacovazzo, C. (1979b). A theoretical weighting scheme for tangent-formula development and refinement and Fourier synthesis. Acta Cryst. A35, 757–764.
Giacovazzo, C. (1980a). Direct methods in crystallography. London: Academic Press.
Giacovazzo, C. (1980b). The method of representations of structure seminvariants. II. New theoretical and practical aspects. Acta Cryst. A36, 362–372.
Giacovazzo, C. (1980c). Triplet and quartet relations: their use in direct procedures. Acta Cryst. A36, 74–82.
Giacovazzo, C. (1983a). From a partial to the complete crystal structure. Acta Cryst. A39, 685–692.
Giacovazzo, C. (1983b). The estimation of two-phase invariants in [P\bar{1}] when anomalous scatterers are present. Acta Cryst. A39, 585–592.
Giacovazzo, C. (1987). One wavelength technique: estimation of centrosymmetrical two-phase invariants in dispersive structures. Acta Cryst. A43, 73–75.
Giacovazzo, C. (1988a). New probabilistic formulas for finding the positions of correctly oriented atomic groups. Acta Cryst. A44, 294–300.
Giacovazzo, C. (1988b). Direct phasing in crystallography. New York: IUCr, Oxford University Press.
Giacovazzo, C., Cascarano, G. & Zheng, C.-D. (1988). On integrating the techniques of direct methods and isomorphous replacement. A new probabilistic formula for triplet invariants. Acta Cryst. A44, 45–51.
Giacovazzo, C., Guagliardi, A., Ravelli, R. & Siliqi, D. (1994). Ab initio direct phasing of proteins: the limits. Z. Kristallogr. 209, 136–142.
Giacovazzo, C., Siliqi, D. & Platas, J. G. (1995). The ab initio crystal structure solution of proteins by direct methods. V. A new normalizing procedure. Acta Cryst. A51, 811–820.
Giacovazzo, C., Siliqi, D., Platas, J. G., Hecht, H.-J., Zanotti, G. & York, B. (1996). The ab initio crystal structure solution of proteins by direct methods. VI. Complete phasing up to derivative resolution. Acta Cryst. D52, 813–825.
Giacovazzo, C., Siliqi, D. & Ralph, A. (1994). The ab initio crystal structure solution of proteins by direct methods. I. Feasibility. Acta Cryst. A50, 503–510.
Giacovazzo, C., Siliqi, D. & Spagna, R. (1994). The ab initio crystal structure solution of proteins by direct methods. II. The procedure and its first applications. Acta Cryst. A50, 609–621.
Giacovazzo, C., Siliqi, D. & Zanotti, G. (1995). The ab initio crystal structure solution of proteins by direct methods. III. The phase extension process. Acta Cryst. A51, 177–188.
Gillis, J. (1948). Structure factor relations and phase determination. Acta Cryst. 1, 76–80.
Gilmore, C. J. (1984). MITHRIL. An integrated direct-methods computer program. J. Appl. Cryst. 17, 42–46.
Goedkoop, J. A. (1950). Remarks on the theory of phase limiting inequalities and equalities. Acta Cryst. 3, 374–378.
Gramlich, V. (1984). The influence of rational dependence on the probability distribution of structure factors. Acta Cryst. A40, 610–616.
Grant, D. F., Howells, R. G. & Rogers, D. (1957). A method for the systematic application of sign relations. Acta Cryst. 10, 489–497.
Hall, S. R., du Boulay, D. J. & Olthof-Hazekamp, R. (1999). Xtal3.6 crystallographic software. .
Hao, Q. & Woolfson, M. M. (1989). Application of the Ps-function method to macromolecular structure determination. Acta Cryst. A45, 794–797.
Harker, D. & Kasper, J. S. (1948). Phases of Fourier coefficients directly from crystal diffraction data. Acta Cryst. 1, 70–75.
Hauptman, H. (1964). The role of molecular structure in the direct determination of phase. Acta Cryst. 17, 1421–1433.
Hauptman, H. (1965). The average value of [\exp \{2 \pi i ({\bf h} \cdot {\bf r} + {\bf h}' \cdot {\bf r}')\}]. Z. Kristallogr. 121, 1–8.
Hauptman, H. (1970). Communication at New Orleans Meeting of Am. Crystallogr. Assoc.
Hauptman, H. (1974). On the identity and estimation of those cosine invariants, [\cos (\varphi_{\bf m} + \varphi_{\bf n} + \varphi_{\bf p} + \varphi_{\bf q})], which are probably negative. Acta Cryst. A30, 472–476.
Hauptman, H. (1975). A new method in the probabilistic theory of the structure invariants. Acta Cryst. A31, 680–687.
Hauptman, H. (1982a). On integrating the techniques of direct methods and isomorphous replacement. I. The theoretical basis. Acta Cryst. A38, 289–294.
Hauptman, H. (1982b). On integrating the techniques of direct methods with anomalous dispersion. I. The theoretical basis. Acta Cryst. A38, 632–641.
Hauptman, H. (1995). Looking ahead. Acta Cryst. B51, 416–422.
Hauptman, H., Fisher, J., Hancock, H. & Norton, D. A. (1969). Phase determination for the estriol structure. Acta Cryst. B25, 811–814.
Hauptman, H. & Green, E. A. (1976). Conditional probability distributions of the four-phase structure invariant [\varphi_{\bf h} + \varphi_{\bf k} +] [\varphi_{\bf l} + \varphi_{\bf m}] in [P\bar{1}]. Acta Cryst. A32, 45–49.
Hauptman, H. & Green, E. A. (1978). Pairs in [P2_1]: probability distributions which lead to estimates of the two-phase structure seminvariants in the vicinity of π/2. Acta Cryst. A34, 224–229.
Hauptman, H. & Karle, J. (1953). Solution of the phase problem. I. The centrosymmetric crystal. Am. Crystallogr. Assoc. Monograph No. 3. Dayton, Ohio: Polycrystal Book Service.
Hauptman, H. & Karle, J. (1956). Structure invariants and seminvariants for non-centrosymmetric space groups. Acta Cryst. 9, 45–55.
Hauptman, H. & Karle, J. (1958). Phase determination from new joint probability distributions: space group [P\bar{1}]. Acta Cryst. 11, 149–157.
Hauptman, H. & Karle, J. (1959). Table 2. Equivalence classes, seminvariant vectors and seminvariant moduli for the centered centrosymmetric space groups, referred to a primitive unit cell. Acta Cryst. 12, 93–97.
Heinermann, J. J. L. (1977a). The use of structural information in the phase probability of a triple product. Acta Cryst. A33, 100–106.
Heinermann, J. J. L. (1977b). Thesis. University of Utrecht.
Heinermann, J. J. L., Krabbendam, H. & Kroon, J. (1979). The joint probability distribution of the structure factors in a Karle–Hauptman matrix. Acta Cryst. A35, 101–105.
Heinermann, J. J. L., Krabbendam, H., Kroon, J. & Spek, A. L. (1978). Direct phase determination of triple products from Bijvoet inequalities. II. A probabilistic approach. Acta Cryst. A34, 447–450.
Hendrickson, W. A., Love, W. E. & Karle, J. (1973). Crystal structure analysis of sea lamprey hemoglobin at 2 Å resolution. J. Mol. Biol. 74, 331–361.
Hendrickson, W. A. & Ogata, C. M. (1997). Phase determination from multiwavelength anomalous diffraction measurements. Methods Enzymol. 276, 494–523.
Hoppe, W. (1963). Phase determination and zero points in the Patterson function. Acta Cryst. 16, 1056–1057.
Hughes, E. W. (1953). The signs of products of structure factors. Acta Cryst. 6, 871.
Hull, S. E. & Irwin, M. J. (1978). On the application of phase relationships to complex structures. XIV. The additional use of statistical information in tangent-formula refinement. Acta Cryst. A34, 863–870.
Hull, S. E., Viterbo, D., Woolfson, M. M. & Shao-Hui, Z. (1981). On the application of phase relationships to complex structures. XIX. Magic-integer representation of a large set of phases: the MAGEX procedure. Acta Cryst. A37, 566–572.
International Tables for Crystallography (2001). Vol. F. Macromolecular crystallography, edited by M. G. Rossmann & E. Arnold. Dordrecht: Kluwer Academic Publishers.
Jaynes, E. T. (1957). Information theory and statistical mechanics. Phys. Rev. 106, 620–630.
Karle, J. (1970a). An alternative form for [B_{3.0}], a phase determining formula. Acta Cryst. B26, 1614–1617.
Karle, J. (1970b). Partial structures and use of the tangent formula and translation functions. In Crystallographic computing, pp. 155–164. Copenhagen: Munksgaard.
Karle, J. (1972). Translation functions and direct methods. Acta Cryst. B28, 820–824.
Karle, J. (1979). Triple phase invariants: formula for centric case from fourth-order determinantal joint probability distributions. Proc. Natl Acad. Sci. USA, 76, 2089–2093.
Karle, J. (1980). Triplet phase invariants: formula for acentric case from fourth-order determinantal joint probability distributions. Proc. Natl Acad. Sci. USA, 77, 5–9.
Karle, J. (1983). A simple rule for finding and distinguishing triplet phase invariants with values near 0 or π with isomorphous replacement data. Acta Cryst. A39, 800–805.
Karle, J. (1984). Rules for evaluating triplet phase invariants by use of anomalous dispersion data. Acta Cryst. A40, 4–11.
Karle, J. (1985). Many algebraic formulas for the evaluation of triplet phase invariants from isomorphous replacement and anomalous dispersion data. Acta Cryst. A41, 182–189.
Karle, J. & Hauptman, H. (1950). The phases and magnitudes of the structure factors. Acta Cryst. 3, 181–187.
Karle, J. & Hauptman, H. (1956). A theory of phase determination for the four types of non-centrosymmetric space groups [1P 222{\it ,}\, 2P 22{\it ,} \, 3P_{1}2{\it ,} \, 3P_{2}2]. Acta Cryst. 9, 635–651.
Karle, J. & Hauptman, H. (1958). Phase determination from new joint probability distributions: space group [P1]. Acta Cryst. 11, 264–269.
Karle, J. & Hauptman, H. (1961). Seminvariants for non-centrosymmetric space groups with conventional centered cells. Acta Cryst. 14, 217–223.
Karle, J. & Karle, I. L. (1966). The symbolic addition procedure for phase determination for centrosymmetric and non-centrosymmetric crystals. Acta Cryst. 21, 849–859.
Klug, A. (1958). Joint probability distributions of structure factors and the phase problem. Acta Cryst. 11, 515–543.
Koch, M. H. J. (1974). On the application of phase relationships to complex structures. IV. Automatic interpretation of electron-density maps for organic structures. Acta Cryst. A30, 67–70.
Krabbendam, H. & Kroon, J. (1971). A relation between structure factor, triple products and a single Patterson vector, and its application to sign determination. Acta Cryst. A27, 362–367.
Kroon, J., Spek, A. L. & Krabbendam, H. (1977). Direct phase determination of triple products from Bijvoet inequalities. Acta Cryst. A33, 382–385.
Lajzérowicz, J. & Lajzérowicz, J. (1966). Loi de distribution des facteurs de structure pour un répartition non uniforme des atomes. Acta Cryst. 21, 8–12.
Langs, D. A. (1985). Translation functions: the elimination of structure-dependent spurious maxima. Acta Cryst. A41, 305–308.
Lessinger, L. & Wondratschek, H. (1975). Seminvariants for space groups [I\bar{4}2m] and [I\bar{4}d]. Acta Cryst. A31, 521.
Mackay, A. L. (1953). A statistical treatment of superlattice reflexions. Acta Cryst. 6, 214–215.
Main, P. (1976). Recent developments in the MULTAN system. The use of molecular structure. In Crystallographic computing techniques, edited by F. R. Ahmed, pp. 97–105. Copenhagen: Munksgaard.
Main, P. (1977). On the application of phase relationships to complex structures. XI. A theory of magic integers. Acta Cryst. A33, 750–757.
Main, P., Fiske, S. J., Germain, G., Hull, S. E., Declercq, J.-P., Lessinger, L. & Woolfson, M. M. (1999). Crystallographic software: teXsan for Windows.
Main, P. & Hull, S. E. (1978). The recognition of molecular fragments in E maps and electron density maps. Acta Cryst. A34, 353–361.
Moncrief, J. W. & Lipscomb, W. N. (1966). Structure of leurocristine methiodide dihydrate by anomalous scattering methods; relation to leurocristine (vincristine) and vincaleukoblastine (vinblastine). Acta Cryst. A21, 322–331.
Mukherjee, A. K., Helliwell, J. R. & Main, P. (1989). The use of MULTAN to locate the positions of anomalous scatterers. Acta Cryst. A45, 715–718.
Narayan, R. & Nityananda, R. (1982). The maximum determinant method and the maximum entropy method. Acta Cryst. A38, 122–128.
Navaza, J. (1985). On the maximum-entropy estimate of the electron density function. Acta Cryst. A41, 232–244.
Navaza, J., Castellano, E. E. & Tsoucaris, G. (1983). Constrained density modifications by variational techniques. Acta Cryst. A39, 622–631.
Naya, S., Nitta, I. & Oda, T. (1964). A study on the statistical method for determination of signs of structure factors. Acta Cryst. 17, 421–433.
Naya, S., Nitta, I. & Oda, T. (1965). Affinement tridimensional du sulfanilamide β. Acta Cryst. 19, 734–747.
Nordman, C. E. (1985). Introduction to Patterson search methods. In Crystallographic computing 3. Data collection, structure determination, proteins and databases, edited by G. M. Sheldrick, G. Kruger & R. Goddard, pp. 232–244. Oxford: Clarendon Press.
Oda, T., Naya, S. & Taguchi, I. (1961). Matrix theoretical derivation of inequalities. II. Acta Cryst. 14, 456–458.
Okaya, J. & Nitta, I. (1952). Linear structure factor inequalities and the application to the structure determination of tetragonal ethylenediamine sulphate. Acta Cryst. 5, 564–570.
Okaya, Y. & Pepinsky, R. (1956). New formulation and solution of the phase problem in X-ray analysis of non-centric crystals containing anomalous scatterers. Phys. Rev. 103, 1645–1647.
Ott, H. (1927). Zur Methodik der Struckturanalyse. Z. Kristallogr. 66, 136–153.
Parthasarathy, S. & Srinivasan, R. (1964). The probability distribution of Bijvoet differences. Acta Cryst. 17, 1400–1407.
Peerdeman, A. F. & Bijvoet, J. M. (1956). The indexing of reflexions in investigations involving the use of the anomalous scattering effect. Acta Cryst. 9, 1012–1015.
Piro, O. E. (1983). Information theory and the phase problem in crystallography. Acta Cryst. A39, 61–68.
Podjarny, A. D., Schevitz, R. W. & Sigler, P. B. (1981). Phasing low-resolution macromolecular structure factors by matricial direct methods. Acta Cryst. A37, 662–668.
Podjarny, A. D., Yonath, A. & Traub, W. (1976). Application of multivariate distribution theory to phase extension for a crystalline protein. Acta Cryst. A32, 281–292.
Rae, A. D. (1977). The use of structure factors to find the origin of an oriented molecular fragment. Acta Cryst. A33, 423–425.
Ralph, A. C. & Woolfson, M. M. (1991). On the application of one-wavelength anomalous scattering. III. The Wilson-distribution and MPS methods. Acta Cryst. A47, 533–537.
Ramachandran, G. N. & Raman, S. (1956). A new method for the structure analysis of non-centrosymmetric crystals. Curr. Sci. (India), 25, 348.
Raman, S. (1959). Syntheses for the deconvolution of the Patterson function. Part II. Detailed theory for non-centrosymmetric crystals. Acta Cryst. 12, 964–975.
Rango, C. de (1969). Thesis. Paris.
Rango, C. de, Mauguen, Y. & Tsoucaris, G. (1975). Use of high-order probability laws in phase refinement and extension of protein structures. Acta Cryst. A31, 227–233.
Rango, C. de, Mauguen, Y., Tsoucaris, G., Dodson, E. J., Dodson, G. G. & Taylor, D. J. (1985). The extension and refinement of the 1.9 Å spacing isomorphous phases to 1.5 Å spacing in 2Zn insulin by determinantal methods. Acta Cryst. A41, 3–17.
Rango, C. de, Tsoucaris, G. & Zelwer, C. (1974). Phase determination from the Karle–Hauptman determinant. II. Connexion between inequalities and probabilities. Acta Cryst. A30, 342–353.
Rogers, D., Stanley, E. & Wilson, A. J. C. (1955). The probability distribution of intensities. VI. The influence of intensity errors on the statistical tests. Acta Cryst. 8, 383–393.
Rogers, D. & Wilson, A. J. C. (1953). The probability distribution of X-ray intensities. V. A note on some hypersymmetric distributions. Acta Cryst. 6, 439–449.
Rossmann, M. G., Blow, D. M., Harding, M. M. & Coller, E. (1964). The relative positions of independent molecules within the same asymmetric unit. Acta Cryst. 17, 338–342.
Sayre, D. (1952). The squaring method: a new method for phase determination. Acta Cryst. 5, 60–65.
Sayre, D. (1953). Double Patterson function. Acta Cryst. 6, 430–431.
Sayre, D. (1972). On least-squares refinement of the phases of crystallographic structure factors. Acta Cryst. A28, 210–212.
Sayre, D. & Toupin, R. (1975). Major increase in speed of least-squares phase refinement. Acta Cryst. A31, S20.
Schenk, H. (1973a). Direct structure determination in [P1] and other non-centrosymmetric symmorphic space groups. Acta Cryst. A29, 480–481.
Schenk, H. (1973b). The use of phase relationships between quartets of reflexions. Acta Cryst. A29, 77–82.
Sheldrick, G. M. (1990). Phase annealing in SHELX-90: direct methods for larger structures. Acta Cryst. A46, 467–473.
Sheldrick, G. M. (1997). In Direct methods for solving macromolecular structures. NATO Advanced Study Institute, Erice, Italy.
Sheldrick, G. M. (2000a). The SHELX home page. .
Sheldrick, G. M. (2000b). SHELX. .
Sheldrick, G. M. & Gould, R. O. (1995). Structure solution by iterative peaklist optimization and tangent expansion in space group [P1]. Acta Cryst. B51, 423–431.
Sim, G. A. (1959). The distribution of phase angles for structures containing heavy atoms. II. A modification of the normal heavy-atoms method for non-centrosymmetrical structures. Acta Cryst. 12, 813–815.
Simerska, M. (1956). Czech. J. Phys. 6, 1.
Simonov, V. I. & Weissberg, A. M. (1970). Calculation of the signs of structure amplitudes by a binary function section of interatomic vectors. Sov. Phys. Dokl. 15, 321–323. [Translated from Dokl. Akad. Nauk SSSR, 191, 1050–1052.]
Sint, L. & Schenk, H. (1975). Phase extension and refinement in non-centrosymmetric structures containing large molecules. Acta Cryst. A31, S22.
Smith, J. L. (1998). Multiwavelength anomalous diffraction in macromolecular crystallography. In Direct methods for solving macromolecular structures, edited by S. Fortier, pp. 221–225. Dordrecht: Kluwer Academic Publishers.
Srinivasan, R. & Parthasarathy, S. (1976). Some statistical applications in X-ray crystallography. Oxford: Pergamon Press.
Taylor, D. J., Woolfson, M. M. & Main, P. (1978). On the application of phase relationships to complex structures. XV. Magic determinants. Acta Cryst. A34, 870–883.
Tsoucaris, G. (1970). A new method for phase determination. The maximum determinant rule. Acta Cryst. A26, 492–499.
Van der Putten, N. & Schenk, H. (1977). On the conditional probability of quintets. Acta Cryst. A33, 856–858.
Vaughan, P. A. (1958). A phase-determining procedure related to the vector-coincidence method. Acta Cryst. 11, 111–115.
Vermin, W. J. & de Graaff, R. A. G. (1978). The use of Karle–Hauptman determinants in small-structure determinations. Acta Cryst. A34, 892–894.
Vicković, I. & Viterbo, D. (1979). A simple statistical treatment of unobserved reflexions. Application to two organic substances. Acta Cryst. A35, 500–501.
Weeks, C. M., DeTitta, G. T., Hauptman, H. A., Thuman, P. & Miller, R. (1994). Structure solution by minimal-function phase refinement and Fourier filtering. II. Implementation and applications. Acta Cryst. A50, 210–220.
Weeks, C. M. & Miller, R. (1999). The design and implementation of SnB version 2.0. J. Appl. Cryst. 32, 120–124.
Weinzierl, J. E., Eisenberg, D. & Dickerson, R. E. (1969). Refinement of protein phases with the Karle–Hauptman tangent fomula. Acta Cryst. B25, 380–387.
White, P. & Woolfson, M. M. (1975). The application of phase relationships to complex structures. VII. Magic integers. Acta Cryst. A31, 53–56.
Wilkins, S. W., Varghese, J. N. & Lehmann, M. S. (1983). Statistical geometry. I. A self-consistent approach to the crystallographic inversion problem based on information theory. Acta Cryst. A39, 47–60.
Wilson, A. J. C. (1942). Determination of absolute from relative X-ray intensity data. Nature (London), 150, 151–152.
Wilson, K. S. (1978). The application of MULTAN to the analysis of isomorphous derivatives in protein crystallography. Acta Cryst. B34, 1599–1608.
Wolff, P. M. de & Bouman, J. (1954). A fundamental set of structure factor inequalities. Acta Cryst. 7, 328–333.
Woolfson, M. M. (1958). Crystal and molecular structure of p,p′-dimethoxybenzophenone by the direct probability method. Acta Cryst. 11, 277–283.
Woolfson, M. M. (1977). On the application of phase relationships to complex structures. X. MAGLIN – a successor to MULTAN. Acta Cryst. A33, 219–225.
Woolfson, M. & Fan, H.-F. (1995). Physical and non-physical methods of solving crystal structures. Cambridge University Press.
Yao, J.-X. (1981). On the application of phase relationships to complex structures. XVIII. RANTAN – random MULTAN. Acta Cryst. A37, 642–664.

to end of page
to top of page