International
Tables for
Crystallography
Volume B
Reciprocal space
Edited by U. Shmueli

International Tables for Crystallography (2010). Vol. B, ch. 2.2, pp. 232-234   | 1 | 2 |

Section 2.2.8. Other multisolution methods applied to small molecules

C. Giacovazzoa*

aDipartimento Geomineralogico, Campus Universitario, 70125 Bari, Italy, and Institute of Crystallography, Via G. Amendola, 122/O, 70125 Bari, Italy
Correspondence e-mail: carmelo.giacovazzo@ic.cnr.it

2.2.8. Other multisolution methods applied to small molecules

| top | pdf |

In very complex structures a large initial set of known phases seems to be a basic requirement for a structure to be determined. This aim can be achieved, for example, by introducing a large number of permutable phases into the initial set. However, the introduction of every new symbol implies a fourfold increase in computing time, which, even in fast computers, quickly leads to computing-time limitations. On the other hand, a relatively large starting set is not in itself enough to ensure a successful structure determination. This is the case, for example, when the triplet invariants used in the initial steps differ significantly from zero. New strategies have therefore been devised to solve more complex structures.

  • (1) Magic-integer methods

    In the classical procedure described in Section 2.2.7[link], the unknown phases in the starting set are assigned all combinations of the values [\pm \pi / 4, \pm 3 \pi / 4]. For n unknown phases in the starting set, [4^{n}] sets of phases arise by quadrant permutation; this is a number that increases very rapidly with n. According to White & Woolfson (1975[link]), phases can be represented for a sequence of n integers by the equations [\varphi_{i} = m_{i}x \ (\hbox{mod } 2 \pi), \quad i = 1, \ldots, n. \eqno(2.2.8.1)]The set of equations can be regarded as the parametric equation of a straight line in n-dimensional phase space. The nature and size of errors connected with magic-integer representations have been investigated by Main (1977[link]) who also gave a recipe for deriving magic-integer sequences which minimize the r.m.s. errors in the represented phases (see Table 2.2.8.1[link]). To assign a phase value, the variable x in equation (2.2.8.1)[link] is given a series of values at equal intervals in the range [0\,\lt\, x\,\lt \,2 \pi]. The enantiomorph is defined by exploring only the appropriate half of the n-dimensional space.

    Table 2.2.8.1| top | pdf |
    Magic-integer sequences for small numbers of phases (n) together with the number of sets produced and the root-mean-square error in the phases

    nSequenceNo. of setsR.m.s. error (°)
    1 1               4 26
    2 2 3             12 29
    3 3 4 5           20 37
    4 5 7 8 9         32 42
    5 8 11 13 14 15       50 45
    6 13 18 21 23 24 25     80 47
    7 21 29 34 37 39 40 41   128 48
    8 34 47 55 60 63 65 66 67 206 49

    A different way of using the magic-integer method (Declercq et al., 1975[link]) is the primary–secondary P–S method which may be described schematically in the following way:

    • (a) Origin- and enantiomorph-fixing phases are chosen and some one-phase s.s.'s are estimated.

    • (b) Nine phases [this is only an example: very long magic-integer sequences may be used to represent primary phases (Hull et al., 1981[link]; Debaerdemaeker & Woolfson, 1983[link])] are represented with the approximated relationships: [\cases{\varphi_{i_{1}} = 3 x\cr \varphi_{i_{2}} = 4 x\cr \varphi_{i_{3}} = 5 x\cr}\qquad \cases{\varphi_{j_{1}} = 3 y\cr \varphi_{j_{2}} = 4 y\cr \varphi_{j_{3}} = 5 y\cr}\qquad \cases{\varphi_{p_{1}} = 3 z\cr \varphi_{p_{2}} = 4 z\cr \varphi_{p_{3}} = 5 z.\cr}]Phases in (a) and (b) consistitute the primary set.

    • (c) The phases in the secondary set are those defined through [\textstyle\sum_{2}] relationships involving pairs of phases from the primary set: they, too, can be expressed in magic-integer form.

    • (d) All the triplets that link together the phases in the combined primary and secondary set are now found, other than triplets used to obtain secondary reflections from the primary ones. The general algebraic form of these triplets will be [m_{1}x + m_{2}y + m_{3}z + b \equiv 0\ (\hbox{mod } 1),]where b is a phase constant which arises from symmetry translation. It may be expected that the `best' value of the unknown x, y, z corresponds to a maximum of the function [\psi (x, y, z) = \textstyle\sum |E_{1} E_{2} E_{3}| \cos 2 \pi (m_{1}x + m_{2}y + m_{3}z + b),]with [0\leq x, y, z\,\lt\, 1]. It should be noticed that ψ is a Fourier summation which can easily be evaluated. In fact, ψ is essentially a figure of merit for a large number of phases evaluated in terms of a small number of magic-integer variables and gives a measure of the internal consistency of [\sum_{2}] relationships. The ψ map generally presents several peaks and therefore can provide several solutions for the variables.

  • (2) The random-start method

    These are procedures which try to solve crystal structures by starting from random initial phases (Baggio et al., 1978[link]; Yao, 1981[link]). They may be so described:

    • (a) A number of reflections (say NUM ∼ 100 or larger) at the bottom of the CONVERGE map are selected. These, and the relationships which link them, form the system for which trial phases will be found.

    • (b) A pseudo-random number generator is used to generate M sets of NUM random phases. Each of the M sets is refined and extended by the tangent formula or similar methods.

  • (3) Accurate calculation of s.i.'s and s.s.'s with 1, 2, 3, 4, …, n phases

    Having a large set of good phase relationships allows one to overcome difficulties in the early stages and in the refinement process of the phasing procedure. Accurate estimates of s.i.'s and s.s.'s may be achieved by the application of techniques such as the representation method or the neighbourhood principle (Hauptman, 1975[link]; Giacovazzo, 1977a[link], 1980b[link]). So far, second-representation formulae are available for triplets and one-phase seminvariants; in particular, reliably estimated negative triplets can be recognized, which is of great help in the phasing process (Cascarano, Giacovazzo, Camalli et al., 1984[link]). Estimation of higher-order s.s.'s with upper representations or upper neighbourhoods is rather difficult, both because the procedures are time consuming and because the efficiency of the present joint probability distribution techniques deteriorates with complexity. However, further progress can be expected in the field.

  • (4) Modified tangent formulae and least-squares determination and refinement of phases

    The problem of deriving the individual phase angles from triplet relationships is greatly overdetermined: indeed the number of triplets, in fact, greatly exceeds the number of phases so that any [\varphi_{\bf h}] may be determined by a least-squares approach (Hauptman et al., 1969[link]). The function to be minimized may be [{M} = {{\textstyle\sum_{\bf k}} w_{\bf k}[\cos (\varphi_{\bf h} - \varphi_{\bf k} - \varphi_{{\bf h}-{\bf k}}) - C_{\bf k}]^{2}\over \sum w_{\bf k}},]where [C_{\bf k}] is the estimate of the cosine obtained by probabilistic or other methods.

    Effective least-squares procedures based on linear equations (Debaerdemaeker & Woolfson, 1983[link]; Woolfson, 1977[link]) can also be used. A triplet relationship is usually represented by [(\varphi_{p} \pm \varphi_{q} \pm \varphi_{r} + b) \approx 0\ (\hbox{mod } 2 \pi), \eqno(2.2.8.2)]where b is a factor arising from translational symmetry. If (2.2.8.2)[link] is expressed in cycles and suitably weighted, then it may be written as [w (\varphi_{p} \pm \varphi_{q} \pm \varphi_{r} + b) = wn,]where n is some integer. If the integers were known then the equation would appear (in matrix notation) as [{\bi A}\boldPhi = {\bi C}, \eqno(2.2.8.3)]giving the least-squares solution [{\boldPhi } = ({\bi A}^{T}{\bi A})^{-1} {\bi A}^{T}{\bi C}. \eqno(2.2.8.4)]When approximate phases are available, the nearest integers may be found and equations (2.2.8.3)[link] and (2.2.8.4)[link] constitute the basis for further refinement.

    Modified tangent procedures are also used, such as (Sint & Schenk, 1975[link]; Busetta, 1976[link]) [\tan \varphi_{\bf h} \simeq {{\textstyle\sum_{j}} G_{{{\bf h}, \, {\bf k}}_{j}} \sin (\varphi_{{\bf k}_{j}} + \varphi_{{\bf h}-{\bf k}_{j}} - \Delta_{j})\over \sum G_{{{\bf h}, \, {\bf k}}_{j}} \cos (\varphi_{{\bf k}_{j}} + \varphi_{{\bf h}-{\bf k}_{j}} - \Delta_{j})},]where [\Delta_{j}] is an estimate for the triplet phase sum [(\varphi_{\bf h} -] [ \varphi_{{\bf k}_{j}} - \varphi_{{\bf h}-{\bf k}_{j}})].

  • (5) Techniques based on the positivity of Karle–Hauptman determinants

    (The main formulae have been briefly described in Section 2.2.5.7[link].) The maximum determinant rule has been applied to solve small structures (de Rango, 1969[link]; Vermin & de Graaff, 1978[link]) via determinants of small order. It has, however, been found that their use (Taylor et al., 1978[link]) is not of sufficient power to justify the larger amount of computing time required by the technique as compared to that required by the tangent formula.

  • (6) Tangent techniques using simultaneously triplets, quartets,…

    The availability of a large number of phase relationships, in particular during the first stages of a direct procedure, makes the phasing process easier. However, quartets are sums of two triplets with a common reflection. If the phase of this reflection (and/or of the other cross terms) is known then the quartet probability formulae described in Section 2.2.5.5[link] cannot hold. Similar considerations may be made for quintet relationships. Thus triplet, quartet and quintet formulae described in the preceding paragraphs, if used without modifications, will certainly introduce systematic errors in the tangent refinement process.

    A method which takes into account correlation between triplets and quartets has been described (Giacovazzo, 1980c[link]) [see also Freer & Gilmore (1980[link]) for a first application], according to which [\tan \varphi_{\bf h} \simeq {{\textstyle\sum\limits_{\bf k}} G \sin (\varphi_{\bf k} + \varphi_{{\bf h}-{\bf k}}) - {\textstyle\sum\limits_{{{\bf k}, \, {\bf l}}}} G' \sin (\varphi_{\bf k} + \varphi_{\bf l} + \varphi_{{\bf h}-{\bf k}-{\bf l}})\over {\textstyle\sum\limits_{\bf k}} G \cos (\varphi_{\bf k} + \varphi_{{\bf h}-{\bf k}}) - {\textstyle\sum\limits_{{\bf k}, \, {\bf l}}} G' \cos (\varphi_{\bf k} + \varphi_{\bf l} + \varphi_{{\bf h}-{\bf k}-{\bf l}})},]where G′ takes into account both the magnitudes of the cross terms of the quartet and the fact that their phases may be known.

  • (7) Integration of Patterson techniques and direct methods (Egert & Sheldrick, 1985[link]) [see also Egert (1983[link], and references therein)]

    A fragment of known geometry is oriented in the unit cell by real-space Patterson rotation search (see Chapter 2.3[link] ) and its position is found by application of a translation function (see Section 2.2.5.4[link] and Chapter 2.3[link] ) or by maximizing the weighted sum of the cosines of a small number of strong translation-sensitive triple phase invariants, starting from random positions. Suitable FOMs rank the most reliable solutions.

  • (8) Maximum entropy methods

    A common starting point for all direct methods is a stochastic process according to which crystal structures are thought of as being generated by randomly placing atoms in the asymmetric unit of the unit cell according to some a priori distribution. A non-uniform prior distribution of atoms p(r) gives rise to a source of random atomic positions with entropy (Jaynes, 1957[link]) [H(p) = - \textstyle\int\limits_{V} p({\bf r}) \log p({\bf r}) \;\hbox{d}{\bf r}.]The maximum value [H_{\max} = \log V] is reached for a uniform prior [p({\bf r}) = 1/V].

    The strength of the restrictions introduced by p(r) is not measured by [H(p)] but by [H(p) - H_{\max}], given by [H(p) - H_{\max} = - \textstyle\int\limits_{V} p({\bf r}) \log [\;p({\bf r})/m({\bf r})] \;\hbox{d}{\bf r},]where [m({\bf r}) = 1/V]. Accordingly, if a prior prejudice m(r) exists, which maximizes H, the revised relative entropy is [S(p) = - \textstyle\int\limits_{V} p({\bf r}) \log [\;p({\bf r})/m({\bf r})] \;\hbox{d}{\bf r}.]The maximization problem was solved by Jaynes (1957[link]). If [G_{j}(p)] are linear constraint functionals defined by given constraint functions [C_{j}({\bf r})] and constraint values [c_{j}], i.e. [G_{j}(p) = \textstyle\int\limits_{V} p({\bf r})C_{j}({\bf r}) \;\hbox{d}{\bf r} = c_{j},]the most unbiased probability density p(r) under prior prejudice m(r) is obtained by maximizing the entropy of p(r) relative to m(r). A standard variational technique suggests that the constrained maximization is equivalent to the unconstrained maximization of the functional [S(p) + \textstyle\sum\limits_{j} \lambda_{j}G_{j}(p),]where the [\lambda_{j}]'s are Lagrange multipliers whose values can be determined from the constraints.

    Such a technique has been applied to the problem of finding good electron-density maps in different ways by various authors (Wilkins et al., 1983[link]; Bricogne, 1984[link]; Navaza, 1985[link]; Navaza et al., 1983[link]).

    Maximum entropy methods are strictly connected with traditional direct methods: in particular it has been shown that:

    • (a) the maximum determinant rule (see Section 2.2.5.7[link]) is strictly connected (Britten & Collins, 1982[link]; Piro, 1983[link]; Narayan & Nityananda, 1982[link]; Bricogne, 1984[link]);

    • (b) the construction of conditional probability distributions of structure factors amounts precisely to a reciprocal-space evaluation of the entropy functional [S(p)] (Bricogne, 1984[link]).

    Maximum entropy methods are under strong development: important contributions can be expected in the near future even if a multipurpose robust program has not yet been written.

References

Baggio, R., Woolfson, M. M., Declercq, J.-P. & Germain, G. (1978). On the application of phase relationships to complex structures. XVI. A random approach to structure determination. Acta Cryst. A34, 883–892.
Bricogne, G. (1984). Maximum entropy and the foundation of direct methods. Acta Cryst. A40, 410–415.
Britten, P. L. & Collins, D. M. (1982). Information theory as a basis for the maximum determinant. Acta Cryst. A38, 129–132.
Busetta, B. (1976). An example of the use of quartet and triplet structure invariants when enantiomorph discrimination is difficult. Acta Cryst. A32, 139–143.
Cascarano, G., Giacovazzo, C., Camalli, M., Spagna, R., Burla, M. C., Nunzi, A. & Polidori, G. (1984). The method of representations of structure seminvariants. The strengthening of triplet relationships. Acta Cryst. A40, 278–283.
Debaerdemaeker, T. & Woolfson, M. M. (1983). On the application of phase relationships to complex structures. XXII. Techniques for random refinement. Acta Cryst. A39, 193–196.
Declercq, J.-P., Germain, G. & Woolfson, M. M. (1975). On the application of phase relationships to complex structures. VIII. Extension of the magic-integer approach. Acta Cryst. A31, 367–372.
Egert, E. (1983). Patterson search – an alternative to direct methods. Acta Cryst. A39, 936–940.
Egert, E. & Sheldrick, G. M. (1985). Search for a fragment of known geometry by integrated Patterson and direct methods. Acta Cryst. A41, 262–268.
Freer, A. A. & Gilmore, C. J. (1980). The use of higher invariants in MULTAN. Acta Cryst. A36, 470–475.
Giacovazzo, C. (1977a). A general approach to phase relationships: the method of representations. Acta Cryst. A33, 933–944.
Giacovazzo, C. (1980b). The method of representations of structure seminvariants. II. New theoretical and practical aspects. Acta Cryst. A36, 362–372.
Giacovazzo, C. (1980c). Triplet and quartet relations: their use in direct procedures. Acta Cryst. A36, 74–82.
Hauptman, H. (1975). A new method in the probabilistic theory of the structure invariants. Acta Cryst. A31, 680–687.
Hauptman, H., Fisher, J., Hancock, H. & Norton, D. A. (1969). Phase determination for the estriol structure. Acta Cryst. B25, 811–814.
Hull, S. E., Viterbo, D., Woolfson, M. M. & Shao-Hui, Z. (1981). On the application of phase relationships to complex structures. XIX. Magic-integer representation of a large set of phases: the MAGEX procedure. Acta Cryst. A37, 566–572.
Jaynes, E. T. (1957). Information theory and statistical mechanics. Phys. Rev. 106, 620–630.
Main, P. (1977). On the application of phase relationships to complex structures. XI. A theory of magic integers. Acta Cryst. A33, 750–757.
Narayan, R. & Nityananda, R. (1982). The maximum determinant method and the maximum entropy method. Acta Cryst. A38, 122–128.
Navaza, J. (1985). On the maximum-entropy estimate of the electron density function. Acta Cryst. A41, 232–244.
Navaza, J., Castellano, E. E. & Tsoucaris, G. (1983). Constrained density modifications by variational techniques. Acta Cryst. A39, 622–631.
Piro, O. E. (1983). Information theory and the phase problem in crystallography. Acta Cryst. A39, 61–68.
Rango, C. de (1969). Thesis. Paris.
Sint, L. & Schenk, H. (1975). Phase extension and refinement in non-centrosymmetric structures containing large molecules. Acta Cryst. A31, S22.
Taylor, D. J., Woolfson, M. M. & Main, P. (1978). On the application of phase relationships to complex structures. XV. Magic determinants. Acta Cryst. A34, 870–883.
Vermin, W. J. & de Graaff, R. A. G. (1978). The use of Karle–Hauptman determinants in small-structure determinations. Acta Cryst. A34, 892–894.
White, P. & Woolfson, M. M. (1975). The application of phase relationships to complex structures. VII. Magic integers. Acta Cryst. A31, 53–56.
Wilkins, S. W., Varghese, J. N. & Lehmann, M. S. (1983). Statistical geometry. I. A self-consistent approach to the crystallographic inversion problem based on information theory. Acta Cryst. A39, 47–60.
Woolfson, M. M. (1977). On the application of phase relationships to complex structures. X. MAGLIN – a successor to MULTAN. Acta Cryst. A33, 219–225.
Yao, J.-X. (1981). On the application of phase relationships to complex structures. XVIII. RANTAN – random MULTAN. Acta Cryst. A37, 642–664.








































to end of page
to top of page