International
Tables for Crystallography Volume F Crystallography of biological macromolecules Edited by M. G. Rossmann and E. Arnold © International Union of Crystallography 2006 
International Tables for Crystallography (2006). Vol. F, ch. 15.2, pp. 325331
https://doi.org/10.1107/97809553602060000688 Chapter 15.2. Model phases: probabilities, bias and maps^{a}Department of Haematology, University of Cambridge, Wellcome Trust Centre for Molecular Mechanisms in Disease, CIMR, Wellcome Trust/MRC Building, Hills Road, Cambridge CB2 2XY, England The optimal use of model phase information requires an estimate of its reliability, specifically the probability that various values of the phase angle are true. This chapter covers the importance of phase in model bias; structurefactor probability relationships; figureofmerit weighting for model phases; map coefficients to reduce model bias; differencemap coefficients; refinement bias; and maximiumlikelihood structure refinement. 
The intensities of Xray diffraction spots measured from a crystal give us only the amplitudes of the diffracted waves. To reconstruct a map of the electron density in the crystal, the unmeasured phase information is also required. In fact, the phases are much more important to the appearance of the map than the measured amplitudes. When phases are supplied by an atomic model, therefore, some degree of model bias is inevitable.
The optimal use of model phase information requires an estimate of its reliability, specifically the probability that various values of the phase angle are true. Such a probability distribution can be derived, starting first with the relationship between the structure factor (amplitude and phase) of the model and that of the true crystal structure. The phase probability distribution can then be obtained from this and used, for instance, to provide a figureofmerit weighting that minimizes the r.m.s. error from the true electron density.
Even with figureofmerit weighting, modelphased electron density is biased towards the model. The systematic bias component of modelphased map coefficients can be predicted, allowing the derivation of map coefficients that give electrondensity maps with reduced model bias. With the help of a few simple assumptions, a correction for bias can also be made when different sources of phase information are combined.
Finally, the refinement of a model against the observed amplitudes allows a certain amount of overfitting of the data, which leads to an extra `refinement bias'. Fortunately, the use of appropriate refinement strategies, including maximumlikelihood targets, can reduce the severity of this problem.
Dramatic illustrations of the importance of the phase have been published. For instance, Ramachandran & Srinivasan (1961) calculated an electrondensity map using phases from one structure and amplitudes from another. In this map there are peaks at the positions of the atoms in the structure that contributed the phase information, but not in the structure that contributed the amplitudes. Similar calculations with twodimensional Fourier transforms of photographs (Oppenheim & Lim, 1981; Read, 1997) show that the phases of one completely overwhelm the amplitudes of the other.
These examples, though dramatic, are not completely representative of the normal situation, where the structure contributing the phases is partially or even nearly correct. Nonetheless, model phases always contribute bias, so that the resulting map tends to bear too close a resemblance to the model.
The importance of the phase can be understood most easily in terms of Parseval's theorem, a result that is important to the understanding of many aspects of the Fourier transform and its use in crystallography. Parseval's theorem states that the meansquare value of the variable on one side of a Fourier transform is proportional to the meansquare value of the variable on the other side. Since the Fourier transform is additive, Parseval's theorem also applies to sums or differences.
If and are, for instance, the true electron density and the electron density of the model, respectively, Parseval's theorem tells us that the r.m.s. error in the electron density is proportional to the r.m.s. error in the structure factor. (The structurefactor error is a vector error in the complex plane.)
This understanding of error in electrondensity maps explains why the phase is much more important than the amplitude in determining the appearance of an electrondensity map. As illustrated in Fig. 15.2.2.1, a random choice of phase (from a uniform distribution of all possible phases) will generally give a larger error in the complex plane than a random choice of amplitude [from a Wilson (1949) distribution of amplitudes].
To use model phase information optimally, the probability distribution for the true phase (or, equivalently, the distribution of the error in the model phase) needs to be known. Such a distribution can be derived by first working out the probability distribution for the true structure factor (or the distribution of the vector difference between the model and true structure factors). Then the phase probability distribution is obtained by fixing the known value of the structurefactor amplitude and renormalizing.
A number of related structurefactor distributions have been derived, differing in the amount of information available about the structure and in the assumed form of errors in the model. These range from the Wilson distribution, which applies when none of the atomic positions is known, to a distribution that applies when there are a variety of sources of error in an atomic model.
For the Wilson distribution (Wilson, 1949), it is assumed that the atoms in a crystal structure in space group P1 are scattered randomly and independently through the unit cell. In fact, it is sufficient to make the much less restrictive assumption that the atoms are placed randomly with respect to the Bragg planes defined by the Miller indices. The assumption of independence is somewhat more problematic, since there are restrictions on the distances between atoms, large volumes of protein crystals are occupied by disordered solvent and many protein crystals display noncrystallographic symmetry; as discussed elsewhere (Vellieux & Read, 1997), the resulting relationships among structure factors are exploited implicitly in averaging and solventflattening procedures. The higherorder relationships among structure factors are used explicitly in direct methods for solving smallmolecule structures and are being developed for use in protein structures (Bricogne, 1993). For the purposes of simpler relationships between the calculated and true structure factors for a single hkl, however, the lack of complete independence does not seem to create serious problems.
When atoms are placed randomly relative to the Bragg planes, the contribution of each atom to the structure factor will have a phase varying randomly from 0 to 2π. The overall structure factor can then be considered to be the result of a random walk in the complex plane, which can be treated as an application of the central limit theorem. The structure factor is the sum of the independent atomic scattering contributions, each of which has a probability distribution defined as a circle in the complex plane centred on the origin, with a radius of . The centroid of this atomic distribution is at the origin, and the variance for each of the real and imaginary parts is . The probability distribution of the structure factor that is the sum of these contributions is a twodimensional Gaussian, the product of the onedimensional Gaussians for the real and imaginary parts. Because the variances are equal in the real and imaginary directions, it can be simplified, as shown below, and expressed in terms of a single distribution parameter, .
The Sim distribution (Sim, 1959), which is relevant when the positions of some of the atoms are known, has a very similar basis, except that the structure factor is now considered to arise from a random walk starting from the position of the structure factor corresponding to the known part, . Atoms with known positions do not contribute to the variance, while each of the atoms with unknown positions (the `Q' atoms) contributes to each of the real and imaginary parts, as in the Wilson distribution. The distribution parameter in this case is referred to as . The Sim distribution is a conditional probability distribution, depending on the value of ,
The Wilson (1949) and Woolfson (1956) distributions for space group are obtained similarly, except that the random walks are along a line and the resulting Gaussian distributions are onedimensional. (The Woolfson distribution is the centric equivalent of the Sim distribution.) For more complicated space groups, it is reasonable to assume that acentric reflections follow the P1 distribution and that centric reflections follow the distribution. However, for any zone of the reciprocal lattice in which symmetryrelated atoms are constrained to scatter in phase, the variances must be multiplied by the expected intensity factor, , for the zone, because the symmetryrelated contributions are no longer independent.
In the Sim distribution, an atom is considered to be either exactly known or completely unknown in its position. These are extreme cases, since there will normally be varying degrees of uncertainty in the positions of various atoms in a model. The treatment can be generalized by allowing a probability distribution of coordinate errors for each atom. In this case, the centroid for the individual atomic contribution to the structure factor will no longer be obtained by multiplying by either zero or one. Averaged over the circle corresponding to possible phase errors, the centroid will generally be reduced in magnitude, as illustrated in Fig. 15.2.3.1. In fact, averaging to obtain the centroid is equivalent to weighting the atomic scattering contribution by the Fourier transform of the coordinateerror probability distribution, . By the convolution theorem, this in turn is equivalent to convoluting the atomic density with the coordinateerror distribution. Intuitively, the atom is smeared over all of its possible positions. The weighting factor, , is thus analogous to the thermalmotion term in the structurefactor expression.

Centroid of the structurefactor contribution from a single atom. The probability of a phase for the contribution is indicated by the thickness of the line. 
The variances for the individual atomic contributions will differ in magnitude, but if there are a sufficient number of independent sources of error, we can invoke the central limit theorem again and assume that the probability distribution for the structure factor will be a Gaussian centred on . If the coordinateerror distribution is Gaussian, and if each atom in the model is subject to the same errors, the resulting structurefactor probability distribution is the Luzzati (1952) distribution. In this special case, for all atoms, where D is the Fourier transform of a Gaussian and behaves like the application of an overall B factor.
The Wilson, Sim, Luzzati and variableerror distributions have very similar forms, because they are all Gaussians arising from the application of the central limit theorem. The central limit theorem is valid under many circumstances; even when there are errors in position, scattering factor and B factor, as well as missing atoms, a similar distribution still applies. As long as these sources of error are independent, the true structure factor will have a Gaussian distribution centred on (Fig. 15.2.3.2), where D now includes effects of all sources of error, as well as compensating for errors in the overall scale and B factor (Read, 1990). in the acentric case, where , is the expected intensity factor and is the Wilson distribution parameter for the model.

Schematic illustration of the general structurefactor distribution, relevant in the case of any set of independent random errors in the atomic model. 
For centric reflections, the scattering differences are distributed along a line, so the probability distribution is a onedimensional Gaussian.
Srinivasan (1966) showed that the Sim and Luzzati distributions could be combined into a single distribution that had a particularly elegant form when expressed in terms of normalized structure factors, or E values. This functional form still applies to the general distribution that reflects a variety of sources of error; the only difference is the interpretation placed on the parameters (Read, 1990). If F and are replaced by the corresponding E values, a parameter plays the role of D, and reduces to (). [The parameter is equivalent to D after correction for model completeness; ] When the structure factors are normalized, overall scale and Bfactor effects are also eliminated. The parameter that characterizes this probability distribution varies as a function of resolution. It must be deduced from the amplitudes and , since the phase (thus the phase difference) is unknown.
A general approach to estimating parameters for probability distributions is to maximize a likelihood function. The likelihood function is the overall joint probability of making the entire set of observations, which is a function of the desired parameters. The parameters that maximize the probability of making the set of observations are the most consistent with the data. The idea of using maximum likelihood to estimate model phase errors was introduced by Lunin & Urzhumtsev (1984), who gave a treatment that was valid for space group P1. In a more general treatment that applies to highersymmetry space groups, allowance is made for the statistical effects of crystal symmetry (centric zones and differing expected intensity factors) (Read, 1986).
The values are estimated by maximizing the joint probability of making the set of observations of . If the structure factors are all assumed to be independent, the joint probability distribution is the product of all the individual distributions. The assumption of independence is not completely justified in theory, but the results are fairly accurate in practice. The required probability distribution, , is derived from by integrating over all possible phase differences and neglecting the errors in as a measure of . The form of this distribution, which is given in other publications (Read, 1986, 1990), differs for centric and acentric reflections. (It is important to note that although the distributions for structure factors are Gaussian, the distributions for amplitudes obtained by integrating out the phase are not.) It is more convenient to deal with a sum than a product, so the log likelihood function is maximized instead. In the program SIGMAA, reciprocal space is divided into spherical shells, and a value of the parameter is refined for each resolution shell. Details of the algorithm are given elsewhere (Read, 1986).
The resolution shells must be thick enough to contain several hundred to a thousand reflections each, in order to provide estimates with a sufficiently small statistical error. A larger number of shells (fewer reflections per shell) can be used for refined structures, since estimates of become more precise as the true value approaches 1. If there are sufficient reflections per shell, the estimates will vary smoothly with resolution. As discussed below, the smooth variation with resolution can also be exploited through a restraint that allows values to be estimated from fewer reflections.
Blow & Crick (1959) and Sim (1959) showed that the electrondensity map with the least r.m.s. error is calculated from centroid structure factors. This conclusion follows from Parseval's theorem, because the centroid structure factor (its probabilityweighted average value or expected value) minimizes the r.m.s. error of the structure factor. Since the structurefactor distribution is symmetrical about , the expected value of F will have the same phase as , but the averaging around the phase circle will reduce its magnitude if there is any uncertainty in the phase value (Fig. 15.2.4.1). We treat the reduction in magnitude by applying a weighting factor called the figure of merit, m, which is equivalent to the expected value of the cosine of the phase error.
A figureofmerit weighted map, calculated with coefficients , has the least r.m.s. error from the true map. According to the normal statistical (minimum variance) criteria, then, it is the best map. However, such a map will suffer from model bias; if its purpose is to allow the detection and repair of errors in the model, this is a serious qualitative defect. Fortunately, it is possible to predict the systematic errors leading to model bias and to make some correction for them.
Main (1979) dealt with this problem in the case of a perfect partial structure. Since the relationships among structure factors are the same in the general case of a partial structure with various errors, once is substituted for , all that is required to apply Main's results more generally is a change of variables (Read, 1986, 1990).
In Main's approach, the cosine law is used to introduce the cosine of the phase error, which is converted into a figure of merit by taking expected values. Some manipulations allow us to solve for the figureofmerit weighted map coefficient, which is approximated as a linear combination of the true structure factor and the model structure factor (Main, 1979; Read, 1986). Finally, we can solve for an approximation to the true structure factor, giving map coefficients from which the systematic model bias component has been removed.
A similar analysis for centric structure factors shows that there is no systematic model bias in figureofmerit weighted map coefficients, so no bias correction is needed in the centric case.
When model phase information is combined with, for instance, multiple isomorphous replacement (MIR) phase information, there will still be model bias in the acentric map coefficients, to the extent that the model influences the final phases. However, it is inappropriate to continue using the same map coefficients to reduce model bias, because some phases could be determined almost completely by the MIR phase information. It makes much more sense to have map coefficients that reduce to the coefficients appropriate for either model or MIR phases, in extreme cases where there is only one source of phase information, and that vary smoothly between those extremes.
Map coefficients that satisfy these criteria (even if they are not rigorously derived) are implemented in the program SIGMAA. The resulting maps are reasonably successful in reducing model bias. Two assumptions are made: (1) the model bias component in the figureofmerit weighted map coefficient, , is proportional to the influence that the model phase has had on the combined phase; and (2) the relative influence of a source of phase information can be measured by the information content, H (Guiasu, 1977), of the phase probability distribution. The first assumption corresponds to the idea that the figureofmerit weighted map coefficient is a linear combination of the MIR and model phase cases. where and
Solving for an approximation to the true F gives the following expression, which can be seen to reduce appropriately when w is 0 (no model influence) or 1 (no MIR influence):
In principle, since the distribution of observed and calculated amplitudes is determined largely by the coordinate errors of the model, one can determine whether a particular coordinateerror distribution is consistent with the amplitudes. Unfortunately, it turns out that the coordinate errors cannot be deduced unambiguously, because many distributions of coordinate errors are consistent with a particular distribution of amplitudes (Read, 1990).
If the simplifying assumption is made that all the atoms are subject to a single error distribution, then the parameter D (and thus the related parameter ) varies with resolution as the Fourier transform of the error distribution, as discussed above. Two related methods to estimate overall coordinate error are based on the even more specific assumption that the coordinateerror distribution is Gaussian: the Luzzati plot (Luzzati, 1952) and the plot (Read, 1986). Unfortunately, the central assumption is not justified; atoms that scatter more strongly (heavier atoms or atoms with lower B factors) tend to have smaller coordinate errors than weakly scattering atoms. The proportion of the structure factor contributed by well ordered atoms increases at high resolution, so that the structure factors agree better at high resolution than if there were a single error distribution.
It is often stated, optimistically, that the Luzzati plot provides an upper bound to the coordinate error, because the observation errors in have been ignored. This is misleading, because there are other effects that cause the Luzzati and plots to give underestimates (Read, 1990). Chief among these are the correlation of errors and scattering power and the overfitting of the amplitudes in structure refinement (discussed below). These estimates of overall coordinate error should not be interpreted too literally; at best, they provide a comparative measure.
The computer program SIGMAA (Read, 1986) has been developed to implement the results described here. Apart from the two types of map coefficient discussed above, two types of differencemap coefficient can also be produced:
The general difference map, it should be noted, uses a vector difference between the figureofmerit weighted combined phase coefficient (the `best' estimate of the true structure factor) and the calculated structure factor. When additional phase information is available, it should provide a clearer picture of the errors in the model.
The structurefactor probabilities discussed above depend on the atoms having independent errors (or at least a sufficient number of groups of atoms having independent errors). Unfortunately, this assumption breaks down when a structure is refined against the observed diffraction data. Few protein crystals diffract to sufficiently high resolution to provide a large number of observations for every refinable parameter. The refinement problem is, therefore, not sufficiently overdetermined, so it is possible to overfit the data. If there is an error in the model that is outside the range of convergence of the refinement method, it is possible to introduce compensating errors in the rest of the structure to give a better, and misleading, agreement in the amplitudes. As a result, the phase accuracy (hence the weighting factors m and D) is overestimated, and model bias is poorly removed. Because simulated annealing is a more effective minimizer than gradient methods (Brünger et al., 1987), it is also more effective at locating local minima, so structures refined by simulated annealing probably tend to suffer more severely from refinement bias.
There is another interpretation to the problem of refinement bias. As Silva & Rossmann (1985) point out, minimizing the r.m.s. difference between the amplitudes and is equivalent (by Parseval's theorem) to minimizing the difference between the model electron density and the density corresponding to the map coefficients ; a lower residual is obtained either by making the model look more like the true structure, or by making the modelphased map look more like the model through the introduction of systematic phase errors.
A number of strategies are available to reduce the degree or impact of refinement bias. The overestimation of phase accuracy has been overcome in a new version of SIGMAA that is under development (Read, unpublished). Crossvalidation data, which are normally used to compute as an unbiased indicator of refinement progress (Brünger, 1992), are used to obtain unbiased estimates. Because of the high statistical error of estimates computed from small numbers of reflections, reliable values can only be obtained by exploiting the smoothness of the curve as a function of resolution. This can be achieved either by fitting a functional form or by adding a penalty to points that deviate from the line connecting their neighbours. Lunin & Skovoroda (1995) have independently proposed the use of crossvalidation data for this purpose, but as their algorithm is equivalent to the conventional SIGMAA algorithm, it will suffer severely from statistical error.
The degree of refinement bias can be reduced by placing less weight on the agreement of structurefactor amplitudes. Anecdotal evidence suggests that the problem is less serious, in structures refined using XPLOR (Brünger et al., 1987), when the Engh & Huber (1991) parameter set is used for the energy terms. In this new parameter set, the deviations from standard geometry are much more strictly restrained, so in effect the pressure on the agreement of structurefactor amplitudes is reduced. The use of maximumlikelihood targets for refinement (discussed below) also helps to reduce overfitting.
If errors are suspected in certain parts of the structure, `omit refinement' (in which the questionable parts are omitted from the model) can be a very effective way to eliminate refinement bias in those regions (James et al., 1980; Hodel et al., 1992).
If MIR or MAD (multiwavelength anomalous dispersion) phases are available, combined phase maps tend to suffer less from refinement bias, depending on the extent to which the experimental phases influence the combined phases. Finally, it is always a good idea to refer occasionally to the original MIR or MAD map, which cannot suffer at all from model bias or refinement bias.
In the past, conventional structure refinement was based on a leastsquares target, which would be justified if the observed and calculated structurefactor amplitudes were related by a Gaussian probability distribution. Unfortunately, the relationship between and is not Gaussian, and the distribution for is not even centred on . Because of this, it was suggested (Read, 1990; Bricogne, 1991) that a maximumlikelihood target should be used instead, and that it should be based on probability distributions such as those described above.
Three implementations of maximumlikelihood structure refinement have now been reported (Pannu & Read, 1996; Murshudov et al., 1997; Bricogne & Irwin, 1996). As expected, there is a decrease in refinement bias, as the calculated structurefactor amplitudes will not be forced to be equal to the observed amplitudes. Maximumlikelihood targets have been shown to work much better than leastsquares targets, particularly when the starting models are poor.
Prior phase information can also be incorporated into a maximumlikelihood target (Pannu et al., 1998). Tests show that even weak phase information can have a dramatic effect on the success of refinement, and that the amount of overfitting is even further reduced (Pannu et al., 1998).
Acknowledgements
This chapter is a revised version of a contribution to Methods in Enzymology (Read, 1997).
References
Blow, D. M. & Crick, F. H. C. (1959). The treatment of errors in the isomorphous replacement method. Acta Cryst. 12, 794–802.Google ScholarBricogne, G. (1991). A multisolution method of phase determination by combined maximization of entropy and likelihood. III. Extension to powder diffraction data. Acta Cryst. A47, 803–829.Google Scholar
Bricogne, G. (1993). Direct phase determination by entropy maximization and likelihood ranking: status report and perspectives. Acta Cryst. D49, 37–60.Google Scholar
Bricogne, G. & Irwin, J. (1996). In Proceedings of the CCP4 study weekend. Macromolecular refinement, edited by E. Dodson, M. Moore, A. Ralph & S. Bailey, pp. 85–92. Warrington: Daresbury Laboratory.Google Scholar
Brünger, A. T. (1992). Free R value: a novel statistical quantity for assessing the accuracy of crystal structures. Nature (London), 355, 472–474.Google Scholar
Brünger, A. T., Kuriyan, J. & Karplus, M. (1987). Crystallographic R factor refinement by molecular dynamics. Science, 235, 458–460.Google Scholar
Engh, R. A. & Huber, R. (1991). Accurate bond and angle parameters for Xray protein structure refinement. Acta Cryst. A47, 392–400.Google Scholar
Guiasu, S. (1977). Information theory with applications. London: McGrawHill.Google Scholar
Hodel, A., Kim, S.H. & Brünger, A. T. (1992). Model bias in macromolecular crystal structures. Acta Cryst. A48, 851–858.Google Scholar
James, M. N. G., Sielecki, A. R., Brayer, G. D., Delbaere, L. T. J. & Bauer, C.A. (1980). Structures of product and inhibitor complexes of Streptomyces griseus protease A at 1.8 Å resolution – a model for serine protease catalysis. J. Mol. Biol. 144, 43–88.Google Scholar
Lunin, V. Yu. & Skovoroda, T. P. (1995). Rfree likelihoodbased estimates of errors for phases calculated from atomic models. Acta Cryst. A51, 880–887.Google Scholar
Lunin, V. Yu. & Urzhumtsev, A. G. (1984). Improvement of protein phases by coarse model modification. Acta Cryst. A40, 269–277.Google Scholar
Luzzati, V. (1952). Traitement statistique des erreurs dans la determination des structures cristallines. Acta Cryst. 5, 802–810.Google Scholar
Main, P. (1979). A theoretical comparison of the β, γ′ and 2F_{o} − F_{c} syntheses. Acta Cryst. A35, 779–785.Google Scholar
Murshudov, G. N., Vagin, A. A. & Dodson, E. J. (1997). Refinement of macromolecular structures by the maximumlikelihood method. Acta Cryst. D53, 240–255.Google Scholar
Oppenheim, A. V. & Lim, J. S. (1981). The importance of phase in signals. Proc. IEEE, 69, 529–541.Google Scholar
Pannu, N. S., Murshudov, G. N., Dodson, E. J. & Read, R. J. (1998). Incorporation of prior phase information strengthens maximumlikelihood structure refinement. Acta Cryst. D54, 1285–1294.Google Scholar
Pannu, N. S. & Read, R. J. (1996). Improved structure refinement through maximum likelihood. Acta Cryst. A52, 659–668.Google Scholar
Ramachandran, G. N. & Srinivasan, R. (1961). An apparent paradox in crystal structure analysis. Nature (London), 190, 159–161.Google Scholar
Read, R. J. (1986). Improved Fourier coefficients for maps using phases from partial structures with errors. Acta Cryst. A42, 140–149.Google Scholar
Read, R. J. (1990). Structurefactor probabilities for related structures. Acta Cryst. A46, 900–912.Google Scholar
Read, R. J. (1997). Model phases: probabilities and bias. Methods Enzymol. 277, 110–128.Google Scholar
Silva, A. M. & Rossmann, M. G. (1985). The refinement of southern bean mosaic virus in reciprocal space. Acta Cryst. B41, 147–157.Google Scholar
Sim, G. A. (1959). The distribution of phase angles for structures containing heavy atoms. II. A modification of the normal heavyatom method for noncentrosymmetrical structures. Acta Cryst. 12, 813–815.Google Scholar
Srinivasan, R. (1966). Weighting functions for use in the early stages of structure analysis when a part of the structure is known. Acta Cryst. 20, 143–144.Google Scholar
Vellieux, F. M. D. & Read, R. J. (1997). Noncrystallographic symmetry averaging in phase refinement and extension. Methods Enzymol. 277, 18–53.Google Scholar
Wilson, A. J. C. (1949). The probability distribution of Xray intensities. Acta Cryst. 2, 318–321.Google Scholar
Woolfson, M. M. (1956). An improvement of the `heavyatom' method of solving crystal structures. Acta Cryst. 9, 804–810.Google Scholar