International
Tables for Crystallography Volume F Crystallography of biological macromolecules Edited by E. Arnold, D. M. Himmel and M. G. Rossmann © International Union of Crystallography 2012 
International Tables for Crystallography (2012). Vol. F, ch. 15.1, pp. 385400
https://doi.org/10.1107/97809553602060000847 Chapter 15.1. Phase improvement by iterative density modification^{a}Division of Basic Sciences, Fred Hutchinson Cancer Research Center, 1100 Fairview Ave N., Seattle, WA 90109, USA,^{b}Department of Chemistry, University of York, York YO1 5DD, England, and ^{c}Department of Physics, University of York, York YO1 5DD, England Density modification is a method for improving phase estimates arising from sources such as MIR/MAD and molecular replacement. This is achieved by use of chemical knowledge concerning the properties of well phased electrondensity maps, including such features as solvent flatness, atomic composition and noncrystallographic symmetry. The calculation is performed iteratively, with alternating stages of map modification in real space and phase weighting in reciprocal space. 
Density modification is a technique for improving the quality of an approximate electrondensity map based on some conserved features of the correct electrondensity map. These conserved features are independent of the unknown fine detail of the structural conformation. They are often expressed as constraints on the electron density in various forms, either in real or reciprocal space. Since the structurefactor amplitudes are known, these constraints restrict the values of phases and can therefore be used for phase improvement.
The structurefactor amplitudes and phases are independent of each other if we know nothing about the electron density. Therefore, the phases are indeterminable given only the amplitudes (Baker, Krukowski & Agard, 1993). The information about the electron density provides the missing link between structurefactor amplitudes and phases. It is only through the knowledge of the chemical or physical properties of the electron density that the phases can be retrieved. Density modification is usually the most straightforward application of the constraints on electron density. However, this is only a matter of convenience in implementation. Sometimes the constraints can be more readily implemented in reciprocal space on structure factors.
Densitymodification methods are usually implemented as an iterative procedure that alternates between density modification in real space and phase combination in reciprocal space. This paradigm was first proposed by Hoppe & Gassmann (1968) in their `phase correction' method. This approach takes advantage of the particular properties of the constraints and uses them in a way that is most convenient to implement.
Densitymodification methods usually require an initial map with substantial phase information. In most cases, these phases are obtained from multiple isomorphous replacement (MIR) or multiwavelength anomalous dispersion (MAD), but it is also possible to improve maps from other sources, such as molecular replacement. The amount of information in the initial map is dependent on phase accuracy, data resolution and completeness. As more powerful constraints are incorporated, the density modification can be initiated from lowerresolution maps with less accurate phases. Ab initio phasing would be achieved if a densitymodification method could start from a map generated from random phases. Therefore, density modification can potentially lead to ab initio phasing methods, although it does not seek direct solution to the phase problem as its immediate goal.
There are two major components in a densitymodification procedure. One is the type of electrondensity constraints. The other is the way the constraints are exploited. These two components combined determine the phasing power of the procedure. In this chapter, we will review various electrondensity constraints and the way they are exploited for phase improvement.
The aim of densitymodification calculations is to obtain new or improved phase estimates for observed structurefactor amplitudes. Often, this includes calculation of phases for previously unphased reflections, for example, in the case of phase extension. The calculation of weights, which indicate the degree of confidence in the new phase estimates, is also an important part of the calculation. Improved phase estimates are obtained by bringing the initial phase estimates into consistency with additional sources of structural information.
One difficulty in combining information from various sources is that the amplitudes and phases are represented in reciprocal space and include good estimates of error, whereas the other constraints are in real space and in general, represent expectations about the structure which may be hard to quantify. As a result, the method that has been adopted is iterative and divided into real and reciprocalspace steps. A weighted map is calculated and used as a basis for applying all the realspace modifications. The modified map is then backtransformed to produce a set of amplitudes and phases. The agreement between the observed amplitudes and the amplitudes calculated from the modified map is then used to estimate weights for the modified phases, which are used to combine the modified phases with experimental phases to produce new phases. This process is shown diagrammatically in Fig. 15.1.2.1.

Densitymodification calculation showing iterative application of realspace and reciprocalspace constraints. 
A broad range of techniques have been applied to electrondensity maps to impose chemical or physical information. Some sources of information used in density modification are summarized in Table 15.1.2.1. The list included here is not exhaustive, but covers the most widely used methods. Here, we describe some of the constraints and the techniques through which these constraints are implemented for phase improvement.

Solvent flattening exploits the fact that the electron density in the solvent region is flat at medium resolution, owing to the high thermal motion and disorder of solvent molecules. The flattening of the solvent region suppresses noise in the map and therefore improves phases.
Biological molecules are typically irregular in shape, often taking roughly globular forms. When they are packed regularly to form a crystal lattice, there are gaps left between them, and these spaces are filled with the solvent in which the crystallization was performed. This solvent is a disordered liquid, and thus the arrangement of atoms in the solvent regions varies between unit cells, except in those small regions near the surface of the protein. The Xray image forms an average of electron density over many cells, so the electron density over much of the solvent region appears to be constant to a good approximation.
The existence of a flat solvent region in a crystal places strong constraints on the structurefactor phases. The constraint of solvent flatness is implemented by identifying the molecular boundaries and replacing the densities in the solvent region by their mean density value.
When solving a structure, the contents of the unit cell are usually known, and so an estimate can be formed of how much of the cell volume is taken up by solvent (Matthews, 1968). If the solvent region can be located in the cell, then we can improve an electrondensity map by setting the electron density in this region to the expected constant solvent density. Once the resulting modified phases are combined with the experimental data, an improvement can often be seen in the protein regions of the map (Bricogne, 1974).
The solvent region of a unit cell may usually be determined even from a poor MIR map using the following features:
A good method for locating the solvent region therefore takes into account information from both low and highresolution structure factors. Many methods have been proposed to locate the protein–solvent boundary. The first of these were the visual identification methods. The boundary was identified by digitizing a minimap with the aid of a graphic tablet (Hendrickson et al., 1975; Schevitz et al., 1981). The handdigitizing procedure was very time consuming and prone to subjective judgmental errors. Nevertheless, these methods demonstrated the potential of solvent flattening and stimulated further improvement on boundaryidentification methods. An automated method using a linked, highdensity approach was first proposed by Bhat & Blow (1982). Based on the fact that the densities are generally higher in the protein region than in the solvent region, they defined the molecular boundary by locating the protein as a region of linked, highdensity points.
Convolution techniques were subsequently adopted as an efficient method of molecularboundary identification. Reynolds et al. (1985) proposed a high mean absolute density value approach. The electron density within the protein region was expected to have greater excursions from the mean density value than the solvent region, which is relatively featureless. The molecular boundary was located based on the value of a smoothed `modulus' electron density, which is the sum of the absolute values of all density points within a small box.
Wang (1985) suggested an automated convolution method for identifying the solvent region which has achieved widespread use. His method involved first calculating a truncated map: The electron density is simply truncated at the expected solvent value, ; however, since the variations in density in the protein region are much larger than the variations in the solvent region, it is generally only the protein region which will be affected. Thus, the mean density over the protein region is increased. Similar results may be obtained using the meansquared difference of the density from the expected solvent value.
A smoothed map is then formed by calculating at each point in the map the mean density over a surrounding sphere of radius R. This operation can be written as a convolution of the truncated map, , with a spherical weighting function, , where
Leslie (1987) noted that the convolution operation required in equation (15.1.2.2) can be very efficiently performed in reciprocal space using fast Fourier transforms (FFTs), where denotes a Fourier transform, and represents an inverse Fourier transform.
The Fourier transform of the truncated density can be readily calculated using FFTs. The Fourier transform of the weighting function can be calculated analytically by where
Therefore, the averaging of the truncated electron density by a spherical weighting function can be achieved by two FFTs. This greatly reduced the time required for calculating the averaged density. Other weighting functions may be implemented by the same approach.
A cutoff value, , is then calculated, which divides the unit cell into two portions occupying the correct volumes for the protein and solvent regions. All points in the map where can then be assumed to be in the solvent region. A typical mask obtained from an MIR map by this means, and the modified map, are shown in Fig. 15.1.2.2.
The radius of the sphere, R, used in equation (15.1.2.3) for the averaging of electron densities is generally around 8 Å. The molecular envelope derived from such an averaged map tends to lose details of the protein molecular surface. Paradoxically, a large averaging sphere is required for the identification of the protein–solvent boundary based on the difference between the mean density of the protein and solvent, which is very small and can only be distinguished when a sufficiently large area of the map is averaged. Abrahams & Leslie (1996) proposed an alternative method of molecularboundary identification that uses the standard deviation of the electron density within a given radius relative to the overall mean at every grid point of a map. The localstandarddeviation map is the square root of a convolution of a sphere and the squared map, which can be calculated in reciprocal space in a similar way to the procedure described in equations (15.1.2.4) and (15.1.2.5) as proposed by Leslie (1987). By integrating the histogram of the localstandarddeviation map, the cutoff value of the local standard deviation corresponding to the solvent fraction can be calculated. Using this procedure, a molecular envelope that contains more details of the protein molecular surface can be obtained, since the radius of the averaging sphere can be as low as 4 Å (Abrahams & Leslie, 1996).
Once the envelope has been determined, solvent flattening is performed by simply setting the density in the solvent region to the expected value, : If the electron density has not been calculated on an absolute scale, the solvent density may be set to its mean value.
A related method is solvent flipping, developed by Abrahams & Leslie (1996). In this approach, the flattening operation is modified by the introduction of a relaxation factor, γ, where γ is positive, effectively `flipping' the density in the solvent region. The effect of this modification is to correct for the problem of independence in phase combination and is discussed in Section 15.1.4.3.
Histogram matching seeks to bring the distribution of electrondensity values of a map to that of an ideal map. The density histogram of a map is the probability distribution of electrondensity values. It provides a global description of the appearance of the map, and all spatial information is discarded. The comparison of the histogram for a given map with that expected for an ideal map can serve as a measure of quality. Furthermore, the initial map can be improved by adjusting density values in a systematic way to make its histogram match the ideal histogram.
Histogram matching is a standard technique in image processing. It is aimed at bringing the density distribution of an image to an ideal distribution, thereby improving the image quality. The first attempt at modifying the electrondensity distribution was that by Hoppe & Gassman (1968), who proposed the `3–2' rule. The electron density was first normalized to a maximum of 1 and modified by imposing positivity. Subsequently, the electron density was modified by . Podjarny & Yonath (1977) used the skewness of the density histogram as a measure of quality of the modified map. Harrison (1988) used a Gaussian function as the ideal histogram in his histogramspecification method for protein phase refinement and extension. The choice of the Gaussian function as the ideal electrondensity distribution was based on theoretical arguments instead of experimental evaluation. The Gaussian function was also made independent of resolution. Lunin (1988) used the electrondensity distribution to retrieve the values of lowangle structure factors whose amplitudes had not been measured during an Xray experiment. The electrondensity distribution was thought to be structure specific and was derived from a homologous structure. Moreover, the histogram was derived from the entire unit cell, including both the protein and the solvent. Zhang & Main (1988) systematically examined the electrondensity histogram of several proteins and found that the ideal density histogram is dependent on resolution, the overall temperature factor and the phase error. It is, however, independent of structural conformation. The sensitivity to phase error suggests that the density histogram could be used for phase improvement. The structural conformation independence made it possible to predict the ideal histogram for unknown structures.
Polypeptide structures in particular, and biological macromolecules in general, display a broadly similar atomic composition, and the way in which these atoms bond together is also conserved across a wide range of structures. These similarities between different protein structures can be used to predict the ideal histogram even when positional information for individual atoms is not available in a map. If the positional information is removed from an electrondensity map, then what remains is an unlabelled list of density values. This list is the histogram of the electrondensity distribution, which is independent of the relative disposition of these densities. The shape of the histogram is primarily based on the presence of atoms and their characteristic distances from each other. This is true for all polypeptide structures.
The frequency distribution, , of electrondensity values in a map can be constructed by sampling the map and counting the density values in different ranges. In practice, once the electrondensity map has been sampled on a discrete grid, this frequency distribution becomes a histogram, but for convenience, it is treated here as a continuous distribution.
At resolutions of better than 6.0 Å and after exclusion of the solvent region, the frequency distribution of electrondensity values for protein density over a wide range of proteins varies only with resolution and overall temperature factor to a good approximation. If the overall temperature factor is artificially adjusted, for example, by sharpening to , then the frequency distributions may be treated as a function of resolution only. Therefore, once a good approximation to the molecular envelope is known, the frequency distribution of electron densities in the protein region as a function of resolution may be assumed to be known. Therefore, the ideal density histogram for an unknown map at a given resolution can be taken from any known structure at the same resolution (Zhang & Main, 1988, 1990a).
The ideal electrondensity histogram can also be predicted by an analytical formula (Lunin & Skovoroda, 1991; Main, 1990a). The method adopted by Main (1990a) represents the density histogram by components that correspond to three types of electron density in the map. The first component is the region of overlapping densities, which can be represented by a randomly distributed background noise. The second component is the region of partially overlapping densities. The third component is the region of nonoverlapping atomic peaks, which can be represented by a Gaussian.
The histogram for the overlapping part of the density can be represented by a Gaussian distribution, where is the mean density and σ is the standard deviation. The region of partially overlapping densities can be modelled by a cubic polynomial function, The histogram for the nonoverlapping part of the density can be derived analytically from a Gaussian atom, where is the maximum density, N is a normalizing factor and A is the relative weight of the terms between equation (15.1.2.8) and equation (15.1.2.10).
If we use two threshold values, and , to divide the three density regions, the complete formula can be expressed as
The parameters a, b, c, d in the cubic polynomial are calculated by matching function values and gradients at and . The parameters in the histogram formula, , σ, A, , , , can be obtained from histograms of known structures.
Zhang & Main (1990a) demonstrated that, at better than 4 Å resolution, the histogram for an MIR map is generally significantly different from the ideal distribution calculated from atomic coordinates. The obvious course is therefore to alter the map in such a way as to make its density histogram equal to the ideal distribution. Unfortunately, there are an infinite number of maps corresponding to any chosen density distribution, so we must choose a systematic method of altering the map.
The conventional method of performing such a modification is to retain the ordering of the density values in the map. The highest point in the original map will be the highest point in the modified map, the second highest points will correspond in the same way, and so on.
Mathematically, this transformation is represented as follows. Let be the current density histogram and be the desired distribution, normalized such that their sums are equal to 1. The cumulative distribution functions, and , may then be calculated: The cumulative distribution function of a variable transforms a value chosen from the distribution into a number between 0 and 1, representing the position of that value in an ordered list of values chosen from the distribution.
The transformation may, therefore, be performed in two stages. A density value is taken from the initial distribution and the cumulative distribution function of the initial distribution is applied to obtain the position of that value in the distribution. The inverse of the cumulative distribution function for the desired distribution is applied to this value to obtain the density value for the corresponding point in the desired distribution. Thus, given a density value, ρ, from the initial distribution, the modified value, ρ′, is obtained by The distribution of ρ′ will then match the desired distribution after the above transformation. The transformation of an electrondensity value by this method is illustrated in Fig. 15.1.2.3. The transformation in equation (15.1.2.13) can be achieved through a linear transform represented by where and n is the number of density bins. The above linear transform is sufficient if the number of density bins is large enough. An n value of about 200 is usually quite satisfactory.
Various properties of the electron density are specified in the density histogram, such as the minimum, maximum and mean density, the density variance, and the entropy of the map. The mean density of the ideal map can be obtained by The variance of the density in the ideal map can be obtained by where
Therefore, the process of histogram matching applies a minimum and a maximum value to the electron density, imposes the correct mean and variance, and defines the entropy of the new map. The order of electrondensity values remains unchanged after histogram matching.
Histogram matching is complementary to solvent flattening since it is applied to the protein region of a map, whereas solvent flattening only operates on the solvent region of the map. The same envelope that was used for isolating the solvent region can be used to determine the protein region of the cell. An alternative approach is to define separate solvent and protein masks, with uncertain regions excluded from either mask and allowed to keep their unmodified values.
15.1.2.2.4. Scaling the observed structurefactor amplitudes according to the ideal density histogram
In the process of density modification, electron density or structure factors from different sources are compared and combined. It is, therefore, crucial to ensure that all the structure factors and maps are on the same scale. The observed structure factors can be put on the absolute scale by Wilson statistics (Wilson, 1949) using a scale and an overall temperature factor. This is accurate when atomic or near atomic resolution data are available. The scale and overall temperature factor obtained from Wilson statistics are less accurate when only medium to lowresolution data are available. A more robust method of scaling nonatomic resolution data is through the density histogram (Cowtan & Main, 1993; Zhang, 1993).
The ideal density histogram defines the mean and variance of an electron density, as shown in equations (15.1.2.15) and (15.1.2.16). We can scale the observed structurefactor amplitudes to be consistent with the target histogram using the following formula, obtained from the structurefactor equation and Parseval's theorem. The mean density and the density variance of the observed map can be calculated as
The mean and variance of the electrondensity map at the desired resolution are calculated using the target histogram, the mean value of the solvent density, , and the solvent volume of the cell, . The F(000) term can then be evaluated from equations (15.1.2.15) and (15.1.2.18): The scale of the observed amplitudes can be obtained from equations (15.1.2.16) and (15.1.2.19), where This method is adequate for scaling observed structure factors at any resolution.
The averaging method enforces the equivalence of electrondensity values between grid points in the map related by noncrystallographic symmetry. The averaging procedure can filter noise, correct systematic error and even determine the phases ab initio in favourable cases (Chapman et al., 1992; Tsao et al., 1992).
Noncrystallographic symmetry (NCS) arises in crystals when there are two or more of the same molecules in one asymmetric unit. Such symmetries are local, since they only apply within a subregion of a single unit cell. A fivefold axis, for example, must be noncrystallographic, since it is not possible to tessellate objects with fivefold symmetry. Since the symmetry does not map the crystal lattice back onto itself, the individual molecules that are related by the noncrystallographic symmetry will be in different environments; therefore, the symmetry relationships are only approximate.
Noncrystallographic symmetries provide phase information by the following means. Firstly, the related regions of the map may be averaged together, increasing the ratio of signal to noise in the map. Secondly, since the asymmetric unit must be proportionally larger to hold multiple copies of the molecule, the number of independent diffraction amplitudes available at any resolution is also proportionally larger. This redundancy in sampling the molecular transform leads to additional phase information which can be used for phase improvement.
The selfrotation symmetry is now routinely solved by the use of a Patterson rotation function (Rossmann & Blow, 1962). The translation symmetry can be determined by a translation function (Crowther & Blow, 1967) when a search model, either an approximate structure of the protein to be determined or the structure of a homologous protein, is available. The searches of the Patterson rotation and translation functions are achieved typically using fast automatic methods, such as XPLOR (Brünger et al., 1987) or AMoRe (Navaza, 1994). In cases where no search model is available or the Patterson translation function is unsolvable, either the whole electrondensity map, or a region which is expected to contain a molecule, may be rotated using the rotation solution and used as a search model in a phased translation function (Read & Schierbeek, 1988).
Once the averaging operators are determined, the mask can be determined using the local density correlation function as developed by Vellieux et al. (1995). This is achieved by a systematic search for extended peaks in the local density correlation, which must be carried out over a volume of several unit cells in order to guarantee finding the whole molecule. The local correlation function distinguishes those volumes of crystal space which map onto similar density under transformation by the averaging operator. Thus, in the case of improper NCS, a local correlation mask will cover only one monomer. In the case of a proper symmetry, a local correlation mask will cover the whole complex (Fig. 15.1.2.4a,b).
Special cases arise when there are combinations of crystallographic and noncrystallographic symmetries, of proper and improper symmetries, or when a noncrystallographic symmetry element maps a cell edge onto itself. In the latter case, the volume of matching density is infinite, and arbitrary limits must be placed upon the mask along one crystal axis.
The initial NCS operation obtained from rotation and translation functions or heavyatom positions can be finetuned by a densityspace Rfactor search in the sixdimensional rotation and translation space. The densityspace R factor is defined as where is the set of Cartesian coordinates, is the NCSrelated set of coordinates of r and Ω represents the NCS operator.
The sixdimensional search is very time consuming. The search rate can be increased by using only a representative subset of grid points. The NCS operation is systematically altered to find the lowest densityspace R factor for the selected subset of grid points.
The solution of the NCS operation from the sixdimensional search can be further refined by the following leastsquares procedure. If is related to by the NCS operation, Ω, Here, Ω is a function of , where represents the rotation and translation components of the NCS operation. The solution to the NCS parameters, ω, can be obtained by minimizing the density residual between the NCSrelated molecules, using a leastsquares formula of the form where Δω is the shift to the NCS parameters. Here,The partial derivatives, , can be calculated by Fourier transforms,or more efficiently with a single Fourier transform by the use of spectral Bsplines (Cowtan & Main, 1998). is derived analytically based on the relationship between the Cartesian coordinates, r, and the rotational and translational coordinates of the NCS operation, ω,
Once the mask and matrices are determined, the electrondensity map may be modified by averaging. This can be achieved in one or two stages: The density for each copy of the molecule in the asymmetric unit may be replaced by the averaged density from every copy; however, this becomes slow for highorder NCS (Fig. 15.1.2.4c). Alternatively, a single averaged copy of the molecule may be created in an artificial cell [referred to by Rossmann et al. (1992) as an Hcell], and then each copy of the molecule may be reconstructed in the asymmetric unit from this copy (Fig. 15.1.2.4d). This is more efficient for highorder NCS, but additional errors are introduced in the second interpolation.
Interpolation of electrondensity values at nonmap grid sites is usually required, since the NCS operators will not normally map grid points onto each other. To obtain accurate interpolated values, either a fine grid or a complex interpolation function are required; suitable functions are described in Bricogne (1974) and Cowtan & Main (1998). Solvent flattening and histogram matching are frequently applied after averaging, since histogram matching tends to correct for any smoothing introduced by density interpolation.
In the case of flexible proteins, it may be necessary to average only part of the molecule, in which case the averaging mask will exclude some parts of the unit cell which are indicated as protein by the solvent mask. In other cases, it may be necessary to apply multidomain averaging; in this case, the protein is divided into rigid domains which can appear in differing orientations. Each domain must then have a separate mask and set of averaging matrices.
Averaging may also be performed across similar molecules in multiple crystal forms (Schuller, 1996); in this case, density modification is performed on each crystal form simultaneously, with averaging of the molecular density across all copies of the molecule in all crystal forms. This is a powerful technique for phase improvement, even when no phasing is available in some crystal forms.
The skeletonization method enhances connectivity in the map. This is achieved by locating ridges of density, constructing a graph of linked peaks, and then building a new map using cylinders of density around the graph peaks.
At worse than atomic resolution, the density peaks for bonded atoms are no longer resolved, and so interpretation of the density in terms of atomic positions involves recognition of common motifs in the pattern of ridges in the density. Skeletonization was a tool developed by Greer (1985) to assist model building by tracing high ridges in the electron density to describe the connectivity in the map.
Skeletonization has more recently been adapted to the problem of density modification (Baker, Bystroff et al., 1993; Bystroff et al., 1993; Wilson & Agard, 1993). A skeleton is constructed by tracing the ridges in the map. The resulting ridges form connected `trees'. These trees may be pruned to remove small unconnected fragments and break circuits to select for proteinlike features. A new map may then be built by building density around the links of the skeleton using the profile of a cylindrically averaged atom at the appropriate resolution.
The skeletonization method has been used to add new features to a partial model of a molecule (Baker, Bystroff et al., 1993). An efficient alternative algorithm for tracing density ridges is given by Swanson (1994).
Sayre's equation constrains the local shape of electron density. It provides a link between all structurefactor amplitudes and phases. It is an exact equation at atomic resolution in an equalatom system. It is, therefore, very powerful for phase refinement and extension for small molecules at atomic resolution (Sayre, 1952, 1972, 1974). However, its power diminishes as resolution decreases. It can still be an effective tool for macromolecular phase refinement and extension if the shape function can be modified to accommodate the overlap of atoms at nonatomic resolution (Zhang & Main, 1990b).
Sayre's equation (Sayre, 1952, 1972, 1974) expresses the constraint on structure factors when the atoms in a structure are equal and resolved, and the equation has formed the foundation of direct methods. In protein calculations, the resolution is generally too poor for atoms to be resolved, and this is reflected in the bulk of the terms required to calculate the equation for any particular missing structure factor.
For equal and resolved atoms, squaring the electron density changes only the shape of the atomic peaks and not their positions. The original density may therefore be restored by convoluting with some smoothing function, , which is a function of atomic shape, where Here, is the ratio of scattering factors of real, , and `squared', , atoms, and V is the unitcell volume, i.e.,
Sayre's equation states that the convolution of the squared electron density with a shape function restores the original electron density. It can be seen from equation (15.1.2.30) that Sayre's equation puts constraints on the local shape of electron density. The local shape function is the Fourier transform of the ratio of scattering factors of the real and `squared' atoms.
Sayre's equation is more frequently expressed in reciprocal space as a system of equations relating structure factors in amplitude and phase: The reciprocalspace expression of Sayre's equation can be obtained directly from a Fourier transformation of both sides of equation (15.1.2.30) and the application of the convolution theorem.
15.1.2.5.2. The application of Sayre's equation to macromolecules at nonatomic resolution – the θ() curve
Sayre's equation is exact for an equalatom structure at atomic resolution. The reciprocalspace shape function, , can be calculated analytically from the ratio of the scattering factors of real and `squared' atoms, which can both be represented by a Gaussian function. At infinite resolution, we expect to be a spherically symmetric function that decreases smoothly with increased h. However, for data at nonatomic resolution, the curve will behave differently because atomic overlap changes the peak shapes. Therefore, a sphericalaveraging method is adopted to obtain an estimate of the shape function empirically from the ratio of the observed structure factors and the structure factors from the squared electron density using the formula where the averaging is carried out over ranges of , i.e., over spherical shells, each covering a narrow resolution range. Here, s represents the modulus of h.
The empirically derived shape function only extends to the resolution of the experimentally observed phases. This is sufficient for phase refinement. However, there are no experimentally observed phases to give the empirical for phase extension. Therefore, a Gaussian function of the form is fitted to the available values of , and the parameters K and B are obtained using a leastsquares method. The shape function for the resolution beyond that of the observed phases is extrapolated using the fitted Gaussian function. The derivation of the shape function from a combination of spherical averaging and Gaussian extrapolation is the key to the successful application of Sayre's equation for phase improvement at nonatomic resolution (Zhang & Main, 1990b).
The atomization method uses the fact that the structure underlying the map consists of discrete atoms. It attempts to interpret the map by automatically placing atoms and refining their positions.
Agarwal & Isaacs (1977) proposed a method for the extension of phases to higher resolutions by interpreting an electrondensity map in terms of `dummy' atoms. These are so called because at the initial resolution of 3.0 Å, true atom peaks could not be resolved. The placement of `dummy atoms' is subject to constraints of bonding distance and the number of neighbours. The coordinates and temperature factors of these dummy atoms may then be refined against all the available diffraction amplitudes. Structure factors may then be calculated from the refined coordinates to provide phases for the highresolution reflections and to improve the phases of the starting set.
The atomization approach has been extended in the ARP program (Lamzin & Wilson, 1997) by the use of differencemap criteria to test dummyatom assignments, with the aim of removing wrong atoms and introducing missing atoms. With modern refinement algorithms, this technique has become very effective for the solution of structures at high resolution from a poor molecularreplacement model, or even directly from an MIR/MAD map.
Map improvement has also been demonstrated at intermediate resolutions by Perrakis et al. (1997) using a multisolution variant of the ARP method, and by Vellieux (1998).
The interpretation of an approximately phased map has also been applied very successfully as part of the `Shake n' Bake' directmethods procedure (Miller et al., 1993; Weeks et al., 1993). The alternating application of phase refinement by the minimum principle in reciprocal space (`Shake') and atomization in real space (`Bake') has proved to be a very powerful method for solving small protein structures at atomic resolution using only structurefactor amplitudes.
Density modification, although mostly performed in real space for ease of application, can be understood in terms of reciprocalspace constraints on structurefactor amplitudes and phases.
Main & Rossmann (1966) showed that the NCSaveraging operation in real space can be expressed in reciprocal space as the convolution of the structure factors and the Fourier transform of the molecular envelope and the NCS matrices. Similarly, the solventflattening operation can be considered a multiplication of the map by some mask, , where in the protein region and in the solvent region. Thus This assumes that the solvent level is zero, which can be achieved by suitable adjustment of the term.
If we transform this equation to reciprocal space, then the product becomes a convolution; thus where is the Fourier transform of the mask . The solvent mask shows the outline of the molecule with no internal detail, so must be a lowresolution image. Therefore, all but the lowestresolution terms of will be negligible.
The convolution expresses the relationship between phases in reciprocal space from the constraint of solvent flatness in real space. Since only the terms near the origin of are nonzero, the convolution can only relate phases that are local to each other in reciprocal space. Thus, it can only provide phase information for structure factors near the current phasing resolution limit.
This reasoning may also be applied to other density modifications. Histogram matching applies a nonlinear rescaling to the current density in the protein region. The equivalent multiplier, , shows variations of about 1.0 that are related to the features in the initial map. The function for histogram matching is, therefore, dominated by its origin term, but shows significant features to the same resolution as the current map or further, as the density rescaling becomes more nonlinear. Histogram matching can therefore give phase indications to twice the resolution of the initial map or beyond, although phase indications will be weak and contain errors related to the level of error in the initial map.
Averaging may be described as the summation of a number of reoriented copies of the electron density within the region of the averaging mask (Main & Rossmann, 1966), i.e. where is the initial density, , transformed by the ith NCS operator and is the mask of the molecule to be averaged. This summation is repeated for each copy of the molecule in the whole unit cell. The reciprocalspace averaging function, , is the Fourier transform of a mask, as for solvent flattening, but since the mask covers only a single molecule, rather than the molecular density in the whole unit cell, the extent of in reciprocal space is greater.
Sayre's equation is already expressed as a convolution, although in this case the function is given by the structure factors themselves. It is, therefore, the most powerful method for phase extension. However, as resolution decreases, more of the reflections required to form the convolution are missing, and the error increases.
The functions and for these density modifications are illustrated in Fig. 15.1.3.1 for a simple onedimensional structure.
Phase combination is used to filter the noise in the modified phases and eliminate the incorrect component of the modified phases through a statistical process. The observed structurefactor amplitudes are used to estimate the reliability of the phases after density modification. The estimated probability of the modified phases is combined with the probability of observed phases to produce a more reliable phase estimate,
Once a modified map has been obtained, modified phases and amplitudes may be derived from an inverse Fourier transform. The modified phases are normally combined with the initial phases by multiplication of their probability distributions. The probability distribution for the experimentally observed phases is usually described in terms of a best phase and figure of merit (Blow & Rossmann, 1961) or by Hendrickson–Lattman coefficients (Hendrickson & Lattman, 1970). In order to estimate a unimodal probability distribution for the modified phase, some estimate of the associated error must be made; this is usually achieved using the Sim weighting scheme (Sim, 1959).
Recombination with the initial phases assumes independence between the initial and modified phases and is a source of difficulties. However, in the absence of some form of phase constraint, most densitymodification constraints are too weak to guarantee convergence to a reasonable solution. The exception is when highorder NCS is present; in this case, the combination of NCS and observed amplitudes is sufficient to determine the phases (Chapman et al., 1992; Tsao et al., 1992), and phase combination may be omitted; however, weighting of the phases is still necessary. In this case, it is also possible to restore missing reflections in both amplitude and phase.
The phase probability distribution for the densitymodified phase is conventionally generated under assumptions that were made for the combination of a partial atomic model with experimental data. It assumes that the calculated amplitudes and phases arise from a density map in which some atoms are present and correctly positioned, and the remainder are completely absent (Sim, 1959). Thus, the difference between the true structure factor and the calculated value must be the effective structure factor due to the missing density alone. If the phase of this quantity is random and the amplitude is drawn from a Wilson distribution (Wilson, 1949), the following expression is obtained: where and where is the variance parameter in the Wilson distribution for the missing part of the structure. The figure of merit, w, can be derived from where and are zero and firstorder modified Bessel functions. A similar argument follows for centric reflections.
The error estimate for the phase depends on the effective amount of missing structure that is estimated on the basis of the agreement of the modified amplitudes with their measured values, where may be estimated by a number of means, for example (Bricogne, 1976), where the average is normally taken over all reflections at a particular resolution. A more sophisticated approach is the method of Read (1986), which allows for errors in the atomic model and has also been used in density modification (Chapter 15.2 ).
Although these approaches have been applied with some success, the assumption in equation (15.1.4.1) that the densitymodified amplitudes and phases are independent of the initial values is invalid. Since the density constraints are typically underdetermined, it is possible to achieve an arbitrarily good agreement between the model amplitudes and their observed values without improving the phases. As a result, phase weights from density modification are typically overestimated.
This problem has traditionally been addressed by limiting the number of cycles of density modification in which weakly phased reflections are included. Typically, density modification is started with only some subset of the data, such as those reflections well phased from MIR data. Only these reflections are included in the phase recombination, with other reflections set to zero. As the calculation progresses, more reflections are introduced until all the data are included. The figures of merit of reflections that undergo fewer cycles of phase recombination will be correspondingly smaller (e.g. Leslie, 1987; Zhang & Main, 1990a). In averaging calculations where considerable phase information is available from highorder NCS, it is still typically necessary to perform phase extension over hundreds of cycles and to add a very thin resolution shell of new reflections at each cycle.
The phases and figure of merit generated from density modification are more suited to the calculation of weighted maps than maps. The map is designed to aid the structure completion from a partial model (Main, 1979). The map will restore features missing from the current model at full weight if the following conditions are fulfilled. First, the model phases must be close to their true values. Secondly, the difference between the model and observed amplitudes is a good indicator of the phase error and the difference between the calculated and observed amplitudes decreases as the phases approach their true values. Neither of these assumptions are necessarily true for density modification, since it may be applied to very poor maps with almost random phases, and under most densitymodification schemes the structurefactor amplitudes may be overfitted to the observed values.
The modified map may be made more independent of the original map, as was assumed when multiplying the phase probability distributions in equation (15.1.4.1), through a reciprocalspace analogue of the omit map, the reflectionomit method.
The reflections are divided into (typically 10 or 20) sets and densitymodification calculations are performed, excluding each set in turn from the calculation of the starting map, in a manner similar to a freeRvalue calculation (Brünger, 1992). Density modification is applied to each map in turn, and the modified reflections from each of the free sets are combined to give a new, complete data set. This data set should be less dependent on the original amplitudes; therefore, the amplitudes may be expected to give a better indication of the quality of the modified phases.
The resulting maps obtained using solvent flattening and/or histogram matching are dramatically improved using the reflectionomit method (Cowtan & Main, 1996). In the case of averaging calculations, however, the reflectionomit approach makes little difference, since omitted reflections tend to be restored through noncrystallographic symmetry relationships to other regions of reciprocal space. It is possible that further improvements may be achieved by selecting reflection sets that approximately obey the NCS relationships.
Abrahams & Leslie (1996) have shown that solvent flipping is dramatically more effective as a density modification than solvent flattening. This may be shown to be theoretically equivalent to performing a reflectionomit calculation for each reflection individually (Abrahams, 1997).
Solvent flattening is represented in reciprocal space by convolution of the structure factors with a function, , as shown in equation (15.1.3.2). If the origin term of G is set to zero, then the modified structure factor, , will depend on the values of all the structure factors except itself; this is equivalent to performing a reflectionomit calculation with that reflection alone omitted.
Let the originremoved G be called and its Fourier transform : then The convolution of the reflection data with is equivalent to performing a reflectionomit calculation, omitting every reflection in turn. However, the convolution may still be performed in real space; thus, the full omit calculation becomes a simple multiplication of the map by : In a solventflattening calculation, will be equal to minus the fraction of the cell that is protein. In the case of a cell with 50% solvent, has a value of 0.5 in the protein and −0.5 in the solvent. Multiplication of the map by this function results in flipping of the solvent.
If the origin term of the G function, γ, can be determined, then the flipping calculation may alternatively be performed by subtracting a copy of the initial map scaled by γ from the modified map. This is the γ correction of Abrahams (1997). This approach may be generalized to arbitrary densitymodification methods by use of the perturbation γ (Cowtan, 1999). In this approach, a random perturbation is applied to the starting data. Density modification is applied to both the perturbed and unperturbed maps. The relative size of the perturbation signal in the modified map gives an estimate for γ. The perturbation γ provides effective bias correction for any combination of solvent flattening, histogram matching and averaging. γ may also be estimated as a function of resolution, allowing successful application to multiresolution modification and possibly atomization as well.
The Fourier coefficients of the densitymodified map include nonzero contributions for reflections that were not present in the original electrondensity map. These values are commonly used to restore the values of reflections that were missing from the original data (including lowresolution reflections falling behind the backstop). However, they have recently also been employed to extrapolate structure factors beyond the resolution of the observed data (Caliandro et al., 2005; Usón et al., 2007).
In the scheme of Usón et al., the Fourier coefficients of the densitymodified map are normalized to obey a Wilson distribution (calculated using the overall temperature factor of the observed data). These are then used in the calculation of the next electrondensity map. At high resolution (1.35 Å in the case reported), this procedure leads to both improved phases for the observed structure factors at lower resolutions and good estimates for the extrapolated structure factors – these can be judged by comparison with the calculated values obtained from the final refined model. The resulting electrondensity map obtained using density modification alone can rival the quality of the final `best' map obtained from refinement.
The theoretical basis for the effectiveness of this approach has yet to be fully explained. Current implementations are mainly effective at high resolution, where the extrapolated resolution can approach atomic resolution.
The chemical and physical information of the underlying structure that the electron density represents serves as constraints on the phases. For small molecules, the constraints of positivity and atomicity are sufficient to solve the phase problem ab initio (Hauptman, 1986; Karle, 1986; Woolfson, 1987), because crystals of small molecules generally diffract to atomic resolution. However, no single constraint at our disposal is powerful enough to render the macromolecular phase problem determinable, because macromolecule crystals rarely diffract to atomic resolution. Therefore, individual constraints are combined to produce a more powerful densitymodification protocol. This is because these constraints represent different characteristic features of the electron density and they contain independent phasing information.
The phasing power of a method increases with the number of independent constraints employed, the number of density points affected and the amplitude of changes imposed on the electron density. It also depends on the physical nature and accuracy of the constraints and how the constraints are applied. One obvious way of implementing several constraints is to apply them one after the other to the electron density. This sequential application, although easy to implement, suffers some drawbacks. The cyclic application of all constraints may not converge easily, since some constraints may contain contradicting information as to how the density should be modified. An alternative way of implementing various constraints is simultaneous application. The density solution that satisfies all the constraints is obtained by a global minimization procedure (Main, 1990b; Zhang & Main, 1990b).
The constraints used in SQUASH/DM can be divided into three categories. The first category comprises the linear constraints, such as solvent flatness, density histogram and equal molecules. The second category comprises the nonlinear constraints, such as the local shape of electron density as expressed in Sayre's equation. The third category comprises the available structural data, such as the observed structurefactor amplitudes and the experimental phases. The first and second categories of constraints are used to solve new electrondensity values. The third category of constraints is used as a means to filter the modified phases.
The modification to the density value at a grid point by a linear constraint is independent of the values at other grid points. These constraints include solvent flattening, histogram matching and molecular averaging. These densitymodification methods construct an improved map directly from an initial density map as expressed by where is the target electron density produced by these linear constraints.
The new electron density that satisfies both the linear constraints represented by equation (15.1.5.1) and the nonlinear constraints expressed by Sayre's equation (15.1.2.30) can be obtained by solving the systems of simultaneous equations (Zhang & Main, 1990b)
Equation (15.1.5.2) represents a system of nonlinear simultaneous equations with as many unknowns as the number of grid points in the asymmetric unit of the map and with twice as many equations as unknowns. The functions and are both known. The leastsquares solution, using either the full matrix or the diagonal approximation, is obtained using the Newton–Raphson technique with fast Fourier transforms, as described in the next section (Main, 1990b).
For a system of nonlinear equations of electron density, where 0 is a null vector, n is the number of grid points and m is the number of equations, the Newton–Raphson method of solution is to find a set of shifts, to , through a system of linear equations, where J is a matrix of partial derivatives of F with respect to and is called the Jacobian matrix, is a vector of residuals to equation (15.1.5.3) for a trial solution, , and is a vector of shifts to the density. Hence, the solution for is achieved in an iterative manner, Therefore, the problem of solving a system of nonlinear equations (15.1.5.3) is transformed into solving a system of linear equations (15.1.5.4), which forms one cycle of Newton–Raphson iteration.
If there are more equations than unknowns , the unknowns are obtained through a leastsquares solution to equations (15.1.5.4), Theoretically, the above system of equations could be solved by matrix multiplication and inversion, i.e. However, the amount of calculation involved in setting up the normal matrix of least squares is huge for the problem presented by protein structures. This can be completely avoided by using the conjugategradient technique for solving the system of linear equations.
The conjugategradient method does not require the inversion of the normal matrix, and therefore the solution to a large system of linear equations can be achieved very quickly.
Starting from a trial solution to equations (15.1.5.4), such as a null vector, the initial residual is and the initial search step is
The iterative process is as follows. The new shift to the density is where and The new residual is where The next search step which conjugates with the residual is where
The process is iterated by increasing k until convergence is reached, when
The number of iterations required for an exact solution is equal to the number of unknowns, because the search vector at each step is orthogonal with all the previous steps. However, a very satisfactory solution can normally be reached after very few iterations. This makes the conjugategradient method a very efficient and fast procedure for solving a system of equations. Note that the normal matrix never appears explicitly, although it is implicit in (15.1.5.10) and (15.1.5.16). The inversion of the normal matrix and matrix multiplication is completely avoided. Most of the calculation comes from the formation of the matrixvector products in (15.1.5.10), (15.1.5.14) and (15.1.5.16). These can be expressed as convolutions and can be performed using FFTs, thus saving considerably more time.
The solution to at the end of conjugategradient iteration is substituted into equation (15.1.5.6) to get a new solution for . The solution to the system of nonlinear equations (15.1.5.3) is obtained when the Newton–Raphson iteration has reached convergence.
The equations to be solved for the electrondensity shifts, , are from the Jacobian of equation (15.1.5.2), where is the residual to Sayre's equation, and is the residual to the linear densitymodification equations, Starting from a trial solution of , the initial residual vector is where and Thus, only three FFTs are required to calculate the initial residual. The residual of Sayre's equation is given in equation (15.1.5.23).
The calculation of in equation (15.1.5.14) is achieved in a similar manner using FFTs, where the vector is partitioned as shown above, and
Similarly, vector in equation (15.1.5.16) is obtained from where is defined in equation (15.1.5.26).
The remaining calculations in equations (15.1.5.12), (15.1.5.13), (15.1.5.15), (15.1.5.17) and (15.1.5.18) require either the inner product of a pair of vectors or a linear combination of vectors, both of which are very quick to calculate. Each iteration of the conjugate gradient requires four FFTs, as described in equations (15.1.5.26)–(15.1.5.29).
The fullmatrix solution to equation (15.1.5.4) requires a significant amount of computing, although it can be achieved using FFTs. The diagonal approximation to the normal matrix has been used as an alternative method of solution to the electrondensity shift in equation (15.1.5.4) (Main, 1990b). As with the fullmatrix calculation, it can be done entirely by FFTs and a linear combination of vectors.
The diagonal element of the normal matrix, , in equation (15.1.5.7) is The righthand side of equation (15.1.5.7), , is identical to the residual vector, , which can be calculated from equation (15.1.5.22). Therefore, the solution to the electrondensity shift, , can be calculated from
Compared with the fullmatrix solution, all the calculations involved in between equations (15.1.5.12) and (15.1.5.18) and the subsequent iterations are spared in the diagonal approximation. This makes calculation by the diagonal approximation much faster than by the fullmatrix method.
Statistical densitymodification methods arise from a reinterpretation of the problem of phase improvement in statistical terms, and as a result reduce the problems of bias associated with the classical densitymodification methods described above.
This is achieved by expressing all information about probable electrondensity values in the map in terms of probability distributions, with a probability distribution of electrondensity values assigned to each point in the electrondensity map. The probable electrondensity values at any point in the map may depend, for example, on whether that point lies in the solvent or protein region, and on any NCS relationships with other regions of the map. These distributions are then used to infer corresponding phase probability distributions in reciprocal space, which may then be combined with any existing phase information. This avoids working with a single map representing a single sample from the phase probability distributions.
This formulation in turn weakens the link between the newly introduced information and the initial phase probability distributions, thus reducing the bias introduced in a single cycle of phase improvement. Since the current `best' map is not used as a starting map to be modified, the phase probability distributions from which the `best' map is derived are not directly included in any new phase information. The only way in which the current phases are used is in the classification of the asymmetric unit into regions of different density types, e.g. solvent and protein.
The result of these changes is that statistical densitymodification techniques lead to reduced phase bias and more realistic estimates of the figures of merit. The resulting method has been implemented in the RESOLVE software (Terwilliger, 1999). In addition to its application to conventional densitymodification problems, it has been particularly effective in removing bias from maps phased from an atomic model through the `primeandswitch' approach (Terwilliger, 2004). The statistical approach to density modification requires substantially more computation that the simpler classical methods.
To demonstrate the effect of different constraints on phase improvement, various densitymodification techniques were applied to an MIR data set for which the refined structure coordinates are available. The test structure is 5carboxymethyl2hydroxymuconate isomerase, solved by Wigley et al. (1989). MIR phases were available to 3.7 Å, with SIR information to 2.6 Å. Density modification was used to improve and extend phases to the limit of the data at 2.1 Å. The structure includes threefold noncrystallographic symmetry.
The MIR and densitymodified phases are compared by plotting the mean of the cosine of the phase error, weighted by the figure of merit and structurefactor amplitude, as a function of resolution (Zhang et al., 1997), This phase correlation over all reflections is equivalent to map correlation. The results of density modification by various techniques, using the reflectionomit method for phase combination, are shown in Fig. 15.1.7.1.
Solvent flattening alone has slightly improved the phases at low resolution but has not lead to significant phase extension. The solventflattening function in Fig. 15.1.3.1 only has nonzero amplitudes close to the origin. It relates structure factors only in a very thin resolution shell. Therefore, solvent flattening is weak on phase extension.
Histogram matching alone improves the lowresolution phases and gives significant phase extension to higher resolutions. The histogrammatching function in Fig. 15.1.3.1 showed much stronger highresolution amplitudes. Therefore, it could relate structure factors in a larger resolution shell. Moreover, there is always an ideal histogram specified at a given target resolution for phase extension. These two reasons combined make histogram matching a more powerful technique in phase extension than solvent flattening.
The combination of histogram matching and solvent flattening is slightly more powerful than histogram matching alone; since histogram matching sharpens the protein density, it implies an element of solvent flattening. Solvent flattening and averaging give a significant improvement at low resolution, but little phase extension. Averaging is powerful for phase refinement, but is weak for phase extension if no special precautions are taken. If there are flexible loop regions on the protein surface, these regions should be excluded from the molecular mask for averaging. The phasing power of averaging weakens at high resolution when the differences between NCSrelated molecules become significant. Solvent flattening, histogram matching and averaging combined give a dramatic improvement at all resolutions. The addition of Sayre's equation gives a slight further improvement at high resolution.
Sayre's equation is very effective for phase refinement and extension at atomic or near atomic resolution. It becomes ineffective at low resolution or when the initial map is poor. Under these circumstances, it is better to apply other densitymodification methods first to refine the phases and extend them to a higher resolution before Sayre's equation is applied. Sayre's equation also decreases in power as the solvent content increases, since it is only applicable to the protein regions of the map.
The fact that the best results were obtained when all the constraints were combined indicates that each constraint contains some degree of independent phasing information. Moreover, it also suggests that the strengths of these constraints are complementary. Each constraint, when applied in isolation, may introduce systematic errors that are difficult to overcome when a different constraint is subsequently applied. This problem is greatly reduced when the constraints are applied simultaneously and the combined process iterates much further towards the desired density map.
Densitymodification methods have become sufficiently powerful that it is possible to solve structures from comparatively poor initial maps. This has reduced the amount of effort required to find more heavyatom derivatives and to collect additional diffraction data sets. Density modification may simplify the process of map interpretation, even when good phase information is available. Density modification can also be used to obtain phases ab initio when highorder noncrystallographic symmetry is present.
Acknowledgements
KYJZ acknowledges the US National Institutes of Health for support (grant GM55663). KDC acknowledges the UK BBSRC for support (grant 87/B03785). Some of the material used in this article is reprinted from Cowtan & Zhang (1999) with permission from Elsevier Science.
References
Abrahams, J. P. (1997). Bias reduction in phase refinement by modified interference functions: introducing the γ correction. Acta Cryst. D53, 371–376.Abrahams, J. P. & Leslie, A. G. W. (1996). Methods used in the structure determination of bovine mitochondrial F_{1} ATPase. Acta Cryst. D52, 30–42.
Agarwal, R. C. & Isaacs, N. W. (1977). Method for obtaining a high resolution protein map starting from a low resolution map. Proc. Natl Acad. Sci. USA, 74, 2835–2839.
Baker, D., Bystroff, C., Fletterick, R. J. & Agard, D. A. (1993). PRISM: topologically constrained phase refinement for macromolecular crystallography. Acta Cryst. D49, 429–439.
Baker, D., Krukowski, A. E. & Agard, D. A. (1993). Uniqueness and the ab initio phase problem in macromolecular crystallography. Acta Cryst. D49, 186–192.
Bhat, T. N. & Blow, D. M. (1982). A densitymodification method for the improvement of poorly resolved protein electrondensity maps. Acta Cryst. A38, 21–29.
Blow, D. M. & Rossmann, M. G. (1961). The single isomorphous replacement method. Acta Cryst. 14, 1195–1202.
Bricogne, G. (1974). Geometric sources of redundancy in intensity data and their use for phase determination. Acta Cryst. A30, 395–405.
Bricogne, G. (1976). Methods and programs for directspace exploitation of geometric redundancies. Acta Cryst. A32, 832–847.
Brünger, A. T. (1992). Free R value: a novel statistical quantity for assessing the accuracy of crystal structures. Nature (London), 355, 472–475.
Brünger, A. T., Kuriyan, J. & Karplus, M. (1987). Crystallographic R factor refinement by molecular dynamics. Science, 235, 458–460.
Bystroff, C., Baker, D., Fletterick, R. J. & Agard, D. A. (1993). PRISM: application to the solution of two protein structures. Acta Cryst. D49, 440–448.
Caliandro, R., Carrozzini, B., Cascarano, G. L., De Caro, L., Giacovazzo, C. & Siliqi, D. (2005). Phasing at resolution higher than the experimental resolution. Acta Cryst. D61, 556–565.
Chapman, M. S., Tsao, J. & Rossmann, M. G. (1992). Ab initio phase determination for spherical viruses: parameter determination for sphericalshell models. Acta Cryst. A48, 301–312.
Cowtan, K. D. (1999). Error estimation and bias correction in phaseimprovement calculations. Acta Cryst. D55, 1555–1567.
Cowtan, K. D. & Main, P. (1993). Improvement of macromolecular electrondensity maps by the simultaneous application of real and reciprocal space constraints. Acta Cryst. D49, 148–157.
Cowtan, K. D. & Main, P. (1996). Phase combination and cross validation in iterated densitymodification calculations. Acta Cryst. D52, 43–48.
Cowtan, K. D. & Main, P. (1998). Miscellaneous algorithms for density modification. Acta Cryst. D54, 487–493.
Cowtan, K. D. & Zhang, K. Y. J. (1999). Density modification for macromolecular phase improvement. Prog. Biophys. Mol. Biol. 72, 245–270.
Crowther, R. A. & Blow, D. M. (1967). A method of positioning a known molecule in an unknown crystal structure. Acta Cryst. 23, 544–548.
Greer, J. (1985). Computer skeletonization and automatic electrondensity map analysis. In Diffraction Methods for Biological Macromolecules, edited by H. W. Wyckoff, C. H. W. Hirs & S. N. Timasheff, Vol. 115, pp. 206–224. Orlando: Academic Press.
Harrison, R. W. (1988). Histogram specification as a method of density modification. J. Appl. Cryst. 21, 949–952.
Hauptman, H. (1986). The direct methods of Xray crystallography. Science, 233, 178–183.
Hendrickson, W. A., Klippenstein, G. L. & Ward, K. B. (1975). Tertiary structure of myohemerythrin at low resolution. Proc. Natl Acad. Sci. USA, 72, 2160–2164.
Hendrickson, W. A. & Lattman, E. E. (1970). Representation of phase probability distributions for simplified combination of independent phase information. Acta Cryst. B26, 136–143.
Hoppe, W. & Gassmann, J. (1968). Phase correction, a new method to solve partially known structures. Acta Cryst. B24, 97–107.
Karle, J. (1986). Recovering phase information from intensity data. Science, 232, 837–843.
Lamzin, V. S. & Wilson, K. S. (1997). Automated refinement for protein crystallography. Methods Enzymol. 277, 269–305.
Leslie, A. G. W. (1987). A reciprocalspace method for calculating a molecular envelope using the algorithm of B. C. Wang. Acta Cryst. A43, 134–136.
Lunin, V. Yu. (1988). Use of the information on electron density distribution in macromolecules. Acta Cryst. A44, 144–150.
Lunin, V. Yu. & Skovoroda, T. P. (1991). Frequencyrestrained structurefactor refinement. I. Histogram simulation. Acta Cryst. A47, 45–52.
Main, P. (1979). A theoretical comparison of the β, γ′ and 2F_{o} − F_{c} syntheses. Acta Cryst. A35, 779–785.
Main, P. (1990a). A formula for electron density histograms for equalatom structures. Acta Cryst. A46, 507–509.
Main, P. (1990b). The use of Sayre's equation with constraints for the direct determination of phases. Acta Cryst. A46, 372–377.
Main, P. & Rossmann, M. G. (1966). Relationships among structure factors due to identical molecules in different crystallographic environments. Acta Cryst. 21, 67–72.
Matthews, B. W. (1968). Solvent content of protein crystals. J. Mol. Biol. 33, 491–497.
Miller, R., DeTitta, G. T., Jones, R., Langs, D. A., Weeks, C. M. & Hauptman, H. A. (1993). On the application of the minimal principle to solve unknown structures. Science, 259, 1430–1433.
Navaza, J. (1994). AMoRe: an automated package for molecular replacement. Acta Cryst. A50, 157–163.
Perrakis, A., Sixma, T. K., Wilson, K. S. & Lamzin, V. S. (1997). wARP: improvement and extension of crystallographic phases by weighted averaging of multiplerefined dummy atomic models. Acta Cryst. D53, 448–455.
Podjarny, A. D. & Yonath, A. (1977). Use of matrix direct methods for lowresolution phase extension for tRNA. Acta Cryst. A33, 655–661.
Read, R. J. (1986). Improved Fourier coefficients for maps using phases from partial structures with errors. Acta Cryst. A42, 140–149.
Read, R. J. & Schierbeek, A. J. (1988). A phased translation function. J. Appl. Cryst. 21, 490–495.
Reynolds, R. A., Remington, S. J., Weaver, L. H., Fisher, R. G., Anderson, W. F., Ammon, H. L. & Matthews, B. W. (1985). Structure of a serine protease from rat mast cells determined from twinned crystals by isomorphous and molecular replacement. Acta Cryst. B41, 139–147.
Rossmann, M. G. & Blow, D. M. (1962). The detection of subunits within the crystallographic asymmetric unit. Acta Cryst. 15, 24–31.
Rossmann, M. G., McKenna, R., Tong, L., Xia, D., Dai, J.B., Wu, H., Choi, H.K. & Lynch, R. E. (1992). Molecular replacement realspace averaging. J. Appl. Cryst. 25, 166–180.
Sayre, D. (1952). The squaring method: a new method for phase determination. Acta Cryst. 5, 60–65.
Sayre, D. (1972). On leastsquares refinement of the phases of crystallographic structure factors. Acta Cryst. A28, 210–212.
Sayre, D. (1974). Leastsquares phase refinement. II. Highresolution phasing of a small protein. Acta Cryst. A30, 180–184.
Schevitz, R. W., Podjarny, A. D., Zwick, M., Hughes, J. J. & Sigler, P. B. (1981). Improving and extending the phases of medium and lowresolution macromolecular structure factors by density modification. Acta Cryst. A37, 669–677.
Schuller, D. J. (1996). MAGICSQUASH: more versatile noncrystallographic averaging with multiple constraints. Acta Cryst. D52, 425–434.
Sim, G. A. (1959). The distribution of phase angles for structures containing heavy atoms. II. A modification of the normal heavyatom method for noncentrosymmetrical structures. Acta Cryst. 12, 813–815.
Swanson, S. M. (1994). Core tracing: depicting connections between features in electron density. Acta Cryst. D50, 695–708.
Terwilliger, T. C. (1999). Reciprocalspace solvent flattening. Acta Cryst. D55, 1863–1871.
Terwilliger, T. C. (2004). Using primeandswitch phasing to reduce model bias in molecular replacement. Acta Cryst. D60, 2144–2149.
Tsao, J., Chapman, M. S. & Rossmann, M. G. (1992). Ab initio phase determination for viruses with high symmetry: a feasibility study. Acta Cryst. A48, 293–301.
Usón, I., Stevenson, C. E. M., Lawson, D. M. & Sheldrick, G. M. (2007). Structure determination of the Omethyltransferase NovP using the `free lunch algorithm' as implemented in SHELXE. Acta Cryst. D63, 1069–1074.
Vellieux, F. M. D. (1998). A comparison of two algorithms for electrondensity map improvement by introduction of atomicity: skeletonization, and map sorting followed by refinement. Acta Cryst. D54, 81–85.
Vellieux, F. M. D. A. P., Hunt, J. F., Roy, S. & Read, R. J. (1995). DEMON/ANGEL: a suite of programs to carry out density modification. J. Appl. Cryst. 28, 347–351.
Wang, B. C. (1985). Resolution of phase ambiguity in macromolecular crystallography. In Diffraction Methods for Biological Macromolecules, edited by H. W. Wyckoff, C. H. W. Hirs & S. N. Timasheff, Vol. 115, pp. 90–113. Orlando: Academic Press.
Weeks, C. M., DeTitta, G. T., Miller, R. & Hauptman, H. A. (1993). Applications of the minimal principle to peptide structures. Acta Cryst. D49, 179–181.
Wigley, D. B., Roper, D. I. & Cooper, R. A. (1989). Preliminary crystallographic analysis of 5carboxymethyl2hydroxymuconate isomerase from Escherichia coli. J. Mol. Biol. 210, 881–882.
Wilson, A. J. C. (1949). The probability distribution of Xray intensities. Acta Cryst. 2, 318–321.
Wilson, C. & Agard, D. A. (1993). PRISM: automated crystallographic phase refinement by iterative skeletonization. Acta Cryst. A49, 97–104.
Woolfson, M. M. (1987). Direct methods – from birth to maturity. Acta Cryst. A43, 593–612.
Zhang, K. Y. J. (1993). SQUASH – combining constraints for macromolecular phase refinement and extension. Acta Cryst. D49, 213–222.
Zhang, K. Y. J., Cowtan, K. D. & Main, P. (1997). Combining constraints for electron density modification. In Macromolecular Crystallography, edited by C. W. Carter & R. M. Sweet, Vol. 277, pp. 53–64. New York: Academic Press.
Zhang, K. Y. J. & Main, P. (1988). Histogram matching as a density modification technique for phase refinement and extension of protein molecules. In Improving Protein Phases, edited by S. Bailey, E. Dodson & S. Phillips. Report DL/SCI/R26, pp. 57–64. Warrington: Daresbury Laboratory.
Zhang, K. Y. J. & Main, P. (1990a). Histogram matching as a new density modification technique for phase refinement and extension of protein molecules. Acta Cryst. A46, 41–46.
Zhang, K. Y. J. & Main, P. (1990b). The use of Sayre's equation with solvent flattening and histogram matching for phase extension and refinement of protein structures. Acta Cryst. A46, 377–381.