International
Tables for Crystallography Volume F Crystallography of biological macromolecules Edited by E. Arnold, D. M. Himmel and M. G. Rossmann © International Union of Crystallography 2012 
International Tables for Crystallography (2012). Vol. F, ch. 14.1, pp. 367372
https://doi.org/10.1107/97809553602060000844 Chapter 14.1. Heavyatom location and phase determination with singlewavelength diffraction data ^{a}Institute of Molecular Biology, Howard Hughes Medical Institute and Department of Physics, University of Oregon, Eugene, OR 97403, USA The information from anomalous scattering and isomorphous replacement are complementary. Ways in which these two sources of information can be combined to facilitate the determination of the crystal structures of macromolecules are reviewed. The use of a single isomorphous replacement leads to an ambiguous phase determination, but this ambiguity can be resolved by incorporation of data from anomalous scattering. Likewise, information from isomorphous replacement and anomalous scattering can be combined to assist in the location of heavy atoms. 
As is well known, the successful introduction of the method of isomorphous replacement by Green et al. (1954) was the turning point in the subsequent development of protein crystallography as we now know it.
The idea that the phases of Xray reflections from a protein crystal could be obtained by the introduction of heavy atoms into the crystal was not new, having been suggested by J. D. Bernal in 1939 (Bernal, 1939). The isomorphousreplacement method was used as early as 1927 by Cork (1927) in studying the alums. Bokhoven et al. (1951) subsequently extended the method to the study of a noncentrosymmetric projection of strychnine sulfate, using what would now be termed the method of single isomorphous replacement. They also suggested that by using a double isomorphous replacement, a unique phase determination could be obtained, even for noncentrosymmetric reflections. The details of the double (or multiple) isomorphousreplacement method were worked out by Harker (1956), who introduced the very useful concept of phase circles. Another contribution which was of great practical value, and which will provide the basis for much of the subsequent discussion, is the method introduced by Blow & Crick (1959) for the treatment of errors in the isomorphousreplacement method. In addition to the determination of protein phases by the method of substitution with heavy atoms, it is now routine to supplement this information by utilizing the anomalous scattering of the substituted atoms. The underlying principles trace back to articles by Bijvoet (1954), Ramachandran & Raman (1956), and Okaya & Pepinsky (1960). The first application of the anomalousscattering method to protein crystallography was by Blow (1958), who used the anomalous scattering of the iron atoms to determine phase information for a noncentrosymmetric projection of horse oxyhaemoglobin.
In the following discussion, we first review the classical method of phase determination by isomorphous replacement, then discuss the inclusion of singlewavelength anomalousscattering data, and conclude by discussing the use of such data for heavyatom location. Part of the review is based on Matthews (1970).
Consider a protein crystal with an isomorphous heavyatom derivative, i.e. a modified crystal in which heavy atoms occupy specific sites throughout the crystal, but which is in all other respects identical to the unsubstituted `parent' crystal. Let the structure factors of the protein crystal be , of the isomorph be , and of the heavy atoms . (Note: Structure amplitudes are indicated by italic type, e.g. , and vectors by boldface type, e.g. .) In practice, one can measure the structure amplitudes and , and it is desired to obtain from these observable quantities the value of the phase angle of so that a Fourier synthesis showing the electron density of the protein structure may be calculated. It will be assumed, for the moment, that the positions and occupancy of the sites of heavyatom binding have been determined as accurately as possible.
From the heavyatom parameters, the corresponding structure factor is calculated. To determine ϕ, the phase of , we construct a set of phase circles, as proposed by Harker (1956). From a chosen origin O (Fig. 14.1.2.1a), the vector OA is drawn equal to . Circles of radius and are then drawn about O and A, respectively. The intersections of the phase circles at B and B′ define two possible phase angles for . Note that the angles are symmetrical about . This ambiguity may in principle be resolved in two ways: (a) by using a second heavyatom isomorphous derivative or (b) by utilizing the anomalousscattering effects for the first isomorph.
The phase information provided by a second isomorph is illustrated in Fig. 14.1.3.1(a). In theory, the three phase circles will intersect at a point and the phase ambiguity will be resolved. In practice, there will be errors in the observed amplitudes and and in the heavyatom parameters (and thus in ). Also, the isomorphism may be imperfect. As a result, the intersections of the three phase circles may not coincide. Another complication arises from the fact that for reflections where is small, the circles will be essentially concentric and will not have well defined points of intersection. In other words, the phase determination will become indeterminate. The method of Blow & Crick (1959) was introduced as a way to take all these factors into account. It has had an extraordinary impact, not only as a practical method for protein phase determination, but also in influencing all subsequent thinking in this area.
Blow & Crick pointed out that in practice the phase angle ϕ can never be determined with complete certainty. Rather, there is a finite probability that any arbitrary phase angle may be the correct one. Consider the vector diagram shown in Fig. 14.1.4.1, in which is known and we wish to determine the probability that the arbitrary phase angle ϕ is the correct phase of . Strictly, one should allow for the possibility of errors in and , and should consider the probability that the vector occupies all possible positions in the Argand diagram. However, Blow & Crick suggested that the analysis might be considerably simplified by assuming that and are known accurately and that all the error lies in the observation of . In other words, it was assumed that the vector must lie on the circle of radius , and the probability distribution of could be evaluated as a function of ϕ only.
For an arbitrary phase angle ϕ, the phase triangle (Fig. 14.1.4.1) will not close exactly. If we define to be the vector sum of and , then the lack of closure of the phase triangle is given by Following Blow & Crick, if E is the r.m.s. error associated with the measurements, and the distribution of error is assumed to be Gaussian, then the probability P(ϕ) of the phase ϕ being the true phase is where N is a normalizing factor such that the sum of all probabilities is unity. The unnormalized probability distribution corresponding to Fig. 14.1.4.1 (and Fig. 14.1.2.1a) is shown in Fig. 14.1.2.1(b). The two most probable phase angles ( and ) are the alternative phases of for which the phase triangle is closed.
Individual probability distributions for the additional heavyatom derivatives are derived in an analogous manner and may be multiplied together to give an overall probability distribution. The joint probability distribution corresponding to Fig. 14.1.3.1(a) is shown in Fig. 14.1.3.1(b), and in this case the most probable phase is that which simultaneously fits best the observed data for the two isomorphous derivatives.
The main objection which may be made to the Blow & Crick treatment is that it assumes that there is no error in . In practice, however, this is not a serious limitation.
A protein crystallographer desires to obtain a Fourier synthesis that can most readily be interpreted in terms of an atomic model of the structure. One synthesis which could be calculated is the `most probable Fourier', obtained by choosing the value of for each reflection which corresponds to the highest value of P(ϕ). Blow & Crick pointed out that although this Fourier is the most likely to be correct, it has certain disadvantages. In the first place, it might tend to give too much weight to uncertain or unreliable phases, and, in the second place, for cases where P(ϕ) is bimodal, there is a strong chance of making a large error in the phase angle. Blow & Crick suggested that in cases such as this, a compromise is needed, and that the centroid of the phase probability distribution provides just the required compromise. They showed that the corresponding synthesis is the `best Fourier', which is defined to be that Fourier transform which is expected to have the minimum meansquare difference from the Fourier transform of the true F's when averaged over the whole unit cell.
The centroid of the phase probability distribution may be defined as a point on the phase diagram with polar coordinates , where is the `best' phase angle. The quantity m, which acts as a weighting factor for , is called the `figure of merit' of the phase determination. Its magnitude, between 0 and 1, is a measure of the reliability of the phase determination.
All atoms, particularly those used in preparing heavyatom isomorphs, give rise to anomalous scattering, especially if the energy of the scattered Xrays is close to an absorption edge. The atomic scattering factor of the atom in question can be expressed as
where is the normal scattering factor far from an absorption edge, and Δf ′ and f ″ are the correction terms which arise due to dispersion effects. The quantity Δf ′, in phase with , is usually negative, and f ″, the imaginary part, is always ahead of the phase of the real part . It may be noted that by using different wavelengths, the term Δf ′ is equivalent to a change in scattering power of the heavy atom and produces intensity differences similar to a normal isomorphous replacement, except that in this case the isomorphism is exact (Ramaseshan, 1964). This is the basis of the multiwavelengthanomalousdispersion (MAD) method (Hendrickson, 1991) discussed in Chapter 14.2 . Here we focus on measurements based on a single wavelength, traditionally referred to as the `anomalousscattering method'.
The anomalous scattering of a heavy atom is always considerably less than the normal scattering (for Cu Kα radiation, ranges from about 0.24 to 0.36), but there are several factors which tend to offset this reduction in magnitude (e.g. see Blow, 1958; North, 1965).
Suppose that two isomorphous crystals are differentiated by N heavy atoms of position and scattering factor . Then, for the reflection hkl, the calculated structure factor of the N atoms is If the heavy atoms are all of the same type, i.e. they all have the same ratio of , then and are orthogonal, and .
The relation between the structure factors of the reflection hkl and its Friedel mate is illustrated in Fig. 14.1.7.1(a). The situation can be conveniently represented (Fig. 14.1.7.1b) by reflecting the diagram through the real axis onto the hkl diagram. In cases such as this, where Friedel's law breaks down, we shall refer to the difference as the Bijvoet difference, or simply the anomalousscattering difference. The Harker phase circles corresponding to Fig. 14.1.7.1(b) are shown in Fig. 14.1.7.2. It will be seen that, as in the case of single isomorphous replacement, and similarly with the anomalousscattering data alone, there is an ambiguous phase determination. In the absence of error, the three phase circles (Fig. 14.1.7.2) will meet at a point, resolving the phase ambiguity and giving a unique solution for the phase of . The isomorphousreplacement method gives phase information symmetrical about the vector , whereas the anomalousscattering phase information for is symmetrical about , which, for heavy atoms of the same type, is at right angles to . In other words, the two methods complement each other, one method providing exactly that information which is not given by the other.

(a) Vector diagrams illustrating anomalous scattering for the reflections hkl and . (b) Combined vector diagram for reflections hkl and . 

Harker construction for a single isomorphous replacement with anomalous scattering, in the absence of errors. 
On average, the experimentally measured isomorphousreplacement difference, , will be larger than the anomalousscattering difference, . The former, however, relies on measurements from different crystals and is also susceptible to errors due to nonisomorphism between the parent and derivative crystals. The latter can be obtained from measurements on the same crystal, under closely similar experimental conditions, and is not affected by nonisomorphism. Therefore, it is desirable to use methods that take into account the different sources of error in the respective measurements (Blow & Rossmann, 1961; North, 1965; Matthews, 1966b). One method is as follows.
From Fig. 14.1.8.1, it can be seen that the most probable phase angle will be the one for which . At any other phase angle, there will be an `anomalousscattering lack of closure' which we define to be . The value of can readily be calculated as a function of ϕ (Matthews, 1966b; Hendrickson, 1979). Thus, if the r.m.s. error in is , and the distribution of error is assumed to be Gaussian, then from measurements of anomalous scattering, the probability of phase ϕ being the true phase of can be estimated using an equation exactly analogous to equation (14.1.4.2).
An example of an anomalousscattering phase probability distribution is shown by the dotted curve in Fig. 14.1.8.2. The asymmetry of the distribution arises from the fact that is the phase probability distribution for rather than that of , which would be symmetrical about the phase of . The overall probability distribution obtained by combining the anomalousscattering data with the previous isomorphousreplacement data (Fig. 14.1.2.1b) is given by and is illustrated in Fig. 14.1.8.2.
The treatment outlined above of phase determination by anomalous scattering assumed that data were available for a parent crystal devoid of anomalous scatters and an anomalously scattering isomorphous heavyatom derivative. It is not uncommon that the native protein itself contains atoms which scatter anomalously or has been engineered to contain such scatterers. In such cases, measurements will usually be made at multiple wavelengths in order to exploit MAD phasing (Hendrickson, 1991). If, however, measurements are available only at a single wavelength, they can be utilized to obtain some phase information (e.g. Matthews, 1970).
During the development of protein crystallography, it was understood that heavyatom sites might be located from difference Patterson functions, but there was substantial debate as to the type of function that was preferable (Perutz, 1956).
Blow (1958), and also Rossmann (1960), advocated a Patterson function with amplitudes . It relies on the admittedly crude assumption that the desired scattering amplitude of the heavy atoms, , can be approximated by The approximation does have one very helpful characteristic, namely, that it tends to be most accurate when is large, i.e. when is parallel or antiparallel to (cf. Fig. 14.1.4.1). Thus, the numerically largest coefficients in the Patterson function tend to represent correctly. Given a well behaved isomorphous heavyatom derivative, and accurately measured data, experience has shown that a map with coefficients can give an excellent representation of the desired heavyatom–heavyatom vector peaks.
A relation exactly analogous to equation (14.1.10.1) can be used to approximate the anomalous heavyatom scattering amplitude, namely, (see Fig. 14.1.7.1b). As noted above, if all the heavy atoms are the same, . Thus, a Patterson function with coefficients should also show the desired heavyatom–heavyatom vector peaks (Blow, 1957; Rossmann, 1961).
For each individual reflection, however, and as is also the case for phase determination, the information that is provided by the isomorphousreplacement difference is exactly complementary to that provided by the anomalousscattering measurement . By combining both sets of experimental measurements, it is possible to obtain a much better estimate of the heavyatom scattering, , for every reflection (Kartha & Parthasarathy, 1965a,b; Matthews, 1966a; Singh & Ramaseshan, 1966). One formulation (Matthews, 1966a) can be written as where and w is a weighting factor (from 0 to 1) that is an estimate of the relative reliability of the measurements of compared with .
The discussion above has focused on the use of difference Patterson functions to locate heavyatom sites. Once one or more putative sites have been located, they can be used to calculate approximate protein phases, which, in turn, can be used to calculate difference Fourier series with coefficients in the form where m is the figure of merit and is the `best', albeit approximate, phase of the protein structure factor. Putting aside errors due to inaccuracies in , such maps do not give the true heavyatom vector, . Rather, they give, essentially, the projection of along (cf. Fig. 14.1.4.1). Nevertheless, subject to certain limitations, such difference maps are extraordinarily powerful in locating secondary sites in a given heavyatom derivative, or in using approximate phases from one derivative to search for heavyatom sites in other putative derivatives. It is in this context, however, that certain limitations of the singleisomorphousreplacement (SIR) method have to be kept in mind. These are noted in the next section.
Although phase determination from a single heavyatom derivative in the absence of anomalousscattering data is, in principle, ambiguous, it was realized early on that useful phase information can still be obtained (Blow & Rossmann, 1961). As shown in Fig. 14.1.2.1(a), the two possible phases for the protein are or . In terms of the analysis of Blow & Crick (1959), the `best' phase to use for the protein is the average of and . This is also equivalent to using both and . With this in mind, a situation that is of special concern is one in which the heavyatom distribution used to determine the phases happens to have a centre of symmetry. One common way in which this can occur is when one has a heavyatom derivative with a single site in space group . A related situation occurs when there are multiple sites in space group , but all have the same y coordinate. If the origin of coordinates is considered to be at the site of centrosymmetry, then all of the heavyatom vectors (Fig. 14.1.2.1a) will necessarily have phases of 0 or π. If such phases are used, for example, to try to identify heavyatombinding sites in a second derivative, the map will show the correct sites, but will also show spurious peaks of equal height related by the centre of symmetry. Faced with this choice, one must arbitrarily choose one of the alternative peaks which, in turn, will define an overall handedness for the heavyatom arrangement. In the absence of any anomalousscattering data, one can proceed with the structure determination in the standard way, but it must be kept in mind that either the correct electrondensity map or its mirror image will ultimately be obtained.
An alternative approach is to include anomalousscattering data in the initial phase determination, i.e. to use single isomorphous replacement with anomalous scattering (SIRAS). It must be remembered that in calculating phases from anomalousscattering data, it is first necessary to determine the coordinates of the heavy atoms in their absolute configuration. If the wrong hand is used in the SIRAS method (illustrated in Fig. 14.1.13.1), the resultant electrondensity map will generally bear no relation to the correct electron density.
The recommended procedure, therefore, is as follows. One arbitrarily chooses one possible heavyatom arrangement for heavyatom derivative 1, calculates SIRAS phases and calculates a differenceelectrondensity map for derivative 2. The handedness of the derivative 1 coordinates are then inverted and the overall calculation repeated. The calculation based on the correct heavyatom arrangement should show peaks at the heavyatom sites of the second derivative. The calculation based on the incorrect arrangement shows noise (Matthews, 1966a). This procedure determines the absolute configuration of the heavyatom arrangement and, at the same time, shows the derived sites for the second and subsequent derivatives.
Acknowledgements
This work was supported in part by NIH grant GM21967.
References
Bernal, J. D. (1939). Structure of proteins. Nature (London), 143, 663–667.Bijvoet, J. M. (1954). Structure of optically active compounds in the solid state. Nature (London), 173, 888–891.
Blow, D. M. (1957). Xray Analysis of Haemoglobin: Determination of Phase Angles by Isomorphous Substitution. PhD thesis, University of Cambridge.
Blow, D. M. (1958). The structure of haemoglobin. VII. Determination of phase angles in the noncentrosymmetric [100] zone. Proc. R. Soc. London Ser. A, 247, 302–336.
Blow, D. M. & Crick, F. H. C. (1959). The treatment of errors in the isomorphous replacement method. Acta Cryst. 12, 794–802.
Blow, D. M. & Rossmann, M. G. (1961). The single isomorphous replacement method. Acta Cryst. 14, 1195–1202.
Bokhoven, C., Schoone, J. C. & Bijvoet, J. M. (1951). The Fourier synthesis of the crystal structure of strychnine sulphate pentahydrate. Acta Cryst. 4, 275–280.
Cork, J. M. (1927). The crystal structure of some of the alums. Philos. Mag. 4, 688–698.
Green, D. W., Ingram, V. M. & Perutz, M. F. (1954). The structure of haemoglobin. IV. Sign determination by the isomorphous replacement method. Proc. R. Soc. London Ser. A, 225, 287–307.
Harker, D. (1956). The determination of the phases of the structure factors of noncentrosymmetric crystals by the method of double isomorphous replacement. Acta Cryst. 9, 1–9.
Hendrickson, W. A. (1979). Phase information from anomalousscattering measurements. Acta Cryst. A35, 245–247.
Hendrickson, W. A. (1991). Determination of macromolecular structures from anomalous diffraction of synchrotron radiation. Science, 254, 51–58.
Kartha, G. & Parthasarathy, R. (1965a). Combination of multiple isomorphous replacement and anomalous dispersion data for protein structure determination. I. Determination of heavyatom positions in protein derivatives. Acta Cryst. 18, 745–749.
Kartha, G. & Parthasarathy, R. (1965b). Combination of multiple isomorphous replacement and anomalous dispersion data for protein structure determination. II. Correlation of the heavyatom positions in different isomorphous protein crystals. Acta Cryst. 18, 749–753.
Matthews, B. W. (1966a). The determination of the position of the anomalously scattering heavy atom groups in protein crystals. Acta Cryst. 20, 230–239.
Matthews, B. W. (1966b). The extension of the isomorphous replacement method to include anomalous scattering measurements. Acta Cryst. 20, 82–86.
Matthews, B. W. (1970). Determination and refinement of phases for proteins. In Crystallographic Computing, edited by F. R. Ahmed, S. R. Hall & C. P. Huber, pp. 146–159. Copenhagen: Munksgaard.
North, A. C. T. (1965). The combination of isomorphous replacement and anomalous scattering data in phase determination of noncentrosymmetric reflexions. Acta Cryst. 18, 212–216.
Okaya, Y. & Pepinsky, R. (1960). New developments in the anomalous dispersion method for structure analysis. In Computing Methods and the Phase Problem in Xray Crystal Analysis, pp. 273–299. London: Pergamon Press.
Perutz, M. F. (1956). Isomorphous replacement and phase determination in noncentrosymmetric space groups. Acta Cryst. 9, 867–873.
Ramachandran, G. N. & Raman, S. (1956). A new method for the structure analysis of noncentrosymmetric crystals. Curr. Sci. 25, 348–351.
Ramaseshan, S. (1964). The use of anomalous scattering in crystal structure analysis. In Advanced Methods of Crystallography, edited by G. N. Ramachandran, pp. 67–95. London: Academic Press.
Rossmann, M. G. (1960). The accurate determination of the position and shape of heavyatom replacement groups in proteins. Acta Cryst. 13, 221–226.
Rossmann, M. G. (1961). The position of anomalous scatterers in protein crystals. Acta Cryst. 14, 383–388.
Singh, A. K. & Ramaseshan, S. (1966). The determination of heavy atom positions in protein derivatives. Acta Cryst. 21, 279–280.