InternationalReciprocal spaceTables for Crystallography Volume B Edited by U. Shmueli © International Union of Crystallography 2006 |
International Tables for Crystallography (2006). Vol. B, ch. 2.3, pp. 242-246
## Section 2.3.3. Isomorphous replacement difference Pattersons |

One of the initial stages in the application of the isomorphous replacement method is the determination of heavy-atom positions. Indeed, this step of a structure determination can often be the most challenging. Not only may the number of heavy-atom sites be unknown, and have incomplete substitution, but the various isomorphous compounds may also lack isomorphism. To compound these problems, the error in the measurement of the isomorphous difference in structure amplitudes is often comparable to the differences themselves. Clearly, therefore, the ease with which a particular problem can be solved is closely correlated with the quality of the data-measuring procedure.

The isomorphous replacement method was used incidentally by Bragg in the solution of NaCl and KCl. It was later formalized by J. M. Robertson in the analysis of phthalocyanine where the coordination centre could be Pt, Ni and other metals (Robertson, 1935, 1936; Robertson & Woodward, 1937). In this and similar cases, there was no difficulty in finding the heavy-atom positions. Not only were the heavy atoms frequently in special positions, but they dominated the total scattering effect. It was not until Perutz and his colleagues (Green *et al*., 1954; Bragg & Perutz, 1954) applied the technique to the solution of haemoglobin, a protein of 68 000 Da, that it was necessary to consider methods for detecting heavy atoms. The effect of a single heavy atom, even uranium, can only have a very marginal effect on the structure amplitudes of a crystalline macromolecule. Hence, techniques had to be developed which were dependent on the difference of the isomorphous structure amplitudes rather than on the solution of the Patterson of the heavy-atom-derivative compound on its own.

Phases in a centrosymmetric projection will be 0 or π if the origin is chosen at the centre of symmetry. Hence, the native structure factor, , and the heavy-atom-derivative structure factor, , will be collinear. It follows that the structure amplitude, , of the heavy atoms alone in the cell will be given by where is the error on the parenthetic sum or difference. Three different cases may arise (Fig. 2.3.3.1). Since the situation shown in Fig. 2.3.3.1(*c*) is rare, in general Thus, a Patterson computed with the square of the differences between the native and derivative structure amplitudes of a centrosymmetric projection will approximate to a Patterson of the heavy atoms alone.

The approximation (2.3.3.1) is valid if the heavy-atom substitution is small enough to make for most reflections, but sufficiently large to make . It is also assumed that the native and heavy-atom-derivative data have been placed on the same relative scale. Hence, the relation (2.3.3.1) should be re-written as where *k* is an experimentally determined scale factor (see Section 2.3.3.7). Uncertainty in the determination of *k* will contribute further to , albeit in a systematic manner.

Centrosymmetric projections were used extensively for the determination of heavy-atom sites in early work on proteins such as haemoglobin (Green *et al*., 1954), myoglobin (Bluhm *et al*., 1958) and lysozyme (Poljak, 1963). However, with the advent of faster data-collecting techniques, low-resolution (*e.g.* a 5 Å limit) three-dimensional data are to be preferred for calculating difference Pattersons. For noncentrosymmetric reflections, the approximation (2.3.3.1) is still valid but less exact (Section 2.3.3.3). However, the larger number of three-dimensional differences compared to projection differences will enhance the signal of the real Patterson peaks relative to the noise. If there are *N* terms in the Patterson synthesis, then the peak-to-noise ratio will be proportionally and 1/. With the subscripts 2 and 3 representing two- and three-dimensional syntheses, respectively, the latter will be more powerful than the former whenever Now, as , it follows that must be greater than if the three-dimensional noncentrosymmetric computation is to be more powerful. This condition must almost invariably be true.

A Patterson of a native bio-macromolecular structure (coefficients ) can be considered as being, at least approximately, a vector map of all the light atoms (carbons, nitrogens, oxygens, some sulfurs, and also phosphorus for nucleic acids) other than hydrogen atoms. These interactions will be designated as *LL*. Similarly, a Patterson of the heavy-atom derivative will contain interactions, where *H* represents the heavy atoms. Thus, a true difference Patterson, with coefficients , will contain only the interactions . In general, the carpet of *HL* vectors completely dominates the *HH* vectors except for very small proteins such as insulin (Adams *et al*., 1969). Therefore, it would be preferable to compute a Patterson containing only *HH* interactions in order to interpret the map in terms of specific heavy-atom sites.

Blow (1958) and Rossmann (1960) showed that a Patterson with coefficients approximated to a Patterson containing only *HH* vectors. If the phase angle between and is ϕ (Fig. 2.3.3.2), then In general, however, . Hence, ϕ is small and which is the same relation as (2.3.3.1) for centrosymmetric approximations. Since the direction of is random compared to , the root-mean-square projected length of onto will be . Thus it follows that a better approximation is which accounts for the assumption (Section 2.3.3.2) that . The almost universal method for the initial determination of major heavy-atom sites in an isomorphous derivative utilizes a Patterson with coefficients. Approximation (2.3.3.2) is also the basis for the refinement of heavy-atom parameters in a single isomorphous replacement pair (Rossmann, 1960; Cullis *et al*., 1962; Terwilliger & Eisenberg, 1983).

In the most general case of a triclinic space group, it will be necessary to select an origin arbitrarily, usually coincident with a heavy atom. All other heavy atoms (and subsequently also the macromolecular atoms) will be referred to this reference atom. However, the choice of an origin will be independent in the interpretation of each derivative's difference Patterson. It will then be necessary to correlate the various, arbitrarily chosen, origins. The same problem occurs in space groups lacking symmetry axes perpendicular to the primary rotation axis (*e.g.* *etc.*), although only one coordinate, namely parallel to the unique rotation axis, will require correlation. This problem gave rise to some concern in the 1950s. Bragg (1958), Blow (1958), Perutz (1956), Hoppe (1959) and Bodo *et al*. (1959) developed a variety of techniques, none of which were entirely satisfactory. Rossmann (1960) proposed the synthesis and applied it successfully to the heavy-atom determination of horse haemoglobin. This function gives positive peaks at the end of vectors between the heavy-atom sites in the first compound, positive peaks between the sites in the second compound, and negative peaks between sites in the first and second compound (Fig. 2.3.3.3). It is thus the negative peaks which provide the necessary correlation. The function is unique in that it is a Patterson containing significant information in both positive and negative peaks. Steinrauf (1963) suggested using the coefficients in order to eliminate the positive and vectors.

Although the problem of correlation was a serious concern in the early structural determination of proteins during the late 1950s and early 1960s, the problem has now been by-passed. Blow & Rossmann (1961) and Kartha (1961) independently showed that it was possible to compute usable phases from a single isomorphous replacement (SIR) derivative. This contradicted the previously accepted notion that it was necessary to have at least two isomorphous derivatives to be able to determine a noncentrosymmetric reflection's phase (Harker, 1956). Hence, currently, the procedure used to correlate origins in different derivatives is to compute SIR phases from the first compound and apply them to a difference electron-density map of the second heavy-atom derivative. Thus, the origin of the second derivative will be referred to the arbitrarily chosen origin of the first compound. More important, however, the interpretation of such a `feedback' difference Fourier is easier than that of a difference Patterson. Hence, once one heavy-atom derivative has been solved for its heavy-atom sites, the solution of other derivatives is almost assured. This concept is examined more closely in the following section.

Difference Pattersons have usually been manually interpreted in terms of point atoms. In more complex situations, such as crystalline viruses, a systematic approach may be necessary to analyse the Patterson. That is especially true when the structure contains noncrystallographic symmetry (Argos & Rossmann, 1976). Such methods are in principle dependent on the comparison of the observed Patterson, , with a calculated Patterson, . A criterion, , based on the sum of the Patterson densities at all test vectors within the unit-cell volume *V*, would be can be evaluated for all reasonable heavy-atom distributions. Each different set of trial sites corresponds to a different Patterson. It is then easily shown that where the sum is taken over all **h** reflections in reciprocal space, are the observed differences and are the structure factors of the trial point Patterson. (The symbol *E* is used here because of its close relation to normalized structure factors.)

Let there be *n* noncrystallographic asymmetric units within the crystallographic asymmetric unit and *m* crystallographic asymmetric units within the crystal unit cell. Then there are *L* symmetry-related heavy-atom sites where . Let the scattering contribution of the *i*th site have and real and imaginary structure-factor components with respect to an arbitrary origin. Hence, for reflection **h** Therefore, But must be independent of the number, *L*, of heavy-atom sites per cell. Thus the criterion can be re-written as More generally, if some sites have already been tentatively determined, and if these sites give rise to the structure-factor components and , then Following the same procedure as above, it follows that where and .

Expression (2.3.3.5) will now be compared with the `feedback' method (Dickerson *et al*., 1967, 1968) of verifying heavy-atom sites using SIR phasing. Inspection of Fig. 2.3.3.4 shows that the native phase, α, will be determined as (ϕ is the structure-factor phase corresponding to the presumed heavy-atom positions) when and when . Thus, an SIR difference electron density, , can be synthesized by the Fourier summation where *m* is a figure of merit of the phase reliability (Blow & Crick, 1959; Dickerson *et al*., 1961). Now, where and are the real and imaginary components of the presumed heavy-atom sites. Therefore,

If this SIR difference electron-density map shows significant peaks at sites related by noncrystallographic symmetry, then those sites will be at the position of a further set of heavy atoms. Hence, a suitable criterion for finding heavy-atom sites is or by substitution But Therefore, This expression is similar to (2.3.3.5) derived by consideration of a Patterson search. It differs from (2.3.3.5) in two respects: the Fourier coefficients are different and expression (2.3.3.6) is lacking a second term. Now the figure of merit *m* will be small whenever is small as the SIR phase cannot be determined well under those conditions. Hence, effectively, the coefficients are a function of , and the coefficients of the functions (2.3.3.5) and (2.3.3.6) are indeed rather similar. The second term in (2.3.3.5) relates to the use of the search atoms in phasing and could be included in (2.3.3.6), provided the actual feedback sites in each of the *n* electron-density functions tested by are omitted in turn. Thus, a systematic Patterson search and an SIR difference Fourier search are very similar in character and power.

The difference Patterson computed with coefficients contains information on the heavy atoms (*HH* vectors) and the macromolecular structure (*HL* vectors) (Section 2.3.3.3). If the scaling between the and data sets is not perfect there will also be noise. Rossmann (1961*b*) was partially successful in determining the low-resolution horse haemoglobin structure by using a series of superpositions based on the known heavy-atom sites. Nevertheless, Patterson superposition methods have not been used for the structure determination of proteins owing to the successful error treatment of the isomorphous replacement method in reciprocal space. However, it is of some interest here for it gives an alternative insight into SIR phasing.

The deconvolution of an arbitrary molecule, represented as `?', from an Patterson, is demonstrated in Fig. 2.3.3.5. The original structure is shown in Fig. 2.3.3.5(*a*) and the corresponding Patterson in Fig. 2.3.3.5(*b*). Superposition with respect to one of the heavy-atom sites is shown in Fig. 2.3.3.5(*c*) and the other in Fig. 2.3.3.5(*d*). Both Figs. 2.3.3.5(*c*) and (*d*) contain a centre of symmetry because the use of only a single *HH* vector implies a centre of symmetry half way between the two sites. The centre is broken on combining information from all three sites (which together lack a centre of symmetry) by superimposing Figs. 2.3.3.5(*c*) and (*d*) to obtain either the original structure (Fig. 2.3.3.5*a*) or its enantiomorph. Thus it is clear, in principle, that there is sufficient information in a single isomorphous derivative data set, when used in conjunction with a native data set, to solve a structure completely. However, the procedure shown in Fig. 2.3.3.5 does not consider the accumulation of error in the selection of individual images when these intersect with another image. In this sense the reciprocal-space isomorphous replacement technique has greater elegance and provides more insight, whereas the alternative view given by the Patterson method was the original stimulus for the discovery of the SIR phasing technique (Blow & Rossmann, 1961).

Other Patterson functions for the deconvolution of SIR data have been proposed by Ramachandran & Raman (1959), as well as others. The principles are similar but the coefficients of the functions are optimized to emphasize various aspects of the signal representing the molecular structure.

It is insufficient to discuss Patterson techniques for locating heavy-atom substitutions without also considering errors of all kinds. First, it must be recognized that most heavy-atom labels are not a single atom but a small compound containing one or more heavy atoms. The compound itself will displace water or ions and locally alter the conformation of the protein or nucleic acid. Hence, a simple Gaussian approximation will suffice to represent individual heavy-atom scatterers responsible for the difference between native and heavy-atom derivatives. Furthermore, the heavy-atom compound often introduces small global structural changes which can be detected only at higher resolution. These problems were considered with some rigour by Crick & Magdoff (1956). In general, lack of isomorphism is exhibited by an increase in the size of the isomorphous differences with increasing resolution (Fig. 2.3.3.6).

Crick & Magdoff (1956) also derived the approximate expression to estimate the r.m.s. fractional change in intensity as a function of heavy-atom substitution. Here, represents the number of heavy atoms attached to a protein (or other large molecule) which contains light atoms. and are the scattering powers of the average heavy and protein atom, respectively. This function was tabulated by Eisenberg (1970) as a function of molecular weight (proportional to ). For instance, for a single, fully substituted, Hg atom the formula predicts an r.m.s. intensity change of around 25% in a molecule of 100 000 Da. However, the error of measurement of a reflection intensity is likely to be arround 10% of *I*, implying perhaps an error of around 14% of *I* on a difference measurement. Thus, the isomorphous replacement difference measurement for almost half the reflections will be buried in error for this case.

Scaling of the different heavy-atom-derivative data sets onto a common relative scale is clearly important if error is to be reduced. Blundell & Johnson (1976, pp. 333–336) give a careful discussion of this subject. Suffice it to say here only that a linear scale factor is seldom acceptable as the heavy-atom-derivative crystals frequently suffer from greater disorder than the native crystals. The heavy-atom derivative should, in general, have a slightly larger mean value for the structure factors on account of the additional heavy atoms (Green *et al*., 1954). The usual effect is to make (Phillips, 1966).

As the amount of heavy atom is usually unknown in a yet unsolved heavy-atom derivative, it is usual practice either to apply a scale factor of the form or, more generally, to use local scaling (Matthews & Czerwinski, 1975). The latter has the advantage of not making any assumption about the physical nature of the relative intensity decay with resolution.

### References

Adams, M. J., Blundell, T. L., Dodson, E. J., Dodson, G. G., Vijayan, M., Baker, E. N., Harding, M. M., Hodgkin, D. C., Rimmer, B. & Sheat, S. (1969).*Structure of rhombohedral 2 zinc insulin crystals. Nature (London)*,

**224**, 491–495.

Argos, P. & Rossmann, M. G. (1976).

*A method to determine heavy-atom positions for virus structures. Acta Cryst.*B

**32**, 2975–2979.

Blow, D. M. (1958).

*The structure of haemoglobin. VII. Determination of phase angles in the noncentrosymmetric [100] zone. Proc. R. Soc. London Ser. A*,

**247**, 302–336.

Blow, D. M. & Crick, F. H. C. (1959).

*The treatment of errors in the isomorphous replacement method. Acta Cryst.*

**12**, 794–802.

Blow, D. M. & Rossmann, M. G. (1961).

*The single isomorphous replacement method. Acta Cryst.*

**14**, 1195–1202.

Bluhm, M. M., Bodo, G., Dintzis, H. M. & Kendrew, J. C. (1958).

*The crystal structure of myoglobin. IV. A Fourier projection of sperm-whale myoglobin by the method of isomorphous replacement. Proc. R. Soc. London Ser. A*,

**246**, 369–389.

Blundell, T. L. & Johnson, L. N. (1976).

*Protein crystallography.*New York: Academic Press.

Bodo, G., Dintzis, H. M., Kendrew, J. C. & Wyckoff, H. W. (1959).

*The crystal structure of myoglobin. V. A low-resolution three-dimensional Fourier synthesis of sperm-whale myoglobin crystals. Proc. R. Soc. London Ser. A*,

**253**, 70–102.

Bragg, W. L. (1958).

*The determination of the coordinates of heavy atoms in protein crystals. Acta Cryst.*

**11**, 70–75.

Bragg, W. L. & Perutz, M. F. (1954).

*The structure of haemoglobin. VI. Fourier projections on the 010 plane. Proc. R. Soc. London Ser. A*,

**225**, 315–329.

Crick, F. H. C. & Magdoff, B. S. (1956).

*The theory of the method of isomorphous replacement for protein crystals. I. Acta Cryst.*

**9**, 901–908.

Cullis, A. F., Muirhead, H., Perutz, M. F., Rossmann, M. G. & North, A. C. T. (1962).

*The structure of haemoglobin. IX. A three-dimensional Fourier synthesis at 5.5 Å resolution: description of the structure. Proc. R. Soc. London Ser. A*,

**265**, 161–187.

Dickerson, R. E., Kendrew, J. C. & Strandberg, B. E. (1961).

*The crystal structure of myoglobin: phase determination to a resolution of 2 Å by the method of isomorphous replacement. Acta Cryst.*

**14**, 1188–1195.

Dickerson, R. E., Kopka, M. L., Varnum, J. C. & Weinzierl, J. E. (1967).

*Bias, feedback and reliability in isomorphous phase analysis. Acta Cryst.*

**23**, 511–522.

Dickerson, R. E., Weinzierl, J. E. & Palmer, R. A. (1968).

*A least-squares refinement method for isomorphous replacement. Acta Cryst.*B

**24**, 997–1003.

Eisenberg, D. (1970).

*X-ray crystallography and enzyme structure.*In

*The enzymes*, edited by P. D. Boyer, Vol. I, 3rd ed., pp. 1–89. New York: Academic Press.

Green, D. W., Ingram, V. M. & Perutz, M. F. (1954).

*The structure of haemoglobin. IV. Sign determination by the isomorphous replacement method. Proc. R. Soc. London Ser. A*,

**225**, 287–307.

Harker, D. (1956).

*The determination of the phases of the structure factors of non-centrosymmetric crystals by the method of double isomorphous replacement. Acta Cryst.*

**9**, 1–9.

Hoppe, W. (1959).

*Die Bestimmung genauer Schweratom-parameter in isomorphen azentrischen Kristallen. Acta Cryst.*

**12**, 665–674.

Kartha, G. (1961).

*Isomorphous replacement method in non-centrosymmetric structures. Acta Cryst.*

**14**, 680–686.

Matthews, B. W. & Czerwinski, E. W. (1975).

*Local scaling: a method to reduce systematic errors in isomorphous replacement and anomalous scattering measurements. Acta Cryst.*A

**31**, 480–487.

Perutz, M. F. (1956).

*Isomorphous replacement and phase determination in non-centrosymmetric space groups. Acta Cryst.*

**9**, 867–873.

Phillips, D. C. (1966).

*Advances in protein crystallography.*In

*Advances in structure research by diffraction methods*, Vol. 2, edited by R. Brill & R. Mason, pp. 75–140. New York: John Wiley.

Poljak, R. J. (1963).

*Heavy-atom attachment to crystalline lysozyme. J. Mol. Biol.*

**6**, 244–246.

Ramachandran, G. N. & Raman, S. (1959).

*Syntheses for the deconvolution of the Patterson function. Part I. General principles. Acta Cryst.*

**12**, 957–964.

Robertson, J. M. (1935).

*An X-ray study of the structure of phthalocyanines. Part I. The metal-free, nickel, copper, and platinum compounds. J. Chem. Soc.*pp. 615–621.

Robertson, J. M. (1936).

*An X-ray study of the phthalocyanines. Part II. Quantitative structure determination of the metal-free compound. J. Chem. Soc.*pp. 1195–1209.

Robertson, J. M. & Woodward, I. (1937).

*An X-ray study of the phthalocyanines. Part III. Quantitative structure determination of nickel phthalocyanine. J. Chem. Soc.*pp. 219–230.

Rossmann, M. G. (1960).

*The accurate determination of the position and shape of heavy-atom replacement groups in proteins. Acta Cryst.*

**13**, 221–226.

Rossmann, M. G. (1961

*b*).

*Application of the Buerger minimum function to protein structures.*In

*Computing methods and the phase problem in X-ray crystal analysis*, edited by R. Pepinsky, J. M. Robertson & J. C. Speakman, pp. 252–265. Oxford: Pergamon Press.

Steinrauf, L. K. (1963).

*Two Fourier functions for use in protein crystallography. Acta Cryst.*

**16**, 317–319.

Terwilliger, T. C. & Eisenberg, D. (1983).

*Unbiased three-dimensional refinement of heavy-atom parameters by correlation of origin-removed Patterson functions. Acta Cryst.*A

**39**, 813–817.