International
Tables for
Crystallography
Volume B
Reciprocal space
Edited by U. Shmueli

International Tables for Crystallography (2006). Vol. B, ch. 2.4, pp. 269-274   | 1 | 2 |

Section 2.4.4. Isomorphous replacement and anomalous scattering in protein crystallography

M. Vijayana* and S. Ramaseshanb

aMolecular Biophysics Unit, Indian Institute of Science, Bangalore 560 012, India, and bRaman Research Institute, Bangalore 560 080, India
Correspondence e-mail:  mv@mbu.iisc.ernet.in

2.4.4. Isomorphous replacement and anomalous scattering in protein crystallography

| top | pdf |

2.4.4.1. Protein heavy-atom derivatives

| top | pdf |

Perhaps the most spectacular applications of isomorphous replacement and anomalous-scattering methods have been in the structure solution of large biological macromolecules, primarily proteins. Since its first successful application on myoglobin and haemoglobin, the isomorphous replacement method, which is often used in conjunction with the anomalous-scattering method, has been employed in the solution of scores of proteins. The application of this method involves the preparation of protein heavy-atom derivatives, i.e. the attachment of heavy atoms like mercury, uranium and lead, or chemical groups containing them, to protein crystals in a coherent manner without changing the conformation of the molecules and their crystal packing. This is only rarely possible in ordinary crystals as the molecules in them are closely packed. Protein crystals, however, contain large solvent regions and isomorphous derivatives can be prepared by replacing the disordered solvent molecules by heavy-atom-containing groups without disturbing the original arrangement of protein molecules.

2.4.4.2. Determination of heavy-atom parameters

| top | pdf |

For any given reflection, the structure factor of the native protein crystal [({\bf F}_{N})], that of a heavy-atom derivative [({\bf F}_{NH})], and the contribution of the heavy atoms in that derivative [({\bf F}_{H})] are related by the equation [{\bf F}_{NH} = {\bf F}_{N} + {\bf F}_{H}. \eqno(2.4.4.1)] The value of [{\bf F}_{H}] depends not only on the positional and thermal parameters of the heavy atoms, but also on their occupancy factors, because, at a given position, the heavy atom may not often be present in all the unit cells. For example, if the heavy atom is present at a given position in only half the unit cells in the crystal, then the occupancy factor of the site is said to be 0.5.

For the successful determination of the heavy-atom parameters, as also for the subsequent phase determination, the data sets from the native and the derivative crystals should have the same relative scale. The different data sets should also have the same overall temperature factor. Different scaling procedures have been suggested (Blundell & Johnson, 1976[link]) and, among them, the following procedure, based on Wilson's (1942[link]) statistics, appears to be the most feasible in the early stages of structure analysis.

Assuming that the data from the native and the derivative crystals obey Wilson's statistics, we have, for any range of [\sin^{2} \theta/\lambda^{2}], [\ln \left\{{\sum f_{Nj}^{2}\over \langle F_{N}^{2}\rangle}\right\} = \ln K_{N} + 2B_{N} {\sin^{2} \theta\over \lambda^{2}} \eqno(2.4.4.2)] and [\ln \left\{{\sum f_{Nj}^{2} + \sum f_{Hj}^{2}\over \langle F_{NH}^{2}\rangle}\right\} = \ln K_{NH} + 2B_{NH} {\sin^{2} \theta\over \lambda^{2}}, \eqno(2.4.4.3)] where [f_{Nj}] and [f_{Hj}] refer to the atomic scattering factors of protein atoms and heavy atoms, respectively. [K_{N}] and [K_{NH}] are the scale factors to be applied to the intensities from the native and the derivative crystals, respectively, and [B_{N}] and [B_{NH}] the temperature factors of the respective structure factors. Normally one would be able to derive the absolute scale factor and the temperature factor for both the data sets from (2.4.4.2)[link] and (2.4.4.3)[link] using the well known Wilson plot. The data from protein crystals, however, do not follow Wilson's statistics as protein molecules contain highly non-random features. Therefore, in practice, it is difficult to fit a straight line through the points in a Wilson plot, thus rendering the parameters derived from it unreliable. (2.4.4.2)[link] and (2.4.4.3)[link] can, however, be used in a different way. From the two equations we obtain [\eqalignno{ &\ln \left\{{\sum f_{Nj}^{2} + \sum f_{Hj}^{2}\over \sum f_{Nj}^{2}} \cdot {\langle F_{N}^{2}\rangle\over \langle F_{NH}^{2}\rangle}\right\} \cr &\quad = \ln \left({K_{NH}\over K_{N}}\right) + 2(B_{NH} - B_{N}) {\sin^{2} \theta\over \lambda^{2}}. &(2.4.4.4)}] The effects of structural non-randomness in the crystals obviously cancel out in (2.4.4.4)[link]. When the left-hand side of (2.4.4.4)[link] is plotted against [(\sin^{2} \theta)/\lambda^{2}], it is called a comparison or difference Wilson plot. Such plots yield the ratio between the scales of the derivative and the native data, and the additional temperature factor of the derivative data. Initially, the number and the occupancy factors of heavy-atom sites are unknown, and are roughly estimated from intensity differences to evaluate [\sum f_{Hj}^{2}]. These estimates usually undergo considerable revision in the course of the determination and the refinement of heavy-atom parameters.

At first, heavy-atom positions are most often determined by Patterson syntheses of one type or another. Such syntheses are discussed in some detail elsewhere in Chapter 2.3[link] . They are therefore discussed here only briefly.

Equation (2.4.2.6)[link] holds when the data are centric. [F_{H}] is usually small compared to [F_{N}] and [F_{NH}], and the minus sign is then relevant on the left-hand side of (2.4.2.6)[link]. Thus the difference between the magnitudes of [{\bf F}_{NH}] and [{\bf F}_{N}], which can be obtained experimentally, normally gives a correct estimate of the magnitude of [{\bf F}_{H}] for most reflections. Then a Patterson synthesis with [(F_{NH} - F_{N})^{2}] as coefficients corresponds to the distribution of vectors between heavy atoms, when the data are centric. But proteins are made up of L-amino acids and hence cannot crystallize in centrosymmetric space groups. However, many proteins crystallize in space groups with centrosymmetric projections. The centric data corresponding to these projections can then be used for determining heavy-atom positions through a Patterson synthesis of the type outlined above.

The situation is more complex for three-dimensional acentric data. It has been shown (Rossmann, 1961[link]) that [(F_{NH} - F_{N})^{2} \simeq F_{H}^{2} \cos^{2} (\alpha_{NH} - \alpha_{H}) \eqno(2.4.4.5)] when [F_{H}] is small compared to [F_{NH}] and [F_{N}]. Patterson synthesis with [(F_{NH} - F_{N})^{2}] as coefficients would, therefore, give an approximation to the heavy-atom vector distribution. An isomorphous difference Patterson synthesis of this type has been used extensively in protein crystallography to determine heavy-atom positions. The properties of this synthesis have been extensively studied (Ramachandran & Srinivasan, 1970[link]; Rossmann, 1960[link]; Phillips, 1966[link]; Dodson & Vijayan, 1971[link]) and it has been shown that this Patterson synthesis would provide a good approximation to the heavy-atom vector distribution even when [F_{H}] is large compared to [F_{N}] (Dodson & Vijayan, 1971[link]).

As indicated earlier (see Section 2.4.3.1[link]), heavy atoms are always anomalous scatterers, and the structure factors of any given reflection and its Friedel equivalent from a heavy-atom derivative have unequal magnitudes. If these structure factors are denoted by [{\bf F}_{NH}(+)] and [{\bf F}_{NH}(-)] and the real component of the heavy-atom contributions (including the real component of the dispersion correction) by [{\bf F}_{H}], then it can be shown (Kartha & Parthasarathy, 1965[link]) that [\left({k\over 2}\right)^{2} [F_{NH}(+) - F_{NH}(-)]^{2} = F^{2}_{H} \sin^{2} (\alpha_{NH} - \alpha_{H}), \eqno(2.4.4.6)] where [k = (f_{H} + f'_{H})/f''_{H}]. Here it has been assumed that all the anomalous scatterers are of the same type with atomic scattering factor [f_{H}] and dispersion-correction terms [f'_{H}] and [f''_{H}]. A Patterson synthesis with the left-hand side of (2.4.4.6)[link] as coefficients would also yield the vector distribution corresponding to the heavy-atom positions (Rossmann, 1961[link]; Kartha & Parthasarathy, 1965[link]). However, [F_{NH}(+) - F_{NH}(-)] is a small difference between two large quantities and is liable to be in considerable error. Patterson syntheses of this type are therefore rarely used to determine heavy-atom positions.

It is interesting to note (Kartha & Parthasarathy, 1965[link]) that addition of (2.4.4.5)[link] and (2.4.4.6)[link] readily leads to [(F_{NH} - F_{N})^{2} + \left({k\over 2}\right)^{2} [F_{NH}(+) - F_{NH}(-)]^{2} \simeq F^{2}_{H}. \eqno(2.4.4.7)] Thus, the magnitude of the heavy-atom contribution can be estimated if intensities of Friedel equivalents have been measured from the derivative crystal. [F_{NH}] is then not readily available, but to a good approximation [F_{NH} = [F_{NH}(+) + F_{NH}(-)]/2. \eqno(2.4.4.8)] A different and more accurate expression for estimating [F^{2}_{H}] from isomorphous and anomalous differences was derived by Matthews (1966[link]). According to a still more accurate expression derived by Singh & Ramaseshan (1966[link]), [\eqalignno{ F^{2}_{H} &= F^{2}_{NH} + F^{2}_{N} - 2F_{NH}F_{N} \cos (\alpha_{N} - \alpha_{NH})\cr &= F^{2}_{NH} + F^{2}_{N} \pm 2F_{NH}F_{N}\cr &\quad \times (1 - \{k[F_{NH}(+) - F_{NH}(-)]/2F_{N}\}^{2})^{1/2}. &(2.4.4.9)}] The lower estimate in (2.4.4.9)[link] is relevant when [|\alpha_{N} - \alpha_{NH}| \lt  90^{\circ}] and the upper estimate is relevant when [|\alpha_{N} - \alpha_{NH}| \gt  90^{\circ}]. The lower and the upper estimates may be referred to as [F_{HLE}] and [F_{HUE}] , respectively. It can be readily shown (Dodson & Vijayan, 1971[link]) that the lower estimate would represent the correct value of [F_{H}] for a vast majority of reflections. Thus, a Patterson synthesis with [F^{2}_{HLE}] as coefficients would yield the vector distribution of heavy atoms in the derivative. Such a synthesis would normally be superior to those with the left-hand sides of (2.4.4.5)[link] and (2.4.4.6)[link] as coefficients. However, when the level of heavy-atom substitution is low, the anomalous differences are also low and susceptible to large percentage errors. In such a situation, a synthesis with [(F_{NH} - F_{N})^{2}] as coefficients is likely to yield better results than that with [F^{2}_{HLE}] as coefficients (Vijayan, 1981[link]).

Direct methods employing different methodologies have also been used successfully for the determination of heavy-atom positions (Navia & Sigler, 1974[link]). These methods, developed primarily for the analysis of smaller structures, have not yet been successful in a priori analysis of protein structures. The very size of protein structures makes the probability relations used in these methods weak. In addition, data from protein crystals do not normally extend to high enough angles to permit resolution of individual atoms in the structure and the feasibility of using many of the currently popular direct-method procedures in such a situation has been a topic of much discussion. The heavy atoms in protein derivative crystals, however, are small in number and are normally situated far apart from one another. They are thus expected to be resolved even when low-resolution X-ray data are used. In most applications, the magnitudes of the differences between [F_{NH}] and [F_{N}] are formally considered as the `observed structure factors' of the heavy-atom distribution and conventional direct-method procedures are then applied to them.

Once the heavy-atom parameters in one or more derivatives have been determined, approximate protein phase angles, [\alpha_{N}]'s, can be derived using methods described later. These phase angles can then be readily used to determine the heavy-atom parameters in a new derivative employing a difference Fourier synthesis with coefficients [(F_{NH} - F_{N}) \exp (i\alpha_{N}). \eqno(2.4.4.10)] Such syntheses are also used to confirm and to improve upon the information on heavy-atom parameters obtained through Patterson or direct methods. They are obviously very powerful when centric data corresponding to centrosymmetric projections are used. The synthesis yields satisfactory results even when the data are acentric although the difference Fourier technique becomes progressively less powerful as the level of heavy-atom substitution increases (Dodson & Vijayan, 1971[link]).

While the positional parameters of heavy atoms can be determined with a reasonable degree of confidence using the above-mentioned methods, the corresponding temperature and occupancy factors cannot. Rough estimates of the latter are usually made from the strength and the size of appropriate peaks in difference syntheses. The estimated values are then refined, along with the positional parameters, using the techniques outlined below.

2.4.4.3. Refinement of heavy-atom parameters

| top | pdf |

The least-squares method with different types of minimization functions is used for refining the heavy-atom parameters, including the occupancy factors. The most widely used method (Dickerson et al., 1961[link]; Muirhead et al., 1967[link]; Dickerson et al., 1968[link]) involves the minimization of the function [\varphi = \textstyle\sum w(F_{NH} - |{\bf F}_{N} + {\bf F}_{H}|)^{2}, \eqno(2.4.4.11)] where the summation is over all the reflections and w is the weight factor associated with each reflection. Here [F_{NH}] is the observed magnitude of the structure factor for the particular derivative and [{\bf F}_{N} + {\bf F}_{H}] is the calculated structure factor. The latter obviously depends upon the protein phase angle [\alpha_{N}], and the magnitude and the phase angle of [{\bf F}_{H}] which are in turn dependent on the heavy-atom parameters. Let us assume that we have three derivatives A, B and C, and that we have already determined the heavy-atom parameters [HA_{i}], [HB_{i}] and [HC_{i}]. Then, [\eqalignno{ {\bf F}_{HA} &= {\bf F}_{HA} (HA_{i})\cr {\bf F}_{HB} &= {\bf F}_{HB} (HB_{i}) &(2.4.4.12)\cr {\bf F}_{HC} &= {\bf F}_{HC} (HC_{i}). }] A set of approximate protein phase angles is first calculated, employing methods described later, making use of the unrefined heavy-atom parameters. These phase angles are used to construct [{\bf F}_{N} + {\bf F}_{H}] for each derivative. (2.4.4.11) is then minimized, separately for each derivative, by varying [HA_{i}] for derivative A, [HB_{i}] for derivative B, and [HC_{i}] for derivative C. The refined values of [HA_{i}], [HB_{i}] and [HC_{i}] are subsequently used to calculate a new set of protein phase angles. Alternate cycles of parameter refinement and phase-angle calculation are carried out until convergence is reached. The progress of refinement may be monitored by computing an R factor defined as (Kraut et al., 1962[link]) [R_{K} = {\sum |F_{NH} - |{\bf F}_{N} + {\bf F}_{H}||\over F_{NH}}. \eqno(2.4.4.13)]

The above method has been successfully used for the refinement of heavy-atom parameters in the X-ray analysis of many proteins. However, it has one major drawback in that the refined parameters in one derivative are dependent on those in other derivatives through the calculation of protein phase angles. Therefore, it is important to ensure that the derivative, the heavy-atom parameters of which are being refined, is omitted from the phase-angle calculation (Blow & Matthews, 1973[link]). Even when this is done, serious problems might arise when different derivatives are related by common sites. In practice, the occupancy factors of the common sites tend to be overestimated compared to those of the others (Vijayan, 1981[link]; Dodson & Vijayan, 1971[link]). Yet another factor which affects the occupancy factors is the accuracy of the phase angles. The inclusion of poorly phased reflections tends to result in the underestimation of occupancy factors. It is therefore advisable to omit from refinement cycles reflections with figures of merit less than a minimum threshold value or to assign a weight proportional to the figure of merit (as defined later) to each term in the minimization function (Dodson & Vijayan, 1971[link]; Blow & Matthews, 1973[link]).

If anomalous-scattering data from derivative crystals are available, the values of [F_{H}] can be estimated using (2.4.4.7)[link] or (2.4.4.9)[link] and these can be used as the `observed' magnitudes of the heavy-atom contributions for the refinement of heavy-atom parameters, as has been done by many workers (Watenpaugh et al., 1975[link]; Vijayan, 1981[link]; Kartha, 1965[link]). If (2.4.4.9)[link] is used for estimating [F_{H}], the minimization function has the form [\varphi = \textstyle\sum w(F_{HLE} - F_{H})^{2}. \eqno(2.4.4.14)] The progress of refinement may be monitored using a reliability index defined as [R = {\sum |F_{HLE} - F_{H}|\over \sum F_{HLE}}. \eqno(2.4.4.15)]

The major advantage of using [F_{HLE}]'s in refinement is that the heavy-atom parameters in each derivative can now be refined independently of all other derivatives. Care should, however, be taken to omit from calculations all reflections for which [F_{HUE}] is likely to be the correct estimate of [F_{H}]. This can be achieved in practice by excluding from least-squares calculations all reflections for which [F_{HUE}] has a value less than the maximum expected value of [F_{H}] for the given derivative (Vijayan, 1981[link]; Dodson & Vijayan, 1971[link]).

A major problem associated with this refinement method is concerned with the effect of experimental errors on refined parameters. The values of [F_{NH}(+) - F_{NH}(-)] are often comparable to the experimental errors associated with [F_{NH}(+)] and [F_{NH}(-)]. In such a situation, even random errors in [F_{NH}(+)] and [F_{NH}(-)] tend to increase systematically the observed difference between them (Dodson & Vijayan, 1971[link]). In (2.4.4.7)[link] and (2.4.4.9)[link], this difference is multiplied by k or [k/2], a quantity much greater than unity, and then squared. This could lead to the systematic overestimation of [F_{HLE}]'s and the consequent overestimation of occupancy factors. The situation can be improved by employing empirical values of k, evaluated using the relation (Kartha & Parthasarathy, 1965[link]; Matthews, 1966[link]) [k = {2 \sum |F_{NH} - F_{N}|\over \sum |F_{NH}(+) - F_{NH}(-)|}, \eqno(2.4.4.16)] for estimating [F_{HLE}] or by judiciously choosing the weighting factors in (2.4.4.14) (Dodson & Vijayan, 1971[link]). The use of a modified form of [F_{HLE}], arrived at through statistical considerations, along with appropriate weighting factors, has also been advocated (Dodson et al., 1975[link]).

When the data are centric, (2.4.4.9)[link] reduces to [F_{H} = F_{NH} \pm F_{N}. \eqno(2.4.4.17)] Here, again, the lower estimate most often corresponds to the correct value of [F_{H}]. (2.4.4.17) does not involve [F_{NH}(+) - F_{NH}(-)] which, as indicated earlier, is prone to substantial error. Therefore, [F_{H}]'s estimated using centric data are more reliable than those estimated using acentric data. Consequently, centric reflections, when available, are extensively used for the refinement of heavy-atom parameters. It may also be noted that in conditions under which [F_{HLE}] corresponds to the correct estimate of [F_{H}], minimization functions (2.4.4.11) and (2.4.4.14) are identical for centric data.

A Patterson function correlation method with a minimization function of the type [\varphi = \textstyle\sum w[(F_{NH} - F_{N})^{2} - F^{2}_{H}]^{2} \eqno(2.4.4.18)] was among the earliest procedures suggested for heavy-atom-parameter refinement (Rossmann, 1960[link]). This procedure would obviously work well when centric reflections are used. A modified version of this procedure, in which the origins of the Patterson functions are removed from the correlation, and centric and acentric data are treated separately, has been proposed (Terwilliger & Eisenberg, 1983[link]).

2.4.4.4. Treatment of errors in phase evaluation: Blow and Crick formulation

| top | pdf |

As shown in Section 2.4.2.3[link], ideally, protein phase angles can be evaluated if two isomorphous heavy-atom derivatives are available. However, in practice, conditions are far from ideal on account of several factors such as imperfect isomorphism, errors in the estimation of heavy-atom parameters, and the experimental errors in the measurement of intensity from the native and the derivative crystals. It is therefore desirable to use as many derivatives as are available for phase determination. When isomorphism is imperfect and errors exist in data and heavy-atom parameters, all the circles in a Harker diagram would not intersect at a single point; instead, there would be a distribution of intersections, such as that illustrated in Fig. 2.4.4.1[link]. Consequently, a unique solution for the phase angle cannot be deduced.

[Figure 2.4.4.1]

Figure 2.4.4.1 | top | pdf |

Distribution of intersections in the Harker construction under non-ideal conditions.

The statistical procedure for computing protein phase angles using multiple isomorphous replacement (MIR) was derived by Blow & Crick (1959[link]). In their treatment, Blow and Crick assume, for mathematical convenience, that all errors, including those arising from imperfect isomorphism, could be considered as residing in the magnitudes of the derivative structure factors only. They further assume that these errors could be described by a Gaussian distribution. With these simplifying assumptions, the statistical procedure for phase determination could be derived in the following manner.

Consider the vector diagram, shown in Fig. 2.4.4.2[link], for a reflection from the ith derivative for an arbitrary value [\alpha] for the protein phase angle. Then, [D_{Hi}(\alpha) = [F^{2}_{N} + F^{2}_{Hi} + 2F_{N}F_{Hi} \cos (\alpha_{Hi} - \alpha)]^{1/2}. \eqno(2.4.4.19)] If [\alpha] corresponds to the true protein phase angle [\alpha_{N}], then [D_{Hi}] coincides with [F_{NHi}]. The amount by which [D_{Hi}(\alpha)] differs from [F_{NHi}], namely, [\xi_{Hi}(\alpha) = F_{NHi} - D_{Hi}(\alpha), \eqno(2.4.4.20)] is a measure of the departure of [\alpha] from [\alpha_{N}]. [\xi] is called the lack of closure. The probability for [\alpha] being the correct protein phase angle could now be defined as [P_{i}(\alpha) = N_{i} \exp [-\xi_{Hi}^{2}(\alpha)/2E_{i}^{2}], \eqno(2.4.4.21)] where [N_{i}] is the normalization constant and [E_{i}] is the estimated r.m.s. error. The methods for estimating [E_{i}] will be outlined later. When several derivatives are used for phase determination, the total probability of the phase angle [\alpha] being the protein phase angle would be [P(\alpha) = \textstyle\prod P_{i}(\alpha) = N \exp \left\{ -{\textstyle\sum\limits_{i}} [\xi_{Hi}^{2}(\alpha)/2E_{i}^{2}]\right\}, \eqno(2.4.4.22)] where the summation is over all the derivatives. A typical distribution of [P(\alpha)] plotted around a circle of unit radius is shown in Fig. 2.4.4.3[link]. The phase angle corresponding to the highest value of [P(\alpha)] would obviously be the most probable protein phase, [\alpha_{M}], of the given reflection. The most probable electron-density distribution is obtained if each [F_{N}] is associated with the corresponding [\alpha_{M}] in a Fourier synthesis.

[Figure 2.4.4.2]

Figure 2.4.4.2 | top | pdf |

Vector diagram indicating the calculated structure factor, [{\bf D}_{Hi}(\alpha)], of the ith heavy-atom derivative for an arbitrary value [\alpha] for the phase angle of the structure factor of the native protein.

[Figure 2.4.4.3]

Figure 2.4.4.3 | top | pdf |

The probability distribution of the protein phase angle. The point P is the centroid of the distribution.

Blow and Crick suggested a different way of using the probability distribution. In Fig. 2.4.4.3[link], the centroid of the probability distribution is denoted by P. The polar coordinates of P are m and [\alpha_{B}], where m, a fractional positive number with a maximum value of unity, and [\alpha_{B}] are referred to as the `figure of merit' and the `best phase', respectively. One can then compute a `best Fourier' with coefficients [mF_{N} \exp (i\alpha_{B}).] The best Fourier is expected to provide an electron-density distribution with the lowest r.m.s. error. The figure of merit and the best phase are usually calculated using the equations [\eqalign{ m \cos \alpha_{B} &= {\textstyle\sum\limits_{i}} P(\alpha_{i}) \cos (\alpha_{i})/{\textstyle\sum\limits_{i}} P(\alpha_{i})\cr m \sin \alpha_{B} &= {\textstyle\sum\limits_{i}} P(\alpha_{i}) \sin (\alpha_{i})/{\textstyle\sum\limits_{i}} P(\alpha_{i}),} \eqno(2.4.4.23)] where [P(\alpha_{i})] are calculated, say, at [5^{\circ}] intervals (Dickerson et al., 1961[link]). The figure of merit is statistically interpreted as the cosine of the expected error in the calculated phase angle and it is obviously a measure of the precision of phase determination. In general, m is high when [\alpha_{M}] and [\alpha_{B}] are close to each other and low when they are far apart.

2.4.4.5. Use of anomalous scattering in phase evaluation

| top | pdf |

When anomalous-scattering data have been collected from derivative crystals, [F_{NH}(+)] and [F_{NH}(-)] can be formally treated as arising from two independent derivatives. The corresponding Harker diagram is shown in Fig. 2.4.4.4[link]. Thus, in principle, protein phase angles can be determined using a single derivative when anomalous-scattering effects are also made use of. It is interesting to note that the information obtained from isomorphous differences, [F_{NH} - F_{N}], and that obtained from anomalous differences, [F_{NH}(+) - F_{NH}(-)], are complementary. The isomorphous difference for any given reflection is a maximum when [F_{N}] and [F_{H}] are parallel or antiparallel. The anomalous difference is then zero, if all the anomalous scatterers are of the same type, and [\alpha_{N}] is determined uniquely on the basis of the isomorphous difference. The isomorphous difference decreases and the anomalous difference increases as the inclination between [{\bf F}_{N}] and [{\bf F}_{H}] increases. The isomorphous difference tends to be small and the anomalous difference tends to have the maximum possible value when [{\bf F}_{N}] and [{\bf F}_{H}] are perpendicular to each other. The anomalous difference then has the predominant influence in determining the phase angle.

[Figure 2.4.4.4]

Figure 2.4.4.4 | top | pdf |

Harker construction using anomalous-scattering data from a single derivative.

Although isomorphous and anomalous differences have a complementary role in phase determination, their magnitudes are obviously unequal. Therefore, when [F_{NH}(+)] and [F_{NH}(-)] are treated as arising from two derivatives, the effect of anomalous differences on phase determination would be only marginal as, for any given reflection, [F_{NH}(+) - F_{NH}(-)] is usually much smaller than [F_{NH} - F_{N}]. However, the magnitude of the error in the anomalous difference would normally be much smaller than that in the corresponding isomorphous difference. Firstly, the former is obviously free from the effects of imperfect isomorphism. Secondly, [F_{NH}(+)] and [F_{NH}(-)] are expected to have the same systematic errors as they are measured from the same crystal. These errors are eliminated in the difference between the two quantities. Therefore, as pointed out by North (1965[link]), the r.m.s. error used for anomalous differences should be much smaller than that used for isomorphous differences. Denoting the r.m.s. error in anomalous differences by E′, the new expression for the probability distribution of protein phase angle may be written as [\eqalignno{ P_{i}(\alpha) &= N_{i} \exp [-\xi_{Hi}^{2}(\alpha)/2E_{i}^{2}]\cr &\quad \times \exp \{-[\Delta H_{i} - \Delta H_{i{\rm cal}}(\alpha)]^{2}/2E_{i}^{'2}\}, &(2.4.4.24)}] where [\Delta H_{i} = F_{NHi}(+) - F_{NHi}(-)] and [\Delta H_{i{\rm cal}}(\alpha) = 2F''_{Hi} \sin (\alpha_{Di} - \alpha_{Hi}).] Here [\alpha_{Di}] is the phase angle of [D_{Hi}(\alpha)] [see (2.4.4.19) and Fig. 2.4.4.2[link]]. [\Delta H_{i{\rm cal}}(\alpha)] is the anomalous difference calculated for the assumed protein phase angle [\alpha]. [F_{NHi}] may be taken as the average of [F_{NHi}(+)] and [F_{NHi}(-)] for calculating [\xi_{Hi}^{2}(\alpha)] using (2.4.4.20).

2.4.4.6. Estimation of r.m.s. error

| top | pdf |

Perhaps the most important parameters that control the reliability of phase evaluation using the Blow and Crick formulation are the isomorphous r.m.s. error [E_{i}] and the anomalous r.m.s. error [E'_{i}]. For a given derivative, the sharpness of the peak in the phase probability distribution obviously depends upon the value of E and that of E′ when anomalous-scattering data have also been used. When several derivatives are used, an overall underestimation of r.m.s. errors leads to artifically sharper peaks, the movement of [\alpha_{B}] towards [\alpha_{M}], and deceptively high figures of merit. Opposite effects result when E's are overestimated. Underestimation or overestimation of the r.m.s. error in the data from a particular derivative leads to distortions in the relative contribution of that derivative to the overall phase probability distributions. It is therefore important that the r.m.s. error in each derivative is correctly estimated.

Centric reflections, when present, obviously provide the best means for evaluating E using the expression [E^{2} = {\textstyle\sum\limits_{n}} (|F_{NH} \pm F_{N}| - F_{N})^{2}/n. \eqno(2.4.4.25)] As suggested by Blow & Crick (1959[link]), values of E thus estimated can be used for acentric reflections as well. Once a set of approximate protein phase angles is available, [E_{i}] can be calculated as the r.m.s. lack of closure corresponding to [\alpha_{B}] [i.e. [\alpha = \alpha_{B}] in (2.4.4.20)] (Kartha, 1976[link]). [E'_{i}] can be similarly evaluated as the r.m.s. difference between the observed anomalous difference and the anomalous difference calculated for [\alpha_{B}] [see (2.4.4.24)]. Normally, the value of [E'_{i}] is about a third of that of [E_{i}] (North, 1965[link]).

A different method, outlined below, can also be used to evaluate E and E′ when anomalous scattering is present (Vijayan, 1981[link]; Adams, 1968[link]). From Fig. 2.4.2.2[link], we have [\cos \psi = (F_{NH}^{2} + F_{H}^{2} - F_{N}^{2})/2F_{NH}F_{H} \eqno(2.4.4.26)] and [F_{N}^{2} = F_{NH}^{2} + F_{H}^{2} - 2F_{NH}F_{H} \cos \psi, \eqno(2.4.4.27)] where [\psi = \alpha_{NH} - \alpha_{H}]. Using arguments similar to those used in deriving (2.4.3.5)[link], we obtain [\sin \psi = [F_{NH}^{2}(+) - F_{NH}^{2}(-)]/4F_{NH}F''_{H}. \eqno(2.4.4.28)] If [F_{NH}] is considered to be equal to [[F_{NH}(+) + F_{NH}(-)]/2], we obtain from (2.4.4.28) [F_{NH}(+) - F_{NH}(-) = 2F''_{H} \sin \psi. \eqno(2.4.4.29)] We obtain what may be called [\psi_{\rm iso}] if the magnitude of [\psi] is determined from (2.4.4.26) and the quadrant from (2.4.4.28). Similarly, we obtain [\psi_{\rm ano}] if the magnitude of [\psi] is determined from (2.4.4.28) and the quadrant from (2.4.4.26). Ideally, [\psi_{\rm iso}] and [\psi_{\rm ano}] should have the same value and the difference between them is a measure of the errors in the data. [F_{N}] obtained from (2.4.4.27) using [\psi_{\rm ano}] may be considered as its calculated value [(F_{N{\rm cal}})]. Then, assuming all errors to lie in [F_{N}], we may write [E^{2} = {\textstyle\sum\limits_{n}} (F_{N} - F_{N{\rm cal}})^{2}/n. \eqno(2.4.4.30)] Similarly, the calculated anomalous difference [(\Delta H_{\rm cal})] may be evaluated from (2.4.4.29) using [\psi_{\rm iso}]. Then [E'^{2} = {\textstyle\sum\limits_{n}} [|F_{NH}(+) - F_{NH}(-)| - \Delta H_{\rm cal}]^{2}/n. \eqno(2.4.4.31)] If all errors are assumed to reside in [F_{H}], E can be evaluated in yet another way using the expression [E^{2} = {\textstyle\sum\limits_{n}} (F_{HLE} - F_{H})^{2}/n. \eqno(2.4.4.32)]

2.4.4.7. Suggested modifications to Blow and Crick formulation and the inclusion of phase information from other sources

| top | pdf |

Modifications to the Blow and Crick procedure of phase evaluation have been suggested by several workers, although none represent a fundamental departure from the essential features of their formulation. In one of the modifications (Cullis et al., 1961a[link]; Ashida, 1976[link]), all [E_{i}]'s are assumed to be the same, but the lack-of-closure error [\xi_{Hi}] for the ith derivative is measured as the distance from the mean of all intersections between phase circles to the point of intersection of the phase circle of that derivative with the phase circle of the native protein. Alternatively, individual values of [E_{i}] are retained, but the lack of closure is measured from the weighted mean of all intersections (Ashida, 1976[link]). This is obviously designed to undo the effects of the unduly high weight given to [F_{N}] in the Blow and Crick formulation. In another modification (Raiz & Andreeva, 1970[link]; Einstein, 1977[link]), suggested for the same purpose, the [F_{N}] and [F_{NHi}] circles are treated as circular bands, the width of each band being related to the error in the appropriate structure factor. A comprehensive set of modifications suggested by Green (1979[link]) treats different types of errors separately. In particular, errors arising from imperfect isomorphism are treated in a comprehensive manner.

Although the isomorphous replacement method still remains the method of choice for the ab initio determination of protein structures, additional items of phase information from other sources are increasingly being used to replace, supplement, or extend the information obtained through the application of the isomorphous replacement. Methods have been developed for the routine refinement of protein structures (Watenpaugh et al., 1973[link]; Huber et al., 1974[link]; Sussman et al., 1977[link]; Jack & Levitt, 1978[link]; Isaacs & Agarwal, 1978[link]; Hendrickson & Konnert, 1980[link]) and they provide a rich source of phase information. However, the nature of the problem and the inherent limitations of the Fourier technique are such that the possibility of refinement yielding misleading results exists (Vijayan, 1980a[link],b[link]). It is therefore sometimes desirable to combine the phases obtained during refinement with the original isomorphous replacement phases. The other sources of phase information include molecular replacement (see Chapter 2.3[link] ), direct methods (Hendrickson & Karle, 1973[link]; Sayre, 1974[link]; de Rango et al., 1975[link]; see also Chapter 2.2[link] ) and different types of electron-density modifications (Hoppe & Gassmann, 1968[link]; Collins, 1975[link]; Schevitz et al., 1981[link]; Bhat & Blow, 1982[link]; Agard & Stroud, 1982[link]; Cannillo et al., 1983[link]; Raghavan & Tulinsky, 1979[link]; Wang, 1985[link]).

The problem of combining isomorphous replacement phases with those obtained by other methods was first addressed by Rossmann & Blow (1961[link]). The problem was subsequently examined by Hendrickson & Lattman (1970[link]) and their method, which involves a modification of the Blow and Crick formulation, is perhaps the most widely used for combining phase information from different sources.

The Blow and Crick procedure is based on an assumed Gaussian `lumped' error in [F_{NHi}] which leads to a lack of closure, [\xi_{Hi}(\alpha)], in [F_{NHi}] defined by (2.4.4.20). Hendrickson and Lattman make an equally legitimate assumption that the lumped error, again assumed to be Gaussian, is associated with [F_{NHi}^{2}]. Then, as in (2.4.4.20), we have [\xi''_{Hi}(\alpha) = F_{NHi}^{2} - D_{Hi}^{2}(\alpha), \eqno(2.4.4.33)] where [\xi''_{Hi}(\alpha)] is the lack of closure associated with [F_{NHi}^{2}] for an assumed protein phase angle [\alpha]. Then the probability for [\alpha] being the correct phase angle can be expressed as [\hfil\qquad P_{i} (\alpha) = N_{i} \exp [-\xi''^{2}_{Hi} (\alpha)/2E''^{2}_{i}],\eqno(2.4.4.34)] where [E''_{i}] is the r.m.s. error in [F^{2}_{NHi}], which can be evaluated using methods similar to those employed for evaluating [E_{i}]. Hendrickson and Lattman have shown that the exponent in the probability expression (2.4.4.34) can be readily expressed as a linear combination of five terms in the following manner. [\eqalignno{ -\xi''^{2}_{Hi} (\alpha)/2E''^{2}_{i} &= K_{i} + A_{i} \cos \alpha + B_{i} \sin \alpha + C_{i} \cos 2\alpha\cr &\quad + D_{i} \sin 2\alpha, &(2.4.4.35)}] where [K_{i}, A_{i}, B_{i}, C_{i}] and [D_{i}] are constants dependent on [F_{N}, F_{Hi}, F_{NHi}] and [E''_{i}]. Thus, five constants are enough to store the complete probability distribution of any reflection. Expressions for the five constants have been derived for phase information from anomalous scattering, tangent formula, partial structure and molecular replacement. The combination of the phase information from all sources can then be achieved by simply taking the total value of each constant. Thus, the total probability of the protein phase angle being [\alpha] is given by [\eqalignno{ P(\alpha) = \textstyle\prod P_{s}(\alpha) &= N \exp \left({\textstyle\sum\limits_{s}} K_{s} + {\textstyle\sum\limits_{s}} A_{s} \cos \alpha + {\textstyle\sum\limits_{s}} B_{s} \sin \alpha \right.\cr &\quad \left. + {\textstyle\sum\limits_{s}} C_{s} \cos 2\alpha + {\textstyle\sum\limits_{s}} D_{s} \sin 2\alpha\right),\cr & &(2.4.4.36)}] where [K_{s}, A_{s}] etc. are the constants appropriate for the sth source and N is the normalization constant.

2.4.4.8. Fourier representation of anomalous scatterers

| top | pdf |

It is often useful to have a Fourier representation of only the anomalous scatterers in a protein. The imaginary component of the electron-density distribution obviously provides such a representation. When the structure is known and [{F}_{N}(+)] and [{F}_{N}(-)] have been experimentally determined, Chacko & Srinivasan (1970[link]) have shown that this representation is obtained in a Fourier synthesis with [i[{\bf F}_{N}(+) + {\bf F}_{N}^{*}(-)]/2] as coefficients, where [{\bf F}_{N}^{*}(-)], whose magnitude is [{F}_{N}(-)], is the complex conjugate of [{\bf F}_{N}(+)]. They also indicated a method for calculating the phase angles of [{\bf F}_{N}(+)] and [{\bf F}_{N}^{*}(-)]. It has been shown (Hendrickson & Sheriff, 1987[link]) that the Bijvoet-difference Fourier synthesis proposed earlier by Kraut (1968[link]) is an approximation of the true imaginary component of the electron density. The imaginary synthesis can be useful in identifying minor anomalous-scattering centres when the major centres are known and also in providing an independent check on the locations of anomalous scatterers and in distinguishing between anomalous scatterers with nearly equal atomic numbers (Sheriff & Hendrickson, 1987[link]; Kitagawa et al., 1987[link]).

References

Adams, M. J. (1968). DPhil thesis, Oxford University, England.
Agard, D. A. & Stroud, R. M. (1982). α-Bungarotoxin structure revealed by a rapid method for averaging electron density of non-crystallographically translationally related molecules. Acta Cryst. A38, 186–194.
Ashida, T. (1976). Some remarks on the phase angle determination by the isomorphous replacement method. In Crystallographic computing techniques, edited by F. R. Ahmed, pp. 282–284. Copenhagen: Munksgaard.
Bhat, T. N. & Blow, D. M. (1982). A density-modification method for improvement of poorly resolved protein electron-density maps. Acta Cryst. A38, 21–29.
Blow, D. M. & Crick, F. H. C. (1959). The treatment of errors in the isomorphous replacement method. Acta Cryst. 12, 794–802.
Blow, D. M. & Matthews, B. W. (1973). Parameter refinement in the multiple isomorphous-replacement method. Acta Cryst. A29, 56–62.
Blundell, T. L. & Johnson, L. N. (1976). Protein crystallography. London: Academic Press.
Cannillo, E., Oberti, R. & Ungaretti, L. (1983). Phase extension and refinement by density modification in protein crystallography. Acta Cryst. A39, 68–74.
Chacko, K. K. & Srinivasan, R. (1970). On the Fourier refinement of anomalous dispersion corrections in X-ray diffraction data. Z. Kristallogr. 131, 88–94.
Collins, D. M. (1975). Efficiency in Fourier phase refinement for protein crystal structures. Acta Cryst. A31, 388–389.
Cullis, A. F., Muirhead, H., Perutz, M. F., Rossmann, M. G. & North, A. C. T. (1961a). The structure of haemoglobin. VIII. A three-dimensional Fourier synthesis at 5.5 Å resolution: determination of the phase angles. Proc. R. Soc. London Ser. A, 265, 15–38.
Dickerson, R. E., Kendrew, J. C. & Strandberg, B. E. (1961). The crystal structure of myoglobin: phase determination to a resolution of 2 Å by the method of isomorphous replacement. Acta Cryst. 14, 1188–1195.
Dickerson, R. E., Weinzierl, J. E. & Palmer, R. A. (1968). A least-squares refinement method for isomorphous replacement. Acta Cryst. B24, 997–1003.
Dodson, E., Evans, P. & French, S. (1975). The use of anomalous scattering in refining heavy atom parameters in proteins. In Anomalous scattering, edited by S. Ramaseshan & S. C. Abrahams, pp. 423–436. Copenhagen: Munksgaard.
Dodson, E. & Vijayan, M. (1971). The determination and refinement of heavy-atom parameters in protein heavy-atom derivatives. Some model calculations using acentric reflexions. Acta Cryst. B27, 2402–2411.
Einstein, J. E. (1977). An improved method for combining isomorphous replacement and anomalous scattering diffraction data for macromolecular crystals. Acta Cryst. A33, 75–85.
Green, E. A. (1979). A new statistical model for describing errors in isomorphous replacement data: the case of one derivative. Acta Cryst. A35, 351–359.
Hendrickson, W. A. & Karle, J. (1973). Carp muscle calcium-binding protein. III. Phase refinement using the tangent formula. J. Biol. Chem. 248, 3327–3334.
Hendrickson, W. A. & Konnert, J. H. (1980). Incorporation of stereochemical information into crystallographic refinement. In Computing in crystallography, edited by R. Diamond, S. Ramaseshan & K. Venkatesan, pp. 13.01–13.23. Bangalore: Indian Academy of Sciences.
Hendrickson, W. A. & Lattman, E. E. (1970). Representation of phase probability distributions in simplified combinations of independent phase information. Acta Cryst. B26, 136–143.
Hendrickson, W. A. & Sheriff, S. (1987). General density function corresponding to X-ray diffraction with anomalous scattering included. Acta Cryst. A43, 121–125.
Hoppe, W. & Gassmann, J. (1968). Phase correction, a new method to solve partially known structures. Acta Cryst. B24, 97–107.
Huber, R., Kukla, D., Bode, W., Schwager, P., Bartels, K., Deisenhofer, J. & Steigemann, W. (1974). Structure of the complex formed by bovine trypsin and bovine pancreatic trypsin inhibitor. II. Crystallographic refinement at 1.9 Å resolution. J. Mol. Biol. 89, 73–101.
Isaacs, N. W. & Agarwal, R. C. (1978). Experience with fast Fourier least squares in the refinement of the crystal structure of rhombohedral 2-zinc insulin at 1.5 Å resolution. Acta Cryst. A34, 782–791.
Jack, A. & Levitt, M. (1978). Refinement of large structures by simultaneous minimization of energy and R factor. Acta Cryst. A34, 931–935.
Kartha, G. (1965). Combination of multiple isomorphous replacement and anomalous dispersion data for protein structure determination. III. Refinement of heavy atom positions by the least-squares method. Acta Cryst. 19, 883–885.
Kartha, G. (1976). Protein phase evaluation: multiple isomorphous series and anomalous scattering methods. In Crystallographic computing techniques, edited by F. R. Ahmed, pp. 269–281. Copenhagen: Munksgaard.
Kartha, G. & Parthasarathy, R. (1965). Combination of multiple isomorphous replacement and anomalous dispersion data for protein structure determination. I. Determination of heavy-atom positions in protein derivatives. Acta Cryst. 18, 745–749.
Kitagawa, Y., Tanaka, N., Hata, Y., Katsube, Y. & Satow, Y. (1987). Distinction between Cu2+ and Zn2+ ions in a crystal of spinach superoxide dismutase by use of anomalous dispersion and tuneable synchrotron radiation. Acta Cryst. B43, 272–275.
Kraut, J. (1968). Bijvoet-difference Fourier function. J. Mol. Biol. 35, 511–512.
Kraut, J., Sieker, L. C., High, D. F. & Freer, S. T. (1962). Chymotrypsin: a three-dimensional Fourier synthesis at 5 Å resolution. Proc. Natl Acad. Sci. USA, 48, 1417–1424.
Matthews, B. W. (1966). The determination of the position of anomalously scattering heavy atom groups in protein crystals. Acta Cryst. 20, 230–239.
Muirhead, H., Cox, J. M., Mazzarella, L. & Perutz, M. F. (1967). Structure and function of haemoglobin. III. A three-dimensional Fourier synthesis of human deoxyhaemoglobin at 5.5 Å resolution. J. Mol. Biol. 28, 156–177.
Navia, M. A. & Sigler, P. B. (1974). The application of direct methods to the analysis of heavy-atom derivatives. Acta Cryst. A30, 706–712.
North, A. C. T. (1965). The combination of isomorphous replacement and anomalous scattering data in phase determination of non-centrosymmetric reflexions. Acta Cryst. 18, 212–216.
Phillips, D. C. (1966). Advances in protein crystallography. In Advances in structure research by diffraction methods, Vol. 2, edited by R. Brill & R. Mason, pp. 75–140. New York and London: Interscience.
Raghavan, N. V. & Tulinsky, A. (1979). The structure of α-chymotrypsin. II. Fourier phase refinement and extension of the dimeric structure at 1.8 Å resolution by density modification. Acta Cryst. B35, 1776–1785.
Raiz, V. Sh. & Andreeva, N. S. (1970). Determining the coefficients of the Fourier series of the electron density function of protein crystals. Sov. Phys. Crystallogr. 15, 206–210.
Ramachandran, G. N. & Srinivasan, R. (1970). Fourier methods in crystallography. New York: Wiley–Interscience.
Rango, C. de, Mauguen, Y. & Tsoucaris, G. (1975). Use of high-order probability laws in phase refinement and extension of protein structures. Acta Cryst. A31, 227–233.
Rossmann, M. G. (1960). The accurate determination of the position and shape of heavy-atom replacement groups in proteins. Acta Cryst. 13, 221–226.
Rossmann, M. G. (1961). The position of anomalous scatterers in protein crystals. Acta Cryst. 14, 383–388.
Rossmann, M. G. & Blow, D. M. (1961). The refinement of structures partially determined by the isomorphous replacement method. Acta Cryst. 14, 641–647.
Sayre, D. (1974). Least-squares phase refinement. II. High-resolution phasing of a small protein. Acta Cryst. A30, 180–184.
Schevitz, R. W., Podjarny, A. D., Zwick, M., Hughes, J. J. & Sigler, P. B. (1981). Improving and extending the phases of medium- and low-resolution macromolecular structure factors by density modification. Acta Cryst. A37, 669–677.
Sheriff, S. & Hendrickson, W. A. (1987). Location of iron and sulfur atoms in myohemerythrin from anomalous-scattering measurements. Acta Cryst. B43, 209–212.
Singh, A. K. & Ramaseshan, S. (1966). The determination of heavy atom positions in protein derivatives. Acta Cryst. 21, 279–280.
Sussman, J. L., Holbrook, S. R., Church, G. M. & Kim, S.-H. (1977). A structure-factor least-squares refinement procedure for macromolecular structures using constrained and restrained parameters. Acta Cryst. A33, 800–804.
Terwilliger, T. C. & Eisenberg, D. (1983). Unbiased three-dimensional refinement of heavy-atom parameters by correlation of origin-removed Patterson functions. Acta Cryst. A39, 813–817.
Vijayan, M. (1980a). On the Fourier refinement of protein structures. Acta Cryst. A36, 295–298.
Vijayan, M. (1980b). Phase evaluation and some aspects of the Fourier refinement of macromolecules. In Computing in crystallography, edited by R. Diamond, S. Ramaseshan & K. Venkatesan, pp. 19.01–19.25. Bangalore: Indian Academy of Sciences.
Vijayan, M. (1981). X-ray analysis of 2Zn insulin: some crystallographic problems. In Structural studies on molecules of biological interest, edited by G. Dodson, J. P. Glusker & D. Sayre, pp. 260–273. Oxford: Clarendon Press.
Wang, B. C. (1985). Resolution of phase ambiguity in macromolecular crystallography. Methods Enzymol. 115, 90–112.
Watenpaugh, K. D., Sieker, L. C., Herriot, J. R. & Jensen, L. H. (1973). Refinement of the model of a protein: rubredoxin at 1.5 Å resolution. Acta Cryst. B29, 943–956.
Watenpaugh, K. D., Sieker, L. C. & Jensen, L. H. (1975). Anomalous scattering in protein structure analysis. In Anomalous scattering, edited by S. Ramaseshan & S. C. Abrahams, pp. 393–405. Copenhagen: Munksgaard.
Wilson, A. J. C. (1942). Determination of absolute from relative X-ray intensity data. Nature (London), 150, 151–152.








































to end of page
to top of page