Tables for
Volume F
Crystallography of biological macromolecules
Edited by E. Arnold, D. M. Himmel and M. G. Rossmann

International Tables for Crystallography (2012). Vol. F, ch. 16.1, p. 425   | 1 | 2 |

Section Substructure applications

G. M. Sheldrick,a C. J. Gilmore,b H. A. Hauptman,c C. M. Weeks,c* R. Millerc and I. Usónd

aLehrstuhl für Strukturchemie, Georg-August-Universität Göttingen, Tammannstrasse 4, D-37077 Göttingen, Germany,bDepartment of Chemistry, University of Glasgow, Glasgow G12 8QQ, UK,cHauptman–Woodward Medical Research Institute, Inc., 700 Ellicott Street, Buffalo, NY 14203–1102, USA, and dInstitució Catalana de Recerca i Estudis Avançats at IBMB-CSIC, Barcelona Science Park. Baldiri Reixach 15, 08028 Barcelona, Spain
Correspondence e-mail: Substructure applications

| top | pdf |

It has been known for some time that conventional direct methods can be a valuable tool for locating the positions of heavy-atom substructures using isomorphous (Wilson, 1978[link]) and anomalous (Mukherjee et al., 1989[link]) difference structure factors. Experience has shown that successful substructure applications are highly dependent on the accuracy of the difference magnitudes. As the technology for producing selenomethionine-substituted proteins and collecting accurate multiple-wavelength (MAD) data has improved (Hendrickson & Ogata, 1997[link]; Smith, 1998[link]), there has been an increased need to locate many selenium sites. For larger structures (e.g. more than about 30 Se atoms), automated Patterson interpretation methods can be expected to run into difficulties since the number of unique peaks to be analysed increases with the square of the number of atoms. Experimentally measured difference data are an approximation to the data for the hypothetical substructure, and it is reasonable to expect that conventional direct methods might run into difficulties sooner when applied to such data. Dual-space direct methods provide a more robust foundation for handling such data, which are often extremely noisy. Dual-space methods also have the added advantage that the expected number of Se atoms, Nu, which is usually known, can be exploited directly by picking the top Nu peaks. Successful applications require great care in data processing, especially if the [|F_{A}|] values resulting from a MAD experiment are to be used.

SHELXD is frequently successfully employed with |FA| values derived from multiwavelength MAD data generated, for example, by the programs SHELXC (Sheldrick, 2008[link], 2010[link]) or XPREP (Bruker AXS, Madison, WI). The decision at which resolution the data should be truncated for substructure determination is best taken on the basis of the correlation coefficients between the signed anomalous differences (Schneider & Sheldrick, 2002[link]). On the other hand, SnB is normally applied separately to anomalous and dispersive differences. In many cases, both approaches lead to successful substructure solution. The real advantage of MAD data is that they provide more experimental phase information (i.e. better maps) and this is most important at medium to low resolution. The amount of data available for substructure problems is much larger than for full-structure problems with a comparable number of atoms to be located. Consequently, the user can afford to be stringent in eliminating data with uncertain measurements. Guidelines for rejecting uncertain data have been suggested (Smith et al., 1998[link]). Consideration should be limited to those data pairs [(|E_{1}|, |E_{2}|)] [i.e., isomorphous pairs [(|E_{\rm nat}|, |E_{\rm der}|)] and anomalous pairs [(|E_{+{\bf H}}|, |E_{-{\bf H}}|)]] for which [\min \left[|E_{1}| / \sigma (|E_{1}|), |E_{2}| / \sigma (|E_{2}|)\right] \geq x_{\min} \eqno(]and [{\|E_{1}| - |E_{2}\| \over [\sigma^{2} (|E_{1}|) + \sigma^{2} (|E_{2}|)]^{1/2}} \geq y_{\min}, \eqno(]where typically [x_{\min} = 3] and [y_{\min} = 1]. The final choice of max­imum resolution to be used should be based on inspection of the spherical shell averages [\langle |E_{\Delta}|^{2} \rangle_s] versus [\langle s \rangle] where [s=\sin(\theta)/\lambda]. The purpose of this precaution is to avoid spuriously large [|E_{\Delta}|] values for high-resolution data pairs measured with large uncertainties due to imperfect isomorphism or general fall-off of scattering intensity with increasing scattering angle. Only those [|E_{\Delta}|]'s for which [|E_{\Delta}| / \sigma (|E_{\Delta}|) \geq z_{\min} \eqno(](typically [z_{\min} = 3]) should be deemed sufficiently reliable for subsequent phasing. The probability of very large difference [|E|]'s (e.g. [\gt 5]) is remote, and data sets that appear to have many such measurements should be examined critically for measurement errors. If a few such data remain even after the adoption of rigorous rejection criteria, it may be best to eliminate them individually. A paper by Blessing & Smith (1999[link]) elaborates further data-selection criteria. On the other hand, it is also important that the phase:invariant ratio be maintained at 1:10 in order to ensure that the phases are overdetermined. Since the largest [|E|]'s for the substructure cell are more widely separated than they are in a true small-molecule cell, the relative number of possible triplets involving the largest reciprocal-lattice vectors may turn out to be too small. Consequently, a relatively small number of substructure phases (e.g. 10Nu) may not have a sufficient number (i.e., 100Nu) of invariants. Since the number of triplets increases rapidly with the number of reflections considered, the appropriate action in such cases is to increase the number of reflections, as suggested in Table[link]. This will typically produce the desired overdetermination.

It is rare for Se atoms to be closer to each other than 5 Å, and the application of SnB to AdoHcy hydolase data truncated to 4 and 5 Å has been successful. Success rates were less for lower-resolution data, but the CPU time required per trial was also reduced, primarily because much smaller Fourier grids were necessary. Consequently, there was no net increase in the CPU time needed to find a solution.


Schneider, T. R. & Sheldrick, G. M. (2002). Substructure solution with SHELXD. Acta Cryst. D58, 1772–1779.
Sheldrick, G. M. (2008). A short history of SHELX. Acta Cryst. A64, 112–122.
Sheldrick, G. M. (2010). Experimental phasing with SHELXC/D/E: combining chain tracing with density modification. Acta Cryst. D66, 479–485.
Blessing, R. H. & Smith, G. D. (1999). Difference structure-factor normalization for heavy-atom or anomalous-scattering substructure determinations. J. Appl. Cryst. 32, 664–670.
Hendrickson, W. A. & Ogata, C. M. (1997). Phase determination from multiwavelength anomalous diffraction measurements. Methods Enzymol. 276, 494–523.
Mukherjee, A. K., Helliwell, J. R. & Main, P. (1989). The use of MULTAN to locate the positions of anomalous scatterers. Acta Cryst. A45, 715–718.
Smith, G. D., Nagar, B., Rini, J. M., Hauptman, H. A. & Blessing, R. H. (1998). The use of SnB to determine an anomalous scattering sub­structure. Acta Cryst. D54, 799–804.
Smith, J. L. (1998). Multiwavelength anomalous diffraction in macromolecular crystallography. In Direct Methods for Solving Macro­molecular Structures, edited by S. Fortier, pp. 211–225. Dordrecht: Kluwer Academic Publishers.
Wilson, K. S. (1978). The application of MULTAN to the analysis of isomorphous derivatives in protein crystallography. Acta Cryst. B34, 1599–1608.

to end of page
to top of page