International
Tables for Crystallography Volume F Crystallography of biological macromolecules Edited by E. Arnold, D. M. Himmel and M. G. Rossmann © International Union of Crystallography 2012 |
International Tables for Crystallography (2012). Vol. F, ch. 18.6, pp. 514-515
Section 18.6.5. Symbolic target function
A. T. Brunger,^{a}^{*} P. D. Adams,^{b} W. L. DeLano,^{c} P. Gros,^{d} R. W. Grosse-Kunstleve,^{b} J.-S. Jiang,^{e} N. S. Pannu,^{f} R. J. Read,^{g} L. M. Rice^{h} and T. Simonson^{i}
^{a}Howard Hughes Medical Institute, and Departments of Molecular and Cellular Physiology, Neurology and Neurological Sciences, and Stanford Synchrotron Radiation Laboratory (SSRL), Stanford University, 1201 Welch Road, MSLS P210, Stanford, CA 94305, USA,^{b}The Howard Hughes Medical Institute and Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT 06511, USA,^{c}Graduate Group in Biophysics, Box 0448, University of California, San Francisco, CA 94143, USA,^{d}Crystal and Structural Chemistry, Bijvoet Center for Biomolecular Research, Utrecht University, Padualaan 8, 3584 CH Utrecht, The Netherlands,^{e}Protein Data Bank, Biology Department, Brookhaven National Laboratory, Upton, NY 11973–5000, USA,^{f}Department of Mathematical Sciences, University of Alberta, Edmonton, Alberta, Canada T6G 2G1,^{g}Department of Haematology, University of Cambridge, Wellcome Trust Centre for Molecular Mechanisms in Disease, CIMR, Wellcome Trust/MRC Building, Hills Road, Cambridge CB2 2XY, England,^{h}Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT 06511, USA, and ^{i}Laboratoire de Biologie Structurale (CNRS), IGBMC, 1 rue Laurent Fries, 67404 Illkirch (CU de Strasbourg), France |
One of the key innovative features of CNS is the ability to symbolically define target functions and their first derivatives for crystallographic searches and refinement. This allows one conveniently to implement new crystallographic methodologies as they are being developed.
The power of symbolic target functions is illustrated by two examples. In the first example, a target function is defined for simultaneous heavy-atom parameter refinement of three derivatives. The sites for each of the three derivatives can be disjoint or identical, depending on the particular situation. For simplicity, the Blow & Crick (1959) approach is used, although maximum-likelihood targets are also possible (see below). The heavy-atom sites are refined against the target
, and are complex structure factors corresponding to the three sets of heavy-atom sites, represents the structure factors of the native crystal, , and are the structure-factor amplitudes of the derivatives, and , and are the variances of the three lack-of-closure expressions. The corresponding target expression and its first derivatives with respect to the calculated structure factors are shown in Fig. 18.6.5.1(a). The derivatives of the target function with respect to each of the three associated structure-factor arrays are specified with the `dtarget' expressions. The `tselection' statement specifies the selected subset of reflections to be used in the target function (e.g. excluding outliers), and the `cvselection' statement specifies a subset of reflections to be used for cross-validation (Brünger, 1992) (i.e. the subset is not used during refinement but only as a monitor for the progress of refinement).
The second example is the refinement of a perfectly twinned crystal with overlapping reflections from two independent crystal lattices. Refinement of the model is carried out against the residual The symbolic definition of this target is shown in Fig. 18.6.5.1(b). The twinning operation itself is imposed as a relationship between the two sets of selected atoms (not shown). This example assumes that the two calculated structure-factor arrays (`fcalc1' and `fcalc2') that correspond to the two lattices have been appropriately scaled with respect to the observed structure factors, and the twinning fractions have been incorporated into the scale factors. However, a more sophisticated target function could be defined which incorporates scaling.
A major advantage of the symbolic definition of the target function and its derivatives is that any arbitrary function of structure-factor arrays can be used. This means that the scope of possible targets is not limited to least-squares targets. Symbolic definition of numerical integration over unknown variables (such as phase angles) is also possible. Thus, even complicated maximum-likelihood target functions (Bricogne, 1984; Otwinowski, 1991; Pannu & Read, 1996; Pannu et al., 1998) can be defined using the CNS language. This is particularly valuable at the prototype stage. For greater efficiency, the standard maximum-likelihood targets are provided through CNS source code which can be accessed as functions in the CNS language. For example, the maximum-likelihood target function MLF (Pannu & Read, 1996) and its derivative with respect to the calculated structure factors are defined as where `mlf( )' and `dmlf( )' refer to internal maximum-likelihood functions, `fobs' and `sigma' are the observed structure-factor amplitudes and corresponding σ values, `fcalc' is the (complex) calculated structure-factor array, `fbulk' is the structure-factor array for a bulk solvent model, and `d' and are the cross-validated D and functions (Read, 1990; Kleywegt & Brünger, 1996; Read, 1997) which are precomputed prior to invoking the MLF target function using the test set of reflections. The availability of internal Fortran subroutines for the most computing-intensive target functions and the symbolic definitions involving structure-factor arrays allow for maximal flexibility and efficiency. Other examples of available maximum-likelihood target functions include MLI (intensity-based maximum-likelihood refinement), MLHL [crystallographic model refinement with prior phase information (Pannu et al., 1998)], and maximum-likelihood heavy-atom parameter refinement for multiple isomorphous replacement (Otwinowski, 1991) and MAD phasing (Hendrickson, 1991; Burling et al., 1996). Work is in progress to define target functions that include correlations between different heavy-atom derivatives (Read, 1994).
References
Blow, D. M. & Crick, F. H. C. (1959). The treatment of errors in the isomorphous replacement method. Acta Cryst. 12, 794–802.Bricogne, G. (1984). Maximum entropy and the foundations of direct methods. Acta Cryst. A40, 410–445.
Brünger, A. T. (1992). Free R value: a novel statistical quantity for assessing the accuracy of crystal structures. Nature (London), 355, 472–475.
Burling, F. T., Weis, W. I., Flaherty, K. M. & Brünger, A. T. (1996). Direct observation of protein solvation and discrete disorder with experimental crystallographic phases. Science, 271, 72–77.
Hendrickson, W. A. (1991). Determination of macromolecular structures from anomalous diffraction of synchrotron radiation. Science, 254, 51–58.
Kleywegt, G. J. & Brünger, A. T. (1996). Checking your imagination: applications of the free R value. Structure, 4, 897–904.
Otwinowski, Z. (1991). In Proceedings of the CCP4 Study Weekend. Isomorphous Replacement and Anomalous Scattering, edited by W. Wolf, P. R. Evans & A. G. W. Leslie, pp. 80–86. Warrington: Daresbury Laboratory.
Pannu, N. S., Murshudov, G. N., Dodson, E. J. & Read, R. J. (1998). Incorporation of prior phase information strengthens maximum-likelihood structure refinement. Acta Cryst. D54, 1285–1294.
Pannu, N. S. & Read, R. J. (1996). Improved structure refinement through maximum likelihood. Acta Cryst. A52, 659–668.
Read, R. J. (1990). Structure-factor probabilities for related structures. Acta Cryst. A46, 900–912.
Read, R. J. (1994). Maximum likelihood refinement of heavy atoms. Lecture notes for a workshop on isomorphous replacement methods in macromolecular crystallography. American Crystallographic Association Annual Meeting, 1994, Atlanta, GA, USA.
Read, R. J. (1997). Model phases: probabilities and bias. Methods Enzymol. 277, 110–128.