International
Tables for Crystallography Volume F Crystallography of biological macromolecules Edited by M. G. Rossmann and E. Arnold © International Union of Crystallography 2006 |
International Tables for Crystallography (2006). Vol. F, ch. 11.4, pp. 229-231
Section 11.4.5. Experimental assumptions^{a}UT Southwestern Medical Center at Dallas, 5323 Harry Hines Boulevard, Dallas, TX 75390-9038, USA, and ^{b}Department of Molecular Physiology and Biological Physics, University of Virginia, 1300 Jefferson Park Avenue, Charlottesville, VA 22908, USA |
To achieve the main target of a diffraction experiment – the estimation of structure factors – three components need to be determined, with maximum possible precision:
The main difficulty of data analysis in protein crystallography is the complexity of the process that determines these components. HKL can determine all three directly from the data produced by the analogue-to-digital converter (ADC). The only extra program needed is one that sends the raw ADC signal to the computer disk. For charge-coupled-device (CCD) detectors, spatial detector distortion and sensitivity per pixel functions need to be established in a separate experiment. Usually it is worthwhile to establish a geometrical description of the detector in a separate diffraction experiment. A precise determination requires a well diffracting, high symmetry, non-slipping crystal and a special data-collection procedure.
The crystal response function consists of two types of factors included in the analysis: additive factors, which are represented by the background, and a number of multiplicative factors, such as exposed crystal volume, overall and resolution-dependent decay, Lorentz factor, flux variation, polarization, etc. Other factors, like extinction and non-decay radiation damage (radiation damage can result not only in decay, but also in a change in the crystal lattice, often a main source of error in an experiment), are ignored by HKL, except for their contribution to error estimates.
The detector response function is the main component for the data model. HKL supports
HKL supports most data formats, which represent particular combinations of the above features. The formats define the coordinate system, the pixel size, the detector size, the active area and the fundamental shape (cylindrical, spherical, flat rectangular or circular, single or multi-module) of the detector.
The main complexity of the data-analysis program and the difficulties in using it are not in application of the data model but rather in the determination of the unknown data-model parameters. The refinement of the data-model parameters is an order of magnitude more complex (in terms of the computer code) than the integration of the Bragg peaks when the parameters are known.
The data model is a compromise between an attempt to describe the measurement process precisely and the ability to find parameters describing this process. For example, the overlap between the Bragg peaks is typically ignored due to the complexity of spot-shape determination when reflections overlap. The issue is not only to implement the parameterization, but also to do it with acceptable speed and stability of the numerical algorithms. A more complex data model can be more precise (realistic) under specific circumstances, but can result in a less stable refinement and produce less precise final results in most cases. An apparently more realistic (complex) data model may end up being inferior to a simpler and more robust approach. The complexity of model-quality analysis is due to the fact that some types of errors may be much less significant than others. In particular, an error that changes the intensities of all reflections by the same factor only changes the overall scale factor between the data and the atomic model. Truncation of the integration area results in a systematic reduction of calculated reflection intensities. A variable integration area may result in a different fraction of a reflection being omitted for different reflections. The goal of an integration method is to minimize the variation in the omitted fraction, rather than its magnitude. Similarly, if there is an error in predicting reflection-profile shape, this constant error has a smaller impact than a variable error of the same magnitude.
The magnitude and types of errors are very different in different experiments. The compensation of errors also differs between experiments, making it hard to generalize about an optimal approach to data analysis when the data do not fully satisfy the assumptions of the data model. For intense reflections, when counting statistics are not a limiting factor, none of the current data models accounts for all reproducible errors in experiments. This issue is critical in measuring small differences originating from dispersive effects.
The parameters of the data model can be classified into four groups:
The least-squares method is based on minimization of a function that is a sum of contributors of the following type: where pred is a prediction based on some parameterized model, obs is the value of this prediction's measurement and is an estimate of the measurement and the prediction uncertainty. DENZO has the following least-squares refinements:
SCALEPACK can refine the following parameters by least-squares methods:
Occasionally, the refinement can be unstable due to high correlation between some parameters. High correlation results in the errors in one parameter compensating for the errors in other parameters. In the case where compensation is 100%, the parameter would be undefined, but the error compensation by other parameters would make the predicted pattern correct. In such cases, eigenvalue filtering [related to singular value decomposition, described by Press et al. (1989) in Numerical Recipes] is employed to remove the most correlated components from the refinement to make it more stable. Eigenvalue filtering works reliably when starting parameters are close to the correct values, but may fail to correct large errors in the input parameters if the correlation is close to, but not exactly, 100%. Once the whole data set is integrated, global refinement [also called post refinement: Rossmann et al. (1979); Winkler et al. (1979); Evans (1987); Greenhough (1987); Evans (1993); Kabsch (1993)] can refine crystal parameters (unit cell and orientation) more precisely and without correlation with detector parameters. The unit cell used in structure-determination calculations should come from the global refinement (in SCALEPACK) and not from DENZO refinement.
The crystal and detector orientation parameters can be refined for each group of images or for each processed image separately. Refinement performed separately for each image allows for robust data processing, even when the crystal slips considerably during data collection.
Not every pixel represents a valid measurement. Specification of the active detector area in DENZO is derived from the format and the definition of the detector size. Detector calibration with flood-field exposure will calculate the sensitivity for each pixel and will also determine which pixels should be ignored. The input command can additionally label some areas of the detector to be ignored, most frequently the shadow caused by the beam stop and its support. To define the shape of the area shadowed by the beam stop, the useful commands are ignore circle and ignore quadrilateral. There are also commands to ignore triangular shapes, margins of the detector and a particular line or pixel.
The basic method for calibration of the spatial dependence of detector sensitivity is to measure the response to a flood-field exposure. The amount of relative exposure per pixel needs to be known. DENZO allows for either a uniform or an isotropic source. If the source is at the crystal position, DENZO refinement (with a separate crystal exposure) can be used to define the geometry of the source relative to the detector. To calculate the flood-field response, an earlier determination of the detector distortion is required. The flood-field response is converted to a sensitivity function. Large deviations from the local average are used to define inactive pixels. The edge of the active area needs special treatment, depending on the method of phosphorus deposition.
Absolute configuration is defined relative to the data-coordinate system and is only affected by the sign of the parameter y scale. A mirror transformation of the data does not affect the self-consistency of the data. Thus, the correctness of the absolute configuration cannot be verified by data-reduction programs.
HKL can also generate data corrected for the above factors and/or for geometrical conversion and distortion in uncompressed, lossless compressed and lossy (non-reversible to the last digit) compressed modes in linear or 16 bit floating-point encoded format. Fig. 11.4.5.1 shows data from the APS-1 detector in (a) uncorrected mode, (b) transformed to an ideal rectangular detector and (c) transformed to a spherical detector.
The detector goniostat in DENZO can have only one rotation axis – 2θ. In the complex transformations described in equation (11.4.2.8), the geometrical scale is affected by pixel-to-millimetre conversion and distortion. For different instruments, the scale is defined differently. For detectors without distortion, the scale is defined by the value of the pixel size in the `slow' direction. For detectors with distortion characterized by polynomials (e.g. CCD detectors), the scale is also defined by the way the distortion was determined. In such a case, the source of scale is the separation between holes in the reference grid mask or, alternatively, the goniostat translation. As the distance of the detector active surface from the crystal cannot be measured precisely, the difference between the two distances is the ultimate source of the scale reference. The angle between the detector distance translation and the X-ray beam completes the definition of the detector goniostat in HKL.
The physical goniostat is defined by six angles. Two angles define the direction of the main axis (ω) in the DENZO coordinate system. The third angle defines the zero position of the ω axis. The fourth is the angle between ω and the second axis (κ or χ). The fifth defines the zero position of the second axis. The sixth is the angle between the second and the third axes. This type of goniostat definition allows for the specification of any three-axis goniostat (EEC Cooperative Workshop on Position-Sensitive Detector Software, 1986). Each type of goniostat is represented by six angles. Misalignment of the goniostat is represented as an adjustment to these angles, which can be refined by the HKL system.
Crystal orientation specified by the three angles needs a definition of a zero point. Any crystal axis, or its equivalent reciprocal-space zone perpendicular to it, can be used as a reference. The definition of zero point aligns the crystal axis with the beam direction and one of the reciprocal axes with the x direction. The user can specify both axes.
Both the refinement and calibration procedures determine the properties of the instrument. The principal difference between refinement and calibration is that calibration is performed with data obtained outside the current diffraction experiment, and refinement uses data obtained during the current diffraction experiment. DENZO performs both refinement and calibration, and in some cases the difference between calibration and refinement is a question of semantics, as the refined data from one experiment can be used as a reference for another experiment, or even as a reference for a subsequent refinement cycle or for another part of the same experiment.
References
EEC Cooperative Workshop on Position-Sensitive Detector Software (1986). Phase I and II, LURE, Paris, 16 May–7 June; Phase III, LURE, Paris, 12–19 November.Blessing, R. H. (1995). An empirical correction for absorption anisotropy. Acta Cryst. A51, 33–38.
Evans, P. (1993). Data reduction: data collection and processing. In Proceedings of the CCP4 study weekend. Data collection and processing, 29–30 January, edited by L. Sawyer, N. Isaac & S. Bailey, pp. 114–123. Warrington: Daresbury Laboratory.
Evans, P. R. (1987). Postrefinement of oscillation camera data. In Proceedings of the Daresbury study weekend at Daresbury Laboratory, 23–24 January, edited by J. R. Helliwell, P. A. Machin and M. Z. Papiz, pp. 58–66. Warrington: Daresbury Laboratory.
Greenhough, A. G. W. (1987). Partials and partiality. In Proceedings of the Daresbury study weekend at Daresbury Laboratory, 23–24 January, edited by J. R. Helliwell, P. A. Machin and M. Z. Papiz, pp. 51–57. Warrington: Daresbury Laboratory.
Kabsch, W. (1993). Automatic processing of rotation diffraction data from crystals of initially unknown symmetry and cell constants. J. Appl. Cryst. 26, 795–800.
Katayama, C. (1986). An analytical function for absorption correction. Acta Cryst. A42, 19–23.
Press, W. H., Flannery, B. P., Teukolsky, S. A. & Vetterling, W. T. (1989). Numerical recipes – the art of scientific computing. Cambridge University Press.
Rossmann, M. G., Leslie, A. G. W., Abdel-Meguid, S. S. & Tsukihara, T. (1979). Processing and post-refinement of oscillation camera data. J. Appl. Cryst. 12, 570–581.
Winkler, F. K., Schutt, C. E. & Harrison, S. C. (1979). The oscillation method for crystals with very large unit cells. Acta Cryst. A35, 901–911.