International
Tables for Crystallography Volume F Crystallography of biological macromolecules Edited by E. Arnold, D. M. Himmel and M. G. Rossmann © International Union of Crystallography 2012 |
International Tables for Crystallography (2012). Vol. F, ch. 11.4, pp. 282-295
https://doi.org/10.1107/97809553602060000833 Chapter 11.4. DENZO and SCALEPACK^{a}UT Southwestern Medical Center at Dallas, 5323 Harry Hines Boulevard, Dallas, TX 75390–9038, USA, and ^{b}Department of Molecular Physiology and Biological Physics, University of Virginia, 1300 Jefferson Park Avenue, Charlottesville, VA 22908, USA This chapter describes the analysis of raw diffraction data, used to produce scaled and merged diffraction amplitudes. The general description of the process is provided within the context of how it is performed in the programs DENZO and SCALEPACK. The data analysis determines many parameters of the diffraction experiment that impact the quality of the final result, providing a flexible and robust approach to problems encountered by crystallographers. Topics covered include: diffraction from a perfect crystal lattice; autoindexing; coordinate systems; experimental assumptions; prediction of the diffraction pattern; detector diagnostics; multiplicative corrections (scaling); global or post refinement; and control through a graphical interface. |
X-ray diffraction data analysis, as performed by the HKL package (Otwinowski, 1993; Otwinowski & Minor, 1997) or similar programs (Rossmann, 1979; Leslie & Tsukihara, 1980; Howard et al., 1985; Blum et al., 1987; Bricogne, 1987; Howard et al., 1987; Messerschmidt & Pflugrath, 1987; Kabsch, 1988; Higashi, 1990; Sakabe, 1991), is used to obtain the following results:
Other results, such as the indexing of diffraction patterns, are in most cases only intermediate steps in the process of reaching the above goals. The HKL system and other programs have tools to validate these intermediate results by self-consistency checks.
The fundamental stages of data analysis are:
The order of these stages represents the natural flow of data reduction, but quite often these steps are repeated iteratively to introduce information gained at a later stage.
The three basic questions in collecting diffraction data are:
These questions and steps (1)–(7) of data analysis are intimately intertwined.
Data analysis makes specific assumptions which the collected data should satisfy. However, the experimenter can verify whether the data satisfy those assumptions only by data analysis. This circular logic can be broken by an iterative process. Data analysis in real time provides immediate feedback during data collection and can remove the guesswork about whether, what and how from the process. The description of data analysis and algorithms that follows will make frequent references to the assumptions about the data and offer guidelines on how to make the experiment fulfil these assumptions.
This article uses the HKL package coordinate system to describe data algorithms and analysis. However, as most equations are written in vector notation, they can be easily adapted to conventions used in other programs.
X-ray photons can scatter from individual electrons by inelastic and incoherent processes. The coherent scattering by the whole crystal is called diffraction.^{1} Energy conservation, when expressed in photon momentum vectors, is equivalent to where S is the diffraction vector, defined as the change of photon momentum in the scattering process, and is the incident-beam vector, which is oriented in the direction of the beam and has length . Diffraction from a perfect crystal lattice occurs when scattering from all repeating crystal elements is in phase, which can be stated in vector algebra as where h, k, l are integers called Miller indices and are the real-space crystal periodicity vectors, defining the orientation matrix:In shorter notation, equations (11.4.2.2)–(11.4.2.4) may be written as , which in reciprocal space is equivalent to , with the orientation matrix defined as the inverse of [A].
The condition for crystal diffraction with Miller indices h, k, l is the existence of a (unique) vector S which is a solution to equations (11.4.2.1)–(11.4.2.4). Equation (11.4.2.1) states the diffraction condition for vector S. The space of the solutions to equations (11.4.2.2)–(11.4.2.4) is called reciprocal space, and vector S belongs to this space. However, the following presentation does not depend on the properties of reciprocal space. The laboratory coordinate system used here has its origin at the position of the crystal. A diffraction peak at the detector position in three-dimensional laboratory space corresponds to vector S: Rotation of the crystal around the goniostat axes can be described by vectors [equations (11.4.2.2)–(11.4.2.4)] as a function of the goniostat angles :where the vectors represent the crystal orientation at the zero position of the goniostat. These rotations can be described as (Bricogne, 1987)where the values represent the direction cosines of a rotation axis. To complete the description of the diffraction geometry, we need a function X(p, q), describing the position in experimental space of each pixel of the detector with integer coordinates . This function is detector-specific and includes the description of the detector geometry and distortion. For a planar detector where are matrices describing the detector's departure from a nominal orientation, is the rotation around the 2θ axis, is the translation between the detector and the crystal, L is an operator describing one of the eight possible axis-direction conventions used by the detector manufacturer, K is an operator scaling pixels to millimetres, D is a detector-specific distortion function, and B is the beam position on the detector surface.
Equations (11.4.2.1)–(11.4.2.8) define the necessary and sufficient conditions needed to fully describe both the existence and position on the detector of the diffraction peaks, and provide all information needed to perform autoindexing.
Among the autoindexing algorithms that have been proposed (Vriend & Rossmann, 1987; Kabsch, 1988; Kim, 1989; Higashi, 1990; Leslie, 1993), the method based on periodicity of the reciprocal lattice tends to be the most reliable (Otwinowski & Minor, 1997; Steller et al., 1997).
Autoindexing starts with a peak search, which results in a set of triplets, where i is the number of the image in which the peak with position was found. For any given rotation matrix Rwhere R is defined asWhen equation (11.4.3.1) is applied to equation (11.4.2.2), it results inwhere is a three-dimensional vector with as yet unknown components. Note that the matrix [R] represents crystal rotation when the crystal is in the diffraction condition defined by the existence of the solution to equations (11.4.2.1)–(11.4.2.4), described by vector S. For data collected in wide oscillation mode,^{2} the angle at which diffraction occurs is not known a priori; however, it can be approximated by the middle of the oscillation range of the image. Combining the peak position with equations (11.4.2.5) and (11.4.2.8) provides an estimate of the vector S. Therefore, equation (11.4.3.3) and similar equations for the k and l Miller indices are approximately satisfied, given approximation and experimental errors. The purpose of autoindexing is to determine the unknown vectors and the triplet for each diffraction peak. To accomplish this, three equations for each peak [equation (11.4.3.3) and the analogous equations for k and l] must be solved. DENZO introduced a method based on the observation that the maxima of the function are the approximate solutions to this set of equations. To speed up the search for all significant maxima, a two-step process is used. The first step is the search for maxima of function (11.4.3.4) on a three-dimensional uniform grid, made very fast owing to the use of a fast Fourier transform (FFT) to evaluate function (11.4.3.4). Function (11.4.3.4) is identical to structure-factor calculations in the space group P1, which allows the use of the crystallographic FFT. Because the maxima at the grid points (HKL uses a grid) only approximate the maxima of function (11.4.3.4), the vectors resulting from a grid search are optimized by Newton's method. Function (11.4.3.4) has maxima not only for basic periodic vectors and , but also for any integer linear combination of them. Any set of three such vectors with a minimal nonzero determinant can be used to describe the crystal lattice. Steller et al. (1997) describe the algorithm that finds the most reliable set of three vectors. This set needs to be converted to the one conventionally used by crystallographers, as defined in International Tables for Crystallography Volume A (IT A ) (2005).
To generate the conventional solution, a two-step procedure is used. In step 1, the reduced primitive triclinic cell is found using the algorithm provided by IT A . Subsequently, step 2 finds conventional cells in the other Bravais lattices of higher symmetry.
The relationship between a higher-symmetry cell and the reduced primitive triclinic cell can be described by where [A] and [P] are matrices of the type , with [P] representing the reduced triclinic primitive cell, and [M] is one of the 44 matrices listed in IT A . If [A] is generated using equation (11.4.3.5) from an experimentally determined [P], owing to experimental errors it will not exactly satisfy the symmetry restraints. DENZO introduced a novel index to evaluate the significance of this violation of symmetry. This index is based on the observation that from [A] one can deduce the parameters of the unit cell, apply symmetry restraints to the unit cell and calculate a matrix for the unit cell that satisfies these symmetry restraints. If [A] perfectly satisfies the symmetry restraints, the matrix [U], where will be unitary and The index of distortion calculated and presented by DENZO is where i and j are the indices of the 3 × 3 matrix [U].
The value of this index increases as additional symmetry restraints are imposed, starting from zero for a triclinic cell. Autoindexing in DENZO always finishes with a table of distortion indices for the 14 possible Bravais lattices, but does not automatically make a choice of lattice.
The cell-reduction procedure cannot determine lattice symmetry, since it cannot distinguish true lattice symmetry from a lattice accidentally having higher symmetry within experimental error (e.g. a monoclinic lattice with is approximately orthorhombic). If one is not certain about the lattice symmetry, the safe choice is to assume space group P1, with a primitive triclinic lattice for the crystal, and to check the table again after refinement of the diffraction-geometry parameters. A reliable symmetry analysis can be done only by comparing intensities of symmetry-related reflections, which is done later in SCALEPACK or another scaling/merging program.
The total oscillation range needed for autoindexing has to cover a sufficient number of spots to establish the periodicity of the diffraction pattern in three dimensions. It is also important that the oscillation range of each image is small enough so that the lunes (i.e. rings of spots from one reciprocal plane) are resolved. One should note that the requirement for lune separation is distinct from the requirement for spot separation. If lunes overlap, spots may have more than one index consistent with a particular position on the detector.
The autoindexing procedure described above is not dependent on prior knowledge of the crystal unit cell; however, for efficiency reasons, the search is restricted to a reasonable range of unit-cell dimensions, obtained, for example, from the requirement of spot separation. In DENZO, this default can be overridden by the keyword longest vector, but the need to use this keyword is a sign of a problem that should be fixed. Either the defined spot size should be decreased or data should be recollected with the detector further away from the crystal.
Autoindexing is sensitive to inaccuracy in the description of the detector geometry. The specified position of the beam on the detector should correspond to the origin of the Bragg-peaks lattice (i.e. the peak with Miller index 000). Autoindexing will shift the origin of the lattice to the Bragg lattice point closest to the specified beam position. An incorrect specification of the beam position will result in the incorrect placement of the Miller index 000. In such a situation, all reflections will have incorrectly determined indices. Such misindexing can be totally self-consistent until the intensities of symmetry-related reflections are compared. This dependence of the indexing correctness on the assumed beam position is one of the main sources of difficulties in indexing (Otwinowski & Minor, 1997; Gewirth, 2003). The beam position has to be precise, as the largest acceptable error is one half of the shortest distance between spots.
The process of determining {h, k, l} triplets is not very sensitive to other detector parameters. Errors of a degree or two in rotation or by 10% in distance are unlikely to produce wrong values of h, k and l. Sometimes data with even very large errors, e.g. if the detector distance is too large by a factor of 5, will still produce the correct {h, k, l} triplets. The detector position error is compensated by an error in the lattice determined by autoindexing. For this reason, the accuracy of the lattice is not a function of the autoindexing procedure, but depends mainly on the accuracy of the detector description. By the same token, the distortion of the lattice also depends on the accuracy of the detector parameters.
Special care has to be taken if more than one crystal contributes to the diffraction image. When there is a large disproportion between the volumes of the crystals (e.g. the presence of a satellite crystal), autoindexing may work without any modifications. In the case of crystals of similar volumes, manual removal of weaker reflections from the peak-search list and resolution cuts can make the proportion of reflections from one crystal in the peak list large enough for the autoindexing method to succeed. If the multiple crystals have similar orientations, using only very low resolution data may be the right method. In the case of twinned crystals, the autoindexing procedure sometimes finds a superlattice that simultaneously assigns integer indices to reflections from both crystals. In such a case, DENZO solves the problem of finding the best three-dimensional lattice that incorporates all of the observed peaks. Unfortunately, for a twinned crystal, the periodicity of the overall diffraction pattern may not in all cases be the proper criterion to index two different lattices. Alternative approaches have been described in the literature for indexing such cases.
There are four natural coordinate systems used to describe a diffraction experiment, defined by the order in which the diffraction-image pixels are stored in the detector, the beam and gravity, or the beam and the goniostat axis (spindle or 2θ). These coordinate systems will be called data, beam–gravity, beam–spindle and beam–2θ, respectively.
To visualize a diffraction pattern, beam–gravity is the coordinate system clearly preferred by human physiology. The universal preference to relate to the gravity direction is revealed by the observation that people generally perceive an image in a mirror as inverted left–right rather than top–down. Hence XdisplayF uses the beam–gravity coordinate system.^{3}
The first (1983) DENZO implementation used the data coordinate system to describe the beam position on the detector and to define the integration box. This is still the case in order to keep backward compatibility. There are eight ways of relating detector-data order to the beam–gravity coordinate system. All of them have been encountered in various detector formats and are the part of the detector-format description in DENZO.
Initially, DENZO supported only a single-axis goniostat and used a beam–spindle coordinate system to define the crystal and detector orientation, as well as polarization. The goniostat spindle axis was assumed to be horizontal, so the direction perpendicular to the beam and spindle was described by the keyword vertical, which in reality may not relate to the gravity direction for some goniostats. The keyword rotx relates to rotation around the spindle axis, roty around the vertical axis and rotz around the beam axis. The definition of the orientation matrix in the file used for communication between DENZO and SCALEPACK uses an unintuitive convention: the letter y in roty relates to the first element of the vector, x in rotx to the second and z in rotz to the third. However, the matrix always has a positive determinant, so this convention has no impact on the handedness of the coordinate system. This unfortunate choice of convention, preserved for backward compatibility reasons, appears only in the communication file and has no significance for anybody who does not inspect the matrix.
The description of generalized multiple-axis goniostats introduced a conceptual change in the DENZO coordinate system. The data-collection axis can be oriented in any direction, so in principle rotx, roty and rotz no longer need to be defined relative to the data-collection axis. However, to keep the useful correlations between refinable parameters (for example, crystal rotz and detector rotz are typically close to 100% correlated), one real and two virtual goniostats are used simultaneously in DENZO. Refinable crystal parameters (crystal rotx, roty, rotz) are still defined, as in the past, by the data-collection axis and the beam. This means that the directions of rotations defined by fit crystal rotx, roty and rotz do not rotate around the data-collection axis as the program advances from one image to another. This coordinate system changes with the change in direction of the data-collection axis. The crystal orientation is defined by three constant, perpendicular axes, which do not have to be aligned with the physical crystal goniostat. However, the so-called 2 theta rotation has a fixed axis, and, if it exists, it defines the DENZO coordinate system together with the beam axis. Thus, the current coordinate system in DENZO should be called beam–2θ. Fortunately for the user, the conversions between different coordinate systems are handled transparently. For example, the refined change in the crystal orientation is converted from the refined axes goniostat to the crystal-orientation goniostat. The movements of the physical goniostat are converted into appropriate changes in the diffraction pattern. The physical goniostat appears only to describe the data collection and, optionally, to calculate the physical goniostat angles needed to produce particular crystal alignments.
The DENZO coordinate system (Gewirth, 2003) is used in the definition of crystal goniostats, 2θ goniostat and polarization. The curvature of cylindrical detectors and Weissenberg coupling are described in this coordinate system as well.
This discussion of the coordinate systems shows that the conceptual complexity of the program description can be compatible with – or even lead to – simplicity in the use of the program. The success of data analysis does not require a full understanding of the relations between internal DENZO goniostats and the coordinate systems when using the programs. The reason for this complexity was to create a simple pattern of correlations between crystal and detector parameters in DENZO refinement. This in turn allows for simple and easy-to-understand control of the refinement process and simplifies problem diagnostics. For example: the definition of refined crystal rotx as rotation around the data-collection axis makes the spindle and shutter problems manifest only as fluctuations of crystal rotx. Constant nonzero values of refined shifts between frames of crystal roty and rotz are a sign of misalignment of the data-collection axis. Although the program compensates for this misalignment with changes in crystal orientation, this introduces a small error in the Lorentz factor (which will be, to a good approximation, compensated for in scaling). The nature of these problems is such that they do not result in a complete failure of the experiment, but they do have an impact on the quality of the result. It is up to the experimenter and the instrument manager to assess the significance of these indications.
To achieve the main goal of a diffraction experiment – the estimation of structure-factor amplitudes – three components need to be determined, with highest possible precision:
The main difficulty of data analysis in protein crystallography is the complexity of the process that determines these components. HKL can determine all three directly from the data produced by the analogue-to-digital converter (ADC). The only extra program needed is one that sends the raw ADC signal to the computer disk. For charge-coupled-device (CCD) detectors, spatial detector distortion and sensitivity per pixel functions need to be established in a separate experiment. An important part of such calibration is the flood-field exposure, which can provide a pixel-by-pixel response correction for calculating the background in the X-ray exposure. There was hope that it would also correct for localized variations in response to Bragg-peak intensities, but it has been shown that light diffusion in reducing fibre optic taper depends on the stretching process during individual manufacture of such tapers. Consequently, there are substantial effects, in addition to the flood-field exposure, on Bragg-peak intensities towards the edges of such tapers. For single taper (single CCD) detectors, with the beam in the centre, the Bragg peaks are underestimated towards the edge in a resolution-dependent way, but without noticeably affecting merging statistics. The resolution-dependent effect is mostly compensated for by a slight adjustment in temperature factor of the model, so it has little practical impact. The issue is much more severe for larger, multi-taper (multi-CCD) detectors, where the localized effects may greatly affect the phasing signal. This effect can be corrected to a degree during scaling, in which geometric details of the detector construction are taken into account.
Additionally, determination of detector-geometry distortion can be accomplished with diffraction experiments on a well diffracting, high-symmetry, non-slipping crystal and a special data-collection procedure. Such a procedure can be used for experimental detectors or to correct for manufacturers' inaccuracies, which are occasionally noticeable.
In practice, many manufacturers of CCD detectors provide post-processed data rather than ADC output, to hide features and flaws of their instruments. This practice complicates data analysis, as the post-processing does not correct well for variations in CCD detector sensitivity, changes the shapes of diffraction-peak profiles and creates incorrect estimates of diffraction-peak intensities in places where flaws were hidden. Better results would be obtained by allowing output of data directly from the ADC, letting CCD detectors approach the lower level of systematic errors observed for image-plate detectors.
The crystal response function consists of two types of factors, which are included in the analysis: additive factors represented by the background, and a number of multiplicative factors, such as exposed crystal volume, resolution-dependent decay, Lorentz factor, flux variation, absorption, polarization etc. Other factors, like extinction, are currently ignored by HKL, except for their contribution to error estimates. Specific radiation-induced damage (Borek et al., 2007, 2010) is approached differently, as it is a merging, rather than scaling-factor, correction.
The detector response function is one of the main components of the data model. The detector formats define the coordinate system, the pixel size, the detector size, the active area and the fundamental shape (cylindrical, spherical, flat rectangular or circular, single or multi-module) of the detector.
HKL supports most of the data formats representing particular combinations of the features listed below:
The main complexity of the data-analysis program and the difficulties in using it are not in the application of the data model but rather in the determination of the unknown data-model parameters. The refinement of the data-model parameters is an order of magnitude more complex in terms of the computer code than the integration of Bragg peaks when the parameters are known.
The data model is a compromise between an attempt to describe the measurement process precisely and the ability to find parameters describing this process. For example, the overlap between Bragg peaks is typically ignored due to the complexity of spot-shape determination when reflections overlap. The issue is not only to implement the parameterization, but also to do it with acceptable speed and stability of the numerical algorithms. A more complex data model can be more precise under specific circumstances, but can result in a less stable refinement and produce less precise final results in most cases. An apparently more realistic but also more complex data model may end up being inferior to a simpler and more robust approach. The complexity of model-quality analysis is due to the fact that some types of errors may be much less significant than others. In particular, an error that changes the intensities of all reflections by the same factor only changes the overall scale factor between the data and the atomic model. Truncation of the integration area results in a systematic reduction of calculated reflection intensities. A variable integration area may result in a different fraction of a reflection being omitted for different reflections. The goal of an integration method is to minimize the variation in the omitted fraction, rather than its magnitude. Similarly, if there is an error in predicting reflection-profile shape, a constant error has a smaller impact than a variable error of the same magnitude.
The magnitudes and types of errors are very different in different experiments. The compensation of errors also differs between experiments, making it hard to generalize about an optimal approach to data analysis when the data do not fully satisfy the assumptions of the data model. For intense reflections, when counting statistics are not a limiting factor, none of the current data models accounts for all reproducible errors in experiments. This issue is critical in measuring small differences originating from dispersive effects, i.e. anomalous scattering.
The parameters of the data model can be classified into four groups:
The least-squares method is based on minimization of a function that is a sum of contributors of the following type: where pred is a prediction based on some parameterized model, obs is the value of this prediction's measurement and is a combined estimate of the measurement and the prediction uncertainties (variance). DENZO refines the following parameters by least-squares methods:
SCALEPACK can refine the following parameters by least-squares methods:
SCALEPACK can also refine, by Tikhonov-stabilized least-squares methods (Borek et al., 2010), multiple structure-factor components contributing to a group of reflections with the same unique index:
Occasionally, the refinement can be unstable due to high correlation between some parameters. Refinement of highly correlated parameters, for example positional refinement of unit-cell parameters and the distance between the crystal and the detector, results in the errors in one parameter compensating for errors in other parameters. However, such error compensation may still lead to mostly correct predictions; for example, incorrect distance and unit-cell parameters can result in correct predictions of the diffraction pattern. In an extreme case, where compensation is 100%, the parameters involved would be undefined, but the error compensation by other parameters involved would be undefined even when the diffraction-pattern prediction is successful.
To prevent instability in calculations, eigenvalue filtering (Reeke, 1984) is employed to remove the most correlated components from the refinement. Eigenvalue filtering works reliably when the starting parameters are close to the correct values. It may fail, however, to correct large errors in the input parameters if the correlation between parameters is close to 100%. All refinements have also built-in Tikhonov or Tikhonov-like stabilizers (Tikhonov & Arsenin, 1977) to prevent unreasonable values of parameters or differences between them.
Once the whole data set is integrated, post refinement (Rossmann, 1979; Winkler et al., 1979; Evans, 1987; Greenhough, 1987; Kabsch, 1993) can refine unit-cell parameters and crystal-orientation parameters more precisely and without correlation with detector parameters. For this reason, the unit-cell parameters used in structure-determination calculations should come from the post refinement performed in SCALEPACK and not from initial refinement in DENZO.
The crystal and detector orientation parameters can be refined either for a group of images or for each processed image separately. Refinement performed separately for each image allows for robust data processing, even when the crystal slips considerably during data collection.
Not every pixel represents a valid measurement. Specification of the active detector area in DENZO is derived from the format and the definition of the detector size. Detector calibration with flood-field exposure will calculate the sensitivity for each pixel and will also determine which pixels should be ignored. Unfortunately, rather than ignoring bad areas, some manufacturers enter values interpolated from nearby pixels, in particular to present a fake contiguous image for multi-module detectors. This practice results in observations in these areas not scaling well with their symmetry equivalents.
The input command in DENZO can additionally label some areas of the detector to be ignored, most frequently the shadow caused by the beam stop and its support. There are also commands to ignore triangular shapes, margins of the detector and a particular line or pixel.
The basic method for calibration of the spatial dependence of detector sensitivity is to measure the response of the detector to a flood-field exposure. The amount of relative exposure per pixel needs to be known. DENZO allows for either a uniform or an isotropic source. If the source is at the crystal position, DENZO refinement (with a separate crystal exposure) can be used to define the geometry of the source relative to the detector. To calculate the flood-field response, a prior determination of the detector distortion is required. The flood-field response is converted to a sensitivity function. Large deviations from the local average are used to define inactive pixels. The edge of the active area needs special treatment, dependent on the method of phosphorus deposition.
Absolute configuration is defined relative to the data-coordinate system and is only affected by the sign of the parameter y scale. A combined mirror transformation of the image data and of the goniostat description does not affect the self-consistency of the data. Thus, the correctness of the absolute configuration cannot be verified by data-reduction programs. Owing to inherent ambiguities in indexing when merging data from different crystals, re-indexing is sometimes necessary, which may cause unjustified concern that the re-indexing may somehow affect the sign of Bijvoet differences. SCALEPACK checks for the positive sign of the re-indexing matrix, and thus the procedure cannot affect the absolute configuration or the Bijvoet differences.
The detector goniostat in DENZO can have only one rotation axis – 2θ. In the complex transformations described in equation (11.4.2.8), the geometrical scale is affected by pixel-to-millimetre conversion and distortion. For different instruments, the scale of the rotation is defined differently. For detectors without distortion, the scale is defined by the value of the pixel size in the `slow' direction. For detectors with distortion characterized by polynomials (e.g. CCD detectors), the scale is defined in the way the distortion was determined. In such a case, the scale is derived either from the separation between holes in the reference grid mask or the detector translation. As the distance of the detector active surface from the crystal cannot be measured precisely, in the latter approach the difference between the two distances is the ultimate source of the scale reference. The angle between the detector distance translation and the X-ray beam completes the definition of the detector goniostat in HKL.
Each type of physical goniostat is defined by six angles. Two angles define the direction of the main axis (ω) in the DENZO coordinate system. The third angle defines the zero position of the ω axis. The fourth is the angle between ω and the second axis (κ or χ). The fifth defines the zero position of the second axis. The sixth is the angle between the second and the third axes. This type of goniostat definition allows for the specification of any three-axis goniostat (EEC Cooperative Workshop on Position-Sensitive Detector Software, 1986). Misalignment of the goniostat is represented as an adjustment to these angles, which can be refined by the HKL system.
Crystal orientation specified by the three angles needs a definition of a zero point. Any crystal axis, or the equivalent reciprocal-space zone perpendicular to it, can be used as a reference. The definition of zero point aligns the crystal axis with the beam direction and one of the reciprocal axes with the x direction in the DENZO coordinate system. The user can specify which crystal axes will be aligned at zero crystal orientation angles by specifying the so-called reference zone.
Both the refinement and calibration procedures determine the properties of the instrument. The principal difference between refinement and calibration is that calibration is performed with data not obtained during the current diffraction experiment, and refinement uses data obtained during the current diffraction experiment. DENZO performs both refinement and calibration. In some cases, the difference between calibration and refinement is a question of semantics, as the refined data from one experiment can be used as a reference for another experiment, or even as a reference for a subsequent refinement cycle or for another part of the same experiment.
The autoindexing procedure assigns Miller indices only to strong spots, ones that can be found through a peak-search procedure. The target of the experiment is to estimate structure-factor amplitudes for all reflections captured by the detector. Therefore, positions of all spots need to be predicted by applying the following equations to all possible triplets h. Using a matrix [A] must be found that generates the vector S that satisfies the diffraction condition [equation (11.4.2.1)], knowing that the matrix [A] is a function of the crystal orientation at the goniostat angles that generated the reflection with indices h [equation (11.4.2.6)]. The rotation of the crystal during the experiment creates a straightforward algebraic problem that results in a complex equation defining the angle at which the reflection occurs. This angle consequently defines the image or images on which the reflection appears. Knowing this angle, the vector S can be calculated, and, from equation (11.4.2.5), the direction of the vector X can be found: Calculation of the length of vector X requires a knowledge of detector orientation, which, for flat detectors, is described here by vector G, perpendicular to the detector and with length equal to the crystal-to-detector distance: Then, by inverting equation (11.4.2.8), the position in pixels of the reflection can be calculated:
The precision of the integration step depends on precise knowledge of the peak positions. The autoindexing step provides only an approximate orientation of the crystal, and the result of that step is imprecise if the initial values of the detector parameters are poorly known. A nonlinear least-squares refinement process is used to improve the prediction (EEC Cooperative Workshop on Position-Sensitive Detector Software, 1986). Depending on the particulars of the experiment, the same parameters (e.g. crystal-to-detector distance) may either be known more precisely a priori, or are better estimated from the diffraction data. DENZO allows either fixing or refining of each of the parameters separately. This flexibility is important when characterizing a detector, but when detector parameters are already known, the fit all option and detector-specific default values are quite reliable.
DENZO can refine the six parameters describing the position and orientation of the detector in space. It can also refine internal parameters of the detector including:
Detector- and crystal-parameter refinement in DENZO is achieved by minimizing the sum of three functions [equations (11.4.6.7), (11.4.6.8) and (11.4.6.11)] of the type in equation (11.4.5.1). The contribution resulting from the measurement of position {p, q} of the reflection is where is the predicted position of the reflection in pixel coordinates, is the centroid position of the observed reflection in pixel coordinates and are combined estimates of uncertainties of the observed and predicted positions.
The Bragg condition [equation (11.4.2.1)] assumes diffraction from ideal crystals and a parallel X-ray beam. In reality, crystals are mosaic and the beam has some angular spread. The value of the mosaicity keyword describes the range of orientations of the crystal lattice within a sample. As the impacts of mosaicity and the beam's angular spread on the angular width of reflections are equivalent, the keyword mosaicity describes the sum of both effects.
DENZO assumes the following model of angular shape of diffraction peaks for the oscillation angle : where mos is the observed angular width of a diffraction peak yet to be increased by the Lorentz factor, is the predicted oscillation angle at which the diffraction condition is fulfilled, is in the range, otherwise M = 0. The actual width of the reflection is different due to Lorentz-factor variability over the image, so equation (11.4.6.9) describes only the common component of the angular width.
Using equation (11.4.6.9) we can calculateP is the predicted partiality of data collected by oscillating from to . It is a number that represents what fraction of the reflection intensity is present in one image. If the partiality is 1, such reflections are called fully recorded; otherwise, they are called partials. For partials, predictions of partiality can be compared with the observed fraction of the reflection intensity present in one image. The partiality model contributes the following term to the refinement: The combined positional [described by equations (11.4.6.7) and (11.4.6.8)] and partiality refinement [equation (11.4.6.11)] used in DENZO is both stable and very accurate. The power of this method lies in proper weighting by estimated errors of two very different terms – one describing positional differences and the other describing intensity differences. Both detector and crystal variables are uniformly treated in the refinement process.
The design of detectors results in pixels not being positioned on an exact square or rectangular grid. A correct understanding of the detector distortions is essential to accurate positional refinement. The types of distortions are detector-specific. The primary sources of error include misalignment of the detector position sensors and optical or magnetic distortion in CCD-based detectors. If the detector distortion can be parameterized, then these parameters should be added to the refinement. For example, in the case of spiral scanners, there are two parameters describing the end position of the scanning head. In a perfectly adjusted scanner, these parameters would be zero. In practice, however, they may deviate from zero by as much as 1 mm. Such misalignment parameters can correlate very strongly with other detector and crystal parameters, particularly for low-symmetry lattices or for low-resolution data. If the distortions are stable, it is better to determine them in a separate experiment optimized for that task.
Fibre-optic tapers used in many CCD detectors have distortions that have to be individually determined for each instrument. The distortion is stable over time and its spatial characteristics are dominated by a smooth component and a small local shear. In high-quality tapers used in X-ray instruments, the small local shear can be ignored. The smooth component can be parameterized in a number of ways, for example by splines or polynomials (Messerschmidt & Pflugrath, 1987). DENZO uses two-dimensional Chebyschev polynomials (Press et al., 1989) in {x, y} or {p, q} coordinates, normalized to the range , . Typically, fifth- or seventh-order polynomials result in a positional error (r.m.s.) lower than 7 µm, which is about one tenth of a typical detector pixel. DENZO can use either a grid mask pattern or the X-ray diffraction pattern to refine the coefficients of the Chebyschev polynomials. If a grid mask is used, it has to be precisely made and positioned. The use of crystallographic data requires precise knowledge of detector and crystal parameters that are not known a priori with the required precision. The crystal and detector parameters can be determined in the same experiment as detector distortion. However, this experiment needs to be designed to minimize the impact of correlations between the parameters involved. The data analysis requires the description of the distortion function and its inverse. In DENZO, both are approximated in terms of Chebyschev polynomials. The magnitude of the approximation error is the same for the distortion function and its inverse.
To accurately integrate diffraction peaks, the spot position has to be predicted accurately. Each integration program has its own procedure for predicting and fitting diffraction profiles. In DENZO, profile-shape prediction is defined by a weighted average of other reflection profiles present within some radius from the spot of interest. Each spot has its own prediction, which is continuously adjusted to variations of spot shapes over the detector. The profiles are added by shifting them to the same position, generating a normalized profile P_{i}, where . In the second step, the measured pixels' values are fitted to a functionwhere B_{i} is the predicted value for the background in pixel i and I is the diffraction intensity of the spot. The profile-fitting procedure minimizes the functionwhere M_{i} are the measured pixel values and V_{i} are the variances of these measurements. The minimum of this function defines the value of the profile-fitted intensity (Otwinowski & Minor, 1997):
The relation between the measured intensity I(hkl) of reflection hkl and its squared structure-factor amplitude |F(hkl)|^{2} is described bywhere I_{b} is the flux density of the primary beam; is the classical electron radius (2.818 × 10^{−12} mm); λ is the wavelength of the beam; is a cross product between the diffraction vector S and (the projection of the crystal-rotation-speed vector on the plane perpendicular to the primary beam; thus is the Lorentz factor); P is a polarization factor (Azaroff, 1955); T is the transmission of the beam [related to the absorbance defined as , ]; v_{u} is the volume of a primitive crystal unit cell; V is the volume of the crystal exposed to the beam; |F(hkl)|^{2} is the square of the structure-factor amplitude for the given reflection hkl; D_{A} is the absorption of X-rays by the detector's active material; and D_{C} is the detector's response to a single absorbed X-ray photon.
SCALEPACK determines the components of the total scale factor, i.e. the product of all factors that multiply the structure-factor amplitudes squared in equation (11.4.8.1):where I is the intensity and K is the total scale factor.
All components of the total scale factor K can be calculated from non-diffraction measurements and calibration of the data-collection system. However, the absolute calibration of the whole system is rarely available and part of the scale factor (the overall scale factor k_{o}) is determined by comparing the scaled data to the squared structure-factor amplitudes predicted from an atomic model obtained after the structure is solved:The relative scale factor k_{r} is calculated, but in practice we assume that some of its components are known from detector calibration, beam monitoring and the diffraction geometry. The scaling procedure determines the remaining parts of the scale factor (often iteratively), based on knowledge from subsequent stages of crystallographic analysis. When iterative procedures are applied, the scaling model has to include information about experimental uncertainties (discussed in Section 11.4.10).
SCALEPACK uses an exponential modelling approach (Otwinowski et al., 2003), which is flexible with regard to correlations among parameters and provides a uniform description of the parameter optimization process for various scaling models. In SCALEPACK scale factors s_{i} for each observation are calculated using a set of a priori unknown parameters p_{i}:where f_{i} are pre-defined modelling functions of experimental conditions and i is a hierarchical index referring both to the type of correction and to the indices of the functional parameters describing this correction.
The simplest scaling model has a separate scale factor for each group (or batch) of data, e.g. one scale factor per image. In such a case,where j is the batch index for a particular reflection. From equation (11.4.8.4) we obtain as the logarithm of the scale factor of batch i.
To correct for a radiation-damage component represented as resolution-dependent decay, one temperature factor is used per batch of data,where S is the scattering vector for each reflection. Using the same approach as for equation (11.4.8.5), is obtained as the relative temperature factor.
Another, more complex, multiplicative correction addresses unknown crystal absorption using an average of absorbances in the incoming and diffracted beam directions (Kopfmann & Huber, 1968). The correction is parameterized by real spherical harmonics as a function of the direction of the incoming beam and the diffraction vector S, expressed in the polar coordinate system of the rotating crystal (Katayama, 1986; Blessing, 1995):where as, lm and ac, lm are parts of the hierarchical index i from equation (11.4.8.4); l, m are indices of the spherical harmonics; P_{lm} is a Legendre polynomial; and are the polar coordinates of the incoming (index i) and the outgoing (index o) directions in the crys_{}tal coordinate system. The odd-order spherical harmonics should have zero coefficients when describing pure absorption of X-rays. However, due to correlation with other effects, the scaling may benefit from the inclusion of low-order odd-order harmonics.
It is beneficial to correct even for a small discrepancy between the actual and assumed directions of the crystal rotation axis. The inaccuracy results in an error in the calculated value of the Lorentz factor [equation (11.4.8.1)]. The scale factor to correct for this error can be described using the parameter p_{l}, the value of which represents a small angular error. The corresponding function isThere are additional effects that are parameterized using exponential modelling [equation (11.4.8.4)], for example, a correction for uneven crystal rotation and/or exposure (Otwinowski et al., 2003), or a correction for uneven detector response, which is very important for multi-CCD detectors. This approach can be extended to other experimental factors if there is a need to correct them during scaling without changing the overall logic of scale-factor determination (Otwinowski et al., 2003).
In principle, global scaling can be followed by local scaling (Matthews & Czerwinski, 1975). Local scaling is mostly applied to calculate differences of phasing signal, where it is assumed that a group of measurements, e.g. those close together in reciprocal space or detector space, should be on a similar scale. A flexible parameterization by the exponential modelling allows for a good description of all kinds of smooth corrections. Local scaling is much more limited in terms of what type of smooth variation is being corrected for, so it is unlikely to provide additional benefit to the general scaling method described here. In practice, if there is an improvement from such procedures at the stage of heavy-atom search, it implies that the scaling parameters for global scaling were not properly chosen.
The unknown parameters in equation (11.4.8.4) are estimated with various level of uncertainty depending on the multiplicity of observations and how symmetry-equivalent reflections are related to each other. Potentially, this may result in unreasonable values of scaling parameters due to insufficient information to determine the values of parameters. In SCALEPACK, the method to stabilize such ill-conditioned calculations is closely related to Tikhonov stabilization (Tikhonov & Arsenin, 1977), where additional, a priori knowledge about the expected magnitude of the physical effect modelled is used to restrain the solutions, based on the same argument as in the case of restraints in the atomic refinement.
For example, logarithms of scale factors typically do not fluctuate by more than w_{s} between frames, where expectation about w_{s} is a function of the data-collection stability (beam stability, goniostat and/or crystal vibrations). This knowledge is described by adding a penalty term (scale restrain) to the functions being optimized:A similar approach can be used in calculations of absorption coefficients. For smooth absorption with the expectation of decreasing magnitude of parameters for high orders of spherical harmonics [equation (11.4.8.7)], a reasonable restraint term parameterized by w_{a} results inIf we do not want to penalize high-order terms more than the low-order ones, the following restraint can be used:
The process of refining crystal parameters using the combined reflection intensity measurements is known as global refinement or post refinement (Rossmann, 1979; Evans, 1993). The implementation of this method in SCALEPACK allows for separate refinement of the orientation of each image, but with the same unit-cell value for the whole data set. In each batch of data (a batch is typically one image), different unit-cell parameters may be poorly determined. However, in a typical data set there are enough orientations to determine all unit-cell lengths and angles precisely. Global refinement is also more precise than the processing of a single image in the determination of crystal mosaicity and the orientation of each image.
Proper error estimation requires the use of Bayesian reasoning and a multi-component error model (Schwarzenbach et al., 1989; Evans, 1993). In principle, the error estimates may be derived solely from a theoretical understanding of the measurement process. However, the complexity of error propagation and correlations between various sources of effects have led crystallographers to rely on hybrid approaches also involving self-consistency analysis of symmetry-equivalent reflections.
The random errors in DENZO are estimated by a heuristic procedure that also accounts for small components of systematic errors (Borek et al., 2003). Initially, DENZO estimated errors of integrated diffraction peaks from X-ray film. After introducing detectors with larger dynamic range, the procedure was adjusted accordingly.
The initial estimates of errors are obtained bywhere n_{b} is the number of pixels used in background estimation and e_{d} is the error-density parameter defined for each instrument, which can also be overridden by the user (Gewirth, 2003) with other variables defined in equation (11.4.7.1). The sums are calculated over all the pixels in a reflection profile. The expression within the braces { } describes two components of uncertainty: the left sum accounts for contributions resulting from pixels in the peak area, whereas the right sum adds an adjustment resulting from uncertainty of the background estimate. The denominator in the front of the expression in braces is derived from error propagation for the profile-fitted intensity [equation (11.4.7.3)].
Next, the goodness-of-profile-fitting factor g is calculated:where n_{i} is the number of pixels in a reflection profile. For weak reflections the parameter g should be relatively close to 1. If it is systematically off by a large factor, the error-density parameter e_{d} should be adjusted (Borek et al., 2003). SCALEPACK applies an additional level of adjustment to the estimates produced by DENZO (Borek et al., 2003):which is scaled either by the user or by an automatically adjustable factor E_{S} (called the error scale factor) to make disagreements among symmetry-related measurements consistent:Even this scaled estimate of random error σ_{I} does not account for all types of errors and additional adjustments for systematic effects are needed.
The multiplicative scale factor has its own uncertainty independent of random errors with typical values in the range of a few per cent. However, even such small errors are important in calculations of the phase signal. Errors in the scale factors have a correlated component that equally affects measurements of intensities in phasing differences, so it does not impact on the differences themselves. The important part is estimating the magnitude of the remaining component of scaling errors, described by σ_{K}. Comparing symmetry-related reflections estimates only the relevant component of multiplicative errors. The total scaling error would have to be estimated differently, but typically it has little relevance to macromolecular crystallography and can be ignored.
The σ_{I} [equation (11.4.10.4)] can be combined with σ_{K} to obtain the final estimated error of the scaled measurement:
Symmetry-related scaled measurements I(hkl) and their uncertainty estimates σ_{E} are used to obtain merged intensities by a standard weighted averaging formula:This allows for calculations of validation statistics, called goodness-of-fit or normalized χ^{2}, for each unique index:where n represents the number of observations of a given unique index. This χ^{2} statistic is then averaged in resolution shells or over intensity bins or batch number. If the error model accounts properly for all effects, the χ^{2} statistic should fluctuate around a value of unity. If χ^{2} values depart from this expectation it may indicate a number of possibilities, e.g. various problems at earlier stages (poorly edited beam-stop shadow, hardware failures, mistakes in processing or other source of outliers etc.), inadequacy of the error model or variations in the structure factors within the symmetry-related observations. The instrumental problems or mistakes in processing should be corrected. The effects that cannot be corrected may be handled by adjusting the error model. However, if the more detailed analysis eliminates the obvious source for such problems, then the most likely source of discrepancies between symmetry-related measurements results from violation of Friedel symmetry. SCALEPACK calculates merging statistics both for the Bijvoet pairs merged together and separately. Differences in χ^{2} values between these two merging outputs are very reliable estimates of anomalous signal strength. When a more detailed analysis eliminates the obvious reasons for high χ^{2} values, the most likely remaining source of error is non-isomorphism (Borek et al., 2007, 2010).
The HKL package has a number of tools that can detect possible detector or experimental setup problems (Otwinowski & Minor, 1997; Otwinowski et al., 2003). Visual inspection of the image may provide only a very rough estimate of data quality. A check of the analogue-to-digital converter can provide rough diagnostics of detector electronics. Examination of the background can provide information about detector noise, especially when uncorrected images can be examined in the areas exposed to X-rays and areas where pure read-out noise can be observed. DENZO provides several diagnostic tools during the integration stage, as the crystallographer may observe crystal slippage, a change of unit-cell parameters or a change of the values of positional and angular during the refinement.
Even more tools are provided at the data-scaling stage. By observing scale factors, poor crystal alignment can be detected. Other tools may help diagnose X-ray shutter malfunction, spindle-axis alignment and internal detector-alignment problems. The final inspection of outliers may again provide valuable information about detector quality. The clustering of outliers in one area of the detector may indicate a damaged surface; if most outliers are partials, it may indicate a problem with spindle backlash or shutter control. The zoom mode may be used to display the area around the outliers to identify the source of a problem: for example, the existence of a satellite crystal or single pixel spikes due to electronic failure. Sometimes, even for very strong data, a histogram of the pixel intensities may stop below the maximum valid pixel value, indicating saturation of the data-acquisition hardware or software.
DENZO and SCALEPACK form the numerical processing core of the HKL package. These programs can be used directly by editing commands in input scripts, but most of the time they are run through HKL-2000 and its more expanded version HKL-3000.
The basic mode of HKL-2000 reads in previously collected data as input and produces scaled and merged reflections as output. This use can be extended in three directions: control of the data-collection process by HKL-2000 and HKL-3000 (Minor et al., 2002, 2006), incorporation of methods of structure solution by HKL-3000 (Minor et al., 2006), and storage of critical intermediate results in an external database (Grabowski et al., 2007).
Crystallographic structure determinations encompass a wide range of project dynamics. There are many projects where all the steps are executed serially, while others involve substantial iterative improvements, where one or a few stages are repeated a number of times. HKL-2000 and HKL-3000 are designed to make both types of project more effective. To accomplish this, information is automatically propagated between various stages of analysis, and many necessary data transformations are performed to accommodate the interface requirements of many programs and beamline controls. At the same time, crystallography requires the experimenter to be actively engaged in decision making, depending on the nature of a particular project and types of problems encountered. The experimenter may need to assess the quality of a set of crystals, decide how to collect data sets from a chosen subset, determine the symmetry of each diffraction pattern, reassess the crystal quality based on the integration and merging steps, and then solve the structure using an appropriate method. Not all programs are fully automatic, and the experimenter may need to be involved in defining how a particular procedure should be executed. As a consequence, both HKL-2000 and HKL-3000 have a multiplicity of interface screens (accessed by tabs in the graphical control centre), each of them designed to control a particular process.
The versions of HKL-2000 and HKL-3000 that interface with data-collection systems can coordinate all parameters of the diffraction experiment. This facilitates interactive experiments in which data analysis is done online and results are automatically updated when new data are collected. In such experiments, it is possible to adjust the data-collection strategy to guarantee the desired result, particularly with regard to data completeness. The strategy takes into account limitations arising from radiation damage. Radiation damage can be estimated from past experience with similar crystals, by theoretical calculations of decay based on beam intensity, and by evaluating scale- and B-factor changes in real time.
The graphical control centre of HKL-2000 (and HKL-3000) consists of three components: an internal database (optionally connected to an external one), a transition-state engine and a graphical user interface (GUI). The internal database stores all the information about data processing and data collection. It can describe not only the data already collected, but also those being collected and even those planned for collection. Each datum entered or program executed, including the data-collection interface, induces a change in the database by the transition-state engine. One of the main functions of the GUI is input to and editing of the database. The other major function is generation of reports from the database (to visualize its status).
The internal database abstraction is based on the following hierarchy: instrument type; site; experiment; crystal; three-dimensional (3D) group of diffraction images; and diffraction image. Each lower level of the hierarchy inherits the properties of the higher levels. When a program finishes analysing data at a particular level, the parent (higher-level) data are updated, so that data at the same level communicate only through the change of state of their common parent. The site-level data are created only when diffraction from a new detector is seen or when the parameters of the detector are changed, which is done rarely and typically by the X-ray equipment administrator. The experiment-level data describe diffraction data from one or more crystals of the same space group. A uniform series of diffraction images form 3D groups. There is no limit to the number of 3D groups, and, in the case of non-uniformity in the series (e.g. as found during data analysis), one 3D group can be split into two or more smaller 3D groups. The smallest 3D group can consist of one image. A crystal datum contains a set of 3D groups with a relative orientation and exposure level known a priori. In practice, this means that diffraction data encapsulated within a single crystal datum were collected from one sample at one site with potentially different settings of goniostat, data-collection axis, crystal translation, detector position, detector mode (e.g. binned/unbinned) or exposure level.
HKL-3000 combines a number of existing macromolecular crystallographic computer programs [SHELX (Sheldrick, 2008), CCP4 (Collaborative Computational Project, No. 4, 1994), SOLVE/RESOLVE (Terwilliger, 2004), ARP/wARP (Perrakis et al., 1999) and COOT (Emsley & Cowtan, 2004)] and decision-making algorithms into a powerful expert system (Fig. 11.4.12.1). The typical end result of HKL-3000 is an interpretable electron-density map with a partially built structure and, in some cases, an almost complete and refined model.
The HKL-3000 system is designed to evaluate the results of laboratory and synchrotron diffraction experiments very quickly. The system supports most common types of macromolecular crystallography experiments: isomorphous (IR) and molecular replacement (MR), multiple anomalous diffraction (MAD)/single anomalous diffraction (SAD), and native data collection for high-resolution refinement of previously solved models of proteins or protein–ligand complexes. SAD experiments have became popular in recent years (Chruszcz et al., 2008) at least partly due to development of integrated systems like HKL-3000, which allow the experimenter to validate whether experimental data collected at one wavelength are of sufficient quality to solve the structure. The concurrent data collection, data processing and quick preliminary structure solution made by HKL-3000 verifies the success of the X-ray experiment and allows optimization of the data-collection strategy while the crystal is still on the goniostat. This allows the experimenter to decide whether the experiment has been successfully completed and thus whether the crystal can be removed.
The methods presented here have been used to solve tens of thousands of crystal structures deposited in public data repositories. The results range from crystal structures of inorganic molecules with 3 Å unit-cell parameters to a crystal structure of a 700 Å-diameter virus, which crystallized in a 700 × 1000 × 1400 Å unit cell. The precision of signal estimation was sufficient to determine a structure where the amplitudes of the imaginary Bijvoet components were approximately 0.15% of the total amplitudes (Borek et al., 2007), and for the results of integration and scaling analyses to be used in charge-density studies (Dominiak et al., 2006). The success of these methods is a consequence of the robust and stable implementation described above.
Acknowledgements
This work was supported by NIH grant GM-53163. We would like to acknowledge the contributions of the many researchers who provided us with diffraction data representing a wide range of experimental problems. We are indebted to the following people who developed other diffraction-data-analysis programs, interactions with whom contributed to HKL program development and to the ideas presented here: M. G. Rossmann, G. Bricogne, P. Evans, A. Howard, W. Kabsch, A. Leslie, J. Pflugrath and G. Sheldrick. We would also like to thank H. Czarnocka, R. Henderson, W. Majewski, A. Pertsemlidis, D. Tomchick and M. Zimmerman for help in preparing this manuscript.
References
Azaroff, L. V. (1955). Polarization correction for crystal-monochromatized X-radiation. Acta Cryst. 8, 701–704.Blessing, R. H. (1995). An empirical correction for absorption anisotropy. Acta Cryst. A51, 33–38.
Blum, M., Metcalf, P., Harrison, S. C. & Wiley, D. C. (1987). A system for collection and on-line integration of X-ray diffraction data from a multiwire area detector. J. Appl. Cryst. 20, 235–242.
Borek, D., Cymborowski, M., Machius, M., Minor, W. & Otwinowski, Z. (2010). Diffraction data analysis in the presence of radiation damage. Acta Cryst. D66, 426–436.
Borek, D., Ginell, S. L., Cymborowski, M., Minor, W. & Otwinowski, Z. (2007). The many faces of radiation-induced changes. J. Synchrotron Rad. 14, 24–33.
Borek, D., Minor, W. & Otwinowski, Z. (2003). Measurement errors and their consequences in protein crystallography. Acta Cryst. D59, 2031–2038.
Bricogne, G. (1987). The EEC Cooperative Programming Workshop on Position-sensitive Detector Software. In Proceedings of the Daresbury Study Weekend at Daresbury Laboratory, 23–24 January, edited by J. R. Helliwell, P. A. Machin & M. Z. Papiz, pp. 120–146. Warrington: Daresbury Laboratory.
Chruszcz, M., Wlodawer, A. & Minor, W. (2008). Determination of protein structures – a series of fortunate events. Biophys. J. 95, 1–9.
Collaborative Computational Project, Number 4 (1994). The CCP4 suite: programs for protein crystallography. Acta Cryst. D50, 760–763.
Dominiak, P. M., Makal, A., Mallinson, P. R., Trzcinska, K., Eilmes, J., Grech, E., Chruszcz, M., Minor, W. & Wozniak, K. (2006). Continua of interactions between pairs of atoms in molecular crystals. Chem. Eur. J. 12, 1941–1949.
EEC Cooperative Workshop on Position-Sensitive Detector Software (1986). Phase I and II, LURE, Paris, 16 May – 7 June; Phase III, LURE, Paris, 12–19 November.
Emsley, P. & Cowtan, K. (2004). Coot: model-building tools for molecular graphics. Acta Cryst. D60, 2126–2132.
Evans, P. R. (1987). Postrefinement of oscillation camera data. In Proceedings of the Daresbury Study Weekend at Daresbury Laboratory, 23–24 January, edited by J. R. Helliwell, P. A. Machin & M. Z. Papiz, pp. 58–66. Warrington: Daresbury Laboratory.
Evans, P. R. (1993). Data reduction: data collection and processing. In Proceedings of the CCP4 Study Weekend. Data Collection and Processing, 29–30 January, edited by L. Sawyer, N. Isaac & S. Bailey, pp. 114–123. Warrington: Daresbury Laboratory.
Gewirth, D. (2003). HKL Manual. 6th ed. HKL Research, Charlottesville, USA.
Grabowski, M., Joachimiak, A., Otwinowski, Z. & Minor, W. (2007). Structural genomics: keeping up with expanding knowledge of the protein universe. Curr. Opin. Struct. Biol. 17, 347–353.
Greenhough, T. J. (1987). Partials and partiality. In Proceedings of the Daresbury Study Weekend at Daresbury Laboratory, 23–24 January, edited by J. R. Helliwell, P. A. Machin & M. Z. Papiz, pp. 51–57. Warrington: Daresbury Laboratory.
Higashi, T. (1990). Auto-indexing of oscillation images. J. Appl. Cryst. 23, 253–257.
Howard, A. J., Gilliland, G. L., Finzel, B. C., Poulos, T. L., Ohlendorf, D. H. & Salemme, F. R. (1987). The use of an imaging proportional counter in macromolecular crystallography. J. Appl. Cryst. 20, 383–387.
Howard, A. J., Nielsen, C. & Xuong, Ng. H. (1985). Software for a diffractometer with multiwire area detector. Methods Enzymol. 114, 452–472.
International Tables for Crystallography (2005). Vol. A. Space-Group Symmetry, edited by Th. Hahn. Heidelberg: Springer.
Kabsch, W. (1988). Evaluation of single-crystal X-ray diffraction data from a position sensitive detector. J. Appl. Cryst. 21, 916–924.
Kabsch, W. (1993). Automatic processing of rotation diffraction data from crystals of initially unknown symmetry and cell constants. J. Appl. Cryst. 26, 795–800.
Katayama, C. (1986). An analytical function for absorption correction. Acta Cryst. A42, 19–23.
Kim, S. (1989). Auto-indexing oscillation photographs. J. Appl. Cryst. 22, 53–60.
Kopfmann, G. & Huber, R. (1968). A method of absorption correction by X-ray intensity measurements. Acta Cryst. A24, 348–351.
Leslie, A. (1993). Autoindexing of rotation diffraction images and parameter refinement. In Proceedings of the CCP4 Study Weekend. Data Collection and Processing, 29–30 January, edited by L. Sawyer, N. Isaac & S. Bailey, pp. 44–51. Warrington: Daresbury Laboratory.
Leslie, A. G. W. & Tsukihara, T. (1980). A strategy for collecting isomorphous derivative data with the oscillation method. J. Appl. Cryst. 13, 304–305.
Matthews, B. W. & Czerwinski, E. W. (1975). Local scaling: a method to reduce systematic errors in isomorphous replacement and anomalous scattering measurements. Acta Cryst. A31, 480–487.
Messerschmidt, A. & Pflugrath, J. W. (1987). Crystal orientation and X-ray pattern prediction routines for area-detector diffraction systems in macromolecular crystallography. J. Appl. Cryst. 20, 306–315.
Minor, W., Cymborowski, M. & Otwinowski, Z. (2002). Automatic system for crystallographic data collection and analysis. Acta Phys. Pol. A, 101, 613–619.
Minor, W., Cymborowski, M., Otwinowski, Z. & Chruszcz, M. (2006). HKL-3000: the integration of data reduction and structure solution – from diffraction images to an initial model in minutes. Acta Cryst. D62, 859–866.
Naday, I., Ross, S., Westbrook, E. M. & Zentai, G. (1998). Charge-coupled device/fiber optic taper array X-ray detector for protein crystallography. Opt. Eng. 37, 1235–1244.
Otwinowski, Z. (1993). Oscillation data reduction program. In Proceedings of the Daresbury CCP4 Study Weekend. Data Reduction and Processing, edited by L. Sawyer, N. Isaacs and S. Bailey, pp. 56–62. Warrington: Daresbury Laboratory.
Otwinowski, Z., Borek, D., Majewski, W. & Minor, W. (2003). Multiparametric scaling of diffraction intensities. Acta Cryst. A59, 228–234.
Otwinowski, Z. & Minor, W. (1997). Processing of X-ray diffraction data collected in oscillation mode. Methods Enzymol. 276, 307–326.
Perrakis, A., Morris, R. & Lamzin, V. S. (1999). Automated protein model building combined with iterative structure refinement. Nat. Struct. Biol. 6, 458–463.
Press, W. H., Flannery, B. P., Teukolsky, S. A. & Vetterling, W. T. (1989). Numerical Recipes – the Art of Scientific Computing. Cambridge University Press.
Reeke, G. N. Jr (1984). Eigenvalue filtering in the refinement of crystal and orientation parameters for oscillation photography. J. Appl. Cryst. 17, 238–243.
Rossmann, M. G. (1979). Processing oscillation diffraction data for very large unit cells with an automatic convolution technique and profile fitting. J. Appl. Cryst. 12, 225–238.
Sakabe, N. (1991). X-ray diffraction data collection system for modern protein crystallography with a Weissenberg camera and an imaging plate using synchrotron radiation. Nucl. Instrum. Methods A, 303, 448–463.
Schwarzenbach, D., Abrahams, S. C., Flack, H. D., Gonschorek, W., Hahn, Th., Huml, K., Marsh, R. E., Prince, E., Robertson, B. E., Rollett, J. S. & Wilson, A. J. C. (1989). Statistical descriptors in crystallography. Report of the IUCr Subcommittee on Statistical Descriptors. Acta Cryst. A45, 63–75.
Sheldrick, G. M. (2008). A short history of SHELX. Acta Cryst. A64, 112–122.
Steller, I., Bolotovsky, R. & Rossmann, M. G. (1997). An algorithm for automatic indexing of oscillation images using Fourier analysis. J. Appl. Cryst. 30, 1036–1040.
Terwilliger, T. (2004). SOLVE and RESOLVE: automated structure solution, density modification and model building. J. Synchrotron Rad. 11, 49–52.
Tikhonov, A. N. & Arsenin, V. I. A. (1977). Solutions of ill-posed problems. Washington, New York, Winston: Halsted Press.
Vriend, G. & Rossmann, M. G. (1987). Determination of the orientation of a randomly placed crystal from a single oscillation photograph. J. Appl. Cryst. 20, 338–343.
Winkler, F. K., Schutt, C. E. & Harrison, S. C. (1979). The oscillation method for crystals with very large unit cells. Acta Cryst. A35, 901–911.