International
Tables for
Crystallography
Volume F
Crystallography of biological macromolecules
Edited by E. Arnold, D. M. Himmel and M. G. Rossmann

International Tables for Crystallography (2012). Vol. F, ch. 18.2, pp. 466-473   | 1 | 2 |
https://doi.org/10.1107/97809553602060000856

Chapter 18.2. Enhanced macromolecular refinement by simulated annealing

A. T. Brunger,a* P. D. Adamsb and L. M. Ricec

aThe Howard Hughes Medical Institute, and Departments of Molecular and Cellular Physiology, Neurology and Neurological Sciences, and Stanford Synchrotron Radiation Laboratory, Stanford Universty, 1201 Welch Road, MSLS P210, Stanford, CA 94305–5489, USA,bThe Howard Hughes Medical Institute and Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT 06511, USA, and cDepartment of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT 06511, USA
Correspondence e-mail:  brunger@stanford.edu

The analysis of X-diffraction data generally requires sophisticated computational procedures that culminate in refinement and structure validation. The refinement procedure can be formulated as the chemically constrained or restrained nonlinear optimization of a target function, which usually measures the agreement between observed diffraction data and data computed from an atomic model. The ultimate goal of refinement is to simultaneously optimize the agreement of an atomic model with observed diffraction data and with a priori chemical information. Simulated annealing is an optimization technique particularly well suited to overcoming the multiple minima problem. Unlike gradient-descent methods, simulated annealing can cross barriers between minima and thus can explore a greater volume of the parameter space to find better models (deeper minima). Following its introduction to crystallographic refinement, there have been major improvements of the original method in four principal areas: the measure of model quality, the search of the parameter space, the target function and the modelling of conformational variability. These developments are discussed in this chapter.

18.2.1. Introduction

| top | pdf |

The analysis of X-ray diffraction data generally requires sophisticated computational procedures that culminate in refinement and structure validation. The refinement procedure can be formulated as the chemically constrained or restrained nonlinear optimization of a target function, which usually measures the agreement between observed diffraction data and data computed from an atomic model. The ultimate goal of refinement is to optimize simultaneously the agreement of an atomic model with observed diffraction data and with a priori chemical information.

The target function used for this optimization normally depends on several atomic parameters and, most importantly, on atomic coordinates. The large number of adjustable parameters (typically at least three times the number of atoms in the model) gives rise to a very complicated target function. This, in turn, produces what is known as the multiple minima problem: the target function contains many local minima in addition to the global minimum, and this tends to defeat gradient-descent optimization techniques such as conjugate gradient or least-squares methods (Press et al., 1986[link]). These methods are unable to sample molecular conformations thoroughly enough to find the optimal model if the starting one is far from the correct structure.

The challenges of crystallographic refinement arise not only from the high dimensionality of the parameter space, but also from the phase problem. For new crystal structures, initial electron-density maps must be computed from a combination of observed diffraction amplitudes and experimental phases, where the latter are typically of poorer quality and/or at a lower resolution than the former. A different problem arises when structures are solved by molecular replacement (Hoppe, 1957[link]; Rossmann & Blow, 1962[link]), which uses a similar structure as a search model to calculate initial phases. In this case, the resulting electron-density maps can be severely `model-biased', that is, they sometimes seem to confirm the existence of the search model without providing clear evidence of actual differences between it and the true crystal structure. In both cases, initial atomic models usually contain significant errors and require extensive refinement.

Simulated annealing (Kirkpatrick et al., 1983[link]) is an optimization technique particularly well suited to overcoming the multiple minima problem. Unlike gradient-descent methods, simulated annealing can cross barriers between minima and, thus, can explore a greater volume of the parameter space to find better models (deeper minima). Following its introduction to crystallographic refinement (Brünger et al., 1987[link]), there have been major improvements of the original method in four principal areas: the measure of model quality, the search of the parameter space, the target function and the modelling of conformational variability.

For crystallographic refinement, the introduction of cross validation and the free R value (Brünger, 1992[link]) has significantly reduced the danger of overfitting the diffraction data during refinement. Cross validation also produces more realistic coordinate-error estimates based on the Luzzati or [\sigma_{A}] methods (Kleywegt & Brünger, 1996[link]). The complexity of the conformational space has been reduced by the introduction of torsion-angle refinement methods (Diamond, 1971[link]; Rice & Brünger, 1994[link]), which decrease the number of adjustable parameters that describe a model approximately tenfold. The target function has been improved by using a maximum-likelihood approach which takes into account model error, model incompleteness and errors in the experimental data (Bricogne, 1991[link]; Pannu & Read, 1996[link]). Cross validation of parameters for the maximum-likelihood target function was essential in order to obtain better results than with conventional target functions (Pannu & Read, 1996[link]; Adams et al., 1997[link]; Read, 1997[link]). Finally, the sampling power of simulated annealing has been used for exploring the molecule's conformational space in cases where the molecule undergoes dynamic motion or exhibits static disorder (Kuriyan et al., 1991[link]; Burling & Brünger, 1994[link]; Burling et al., 1996[link]).

18.2.2. Cross validation

| top | pdf |

Cross validation (Brünger, 1992[link]) plays a fundamental role in the maximum-likelihood target functions described below. A few remarks about this method are therefore warranted (for reviews see Kleywegt & Brünger, 1996[link]; Brünger, 1997[link]). For cross validation, the diffraction data are divided into two sets: a large working set (usually comprising 90% of the data) and a complementary test set (comprising the remaining 10%). The diffraction data in the working set are used in the normal crystallographic refinement process, whereas the test data are not. The cross-validated (or `free') R value computed with the test-set data is a more faithful indicator of model quality. It provides a more objective guide during the model building and refinement process than the conventional R value. It also ensures that introduction of additional parameters (e.g. water molecules, relaxation of noncrystallographic symmetry restraints, or multi-conformer models) improves the quality of the model, rather than increasing overfitting.

Since the conventional R value shows little correlation with the accuracy of a model, coordinate-error estimates derived from the Luzzati (1952)[link] or [\sigma_{A}] (Read, 1986[link]) methods are unrealistically low. Kleywegt & Brünger (1996)[link] showed that more reliable coordinate errors can be obtained by cross validation of the Luzzati or [\sigma_{A}] coordinate-error estimates. An example is shown in Fig. 18.2.2.1[link] using the crystal structure and diffraction data of penicillopepsin (Hsu et al., 1977[link]). At 1.8 Å resolution, the model has an estimated coordinate error of ∼0.2 Å as assessed by multiple independent refinements. As the resolution of the diffraction data is artificially truncated and the model re-refined, the coordinate error (assessed by the atomic root-mean-square difference to the refined model at 1.8 Å resolution) increases monotonically. The conventional R value improves as the resolution decreases and the quality of the model worsens. Consequently, coordinate-error estimates do not display the correct behaviour either: the error estimates are approximately constant, regardless of the resolution and actual coordinate error of the models. However, when cross validation is used (i.e., the test reflections are used to compute the estimated coordinate errors), the results are much better: the cross-validated errors are close to the actual coordinate error, and they show the correct trend as a function of resolution (Fig. 18.2.2.1)[link].

[Figure 18.2.2.1]

Figure 18.2.2.1 | top | pdf |

Effect of resolution on coordinate-error estimates: accuracy as a function of resolution. Refinements were begun with the crystal structure of penicillopepsin (Hsu et al., 1977[link]) with water molecules omitted and with uniform temperature factors. The low-resolution limit was set to 6 Å. Inclusion of all low-resolution diffraction data does not change the conclusions (Adams et al., 1997[link]). The penicillopepsin diffraction data were artificially truncated to the specified high-resolution limit. Each refinement consisted of simulated annealing using a Cartesian-space slow-cooling protocol starting at 2000 K, overall B-factor refinement and individual restrained B-factor refinement. All refinements were carried out with 10% of the diffraction data randomly omitted for cross validation. (a) Coordinate-error estimates of the refined structures using the methods of Luzzati (1952)[link] and Read (1986)[link]. All observed diffraction data were used, i.e. no cross validation was performed. The actual coordinate errors (r.m.s. differences to the original crystal structure) are shown for comparison. (b) Cross-validated coordinate-error estimates. The test set was used to compute the coordinate-error estimates (Kleywegt & Brünger, 1996[link]).

18.2.3. The target function

| top | pdf |

Crystallographic refinement is a search for the global minimum of the target [E = E_{\rm chem} + w_{\rm X{\hbox{-}}ray}E_{\rm X{\hbox{-}}ray} \eqno(18.2.3.1)]as a function of the parameters of an atomic model, in particular, atomic coordinates. [E_{\rm chem}] comprises empirical information about chemical interactions; it is a function of all atomic positions, describing covalent (bond lengths, bond angles, torsion angles, chiral centres and planarity of aromatic rings) and non-bonded (intramolecular as well as intermolecular and symmetry-related) interactions (Hendrickson, 1985[link]). [E_{\rm X{\hbox{-}}ray}] is related to the difference between observed and calculated data, and [w_{\rm X{\hbox{-}}ray}] is a weight appropriately chosen to balance the gradients (with respect to atomic parameters) arising from the two terms.

18.2.3.1. X-ray diffraction data versus model

| top | pdf |

The traditional form of [E_{\rm X{\hbox{-}}ray}] consists of the crystallographic residual, [E^{\rm LSQ}], defined as the sum over the squared differences between the observed [(|{\bf F}_{o}|)] and calculated [(|{\bf F}_{c}|)] structure-factor amplitudes for a particular atomic model: [E_{\rm X{\hbox{-}}ray} = E^{\rm LSQ} = \textstyle{\sum\limits_{hkl\in {\rm working \,\, set}}} (|{\bf F}_{o}| - k|{\bf F}_{c}|)^{2}, \eqno(18.2.3.2)]where hkl are the indices of the reciprocal-lattice points of the crystal and k is a relative scale factor.

Minimization of [E^{\rm LSQ}] can produce improvement in the atomic model, but it can also accumulate systematic errors in the model by fitting noise in the diffraction data (Silva & Rossmann, 1985[link]). The least-squares residual is a limiting case of the more general maximum-likelihood theory and is only justified if the model is nearly complete and error-free. These assumptions may be violated during the initial stages of refinement. Improved targets for macromolecular refinement have been obtained using the more general maximum-likelihood formulation (Bricogne, 1991[link]; Pannu & Read, 1996[link]; Adams et al., 1997[link]; Murshudov et al., 1997[link]). The goal of the maximum-likelihood method is to determine the likelihood of the model, given estimates of the model's errors and those of the measured intensities.

A starting point for the maximum-likelihood formulation of crystallographic refinement is the Sim (1959)[link] distribution, i.e., the Gaussian conditional probability distribution of the `true' structure factors, F, given a partial model with structure factors [{\bf F}_{c}] and the model's error (Fig. 18.2.3.1)[link] (Srinivasan, 1966[link]; Read, 1986[link], 1990[link]) (for simplicity we will only discuss the case of acentric reflections), [P_{a} ({\bf F}\semi {\bf F}_{c}) = (1/\pi \varepsilon \sigma_{\Delta}^{2}) \exp [- ({\bf F} - D{\bf F}_{c})^{2} / \varepsilon \sigma_{\Delta}^{2}], \eqno(18.2.3.3)]where [\sigma_{\Delta}] is a parameter that incorporates the effect of the fraction of the asymmetric unit that is missing from the model and errors in the partial structure. Assuming a Wilson distribution of intensities, it can be shown that (Read, 1990[link]) [\sigma_{\Delta}^{2} = \langle |{\bf F}_{o}|^{2}\rangle - D^{2} \langle |{\bf F}_{c}|^{2}\rangle, \eqno(18.2.3.4)]where D is a factor that takes into account model error: it is unity in the limiting case of an error-free model and it is zero if no model is available (Luzzati, 1952[link]; Read, 1986[link]). For a complete and error-free model, [\sigma_{\Delta}] therefore becomes zero, and the probability distribution, [P_{a}(\hbox{{\bf F}; {\bf F}}_{c})], is infinitely sharp.

[Figure 18.2.3.1]

Figure 18.2.3.1 | top | pdf |

The Gaussian probability distribution forms the basis of maximum-likelihood targets in crystallographic refinement. The conditional probability of the true structure factor, F, given model structure factors, is a Gaussian in the complex plane [equation (18.2.3.3)[link]]. The expected value of the probability distribution is [D\hbox{\bf F}_{c}] with variance [\sigma_{\Delta}], where D and [\sigma_{\Delta}] account for missing or incorrectly placed atoms in the model.

Taking measurement errors into account requires multiplication of equation (18.2.3.3)[link] with an appropriate probability distribution (usually a conditional Gaussian distribution with standard deviation [\sigma_{o}]) of the observed structure-factor amplitudes [(|{\bf F}_{o}|)] around the `true' structure-factor amplitudes [(|{\bf F}|)], [P_{\rm meas} (|{\bf F}_{o}|\semi |{\bf F}|). \eqno(18.2.3.5)]

Prior knowledge of the phases of the structure factors can be incorporated by multiplying equation (18.2.3.3)[link] with a phase probability distribution [P_{\rm phase} (\varphi) \eqno(18.2.3.6)]and rewriting equation (18.2.3.3)[link] in terms of the structure-factor moduli and amplitudes of [{\bf F} = |{\bf F}| \exp (i\varphi)].

The unknown variables [|{\bf F}|] and ϕ in equations (18.2.3.3)[link]–(18.2.3.5)[link] [link] have to be eliminated by integration in order to obtain the conditional probability distribution of the observed structure-factor amplitudes, given a partial model with errors, the amplitude measurement errors and prior phase information: [\eqalignno{P_{a} (|{\bf F}_{o}|\semi {\bf F}_{c}) &= (1/\pi \varepsilon \sigma_{\Delta}^{2}) \textstyle{\int} \hbox{d}\varphi \,\hbox{d} |{\bf F}|\semi |{\bf F}| P_{\rm meas} (|{\bf F}_{o}|\semi |{\bf F}|)&\cr &\quad\times P_{\rm phase} (\varphi) \exp \left\{ - [|{\bf F}| \exp (i\varphi) - D{\bf F}_{o}]^{2}/\varepsilon \sigma_{\Delta}^{2}\right\}.\cr &&(18.2.3.7)}]

The likelihood, [{\cal L}], of the model is defined as the joint probability distribution of the structure factors of all reflections in the working set. Assuming independent and uncorrelated structure factors, [{\cal L}] is simply the product of the distributions in equation (18.2.3.7)[link] for all reflections. Instead of maximizing the likelihood, it is more common to minimize the negative logarithm of the likelihood, [E_{\rm X{\hbox{-}}ray} = {\cal L} = - \textstyle{\sum\limits_{hkl\in {\rm working \, set}}} \log [P_{a} (|{\bf F}_{o}|\semi {\bf F}_{c})]. \eqno(18.2.3.8)]

Empirical estimates of [\sigma_{\Delta}] [and D through equation (18.2.3.4)[link]] can be obtained by minimizing [{\cal L}] for a particular atomic model. It is generally assumed that [\sigma_{\Delta}] and D show relatively little variation among neighbouring reflections. Accepting this assumption, [\sigma_{\Delta}] and D can be estimated by considering narrow resolution shells of reflections and assuming that the two parameters are constant in these shells. Minimization of [{\cal L}] can then be performed as a function of these constant shell parameters while keeping the atomic model fixed (Read, 1986[link], 1997[link]). Alternatively, one can assume a two-term Gaussian model for [\sigma_{\Delta}] (Murshudov et al., 1997[link]) and minimize [{\cal L}] as a function of the Gaussian parameters. Note that individual atomic B factors are taken into account by the calculated model structure factors [({\bf F}_{c})].

This empirical approach to estimate [\sigma_{\Delta}] and D requires occasional recomputation of these values as the model improves. Refinement methods that improve the model structure factors, [{\bf F}_{c}], will therefore have a beneficial effect on [\sigma_{\Delta}] and D. Better estimates of these values will then enhance the next refinement cycle. Thus, powerful optimization methods and maximum-likelihood targets are expected to interact in a synergistic fashion (cf. Fig. 18.2.5.1[link]). Structure-factor averaging of multi-start refinement models can provide another layer of improvement by producing a better description of [{\bf F}_{c}] if the model shows significant variability due to errors or intrinsic flexibility (see below).

In order to achieve an improvement over the least-squares residual [equation (18.2.3.2[link])], cross validation was found to be essential (Pannu & Read, 1996[link]; Adams et al., 1997[link]) for the estimation of model incompleteness and errors ([\sigma_{\Delta}] and D). Since the test set typically contains only 10% of the diffraction data, these cross-validated quantities can show significant statistical fluctuations as a function of resolution. In order to reduce these fluctuations, Read (1997)[link] devised a smoothing method by applying restraints to [\sigma_{A}] values between neighbouring resolution shells where [\sigma_{A} = \left[1 - (\sigma_{\Delta} / \langle |{\bf F}_{o}|^{2}\rangle )\right]^{1/2}. \eqno(18.2.3.9)]

Pannu & Read (1996)[link] have developed an efficient Gaussian approximation of equation (18.2.3.7)[link] in cases of no prior phase information, termed the `MLF' target function. In the limit of a perfect model (i.e. [\sigma_{\Delta} = 0] and [D = 1]), MLF reduces to the traditional least-squares residual [equation (18.2.3.2)[link]] with [1/\sigma_{o}^{2}] weighting. In the case of prior phase information, the integration over the phase angles has been carried out numerically in equation (18.2.3.7)[link], termed the `MLHL' target (Pannu et al., 1998[link]). A maximum-likelihood function which expresses equation (18.2.3.7)[link] in terms of observed intensities has also been developed, termed `MLI' (Pannu & Read, 1996[link]).

18.2.3.2. A priori chemical information

| top | pdf |

The parameters for the covalent terms in [E_{\rm chem}] [equation (18.2.3.1)[link]] can be derived from the average geometry and (r.m.s.) deviations observed in a small-molecule database. Extensive statistical analyses were undertaken for the chemical moieties of proteins (Engh & Huber, 1991[link]) and polynucleotides (Parkinson et al., 1996[link]) using the Cambridge Structural Database (Allen et al., 1983[link]). Analysis of the ever-increasing number of atomic resolution macromolecular crystal structures will no doubt cause some modifications of these parameters in the future.

It is common to use a purely repulsive quartic function [(E_{\rm repulsive})] for the non-bonded interactions that are included in [E_{\rm chem}] (Hendrickson, 1985[link]): [E_{\rm repulsive} = \textstyle{\sum\limits_{ij}} [(cR_{ij}^{\min})^{n} - R_{ij}^{n}]^{m}, \eqno(18.2.3.10)]where [R_{ij}] is the distance between two atoms i and j, [R_{ij}^{\min}] is the van der Waals radius for a particular atom pair ij, [c \leq 1] is a constant that is sometimes used to reduce the radii, and n = 2, m = 2 or n = 1, m = 4. van der Waals attraction and electrostatic interactions are usually not included in crystallographic refinement. These simplifications are valid since the diffraction data contain information that is able to produce atomic conformations consistent with actual non-bonded interactions. In fact, atomic resolution crystal structures can be used to derive parameters for electrostatic charge distributions (Pearlman & Kim, 1990[link]).

18.2.4. Searching conformational space

| top | pdf |

Annealing denotes a physical process wherein a solid is heated until all particles randomly arrange themselves in a liquid phase and is then cooled slowly so that all particles arrange themselves in the lowest energy state. By formally defining the target, E [equation (18.2.3.1)[link]], to be the equivalent of the potential energy of the system, one can simulate such an annealing process (Kirkpatrick et al., 1983[link]). There is no guarantee that simulated annealing will find the global minimum (Laarhoven & Aarts, 1987[link]). However, compared to conjugate-gradient minimization, where search directions must follow the gradient, simulated annealing achieves more optimal solutions by allowing motion against the gradient (Kirkpatrick et al., 1983[link]). The likelihood of uphill motion is determined by a control parameter referred to as temperature. The higher the temperature, the more likely it is that simulated annealing will overcome barriers (Fig. 18.2.4.1)[link]. It should be noted that the simulated-annealing temperature normally has no physical meaning and merely determines the likelihood of overcoming barriers of the target function in equation (18.2.3.1)[link].

[Figure 18.2.4.1]

Figure 18.2.4.1 | top | pdf |

Illustration of simulated annealing for minimization of a one-dimensional function. The kinetic energy of the system (a `ball' rolling on the one-dimensional surface) allows local conformational transitions with barriers smaller than the kinetic energy. If a larger drop in energy is encountered, the excess kinetic energy is dissipated. It is thus unlikely that the system can climb out of the global minimum once it has reached it.

The simulated-annealing algorithm requires a mechanism to create a Boltzmann distribution at a given temperature, T, and an annealing schedule, that is, a sequence of temperatures [T_{1} \geq T_{2} \geq \ldots \geq T_{l}] at which the Boltzmann distribution is computed. Implementations differ in the way they generate a transition, or move, from one set of parameters to another that is consistent with the Boltzmann distribution at a given temperature. The two most widely used methods are Metropolis Monte Carlo (Metropolis et al., 1953[link]) and molecular dynamics (Verlet, 1967[link]) simulations. For X-ray crystallographic refinement, molecular dynamics has proven extremely successful (Brünger et al., 1987[link]) because it limits the search to physically reasonable `moves'.

18.2.4.1. Molecular dynamics

| top | pdf |

A suitably chosen set of atomic parameters can be viewed as generalized coordinates that are propagated in time by the classical equations of motion (Goldstein, 1980[link]). If the generalized coordinates represent the x, y, z positions of the atoms of a molecule, the classical equations of motion reduce to the familiar Newton's second law: [m_{i} {\partial^{2}{\bf r}_{i} \over \partial t^{2}} = -\nabla_{i}E. \eqno(18.2.4.1)]The quantities [m_{i}] and [{\bf r}_{i}] are, respectively, the mass and coordinates of atom i, and E is given by equation (18.2.3.1)[link]. The solution of the partial differential equations (18.2.4.1)[link] can be achieved numerically using finite-difference methods (Verlet, 1967[link]; Abramowitz & Stegun, 1968[link]). This approach is referred to as molecular dynamics.

Initial velocities for the integration of equation (18.2.4.1)[link] are usually assigned randomly from a Maxwell distribution at the appropriate temperature. Assignment of different initial velocities will generally produce a somewhat different structure after simulated annealing. By performing several refinements with different initial velocities, one can therefore improve the chances of success of simulated-annealing refinement. Furthermore, this improved sampling can be used to study discrete disorder and conformational variability, especially when using torsion-angle molecular dynamics (see below).

Although Cartesian (i.e. flexible bond lengths and bond angles) molecular dynamics places restraints on bond lengths and bond angles [through [E_{\rm chem}], equation (18.2.3.1)[link]], one might want to implement these restrictions as constraints, i.e., fixed bond lengths and bond angles (Diamond, 1971[link]). This is supported by the observation that the deviations from ideal bond lengths and bond angles are usually small in macromolecular X-ray crystal structures. Indeed, fixed-length constraints have been applied to crystallographic refinement by least-squares minimization (Diamond, 1971[link]). It is only recently, however, that efficient and robust algorithms have become available for molecular dynamics in torsion-angle space (Bae & Haug, 1987[link], 1988[link]; Jain et al., 1993[link]; Rice & Brünger, 1994[link]). We chose an approach that retains the Cartesian-coordinate formulation of the target function and its derivatives with respect to atomic coordinates, so that the calculation remains relatively straightforward and can be applied to any macromolecule or their complexes (Rice & Brünger, 1994[link]). In this formulation, the expression for the acceleration becomes a function of positions and velocities. Iterative equations of motion for constrained dynamics in this formulation can be derived and solved by finite-difference methods (Abramowitz & Stegun, 1968[link]). This method is numerically very robust and has a significantly increased radius of convergence in crystallographic refinement compared to Cartesian molecular dynamics (Rice & Brünger, 1994[link]).

18.2.4.2. Temperature control

| top | pdf |

Simulated annealing requires the control of the temperature during molecular dynamics. The current temperature of the simulation [(T_{\rm curr})] is computed from the kinetic energy [E_{\rm kin} = \sum\limits_{i}^{n}{\textstyle{1 \over 2}} m_{i} {\displaystyle\left({\partial r_{i} \over \partial t}\right)}^{2} \eqno(18.2.4.2)]of the molecular-dynamics simulation, [T_{\rm curr} = 2E_{\rm kin}/3nk_{B}. \eqno(18.2.4.3)]Here, n is the number of atoms, [m_{i}] is the mass of the atom and [k_{B}] is Boltzmann's constant. One commonly used approach to control the temperature of the simulation consists of coupling the equations of motion to a heat bath through a `friction' term (Berendsen et al., 1984[link]). Another approach is to rescale periodically the velocities in order to match [T_{\rm curr}] with the target temperature.

18.2.4.3. Annealing schedules

| top | pdf |

The simulated-annealing temperature needs to be high enough to allow conformational transitions, but not so high that the model moves too far away from the correct structure. The optimal temperature for a given starting structure is a matter of trial and error. Starting temperatures that work for the average case have been determined for a variety of simulated-annealing protocols (Brünger, 1988[link]; Adams et al., 1997[link]). However, it might be worth trying a different temperature if a particularly difficult refinement problem is encountered. In particular, significantly higher temperatures are attainable using torsion-angle molecular dynamics. Note that each simulated-annealing refinement is subject to `chance' by using a random-number generator to generate the initial velocities. Thus, multiple simulated annealing runs can be carried out in order to increase the success rate of the refinement. The best structure(s) (as determined by the free R value) among a set of refinements using different initial velocities and/or temperatures can be taken for further refinement or structure-factor averaging (see below).

The annealing schedule can, in principle, be any function of the simulation step (or `time' domain). The two most commonly used protocols are linear slow-cooling or constant-temperature followed by quenching. A slight advantage is obtained with slow cooling (Brünger et al., 1990[link]). The duration of the annealing schedule is another parameter. Too short a protocol does not allow sufficient sampling of conformational space. Too long a protocol may waste computer time, since it is more efficient to run multiple trials than one long refinement protocol (unpublished results).

18.2.4.4. An intuitive explanation of simulated annealing

| top | pdf |

The goal of any optimization problem is to find the global minimum of a target function. In the case of crystallographic refinement, one searches for the conformation or conformations of the molecule that best fit the diffraction data and that simultaneously maintain reasonable covalent and non-covalent interactions. Simulated-annealing refinement has a much larger radius of convergence than conjugate-gradient minimization (see below). It must, therefore, be able to find a lower minimum of the target E [equation (18.2.3.1)[link]] than the local minimum found by simply moving along the negative gradient of E.

It is most easy to visualize this property of simulated annealing in the case of a one-dimensional problem, where the goal is to find the global minimum of a function with multiple minima (Fig. 18.2.4.1)[link]. An intuitive way to understand a molecular-dynamics simulation is to envisage a ball rolling on this one-dimensional surface. When the ball is far from the global minimum, it gains a certain momentum which allows it to cross barriers of the target function [equation (18.2.4.3)[link]]. Slow-cooling temperature control ensures that the ball will eventually reach the global minimum rather than just bouncing across the surface. The initial temperature must be large enough to overcome smaller barriers, but low enough to ensure that the system will not escape the global minimum if it manages to arrive there.

While temperature itself is a global parameter of the system, temperature fluctuations arise principally from local conformational transitions, for example, from an amino-acid side chain falling into the correct orientation. These local changes tend to lower the value of the target E, thus increasing the kinetic energy, and hence the temperature, of the system. Once the temperature control has removed this excess kinetic energy through `heat dissipation', the reverse transition is very unlikely, since it would require a localized increase in kinetic energy where the conformational change occurred in the first place (Fig. 18.2.4.1)[link]. Temperature control maintains a sufficient amount of kinetic energy to allow local conformational corrections, but does not supply enough to allow escape from the global minimum. This explains the observation that, on average, the agreement with the diffraction data will improve, rather than worsen, with simulated annealing.

18.2.5. Examples

| top | pdf |

Many examples have shown that simulated-annealing refinement starting from initial models (obtained by standard crystallographic techniques) produces significantly better final models compared to those produced by least-squares or conjugate-gradient minimization (Brünger et al., 1987[link]; Brünger, 1988[link]; Fujinaga et al., 1989[link]; Kuriyan et al., 1989[link]; Rice & Brünger, 1994[link]; Adams et al., 1997[link]). In another realistic test case (Adams et al., 1999[link]), a series of models for the aspartic proteinase penicillopepsin were generated from homologous structures present in the Protein Data Bank. The sequence identity among these structures ranged from 100% to 25%, thus providing a set of models with increasing coordinate error compared to the refined structure of penicillopepsin. These models, after truncation of all residues to alanine, were all used as search models in molecular replacement against the native penicillopepsin diffraction data. In all cases, the correct placement of the model in the penicillopepsin unit cell was found.

Both conjugate-gradient minimization and simulated annealing were carried out in order to compare the performance of the [E^{\rm LSQ}] least-squares residual [equation (18.2.3.2)[link]], MLF (the maximum-likelihood target using amplitudes) and MLHL (the maximum-likelihood target using amplitudes and experimental phase information). In the latter case, phases from single isomorphous replacement (SIR) were used. A very large number of conjugate-gradient cycles were carried out in order to make the computational requirements equivalent for both minimization and simulated annealing. The conjugate-gradient minimizations were converged, i.e. there was no change when further cycles were carried out.

For a given target function, simulated annealing always out­performed minimization (Fig. 18.2.5.1)[link]. For a given starting model, the maximum-likelihood targets outperformed the least-squares-residual target for both minimization and simulated annealing, producing models with lower phase errors and higher map correlation coefficients when compared with the published penicillopepsin crystal structure (Fig. 18.2.5.1)[link]. This improvement is illustrated in [\sigma_{A}]-weighted electron-density maps obtained from the resulting models (Fig. 18.2.5.2)[link]. The incorporation of experimental phase information further improved the refinement significantly despite the ambiguity in the SIR phase probability distributions. Thus, the most efficient refinement will make use of simulated annealing and phase information in the MLHL maximum-likelihood target function.

[Figure 18.2.5.1]

Figure 18.2.5.1 | top | pdf |

Simulated annealing produces better models than extensive conjugate-gradient minimization. Map correlation coefficients were computed before and after refinement against the native penicillopepsin diffraction data (Hsu et al., 1977[link]) for the polyalanine model derived from Rhizopuspepsin (Suguna et al., 1987[link], PDB code 2APR). Correlation coefficients are between [\sigma_{A}]-weighted maps calculated from each model and from the published penicillopepsin structure. The observed penicillopepsin diffraction data were in space group [C{2}] with cell dimensions [a = 97.37], [b = 46.64], [c = 65.47] Å and [\beta = 115.4^{\circ}]. All refinements were carried out using diffraction data from the lowest-resolution limit of 22.0 Å up to 2.0 Å. The MLHL refinements used single isomorphous phases from a K3UO2F5 derivative of the penicillopepsin crystal structure, which covered a resolution range of 22.0 Å to 2.8 Å. Simulated-annealing refinements were repeated five times with different initial velocities. The numerical averages of the map correlation coefficients for the five refinements are shown as hashed bars. The best map correlation coefficients from simulated annealing are shown as white bars.

[Figure 18.2.5.2]

Figure 18.2.5.2 | top | pdf |

Maximum-likelihood targets significantly decrease model bias in simulated-annealing refinement. [\sigma_{A}]-weighted electron-density maps contoured at 1.25σ for models from simulated-annealing refinement with different targets are shown. Residues 233 to 237 are shown for the published penicillopepsin crystal structure (Hsu et al., 1977[link]) as solid lines, and for the model with the lowest free R value from five independent refinements as dashed lines.

Cross validation is essential in the calculation of the maximum-likelihood target (Kleywegt & Brünger, 1996[link]; Pannu & Read, 1996[link]; Adams et al., 1997[link]). Maximum-likelihood refinement without cross validation gives much poorer results, as indicated by higher free R values, [R_{\rm free} - R] differences and phase errors (Adams et al., 1997[link]). It should be noted that the final normal R value is in general increased, compared to refinements with the least-squares target, when using the cross-validated maximum-likelihood formulation. This is a consequence of the reduction of overfitting by this method.

18.2.6. Multi-start refinement and structure-factor averaging

| top | pdf |

Multiple simulated-annealing refinements starting from the same model, termed `multi-start' refinement, will generally produce somewhat different structures. Even well refined structures will show some variation consistent with the estimated coordinate error of the model (cf. results for 1.8 Å resolution in Fig. 18.2.2.1[link]). More importantly, the poorer the model, the more variation is observed (Brünger, 1988[link]). Some of the models resulting from multi-start refinement may be better than others, for example, as judged by the free R value. Thus, if computer time is available, multi-start refinement has several advantages. A more optimal single model than that produced by a single simulated-annealing calculation can usually be obtained. Furthermore, each separate model coming from a multi-start refinement fits the data slightly differently. This could be the result of intrinsic flexibility in the molecule (see below) or the result of model-building error. Regions in the starting model that contain significant errors often show increased variability after multi-start refinement, and a visual inspection of the ensemble of models produced can be helpful in identifying these incorrectly modelled regions.

To better identify the correct conformation, structure factors from each of the models can be averaged (Rice et al., 1998[link]). This averaging tends to reduce the effect of local errors (noise) that are presumably different in each member of the family. The average structure factor can produce phases that contain less model bias than phases computed from a single model. It should also produce better estimates of [\sigma_{\Delta}] and D for maximum-likelihood targets and for [\sigma_{A}]-weighted electron-density maps, because [{\bf F}_{c}] is used in the computation of these parameters [equation (18.2.3.7)[link]]. Because it is inherently a noise-reducing technique, multi-start refinement followed by structure-factor averaging should be most useful in situations where there is significant noise, namely when the data-to-parameter ratio is low (e.g. if only moderate-resolution diffraction data are available).

18.2.7. Ensemble models

| top | pdf |

In cases of conformational variability or discrete disorder, there is not one single correct solution to the global minimization of equation (18.2.3.1)[link]. Rather, the X-ray diffraction data represent a spatial and temporal average over all conformations that are assumed by the molecule. Ensembles of structures, which are simultaneously refined against the observed data, may thus be a more appropriate description of the diffraction data. This has been used for some time when alternate conformations are modelled locally. Alternate conformations can be generalized to global conformations (Gros et al., 1990[link]; Kuriyan et al., 1991[link]; Burling & Brünger, 1994[link]), i.e., the model is duplicated n-fold, the calculated structure factors corresponding to each copy of the model are summed, and this composite structure factor is refined against the observed X-ray diffraction data. Each member of the family is chemically `invisible' to all other members. The optimal number, n, can be determined by cross validation (Burling & Brünger, 1994[link]; Burling et al., 1996[link]).

An advantage of a multi-conformer model is that it directly incorporates many possible types of disorder and motion (global disorder, local side-chain disorder, local wagging and rocking motions). Furthermore, it can be used to detect automatically the most variable regions of the molecule by inspecting the atomic r.m.s. difference around the mean as a function of residue number. Thermal factors of single-conformer models may some­times be misleading because they underestimate the degree of motion or disorder (Kuriyan et al., 1986[link]), and, thus, the multiple-conformer model can be a more faithful representation of the diffraction data. A disadvantage of the multi-conformer model is that it introduces many more parameters in the refinement.

Although there are some similarities between averaging structure factors of individually refined structures and performing multi-conformer refinement, there are also fundamental differences. For example, multi-start averaging seeks to improve the calculated electron-density map by averaging out the noise present in the individual models, each of which is still a good representation of the diffraction data. This method is most useful at the early stages of refinement when the model still contains errors. In contrast, multi-conformer refinement seeks to create an ensemble of structures at the final stages of refinement which, taken together, best represent the data. It should be noted that each individual conformer of the ensemble does not necessarily remain a good description of the diffraction data, since the whole ensemble is refined against the data. Clearly, multi-conformer refinement requires a high observable-to-parameter ratio.

18.2.8. Conclusions

| top | pdf |

Simulated annealing has dramatically improved the efficiency of crystallographic refinement. A case in point is the combination of torsion-angle molecular dynamics with cross-validated maximum-likelihood targets. These two independent developments interact synergistically to produce less model bias than any other method to date. The combined method dramatically increases the radius of convergence, allowing the productive refinement of poor initial models, e.g. those obtained by weak molecular-replacement solutions (Rice & Brünger, 1994[link]; Adams et al., 1997[link], 1999[link]).

Simulated annealing can also be used to provide new physical insights into molecular function which may depend on conformational variability. The sampling characteristics of simulated annealing allow the generation of multi-conformer models that can represent molecular motion and discrete disorder, especially when combined with the acquisition of high-quality data (Burling et al., 1996[link]). Thus, simulated annealing is also a stepping stone towards development of improved models of macromolecules in solution and in the crystalline state.

The computational developments discussed in this review are implemented in the software suite Crystallography & NMR System (Brunger et al., 1998[link]). A pre-release of the software suite is available upon request.

Acknowledgements

LMR is an HHMI predoctoral fellow. This work was funded in part by grants from the National Science Foundation to ATB (BIR 9514819 and ASC 93–181159).

References

Abramowitz, M. & Stegun, I. (1968). Handbook of Mathematical Functions. Applied Mathematics Series, Vol. 55, p. 896. New York: Dover Publications.
Adams, P. D., Pannu, N. S., Read, R. J. & Brünger, A. T. (1997). Cross-validated maximum likelihood enhances crystallographic simulated annealing refinement. Proc. Natl Acad. Sci. USA, 94, 5018–5023.
Adams, P. D., Pannu, N. S., Read, R. J. & Brunger, A. T. (1999). Extending the limits of molecular replacement through combined simulated annealing and maximum-likelihood refinement. Acta Cryst. D55, 181–190.
Allen, F. H., Kennard, O. & Taylor, R. (1983). Systematic analysis of structural data as a research technique in organic chemistry. Acc. Chem. Res. 16, 146–153.
Bae, D.-S. & Haug, E. J. (1987). A recursive formulation for constrained mechanical system dynamics: Part I. Open loop systems. Mech. Struct. Mach. 15, 359–382.
Bae, D.-S. & Haug, E. J. (1988). A recursive formulation for constrained mechanical system dynamics: Part II. Closed loop systems. Mech. Struct. Mach. 15, 481–506.
Berendsen, H. J. C., Postma, J. P. M., van Gunsteren, W. F., DiNola, A. & Haak, J. R. (1984). Molecular dynamics with coupling to an external bath. J. Chem. Phys. 81, 3684–3690.
Bricogne, G. (1991). A multisolution method of phase determination by combined maximization of entropy and likelihood. III. Extension to powder diffraction data. Acta Cryst. A47, 803–829.
Brunger, A. T. (1988). Crystallographic refinement by simulated annealing: application to a 2.8 Å resolution structure of aspartate aminotransferase. J. Mol. Biol. 203, 803–816.
Brunger, A. T. (1992). The free R value: a novel statistical quantity for assessing the accuracy of crystal structures. Nature (London), 355, 472–474.
Brunger, A. T. (1997). Free R value: cross-validation in crystallography. Methods Enzymol. 277, 366–396.
Brunger, A. T., Adams, P. D., Clore, G. M., DeLano, W. L., Gros, P., Grosse-Kunstleve, R. W., Jiang, J.-S., Kuszewski, J., Nilges, M., Pannu, N. S., Read, R. J., Rice, L. M., Simonson, T. & Warren, G. L. (1998). Crystallography & NMR system: a new software suite for macromolecular structure determination. Acta Cryst. D54, 905–921.
Brunger, A. T., Krukowski, A. & Erickson, J. W. (1990). Slow-cooling protocols for crystallographic refinement by simulated annealing. Acta Cryst. A46, 585–593.
Brunger, A. T., Kuriyan, J. & Karplus, M. (1987). Crystallographic R factor refinement by molecular dynamics. Science, 235, 458–460.
Burling, F. T. & Brunger, A. T. (1994). Thermal motion and conformational disorder in protein crystal structures: comparison of multi-conformer and time-averaging models. Isr. J. Chem. 34, 165–175.
Burling, F. T., Weis, W. I., Flaherty, K. M. & Brunger, A. T. (1996). Direct observation of protein solvation and discrete disorder with experimental crystallographic phases. Science, 271, 72–77.
Diamond, R. (1971). A real-space refinement procedure for proteins. Acta Cryst. A27, 436–452.
Engh, R. A. & Huber, R. (1991). Accurate bond and angle parameters for X-ray structure refinement. Acta Cryst. A47, 392–400.
Fujinaga, M., Gros, P. & van Gunsteren, W. F. (1989). Testing the method of crystallographic refinement using molecular dynamics. J. Appl. Cryst. 22, 1–8.
Goldstein, H. (1980). Classical Mechanics. 2nd ed. Reading, Massachusetts: Addison-Wesley.
Gros, P., van Gunsteren, W. F. & Hol, W. G. J. (1990). Inclusion of thermal motion in crystallographic structures by restrained molecular dynamics. Science, 249, 1149–1152.
Hendrickson, W. A. (1985). Stereochemically restrained refinement of macromolecular structures. Methods Enzymol. 115, 252–270.
Hoppe, W. (1957). Die `Faltmolekülmethode' – eine neue Methode zur Bestimmung der Kristallstruktur bei ganz oder teilweise bekannter Molekülstruktur. Acta Cryst. 10, 750–751.
Hsu, I. N., Delbaere, L. T. J., James, M. N. G. & Hoffman, T. (1977). Penicillopepsin from Penicillium janthinellum crystal structure at 2.8 Å and sequence homology with porcine pepsin. Nature (London), 266, 140–145.
Jain, A., Vaidehi, N. & Rodriguez, G. (1993). A fast recursive algorithm for molecular dynamics simulation. J. Comput. Phys. 106, 258–268.
Kirkpatrick, S., Gelatt, C. D. & Vecchi, M. P. Jr (1983). Optimization by simulated annealing. Science, 220, 671–680.
Kleywegt, G. J. & Brunger, A. T. (1996). Cross-validation in crystallography: practice and applications. Structure, 4, 897–904.
Kuriyan, J., Brunger, A. T., Karplus, M. & Hendrickson, W. A. (1989). X-ray refinement of protein structures by simulated annealing: test of the method on myohemerythrin. Acta Cryst. A45, 396–409.
Kuriyan, J., Ösapay, K., Burley, S. K., Brunger, A. T., Hendrickson, W. A. & Karplus, M. (1991). Exploration of disorder in protein structures by X-ray restrained molecular dynamics. Proteins, 10, 340–358.
Kuriyan, J., Petsko, G. A., Levy, R. M. & Karplus, M. (1986). Effect of anisotropy and anharmonicity on protein crystallographic refinement. J. Mol. Biol. 190, 227–254.
Laarhoven, P. J. M. & Aarts, E. H. L. (1987). Editors. Simulated Annealing: Theory and Applications. Dordrecht: D. Reidel Publishing Company.
Luzzati, V. (1952). Traitement statistique des erreurs dans la determination des structures cristallines. Acta Cryst. 5, 802–810.
Metropolis, N., Rosenbluth, M., Rosenbluth, A., Teller, A. & Teller, E. (1953). Equation of state calculations by fast computing machines. J. Chem. Phys. 21, 1087–1092.
Murshudov, G. N., Vagin, A. A. & Dodson, E. J. (1997). Refinement of macromolecular structures by the maximum-likelihood method. Acta Cryst. D53, 240–255.
Pannu, N. S., Murshudov, G. N., Dodson, E. J. & Read, R. J. (1998). Incorporation of prior phase information strengthens maximum-likelihood structure refinement. Acta Cryst. D54, 1285–1294.
Pannu, N. S. & Read, R. J. (1996). Improved structure refinement through maximum likelihood. Acta Cryst. A52, 659–668.
Parkinson, G., Vojtechovsky, J., Clowney, L., Brunger, A. T. & Berman, H. M. (1996). New parameters for the refinement of nucleic acid-containing structures. Acta Cryst. D52, 57–64.
Pearlman, D. A. & Kim, S.-H. (1990). Atomic charges for DNA constituents derived from single-crystal X-ray diffraction data. J. Mol. Biol. 211, 171–187.
Press, W. H., Flannery, B. P., Teukolsky, S. A. & Vetterling, W. T. (1986). Editors. Numerical Recipes, pp. 498–546. Cambridge University Press.
Read, R. J. (1986). Improved Fourier coefficients for maps using phases from partial structures with errors. Acta Cryst. A42, 140–149.
Read, R. J. (1990). Structure-factor probabilities for related structures. Acta Cryst. A46, 900–912.
Read, R. J. (1997). Model phases: probabilities and bias. Methods Enzymol. 278, 110–128.
Rice, L. M. & Brunger, A. T. (1994). Torsion angle dynamics: reduced variable conformational sampling enhances crystallographic structure refinement. Proteins Struct. Funct. Genet. 19, 277–290.
Rice, L. M., Shamoo, Y. & Brunger, A. T. (1998). Phase improvement by multi-start simulated annealing refinement and structure-factor averaging. J. Appl. Cryst. 31, 798–805.
Rossmann, M. G. & Blow, D. M. (1962). The detection of sub-units within the crystallographic asymmetric unit. Acta Cryst. 15, 24–51.
Silva, A. M. & Rossmann, M. G. (1985). The refinement of southern bean mosaic virus in reciprocal space. Acta Cryst. B41, 147–157.
Sim, G. A. (1959). The distribution of phase angles for structures containing heavy atoms. II. A modification of the normal heavy-atom method for non-centrosymmetrical structures. Acta Cryst. 12, 813–815.
Srinivasan, R. (1966). Weighting functions for use in the early stages of structure analysis when a part of the structure is known. Acta Cryst. 20, 143–144.
Suguna, K., Bott, R. R., Padlan, E. A., Subramanian, E., Sheriff, S., Cohen, G. H. & Davies, D. R. (1987). Structure and refinement at 1.8 Å resolution of the aspartic proteinase from Rhizopus chinensis. J. Mol. Biol. 196, 877–900.
Verlet, L. (1967). Computer experiments on classical fluids. I. Thermodynamical properties of Lennard–Jones molecules. Phys. Rev. 159, 98–105.








































to end of page
to top of page