InternationalCrystallography of biological macromoleculesTables for Crystallography Volume F Edited by M. G. Rossmann and E. Arnold © International Union of Crystallography 2006 |
International Tables for Crystallography (2006). Vol. F, ch. 13.3, pp. 275-278
https://doi.org/10.1107/97809553602060000683 ## Chapter 13.3. Translation functions
Translation functions are used to determine the positions of search models by the molecular-replacement method. The functions are generally based on the comparisons of observed and calculated structure-factor amplitudes ( |

A structure determination by the molecular-replacement method traditionally proceeds in two steps (Rossmann, 1972, 1990). The first step involves the determination of the orientation of the search model in the unknown crystal unit cell by the rotation functions (see Chapter 13.2 ). Once the orientation of the search model is known, translation functions are employed in the second step to determine the location of the model in the crystal unit cell. This essentially reduces a six-dimensional problem (three rotational and three translational degrees of freedom) to two three-dimensional problems, which are computationally more manageable. With the speed of modern computers, a strict division between the rotational and the translational components of a molecular-replacement structure solution may no longer be necessary (see Section 13.3.7).

Translation functions are normally formulated to achieve minimum or maximum values when the search molecule is at its correct position in the crystal unit cell. As with the rotation problem, the translation problem is solved as a search. The positional parameters of the model are varied in the unit cell, generally on a grid. Translation functions are evaluated at these search grid points in order to identify those that minimize or maximize the functions.

Most translation functions involve a comparison between the observed structure-factor amplitudes (or squared amplitudes) and those calculated based on the search model. The *R* factor and the correlation coefficient can be used as indicators for translation searches (Section 13.3.2). The correlation between the observed Patterson map and that which is calculated based on the search model is the foundation of another translation function (Section 13.3.3). If phase information is available from other sources, the correlation between the electron-density maps is the basis for the phased translation function (Section 13.3.4). The power of the translation functions can be enhanced in the presence of noncrystallographic symmetry (Section 13.3.8). Proper packing of the search model in the crystal unit cell is an essential component of a solution to the translation problem (Section 13.3.5).

The crystallographic *R* factor is often used as an indicator in translation searches. It is a measure of the percentage difference between the observed () and the calculated () structure-factor amplitudes, A similar *R* factor can be defined based on the square of the structure-factor amplitudes, *i.e.* an *R* factor based on intensity, A factor of 2 is introduced in equation (13.3.2.2) to make values fall in the same range as . In equations (13.3.2.1) and (13.3.2.2) and are scale factors that bring the observed and the calculated structure factors to the same level. These scale factors are generally calculated in shells of equal reciprocal volume, which can compensate for differences in the displacement factors between the observed and the calculated structure-factor amplitudes.

In addition to the *R* factor, the correlation coefficient between the observed and the calculated structure factors is also used in translation searches. Like the *R* factors, correlation coefficients can be defined based on the amplitude or the intensity of the reflections, and that based on the amplitude is shown below. In (13.3.2.3), *N* is the number of reflections that are used in the calculation and denotes the average structure-factor amplitude over the reflections. Unlike the *R* factors, the correlation coefficients do not depend on the overall scale factor between the observed and the calculated structure factors. However, they can be affected by large differences in the overall displacement factors between the observed and the calculated structure factors.

In order to evaluate *R* factors and correlation coefficients for a translation search, structure factors need to be calculated for the search model, with a given orientation, at different positions in the unit cell. For this special case, where only the positional parameters of the search model are varied, the calculation of the structure factors can be simplified (Nixon & North, 1976; Rae, 1977). The structure-factor equation can be written as a double summation – first over the atoms in one asymmetric unit of the unit cell and then over all the asymmetric units, where *j* goes over all the atoms in the asymmetric unit and *n* goes over all the crystallographic asymmetric units. The *n*th crystallographic symmetry operator is given by where is the rotational component and is the translational component of the symmetry operator.

For simplicity, first consider the case where there is only one molecule in the asymmetric unit. In the translation search, the model will be placed at different positions in the unit cell, where is a translation vector which is applied to move the model from its starting position (). Substituting in (13.3.2.6) into (13.3.2.4) gives where is the structure factor calculated based only on the *n*th symmetry-related molecule It can be calculated by placing the search model in a *P*1 unit cell having the same cell dimensions as the unknown-crystal unit cell. The structure factors calculated for this *P*1 cell are related to by

Therefore, the summation over the atoms in the structure-factor calculation, a rather time-consuming process, needs to be performed only once, for the search model at the starting position. Subsequent structure-factor calculations after translation of the model are no longer dependent on the number of atoms present in the unit cell [equation (13.3.2.7)]. The starting position is usually chosen such that the centre of the search model is at (0, 0, 0). Then the vector that is determined from the translation searches will define the centre of the model in the unit cell.

Equation (13.3.2.7) can be generalized to allow for the presence of other molecules that are to remain stationary during the translation search: where is the contribution from the stationary molecules. This formulation is useful if there is more than one molecule in the asymmetric unit. The position of one of the molecules can be determined first, and the model is then included as a stationary molecule for the position search of the next molecule.

Evaluation of the *R* factor and the correlation coefficient [equations (13.3.2.1) and (13.3.2.3)] in a translation search is generally rather slow. A method has been developed to calculate the correlation coefficient by the fast Fourier transform (FFT) technique, which involves reciprocal vectors up to four times the resolution of the reflection data (Navaza & Vernoslova, 1995).

Equation (13.3.2.7) can also be generalized to allow for the presence of two (or more) search models that are to move independently of each other during the translation search. However, this will generally lead to a six- (or more) dimensional problem and is extremely expensive in computation time. With recent improvements in computer technology (especially parallel processing), it might be feasible to carry out such searches in special cases. However, this aspect of translation functions will not be discussed further here.

The most commonly used translation-search indicator is based on the correlation between the observed and the calculated Patterson maps (Crowther & Blow, 1967). Rotation functions are based on the overlap of only a subset of the interatomic vectors in the Patterson map, *i.e.* only those near the origin of the unit cell, which generally contain the self vectors within each crystallographically unique molecule. The correct orientation and position of a search model in the crystal unit cell should lead to the maximal overlap of both the self and the cross vectors, *i.e.* maximal overlap between the observed and the calculated Patterson maps throughout the entire unit cell (Tong, 1993),

The calculated structure factor is a function of the translation vector [equation (13.3.2.10)]. Combining equations (13.3.2.10) and (13.3.3.1) gives The first two terms in equation (13.3.3.2) contribute a constant to the correlation and are generally ignored in this calculation (but see Section 13.3.8). The Patterson-correlation translation function is therefore a Fourier transform, with (from the third term) and (from the fourth term) as the indices. This function can be evaluated quickly by the FFT technique. The term often leads to the doubling of the original reflection indices. For example, will give rise to (2*h*, 0, 2*l*) for monoclinic space groups (*b*-unique setting). Therefore, the Patterson-correlation Fourier transform should normally be sampled with a grid size about 1/6 of the maximum resolution of the reflection data used in the calculation.

A disadvantage of the Patterson-correlation translation function as formulated in equation (13.3.3.2) is that the results from the calculation are on an arbitrary scale. This makes it difficult to compare results from different calculations. The *R* factor or the correlation coefficient can be calculated for the top peaks in the Patterson-correlation translation function [equation (13.3.3.2)] to place the results on an `absolute' scale. Alternatively, the correlation, as defined by equation (13.3.3.1), can be normalized to become a Patterson-correlation coefficient (Harada *et al.*, 1981), This correlation coefficient is equivalent to that defined by equation (13.3.2.3) for reflection intensities (Fujinaga & Read, 1987). The FFT technique can be used to evaluate the term in the denominator of equation (13.3.3.3), which involves reciprocal vectors up to four times the data resolution (Navaza & Vernoslova, 1995). Alternatively, this term can be approximated by an expression which also measures the packing of the search molecules in the crystal (see Section 13.3.5).

The intramolecular vectors can be removed (Crowther & Blow, 1967) from the Patterson maps by subtracting, after appropriate scaling, the structure factors calculated from individual molecules [equation (13.3.2.8)].

The translation functions, as defined above, are based on the structure-factor amplitudes. Normalized structure factors (*E*'s) may provide better results under certain circumstances, since they increase the weight of the high-resolution reflections in the translation function (Harada *et al.*, 1981; Tickle, 1985; Tollin, 1966). Since the Patterson-correlation translation function is also based on reflection intensities, the `large term' approach can be used to accelerate the calculation (Tollin & Rossmann, 1966).

If an atomic model needs to be placed in an electron-density map that has been obtained through other methods (*e.g.* the multiple-isomorphous-replacement method or partial model phases), the phased translation function can be used (Bentley & Houdusse, 1992; Colman *et al.*, 1976; Read & Schierbeek, 1988; Tong, 1993). It is essentially based on the correlation between observed and calculated electron-density values throughout the unit cell: Therefore, the phased translation function can also be evaluated by the FFT technique. As with the Patterson-correlation translation function, the phased translation function can be placed on an `absolute' scale by introducing appropriate normalizing factors or by converting the results to *R* factors or correlation coefficients. It should be noted that the prior phase could be in the wrong hand, so both and may need to be tried in the phased translation function.

The stationary molecules contribute a constant to the phased translation function and are not shown in equation (13.3.4.1). However, the phase information from the stationary molecules can be applied to the observed structure-factor amplitudes, and the phased translation function, rather than the Patterson-correlation translation function, can be used in the search for additional molecules (Bentley & Houdusse, 1992; Driessen *et al.*, 1991; Read & Schierbeek, 1988). This could prove especially useful in locating the last few molecules in cases where there are several molecules in the asymmetric unit.

A correct molecular-replacement solution should lead to the placement of the search model at the correct orientation and position in the crystal unit cell. For this solution, there should be no or minimal steric clashes among the crystallographically related and noncrystallographically related molecules in the unit cell. Therefore, proper packing of the search model in the crystal unit cell is an important component of the molecular-replacement structure solution.

The packing of the search model in the unit cell can be estimated by determining the electron-density overlap among the molecules. This overlap can be calculated numerically, given the molecular envelope (Hendrickson & Ward, 1976). It can also be estimated by an analytical function (Harada *et al.*, 1981), where *N* is the number of crystallographic symmetry operators. This overlap function assumes a value of 1 when there is no overlap among the molecules, and higher values when there is overlap. This function has been used to replace the term in the denominator of equation (13.3.3.3) (Harada *et al.*, 1981). Consequently, those positions that lead to steric clashes among the molecules will be down-weighted, thereby increasing the signal for the correct solution.

The overlap functions provide an overall estimate for the packing of the search model in the unit cell. A more detailed packing analysis can be based on the checking of atomic contacts. For example, the number of contacts below a pre-specified distance cutoff (normally between 2 to 3 Å) in a protein crystal can be determined. Too many such contacts would indicate significant overlap of the molecules. For nucleic acid structures, a set of representative atoms (for example, P, N_{1}, ) can be selected from each nucleotide for this packing analysis.

The region of the unit cell that should be covered during a translation search does not generally correspond to the asymmetric unit of the space group. Since the search model has a defined orientation, it can only reside in one of the asymmetric units in the unit cell. Lacking knowledge as to which asymmetric unit the model occupies, the entire unit cell would need to be searched. However, most space groups possess alternative origins, which means the position of a molecule in the unit cell can only be determined to within certain sets of translations. For example, in space group , there are eight alternative origins at , , , , , , and . This implies that the region that should be searched to locate a molecule need only be of the volume of the unit cell [for example, ]. In addition, for polar space groups, the position of the molecule along the polar axis is arbitrary. The symmetry, as defined by these unique regions, is also known as the Cheshire group (Hirshfeld, 1968), and has been defined for all the 230 space groups.

Once the first molecule is positioned, the origin of the unit cell is fixed as well. The search for subsequent molecules will need to cover the entire unit cell.

The traditional division of the molecular-replacement problem into two steps is partly due to limited computer power. Such a division has placed more pressure on the rotation function, since generally, only a few rotation angles are examined by translation functions. The correct orientation, therefore, needs to be among the top few peaks either directly in the rotation functions or after Patterson-correlation refinement (Brünger, 1990).

With modern computers, it is no longer necessary to maintain the strict division between the rotational and the translational components. Even though a full six-dimensional search is still generally impractical, a limited six-dimensional search can certainly be performed. The Patterson-correlation translation function is preferred for this limited six-dimensional search since it can be evaluated quickly with the FFT technique. Using the *R* factor or the correlation coefficient as the translation function would severely limit either the exploration of rotational space (Fujinaga & Read, 1987) or the reflection data that are used in the calculation (Rabinovich & Shakked, 1984).

Recently, an automated molecular-replacement protocol has been implemented which automatically examines the top peaks in the rotation function by translation functions (Navaza, 1994). This protocol has proven to be remarkably powerful. It assumes that the correct rotation solution is near the top peaks in the rotation function. A more general assumption is that the correct rotation for the search model should produce high values in the rotation function, even though they may not be near peaks in the rotation function. An error of 6° between the correct rotation angles and the peak in the rotation function, which often occurs, can make it impossible to obtain the correct translation-function solution (Fujinaga & Read, 1987).

The combined-molecular-replacement protocol (Tong, 1996*a*) therefore consists of examining all the grid points in the rotation function with heights greater than a defined cutoff value using the Patterson-correlation translation function. The top peaks (usually 10 to 20) in each translation function are all examined as possible solutions. The results from these translation functions are converted to the *R* factor or the correlation coefficient, enabling comparisons among the various orientations. This protocol allows the automatic examination of not only the top peaks in the rotation functions, but also those angles that produce high rotation-function values. In addition, packing of the model in the crystal is examined automatically to eliminate those solutions that have severe clashes among the molecules (see Section 13.3.5). This generalized protocol has proven more powerful than conventional methods in a few structure determinations (Tong, 1996*a*; Wu *et al.*, 1997).

An alternative approach, examining the neighbourhood surrounding the rotation-function peaks, is also possible (Urzhumtsev & Podjarny, 1995).

In the presence of noncrystallographic symmetry (NCS), locked self-rotation functions can be used to determine the orientation of the NCS elements in the crystal unit cell (Tong & Rossmann, 1990). Often, an atomic model for the monomer of the NCS assembly is available, but not the model of the entire assembly. This atomic model can be used in ordinary cross-rotation-function calculations. A more powerful technique is to use the locked cross-rotation function, which can define the orientations of all the molecules within the assembly at the same time (Tong & Rossmann, 1997). With the knowledge of the orientations, several translation searches are needed to locate the individual monomers of the assembly. For cases where the assembly has high NCS, the translation searches to locate the first few molecules may not be very successful, since the search model only represents a small portion of the diffracting power of the crystal.

A locked translation function takes into account contributions from all the monomers of the assembly at the same time (Tong, 1996*b*). It can determine the position of the monomer search model relative to the centre of the NCS assembly. With this knowledge, the entire assembly can be generated and can then be used in an ordinary translation search to locate the centre of this NCS assembly in the unit cell.

Given the atomic model, (in Cartesian coordinates), for the monomer at a starting position and the rotation, [*F*], that brings it into the same orientation as that of a monomer in the standard orientation, the model of the entire assembly in the standard orientation is given by where is a translation vector and the centre of the assembly is placed at (0, 0, 0). is the set of rotation matrices for the NCS point group in the standard orientation. The correct translation vector should give rise to the maximal overlap between the self vectors within the NCS assembly and the observed Patterson map. This overlap is given by the second term of equation (13.3.3.2). The locked translation function is therefore defined as where and [*E*] is the rotation matrix that brings the standard orientation to that of the assembly in the crystal and [α] is the de-orthogonalization matrix (Rossmann & Blow, 1962). Equation (13.3.8.2) can be evaluated indirectly by the FFT technique (Tong, 1996*b*). As with the Patterson-correlation translation function, equation (13.3.8.2) can be converted to a correlation coefficient, although the evaluation will become more time-consuming. It should be noted that equation (13.3.8.2) bears much resemblance to equation (13.3.3.2), with the interchange of the crystallographic quantities and the noncrystallographic quantities .

A variety of translation functions have been developed for special purposes. For example, a function has been proposed that can determine the translational component along an (NCS) twofold axis (Rossmann & Blow, 1962).

If initial phase information is available for a crystal that possesses NCS, averaging among the NCS monomers is a powerful technique for improving the phase information (Rossmann, 1990). Accurate parameters for the NCS elements (orientation and position) are essential for this averaging process. The orientations of the NCS elements can often be determined by self-rotation functions. The positions of the NCS elements can be determined by a special translation function based on the electron-density overlap (in a spherical volume) of NCS-related monomers (Tong, 1993): where and are the centres of two molecules related by NCS, and [*C*] is the rotation matrix for the NCS. This equation can be used to obtain an initial estimate for the position of the NCS axis (given its orientation) (Tong, 1993), to identify positions related by the NCS (Blow *et al.*, 1964) and to refine the NCS parameters iteratively (Tong & Rossmann, 1997).

### References

Bentley, G. A. & Houdusse, A. (1992).*Some applications of the phased translation function in macromolecular structure determination*.

*Acta Cryst.*A

**48**, 312–322.Google Scholar

Blow, D. M., Rossmann, M. G. & Jeffery, B. A. (1964).

*The arrangement of α-chymotrypsin molecules in the monoclinic crystal form*.

*J. Mol. Biol.*

**8**, 65–78.Google Scholar

Brünger, A. T. (1990).

*Extension of molecular replacement: a new search strategy based on Patterson correlation refinement*.

*Acta Cryst.*A

**46**, 46–57.Google Scholar

Colman, P. M., Fehlhammer, H. & Bartels, K. (1976).

*Patterson search methods in protein structure determination: β-trypsin and immunoglobulin fragments*. In

*Crystallographic computing techniques*, edited by F. R. Ahmed, K. Huml & B. Sedlacek, pp. 248–258. Copenhagen: Munksgaard.Google Scholar

Crowther, R. A. & Blow, D. M. (1967).

*A method of positioning a known molecule in an unknown crystal structure*.

*Acta Cryst.*

**23**, 544–548.Google Scholar

Driessen, H. P. C., Bax, B., Slingsby, C., Lindley, P. F., Mahadevan, D., Moss, D. S. & Tickle, I. J. (1991).

*Structure of oligomeric βB2-crystallin: an application of the T*.

_{2}translation function to an asymmetric unit containing two dimers*Acta Cryst.*B

**47**, 987–997.Google Scholar

Fujinaga, M. & Read, R. J. (1987).

*Experiences with a new translation-function program*.

*J. Appl. Cryst.*

**20**, 517–521.Google Scholar

Harada, Y., Lifchitz, A., Berthou, J. & Jolles, P. (1981).

*A translation function combining packing and diffraction information: an application to lysozyme (high-temperature form)*.

*Acta Cryst.*A

**37**, 398–406.Google Scholar

Hendrickson, W. A. & Ward, K. B. (1976).

*A packing function for delimiting the allowable locations of crystallized macromolecules*.

*Acta Cryst.*A

**32**, 778–780.Google Scholar

Hirshfeld, F. L. (1968).

*Symmetry in the generation of trial structures*.

*Acta Cryst.*A

**24**, 301–311.Google Scholar

Navaza, J. (1994).

*AMoRe: an automated package for molecular replacement*.

*Acta Cryst.*A

**50**, 157–163.Google Scholar

Navaza, J. & Vernoslova, E. (1995).

*On the fast translation functions for molecular replacement*.

*Acta Cryst.*A

**51**, 445–449.Google Scholar

Nixon, P. E. & North, A. C. T. (1976).

*Crystallographic relationship between human and hen-egg lysozymes. I. Methods for the establishment of molecular orientational and positional parameters*.

*Acta Cryst.*A

**32**, 320–325.Google Scholar

Rabinovich, D. & Shakked, Z. (1984).

*A new approach to structure determination of large molecules by multi-dimensional search methods*.

*Acta Cryst.*A

**40**, 195–200.Google Scholar

Rae, A. D. (1977).

*The use of structure factors to find the origin of an oriented molecular fragment*.

*Acta Cryst.*A

**33**, 423–425.Google Scholar

Read, R. J. & Schierbeek, A. J. (1988).

*A phased translation function*.

*J. Appl. Cryst.*

**21**, 490–495.Google Scholar

Rossmann, M. G. (1972). Editor.

*The molecular replacement method.*New York: Gordon & Breach.Google Scholar

Rossmann, M. G. (1990).

*The molecular replacement method*.

*Acta Cryst.*A

**46**, 73–82.Google Scholar

Rossmann, M. G. & Blow, D. M. (1962).

*The detection of sub-units within the crystallographic asymmetric unit*.

*Acta Cryst.*

**15**, 24–31.Google Scholar

Tickle, I. J. (1985).

*Review of space group general translation functions that make use of known structure information and can be expanded as Fourier series.*In

*Proceedings of the Daresbury study weekend. Molecular replacement*, edited by P. A. Machin, pp. 22–26. Warrington: Daresbury Laboratory.Google Scholar

Tollin, P. (1966).

*On the determination of molecular location*.

*Acta Cryst.*

**21**, 613–614.Google Scholar

Tollin, P. & Rossmann, M. G. (1966).

*A description of various rotation function programs*.

*Acta Cryst.*

**21**, 872–876.Google Scholar

Tong, L. (1993).

*Replace, a suite of computer programs for molecular-replacement calculations*.

*J. Appl. Cryst.*

**26**, 748–751.Google Scholar

Tong, L. (1996

*a*).

*Combined molecular replacement*.

*Acta Cryst.*A

**52**, 782–784.Google Scholar

Tong, L. (1996

*b*).

*The locked translation function and other applications of a Patterson correlation function*.

*Acta Cryst.*A

**52**, 476–479.Google Scholar

Tong, L. & Rossmann, M. G. (1990).

*The locked rotation function*.

*Acta Cryst.*A

**46**, 783–792.Google Scholar

Tong, L. & Rossmann, M. G. (1997).

*Rotation function calculations with GLRF program. Methods Enzymol.*

**276**, 594–611.Google Scholar

Urzhumtsev, A. & Podjarny, A. (1995).

*On the solution of the molecular-replacement problem at very low resolution: application to large complexes*.

*Acta Cryst.*D

**51**, 888–895.Google Scholar

Wu, H., Kwong, P. D. & Hendrickson, W. A. (1997).

*Dimeric association and segmental variability in the structure of human CD4*.

*Nature (London)*,

**387**, 527–530.Google Scholar