International
Tables for Crystallography Volume C Mathematical, physical and chemical tables Edited by E. Prince © International Union of Crystallography 2006 
International Tables for Crystallography (2006). Vol. C, ch. 8.3, pp. 693698
Section 8.3.1. Constrained models^{a}NIST Center for Neutron Research, National Institute of Standards and Technology, Gaithersburg, MD 20899, USA,^{b}Geophysical Laboratory, Carnegie Institution of Washington, 5251 Broad Branch Road NW, Washington, DC 200151305, USA, and ^{c}Laboratory for the Structure of Matter, Code 6030, Naval Research Laboratory, Washington, DC 203755000, USA 
The techniques of least squares are applicable for refining almost any model, but the question of the suitability of the model remains. The addition of parameters may reduce the residual disagreement, but lead to solutions that have no physical or chemical validity. Addition of constraints is one method of constricting the solutions.
The classical technique for application of constraints is the use of Lagrange undetermined multipliers, in which the set of p parameters, , is augmented by additional unknowns, λ_{k}, one for each constraint relationship desired. The problem may be stated in the form: find the minimum of subject to the condition This may be shown (Gill, Murray & Wright, 1981) to be equivalent to the problem: find a point at which the gradient of vanishes. Solving for the stationary point leads to a set of simultaneous equations of the form and Thus, the number of equations, and the number of unknowns, is increased from p to 2p − q. In cases where the number of constraint relations is small, and where it may be difficult to solve the relations for some of the parameters in terms of the rest, this method yields the desired results without too much additional computation (Ralph & Finger, 1982). With the large numbers of parameters, and large numbers of constraints, that arise in many crystallographic problems, however, the use of Lagrange multipliers is computationally inefficient and cumbersome.
In most cases encountered in crystallography, constraints may be applied directly, thus reducing rather than increasing the size of the normalequations matrix. For each constraint introduced, one of the parameters becomes dependent on the remaining set, and the rank of the remaining system is reduced by one. For p parameters and p − q constraints, the problem reduces to q parameters. If the Gauss–Newton algorithm is used (Section 8.1.4 ), the normalequations matrix is A^{T}WA, where and W is a weight matrix. A constrained model, , maybe constructed using relations of the form Applying the chain rule for differentiation, the normalequations matrix for the constrained model is B^{T}WB, where This may be written in matrix form B = AC, where defines a constraint matrix. The application of constraints involves (a) determination of the model to be used, (b) calculation of the elements of C, and (c) computation of the modified normalequations matrix.
The construction of matrix C by a procedure known as the variable reduction method may be presented formally as follows: Designate by Z the matrix whose elements are and partition Z in the form Z = (U, V), where V is composed of (p − q) columns of Z chosen to be linearly independent, so that V is nonsingular. [V is shown as the last (p − q) columns only for convenience. Any linearly independent set may be chosen.] The rows of Z form a basis for a (p − q)dimensional subspace of the pdimensional parameter space, and we wish to construct a basis for z, a qdimensional subspace that is orthogonal to it, so that all shifts within that subspace starting at a point where the constraints are satisfied, a feasible point, leave the values of the constraint relations unchanged. This basis is used for the columns of C, which is given by In this formulation, the columns of V correspond to dependent parameters that are functions of the independent parameters corresponding to the columns of U.
Most existing programs provide for the calculation of the structure factor, F, and its partial derivatives with respect to a conventional set of parameters, including occupancy, position, isotropic or anisotropic atomic displacement factors, and possibly higher cumulants of an atomic density function (Prince, 1994). The constrained calculation is usually performed by evaluating selected elements, . Because the constraint matrix is often extremely sparse, calculation of a limited sum involving only the nonzero elements is usually computationally superior to a full matrix multiplication. After adjustment of the z's, equations (8.3.1.5) are used to update the parameters. Using this procedure, it is not necessary to express the structure factor, or its derivatives, in terms of the refined parameters. This is particularly important when the constrained model involves arbitrary molecular shapes or rigidbody thermal motions.
The need for constraints arises most frequently when the crystal structure contains atoms in special positions. Here, certain parameters will be constant or linearly related to others. If a parameter is constrained to be a constant, the corresponding row of C will contain zeros, and that column will be ignored. When parameters are linearly dependent on others, which may occur in trigonal, hexagonal, tetragonal and cubic space groups, the modification indicated in (8.3.1.6) cannot be avoided. The constraint relationships among position parameters are trivial. Levy (1956) described an algorithm for determining the constraints that pertain to second and higher cumulants in the structurefactor formula. Table 8.3.1.1 is a summary of relations that are found for anisotropic atomic displacement factors, with a listing of the space groups in which they occur. Johnson (1970) provides a table listing the number of unique coefficients for each possible site symmetry for tensors of various ranks, which is useful information for verification of constraint relationships.

Another important use of constraints applies to the occupancies of certain sites in the crystal where, for example, a molecule is disordered in two or more possible orientations or (very common in minerals) several elements are distributed among several sites. In both cases, refinement of all of the fractional occupancies tends to be extremely ill conditioned, because of high correlations between occupancies and atomic displacement parameters. The overall chemistry, however, may be known from electron microprobe (Finger, 1969) or other analytic techniques to much better precision than it is possible to determine it using diffraction data alone. The constraining equations for the occupancies of n species in m sites have the form where is the multiplicity of the ith site, is the fractional occupancy of the jth species in the ith site, and p is the total number of atoms of species j per unit cell. For a given crystal structure and composition, the bs and ps are known, and, furthermore, it is possible to write an additional constraint for the total occupancy of each site, If necessary, vacancies may be included as one of the n species present. In theory, (8.3.1.9) and (8.3.1.10) could be solved for (n − 1) × (m − 1) unknown parameters, , with m + n − 1 constraint relations, but, in practice, at most one occupancy factor per site can be refined. When constraints are applied, the correlations between occupancies and displacement factors are greatly reduced.
In the analysis of a crystal structure, it may be desirable to test various constraints on the shape or symmetry of a molecule. For example, the molecule of a particular compound may have orthorhombic symmetry in the liquid or vapour phase, but crystallize with a monoclinic or triclinic space group. Without constraints, it is impossible to determine whether the crystallization has caused changes in the molecular conformation. Residual errors in the observations will invariably lead to deviations from the original molecular geometry, but these may or may not be meaningful.
With molecularshape constraints, it is possible to constrain the geometry to any desired conformation. The first step is to describe the molecule in a special, orthonormal coordinate system that has a well defined relationship between the coordinate axes and the symmetry elements. If this system is properly chosen, the description of the molecule is easy. The next step is to describe the transformation between this orthonormal system and the crystallographic axes. A standard, orthonormal coordinate system (Prince, 1994) can be constructed with its x axis parallel to a and its z axis parallel to c*. If the special system is translated with respect to the standard system so that they share a common origin, Eulerian angles, ω, χ, and ϕ, may be used to define a matrix that rotates the special coordinates into the standard system. Angle ω is defined as the clockwise rotation through which the special system must be rotated about the z axis of the standard system to bring the z axis of the special system into the x, z plane of the standard system. Similarly, angle χ is the clockwise angle through which the resulting, special system must be rotated about the y axis of the standard system to bring the z axes into coincidence, and, finally, angle ϕ is the clockwise angle through which the special system must be rotated about the common z axes to bring the other axes into coincidence. The overall transformation is given by
The overall transformation of a vector, x′, from the special coordinate system to the crystallographic system is given by where t is the origin offset between the two systems and D is the upper triangular Cholesky factor (Subsection 8.1.1.1 ) of the metric tensor, G, which is defined by Equations (8.3.1.12) are the constraint relationships, and the refinable parameters include the adjustable parameters in the special system, the origin offset, and the three rotation angles. This set of parameters, although it is written in a very different manner, is a linear transformation of a subset of the conventional crystallographic parameters, so that statistical tests based on the F ratio or Hamilton's R ratio (Section 8.4.2 ; Hamilton, 1964) may be used to assess significance. Shape constraints differ from those owing to space group or chemical conditions in that the constraint equations (8.3.1.12) are not linear functions of the independent parameters. Thus, the elements of C are not constants and must therefore be evaluated in each iteration of the refinement algorithm.
Another area in which application of constraints is important arises whenever some portion of the structure undergoes thermal motion as a rigid entity. One means of determining rigidmotion parameters is to refine the conventional, anisotropic atomic displacement factors of all atoms individually and to fit a librational model to the resulting thermal factors (Schomaker & Trueblood, 1968). A problem arises with this approach because the presence of libration implies curvilinear motion in the crystallographic system, and thus the probability density function for an atom that does not lie on the axis of libration cannot be described by a Gaussian function in a rectilinear coordinate system. For neutron diffraction, where H atoms have major scattering power, the effect may be large enough to affect convergence (Prince, Dickens & Rush, 1974). Anharmonic (thirdcumulant) terms could be used, but the number of parameters increases rapidly, because there are as many as ten, independent, third cumulant tensor elements per atom.
Thermal motions of rigid bodies are represented by a symmetric, translation tensor, T, a symmetric, libration tensor, L, and a nonsymmetric, screw correlation tensor, S (Cruickshank, 1961; Schomaker & Trueblood, 1968). Any sequence of rotations of a rigid body about a fixed point is equivalent to a single, finite rotation about some axis passing through the fixed point. This rotation can be represented (Prince, 1994) by an axial vector, , where  is the magnitude of the rotation, and the direction cosines of the axis with respect to some system of orthogonal axes are given by . An exact expression for the displacement, u, of a point in the rigid body, located by a vector r from the centre of mass, owing to a rotation about an axis passing through the centre of mass is For small rotations, the trigonometric functions can be replaced by powerseries expansions, and, because of the extremely rapid convergence of these series, (8.3.1.14) is approximated extremely well, even for values of  as large as 0.5 rad, by By expansion of the vector products, this can be written where the coefficients, , , , and are multiples of components of r. For example, , so that A(r)_{11} = 0, A(r)_{12} = r_{3}, and A(r)_{13} = −r_{2}. These coefficients have been tabulated by Sygusch (1976), and expressed in Fortran source code by Prince (1994).
If the centre of mass of the rigid body also moves, the total displacement of the point at r is v = u + t, where t is the displacement of the centre of mass from its equilibrium position. A discussion of the effects of rigidbody motion on diffraction intensities involves quantities like , , and so forth, the ensemble averages of these quantities over many unit cells, which may be assumed to be equal to the time averages for one unit cell over a long time. The rigidbodymotion tensors are defined by = , = , and = . The distributions of and can usually be assumed to be approximately Gaussian, so that fourth moments can be expressed in terms of second moments. Thus, = , = , and so forth. If the elements of t and are measured with respect to their mean positions, = = 0. Third moments, quantities like , do not necessarily vanish, except when the rigid body is centrosymmetric, but their effects virtually always are small, and can be neglected.
A particle that is part of a librating, rigid body undergoes a curvilinear motion that results in its having a mean position that is displaced from its equilibrium position. This causes an apparent shortening of interatomic distances, which must be corrected for if accurate values of bond lengths are to be derived. The displacement, d, from the equilibrium position to the mean position is (Prince & Finger, 1973)
Anisotropic atomic displacement factors, , , or , are related by simple, linear transformations that are functions of the unitcell constants to the quantity . If the rigid body has a centre of symmetry, so that the elements of S vanish, this is given by
Expressions including elements of S have been given by Sygusch (1976) and, in Fortran source code, by Prince (1994). Expressions for anisotropic atomic displacement factors in terms of T, L, and S that included only terms linear in the tensor elements were given by Schomaker & Trueblood (1968), who pointed out that the diagonal elements of S never appeared individually, but only as the differences of pairs, so that the expressions were invariant under the addition of a constant to all three elements. This `trace of S singularity' was resolved by applying the additional constraint S_{11} + S_{22} + S_{33} = 0. As was pointed out by Sygusch (1976), the inclusion of terms that are quadratic in the tensor elements removes this indeterminacy, but the effects of the additional terms are so small that the problem remains extremely ill conditioned. In practice, therefore, these elements should still be treated as underdetermined.
Prince (1994) lists the symmetry restrictions for each type of tensor for various point groups. Although the description of thermal motion is essentially harmonic within the rigidbody system, the structurefactor formulation must include what appear to be anharmonic terms. Prince also presents computer routines that contain the relations between the elements of T, L, and S and the second and thirdcumulant tensor elements. As in the case of shape constraints, the equations are nonlinear, and the elements of C must be reevaluated in each iteration.
References
Cruickshank, D. W. J. (1961). Coordinate errors due to rotational oscillations of molecules. Acta Cryst. 14, 896–897.Finger, L. W. (1969). The crystal structure and cation distribution of a grunerite. Mineral. Soc. Am. Spec. Pap. 2, 95–100.
Gill, P. E., Murray, W. & Wright, M. M. (1981). Practical optimization. New York: Academic Press.
Hamilton, W. C. (1964). Statistics in physical science: estimation, hypothesis testing and least squares. New York: Ronald Press.
Johnson, C. K. (1970). Generalized treatments for thermal motion. Thermal neutron diffraction, edited by B. T. M. Willis, pp. 132–160. Oxford University Press.
Levy, H. A. (1956). Symmetry relations among coefficients of anisotropic temperature factors. Acta Cryst. 9, 679.
Prince, E. (1994). Mathematical techniques in crystallography and materials science, 2nd ed. Berlin/Heidelberg/New York/London/Paris/Tokyo/Hong Kong/Barcelona/Budapest: SpringVerlag.
Prince, E., Dickens, B. & Rush, J. J. (1974). A study of onedimensional hindered rotation in NH_{3}OHClO_{4}. Acta Cryst. B30, 1167–1172.
Prince, E. & Finger, L. W. (1973). Use of constraints on thermal motion in structure refinement of molecules with librating side groups. Acta Cryst. B29, 179–183.
Ralph, R. L. & Finger, L. W. (1982). A computer program for refinement of crystal orientation matrix and lattice constants from diffractometer data with lattice symmetry constraints. J. Appl. Cryst. 15, 537–539.
Schomaker, V. & Trueblood, K. N. (1968). On the rigidbody motion of molecules in crystals. Acta Cryst. B24, 63–76.
Sygusch, J. (1976). Constrained thermal motion refinement for a rigid molecule with librating side groups. Acta Cryst. B32, 3295–3298.