Tables for
Volume C
Mathematical, physical and chemical tables
Edited by E. Prince

International Tables for Crystallography (2006). Vol. C, ch. 8.2, pp. 691-692

Section Some examples

E. Princea and D. M. Collinsb

aNIST Center for Neutron Research, National Institute of Standards and Technology, Gaithersburg, MD 20899, USA, and bLaboratory for the Structure of Matter, Code 6030, Naval Research Laboratory, Washington, DC 20375-5341, USA Some examples

| top | pdf |

For an example of the application of the maximum-entropy method, consider (Collins, 1984[link]) a collection of diffraction intensities in which various subsets have been measured under different conditions, such as on different films or with different crystals. All systematic corrections have been made, but it is necessary to put the different subsets onto a common scale. Assume that every subset has measurements in common with some other subset, and that no collection of subsets is isolated from the others. Let the measurement of intensity [I_h] in subset i be [J_{hi}], and let the scale factor that puts intensity [I_h] on the scale of subset i be [k_i]. Equation ([link] becomes [S=-\sum _{h=1}^n\sum _{i=1}^m(k_iI_h)^{\prime }\ln \left [{\left (k_iI_h\right) ^{\prime } \over J_{hi}^{\prime }}\right] , \eqno (]where the term is zero if [I_h] does not appear in subset i. Because [k_i] and [I_h] are parameters of the model, equations ([link] become [\sum _{i=1}^mk_i\ln \left [{(k_iI_h)^{\prime } \over J_{hi}^{\prime }}\right] -\sum _{h=1}^n\;\sum _{i=1}^m(k_iI_h)^{\prime }\left (\sum _{l=1}^mk_l\right) \ln \left [\displaystyle {(k_iI_h)^{\prime } \over J_{hi}^{\prime }}\right] =0, \eqno (]and [\sum _{h=1}^nI_h\ln \left [{(k_iI_h)^{\prime } \over J_{hi}^{\prime }}\right] -\sum _{h=1}^n\sum _{i=1}^m(k_iI_h)^{\prime }\left (\sum_{l=1}^nI_l\right) \ln \left[{(k_iI_h)^{\prime }\over J_{hi}^{\prime}}\right]=0. \eqno (]These simplify to [\ln I_h=Q-\textstyle\sum\limits _{i=1}^mk_i^{\prime }\ln (k_i/J_{hi}) \eqno (]and [\ln k_i=Q-\textstyle\sum\limits _{h=1}^nI_h^{\prime }\ln (I_h/J_{hi}), \eqno (]where [Q=\textstyle\sum\limits ^n_{h=1}\; \textstyle\sum\limits ^m_{i=1}(k_iI_h)^{\prime }\ln [(k_iI_h)/J_{hi}].\eqno (]Equations ([link][link][link] may be solved iteratively, starting with the approximations [k_i=\sum _{h=1}^nJ_{hi}] and Q = 0.

The standard uncertainties of scale factors and intensities are not used in the solution of equations ([link][link][link], and must be computed separately. They may be estimated on a fractional basis from the variances of estimated population means [\left \langle J_{hi}/I_h\right \rangle ] for a scale factor and [\left \langle J_{hi}/k_i\right \rangle ] for an intensity, respectively. The maximum-entropy scale factors and scaled intensities are relative, and either set may be multiplied by an arbitrary, positive constant without affecting the solution.

For another example, consider the maximum-entropy fit of a linear function to a set of independently distributed variables. Let [y_i] represent an observation drawn from a population with mean [a_0+a_1x_i] and finite variance [\sigma _i^2]; we wish to find the maximum-entropy estimate of [a_0] and [a_1]. Assume that the mismatch between the observation and the model is normally distributed, so that its probability density is the positive proportion [\varphi _i=\varphi (\Delta_i)=(2\pi \sigma _i^2)^{-1/2}\exp (-\Delta_i^2/2\sigma _i^2), \eqno (]where [\Delta_i=y_i-(a_0+a_1x_i)]. The prior proportion is given by [\mu _i=\varphi (0)=(2\pi \sigma _i^2)^{-1/2}. \eqno (]Letting [A_\varphi =1\big/\sum \varphi _i], equations ([link] become [\textstyle\sum\limits_{i=1}^n\left [\varphi _i\Delta_i/\sigma _i^2-A_\varphi \,\varphi _i\left (\textstyle\sum\limits _{j=1}^n\varphi _j\Delta_j/\sigma _j^2\right) \right] \Delta_i^2/\sigma _i^2=0 \eqno (]and [\textstyle\sum\limits_{i=1}^n\left [\varphi _i\Delta_i\,x_i/\sigma _i^2-A_\varphi \,\varphi _i\left (\textstyle\sum\limits _{j=1}^n\varphi _j\Delta_jx_j/\sigma _j^2\right) \right] \Delta_i^2/\sigma _i^2=0, \eqno (]which simplifies to [\eqalignno{ &\left (\matrix{ \sum \limits _{i=1}^nw_i &\sum \limits _{i=1}^nw_i x_i \cr \sum \limits _{i=1}^nw_i x_i &\sum \limits _{i=1}^nw_i x_i^2}\right) \left ({a_0 \atop a_1}\right) \cr &\quad =\left(\matrix{ \sum \limits _{i=1}^nw_i\left (y_i-\sigma _i^2A_\varphi \sum \limits _{j=1}^n\varphi _j \Delta_j/\sigma _j^2\right) \cr \sum \limits _{i=1}^nw_i\left (y_ix_i-\sigma _i^2A_\varphi \sum \limits _{j=1}^n\varphi _j\Delta_j x_j/\sigma _j^2\right)}\right), &(}]where [w_i] may be interpreted as a weight and is given by [w_i=\varphi _i\Delta_i^2/\sigma _i^4]. Equations ([link] may be solved iteratively, starting with the approximations that the sums over j on the right-hand side are zero and [w_i=1.0] for all i, that is, using the solutions to the corresponding, unweighted least-squares problem. Resetting [w_i] after each iteration by only half the indicated amount defeats a tendency towards oscillation. Approximate standard uncertainties for the parameters, [a_0] and [a_1], may be computed by conventional means after setting to zero the sums over j on the right-hand side of equations ([link]. (See, however, a discussion of computing variance–covariance matrices in Section 8.1.2[link] .) Note that [w_i] is small for both small and large values of [\left | \Delta_i\right |]. Thus, in contrast to the robust/resistant methods (Section 8.2.2[link]), which de-emphasize only the large differences, this method down-weights both the small and the large differences and adjusts the parameters on the basis of the moderate-size mismatches between model and data. The procedure used in this two-dimensional, linear model can be extended to linear models, and linear approximations to nonlinear models, in any number of dimensions using methods discussed in Chapter 8.1[link] .

The maximum-entropy method has been described (Jaynes, 1979[link]) as being `maximally noncommittal with respect to all other matters; it is as uniform (by the criterion of the Shannon information measure) as it can be without violating the given constraint[s]'. Least squares, because it gives minimum variance estimates of the parameters of a model, and therefore of all functions of the model including the predicted values of any additional data points, might be similarly described as `maximally committal' with regard to the collection of more data. Least squares and maximum entropy can therefore be viewed as the extremes of a range of methods, classified according to the degree of a priori confidence in the correctness of the model, with the robust/resistant methods lying somewhere in between (although generally closer to least squares). Maximum-entropy methods can be used when it is desirable to avoid prejudice in favour of a model because of doubt as to the model's correctness.


Collins, D. M. (1984). Scaling by entropy maximization. Acta Cryst. A40, 705–708.
Jaynes, E. T. (1979). Where do we stand on maximum entropy? The maximum entropy formalism, edited by R. D. Liven & M. Tribus, pp. 44–49. Cambridge, MA: Massachusetts Institute of Technology.

to end of page
to top of page