International
Tables for Crystallography Volume C Mathematical, physical and chemical tables Edited by E. Prince © International Union of Crystallography 2006 
International Tables for Crystallography (2006). Vol. C, ch. 8.4, pp. 704705

Tests based on F or the R ratio have several limitations. One important one is that they are applicable only when the parameters of one model form a subset of the parameters of the other. Also, the F test makes no distinction between improvement in fit as a result of small improvements throughout the entire data set and a large improvement in a small number of critically sensitive data points. A test that can be used for comparing arbitrary pairs of models, and that focuses attention on those data points that are most sensitive to differences in the models, was introduced by Williams & Kloot (1953; also Himmelblau, 1970; Prince, 1982).
Consider a set of observations, , and two models that predict values for these observations, and , respectively. We determine the slope of the regression line , where , and . Suppose model 1 is a perfect fit to the data, which have been measured with great precision, so that for all i. Under these conditions, λ = +1/2. Similarly, if model 2 is a perfect fit, λ = −1/2. Real experimental data, of course, are subject to random error, and λ in general would be expected to be less than 1/2. A leastsquares estimate of λ is and it has an estimated variance The hypothesis that the two models give equally good fits to the data can be tested by considering to be an unconstrained, oneparameter fit that is to be compared with a constrained, zeroparameter fit for which λ = 0. A p.d.f. for making this comparison can be derived from an F distribution with ν_{1} = 1 and ν_{2} = ν = (n − 1). If we let , and use we can derive a p.d.f. for t, which is This p.d.f. is known as Student's t distribution with ν degrees of freedom. Setting , the c.d.f. Ψ(t, ν) can be used to test the alternative hypotheses λ = 0 and λ = ±1/2. Table 8.4.3.1 gives the values of t for which the c.d.f. Ψ(t, ν) has various values for various values of ν. Fortran code for the program from which this table was generated appears in Prince (1994).

Again, it must be understood that the results of these statistical comparisons do not imply that either model is a correct one. A statistical indication of a good fit says only that, given the model, the experimenter should not be surprised at having observed the data values that were observed. It says nothing about whether the model is plausible in terms of compatibility with the laws of physics and chemistry. Nor does it rule out the existence of other models that describe the data as well as or better than any of the models tested.
References
Himmelblau, D. M. (1970). Process analysis by statistical methods. New York: John Wiley.Prince, E. (1982). Comparison of the fits of two models to the same data set. Acta Cryst. B38, 1099–1100.
Prince, E. (1994). Mathematical techniques in crystallography and materials science, 2nd ed. Berlin/Heidelberg/New York/London/Paris/Tokyo/Hong Kong/Barcelona/Budapest: SpringerVerlag.
Williams, E. J. & Kloot, N. H. (1953). Interpolation in a series of correlated observations. Aust. J. Appl. Sci. 4, 1–17.