Tables for
Volume C
Mathematical, physical and chemical tables
Edited by E. Prince

International Tables for Crystallography (2006). Vol. C, ch. 8.1, pp. 679-680

Section Statistics

E. Princea and P. T. Boggsb

aNIST Center for Neutron Research, National Institute of Standards and Technology, Gaithersburg, MD 20899, USA, and bScientific Computing Division, National Institute of Standards and Technology, Gaithersburg, MD 20899, USA Statistics

| top | pdf |

A probability density function, which will be abbreviated p.d.f., is a function, Φ(x), such that the probability of finding the random variable x in the interval [a\leq x\leq b] is given by [p(a\leq x\leq b)=\textstyle\int\limits _a^b\Phi (x){\,{\rm d}}x. ]A p.d.f. has the properties [\Phi (x)\geq 0,\qquad -\infty \, \lt \, x \, \lt \, +\infty, ]and [\textstyle\int\limits _{-\infty }^{+\infty }\Phi (x){\,{\rm d}}x=1. ]A cumulative distribution function, which will be abbreviated c.d.f., is defined by [\Psi (x)=\textstyle\int\limits _{-\infty }^x\Phi (t){\,{\rm d}}t. ]The properties of Φ(x) imply that [0\leq \Psi (x)\leq 1], and Φ(x) = dΨ(x)/dx. The expected value of a function, f(x), of random variable x is defined by [\left \langle f(x)\right \rangle =\textstyle\int\limits _{-\infty }^{+\infty }f(x)\Phi(x){\rm d}x. ]If f(x) = xn, [\left \langle f(x)\right \rangle =\left \langle x^n\right \rangle ] is the nth moment of Φ(x). The first moment, often denoted by μ, is the mean of Φ(x). The second moment about the mean, [\left \langle (x-\left \langle x\right \rangle)^2\right \rangle ], usually denoted by σ2, is the variance of [\Phi \left (x\right) ]. The positive square root of the variance is the standard deviation.

For a vector, x, of random variables, [x_{1},x_{2},\ldots,x_{n}], the joint probability density function, or joint p.d.f., is a function, ΦJ(x), such that [\eqalignno{p(&a_{1} \leq x_{1}\leq b_{1}\semi \, a_{2}\leq x_{2}\leq b_{2};\ldots; \ a_{n}\leq x_{n}\leq b_{n}) \cr&=\textstyle\int\limits _{a_{1}}^{b_{1}}\textstyle\int\limits_{a_{2}}^{b_{2}}\ldots\textstyle \int\limits_{a_{n}}^{b_{n}}\Phi _{J}\left ({\bf x}\right) {\,{\rm d}}x_{1}{\,{\rm d}}x_{2}\ldots {\,{\rm d}}x_{n}.& (}]The marginal p.d.f. of an element (or a subset of elements), [x_{i}], is a function, [\Phi _{M}(x_{i})], such that [\eqalignno{p(a_{i} \leq x_{i}\leq b_{i})&=\textstyle\int\limits _{a_{i}}^{b_{i}}\Phi _{M}(x_{i}){\,{\rm d}}x_i \cr &=\textstyle\int\limits_{-\infty }^{+\infty }\ldots \textstyle\int\limits_{a_{i}}^{b_{i}}\ldots \textstyle\int\limits _{-\infty }^{\infty }\Phi _{J}\left ({\bf x}\right) {\,{\rm d}}x_{1}\ldots {\,{\rm d}}x_{i}\ldots {\,{\rm d}}x_{n}. \cr &&(}]This is a p.d.f. for [x_{i}] alone, irrespective of the values that may be found for any other element of x. For two random variables, x and y (either or both of which may be vectors), the conditional p.d.f. of x given y = y0 is defined by [\Phi _{C}(x|y_{0})=c\Phi _{J}(x,y)_{y=y_{0}}, ]where [c=1/\Phi _{M}(y_{0})] is a renormalizing factor. This is a p.d.f. for x when it is known that y = y0. If [\Phi _{C}\left (x|y\right) =\Phi _{M}\left (x\right)] for all y, or, equivalently, if [\Phi _{J}(x,y)=\Phi _{M}(x)\Phi _{M}(y)], the random variables x and y are said to be statistically independent.

Moments may be defined for multivariate p.d.f.s in a manner analogous to the one-dimensional case. The mean is a vector defined by [\mu _i=\left \langle x_i\right \rangle =\textstyle\int x_i\Phi ({\bf x}){\,{\rm d}}{\bf x}, ]where the volume of integration is the entire domain of x. The variance–covariance matrix is defined by [\eqalignno{V_{ij} &=\left \langle \left (x_i-\langle x_i\rangle\right)\left(x_j-\langle x_j\rangle\right ) \right \rangle \cr &=\textstyle\int \left(x_i-\langle x_i\rangle\right) \left(x_j-\langle x_j\rangle\right) \Phi _J({\bf x}){\,{\rm d}}{\bf x}.& (}]The diagonal elements of V are the variances of the marginal p.d.f.s of the elements of x, that is, [V_{ii}=\sigma _i^2]. It can be shown that, if [x_i] and [x_j] are statistically independent, [V_{ij}=0] when [i\neq j]. If two vectors of random variables, x and y, are related by a linear transformation, x = By, the means of their joint p.d.f.s are related by μx = Bμy, and their variance–covariance matrices are related by Vx = BVyBT.

to end of page
to top of page