General considerations when defining a CIF data item
B. McMahon. International Tables for Crystallography (2006). Vol. G, ch. 3.1, pp. 73-91  [ doi:10.1107/97809553602060000733 ]


CIF dictionaries provide a formal taxonomy of crystallographic terms and ideas. Dictionary entries are constructed in a structured machine-readable manner that facilitates validation and structuring of data. New entries may be devised for public or private dictionaries. A candidate data-name definition should fulfil the following conditions: (i) describe a specific and well defined concept: precision of definition is essential for an effective interchange mechanism; (ii) have appropriate granularity: data names can define a very small piece of information (a standard uncertainty on a particular physical measurable) or a very large amount (the text of a scientific paper). An appropriate choice should be made (and for DDL2 formalized through membership of subcategories, category and category groups, as appropriate); (iii) have well defined relationships with other data items (through its assigned category membership and parent/child links); for DDL2 the prior construction of a formal entity/relationship schema may be helpful ; (iv) constraints on the data type and permissible values should be provided where applicable; (v) the name chosen should be globally unique; this is achieved through monitoring of names in public dictionaries by a regulatory committee (COMCIFS) and by registering of prefix strings for exclusive use in local dictionaries. Some thought may need to be applied to the choice of DDL appropriate for a candidate dictionary. Rules are described for the structuring of dictionaries and the protocol outlined for merging separate dictionary files to provide access to a global distributed dictionary.

