International
Tables for
Crystallography
Volume G
Definition and exchange of crystallographic data
Edited by S. R. Hall and B. McMahon

International Tables for Crystallography (2006). Vol. G, ch. 2.4, pp. 44-45

Section 2.4.2. Historical background

F. H. Allen,a* J. M. Barnard,b A. P. F. Cookb and S. R. Hallc

aCambridge Crystallographic Data Centre, 12 Union Road, Cambridge, CB2 1EZ, England,bBCI Ltd, 46 Uppergate Road, Stannington, Sheffield S6 6BX, England, and cSchool of Biomedical and Chemical Sciences, University of Western Australia, Crawley, Perth, WA 6009, Australia
Correspondence e-mail:  allen@ccdc.cam.ac.uk

2.4.2. Historical background

| top | pdf |

The Standard Molecular Data (SMD) format was initially developed by a group of European pharmaceutical companies in the mid-1980s. Draft documents were made available from 1987 and the specification was published (Bebak et al., 1989[link]). A meeting in Frankfurt in 1988 established a series of technical working groups under the auspices of the Chemical Structure Association (CSA) to examine the format specifications in detail and to make recommendations for any revision. As a result, a draft form of a revised format, described as SMD Version 5.0, was published in February 1990 (Barnard, 1990[link]). A document describing the core format, i.e. those data items regarded as essential in any exchange file, was prepared by one of us (JMB) for consideration by Subcommittee E49.51 of the American Society for Testing and Materials (ASTM).

In December 1993, the ASTM subcommittee E49.51 approved a standard specification for the content (i.e. recommended data items) of computerized chemical structural files (ASTM, 1994[link]), although the subcommittee did not publish any proposals for a format specification. Later, the Chemical Abstracts Service (CAS) circulated a draft proposal for a connection-table-based exchange format for chemical substances and queries. It used some ideas that are similar to the 1990 SMD proposal and is expressed within the framework of the Abstract Syntax Notation 1 (ISO, 2002a[link],b[link]). MDL Information Systems Inc. has also published a description of their proprietary formats (Dalby et al., 1992[link]) and a number of other software systems now provide interfaces to these formats.

During this period, the IUCr Working Party on Crystallographic Information had commissioned one of us (SRH) to coordinate the development of a universal file to replace the existing fixed-format Standard Crystallographic File Structure (SCFS: Brown, 1988[link]). As documented in Chapter 1.1[link] , the CIF approach was adopted as the international standard in 1990 and published by Hall et al. (1991[link]). Although the small-molecule CIF is able to store a representation of 2D chemical topology, its data definitions do not meet all the needs of the chemical community. In 1991, the IUCr became interested in further extending CIF into the chemical arena and discussions took place between representatives of the CIF project and of the SMD Technical Working Group. These meetings decided that an integration of the SMD format and the STAR syntax was desirable because it provided a number of advantages over the existing SMD specifications (Barnard & Cook, 1992[link]). In particular, SMD/STAR provides for a clearer separation of the data structure and the data content roles, together with more flexible data extensibility in future versions. In addition, automated data validation of STAR/SMD files is possible using electronic data dictionaries. In a wider context, there were obvious opportunities for integrating with other applications of the STAR File.

References

ASTM (1994). Standard specification for the content of computerized chemical structural information files or data sets. ASTM Standard E 1586-93. American Society for Testing and Materials, Philadelphia, PA, USA.
ISO (2002a). ISO/IEC 8824-1. Abstract Syntax Notation One (ASN.1). Specification of basic notation. Geneva: International Organization for Standardization.
ISO (2002b). ISO/IEC 8825-1. ASN.1 encoding rules. Specification of Basic Encoding Rules (BER), Canonical Encoding Rules (CER) and Distinguished Encoding Rules (DER). Geneva: International Organization for Standardization.
Barnard, J. M. (1990). Draft specification for revised version of the Standard Molecular Data (SMD) format. J. Chem. Inf. Comput. Sci. 30, 81–96.
Barnard, J. M. & Cook, A. P. F. (1992). The Molecular Information File (MIF): a standard format for molecular information. Report. Chemical Structure Association, London, England.
Bebak, H., Buse, C., Donner, W. T., Hoever, P., Jacob, H., Klaus, H., Pesch, J., Roemelt, J., Schilling, P., Woost, B. & Zirz, C. (1989). The Standard Molecular Data format (SMD format) as an integration tool in computer chemistry. J. Chem. Inf. Comput. Sci. 29, 1–5.
Brown, I. D. (1988). Standard Crystallographic File Structure-87. Acta Cryst. A44, 232.
Dalby, A., Nourse, J. G., Hounshell, W. D., Gushurst, A. K. I., Grier, D. L., Leland, B. A. & Laufer, J. (1992). Description of several chemical structure file formats used by computer programs developed at Molecular Design Limited. J. Chem. Inf. Comput. Sci. 32, 244–255.
Hall, S. R., Allen, F. H. & Brown, I. D. (1991). The Crystallographic Information File (CIF): a new standard archive file for crystallography. Acta Cryst. A47, 655–685.








































to end of page
to top of page