International
Tables for
Crystallography
Volume G
Definition and exchange of crystallographic data
Edited by S. R. Hall and B. McMahon

International Tables for Crystallography (2006). Vol. G, ch. 5.1, p. 484

Section 5.1.3.2. Using existing CIF libraries and APIs

H. J. Bernsteina*

aDepartment of Mathematics and Computer Science, Kramer Science Center, Dowling College, Idle Hour Blvd, Oakdale, NY 11769, USA
Correspondence e-mail: yaya@bernstein-plus-sons.com

5.1.3.2. Using existing CIF libraries and APIs

| top | pdf |

Another approach to making an existing application CIF-aware or to design a new CIF-aware application is to make use of one (or more) of the existing CIF libraries and application programming interfaces (APIs). Because the data involved need not be reprocessed, code that uses a library directly is often faster than equivalent code working with filter programs. The code within an application can be tuned to the internal data structures and coding conventions of the application.

The approach to internal design depends on the language, data structures and operating environment of the application. A few years ago, the precise details of language version and operating system would have been major stumbling blocks to conversion. Today, however, almost every platform supports a variation of the Unix application programming interface and many languages have viable interfaces to C and/or C++. Therefore it is often feasible to consider use of C, C++ or Objective-C libraries, even for Fortran applications. Star_Base (Spadaccini & Hall, 1994[link]; Chapter 5.2[link] ) is a program for extracting data from STAR Files. It is written in ANSI C and includes the code needed to parse a STAR File. OOSTAR (Chang & Bourne, 1998[link]; Chapter 5.2[link] ) is an Objective-C package that includes another parser for STAR Files (http://www.sdsc.edu/pb/cif/OOSTAR.html ). CIFLIB (Westbrook et al., 1997[link]) provides a CIF-specific API. CIFPARSE (Tosic & Westbrook, 1998[link]) is another C-based library for CIF. CBFlib (Chapter 5.6[link] ) is an ANSI C API for both CIF and CBF/imgCIF files. The CifSieve package (Hester & Okamura, 1998[link]) provides specialized code generation for retrieval of particular data items in either C or Fortran (see Chapter 5.3[link] for more details). The package cciflib (Keller, 1996[link]) (http://www.ccp4.ac.uk/dist/html/mmcifformat.html ) is used by the CCP4 program suite to support mmCIF in both C and Fortran applications. If an application in Fortran is to be converted with a purely Fortran-based library, the package CIFtbx (Hall, 1993[link]; Hall & Bernstein, 1996[link]) is a solution. See Chapter 5.4[link] for more details.

The common interface provided in C-based applications is for the library to buffer the entire CIF file into an internal data structure (usually a tree), essentially creating a memory-resident database (see Fig. 5.1.3.4[link]). This preload greatly reduces any demands on the application to deal with the order-independence of CIF, at the expense of what can be a very high demand for memory. The problem of excessive memory demand is dealt with in CBFlib by keeping large text fields on disk, with only pointers to them in memory. In some libraries, validation of tags against dictionaries is handled by the API. In others it is the responsibility of the application programmer. While the former approach helps to catch errors early, the second, `lightweight' approach is more popular when fast performance is required.

[Figure 5.1.3.4]

Figure 5.1.3.4 | top | pdf |

Typical dataflow of a C-based CIF API.

The most commonly used versions of Fortran do not include dynamic memory management. In order to preload an arbitrary CIF, one needs to use one of the C-based libraries. Alternatively, a pure Fortran application can transfer CIFs being read to a disk-based random access file. CIFtbx does this each time it opens a CIF. The user never works directly with the original CIF data set. This provides a clean and simple interface for reading, but slows all read access to CIFs. In Fortran, compromises are often necessary, with critical tables handled in memory rather than on disk, but this may force changes in dimensions and then recompilation when dictionaries or data sets become larger than anticipated.

References

Chang, W. & Bourne, P. E. (1998). CIF applications. IX. A new approach for representing and manipulating STAR files. J. Appl. Cryst. 31, 505–509.
Hall, S. R. (1993). CIF applications. IV. CIFtbx: a tool box for manipulating CIFs. J. Appl. Cryst. 26, 482–494.
Hall, S. R. & Bernstein, H. J. (1996). CIF applications. V. CIFtbx2: extended tool box for manipulating CIFs. J. Appl. Cryst. 29, 598–603.
Hester, J. R. & Okamura, F. P. (1998). CIF applications. X. Automatic construction of CIF input functions: CifSieve. J. Appl. Cryst. 31, 965–968.
Keller, P. A. (1996). A mmCIF toolbox for CCP4 applications. Acta Cryst. A52 (Suppl.), C-576.
Spadaccini, N. & Hall, S. R. (1994). Star_Base: accessing STAR File data. J. Chem. Inf. Comput. Sci. 34, 509–516.
Tosic, O. & Westbrook, J. D. (1998). CIFPARSE: A library of access tools for mmCIF. Reference guide. Version 3.1. Nucleic Acid Database Project, Department of Chemistry and Chemical Biology, Rutgers, The State University of New Jersey, USA. http://sw-tools.pdb.org/apps/CIFPARSE/cifparse/cifparse.html .
Westbrook, J. D., Hsieh, S.-H. & Fitzgerald, P. M. D. (1997). CIF applications. VI. CIFLIB: an application program interface to CIF dictionaries and data files. J. Appl. Cryst. 30, 79–83.








































to end of page
to top of page