International
Tables for
Crystallography
Volume F
Crystallography of biological macromolecules
Edited by E. Arnold, D. M. Himmel and M. G. Rossmann

International Tables for Crystallography (2012). Vol. F, ch. 22.2, pp. 713-720   | 1 | 2 |
https://doi.org/10.1107/97809553602060000886

Chapter 22.2. Molecular surfaces: calculations, uses and representations

M. S. Chapmana* and M. L. Connollyb

aDepartment of Chemistry & Institute of Molecular Biophysics, Florida State University, Tallahassee, FL 32306–4380, USA, and b1259 El Camino Real 184, Menlo Park, CA 94025, USA
Correspondence e-mail:  chapman@sb.fsu.edu

In this chapter, the calculation of molecular surface areas, the uses of surface-area calculations and the representation of molecular surfaces are discussed.

22.2.1. Introduction

| top | pdf |

22.2.1.1. Uses of surface-area calculations

| top | pdf |

Interactions between molecules are most likely to be mediated by the properties of residues at their surfaces. Surfaces have figured prominently in functional interpretations of macromolecular structure. Which residues are most likely to interact with other molecules? What are their properties: charged, polar, or hydrophobic? What would be the estimated energy of interaction? How do the shapes and properties complement one another? Which surfaces are most conserved among a homologous family? At the centre of these questions that are often asked at the start of a structural interpretation lies the calculation of the molecular and/or accessible surfaces.

Surface-area calculations are used in two ways. Graphical surface representations help to obtain a quick intuitive understanding of potential molecular functions and interactions through visualization of the shape, charge distribution, polarity, or sequence conservation on the molecular surface (for example). Quantitative calculations of surface area are used en route to approximations of the free energy of interactions in binding complexes.

Part of this subject area was the topic of an excellent review by Richards (1985)[link], to which the reader is referred for greater coverage of many of the methods of calculation. This review will attempt to incorporate more recent developments, particularly in the use of graphics, both realistic and schematic.

22.2.1.2. Molecular, solvent-accessible and occluded surface areas

| top | pdf |

The concept of molecular surface derives from the behaviour of non-bonded atoms as they approach each other. As indicated by the Lennard–Jones potential, strong unfavourable interactions of overlapping non-bonding electron orbitals increase sharply according to [1/r^{12}], and atoms behave almost as if they were hard spheres with van der Waals radii that are characteristic for each atom type and nearly independent of chemical context. Of course, when orbitals combine in a covalent bond, atoms approach much more closely. Lower-energy attractions between atoms, such as hydrogen bonds or aromatic ring stacking, lead to modest reductions in the distance of closest approach. The van der Waals surface is the area of a volume formed by placing van der Waals spheres at the centre of each atom in a molecule.

Non-bonded atoms of the same molecule contact each other over (at most) a very small proportion of their van der Waals surface. The surface is complicated with gaps and crevices. Much of this surface is inaccessible to other atoms or molecules, because there is insufficient space to place an atom without resulting in forbidden overlap of non-bonded van der Waals spheres (Fig. 22.2.1.1)[link]. These crevices are excluded in the molec­ular surface area. The molecular surface area, also known as the solvent-excluding surface, is the outer surface of the volume from which solvent molecules are excluded. Strictly, this would depend on the orientation of non-spherically symmetric solvents such as water. However, since hydrogen atoms are smaller than oxygen atoms, for current purposes it is sufficient to consider water as a sphere with a radius of 1.4 to 1.7 Å, approximating the `average' distance from the centre of the oxygen atom to the van der Waals surface of water. The practical definition of the molecular surface is, then, the area of the volume excluded to a spherical probe of 1.4 to 1.7 Å radius.

[Figure 22.2.1.1]

Figure 22.2.1.1 | top | pdf |

Surfaces in a plane cut through a hypothetical molecule. The molecular surface consists of the sum of the atomic surfaces that can be contacted by solvent molecules and the surface of the space between atoms from which solvent molecules are excluded. The solvent-accessible surface is the surface formed by the set of the centres of spheres that are in closest contact with the molecular surface.

As an aside, it is important to note that surface-area calculations depend on inexact parameterization. For example, there is no radius of any hard-sphere model that can give a realistic representation of the solvent. Furthermore, the choice of van der Waals radii can depend on whether the distance of zero or minimum potential energy is estimated and the potential-energy function or experimental data used. (Tables of common values are given by Gerstein & Richards in Chapter 22.1[link] .) Thus, calculations of molecular and accessible surfaces are approximate. However, when the errors are averaged over large areas of a macromolecule, the numbers can be precise enough to give important insights into function.

Fig. 22.2.1.1[link] shows that the molecular surface consists of two components. The contact surface is part of the van der Waals surface. The re-entrant surface encloses the interstitial volume and has components that are the exterior surfaces of atoms (contact surface) and parts of the surfaces of probes placed in positions where they are in contact with van der Waals surfaces of two or more atoms (re-entrant surface).

The occluded molecular surface is an approximate complement to the solvent-accessible surface. It is the part of the surface that would be inaccessible to solvent because of steric conflict with neighbouring macromolecular atoms. It is an approximation in that current calculations use van der Waals surfaces, ignoring the differences between atomic and re-entrant surfaces (see below), and the volume of the probe is not fully accounted for (Pattabiraman et al., 1995[link]). Occluded area is defined as the atomic area whose normals cannot be extended 2.8 Å (the presumptive diameter of a water molecule) without intersecting the van der Waals volume of another atom. This crude approximation to the surface that is inaccessible to water not only increases the speed of calculation, but enables surface areas to be partitioned between the atoms. It is used primarily to evaluate model protein structures by comparing the fraction of each amino acid's surface area that is occluded with that calculated for the same residue types in a database of accurate structures.

22.2.1.3. Hydration surface

| top | pdf |

Whether graphically displaying a molecule or examining potential docking interactions, it is usually the molecular surface or solvent-accessible surface that is used. However, macromolecules also interact through the small (solvent) molecules that are more or less tightly bound (Gerstein & Lynden-Bell, 1993[link]). There is a gradation of how tightly solvent molecules are bound and how many are bound around different side chains. With dynamics simulations, Gerstein & Lynden-Bell (1993)[link] showed that the second hydration shell was a reasonable, practical `average' limit to which water atoms should be considered significantly perturbed by the protein. They defined a hydration surface as the surface of this second shell and presented evidence that it approximates the boundary between bound and bulk solvent. They presented calculations that showed that molecules interact significantly when their hydration surfaces interact, and not just when they are close enough for their molecular surfaces to form contacts. It may be computationally impractical to perform the simulations required to calculate the hydration surfaces of many proteins, but this work reminds us that energetically significant interactions occur over a wider area than the commonly computed contact molecular-surface area.

22.2.1.4. Hydrophobicity

| top | pdf |

The hydrophobic effect (Kauzmann, 1959[link]; Tanford, 1997[link]) has its origins in unfavourable entropic terms for water molecules immediately surrounding a hydrophobic group. In the bulk solvent, each water molecule can be oriented in a variety of ways with favourable hydrogen bonding. At the interface with a hydrophobic group, hydrogen bonds are possible only in some directions, with some configurations of the water molecules. When a hydrophobic group is embedded in water, the surrounding solvent molecules have a more restricted set of hydrogen-bonding configurations, resulting in an unfavourable entropic term. The magnitude of the entropic term should be proportional to the number of solvent molecules immediately surrounding the hydrophobic group. This integer number can be considered very approximately proportional to the area of the surface made by the centres of the set of possible solvent probes contacting the solute, i.e. the solvent-accessible surface area (Fig. 22.2.1.1)[link]. When large areas are considered, summed over many hydrophobic atoms, the errors of this non-integer approximation are insignificant. It is now common practice to estimate the hydrophobic effect free-energy contribution by multiplying the change in macromolecular surface area by an energy per unit area [~80 J mol−1 Å−2 (Richards, 1985[link]), but see also below].

22.2.2. Calculation of surface area and energies of interaction

| top | pdf |

22.2.2.1. Introduction

| top | pdf |

The first method to be discussed allows the calculation of an accessible surface. The first method for calculating molecular surface involved raining water down on a model of a macromolecule and constructing a surface by making a net under the spheres in their landing positions (Greer & Bush, 1978[link]). This ignored overhangs and was replaced by the dot surface method. More recently, methods were developed to make polyhedral surfaces of triangles by contouring between lattice points or by delimiting with arcs the spherical and toroidal surfaces and then subdividing the piece-wise quartic molecular surface. The surface is then composed of patches whose areas can be precisely integrated. van der Waals surfaces consist of convex spherical triangles whose areas can be estimated by the Gauss–Bonnet theorem. Re-entrant surfaces are comprised of concave spherical triangles whose areas can be similarly estimated and toroidal saddle-shaped patches whose areas can be calculated by analytical geometry and calculus.

22.2.2.2. Lee & Richards planar slices

| top | pdf |

The first method for calculating the accessible surface area overlaid the molecule on a regular stack of finely spaced parallel planes (Lee & Richards, 1971[link]). The advantage of this method was the ease with which the area could be calculated. The intersection of the atomic surfaces with the planes were circular arcs whose lengths were readily calculated and multiplied by the planar spacing to give an approximation to the surface area. Programs that are currently distributed use more sophisticated methods.

22.2.2.3. Connolly dot surface algorithm

| top | pdf |

A molecular dot surface is a smooth envelope of points on the molecular surface. A probe sphere is placed at a set of approximately evenly spaced points so that the probe and van der Waals surfaces of a given atom are tangential. If the probe sphere does not overlap any other atom, the point is designated as surface. To define the re-entrant surface, sphere centres are also sampled that are tangential to both van der Waals spheres of a pair of neighbouring atoms and are equidistant from the interatomic axis. Arcs are then drawn between surface points and the arcs are subdivided into a set of finely spaced points to define the re-entrant surface. Similarly, spheres contacting triplets of neighbouring atoms are tested, and approximately evenly spaced points within the concave triangle defined by the three contact points are added to the re-entrant surface.

22.2.2.4. Marching-cube algorithm

| top | pdf |

This is conceptually the simplest method and is used in the program GRASP (Nicholls et al., 1991[link]). First, grid points of a cubic lattice overlaid on the molecule are segregated into `interior' and `exterior' as follows. All points farther from an atom than the sum of the van der Waals radius and a probe radius are flagged as external. External points with an internal neighbour are flagged as an approximate `accessible surface'. All grid points falling within probe spheres centred at each surface point now join the set of exterior points. Points that remain `interior' define the volume enclosed by the molecular surface.

All that remains is to contour the molecular surface that lies between interior and exterior grid points. It is a little complicated in three dimensions and is achieved by the marching-cube algorithm. Cubes containing adjacent grid points that are both interior and exterior are used to define potential polyhedral vertices. Triangles are defined by joining the midpoints of unit-cell edges that have one interior and one exterior point. The triangles are joined at their edges in a consistent manner to create a polyhedral surface.

22.2.2.5. Complete and connected rolling algorithms

| top | pdf |

Several algorithms start by dividing the surface into regions within which the surface is smooth and continuous. The surface can be efficiently described in terms of a set of arcs and their start and end points. In complete rolling, the probe is placed in all possible positions at which it contacts the van der Waals spheres of three neighbouring atoms. Those surrounding the same atom are paired as the start and end points of an arc. The complete rolling algorithm does not distinguish outer and inner (cavity) surfaces. In the connected rolling algorithm, the process starts at a triple contact point that is far from the centre of mass and therefore likely to be external. The probe is then rolled only along crevices between two atoms, pursuing all alternatives, stopping each pathway only when the probe returns to a place that has already been probed. This algorithm therefore produces only the outer surface.

22.2.2.6. Analytic surface calculations and the Gauss–Bonnet theorem

| top | pdf |

An analytical method was also proposed for calculating approximate accessible areas (Wodak & Janin, 1980[link]). It assumed random distributions of neighbouring atoms, but this can be a sufficient approximation when calculating the area of an entire molecule. The areas of spherical and toroidal pieces of surface can be calculated exactly by analytic and differential geometry (Richmond, 1984[link]; Connolly, 1983[link]). An advantage of analytical expressions over the prior numerical approximations is that analytical derivatives of the areas can be calculated, albeit with significant difficulty. This then provides the opportunity to optimize atomic positions with respect to surface area. Pseudo-energy functions that approximate the hydrophobic contribution to free energy with a term proportional to the accessible surface area (Richards, 1977[link]) can therefore be incorporated in energy-minimization programs. Although rigorous, these methods are computationally cumbersome and are not used in all energy-minimization routines. Incorporation of solvent effects may become more universal with the Gaussian atom approximations discussed below.

22.2.2.7. Approximations to the surface

| top | pdf |

The methods discussed above are computationally quite cumbersome, especially if they need to be repeated many times. Thus, they are not well suited to comparisons of many structures. They are also not well suited to the calculation of surface-area-dependent energy terms during dynamics simulation or energy minimization, which require the calculation of the derivatives of the surface area with respect to atomic position. It has been argued by several (including A. Nicholls and K. Sharp, personal communications) that simplifying approximations to the surface-area calculations are in order, because the common uses of surface area already embody crude ad hoc approximations, such as non-integer numbers of spherical solvent molecules.

In the treatments discussed earlier, the volume of the protein is (implicitly) described by a set of overlapping step functions that have a constant value if close enough to an atom, or zero if not. Several authors have replaced these step functions with continuous spherical Gaussian functions centred on each atom (Gerstein, 1992[link]; Grant & Pickup, 1995[link]) in treatments reminiscent of Ten Eyck's electron-density calculations (Ten Eyck, 1977[link]). This speeds up the calculation and also facilitates the calculation of analytical derivatives of the surface area. A surface can be calculated for graphical display by contouring the continuous function at an appropriate threshold. The final envelope can be modified by using iterative procedures that fill cavities and crevices that are (nearly) surrounded by protein atoms (Gerstein, 1992[link]).

22.2.2.8. Extended atoms account for missing hydrogen atoms

| top | pdf |

Structures of macromolecules determined by X-ray crystallography rarely reveal the positions of the hydrogen atoms. It is, of course, possible to add explicit hydrogen atoms at the stereochemically most likely positions, but this is rarely done for surface-area calculations. Instead, their average effect is approximately and implicitly accounted for by increasing the heteroatom van der Waals radius by 0.1 to 0.3 Å. (It is not usual to smear atoms to account for thermal motion.)

22.2.3. Estimation of binding energies

| top | pdf |

22.2.3.1. Hydrophobicity

| top | pdf |

As previously introduced, hydrophobic energies result primarily from the increased entropy of water molecules at the macromolecule–solvent interface and can be estimated from the accessible surface area. A number of different constants relating area to free energy of transfer from a hydrophobic to aqueous environment have been proposed in the range of 67 to 130 J mol−1 Å−2 (Reynolds et al., 1974[link]; Chothia, 1976[link]; Hermann, 1977[link]; Eisenberg & McLachlan, 1986[link]), but if a single value is to be used for all of the protein surface, the consensus among crystallographers has been about 80 J mol−1 Å−2 (Richards, 1985[link]).

There are two widely used enhancements of the basic method. Atomic solvation parameters (ASPs, Δσ) remove the assumption that all protein atoms have equal potential influence on the hydrophobic free energy. Eisenberg & McLachlan (1986)[link] determined separate ASPs for atom types C, N/O, O.., N+ and S (treating hydrogen atoms implicitly) by fitting these constants to the experimentally determined octanol/water relative transfer free energies of the 20 amino-acid side chains of Fauchere & Pliska (1983)[link], assuming standard conformations of the side chains. A much improved free energy change of solvation can then be estimated from [\Delta G = \sum_{{\rm atoms}\;i} \Delta\sigma_{i}A_{i}], where the sum­mation is over all atoms with accessible area A and [\Delta\sigma_{i}] is specific for the atom type. Their estimates of ASPs are given in Table 22.2.3.1[link]. Use of ASPs rather than a single value for all atoms makes substantial differences to the estimated free energies of association of macromolecular assemblies (Xie & Chapman, 1996[link]). Through calculation of the overall energy of solvation, calculations with ASPs also allow discrimination between proposed structures that are correctly folded (with hydrophobic side chains that are predominantly internal) and those that are not (Eisenberg & McLachlan, 1986[link]).

Table 22.2.3.1| top | pdf |
The atomic solvation parameters of Eisenberg & McLachlan (1986)[link]

AtomΔσ(atom) (J mol−1 Å−2)
C 67 (8)
N/O −25 (17)
O.. −101 (42)
N+ −210 (38)
S 88 (42)

The work of Sharp et al. (1991)[link] indicates that hydrophobicity depends not only on surface area, but curvature. Sharp et al. were trying to reconcile long-apparent differences between microscopic and macroscopic measurements of hydrophobicity (Tanford, 1979[link]). Microscopic measurements, the basis of all of our preceding discussions, are derived from the partitioning of dilute solutes between solvents. Macroscopic values can come from the measurements of the surface tension between a liquid bulk of the molecule of interest and water. Macroscopic values for aliphatic carbons are much higher, ~302 J mol−1 Å−2. Postulating that the entropic effects at the heart of hydrophobicity depended on the number of water molecules in contact with each other at the molecular surface (Nicholls et al., 1991[link]), Sharp et al. pointed out that not all surfaces were equivalent. Relative to a plane, concave solute surfaces would accommodate fewer solvent molecules neighbouring the molecular surface, whereas convex surfaces would accommodate more. Their treatment could be considered to be a second-order approximation to the number of interfacial solvent molecules, compared to the prior first-order consideration of only area.

To calculate the curvature of point a on the accessible surface (relative to that of a plane), a sphere of twice the solvent radius is drawn (Nicholls et al., 1991[link]). This represents the locus of the centres of solvent molecules that could be in contact with a solvent at a. A curvature correction, c, is the proportion of points on the spherical surface that are inside the inaccessible volume, relative to that for a planar accessible surface ([{1 \over 2}]). In calculating the free energy of transfer, each element of the accessible area is multiplied by its curvature correction. When this is done, the increasingly convex surfaces of small aliphatic molecules account for most of the discrepancy between microscopic and macroscopic hydrophobicities (Nicholls et al., 1991[link]). Furthermore, it emphasizes that, just by their shape, concave surfaces can become relatively hydrophobic. This has been clearly illustrated with GRASP surface representations (see below) in which the accessible surface is coloured according to the local curvature (Nicholls et al., 1991[link]). Consideration of curvature also indicates that the energy of macromolecular association is slightly less than it would otherwise be due to the generation of a concave collar at the interface between two binding macromolecules (Nicholls et al., 1991[link]).

22.2.3.2. Estimates of binding energies

| top | pdf |

In a molecular association in which (as is often the case) hydrophobic interactions dominate, the binding energy can be estimated from the surfaces of the individual molecules that become buried upon association (Richards, 1985[link]). The buried area is simply the sum of the surfaces of the two molecules (calculated independently) minus the surface of the complex, calculated as if one molecule. Usually, all heteroatoms are regarded as equivalent, and the buried area is multiplied by a uniform constant, say 80 J mol−1 Å−2 (Richards, 1985[link]). It is only slightly more complicated to use the different ASPs (Eisenberg & McLachlan, 1986[link]) for different atom types and/or to account for curvature (Nicholls et al., 1991[link]). It should be noted that in many crystal structures, the distinction between atom types in some side chains remains indeterminate, e.g. N and C in histidines, O and O.. in carboxylates, and N and N+ in arginines. In such cases, average values of the two ASPs can be used (Xie & Chapman, 1996[link]). Such energy calculations have been put to several uses, including attempts to predict assembly and disassembly pathways for viral capsid assemblies (Arnold & Rossmann, 1990[link]; Xie & Chapman, 1996[link], and citations therein).

22.2.3.3. Other non-graphical interpretive methods using surface area

| top | pdf |

Which are the amino acids most likely to interact with other molecules? It is reasonable to expect them to be surface-accessible. In determining which residues are most surface-exposed, it is necessary to partition molecular or accessible surfaces between atoms. Contact surfaces (Fig. 22.2.1.1)[link] are atom specific. Re-entrant or accessible surfaces can be divided among surface atoms by proximity. Surface areas can then be summed over the atoms in a residue. Accessible surface areas are sometimes reported as accessibilities (Lee & Richards, 1971[link]) – fractions of a maximum where the standard is evaluated from a tripeptide in which the residue of interest is surrounded by glycines. A different approach to assessing surface exposure is to ask what is the largest molecular fragment that could contact a given atom. This is commonly assayed by determining the largest sphere that can be placed tangentially to the van der Waals surface without intersecting any other atom. An alternative approach to locating functionally important surface regions was proposed in the mid-1980s, but is currently not used very often. The local irregularity of surface texture was characterized through measurement of the fractal dimension (Lewis & Rees, 1985[link]).

Substrates, drugs and ligands often bind in clefts or pockets that are concave in shape. Conversely, it is the most exposed convex regions that are likely to be antigenic. The surface shape can be determined by placing a large (say 6 Å radius) sphere at each vertex of the polyhedral molecular surface. If more than half of the sphere's volume overlaps the molecular volume, then the surface is concave, while if less than half, the surface is convex.

Are there similarities in the shapes of surfaces at the interfaces of macromolecular complexes? For example, are there similarities between the shapes of evolutionary-related antigens or the hypervariable regions of antibodies that bind to them? Quantitative comparison of surface topologies is far from trivial, with questions of three-dimensional (3D) alignment, the metrics to be used in quantifying topology etc. In addition to real differences between molecules, their surfaces may appear to differ due to the resolutions at which their structures were determined. Gerstein (1992)[link] has proposed that comparisons be made in reciprocal space so that correlations can be judged as a function of resolution. Coordinates are aligned. Spherical Gaussian functions are placed at each atom, and an envelope is calculated at some threshold value and modified to remove cavities. Gerstein found that comparison of the envelope structure-factor vectors, obtained by Fourier transformation, led to a plausible classification of the hypervariable regions of known antibody structures.

22.2.4. Graphical representations of shape and properties

| top | pdf |

22.2.4.1. Realistic

| top | pdf |

22.2.4.1.1. Shaded backbone

| top | pdf |

With very large complexes, such as viruses, the surface features to be viewed are obvious at low resolution. In a very simple yet effective representation popularized by the laboratories of David Stuart and Jim Hogle, a Cα trace is `depth cued' (shaded) according to the distance from the centre of mass (Acharya et al., 1990[link]; Fig. 1 for example). The impression of three dimensions probably results from the similarity of the shading to highlighting. The method is most effective for large complexes in which there are sufficient Cα atoms to give a dense impression of a surface.

22.2.4.1.2. `Connolly' and solid polyhedral surfaces

| top | pdf |

In one of the earliest surface graphical representations, dots were drawn for each Connolly surface dot, using vector-graphics terminals. With the improved graphics capability of modern computers, dot representations have been replaced by ones in which solid polyhedra are drawn with a large enough number of small triangular faces such that the surface appears smooth. These representations are clearer, because atoms in the foreground obscure those in the background.

22.2.4.1.3. Photorealistic rendering

| top | pdf |

Depth and three-dimensional relationships are most easily represented by stereovision or rotation of objects in real time on a computer screen. Graphics engines for interactive computers compromise quality for the speed necessary for interactive response, but simple depth cueing (combined with motion or stereo) is sufficient for good 3D representation. For still and/or non-stereo images more common in publications, more sophisticated rendering is helpful and possible now that speed is not a constraint. In Raster3D (Merritt & Bacon, 1997[link]), multiple-light-source shading and highlighting is added, with individual calculations for each fine pixel. These are dependent on the directions of the normals to the surface, which are calculated analytically for spherical surfaces. More complicated surfaces, input as connected triangles, have surfaces rendered raster, pixel by pixel, by interpolating between the surface-normal vectors at the vertices of the surrounding triangle. Together, this leads to a high-quality smooth image that conveys much of the three-dimensionality of molecular surfaces.

22.2.4.1.4. GRASP surfaces

| top | pdf |

GRASP is currently perhaps the most popular program for the display of molecular surfaces. Readers are referred to the program documentation (Nicholls, 1992[link]) or a paper that tan­gentially describes an early implementation (Nicholls et al., 1991[link]). The molecular or accessible surface is determined by the marching-cube algorithm. The surface is filled using methods that make modest compromises on photorealistic light reflection etc., but take advantage of machine-dependent Silicon Graphics surface rendering to perform the display fast enough for interactive adjustment of the view.

The most powerful part of the program is the ability to colour according to properties mapped to the surface (see Fig. 22.2.4.1[link]). These may be values of (say) electrostatic potential interpolated from a three-dimensional lattice. Much has been learned about many proteins from the potentials determined by solution of the Poisson–Boltzmann equation (Nicholls & Honig, 1991[link]). The electrostatic complementarity of binding surfaces has often been readily apparent in ways that were not obvious from Coulombic calculations that ignore screening or from calculations and graphics representations that treat the charges of individual atoms as independent entities.

[Figure 22.2.4.1]

Figure 22.2.4.1 | top | pdf |

GRASP example. The larger picture shows the molecular surface of arginine kinase (Zhou et al., 1998[link]) with the domains and a loop moved to the open configuration seen in a homologous creatine kinase structure (Fritz-Wolf et al., 1996[link]). The surface, coloured with positive charge blue and negative charge red, demonstrates that the active-site pocket (centre) is the most positively charged part of the structure. It complements the negatively charged phosphates of the transition-state analogue components that are shown, moved as a rigid body to the bottom right. They are shown in van der Waals representation, in which oxygens are red, carbons black and nitrogens blue.

Many other properties can be mapped to the surface. These include properties of the atoms associated with that part of the surface (such as thermal factors), curvature of the surface calculated from adjacent atoms (Nicholls & Honig, 1991[link]), or distance to the nearest part of the surface of an adjacent molecule. GRASP is now used to illustrate complicated molecular structures, in part because it also supports the superimposition of other objects over the molecular surface. These include the representation of molecules with Corey–Pauling–Koltun (CPK) spheres and/or bonds, and the representation of electrostatic potentials with field lines, dipole vectors etc.

22.2.4.1.5. Implementations in popular packages

| top | pdf |

Commercial packages use variants of the methods discussed above. For example, surfaces are drawn in the Insight II molecular modelling system using the Connolly dot algorithm (Molecular Structure Corporation, 1995[link]).

22.2.4.2. Schematic and two-dimensional representations such as `roadmap'

| top | pdf |

For their work on viruses, Rossmann & Palmenberg (1988)[link] introduced a highly schematic representation in which individual amino acids were labelled. The methods were extended by Chapman (1993)[link] to other proteins and to the automatic display of features such as topology, sequence similarity and hydrophobicity. Roadmaps sacrifice a realistic impression of shape for the ability to show the locations and properties of constituent surface atoms or residues. This has been important in combining the power of structure and molecular biology in understanding function. Potential sites of mutation are readily identified without substantial molecular-graphics resources, and phenotypes of mutants are readily mapped to the surface and compared with the physiochemical properties to reveal structure–function correlations.

For a set of projection vectors, the intersection points with the first van der Waals (or solvent-accessible) surface of an atom are calculated by basic vector algebra. The atom is identified so that when the projection is mapped to a plane for display, the boundaries of each atom or amino acid can be determined. The atoms or amino acids can then be coloured, shaded, outlined, contoured, or labelled according to parameters that are either calculated from the coordinates (such as distance from the centre of mass), read from a file (such as sequence similarity), or follow properties that are dependent on the residue type (e.g. hydrophobicity) or atom type [e.g. atomic solvation parameters (Eisenberg & McLachlan, 1986[link])].

Several types of projections can be used. The simplest is similar to that used by most other surface-imaging programs. A set of parallel projection vectors is mapped to a 2D grid. An example is shown in Fig. 22.2.4.2[link]. This view avoids distortions, but only one side of the molecule is visualized. Roadmaps are flat, two-dimensional projections that cannot be rotated in real time to reveal other views. Three-dimensionality is limited to an extension by Jean-Yves Sgro that maps the parallel projection of one view to a three-dimensional surface shell that can be rotated with interactive graphics and/or viewed with stereo imaging (Harber et al., 1995[link]; Sgro, 1996[link]). However, the schematic nature of roadmaps leads to the ability to view all parts of the molecule simultaneously.

[Figure 22.2.4.2]

Figure 22.2.4.2 | top | pdf |

(a) Solvent-accessible surface topology of a rhinovirus 14–drug complex (Kim et al., 1993[link]). The triangle shows one of the 60 symmetry-equivalent faces of an icosadeltahedron that constitute the entire virus surface. The surface is coloured and contoured according to distance from the centre of the virus, red being the most elevated. Residues are marked with dotted lines and labelled with residue type and number. A letter starting the residue label indicates a symmetry equivalent. The first numeral indicates the protein number (1 to 4), which is followed by the three-digit residue number. A depression, the `canyon', is where the cellular receptor is bound (Olson et al., 1993[link]). The locations of the dominant neutralizing immunogenic (NIm) sites were determined through the sequencing of escape mutants (Sherry & Rueckert, 1985[link]; Sherry et al., 1986[link]) and are labelled `NIm'. (b) The same view is coloured according to sequence similarity (Palmenberg, 1989[link]; Chapman, 1994[link]), with blue being the most conserved rhinoviral amino acids and red being the most variable. Comparison of diagrams like these suggested the `canyon hypothesis' (Rossmann, 1989[link]). The prediction has proved largely true in that the sites of receptor attachment in several picornaviruses would be depressed areas whose sequences could be more highly conserved because they were partially inaccessible to antibodies and therefore not under the same selective pressure to mutate. In this and other applications, the schematic nature of these diagrams has helped in the collation of structure with data arising from the known phenotypes of site-directed or natural mutants. Part (b) is reproduced from Chapman (1993)[link]. Copyright (1993) The Protein Society. Reprinted with the permission of Cambridge University Press.

To view all parts of the molecule, cylindrical projections are used that are similar to those used in atlases. This is possible because the representation is schematic (not realistic), and longitudinal distortion, similar to that near the poles in world maps, is acceptable. The surface is projected outwards radially onto a cylinder that wraps around the macromolecule (Fig. 22.2.4.3)[link]. Active-site clefts, drug or inhibitor binding sites and pores can be similarly illustrated by projecting their surfaces outward (from the axis) onto a cylinder that encloses the pore, pocket, or cleft. Such clefts are rarely straight, but with some distortion a satisfactory representation is possible by segmenting the cylinder, so that its axis follows the (curved) centre of the binding site or pore (Fig. 22.2.4.3)[link].

[Figure 22.2.4.3]

Figure 22.2.4.3 | top | pdf |

Different projections illustrated with lysozyme. (a) Lysozyme (Blake et al., 1965[link]; Diamond, 1974[link]) is sketched with MOLSCRIPT (Kraulis, 1991[link]) and shown with a ribbon indicting the active-site cleft (Kelly et al., 1979[link]). A cylindrical surface shown unrolled in (b) is shown in (a) wrapped around lysozyme. Vectors orthogonal to the now cylindrical illustrative surface are extended inwards until they intersect with the sphere. Vectors then run towards the centre of the molecule, and their intersections with the solvent-accessible surface are projected back upon the cylinder [unrolled in (b)]. (b) The surface is shaded according to distance from the centre, revealing the substrate-binding cleft as lighter shading. Details of active-site residues are revealed in (c) with a different type of projection. A segmented bent cylinder was traced along the substrate-binding cleft. The surface shows the projection outwards from points on the cylindrical axis. This reveals the amino acids likely to be in most intimate contact with the substrate. Similar plots, coloured according to charge, atomic solvation parameters, or hydrophobicity, can reveal the nature of predominant chemical interactions. This figure is reproduced from Chapman (1993)[link]. Copyright (1993) The Protein Society. Reprinted with the permission of Cambridge University Press.

22.2.5. Conclusion

| top | pdf |

Both quantitative and qualitative analyses of the surfaces of biomolecules are among the most powerful methods of elucidating functional mechanism from three-dimensional structures. A wide array of methods have been developed to help understand binding interactions and macromolecular assembly and to visualize the shape and physiochemical surface properties of macromolecules. Visualization methods range from those that depict a realistic impression of the topology to those that are more schematic and facilitate collation of structural and genetic information.

Acknowledgements

The authors thank Genfa Zhou for providing Fig. 22.2.4.1[link]. MSC gratefully acknowledges the support of the National Science Foundation (BIR 94–18741 and DBI 98–08098), the National Institutes of Health (GM 55837) and the Markey Charitable Trust.

References

Acharya, R., Fry, E., Logan, D., Stuart, D., Brown, F., Fox, G. & Rowlands, D. (1990). The three-dimensional structure of foot-and-mouth disease virus. New Aspects of Positive-Strand RNA Viruses, edited by M. A. Brinton & S. X. Heinz, pp. 319–327. Washington DC: American Society for Microbiology.
Arnold, E. & Rossmann, M. G. (1990). Analysis of the structure of a common cold virus, human rhinovirus 14, refined at a resolution of 3.0 Å. J. Mol. Biol. 211, 763–801.
Blake, C. C. F., Koenig, D. F., Mair, G. A., North, A. C. T., Phillips, D. C. & Sarma, V. R. (1965). Structure of hen egg-white lysozyme, a three-dimensional Fourier synthesis at 2 Å resolution. Nature (London), 206, 757–761.
Chapman, M. S. (1993). Mapping the surface properties of macromolecules. Protein Sci. 2, 459–469.
Chapman, M. S. (1994). Sequence similarity scores and the inference of structure/function relationships. Comput. Appl. Biosci. (CABIOS), 10, 111–119.
Chothia, C. (1976). The nature of the accessible and buried surfaces in proteins. J. Mol. Biol. 105, 1–12.
Connolly, M. L. (1983). Analytical molecular surface calculation. J. Appl. Cryst. 16, 548–558.
Diamond, R. (1974). Real-space refinement of the structure of hen egg-white lysozyme. J. Mol. Biol. 82, 371–391.
Eisenberg, D. & McLachlan, A. D. (1986). Solvation energy in protein folding and binding. Nature (London), 319, 199–203.
Fauchere, J.-L. & Pliska, V. (1983). Hydrophobic parameters π of amino-acid side chains from the partitioning of N-acetyl-amino-acid amides. Eur. J. Med. Chem. Chim. Ther. 18, 369–375.
Fritz-Wolf, K., Schnyder, T., Wallimann, T. & Kabsch, W. (1996). Structure of mitochondrial creatine kinase. Nature (London), 381, 341–345.
Gerstein, M. (1992). A resolution-sensitive procedure for comparing surfaces and its application to the comparison of antigen-combining sites. Acta Cryst. A48, 271–276.
Gerstein, M. & Lynden-Bell, R. M. (1993). What is the natural boundary for a protein in solution? J. Mol. Biol. 230, 641–650.
Grant, J. A. & Pickup, B. T. (1995). A Gaussian description of molecular shape. J. Phys. Chem. 99, 3503–3510.
Greer, J. & Bush, B. L. (1978). Macromolecular shape and surface maps by solvent exclusion. Proc. Natl Acad. Sci. USA, 75, 303–307.
Harber, J., Bernhardt, G., Lu, H.-H., Sgro, J.-Y. & Wimmer, E. (1995). Canyon rim residues, including antigenic determinants, modulate serotype-specific binding of polioviruses to mutants of the poliovirus receptor. Virology, 214, 559–570.
Hermann, R. B. (1977). Use of solvent cavity area and number of packed solvent molecules around a solute in regard to hydrocarbon solubilities and hydrophobic interactions. Proc. Natl Acad. Sci. USA, 74, 4144–4195.
Kauzmann, W. (1959). Some factors in the interpretation of protein denaturation. Adv. Protein Chem. 14, 1–63.
Kelly, J. A., Sielecki, A. R., Sykes, B. D., James, M. N. & Phillips, D. C. (1979). X-ray crystallography of the binding of the bacterial cell wall trisaccharaide NAM-NAG-NAM to lysozymes. Nature (London), 282, 875–878.
Kim, K. H., Willingmann, P., Gong, Z. X., Kremer, M. J., Chapman, M. S., Minor, I., Oliveira, M. A., Rossmann, M. G., Andries, K., Diana, G. D., Dutko, F. J., McKinlay, M. A. & Pevear, D. C. (1993). A comparison of the anti-rhinoviral drug binding pocket in HRV14 and HRV1A. J. Mol. Biol. 230, 206–227.
Kraulis, P. J. (1991). MOLSCRIPT: a program to produce both detailed and schematic plots of protein structures. J. Appl. Cryst. 24, 946–950.
Lee, B. & Richards, F. M. (1971). The interpretation of protein structures: estimation of static accessibility. J. Mol. Biol. 55, 379–400.
Lewis, M. & Rees, D. C. (1985). Fractal surfaces of proteins. Science, 230, 1163–1165.
Merritt, E. A. & Bacon, D. J. (1997). Raster3D: photorealistic molecular graphics. Methods Enzymol. 277, 505–525.
Molecular Structure Corporation (1995). Insight II User Guide. Biosym/MSI, San Diego.
Nicholls, A. (1992). GRASP: Graphical Representation and Analysis of Surface Properties. New York: Columbia University.
Nicholls, A. & Honig, B. (1991). A rapid finite difference algorithm, utilizing successive over-relaxation to solve the Poisson–Boltzmann equation. J. Comput. Chem. 12, 435–445.
Nicholls, A., Sharp, K. & Honig, B. (1991). Protein folding and association: insights from the interfacial and thermodynamic properties of hydrocarbons. Proteins, 11, 281–296.
Olson, N., Kolatkar, P., Oliveira, M. A., Cheng, R. H., Greve, J. M., McClelland, A., Baker, T. S. & Rossmann, M. G. (1993). Structure of a human rhinovirus complexed with its receptor molecule. Proc. Natl Acad. Sci. USA, 90, 507–511.
Palmenberg, A. C. (1989). Sequence alignments of picornaviral capsid proteins. In Molecular Aspects of Picornavirus Infection and Detection, edited by B. L. Semler & E. Ehrenfeld, pp. 211–241. Washington DC: American Society for Microbiology.
Pattabiraman, N., Ward, K. B. & Fleming, P. J. (1995). Occluded molecular surface: analysis of protein packing. J. Mol. Recognit. 8, 334–344.
Reynolds, J. A., Gilbert, D. B. & Tanford, C. (1974). Empirical correlation between hydrophobic free energy and aqueous cavity surface area. Proc. Natl Acad. Sci. USA, 71, 2925–2927.
Richards, F. M. (1977). Areas, volumes, packing, and protein structure. Annu. Rev. Biophys. Bioeng. 6, 151–176.
Richards, F. M. (1985). Calculation of molecular volumes and areas for structures of known geometry. Methods Enzymol. 115, 440–464.
Richmond, T. J. (1984). Solvent accessible surface area and excluded volume in proteins: analytical equations for overlapping spheres and implications for the hydrophobic effect. J. Mol. Biol. 178, 63–89.
Rossmann, M. G. (1989). The canyon hypothesis. J. Biol. Chem. 264, 14587–14590.
Rossmann, M. G. & Palmenberg, A. C. (1988). Conservation of the putative receptor attachment site in picornaviruses. Virology, 164, 373–382.
Sgro, J.-Y. (1996). Virus visualization. In Encyclopedia of Virology Plus (CD-ROM version), edited by R. G. Webster & A. Granoff. San Diego: Academic Press.
Sharp, K. A., Nicholls, A., Fine, R. F. & Honig, B. (1991). Reconciling the magnitude of the microscopic and macroscopic hydrophobic effects. Science, 252, 107–109.
Sherry, B., Mosser, A. G., Colonno, R. J. & Rueckert, R. R. (1986). Use of monoclonal antibodies to identify four neutralization immunogens on a common cold picornavirus, human rhinovirus 14. J. Virol. 57, 246–257.
Sherry, B. & Rueckert, R. (1985). Evidence for at least two dominant neutralization antigens on human rhinovirus 14. J. Virol. 53, 137–143.
Tanford, C. (1997). How protein chemists learned about the hydrophobicity factor. Protein Sci. 6, 1358–1366.
Tanford, C. H. (1979). Interfacial free energy and the hydrophobic effect. Proc. Natl Acad. Sci. USA, 76, 4175–4176.
Ten Eyck, L. F. (1977). Efficient structure-factor calculation for large molecules by the fast Fourier transform. Acta Cryst. A33, 486–492.
Wodak, S. J. & Janin, J. (1980). Analytical approximation to the accessible surface areas of proteins. Proc. Natl Acad. Sci. USA, 77, 1736–1740.
Xie, Q. & Chapman, M. S. (1996). Canine parvovirus capsid structure, analyzed at 2.9 Å resolution. J. Mol. Biol. 264, 497–520.
Zhou, G., Somasundaram, T., Blanc, E., Parthasarathy, G., Ellington, W. R. & Chapman, M. S. (1998). Transition state structure of arginine kinase: implications for catalysis of bimolecular reactions. Proc. Natl Acad. Sci. USA, 95, 8449–8454.








































to end of page
to top of page