International
Tables for Crystallography Volume B Reciprocal space Edited by U. Shmueli © International Union of Crystallography 2010 
International Tables for Crystallography (2010). Vol. B, ch. 2.5, pp. 375388
Section 2.5.7. Singleparticle reconstruction
P. A. Penczek^{g}

Cryoelectron microscopy (cryoEM) in combination with the singleparticle approach is a new method of structure determination for large macromolecular assemblies. Currently, resolution in the range 10 to 30 Å can be reached routinely, although in a number of pilot studies it has been possible to obtain structures at 4 to 8 Å. Theoretically, electron microscopy can yield data exceeding atomic resolution, but the difficulties in overcoming the very low signaltonoise ratio (SNR) and low contrast in the data, combined with the adverse effects of the contrast transfer function (CTF) of the microscope, hamper progress in fulfilling the potential of the technique. However, in recent years, cryoEM has proven its power in the structure determination of large macromolecular assemblies and machines which are too large and complex for the more traditional techniques of structural biology, i.e., Xray crystallography and NMR spectroscopy.
Singleparticle reconstruction is based on the assumption that a protein exists in solution in multiple copies of the same basic structure. Unlike in crystallography, no ordering of the structure within a crystal grid is required; the enhancement of the SNR is achieved by bringing projection images of different (but structurally identical) proteins into register and averaging them. This is why the technique is sometimes called `crystallography without crystals'.
Within the linear weakphaseobject approximation of the image formation process in the microscope [see equation (2.5.2.43) in Section 2.5.2], 2D projections represent line integrals of the Coulomb potential of the particle under examination convoluted with the pointspread function of the microscope, s, as introduced in Section 2.5.1. In addition, we have to consider the translation t of the projection in the plane of micrograph, suppression of highfrequency information by the envelope function E of the microscope, and two additive noises m^{B} and m^{S} . The first one is a coloured background noise, while the second is attributed to the residual scattering by the solvent or the supporting thin layer of carbon, if used, assumed to be white and affected by the transfer function of the microscope in the same way as the imaged protein. In order to have the image formation model correspond more closely to the physical reality of data collection, we write equation (2.5.6.4) from Section 2.5.6 such that the projection operation is always realized in the z direction of the coordinate system (corresponding to the direction of propagation of the electron beam), while the molecule is rotated arbitrarily by three Eulerian angles:Here represents the threedimensional (3D) electron density of the imaged macromolecule and is the nth observed twodimensional (2D) projection image. The total number of projection images N depends on the structure determination project, and can vary from a few hundred to hundreds of thousands. Further, e is the inverse Fourier transform of the envelope function, is a vector of coordinates in the plane of projections, is a vector of coordinates associated with nth macromolecule, is the 4 × 4 transformation matrix given bywith being the shift vector of translation of the object (and its projection) in the xy plane (translation in z is irrelevant due to the projection operation) and is the 3 × 3 rotation matrix specified by three Eulerian angles. As in Section 2.5.6, two of the angles define the direction of projection , while the third angle results in rotation of the projection image in the plane of the formed image xy; changing this angle does not provide any additional information about the structure f. Both types of noise are assumed to be mutually uncorrelated and independent between projection images (i.e., ; k, l = S, B) and also uncorrelated with the signal (; k = S, B). Model (2.5.7.1) is semiempirical in that, unlike in the standard model, we have two contributions to the noise. Although in principle amorphous ice should not be affected by the CTF, so the term m^{S} should be absorbed into m^{B}, in practice the buffer in which the protein is purified is not pure water and it is possible to observe CTF effects by imaging frozen buffer alone. Moreover, if a thin support carbon is used, it will be a source of very strong CTFaffected noise also included in m^{B}.
In Fourier space, (2.5.7.1) is written by taking advantage of the central section theorem [equation (2.5.6.8) of Section 2.5.6]: the Fourier transform of a projection is extracted as a Fourier plane uv of a rotated Fourier transform of a 3D object:The capital letters denote Fourier transforms of objects appearing in (2.5.7.1) while CTF (a Fourier transform of s) depends, among other parameters that are set very accurately (such as the accelerating voltage of the microscope), on the defocus setting and the amplitude contrast ratio that reflects the presence of the amplitude contrast that is due to the removal of widely scattered electrons [the real term in (2.5.5.14)]. For the range of frequency considered, q is assumed to be constant and the CTF is written in terms of the phase perturbation function [given by equation (2.5.2.33)] aswhere for simplicity we assumed no astigmatism. Finally, the rotationally averaged power spectrum of the observed image, calculated as the expectation value of its squared Fourier intensities (2.5.7.3), is given bywhere is the modulus of spatial frequency.
The goal of singleparticle reconstruction is to determine the 3D electrondensity map f of a biological macromolecule such that its projections agree in a leastsquares sense with a large number of collected 2D electronmicroscopy projection images, (n = 1, 2, …, N), of isolated (single) particles with random and unknown orientations. Thus, we seek a leastsquares solution to the problem stated by (2.5.7.1) [or, equivalently, in Fourier space, to (2.5.7.3)]. This is formally written as a nonlinear optimization problem (Yang et al., 2005),The factor of ½ is included merely for convenience. The objective function in (2.5.7.6) is clearly nonlinear due to the coupling between the orientation parameters (n = 1, 2, …, N) and the 3D density f.
The parameters in (2.5.7.6) to be determined can be separated into two groups. (1) The orientation parameters that have to be determined entirely by solving (2.5.7.6) and for which there are no initial guesses, and the structure f itself, for which we may or may not have an initial guess. The number of parameters in this group is very large: n^{3} + 5m. Note that in singleparticle reconstruction, the number of projection data m is far greater than the linear size of the data in pixels, i.e., . (2) Various parameters which we will broadly call the parameters of the image formation model (2.5.7.1)–(2.5.7.4): the defocus settings of the microscope , the amplitude contrast ratio q and, if analytical forms of the envelope function E, the power spectrum of the background noise M, or the structure F are adopted, the parameters of these equations. Some of the parameters in the second group are usually known very accurately or can be estimated from micrograph data before one attempts to solve (2.5.7.6) (see Section 2.5.7.4), but they can also be refined during the structure determination process [for the method for correcting the defocus settings, see Mouche et al. (2001)].
Owing to the very large number of parameters in (2.5.7.6) and the nonlinearities present, one almost never attempts to solve the problem directly. Instead, structure determination using the singleparticle technique involves several steps. (i) The macromolecular complex is prepared with a purity of at least 90%. (ii) The sample is flashfrozen in liquid ethane. Alternatively, cryonegative stain techniques or traditional negative stain methods can be used. (iii) Pictures of the macromolecular complexes are taken. (iv) Exhaustive analysis of 2D particle images aimed at increasing the SNR of the data and evaluation of the homogeneity of the sample is performed. (v) An initial lowresolution model of the structure is established using either experimental techniques or computational methods. (vi) The initial structure is refined in order to increase the resolution using an enlarged data set. Only in this step does one attempt to minimize (2.5.7.6) more or less directly. (vii) Visualization and interpretation of the resulting 3D electrondensity map is the last step; it often involves docking of Xray structures of molecules into EM density maps in order to reveal the arrangement of known molecules within the EM envelope (Fig. 2.5.7.1). As within the weakphaseobject approximation of the image formation in EM the relation between densities in collected images and the 3D electron density of the imaged macromolecule is linear [(2.5.7.1)], all dataprocessing methods employed in the structure determination project should be linear, so the densities in the cryoEM 3D model can be interpreted in terms of the electron density of the protein.
In the actual singleparticle project not all the steps have to be executed in the order outlined above. The technique has proved to be particularly useful in studies of functional complexes of proteins whose base state is known to a certain resolution or even of functional complexes whose atomic (Xray crystallographic) structure is known. In these cases, steps (iv) and (v) can be omitted and the structure of the functional complex (for examples with ligands bound to it) can be relatively easily determined using the native structure as a starting point for step (vi).
In addition to difficulties with obtaining good cryoEM data, the technique is computationally intensive. The reason is that in order to obtain a sufficient SNR in the 3D structure, processing of hundreds of thousands of EM projection images of the molecule might be necessary. For each, five orientation parameters have to be determined, and this is in addition to determination of the imageformation parameters required for the optimization of correlation searches. In effect, it is not unusual for singleparticle projects to consume weeks of the computer time of multiprocessing clusters. This also explains why the knowledge of the base structure simplifies the work to a large degree: when it is known, initial values of the orientation parameters can be easily established, reducing not only the computational time, but also possibilities of errors in the structuredetermination process.
The electron microscope is a phase imaging system; i.e., in order to create contrast in images, they have to be underfocused. Owing to the particular form of the CTF of the microscope [(2.5.7.4)], not only the amplitudes of the image in Fourier space are modified, but information in some ranges of spatial frequencies is set to zero and some phases have reversed sign. Therefore, in order to obtain possibly uniform coverage of Fourier space, the standard practice is to take pictures using different defocus settings and merge them computationally in order to fill gaps in Fourier space. The problem is compounded by the relation between underfocus and the envelope function of the microscope. Farfromfocus images have high contrast, but the envelope function has a relatively steep falloff limiting the range of useful spatial frequencies. Conversely, closetofocus images have little contrast, but the envelope function is decreasing, slowly extending useful information to high spatial frequencies. In effect, it is easier to process computationally farfromfocus data and to obtain accurate alignment of particles, but the results have severely limited resolution. Processing of closetofocus data is challenging and results tend to be less accurate, but there is the potential to obtain highresolution information.
The experimental techniques of initial structure determination (random conical tilt, tomography) require collection of tilt data. This is facilitated by dedicated microscope stages that can be rotated inside the microscope column yielding additional views of the same field. However, collection of highquality tilt images is difficult. The quality of tilted images tends to be adversely affected by charging and drift effects. Moreover, as the stage is tilted the effective ice thickness increases (inversely proportionally to the cosine of the tilt angle, so at 60° the factor is two) and the contrast of the images decreases correspondingly. Finally, the defocus in tilted micrographs varies depending on the position in the field, often forcing users to restrict the particle selection only to regions in the vicinity of the tilt axis. However, tilting establishes geometrical relations between different projections of the same particle, unambiguously allowing for robust determination of an initial 3D model and the handedness of the quaternary structure of the complex.
Electron microscope images can be either recorded on the film and subsequently converted to digital format, or they can be recorded using a chargecoupled device (CCD) camera in a digital format directly on a microscope. In either case, it is necessary to select the magnification of the microscope and the eventual pixel size of the digitized data before the datacollection session. High magnification can potentially yield highresolution data, but at the same time it decreases the yield of particles. Lower magnification values can be used when images are recorded on film, which does not attenuate high spatial frequencies to the same extent as CCD cameras tend to do.
The pixel size has to be adjusted according to the expected resolution of the final structure. Although it is tempting to adopt a small pixel size (in the hope of achieving high resolution of the results), in most cases this is counterproductive, as it results in very large computer files that are difficult to handle and in excessively long dataprocessing times. Theoretically, the optimum pixel size is tied to the maximum frequency present in the data by Shannon's sampling theorem, which states that no information is lost if the signal is sampled at twice the maximum frequency present in the signal, and no additional information is gained by sampling using higher frequency. Thus, if the expected resolution is 12 Å, it should be sufficient to use a pixel size (on the specimen scale) of 6 Å. In practice, various imageprocessing operations performed during alignment of the data and 3D reconstruction of the complex significantly lower the range of useful frequencies. This is because in currently available singleparticle reconstruction software packages rather unsophisticated interpolation schemes are employed, which were selected mainly for the speed of calculations. Therefore, it is advisable to oversample the data by a factor of 1.5 or even 3.0. For an expected resolution of 12 Å this corresponds to pixel sizes of 4 and 2 Å, respectively.
The windowed particles have to be normalized to adjust the image densities to a common framework of reference. The reason for this step is that microscopy conditions are never exactly the same and also within the same micrograph field the background densities can vary by a significant margin due to uneven ice thickness and other factors. A sensible approach to normalization is to assume that the statistical distribution of noise in areas surrounding particles should be the same (Boisset et al., 1993). Hence a large portion of one of the micrographs from the processed set is selected and a reference histogram of its pixel values is generated. Next, assuming a linear transformation of pixel values, the two parameters of this transformation are found in such a way that the histogram of the transformed pixel values surrounding the particle optimally matches the reference histogram using χ^{2} statistics as a discrepancy measure.
The initial assessment of the quality of the micrographs is usually performed during the data collection and in most cases before the micrographs are digitized. The micrographs are examined visually and those that have noticeable drift, astigmatisms, noticeable contamination or simply too low a number of particles to justify further analysis are simply discarded. After digitization of the accepted micrographs, the first step is estimation of the power spectrum, which will be examined for the presence of Thon rings (thus confirming that the micrograph is indeed usable) and astigmatism.
The method of averaged overlapping periodograms (Welch, 1967) is commonly used in EM to calculate the power spectrum. It is designed to improve the statistical properties of the estimate by taking advantage of the fact that when K identically distributed independent measurements are averaged, the variance of the average is decreased with respect to the individual variance by the ratio 1/K. Thus, instead of calculating a periodogram (squared moduli of the discrete Fourier transform) of the entire micrograph field, one subdivides it into much smaller windows, calculates their periodograms and averages them. Typically, one would chose a window size of 512 × 512 pixels and an overlap of 50%, which will result in the reduction of the variance of the estimate to few percent with respect to the variance of the periodogram of the entire field (Fernandez et al., 1997; Zhu et al., 1997). Further reduction of the variance is achieved by rotational averaging of the 2D powerspectrum estimate. The resulting onedimensional (1D) profile is finally used in the third step of our procedure.
For a set of micrographs the power spectra can be evaluated either visually or computationally in an automated fashion. Of main concern are the presence of Thon rings, the astigmatism and the extent to which Thon rings can be detected. Although in principle astigmatic data could be used in subsequent analysis (in fact, astigmatism could be considered advantageous, as particles from the same micrograph would contain complementary information in Fourier space), in practice they are discarded as currently there is no software that can process astigmatic data efficiently. The extent of Thon rings indicates the `resolution' of the data, i.e., the maximum frequency to which information in the data can be present.
A number of well established programs can assist the user in the calculation of power spectra and automated estimation of defocus and astigmatism (Huang et al., 2003; Mindell & Grigorieff, 2003; Sander et al., 2003; Mallick et al., 2005). Given the analytical form of the CTF [(2.5.7.4)], the problem is solved by a robust fitting of the CTF parameters such that the analytical form of the CTF matches the power spectrum of the micrograph. Usually, the steps employed are: (1) robust estimation of the power spectrum; (2) calculation of the rotational average of the power spectrum; (3) subtraction from this rotational average of the slowly decreasing background [roughly corresponding to P_{B} in (2.5.7.5)]; (4) fitting of the defocus value using known settings of the microscope (voltage, spherical aberration constant, …) and usually assuming a constant and known value of the amplitude contrast ratio q (for cryoEM data, q should be in the range 0.02–0.10); and (5) using the established defocus value , analysis of the 2D power spectrum and fitting of the astigmatism amplitude and angle while refining the defocus. As long as the defocus value is not too small and there are at least two detectable zeros of the CTF, all available programs give very good and comparable results.
In some singleparticle packages, the automated calculation of defocus is integrated with the estimation of additional characteristics of the imageformation parameters that are required for advanced application of a Wiener filter [(2.5.7.18)] (Saad et al., 2001; Huang et al., 2003), i.e., the power spectra of two noise distributions P_{S} and P_{B} and the envelope function of the microscope for each micrograph. A possible approach is to select slowly varying functions and fit their parameters to match the estimates of P_{S}, P_{B} and P_{d} obtained from the data. Finally, it is necessary to have a description of the 1D rotationally averaged power spectrum of the complex P_{f} . One possibility is to carry out Xray solution scattering experiments (Gabashvili et al., 2000; Saad et al., 2001) that yield a 1D power spectrum of the complex in solution. However, these experiments require large amounts of purified sample and the accuracy of the results in terms of the overall falloff of the power spectrum can be disputed. For the purpose of cryoEM, a simple approximation of the protein power spectrum by analytical functions is satisfactory.
Depending on the properties of the imaged complex and the magnification used, a single micrograph can yield from a few to thousands of individual particle projections. The first step of the data processing is identification of particle projections in micrographs and their selection. The particles have to be windowed (boxed) using a window size exceeding the particle size by a 30–50% margin. Thus, for example, in order to determine the structure of a 550 kDa complex that has a diameter of ~120 Å to 12 Å resolution, it is appropriate to choose a pixel size of 3 Å and a window size of 60 pixels.
The selection of particles is a labourintensive process; however, the quality of selected particle projections is a major factor in the subsequent steps of analysis and the inclusion of too many imperfect images may preclude successful determination of the 3D structure. There are three possible approaches: (1) manual selection; (2) semiautomated selection; (3) fully automated selection. In the early stages of analysis, particularly when little is known about the shape of the protein and the distribution of projection views, the manual approach is preferable. The researcher displays the micrograph on a computer screen (usually preprocessed by Fourier filtration and contrastadjusted for better visibility of the protein) and interactively identifies locations of particle views. A trained and careful operator can yield much better results than automated approaches. The main risk is in inherent bias of a human operator – there is a tendency to focus on more familiar and more easily visible particle projections, omitting less frequently appearing orientations and in effect jeopardizing successful structure determination. In semiautomated approaches, an initial step in which putative particle projections in a micrograph are chosen is performed by a computer, all candidates are windowed and the user screens a gallery of possible particles instead of the full micrograph. Algorithms that perform the initial identification of particle views range from very simple (for example a bandpass filtration of a micrograph with subsequent selection of peaks that are no closer to each other than half of the expected particle size) to sophisticated nonlinear noisesuppression methods [for details on various algorithms see the Special Issue of the Journal of Structural Biology (Zhu et al., 2004)]. Since the human operator will be responsible for the ultimate decision, preference is given to the faster method. In most cases, semiautomated methods are implemented within a framework of a userfriendly graphical user interface that can greatly facilitate the work. Fully automated methods are currently actively under development but, curiously, even for proteins whose highresolution structure is known, the success rate cannot match that of a human operator (Zhu et al., 2004).
The automated procedures can be divided into three groups: (1) those that rely on ad hoc steps of denoising and contrast enhancement followed by the search for regions of known size that emerge above the background level (Adiga et al., 2004); (2) those that extract orientationindependent statistical features from regions of the micrograph that may contain particles and proceed with classification (Lata et al., 1995; Hall & Patwardhan, 2004); and (3) those that employ templates, i.e., either class averages of particles selected from micrographs or projections of a known 3D structure of the complex (Huang & Penczek, 2004; Sigworth, 2004).
The advantage of the first two approaches is that they do not require template images, i.e., since they are based on a very broadly defined description of particles (general size, shape or abstract features derived from examples of typical particles), they are applicable in cases when no 3D structure of the complex is available. Methods from the second category usually require a training session for the algorithm to construct a set of weights for the predefined features. The methods that take advantage of the availability of templates vary greatly in complexity from straightforward crosscorrelation with a generic shape (a Gaussian function, a lowpassed circle) (Frank & Wagenknecht, 1984) to matched filters with large number of templates and parameters derived from the imageformation model of the micrograph (CTF and envelope functions) (Huang & Penczek, 2004; Sigworth, 2004). The motivation is clear: given an ideal object and imageformation parameters, it should be only a matter of sheer computer power and user's patience to have all particles matching the template selected. In practice, the problem is much more challenging and the success rate of templatebased methods does not necessarily exceed the success rate of carefully tuned ad hoc methods.
One of the difficulties with the application of correlation techniques to the particlepicking problem is the unevenness of micrographs, which is caused by uneven illumination by the electron beam and, to a much larger degree, by the uneven thickness of the ice layer and, when used, the supporting carbon. A possible remedy is to calculate a `locally normalized' crosscorrelation function, in which the total variance of the micrograph is replaced by the local variance of the micrograph calculated within a window of n pixels centred on the current location l. This method has a fast implementation in Fourier space (van Heel, 1982; Roseman, 2003). A faster method is to just apply a highpass filtration of the micrograph using a highpass Gaussian Fourier filter with a halfwidth (1/np) Å^{−1}, where p is the pixel size. This simple step will all but eliminate the unevenness of the micrograph background.
The main difficulty with the correlation technique is the computational complexity of the problem arising from the very large number of templates that have to be considered. The particles in the micrograph are projections of a 3D object with arbitrary inplane rotations. In effect, to perform an exhaustive search, it is necessary to sample quasiuniformly three Eulerian angles [equation (2.5.7.17) with ]. For example, a very crude angular step of results in ~13 000 2D templates! A reduction in the number of templates can be achieved either using clustering techniques (Huang & Penczek, 2004; Wong et al., 2004) or by exploring the eigenstructure of the whole set of templates (Sigworth, 2004).
Alignment of pairs of 2D images is a fundamental step in singleparticle reconstruction. It is aimed at bringing into register various particle projections by determining three orientation parameters (rotation angles and x and y translations) and is employed in 2D alignment of large sets of 2D noisy data and in 3D structurerefinement algorithms. The computational efficiency and numerical accuracy of this step are deciding factors in achieving highquality structural results in an acceptable time.
All 2D alignment methods considered are aimed at finding transformation parameters such that the leastsquares discrepancy between two images f and g is minimized,where is a vector containing the coordinates. T is the transformation matrix given byand is dependent on three transformation parameters: rotation angle and two translations t_{x} and t_{y} . It has to be noted that a minimum of (2.5.7.7) can be found rapidly using the fast Fourier transform (FFT) algorithm if only the xy translation is sought (2D FFT), or if only the rotation angle is needed (1D FFT).
2D alignment methods can be divided into three classes: (1) those that employ exhaustive searches in order to find three orientation parameters; (2) those that perform exhaustive searches by using either simplifications (separate searches for translation and rotation parameters) (Penczek et al., 1992) or by taking advantage of invariant image representations (Schatz & van Heel, 1990; Frank et al., 1992 and the following discussion; Schatz & van Heel, 1992; Marabini & Carazo, 1996); or finally (3) those that are aimed at improvement of previously determined parameters and employ local searches.
In practice, as the windowed particles are approximately centred, the search for translation parameters can be restricted to relatively small values. A very efficient algorithm that takes advantage of the geometry is based on resampling to polar coordinates of the area of the image that roughly corresponds to the particle size. The resampling is done around centres placed on pixels located within a distance from the image centre that corresponds to a preset maximum translation (Joyeux & Penczek, 2002) (Fig. 2.5.7.2). For each translation, a 1D rotational crosscorrelation function in polar coordinates is calculated. Overall, the alignment method based on resampling to polar coordinates comprises the following steps: (1) the image is resampled to polar coordinates; (2) 1D FFTs of various lengths are calculated, appropriately weighted and padded with zeros to equalize their lengths; (3) complex multiplications with 1D Fourier transforms of the similarly processed referenced image are calculated; (4) the inverse 1D FFT is calculated and the position of the maximum is found. The last step yields the rotation angle. Steps (1)–(4) are repeated with the image that is being aligned shifted to account for translations. In addition, the rotation angle for one of the images being mirrored is efficiently calculated in parallel with step (3) by repeating the multiplication with the 1D Fourier transforms of the reference image complex conjugated. This additional check is a necessity in the analysis of singleparticle data sets, as usually one can expect on average half of the images to be mirrored versions of the other half in the data set. Overall, the method is very accurate, because only data under the circular mask enter the calculation.
For a set of N images containing the same object in various orientations and corrupted by an additive noise, the problem of alignment would be relatively simple. For proteins that have strong preferred orientation and particularly when a staining technique is used for grid preparation, this is certainly the case. In the procedure called referencebased alignment, one of the images that appears `typical' is selected and used as a reference to align the remaining images. After all available images are aligned their average is calculated and used as a reference in a repeated alignment of all images. The process is iterated until the orientations of the images stabilize (Frank et al., 1982).
More formally, Frank et al. (1988) proposed the definition of a set of N images f_{k} , k = 1, …, N, aligned if a set of transformations T_{k}, k = 1, …, N, (rotation angles and translations) is found such that all pairs of images are mutually brought into register, so the expressionis minimized. Although there is no simple way to minimize , the interesting observation is that there is no requirement of the images to represent the same particle, not even a similar one. This leads to the conclusion that if the minimum of could be found, a set of diverse images could be aligned; moreover, upon alignment similar images would have similar orientation and subsequent classification of such an aligned data set would reveal subsets of similar images.
A practical method of minimizing, called a referencefree alignment, was proposed by Penczek et al. (1992) by showing that minimization of is equivalent to maximization ofwhereis the partial average of the set of images calculated with the exclusion of the kth image. The method is based on the observation that given a set of approximately aligned images, it should be possible to minimize L_{2} by sequentially correcting alignments of individual images using the crosscorrelation function between each image and the average of the remaining ones. On each step, depending whether the orientation of the image changes or not, (2.5.7.10) will decrease or remain constant.
The outcome of the referencefree alignment algorithm is an aligned set of N images, so all particles that have similar shapes will have similar orientations. Thus, it is natural (and because of the alignment possible) to divide the data set into classes of images that have similar shapes and orientations, i.e., to cluster them. A number of well known clustering algorithms have been adopted for EM applications (Frank, 1990). The general purpose of clustering is to organize objects (in the case of EM, images) into classes whose members are similar to each other, while dissimilar to objects from other classes.
Referencefree alignment with subsequent clustering works well as long as all particles share the same overall shape (i.e., the very low frequency component), as is the case for ribosomes. However, some molecules yield projections that have quite different shapes, as for example is the case for barrellike proteins GroEL (Roseman et al., 1996) with rectangular views and circular end views or flat and rectangular hemocyanin (Boisset et al., 1995). In this case, the referencefree alignment tends to be unstable, as (2.5.7.10) has multiple local minima, which in practice means that the global average of the whole data set can vary significantly depending on the initiation of the procedure. In general, referencefree alignment is an `alignment first, classification second' approach. It is possible to reverse this order by using invariants with the supporting rationale that once approximately homogeneous classes of images were found, it should be easy to align them subsequently as within each class they will share the same motif.
A practical approach to referencefree alignment known as alignment by classification (Dube et al., 1993) is based on the observation that for a very large data set and centred particles one can expect that although the inplane rotation is arbitrary, there is a high chance that at least some of the similar images will be in the same rotational orientation. Therefore, in this approach the images are first (approximately) centred, then subjected to classification, and subsequently aligned.
In its simplest form, the multireference alignment belongs to the class of supervised classification methods: given a set of templates (i.e., reference images; these can be selected unprocessed particle projections, or class averages that resulted from preceding analysis, or projections of a previously determined EM structure, or projections of an Xray crystallographic structure), each of the images from the available data sets is compared (using a selected discrepancy measure) with all templates and assigned to the class represented by the most similar one. Equally often multireference alignment is understood as a form of unsupervised classification, more precisely Kmeans classification, even if the description is not formalized in terms of the latter. Given a number of initial 2D templates, the images are compared with all templates and assigned to the most similar one. New templates are calculated by averaging images assigned to their predecessors and the whole procedure is repeated until a stable solution is reached.
The 2D analysis of projection images provides insight into the behaviour of the protein on the grid in terms of the structural consistency and the number and shape of projection images. In order to obtain 3D information, it is necessary to find geometrical relations between different observed 2D images. The most robust and historically the earliest approach is based on tilt experiments. By tilting the stage in the microscope and acquiring additional pictures of the same area of the grid it is possible to collect projection images of the same molecule with some of the required Eulerian angles determined accurately by the setting of the goniometer of the microscope.
In random conical tilt (RCT) reconstruction (Radermacher et al., 1987), two micrographs of the same specimen area are collected: the first one is recorded at a tilt angle of ~50° while the second one is recorded at 0° (Fig. 2.5.7.3). If particles have preferred orientation on the support carbon film (or within the amorphous ice layer, if no carbon support is used), the projections of particles in the tilted micrographs form a conical tilt series. Since inplane rotations of particles are random, the azimuthal angles of the projections of tilted particles are also randomly distributed; hence the name of the method. The untilted image is required for two reasons: (i) the particle projections from the untilted image are classified, thus a subset corresponding to possibly identical images can be selected ensuring that the projections originated from similar and similarly oriented structures; and (ii) the inplane rotation angle found during alignment corresponds to the azimuthal angles in three dimensions (one of the three Eulerian angles needed). The second Eulerian angle, the tilt, is either taken from the microscope setting of the goniometer or calculated based on geometrical relations between tilted and untilted micrographs. The third Eulerian angle corresponds to the angle of the tilt axis of the microscope stage and is also calculated using the geometrical relations between two micrographs. In addition, it is necessary to centre the particle projections selected from tilted micrographs; although various correlationbased schemes have been proposed, the problem is difficult as the tilt data tend to be very noisy and have very low contrast.
Given three Eulerian angles and centred tilted projections, a 3D reconstruction is calculated. There are numerous advantages of the RCT method. (i) Assuming the sign of the tilt angle is read correctly (it can be confirmed by analysing the defocus gradient in the tilted micrographs), the method yields a correct hand of the structure. (ii) With the exception of the inplane rotation of untilted projections, which can be found relatively easily using alignment procedures, the remaining parameters are determined by the experimental settings. Even if they are not extremely accurate, the possibility of a gross error is eliminated, which positively distinguishes the method from the ab initio computational approaches that use only untilted data. (iii) The computational analysis is entirely done using the untilted data, which have high contrast. (iv) The RCT method is often the only method of obtaining 3D information if the molecule has strongly preferential orientation and only one view is observed in untilted micrographs. The main disadvantage is that the conical projection series leaves a significant portion of the Fourier space undetermined. This follows from the central section theorem [equation (2.5.6.8) of Section 2.5.6]: as the tilt angle is less than 90°, the undetermined region can be thought to form a cone in three dimensions and is referred to as the missing cone. The problem can be overcome if the molecule has more than one preferred orientation. Subsets of particles that have similar untilted appearance (as determined by clustering) are processed independently and for each a separate 3D structure is calculated. If the preferred orientations are sufficiently different, i.e., the orientations of the original particles in three dimensions are sufficiently different in terms of their angles with respect to the z axis, the 3D structures can be aligned and merged, all but eliminating the problem of the missing cone and yielding a robust, if resolutionlimited, initial model of the molecule (Penczek et al., 1994). It should be noted that RCT by itself almost never results in a highresolution 3D model of the molecule. This is due to a variety of reasons, the main ones being the already mentioned poor quality of hightilt data and difficulties with the collection of large numbers of highquality tilted micrographs (they are often marred by drift).
In cases when the molecule does not have well defined preferred orientations, it is possible to use electron tomography to obtain the initial model. In this method, a singleaxis tilt series of projection images of the same specimen area is collected using an angular step of ~2° and a maximum tilt angle not exceeding 60° (Crowther, DeRosier & Klug, 1970). The singleaxis tilt datacollection geometry yields worse coverage of the Fourier space than the RCT method, leaving missing wedges uncovered (Penczek & Frank, 2006). This results in severe artifacts in real space, which make smaller objects virtually unrecognizable. The situation can be largely rectified using socalled doubleaxis tomography, in which a second singleaxis tilt series of data are collected after rotating the specimen grid inplane by 90° (Penczek et al., 1995). This reduces the undetermined region to a missing pyramid and makes the resolution almost isotropic in the xy plane.
The tomographic projection data have to be aligned. This is done using either correlation techniques that enforce pairwise alignment of images (Frank & Mcewen, 1992) or by taking advantage of fiducial markers and enforcing their consistency with respect to a 3D model (Lawrence, 1992; Penczek et al., 1995). In the application to singleparticle work, it is possible to use locations of protein in the micrographs as markers. After the 3D reconstruction is calculated, regions collecting individual molecules are windowed from the volume and all molecules are aligned in three dimensions (Walz et al., 1997). While generally robust, the procedure is labour and computerintensive. Unlike RCT, where only two exposures of the same field are required, electron tomography may require over one hundred images, raising serious concerns about radiation damage. Moreover, most of the data have to be collected at high tilt angle, thus are of lower quality. Particularly troublesome is alignment of 3D molecules deteriorated by the missing wedge/pyramid artifacts, with the directions of artifacts different for each object. However, when successful, electron tomography yields a very good initial model of the molecule, free from missingFourierspacerelated artifacts and with defined handedness.
The experimentbased methods of initial 3D structure determination (RCT and electron tomography) are quite powerful, but rather challenging to employ in practice. Particularly frustrating is the fact that a large volume of difficulttorecord tilt data have to be collected, even though they cannot be used for subsequent highresolution work. Therefore, whenever possible, preference is given to computational methods in which 3D geometrical relations between particle projections are established using various mathematical approaches using only untilted data.
The most straightforward approach and historically the earliest is based on the central section theorem [equation (2.5.6.8) of Section 2.5.6]: because Fourier transforms of 2D projections of a 3D object are the central section of the 3D Fourier transform of this object, it is a straightforward consequence that Fourier transforms of any two projections intersect along a line, henceforth called a common line. (Two trivial exceptions are the case of projections in the same direction, in which their Fourier transforms coincide with possible differences in inplane rotation, and the case of projections in opposite directions, in which they are mirror versions of each other.) This fact was originally used by Crowther, DeRosier & Klug (1970) to solve the structure of viruses with icosahedral (60fold) symmetry. In this case, the Fourier transform of each projection intersects itself (or rather the symmetryrelated copies of itself) 37 times with the exception of degenerate cases of projections in directions of one of three symmetry axes, in which cases the number of common lines is less. Thus, it is possible to find the orientation of a single projection with respect to the chosen system of symmetry axes.
For asymmetric objects a set of three projections that do not intersect along the same line (which would correspond to the singleaxis tilt geometry) uniquely determine their respective orientations (with the exception of the overall rotation, which remains arbitrary, and the handedness of the solution, which remains undetermined). Indeed, three projections span three common lines, and each common line yields two angles: for each of the intersecting sections it is the angle between the x axis in the system of coordinates of this section and the common line in the plane of this section. Thus, we have a total of six angles. At the same time, by arbitrarily setting the orientation of the first projection in 3D space to three Eulerian angles equal to zero (or the corresponding rotation matrix R_{1} = I), we need to determine two rotation matrices R_{2} and R_{3} (or two sets of three Eulerian angles) for the remaining two projections, respectively. So, given six inplane angles we have to find a solution for six Eulerian angles. Let the angle of the common line between the ith and jth projection in the plane of the ith projection be and the corresponding unit vector in the plane of ith projection bewhere we added the third coordinate for convenience. The orientations of unit vectors n_{ij} in 3D space have to be related by rotation matrices; for example, vector n_{21} (the direction of the common line between the first and second projection in the plane of the second projection) should coincide with vector n_{12} (the direction of the same common line, but in the plane of the first projection) upon rotation by (unknown) matrix R_{2}. All possible relations areEquations (2.5.7.13) have two solutions corresponding to two different hands of the molecule [for details see Farrow & Ottensmeyer (1992)].
The commonlines method works very well in the absence of noise. However, even a modest amount of noise can yield quite erroneous results or no results at all. The reason is that the solution of (2.5.7.13) is highly nonlinear with respect to the six given angles and small errors in the location of peaks in crosscorrelation functions can lead to quite large discrepancies from the correct solution. The main difficulty with the application of the commonlines method is that the analytical solution in the form of (2.5.7.13) exists only for three projections (Goncharov et al., 1987).
As a working approach to ab initio structure determination, the commonlines method has been implemented under the name of angular reconstitution in IMAGIC (van Heel, 1987a; van Heel et al., 1996). In order to reduce sensitivity to noise, the method is applied not to individual projection images, but to class averages resulting from the multireference alignment of input data (van Heel et al., 2000). In order to overcome the problem of the lack of solution for the larger than three number of projections, the user has to begin with selection of three judiciously chosen class averages, obtain the solution using (2.5.7.13) and subsequently include (angle) additional class averages using a bruteforce approach, in which the Eulerian angles of the new projection are calculated using a similarity measure based on common lines with the alreadyangled set serving as a reference.
Some of the disadvantages of the angular reconstitution were addressed in the commonlinesbased method for determining orientations for particle projections simultaneously (Penczek et al., 1996). In this method, the problem is formulated in terms of minimization of the variance of the 3D structure, as expressed in terms of commonlines discrepancy between N projections. In a sense, the design of the method is the exact opposite of the `standard' commonlines approach: instead of trying to the determine the Eulerian angles (rotation matrices R_{i}) based on angles of common lines in the planes of the projections, one assumes that rotation matrices R_{i} are known, finds the set of angles of common lines and computes the overall discrepancy along these lines. For a pair of projections i and j, the inplane angles of common lines are found by solving the system of equationsfor and . The discrepancy minimized in the method is the variance of the 3D structure that, by analogy to the 2D case (2.5.7.10) and (2.5.7.11) iswhere is written in Fourier 3D polar coordinates and is the Fourier transform of the kth projection in 2D polar coordinates with the orientation in 3D Fourier space given by the rotation matrix R_{k}. All Fourier planes F_{k} are considered to have zero thickness, so all discrepancies are calculated only along common lines and the `partial average' is in fact an arrangement of N − 1 Fourier planes in 3D space. An approximation to is calculated by equating the values of to the areas of the Voronoi diagram cells constructed on a unit sphere for points of intersection of common lines with this unit sphere (see Section 2.5.6.6). Generally the method performs very well, particularly if the projection images cover 3D angular space evenly.
Some macromolecules, particularly those that have an elongated barrellike shape, will have a strongly preferential orientation with respect to the direction of electron beam showing only what are often called `side views', i.e., projections perpendicular to rotation along one axis corresponding to singleaxis tilt datacollection geometry. These orthoaxial projections form singleaxis tilt reconstruction geometry. In this case, Fourier transforms of all projections share only one common line, the line coinciding with the rotation axis, and clearly the commonlinesbased method is not applicable to the ab initio structure determination. To cope with this situation, a method termed Sidewinder was developed (Pullan et al., 2006). It is based on the observation that a Fourier transform of a finite object with a diameter D can be considered to have a nonzero thickness 1/D (Fig. 2.5.6.5). Thus, if the angle between two central sections of the 3D Fourier transform of this object, as derived from 2D Fourier transforms of its projections, is not too large, then these two sections will share information in Fourier space that is proportional to the amount of overlap of the two `slabs' in Fourier space. Using this observation, the general idea employed in Sidewinder is to calculate pairwise crosscorrelation coefficients (CCCs) between class averages of side views and to use this information to deduce the values of the azimuthal Eulerian angles using the Monte Carlo minimization method (Fishman, 1995).
For structures that have reasonably high symmetry and for those for which it is possible to collect highquality EM data, it is sometimes possible to determine the initial structure using the 3D projection alignment method, which will be described in the next section. However, the approach is extremely computationally intensive and it is virtually impossible to try the method repeatedly to verify that the approach converges to moreorless the same 3D structure, as is recommended for other ab initio methods described in this section. When the method is successful it is quite powerful, as an intermediate resolution structure can be obtained without going through intermediate and quite laborious steps of analysis of the data. A word of caution is warranted: with the direct method, unless there is external evidence that the obtained structure is correct, it is possible to obtain a selfconsistent but entirely incorrect model of the molecule.
In the absence of reliable objective measures of the correctness of the structure, one can apply common sense in order to spot definitely improbable 3D maps. Given the mass of the complex it is possible to calculate the corresponding volume, and thus the threshold at which the map should be examined (Section 2.5.7.11). If at this threshold the mass density is discontinuous or there are pieces of mass surrounding the structures, the map is most likely to be incorrect. Similarly, strong directional artifacts appearing as streaks permeating the structure indicate that either the collected projection images are dominated by one or two views of the structure or that the angular assignment is incorrect. In addition, the 3D map should be centred in the window box; although the centring is not strictly speaking a mathematical requirement for a successful reconstruction, all singleparticle structuredetermination software packages take advantage of the fact that for centred objects orientation searches are easier to perform. So, if the map is not centred it is a clear indication of the failure of the procedure. Finally, for symmetric structures there should be no large pieces of mass on the symmetry axes.
Given an initial lowresolution model of the 3D structure and the data set of 2D projection images of the complex that have Fourierspace information extending beyond the resolution of the model, it is possible to refine the structure such that the full extent of the resolution information in the data will be utilized. In some cases, it is also possible to use as an initial structure in the refinement procedure a structure of a homologous protein, thus avoiding the process of ab initio structure determination altogether. The goal of the refinement is to find such orientation parameters for each of the particle projections for which (2.5.7.6) is minimized. There exist various implementations of the structurerefinement strategy and they can be roughly divided into those that perform exhaustive searches for all five orientation parameters (two translations and three Eulerian angles per 2D projection image) and those that perform local searches, usually by employing gradient information. Finally, the strategies may differ in how the correction for the CTF is implemented.
The original 3D projectionmatching strategy (Penczek et al., 1994) is based on the observation that given an ideal structure f and the necessary parameters of the CTF and imageformation model, it is straightforward to find five orientation parameters for each projection image. One begins with the determination of the sufficient angular step: assuming the structure is properly sampled at the Nyquist frequency and has a realspace radius of r voxels, the angular step is given byNext, keeping in mind that projection directions are parametrized by two Eulerian angles , one generates a set of projection directions quasiuniformly distributed over half a unit sphere (or, in the case of a symmetric structure, over an asymmetric subunit) by taking fixed steps along the altitude or tilt angle and a number of samples azimuthally in proportion to (Penczek et al., 1994). So, for a chosen constant increment and given angle the increment of the angle varies according toIf all three Eulerian angles are to be sampled, as is necessary in some applications, then is sampled uniformly in steps of .
In order to find the orientation parameters of projection images, one step of projection matching is performed. The reference structure is projected in all directions given by (2.5.7.17), yielding a set of reference images. Next, for each projection image, 2D crosscorrelation functions with all reference images are calculated using one of the methods described in Section 2.5.7.6 and the overall maximum yields the translation, the inplane rotation angle, the number of the most similar reference image (thus the remaining two Eulerian angles) and information about whether the image should be mirrored. Given this, a new 3D structure can be calculated using a 3D reconstruction algorithm (see Section 2.5.6). This simple protocol constitutes the core of 3D projection alignment (Fig. 2.5.7.4).
In a simple implementation of the 3D projectionmatching procedure, all projection data are assembled into defocus groups, i.e., groups of projection images that have similar defocus settings (Frank et al., 2000). During refinement, for each defocus group the reference volume is multiplied by the CTF with the appropriate defocus value, one step of projection matching is performed and a refined structure is reconstructed for this group (Fig. 2.5.7.5). In addition, the withingroup resolution is estimated using the Fourier shell correlation (FSC) approach (2.5.7.19) applied to two volumes calculated from two subsets of projection images randomly split into halves. After all defocus groups have been processed, the individual refined volumes are merged in Fourier space with a CTF correction using Wienerfilter methodology (Penczek et al., 1997),where is the spectral signaltonoise ratio estimated for each defocus group using (2.5.7.22). Subsequently, the resolution of the merged volume is estimated by merging the halfvolumes into two halfmerged volumes using (2.5.7.18) and comparing them using (2.5.7.19). Next, the merged volume is filtered using (2.5.7.25) and the structure is centred so that its centre of mass is placed at the centre of the volume in which it is embedded.

Schematic of 3D projection alignment with CTF correction performed on the level of 3D maps reconstructed from projection images sorted into groups that share similar defocus settings. 
The 3D projectionmatching approach works very well during the initial stages of the refinement as it constitutes a very efficient approach to an exhaustive search for orientation parameters of all projection data. Once the orientation parameters are known to a degree of accuracy, it is straightforward to modify the procedure such that only subsets of reference projections are generated at a time and projection images are compared only with reference projections within a specified angular distance from their angular direction established during previous iteration. This modification speeds up the procedure significantly and makes it possible to refine structures to very high resolution by using a very small angular step . Another possible modification is to introduce an additional step of 2D alignment of the projection data that share the same angular direction (Ludtke et al., 1999). The advantage is that this can correct possible errors of alignment to the projection of a limitedresolution reference structure and also, to an extent, reduces the danger of bias from artifacts in the reference structure. Finally, it is also possible to incorporate into the refinement strategy a correction for the envelope function of the microscope (Ludtke et al., 1999). The 3D projectionmatching strategy is widely popular and most EM software packages have implementations of various versions of basic strategies, as outlined above (Frank et al., 1996; Ludtke et al., 1999; Hohn et al., 2007).
A possible improvement over the 3D projectionmatching procedure can be achieved by working in transformed spaces in which the distinction between orientation search and 3D reconstruction is removed: (1) spherical harmonics (Provencher & Vogel, 1988; Vogel & Provencher, 1988), which have found applications exclusively in the determination of icosahedral structures (Yin et al., 2001, 2003); (2) Radon transform (Radermacher, 1994), with selected applications in the determination of asymmetric particles (Ruiz et al., 2003); or (3) Fourier transform, implemented in the FREALIGN package (Grigorieff, 2007). In FREALIGN, the transformation between the arbitrarily oriented Fourier 2D central section and the 3D Fourier Euclidean grid is implemented using trilinear interpolation that includes ad hoc correction for the CTF effects. In highresolution structurerefinement mode, the program uses a gradientbased Powell optimization algorithm (Powell, 1973), thus overcoming the main deficiency of 3D projectionmatching algorithms.
A unified approach to direct minimization of (2.5.7.6) was proposed by Yang et al. (2005) and is implemented in the SPARX package as the YNP method (Hohn et al., 2007). The premise of the YNP method is that the orientation parameters are approximately known (thus the initial 3D map) and both the orientation parameters and the density map are updated simultaneously in a gradientbased optimization scheme. In the YNP method, the derivatives with respect to the density distribution are calculated analytically and the derivatives with respect to orientation parameters are calculated using finite difference approximations. The YNP method is very efficient and its major advantage is that it avoids many problems associated with approximate solutions inherent in methods that work in transform spaces. The projection/backprojection operations are carried out rapidly using linear interpolation, which due to sufficient oversampling of the data does not have a significant adverse impact on the solution. Moreover, because the density map f is updated simultaneously with the orientation parameters, the computationally demanding separate step of 3D reconstruction is eliminated.
The development of resolution measures in EM was greatly influenced by earlier work in Xray crystallography. In EM, the problem is somewhat more difficult as, unlike in crystallography, both the amplitude and the phase information in the data are affected by alignment procedures (which we consider distant analogues of phaseextension methods in crystallography). Therefore, resolution measures in EM reflect the selfconsistency of the results; however, as the data are subject to alignment, there is a significant risk of introducing artifacts resulting from the alignment of the noise component in the data. Ultimately, these artifacts will unduly `improve' the resolution of the map.
The resolution measures used in EM fall into two categories: measures based on averaging of Fourier transforms of individual images and measures based on comparisons of averages calculated for subsets of the data. In the first group, we have the Qfactor (van Heel & Hollenberg, 1980; Kessel et al., 1985) and the spectral signaltonoise ratio (SSNR) introduced for the 2D case by Unser and coworkers (Unser et al., 1987), and for the 3D case for a class of reconstruction algorithms data are based on direct Fourier inversion by Penczek (Penczek, 2002). The second group of measures includes the differential phase residual (DPR) (Frank et al., 1981) and the Fourier ring correlation (FRC) (Saxton & Baumeister, 1982). A marked advantage of these measures is that they are equally well applicable to 2D or 3D data. In the latter case, the volumes resulting from 3D reconstruction algorithms take the place of the 2D averages.
The resolution measures used in singleparticle reconstruction are designed to evaluate the SSNR in the reconstruction as a function of spatial frequency (Penczek, 2002). The `resolution' of the reconstruction is reported as a spatial frequency limit beyond which the SSNR drops below a selected level, for example below one.
The FSC is evaluated by taking advantage of the large number of singleparticle images: the total data set is randomly split into halves; for each subset a 3D reconstruction is calculated (in two dimensions, a simple average); and two maps f and g are compared in Fourier space,In (2.5.7.19), 2 is a preselected ring/shell thickness, the u_{n} form a uniform grid in Fourier space, is the magnitude of the spatial frequency and is the number of Fourier voxels in the shell corresponding to frequency u. The FSC yields a 1D curve of correlation coefficients as a function of u. Note that the FSC is insensitive to linear transformations of the densities of the objects. An FSC curve everywhere close to one reflects strong similarity between f and g; an FSC curve with values close to zero indicates the lack of similarity between f and g. Particularly convenient for the interpretation of the results in terms of `resolution' is the relation between the FSC and the SSNR, which is easily derived by taking the expectation of (2.5.7.19) under the assumption that both f and g are sums of the same signal and different realizations of the noise, which are uncorrelated with the signal and between them (Saxton, 1978):By solving (2.5.7.20) for SSNR we obtainwhich, taking into account that the FSC was calculated from the data set split into halves, has to be modified to (Unser et al., 1987)In order to calculate the FSC that corresponds to a given SSNR, one inverts (2.5.7.22) toEquations (2.5.7.21) and (2.5.7.22) serve as a basis for various `resolution criteria' used in EM. The oftenused 3σ criterion (van Heel, 1987b) equates resolution with the point at which the FSC is larger than zero at a 3σ level, where σ is the expected standard deviation of the FSC that has an expected value of zero, in essence finding a frequency for which the SSNR is significantly larger than zero. The 3σ criterion has a distinct disadvantage of reporting the resolution at a frequency at which there is no significant signal, while tempting the user to interpret the detail in the map at this resolution. Moreover, as the FSC approaches zero, its relative error increases, so the curve oscillates widely around the zero level increasing the chance of selecting an incorrect resolution point. In other criteria one tries to equate the resolution with the frequency at which noise begins to dominate the signal. A good choice of the cutoff level is SSNR = 1.0, a level at which the power of the signal in the reconstruction is equal to the power of the noise. According to (2.5.7.22), this corresponds to FSC = 0.333. Another oftenused cutoff level is FSC = 0.5, at which the SSNR in the reconstruction is 2.0 (Böttcher et al., 1997; Conway et al., 1997; Penczek, 1998).
The main reason behind the determination of the resolution of the EM maps is the necessary step of lowpass filtration of the results before the interpretation of the map is attempted. In order to avoid mistakes, particularly the danger of overinterpretation, one has to remove from the map unreliable Fourier coefficients. Inclusion of Fourier coefficients with a low SNR will result in the creation of spurious details and artifacts in the map. Thus, the optimal filtration should be based on the SSNR distribution in the map and the solution is given by a Wiener filter:Based on the relation of FSC to SSNR (2.5.7.22), we can write (2.5.7.24) asIn practice, because of the irregular shape of typical FSC curves (particularly for small values of FSC) it is preferable to approximate the shape of the Wiener filter (2.5.7.25) by one of the standard lowpass filters, such as Butterworth (Gonzalez & Woods, 2002) or hyperbolic tangent (Basokur, 1998).
The FRC/FSC methodology can be used to compare a noisecorrupted map with a noisefree ideal version of the same object. In singleparticle reconstruction this situation emerges when an Xray crystallographic structure of either the entire EMdetermined structure or of some of its domains is available (Penczek et al., 1999). In this case, we assume that in (2.5.7.19) f represents a sum of the signal and additive uncorrelated noise and g represents the noisefree signal, so is straightforward to calculate the expectation of (2.5.7.19) in order to obtain the relation between the crossresolution (CRC) and the SSNR:ThusInterestingly, for the same SSNR cutoff levels, corresponding values of CRC are higher than those for FSC. For example, for SSNR = 1, CRC = 0.71, while FSC = 0.33. For SSNR = 2, CRC = 0.82, while FSC = 0.5.
The amount of structural information that can be derived from a structure of a macromolecular complex determined by cryoEM depends on two factors: the resolution of the map and the availability of additional structural information about the system. Generally, we will refer to complexes at a resolution better than 7 Å as highresolution structures, as at this resolution the elements of secondary structure become directly visible. Maps at resolution lower than that we will call intermediate resolution, as at this scale of detail one can only determine a general arrangement of subunits. However, it is good to realize that there is a huge difference between the amount and reliability of information derived from a map of the same complex determined at 10 Å as compared to a map determined at 30 Å resolution. Similarly, very large complexes determined at 50 Å resolution will yield more information than very small complexes determined at 15 Å resolution. On the other hand, even intermediateresolution EM maps provide extremely valuable information if they can be placed in the context of other structural work. The singleparticle structure can be also investigated within a context of a more complex system using other, lowerresolution techniques, for example electron tomography. In this case, by using docking approaches one can determine the distribution, orientation and general arrangement of smaller cryoEM determined complexes within larger subcellular systems. On a different scale of resolution, it is quite common to have structures of some domains or even of the entire complexes determined to atomic resolution by Xray crystallography. Again, by using docking techniques it is possible to determine whether the conformation of the EM structure differs from that determined by Xray crystallography or to map subunits and domains of the larger complex by fitting available atomic resolution structures.
The basic mode of visualization of cryoEM maps is surface representation. The first step involves the choice of an appropriate threshold level for the displayed surface, particularly when the scaling of the cryoEM data is arbitrary. A good guide is provided by the total molecular mass of the complex: given a pixel size of p Å, an average protein density d = 1.36 × 10^{−24} g Å^{−3} and the total molecular mass of the complex M Da, the number of voxels N_{v} occupied by the complex iswhere N_{A} is the Avogadro's number (6.02 × 10^{23} atoms mole^{−1}). Based on that, one can find the threshold that for a given structure encompasses the determined number of voxels N_{v} [appropriate functions are implemented in SPIDER (Agrawal et al., 1996; Frank et al., 1996) and SPARX (Hohn et al., 2007)]. At a sufficiently high resolution, cryoEM maps can be analysed in the same manner as Xray crystallographic maps and using the same graphical/analytical packages (`backbone tracing') (Jones et al., 1991) (Fig. 2.5.7.6).
The complexity of cryoEM maps of large macromolecular assemblies combined with their limited resolution invites attempts to automate some of the steps of analysis in an attempt to make the results more robust and less dependent on the researcher's bias. A good example of semiautomated analysis is the nucleic acid–protein separation in a 11.5 Å cryoEM map of the 70S E. coli ribosome (Spahn et al., 2000). In the procedure, the (continuousvalued) densities were analysed making use of (i) the difference in scattering density between protein and nucleic acids; (ii) continuity constraints that the image of any nucleic acid molecule must obey and (iii) knowledge of the molecular volumes of all proteins. As a result, it was possible to reproduce boundary assignments between ribosomal RNA (rRNA) and proteins made from higherresolution Xray maps of the ribosomal subunits with a high degree of accuracy, and allowed plausible predictions to be made for the placements of proteins and RNA components as yet unassigned. One of the conclusions derived from this separation was that the 23S rRNA is solely responsible for the catalysis of peptidebond formation; thus, the ribosome is a ribozyme. The same conclusion was reached independently in the studies of the Xray crystallographic structure of the 70S ribosome (Nissen et al., 2000). The method by Spahn et al. cannot be easily extended to other macromolecules that comprise only protein and generally it is very difficult to delineate at intermediate resolution subunits of large macromolecular assemblies, automatically or not, in the absence of independent knowledge about their shape. The reason is that both the density and shape of the subunits are affected by the limited resolution differently depending on their spatial context. In general, subunits that are isolated and located on the surface or protruding from the structure will have relatively lower density while at the same time their overall shape will be better preserved and easier to discern. Subunits located inside the structure and surrounded by other structural elements, while having higher density, are more difficult to recognize, as they fuse with the surrounding mass densities. Therefore, it is difficult to provide a general method that could cope with the problem of automated massdensity analysis.
As most cryoEM structures are determined at intermediate resolution, the most common mode of analysis is either to compare the map with the available Xray crystallographic structures of its domains or to consider the result in the context of larger, subcellular structures obtained by electron tomography. In both cases correlation techniques are used extensively to obtain objective results or to validate the results obtained by manual fitting.
In docking of Xray crystallographic structures into EM maps, the first step is the conversion of atomic coordinates from Xray molecular models, as given in Protein Data Bank (PDB) files, into an electrondensity map in a way that would mimic the physical image formation process. Although sophisticated methods of computational emulation of the imageformation process in the electron microscope are available, very simple approaches to conversion yield quite satisfactory results at the resolution of the EM results. The most common one is to assume that the Coulomb potential of an atom is proportional to its atomic number and add these atomic numbers within a Euclidean grid with a cell size equal to the EM pixel size in Å. The atomic coordinates of atoms are interpolated within the grid using trilinear interpolation. After such conversion, the Xray map can be handled using the general imageprocessing tools of a singleparticle software package. Initial orientation (or orientations, if the general placement is not immediately visually apparent) of the Xray map can be easily performed manually within any number of graphical packages, for example Chimera (Pettersen et al., 2004). The initial six orientation parameters (three translations and three Eulerian angles) are next transferred to the EM package (for details see Baldwin & Penczek, 2007) and the manual docking is refined using correlation techniques (Fig. 2.5.7.7). Similarly, the handedness of the EM map can be established or confirmed by performing fitting of the Xray determined structure to two EM maps that differ by their hand.
Docking of EM maps into the broader cellular context of structures determined by electron tomography can provide information about the distribution of complexes and their interactions within the cell. Conceptually, the approach is very similar to that of particle picking, i.e., template matching, with the main difference being that calculations are performed in three instead of two dimensions. Given a 3D structure of a singleparticle EM complex, a set of 3D templates is prepared by rotating the template around its centre of mass using the quasiuniformly distributed three Eulerian angles [equation (2.5.7.17) with ]. However, in application to tomography the angular step can be relatively large, resulting in a much smaller number of templates than in two dimensions, the reason being the rather low resolution of typical electron tomograms (not exceeding 50 Å). Next, a bruteforce 3D crosscorrelation search with all templates is performed (Frangakis et al., 2002). After windowing out 3D subvolumes containing putative complexes, subsequent averaging and classification can be performed.
CryoEM is a unique structural technique in its ability to detect conformational variability of large molecular assemblies within one sample that may contain a mixture of complexes in various conformational states. In addition to the expected conformational heterogeneity of the assemblies, due to fluctuations of the structure around the ground state one can expect to capture molecules in different functional states, especially if the binding of a ligand induces a conformational change in the macromolecular assembly. Therefore, a data set of images from an EM experiment must be interpreted as a mixture of projections from similar but not identical structures. The analysis of the extent of the resulting variability requires the calculation of the realspace distribution of 3D variance/covariance in macromolecules reconstructed from a set of their projections. The problem is difficult, as there is no clear relation between the variance in sets of projections that have the same angular direction and the variance of the 3D structure calculated from these projections. Penczek, Chao et al. (2006) proposed calculating the variance in the 3D mass distribution of the structure using a statistical bootstrap resampling technique, in which a new set of projections is selected with replacements from the available whole set of N projections. In the new set, some of the original projections will appear more than once, while others will be omitted. This selection process is repeated a number of times and for each new set of projections the corresponding 3D volume is calculated. Next, the voxelbyvoxel bootstrap variance of the resulting set of volumes is calculated. The target variance is obtained using a relationship between the variance of arithmetic means for sampling with replacements and the sample variance,The estimated structurevariance map can be used for (i) detection of different functional states (for example, those characterized by binding of a ligand) and subsequent classification of the data set into homogeneous groups (Penczek, Frank & Spahn, 2006a), (ii) analysis of the significance of small details in 3D reconstructions, (iii) analysis of the significance of details in difference maps, and (iv) docking of known structural domains into EM density maps.
The bootstrap technique also leads to the analysis of conformational modes of macromolecular complexes, and this is due to the fact that the covariance matrix of the structure can be directly calculated from the bootstrap volumes. The covariance matrix obtained this way would be very large. One possibility is to calculate only correlation coefficients between regions of interest that have large variance (Penczek, Chao et al., 2006). Another possibility is to use the iterative Lanczos technique (Parlett, 1980) and calculate eigenvolumes directly from bootstrap volumes without forming the covariance matrix. These eigenvolumes are related to conformational modes of the molecule, as captured by the projection data of the sample (Penczek, Frank & Spahn, 2006b). Thus, this direct relation to the actual cryoEM projection data positively distinguishes this approach from other techniques in which conformations are postulated based on flexible models of the EM map (Ming et al., 2002; Mitra et al., 2005).
References
Adiga, P. S., Malladi, R., Baxter, W. & Glaeser, R. M. (2004). A binary segmentation approach for boxing ribosome particles in cryo EM micrographs. J. Struct. Biol. 145, 142–151.Agrawal, R. K., Penczek, P., Grassucci, R. A., Li, Y., Leith, A., Nierhaus, K. H. & Frank, J. (1996). Direct visualization of A, P, and Esite transfer RNAs in the Escherichia coli ribosome. Science, 271, 1000–1002.
Baldwin, P. R. & Penczek, P. A. (2007). The transform class in SPARX and EMAN2. J. Struct. Biol. 157, 250–261.
Basokur, A. T. (1998). Digital filter design using the hyperbolic tangent functions. J. Balkan Geophys. Soc. 1, 14–18.
Boisset, N., Penczek, P., Pochon, F., Frank, J. & Lamy, J. (1993). Threedimensional architecture of human alpha 2macroglobulin transformed with methylamine. J. Mol. Biol. 232, 522–529.
Boisset, N., Penczek, P., Taveau, J. C., Lamy, J. & Frank, J. (1995). Threedimensional reconstruction of Androctonus australis hemocyanin labeled with a monoclonal Fab fragment. J. Struct. Biol. 115, 16–29.
Böttcher, B., Wynne, S. A. & Crowther, R. A. (1997). Determination of the fold of the core protein of hepatitis B virus by electron cryomicroscopy. Nature (London), 386, 88–91.
Conway, J. F., Cheng, N., Zlotnick, A., Wingfield, P. T., Stahl, S. J. & Steven, A. C. (1997). Visualization of a 4helix bundle in the hepatitis B virus capsid by cryoelectron microscopy. Nature (London), 386, 91–94.
Crowther, R. A., DeRosier, D. J. & Klug, A. (1970). The reconstruction of a threedimensional structure from projections and its application to electron microscopy. Proc. R. Soc. London Ser. A, 317, 319–340.
Dube, P., Tavares, P., Lurz, R. & van Heel, M. (1993). The portal protein of bacteriophage SPP1: a DNA pump with 13fold symmetry. EMBO J. 12, 1303–1309.
Farrow, N. A. & Ottensmeyer, F. P. (1992). A posteriori determination of relative projection directions of arbitrarily oriented macromolecules. J. Opt. Soc. Am. A, 9, 1749–1760.
Fernandez, J.J., Sanjurjo, J. R. & Carazo, J. M. (1997). A spectral estimation approach to contrast transfer function detection in electron microscopy. Ultramicroscopy, 68, 267–295.
Fishman, G. (1995). Monte Carlo: Concepts, Algorithms, and Applications. New York: Springer.
Frangakis, A. S., Bohm, J., Forster, F., Nickell, S., Nicastro, D., Typke, D., Hegerl, R. & Baumeister, W. (2002). Identification of macromolecular complexes in cryoelectron tomograms of phantom cells. Proc. Natl Acad. Sci. USA, 99, 14153–14158.
Frank, J. (1990). Classification of macromolecular assemblies studied as `single particles'. Quart. Rev. Biophys. 23, 281–329.
Frank, J. & Mcewen, B. (1992). Alignment by crosscorrelation. In Electron Tomography, edited by J. Frank, pp. 205–214. New York: Plenum.
Frank, J., Penczek, P., Agrawal, R. K., Grassucci, R. A. & Heagle, A. B. (2000). Threedimensional cryoelectron microscopy of ribosomes. Methods Enzymol. 317, 276–291.
Frank, J., Penczek, P. & Liu, W. (1992). Alignment, classification, and threedimensional reconstruction of single particles embedded in ice. Scan. Microsc. Suppl. 6, 11–20.
Frank, J., Radermacher, M., Penczek, P., Zhu, J., Li, Y., Ladjadj, M. & Leith, A. (1996). SPIDER and WEB: processing and visualization of images in 3D electron microscopy and related fields. J. Struct. Biol. 116, 190–199.
Frank, J., Radermacher, M., Wagenknecht, T. & Verschoor, A. (1988). Studying ribosome structure by electron microscopy and computerimage processing. Methods Enzymol. 164, 3–35.
Frank, J., Verschoor, A. & Boublik, M. (1981). Computer averaging of electron micrographs of 40S ribosomal subunits. Science, 214, 1353–1355.
Frank, J., Verschoor, A. & Boublik, M. (1982). Multivariate statistical analysis of ribosome electron micrographs. L and R lateral views of the 40 S subunit from HeLa cells. J. Mol. Biol. 161, 107–133.
Frank, J. & Wagenknecht, T. (1984). Automatic selection of molecular images from electron micrographs. Ultramicroscopy, 12, 169–176.
Gabashvili, I. S., Agrawal, R. K., Spahn, C. M., Grassucci, R. A., Svergun, D. I., Frank, J. & Penczek, P. (2000). Solution structure of the E. coli 70S ribosome at 11.5 Å resolution. Cell, 100, 537–549.
Goncharov, A. B., Vainshtein, B. K., Ryskin, A. I. & Vagin, A. A. (1987). Threedimensional reconstruction of arbitrarily oriented identical particles from their electron photomicrographs. Sov. Phys. Crystallogr. 32, 504–509.
Gonzalez, R. F. & Woods, R. E. (2002). Digital Image Processing. Upper Saddle River: Prentice Hall.
Grigorieff, N. (2007). FREALIGN: Highresolution refinement of single particle structures. J. Struct. Biol. 157, 117.
Hall, R. J. & Patwardhan, A. (2004). A two step approach for semiautomated particle selection from low contrast cryoelectron micrographs. J. Struct. Biol. 145, 19–28.
Heel, M. van (1982). Detection of objects in quantumnoise limited images. Ultramicroscopy, 8, 331–342.
Heel, M. van (1987a). Angular reconstitution: a posteriori assignment of projection directions for 3D reconstruction. Ultramicroscopy, 21, 111–124.
Heel, M. van (1987b). Similarity measures between images. Ultramicroscopy, 21, 95–100.
Heel, M. van, Gowen, B., Matadeen, R., Orlova, E. V., Finn, R., Pape, T., Cohen, D., Stark, H., Schmidt, R., Schatz, M. & Patwardhan, A. (2000). Singleparticle electron cryomicroscopy: towards atomic resolution. Quart. Rev. Biophys. 33, 307–369.
Heel, M. van, Harauz, G. & Orlova, E. V. (1996). A new generation of the IMAGIC image processing system. J. Struct. Biol. 116, 17–24.
Heel, M. van & Hollenberg, J. (1980). The stretching of distorted images of twodimensional crystals. In Electron Microscopy at Molecular Dimensions, edited by W. Baumeister, pp. 256–260. Berlin: Springer.
Hohn, M., Tang, G., Goodyear, G., Baldwin, P. R., Huang, Z., Penczek, P. A., Yang, C., Glaeser, R. M., Adams, P. D. & Ludtke, S. J. (2007). SPARX, a new environment for cryoEM image processing. J. Struct. Biol. 157, 47–55.
Huang, Z., Baldwin, P. R., Mullapudi, S. R. & Penczek, P. A. (2003). Automated determination of parameters describing power spectra of micrograph images in electron microscopy. J. Struct. Biol. 144, 79–94.
Huang, Z. & Penczek, P. A. (2004). Application of template matching technique to particle detection in electron micrographs. J. Struct. Biol. 145, 29–40.
Jones, T. A., Zou, J.Y., Cowan, S. W. & Kjeldgaard, M. (1991). Improved methods for building protein models in electron density maps and the location of errors in these models. Acta Cryst. A47, 110–119.
Joyeux, L. & Penczek, P. A. (2002). Efficiency of 2D alignment methods. Ultramicroscopy, 92, 33–46.
Kessel, M., Radermacher, M. & Frank, J. (1985). The structure of the stalk surface layer of a brine pond microorganism: correlation averaging applied to a double layered lattice structure. J. Microsc. 139, 63–74.
Lata, K. R., Penczek, P. & Frank, J. (1995). Automatic particle picking from electron micrographs. Ultramicroscopy, 58, 381–391.
Lawrence, M. C. (1992). Leastsquares method of alignment using markers. In Electron Tomography, edited by J. Frank, pp. 197–204. New York: Plenum Press.
Ludtke, S. J., Baldwin, P. R. & Chiu, W. (1999). EMAN: semiautomated software for highresolution singleparticle reconstructions. J. Struct. Biol. 128, 82–97.
Mallick, S. P., Carragher, B., Potter, C. S. & Kriegman, D. J. (2005). ACE: automated CTF estimation. Ultramicroscopy, 104, 8–29.
Marabini, R. & Carazo, J. M. (1996). On a new computationally fast image invariant based on bispectral projections. Pattern Recognit. Lett. 17, 959–967.
Mindell, J. A. & Grigorieff, N. (2003). Accurate determination of local defocus and specimen tilt in electron microscopy. J. Struct. Biol. 142, 334–347.
Ming, D. M., Kong, Y. F., Lambert, M. A., Huang, Z. & Ma, J. P. (2002). How to describe the movement of protein without amino acids sequence and coordinates. Proc. Natl Acad. Sci. USA, 13, 8620–8625.
Mitra, K., Schaffitzel, C., Shaikh, T., Tama, F., Jenni, S., Brooks, C. L. III, Ban, N. & Frank, J. (2005). Structure of the E. coli proteinconducting channel bound to a translating ribosome. Nature (London), 438, 318–324.
Mouche, F., Boisset, N. & Penczek, P. A. (2001). Lumbricus terrestris hemoglobin – the architecture of linker chains and structural variation of the central toroid. J. Struct. Biol. 133, 176–192.
Nissen, P., Hansen, J., Ban, N., Moore, P. B. & Steitz, T. A. (2000). The structural basis of ribosome activity in peptide bond synthesis. Science, 289, 920–930.
Parlett, B. N. (1980). A new look at the Lanczosalgorithm for solving symmetricsystems of linearequations. Linear Algebr. Its Appl. 29, 323–346.
Penczek, P. (1998). Measures of resolution using Fourier shell correlation. J. Mol. Biol. 280, 115–116.
Penczek, P., Ban, N., Grassucci, R. A., Agrawal, R. K. & Frank, J. (1999). Haloarcula marismortui 50S subunit – complementarity of electron microscopy and Xray crystallographic information. J. Struct. Biol. 128, 44–50.
Penczek, P., Marko, M., Buttle, K. & Frank, J. (1995). Doubletilt electron tomography. Ultramicroscopy, 60, 393–410.
Penczek, P., Radermacher, M. & Frank, J. (1992). Threedimensional reconstruction of single particles embedded in ice. Ultramicroscopy, 40, 33–53.
Penczek, P. A. (2002). Threedimensional spectral signaltonoise ratio for a class of reconstruction algorithms. J. Struct. Biol. 138, 34–46.
Penczek, P. A., Chao, Y., Frank, J. & Spahn, C. M. T. (2006). Estimation of variance in single particle reconstruction using the bootstrap technique. J. Struct. Biol. 154, 168–183.
Penczek, P. A. & Frank, J. (2006). Resolution in electron tomography. In Electron Tomography: Methods for ThreeDimensional Visualization of Structures in the Cell, 2nd ed., edited by J. Frank, pp. 307–330. Berlin: Springer.
Penczek, P. A., Frank, J. & Spahn, C. M. T. (2006a). A method of focused classification, based on the bootstrap 3D variance analysis, and its application to EFGdependent translocation. J. Struct. Biol. 154, 184–194.
Penczek, P. A., Frank, J. & Spahn, C. M. T. (2006b). Conformational analysis of macromolecules analyzed by cryoelectron microscopy. In Microscopy and Microanalysis, edited by P. Kotula, M. Marko, J.H. Scott et al., p. CD386. Chicago: Cambridge University Press.
Penczek, P. A., Grassucci, R. A. & Frank, J. (1994). The ribosome at improved resolution: new techniques for merging and orientation refinement in 3D cryoelectron microscopy of biological particles. Ultramicroscopy, 53, 251–270.
Penczek, P. A., Zhu, J. & Frank, J. (1996). A commonlines based method for determining orientations for N > 3 particle projections simultaneously. Ultramicroscopy, 63, 205–218.
Penczek, P. A., Zhu, J., Schröder, R. & Frank, J. (1997). Threedimensional reconstruction with contrast transfer function compensation from defocus series. Scan. Microsc. Suppl. 11, 1–10.
Pettersen, E. F., Goddard, T. D., Huang, C. C., Couch, G. S., Greenblatt, D. M., Meng, E. C. & Ferrin, T. E. (2004). UCSF Chimera – a visualization system for exploratory research and analysis. J. Comput. Chem. 25, 1605–1612.
Powell, M. J. D. (1973). On search directions for minimization algorithm. Math. Program. 4, 193–201.
Provencher, S. W. & Vogel, R. H. (1988). Threedimensional reconstruction from electron micrographs of disordered specimens. I. Method. Ultramicroscopy, 25, 209–221.
Pullan, L., Mullapudi, S., Huang, Z., Baldwin, P. R., Chin, C., Sun, W., Tsujimoto, S., Kolodziej, S., Stoops, J. K., Lee, J. C., Waxham, M. N., Bean, A. J. & Penczek, P. A. (2006). The endosomeassociated protein Hrs is hexameric and controls cargo sorting as a `master molecule'. Structure, 14, 661–671.
Radermacher, M. (1994). Threedimensional reconstruction from random projections: orientational alignment via Radon transforms. Ultramicroscopy, 53, 121–136.
Radermacher, M., Wagenknecht, T., Verschoor, A. & Frank, J. (1987). Threedimensional reconstruction from a singleexposure, random conical tilt series applied to the 50S ribosomal subunit of Escherichia coli. J. Microsc. 146, 113–136.
Roseman, A. M. (2003). Particle finding in electron micrographs using a fast local correlation algorithm. Ultramicroscopy, 94, 225–236.
Roseman, A. M., Chen, S., White, H., Braig, K. & Saibil, H. R. (1996). The chaperonin ATPase cycle: mechanism of allosteric switching and movements of substratebinding domains in GroEL. Cell, 87, 241–251.
Ruiz, T., Mechin, I., Bar, J., Rypniewski, W., Kopperschlager, G. & Radermacher, M. (2003). The 10.8A structure of Saccharomyces cerevisiae phosphofructokinase determined by cryoelectron microscopy: localization of the putative fructose 6phosphate binding sites. J. Struct. Biol. 143, 124–134.
Saad, A., Ludtke, S. J., Jakana, J., Rixon, F. J., Tsuruta, H. & Chiu, W. (2001). Fourier amplitude decay of electron cryomicroscopic images of single particles and effects on structure determination. J. Struct. Biol. 133, 32–42.
Sander, B., Golas, M. M. & Stark, H. (2003). Automatic CTF correction for single particles based upon multivariate statistical analysis of individual power spectra. J. Struct. Biol. 142, 392–401.
Saxton, W. O. (1978). Computer Techniques for Image Processing of Electron Microscopy. New York: Academic Press.
Saxton, W. O. & Baumeister, W. (1982). The correlation averaging of a regularly arranged bacterial envelope protein. J. Microsc. 127, 127–138.
Schatz, M. & van Heel, M. (1990). Invariant classification of molecular views in electron micrographs. Ultramicroscopy, 32, 255–264.
Schatz, M. & van Heel, M. (1992). Invariant recognition of molecular projections in vitreous ice preparations. Ultramicroscopy, 45, 15–22.
Sigworth, F. J. (2004). Classical detection theory and the cryoEM particle selection problem. J. Struct. Biol. 145, 111–122.
Spahn, C. M. T., Penczek, P., Leith, A. & Frank, J. (2000). A method for differentiating proteins from nucleic acids in intermediateresolution density maps: cryoelectron microscopy defines the quaternary structure of the Escherichia coli 70S ribosome. Struct. Fold. Des. 8, 937–948.
Unser, M., Trus, B. L. & Steven, A. C. (1987). A new resolution criterion based on spectral signaltonoise ratios. Ultramicroscopy, 23, 39–51.
Vogel, R. H. & Provencher, S. W. (1988). Threedimensional reconstruction from electron micrographs of disordered specimens. II. Implementation and results. Ultramicroscopy, 25, 223–239.
Walz, J., Typke, D., Nitsch, M., Koster, A. J., Hegerl, R. & Baumeister, W. (1997). Electron tomography of single iceembedded macromolecules: threedimensional alignment and classification. J. Struct. Biol. 120, 387–395.
Welch, P. D. (1967). The use of fast Fourier transform for the estimation of power spectra: a method based on time averaging over short modified periodograms. IEEE Trans. Audio Electroacoust. AU15, 70–73.
Wong, H. C., Chen, J., Mouche, F., Rouiller, I. & Bern, M. (2004). Modelbased particle picking for cryoelectron microscopy. J. Struct. Biol. 145, 157–167.
Yang, C., Ng, E. G. & Penczek, P. A. (2005). Unified 3D structure and projection orientation refinement using quasiNewton algorithm. J. Struct. Biol. 149, 53–64.
Yin, Z. H., Zheng, Y. L., Doerschuk, P. C., Natarajan, P. & Johnson, J. E. (2003). A statistical approach to computer processing of cryoelectron microscope images: virion classification and 3D reconstruction. J. Struct. Biol. 144, 24–50.
Yin, Z. Y., Zheng, Y. L. & Doerschuk, P. C. (2001). An ab initio algorithm for lowresolution 3D reconstructions from cryoelectron microscopy images. J. Struct. Biol. 133, 132–142.
Zhu, J., Penczek, P. A., Schröder, R. & Frank, J. (1997). Threedimensional reconstruction with contrast transfer function correction from energyfiltered cryoelectron micrographs: procedure and application to the 70S Escherichia coli ribosome. J. Struct. Biol. 118, 197–219.
Zhu, Y., Carragher, B., Glaeser, R. M., Fellmann, D., Bajaj, C., Bern, M., Mouche, F., de Haas, F., Hall, R. J., Kriegman, D. J., Ludtke, S. C., Mallick, S. P., Penczek, P. A., Roseman, A. M., Sigworth, F. J., Volkmann, N. & Potter, C. S. (2004). Automatic particle selection: results of a comparative study. J. Struct. Biol. 145, 3–14.