Tables for
Volume F
Crystallography of biological macromolecules
Edited by E. Arnold, D. M. Himmel and M. G. Rossmann

International Tables for Crystallography (2012). Vol. F, ch. 4.4, p. 140

Section 4.4.2. Design of multiple constructs: bioinformatics analysis of genome sequences

K. H. Choia*

aDepartment of Biochemistry and Molecular Biology, 6.614C Basic Science, The University of Texas Medical Branch,University Blvd, Galveston, TX 77555–0647, USA
Correspondence e-mail:

4.4.2. Design of multiple constructs: bioinformatics analysis of genome sequences

| top | pdf |

Often, the two primary bottlenecks in crystal structure determination are obtaining a soluble protein and the initial crystallization. If initial crystallization trials do not lead to successful crystallization, diffraction-quality crystals or structure determination, modification of the protein itself is often more successful in producing useful crystals than exhaustive screening/optimization of protein crystallization conditions. In the HT approach, generation of stable domains having variable N or C termini can be performed in parallel, vastly reducing time and resource requirements. The clones can be screened systematically for soluble, active proteins amenable to crystallization. This approach is particularly suited to problems in which one has no estimate, a priori, of which constructs are likely to be most successful at the expression, purification or crystallization steps.

It may be difficult to predict which modifications will influence protein solubility and the protein's ability to crystallize. A common and often useful strategy is to crystallize a similar protein from an alternative biological source. If an alternative source is either unavailable or equally problematic, gene fusion with large affinity tags can be used for increasing protein expression and solubility. Several fusion proteins, including thioredoxin, maltose binding protein, glutathione S-transferase, intein, SUMO (small ubiquitin-like modifier) protein and calmodulin-binding protein, have been used to generate soluble proteins. However, the fused tag may prevent crystallization because of conformational heterogeneity resulting from a flexible linker; hence, the tag must often be removed in an additional purification step. Protein engineering by amino-acid substitutions and chemical modifications (such as methylation of exposed lysines) have been shown to improve crystallization of some proteins (Rayment, 1997[link]; Walter et al., 2006[link]; Wingren et al., 2003[link]). Large, flexible solvent-exposed residues (e.g. Lys or Glu) on the surface of proteins are substituted with smaller residues (e.g. Ala) to facilitate the formation of intermolecular contacts which stabilize the crystal lattice (the surface-entropy reduction approach; Cooper et al., 2007[link]; Derewenda, 2004[link]). Random mutagenesis following selection of a desired phonotype (e.g. folding ability using a GFP (green fluorescent protein) reporter or solubility assays) has been used to produce soluble proteins from insoluble wild-type proteins (the directed-evolution approach; Pédelacq et al., 2002[link]). Since symmetric molecules such as homodimers may crystallize more readily than monomeric protein, new Cys residues can be introduced in a monomeric protein to induce homodimer formation via intermolecular disulfide bond formation (the synthetic symmetrization approach; Banatao et al., 2006[link]).

Bioinformatics tools, including multiple sequence alignment and sequence motif searches, as well as prediction of secondary structures, domain boundaries, membrane spanning and disordered regions, can be used to aid the rational design of constructs. General considerations in designing truncation mutations are to avoid truncations in the middle of predicted secondary structural elements, to avoid hydrophobic residues at the termini, and to eliminate membrane-spanning regions in the construct design. In the case of multi-domain proteins, truncation of the protein to smaller functional domains can be effective. Exact locations of the beginning and ending of a domain are still difficult to predict even with domain search programs, and thus biochemical approaches such as limited proteolysis can aid the determination of the domain boundaries. The optimal step size for truncation mutations cannot be predicted, but protein constructs varying by approximately five residues in length often show significant differences in solubility and crystallization behaviour (Choi et al., 2004[link]; Gräslund, Sagemark et al., 2008[link]).


Banatao, D. R., Cascio, D., Crowley, C. S., Fleissner, M. R., Tienson, H. L. & Yeates, T. O. (2006). An approach to crystallizing proteins by synthetic symmetrization. Proc. Natl Acad. Sci. USA, 103, 16230–16235.
Choi, K. H., Groarke, J. M., Young, D. C., Rossmann, M. G., Pevear, D. C., Kuhn, R. J. & Smith, J. L. (2004). Design, expression, and purification of a flaviviridae polymerase using a high-throughput approach to facilitate crystal structure determination. Protein Sci. 13, 2685–2692.
Cooper, D. R., Boczek, T., Grelewska, K., Pinkowska, M., Sikorska, M., Zawadzki, M. & Derewenda, Z. (2007). Protein crystallization by surface entropy reduction: optimization of the SER strategy. Acta Cryst. D63, 636–645.
Derewenda, Z. S. (2004). Rational protein crystallization by mutational surface engineering. Structure, 12, 529–535.
Gräslund, S., Sagemark, J., Berglund, H., Dahlgren, L. G., Flores, A., Hammarström, M., Johansson, I., Kotenyova, T., Nilsson, M., Nordlund, P. & Weigelt, J. (2008). The use of systematic N- and C-terminal deletions to promote production and structural studies of recombinant proteins. Protein Expr. Purif. 58, 210–221.
Pédelacq, J. D., Piltch, E., Liong, E. C., Berendzen, J., Kim, C. Y., Rho, B. S., Park, M. S., Terwilliger, T. C. & Waldo, G. S. (2002). Engineering soluble proteins for structural genomics. Nat. Biotechnol. 20, 927–932.
Rayment, I. (1997). Reductive alkylation of lysine residues to alter crystallization properties of proteins. Methods Enzymol. 276, 171–179.
Walter, T. S., Meier, C., Assenberg, R., Au, K. F., Ren, J., Verma, A., Nettleship, J. E., Owens, R. J., Stuart, D. I. & Grimes, J. M. (2006). Lysine methylation as a routine rescue strategy for protein crystallization. Structure, 14, 1617–1622.
Wingren, C., Edmundson, A. B. & Borrebaeck, C. A. (2003). Designing proteins to crystallize through beta-strand pairing. Protein Eng. 16, 255–264.

to end of page
to top of page